[Scilab-users] Advice from Scilab community

Paul Carrico paul.carrico at free.fr
Fri Oct 25 23:40:10 CEST 2013


Dear All

First of all thanks for all the advices I had ; the use of “members” works
fine and fast in a general case, but not completely applicable in the study



Due to the big size of my matrix (more than 400 000 rows with 4 columns ...
even greater than a million of rows) , I’ve a stacksize issue;  I tried 2
different ways in order to look for duplicates in the matrix:

1-      Each line is compared to the global matrix à see example 1

2-      The matrix is splitted into blocks of 1000 rows (maximum number of
lines without stacksize issue) à see example 2 
 better/faster solution from
the previous one 


 

Nevertheless I haven’t seen any way to avoid the use of (ugly) loops that
drastically slow down the numerical resolution  
. 


 Can vectorization be used instead of loops in the current study ?

Once again, any additional advices will be greatly appreciated from the
community

Have a good WE

Paul

############################################################################
#

mode(0);

clear all;

 

stacksize('max'); 

 

n = 425053; //5021

A = rand(n,4);

 

tic();

i = 1;

while (i <= n)

    [nb,loc] = members(A(i,:),A, "rows","shuffle","last");

    Faces_tmp(i,:) = A(i,:);

    A(i,:) = [];

    A(loc(find(nb == 2)),:) = []; // the duplicated is removed from the
matrix in order to decrease the number of search

    [n,nc]=size(A);

    i = i +1;

end

time = toc()

NB : “members” function goes quite fast 
.

 

###########################################################################

mode(0);

clear all;

 

stacksize('max'); 

 

n = 425053; //5021

A = rand(n,4);

 

block_size = 1000;

rest = modulo(n,block_size);

number_of_blocks = (n - rest)/block_size;

 

printf("A splitted into blocks............\n");

 

k=0;

tic()

if (rest <> 0 ) then

    k = 1;

    s0 = "...

    splited" + string(number_of_blocks + k)+" = A($ - rest +1 : $,:) , ....

    ";

    execstr(s0);

end

 

for h = 1 : number_of_blocks

    s1 = "...

    splited" + string(h) + " = A(block_size * (" + string(h) + " - 1) + 1 :
" + string(h) + " * block_size , :), ... 

    ";

    execstr(s1);

end

 

split_time=toc()

 

printf("Search for duplicates............\n");

 

// NB: when the block is compared to itself “members” gives back 2 for
duplicates

// otherwise 1 is returned since 2 different blocks are compared

// duplicates are removed from the tested block 


 

 

tic()

for i = 1 : (number_of_blocks + k)

    printf("block ref %d ...\n",i);

    for j = 1 : (number_of_blocks + k)

        s2 = "...

        if (j == i) then , ...

            [nb,loc] = members(splited" + string(i) + ",splited" + string(j)
+ ", ""rows"",""shuffle"",""last"") , ...

            splited" + string(j) + "(loc(find(nb == 2)),:) = [] , ....

        else , ...

            [nb,loc] = members(splited" + string(i) + ",splited" + string(j)
+ ", ""rows"",""shuffle"") , ...

            splited" + string(j) + "(loc(find(nb == 1)),:) = [] , ....

        end , ...

        ";

        execstr(s2);

    end

end

time = toc()

 

A_final = [];

for i = 1 : (number_of_blocks + k)

    s3 = " ...

    A_final = [A_final ; splited" + string(i) + "], ...

    ";

    execstr(s3);

end

 

 

-------- Message d'origine--------
De: users de la part de Samuel Gougeon
Date: mar. 22/10/2013 23:14
À: International users mailing list for Scilab.
Objet : Re: [Scilab-users] Advice from Scilab community

Le 22/10/2013 13:58, Carrico, Paul a écrit :
> .../..
>
> -I need to find and to remove the line wich has the same numbers (but
> different in order)
>
If you know them, you may use the function members() available since
Scilab 5.5.0,
with the options "rows" and "shuffle"

Samuel



----------------------------------------------------------------------------
----
 
 
Le présent mail et ses pièces jointes sont confidentiels et destinés à la
personne ou aux personnes visée(s) ci-dessus. Si vous avez reçu cet e-mail
par erreur, veuillez contacter immédiatement l'expéditeur et effacer le
message de votre système. Toute divulgation, copie ou distribution de cet
e-mail est strictement interdite.
 
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error, please contact the sender and
delete the email from your system. If you are not the named addressee you
should not disseminate, distribute or copy this email.
 


---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20131025/16bda84e/attachment.htm>


More information about the users mailing list