[Scilab-users] Why so slow?

Richard Llom richard.llom at gmail.com
Tue May 20 20:44:34 CEST 2014


Hello,
I need to read in a csv of about 360.000 lines with date and numerical 
values. Attached is a sample excerpt of that file.

So far I did:
==== CODE ====

// read in
tic
mydat = csvRead('dat04-2011.csv', ';', ',', 'double', [], [], [], 6);
toc (= 5,213 secs)
mydat = mydat(:,2:6);
tic
mystring = csvRead('dat04-2011.csv', ';', ',', 'string', [], [], [], 6);
toc (= 3,077 secs)
mystring = mystring(:,1);


tic
for i=1:size(mydat,1)
    mydate(i,:) = strtod(strsplit(mystring(i,1),['.';' ';':']))';
end
toc (= 186,473 secs)


==== CODE ====
(I filled in the toc values).


As you can see this is unfortunately very slow. The read in of the csv, but 
especially the for loop.

So I have several question:

1)
Is there a faster way to read in the csv? Note that I need the 'header' 
option.

2)
Instead of the loop I would like to use
mydate = strtod(strsplit(mystring(:,1),['.';' ';':']))';
but this doesn't work. Is there another way to avoid the loop?

3)
The raw csv file is around 15MB, but when I want to read it in the second 
time, Scilab says this will exceed the stacksize. Which is default by 76MB. 
So I don't quite understand how two times the 15MB file takes so much 
memory? I raised the stacksize now, but I would rather like not to.


Any help is appreciated.
Thanks!
Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dat04-2011.csv
Type: text/csv
Size: 405 bytes
Desc: not available
URL: <https://lists.scilab.org/pipermail/users/attachments/20140520/0e7b0856/attachment.csv>


More information about the users mailing list