[Scilab-users] large csv text file management issue

Serge Steer Serge.Steer at inria.fr
Wed Feb 13 18:39:26 CET 2013


If you want to get only parts of such a big file you can use the 
following function (or similar)

The function assumes the numbers are separated by space or tab and the 
decimal point is a point (not a comma)

function M=getcsvpart(filename, from,to)
   u=mopen(filename,'r');//open the file
   //skip to the line numbered from using batch line acquisition
   n=floor((from-2)/10000);
   for k=1:n
     mgetl(u,n);
   end
   n=(from-2)-10000*n;
   if n>0 then  mgetl(u,n);end
   //get and analyze the from -1 to find the number of columns
   t=mgetl(u,1); //the from -1 line
   //compute the number of columns
   t=strsubst(t,"  *"," ");
   ncol=size(tokens(t,[" ",char(10)]),'*');
   //ncol=size(tokens(t,[","]),'*'); //for comma separated csv files

   //generate C format for reading
   fmt="%g";
   fmt=strcat(fmt(ones(1,ncol))," ");
   //fmt=strcat(fmt(ones(1,ncol)),",");    //for comma separated csv files
   //read the selected part of the file
   M=mfscanf(to-from+1,u,fmt);
   mclose(u)
endfunction

Serge Steer
INRIA

Le 12/02/2013 16:01, David Chèze a écrit :
> Hi all !
>
> I'm puzzling with a large csv text file (900 Mo, 2628001 rows, 11 columns)
> from which i would like to extract some data. I tried csvRead() with range
> option but it fails claiming with error 999 even with stacksize set to max.
> I open the file with the LargeTextFileViewer software where I've checked
> that the overall format was ok. Using this software i copied first 100 lines
> into a test file which is read succesfully by csvRead() using the range
> option so the command looks ok.
> The pb with csvread seems to occur before trying to convert the data into
> scilab memory : just while trying to open the huge text file.
> May csvRead (using the range option) work in a similar way as LTFviewer so
> that it can manage large text file?
> To overcome this limitation i tried to split my huge text file into smaller
> pieces but unfortunately I didn't found free software under windows to do
> that.
> here's a test file with a few lines, the original is far larger:  test.plt
> <http://mailinglists.scilab.org/file/n4025917/test.plt>
>
>
> Any idea to help ?
>
> Thanks David
>
> W7-32bit-scilab5.4.1branch
>
>
>
> --
> View this message in context: http://mailinglists.scilab.org/large-csv-text-file-management-issue-tp4025917.html
> Sent from the Scilab users - Mailing Lists Archives mailing list archive at Nabble.com.
> _______________________________________________
> users mailing list
> users at lists.scilab.org
> http://lists.scilab.org/mailman/listinfo/users
>




More information about the users mailing list