[Scilab-users] HDF5 save is super slow

Antoine Monmayrant antoine.monmayrant at laas.fr
Mon Oct 15 12:22:58 CEST 2018


Le 15/10/2018 à 11:55, Arvid Rosén a écrit :
>
> Hi,
>
> Thanks for getting back to me!
>
> Unfortunately, we used Scilab’s pretty cool way of doing object 
> orientation, so we have big nested tlist structures with multiple 
> instances of various lists of filters and other structures, as in my 
> example. Saving those structures in some explicit manual way would be 
> extremely complicated. Or is there some way of writing explicit HDF5 
> saving/loading schemes using overloading? That would be great! I am 
> sure we could find the main culprits and do something explicit for 
> them, but as they can be located wherever in a big nested structure, 
> it would be painful to do anything on the top level.
>
> Another, related I guess, problem here is that the new file format 
> uses about 15 times as much disk space as the old format (for a 
> typical ill-behaved nested structure). That adds to the save/load time 
> too I guess, but is probably not the main source here.
>
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and should 
be reported as bugs.

By the way, I rewrote your script to run it under both 6.0 and 5.5:

/////////////////////////////////
N = 4;
n = 10000;
filters = list();

for i=1:n
   G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
   filters($+1) = G;
end

ver=getversion('scilab');

if ver(1)<6 then
     tic();
     save('filters_old.dat', filters);
     ts1 = toc();
else
     tic();
     save('filters_new.dat', 'filters');
     ts1 = toc();
end

printf("Time for save %.2fs\n", ts1);
/////////////////////////////////

Hope it helps,

Antoine

> I think I might have reported this earlier using Bugzilla, but I’m not 
> sure. I’ll check and report it if not.
>
> Cheers,
>
> Arvid
>
> *From: *users <users-bounces at lists.scilab.org> on behalf of 
> "amonmayr at laas.fr" <amonmayr at laas.fr>
> *Reply-To: *"antoine.monmayrant at laas.fr" <antoine.monmayrant at laas.fr>, 
> Users mailing list for Scilab <users at lists.scilab.org>
> *Date: *Monday, 15 October 2018 at 11:08
> *To: *"users at lists.scilab.org" <users at lists.scilab.org>
> *Subject: *Re: [Scilab-users] HDF5 save is super slow
>
> Hello,
>
> I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a 
> slowdown of around 175 between old save in 5.5.1 and new (and only) 
> save in 6.0.
> It's really related to the data structure, because we use hdf5 
> read/write a lot here and did not experience significant slowdowns 
> using 6.0.
> I think the overhead might come to the translation of your fairly 
> complex variable (a long array of tlist) in the corresponding hdf5 
> structure.
> In the old save, this translation was not necessary.
> Maybe you could try to save your data in a different way.
> For example:
> 3) you could save each element of "filters" in a separate file.
> 2) you could bypass save and directly write your data in a hdf5 file 
> by using h5open(), h5write() directly. It means you need to write your 
> own load() for your custom file format. But this way, you can try to 
> find the best way to layout your data in hdf5 format.
> 3) in addition to 2) you could try to save each entry of your 
> "filters" array as one dataset in a given hdf5 file.
>
> Did you search on bugzilla whether this bug was already submitted?
> Could you try to report it?
>
>
> Antoine
>
> Le 15/10/2018 à 10:11, Arvid Rosén a écrit :
>
>     /////////////////////////////////
>
>     N = 4;
>
>     n = 10000;
>
>     filters = list();
>
>     for i=1:n
>
>     G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
>
>     filters($+1) = G;
>
>     end
>
>     tic();
>
>     save('filters.dat', filters);
>
>     ts1 = toc();
>
>     tic();
>
>     save('filters.dat', 'filters');
>
>     ts2 = toc();
>
>     printf("old save %.2fs\n", ts1);
>
>     printf("new save %.2fs\n", ts2);
>
>     printf("slowdown %.1f\n", ts2/ts1);
>
>     /////////////////////////////////
>
> -- 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   Antoine Monmayrant LAAS - CNRS
>   7 avenue du Colonel Roche
>   BP 54200
>   31031 TOULOUSE Cedex 4
>   FRANCE
>   Tel:+33 5 61 33 64 59
>   
>   email :antoine.monmayrant at laas.fr <mailto:antoine.monmayrant at laas.fr>
>   permanent email :antoine.monmayrant at polytechnique.org 
> <mailto:antoine.monmayrant at polytechnique.org>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++


-- 
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

  Antoine Monmayrant LAAS - CNRS
  7 avenue du Colonel Roche
  BP 54200
  31031 TOULOUSE Cedex 4
  FRANCE

  Tel:+33 5 61 33 64 59
  
  email : antoine.monmayrant at laas.fr
  permanent email : antoine.monmayrant at polytechnique.org

+++++++++++++++++++++++++++++++++++++++++++++++++++++++

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20181015/51e03511/attachment.htm>


More information about the users mailing list