[Scilab-Dev] save(..) with internal timestamp? MD5 varying with unchanged content to be saved.

antoine.monmayrant+scilab at laas.fr antoine.monmayrant+scilab at laas.fr
Wed Jun 10 09:55:00 CEST 2015


Le 06/10/2015 09:36 AM, Clément David a écrit :
> Hi Samuel,
>
> For the record,
>
> I checked and even if the md5sum of the two files are different, the
> h5diff tool does not report any difference. That's weird !

Not really, it is a hdf5 issue, not a scilab one:

http://stackoverflow.com/questions/16019656/hdf5-file-h5py-with-version-control-hash-changes-on-every-save

Antoine

>
> On a more advanced analysis, I converted the sod binary files to
> hexdump (using xxd on linux) and got a multiple one char diff on
> something which may be an entry descriptor located just before the
> "SCILAB_Class" string value.
>
> Extracted reduced hexdiff : diff -C 5 <(xxd test1.sod) <(xxd test2.sod)
>
> *** 9271,9281 ****
>    0024360: 0100 0000 0000 0000 0100 0000 0000 0000  ................
>    0024370: 0300 0800 0100 0000 1700 0000 0800 0000  ................
>    0024380: 0500 0800 0100 0000 0202 0201 0000 0000  ................
>    0024390: 0800 1800 0100 0000 0301 f8fd 0100 0000  ................
>    00243a0: 0000 0800 0000 0000 0000 0000 0000 0000  ................
> ! 00243b0: 1200 0800 0000 0000 0100 0000 23e4 7755  ............#.wU
>    00243c0: 0c00 4000 0000 0000 0100 0d00 0800 1800  .. at .............
>    00243d0: 5343 494c 4142 5f43 6c61 7373 0000 0000  SCILAB_Class....
>    00243e0: 1300 0000 0400 0000 0101 0100 0000 0000  ................
>    00243f0: 0100 0000 0000 0000 0100 0000 0000 0000  ................
>    0024400: 6c69 7374 0000 0000 0c00 4800 0000 0000  list......H.....
> --- 9271,9281 ----
>    0024360: 0100 0000 0000 0000 0100 0000 0000 0000  ................
>    0024370: 0300 0800 0100 0000 1700 0000 0800 0000  ................
>    0024380: 0500 0800 0100 0000 0202 0201 0000 0000  ................
>    0024390: 0800 1800 0100 0000 0301 f8fd 0100 0000  ................
>    00243a0: 0000 0800 0000 0000 0000 0000 0000 0000  ................
> ! 00243b0: 1200 0800 0000 0000 0100 0000 28e4 7755  ............(.wU
>    00243c0: 0c00 4000 0000 0000 0100 0d00 0800 1800  .. at .............
>    00243d0: 5343 494c 4142 5f43 6c61 7373 0000 0000  SCILAB_Class....
>    00243e0: 1300 0000 0400 0000 0101 0100 0000 0000  ................
>    00243f0: 0100 0000 0000 0000 0100 0000 0000 0000  ................
>    0024400: 6c69 7374 0000 0000 0c00 4800 0000 0000  list......H.....
> ***************
>
> --
> Clément
>
> Le mercredi 10 juin 2015 à 00:03 +0200, Samuel Gougeon a écrit :
>> Hi,
>>
>> I have N figures. I would like to re-export only those that were
>> modified in the meanwhile.
>> Indeed, exporting with xs2### is rather time-consuming, in such a way
>> that i am looking for a short-circuit.
>>
>> The way that i imagined is the following, for a figure of handle f :
>> 1) f is saved in file1.sod. This is quite faster than exporting.
>> 2) later, f is re-saved in file2.sod
>> 3) we compute the MD5 checksums of the contents of file1.sod and
>> file2.sod
>> 4) we compare both checksums. If they are different, we re-export f.
>>
>> Unfortunately, it looks that save(..) likely saves also an internal
>> timestamp, or something varying external to the main saved content.
>> This ruins the idea, and i do not see any reflief plan. Here is a
>> proof:
>> clf
>> plot2d()
>> f = gcf();
>> save test.sod f
>> getmd5 test.sod
>> sleep(5000)
>> save test.sod f
>> getmd5 test.sod
>> Yielding:
>>
>> -->clf
>> -->plot2d()
>> -->f = gcf();
>> -->save test.sod f
>> -->getmd5 test.sod
>>   ans  =
>>   e015481486eb9708a4fe1d3df1cbbbb9
>> -->sleep(5000)
>> -->save test.sod f
>> -->getmd5 test.sod
>>   ans  =
>>   3c6319b5d3a2299caaacc2f95c3efb32
>>
>> Other tests show that
>> 1) renaming the file does not change its checksum returned by
>> getmd5()
>> 2) modifying the OS timestamp of the file (any of creation, last
>> access, last write) does not change its checksum.
>>
>> Hence, it really looks that an internal timestamp is recorded with
>> the proper data.
>> So, my questions are :
>> 1) do -- you developers -- confirm this save()'s behavior? I did you
>> go the the source code.
>> 2) Why doing that? Is is on purpose, or is it a bug?
>> 3) Is there a way to avoid it?
>>      a) There is presently no usage option to avoid it
>>      b) On option, recording only the date with a fixed conventionnal
>> hour such that 00:00:00.000 would be ok for me.
>>          10s of export is <<< 24h ;)
>> 4) Would you have any idea to do what i expect with figures, without
>> using their saved handles?
>>      I posted a wish and proposal
>> http://bugzilla.scilab.org/show_bug.cgi?id=11658 ; 3 years ago, in
>> order to
>>      open the possibillity to save copies of handles (in structures)
>> and then becomes able to compare them.
>>      But the thread does not breathe...
>>
>> Reading you soon
>>
>> Samuel
>>
>>   _______________________________________________
>> dev mailing list
>> dev at lists.scilab.org
>> http://lists.scilab.org/mailman/listinfo/dev
> _______________________________________________
> dev mailing list
> dev at lists.scilab.org
> http://lists.scilab.org/mailman/listinfo/dev
>





More information about the dev mailing list