<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">Le 19/08/2016 13:26, Gerhard Kreuzer a
écrit :<br>
</div>
<blockquote cite="mid:072801d1fa0c$922bca70$b6835f50$@liftoff.at"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Comic Sans MS";
panose-1:3 15 7 2 3 3 2 2 2 4;}
@font-face
{font-family:Monospaced;
panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Vorformatiert Zchn";
margin:0cm;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.E-MailFormatvorlage17
{mso-style-type:personal-compose;
font-family:"Comic Sans MS","serif";
color:windowtext;
font-weight:normal;
font-style:normal;}
span.HTMLVorformatiertZchn
{mso-style-name:"HTML Vorformatiert Zchn";
mso-style-priority:99;
mso-style-link:"HTML Vorformatiert";
font-family:"Courier New";
mso-fareast-language:DE-AT;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"Comic Sans
MS","serif"">Hi,<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"Comic Sans
MS","serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"Comic Sans
MS","serif"">I have a data file containing
metadata and binary data. I successfully scanned the file
with .net Regex class, now I want to scan it with SciLab,
but ….<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"Comic Sans
MS","serif""><o:p> </o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"Comic Sans
MS","serif"">The relevant part of the file
looks like: #data# .. here comes binary data … #EOC# ..
here comes binary data … #EOC# .. and so on.<o:p></o:p></span></p>
<p class="MsoNormal"><span
style="font-size:12.0pt;font-family:"Comic Sans
MS","serif""><o:p> </o:p></span></p>
<pre><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif"">My regex in .net notation looks like: </span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:black">#data#</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#4A55DB">((</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:black">?</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#5C5C5C"><</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:black">data</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#5C5C5C">></span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#4A55DB">(</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:black">?s</
span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#FFAA00">:</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#5C5C5C">.*</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:black">?</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#4A55DB">))</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:black">#EOC#</span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#4A55DB">)+<o:p></o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:#4A55DB"><o:p> </o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif"">Ok, in SciLab the notation is little different: </span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:rosybrown">'/#data#((?P<data>(?s:.*?))#EOC#)+/'<o:p></o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:rosybrown"><o:p> </o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif"">This regex expression didn’t match at all. I start experimenting and it looks like that the regex engine stops at a new line character (0x0A) which i spart of the binary data block. As far as I know (and that isn’t that far) the clause </span><span style="font-size:12.0pt;font-family:"Monospaced","serif";color:rosybrown">(?s:.*?)</span><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif""> means, take any character until you find #EOC# but as least as possible.<o:p></o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif"">No interpretation on 0x0A ….<o:p></o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif""><o:p> </o:p></span></pre>
<pre><span style="font-size:12.0pt;font-family:"Comic Sans MS","serif"">Any idea how I can parse my file and get the binary data blocks into variables, or at least get pointers to the starting points, so I am able to read the binary data with some file read function?<o:p></o:p></span></pre>
</div>
</blockquote>
<br>
Here is a working example supporting ascii(10). It looks tha "ms"
modifiers must be used together:<br>
<tt>--> s = "abcd" + ascii(10) + "efghijkClmnop" + ascii(10) +
"fg hiJkl"</tt><tt><br>
</tt><tt> s = </tt><tt><br>
</tt><tt> abcd</tt><tt><br>
</tt><tt>efghijkClmnop</tt><tt><br>
</tt><tt>fg hiJkl</tt><tt><br>
</tt><tt><br>
</tt><tt>--> [trash,trash,captures] = regexp(s, "/c.*?i/<b>ms</b>i");
captures</tt><tt><br>
</tt><tt> captures = </tt><tt><br>
</tt><tt>!cd</tt><tt><br>
</tt><tt>efghi !</tt><tt><br>
</tt><tt>! !</tt><tt><br>
</tt><tt>!Clmnop</tt><tt><br>
</tt><tt>fg hi !</tt><tt><br>
</tt><tt><br>
</tt>So, in your case, you may try with:<br>
<br>
<tt>regexp(s, "/</tt><span
style="font-size:12.0pt;font-family:"Monospaced","serif";color:rosybrown">#data#(.*?)#EOC#)</span><tt><b>/ms</b>");<br>
</tt><br>
May be the hardest thing will be to get your binary content as a
string.<br>
I don't think that Scilab's regexp() will accept anything else than
a string.<br>
May be there are some Scilab regexp features in reading binary
files. To be investigated.<br>
<br>
<tt>HTH<br>
Samuel Gougeon<br>
<br>
</tt>
</body>
</html>