[Scilab-users] Regex problem

Samuel Gougeon sgougeon at free.fr
Fri Aug 19 19:14:19 CEST 2016


Le 19/08/2016 18:04, Samuel Gougeon a écrit :
> Le 19/08/2016 17:20, Samuel Gougeon a écrit :
>> Le 19/08/2016 13:26, Gerhard Kreuzer a écrit :
>>> .../...
>>> Any idea how I can parse my file and get the binary data blocks into 
>>> variables, or at least get pointers to  the starting points, so I am 
>>> able to read the binary data with some file read function?
>>
>> Here is a working example supporting ascii(10). It looks tha "ms" 
>> modifiers must be used together:
> Aa, actually not: "s" works alone, but as a global modifier (not 
> checked after you locally in a capturing parenthesis):
.
It works also locally:
--> s = "abcd" + ascii(10) + "efghijkClmnop" + ascii(10) + "fg hiJkl";
--> [trash,trash,captures] = regexp(s, "/c(?*s*:.*?)i/i"); captures
  captures  =

!cd
efghi      !
!              !
!Clmnop
fg hi  !

So on your sample, it could be:

// Sample string including some \n:
s = "%!bk{*#data#*some binary code including \n as here" + ascii(10) ..
+"etc etc*#EOC#* intersticial binary*#data#*Let''s go on with binary" + 
ascii(10) ..
+ "other bytes in 0:31 are also of concern*#EOC#* remaining binary content"
// Capturing the patterns
[t,t,captures] = regexp(s, "/#data#.*?#EOC#/s"); captures
// Removing "#data#" and #EOC delimiters:
part(captures,7:$-5)

// yielding:--------------

--> // Sample string including some \n:
--> s = "%!bk{#data#some binary code including \n as here" + ascii(10) ..
   > +"etc etc#EOC# intersticial binary#data#Let''s go on with binary" + 
ascii(10) ..
   > + "other bytes in 0:31 are also of concern#EOC# remaining binary 
content"
  s  =
  %!bk{*#data#*some binary code including \n as here
etc etc*#EOC#* intersticial binary*#data#*Let's go on with binary
other bytes in 0:31 are also of concern*#EOC#* remaining binary content

--> // Capturing the patterns
--> [t,t,captures] = regexp(s, "/#data#.*?#EOC#/s"); captures
  captures  =
!#data#some binary code including \n as here
etc etc#EOC#                     !
!#data#Let's go on with binary
other bytes in 0:31 are also of concern#EOC#  !

--> // Removing "#data#" and #EOC delimiters:
--> part(captures,7:$-5)
  ans  =
!some binary code including \n as here
etc etc !
!Let's go on with binary
other bytes in 0:31 are also of concern  !

If you have results with mgetstr() or other ways to feed regexp(), would 
be fine to report them :)

Cheers
Samuel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20160819/a7becbad/attachment.htm>


More information about the users mailing list