[Scilab-Dev] SEP#12: Scipad - Add support for encodings

fvogelnew1 at free.fr fvogelnew1 at free.fr
Thu Dec 25 23:27:22 CET 2008


Selon Yung-Jang Lee <yjlee123 at gmail.com>:
>
> SEP says:

Preliminary comment: You have sent two messages, both based on SEP#12 in
revision V1.0. Are you aware I have sent a new version on 23 Dec., V1.1?


> I hope it is not too late to say "yes ,  auto-detection of encoding for xml
> file is a valuable feature for most online help author".

Sure it is not too late.

Here is my proposal for a specification of this feature:

The encoding menu in the options menu receives a new saved preference
"Auto-detect encoding when loading xml files".

When this option is selected and the user opens a file, Scipad first detects the
file type using the existing mechanism based on the file extension (proc
extenstolang). When the file is detected to be an xml file, Scipad will launch a
regular expression search to find the encoding name specified in the xml prolog.
This prolog usually looks like:
    <?xml version="1.0" encoding="UTF-8"?>
It is precisely defined in the XML specification http://www.w3.org/TR/xml and I
will try to follow this specification as closely as it is possible within a
single regexp. Writing (or integrating) a full fledged XML parser is of course
out of the question.

If no encoding name is found in the xml file just opened, nothing more happens.

If an encoding name is found, it is compared (in lower case) against encodings
Scipad knows about (those appearing in the encoding menu).

If there is no match, nothing more happens.

In principle there cannot be more than one match ([encoding names] is not
supposed to contain duplicate entries). If there is at least one match, then the
first match is kept. If this match is the same as the current encoding, nothing
more happens.

If the match differs from the current encoding, Scipad switches the current
encoding to become the matched encoding, and then reopens the file in the same
buffer using the matched encoding.

What do you think?

Francois





More information about the dev mailing list