[Scilab-Dev] slint() : remarks and suggestions

Tue Apr 19 13:09:44 CEST 2016

Hello devs,

The so-called /slint /module is a new experimental tool proposed and 
somewhat documented in Scilab master: 
http://www.scilab.org/fr/development/nightly_builds/master
Scilab 6.0-b1 did not yet include it. slint is a Scilab code checker.

The Scilab community has not been involved to design it, i mean beyond 
the CNES (as likely the customer havingordered for it). S/E is planning 
to include slint into Scilab.

Due to this possible inclusion in Scilab, that would make slint() 
distributed to all users, i would like to make some remarks and 
suggestions about this module, as i would like a lot reading other 
comments from other users and developers about this module.
Below i will focus on some aspects about slint() "interfaces" 
(prototypes, data formats), since interfaces turn rather locked after 
the first publication (wherever it is published, in Scilab or on ATOMS). 
I have not made deep tests about the current parser / slint engine, and 
beyond some unit tests about a subset of split code checks, i did not 
see any code sample gathering all features presently checked by slint().

 1. *In Scilab, or a complementary module* ?
    slint is not specific to any type or area of applications, since it
    is dedicated to Scilab code "qualification", whatever is the purpose
    of the code.
    However, it is a rather specific development tool, clearly oriented
    to developers working on "high standards". It is more about
    computing sciences than for engineering and prototyping.
    In its "/Scilab development/" category, ATOMS already proposes other
    development-specific modules -- that are tagged /Complementary/,
    such as /Scibench/, or /assert/. /assert/ in now included in Scilab,
    since it is used by almost all non-regression tests that are as well
    included (whether they aren't unchecked in Scilab's installer).
    /slint/ is somewhat another module for benchmarking, but applied to
    the code style and syntax, instead of to the code speed. IMO, it
    would not be shocking to put it only on ATOMS as a complementary
    module, and/or to leave the user decide through Scilab's installer
    if he/she wants to install it by default.

 2. *slint's name: slint?*
    I am not convinced by this module name and (unique) function name.
    It does not clearly tell what it does. Should we guess that "s"
    stand for /S/cilab (if so, is it worthwhile to remind it?), and lint
    for code-washing residues?
    Here are some suggestions: codeCheck(), codeDiagn(), codeQualify(),
    codeValidate(), codeWash()... codeCheck() or codecheck() would look
    the most appropriate to me. By the way, slint's pages talk about
    /checking/ rules.

 3. *Syntaxes*
    The current slint() help page shows the following:

          Calling Sequence

    slint(files [, conf, out]) slint(files [, out]) out = slint(files [,
    conf], print)

          Arguments

    files
        a matrix of strings, the .sci files or the directories to analyze. 
    conf
        a scalar string, the name of the configuration file (by default
        it's SCI/modules/slint/etc/slint.xml). 
    out
        a scalar string, the name of the output file. 
    print
        a scalar boolean, if true the result is printed else the result
        is a struct. 
    out
        a struct (if print is false). 

          Description

    slint has been written to check the "quality" of the Scilab's code
    according to configurable rules.

          Examples

    slint("SCI/modules/elementary_functions/macros/atanm.sci");

    *Remarks*:
      * slint() application to .sce scripts is not documented. Is this
        usage possible? Is it already runnable?

      * slint() application to compiled Scilab functions is not possible
        or documented. Yet, as for profiling tools, this kind of input
        would be handy.

      * print = %f should rather be the default. %T is currently the
        default. For boolean variables, for my own i try to use or
        document variables names with a final "?" => "print?", to remind
        that they are boolean. BTW, it is here "disp?" rather than
        "print?". Just a matter of documentation.

      * The /out/ structure of results
          o can be as well an input parameter. This usage is quite
            unexpected and is not explained.

          o Fields and organization of the structure is not documented.
            But running the example provides some insight to it:
            --> slint("SCI/modules/elementary_functions/macros/atanm.sci");
            In SCI\modules\elementary_functions\macros\atanm.sci:
               At l. 0, c. 0: 00015: Maximum line length exceeded at
            lines: 24, 28, 35.
               At l. 24, c. 15: 00029: A function argument must be
            preceded by a single space.
               At l. 24, c. 15: 00029: A function argument must be
            preceded by a single space.
               At l. 27, c. 8: 00028: Operator <> should be surrounded
            by single spaces.
               At l. 28, c. 15: 00029: A function argument must be
            preceded by a single space.
               At l. 28, c. 15: 00029: A function argument must be
            preceded by a single space.
               At l. 31, c. 8: 00028: Operator == should be surrounded
            by single spaces.
            ...
               At l. 45, c. 7: 00028: Operator / should be surrounded by
            single spaces.
               At l. 45, c. 7: 00028: Operator * should be surrounded by
            single spaces.
               At l. 45, c. 7: 00033: Expression is not bracketed.
               At l. 47, c. 12: 00028: Operator == should be surrounded
            by single spaces.
               At l. 47, c. 29: 00028: Operator = should be surrounded
            by single spaces.
            Module developed with the contribution of CNES.

            And by grabbing the output in a variable and canceling the
            printing-in-console:

            --> results =
            slint("SCI/modules/elementary_functions/macros/atanm.sci", %f)
              results  =
               file: [1x1 string]
               info: [1x1 struct]

            --> results.file
              ans  =
              SCI\modules\elementary_functions\macros\atanm.sci

            --> results.info
              ans  =
               00029: [8x1 struct]
               00015: [1x1 struct]
               00028: [17x1 struct]
               00009: [1x1 struct]
               00033: [3x1 struct]

            --> results.info("00033").loc
              ans  =
                    ans(1)
                39.   22.
                39.   34.

                    ans(2)
                39.   22.
                39.   26.

                    ans(3)
                45.   7.
                45.   28.

            --> results.info("00033").msg
              ans  =
                    ans(1)
              Expression is not bracketed.

                    ans(2)
              Expression is not bracketed.

                    ans(3)
              Expression is not bracketed.

            *Comments and suggestions*:
             1. imo, this structure for the results looks uselessly
                complicated, and finally inefficient. It makes choosing
                how to filter and view results very difficult:
                  + the "info" field could be renamed "results". Its
                    contents are for example not "contextual or
                    configuration infos". To be clearer, they are results.
                  + this .results field could rather be a simple list,
                    with as many components as there are analyzed files.
                  + Each .result(i) component could rather be a matrix
                    of text. If a file has fully passed checkings, its
                    results are [].
                    The matrix would have the following columns:
                      # line number (ideally with leading 0 to make
                        lexicographical and numerical orders matching)
                      # column number (idem)
                      # checker id
                      # message
                      # file basename (with the file extension, but
                        without the path)
                      # file id (# rank in the list of files), converted
                        into text

                  + *Finally, wondering about all what is written
                    here-above, i think that removing the splitting
                    between .fileS and .results (.info) fields would
                    even be preferable. *
                    *A single matrix of text would be much simpler and
                    more efficient*. To do so, for each new file, its
                    full path shall be recorded in the results matrix,
                    in the /message/ column, with the code-row=0,
                    code-colum=0, checker-id=00000, given file basename
                    and rank#.
                    It would be much easier to filter and sort according
                    to any column or multicolumn sorting. Very easy to
                    record in a .csv file ; etc. Very easy to reimport,
                    to compare or merge with other files ; to make
                    statistics with that ; etc. All things very hard to
                    do with the current structure.

             2. The present structure is so rigid and unhandy that it
                prevents basic filtering operations (like for instance
                relisting results (out of console) in the order of rows
                of code in the file).
                By the way -- but this has no importance, since the
                structure should rather be abandoned --,
                 1. *id of checkers* starting with a digit prevents
                    using the .dot field addressing. Why not prefixing
                    them with a letter (say "r" as "rule" or "c" as
                    "criteria")?

                 2. *.loc* field is a list. Yet, its components have all
                    the same types and sizes, in such a way that they
                    could rather be stored in rows of a matrix. This
                    would allow filtering operations with find().

                 3. *.msg* field: same remark: a matrix of text would be
                    more handy.

      * *Checking rules look not to be categorized*. Yet, some rules are
        "only" about the code style, some others about deprecated or
        removed features, etc.
        Presently, it is not easy to filter results by type of "lint".
        We must somewhat look at all lints or at none or at some
        specific ids. For instance, if i want to use an external module
        and before i want to assess its runnability, i won't care about
        its code style (i am not the author, and i don't want to spend
        time on the code style of an external module), and i would wish
        to get and fix all deprecated or removed features in a
        straightforward way. To do so, defining subsets and tagging
        checkers in them would be handy.

      * Presently, slint() does not allow
          o to *provide a **Scilab version against which the Scilab code
            must be checked*.
          o to provide a subset of rules (or categories of rules) that
            must be checked, instead of checking all defined rules.

 4. *Documentation pages*.
    The 40 pages about checkers (checking criteria/rules), with for most
    of them only half-a-line of description, and nothing else, and
    nothing more expected, made me deeply wondering: which reviewer has
    accepted that, and who has accepted to merge that?
    For instance, if we spam in the same way the help tree just by
    splitting the /axes_properties/ page in 40 distinct pages, with one
    property per page, do we have a chance on review to be accepted and
    merged?
    This way of doing is shocking.
    By the way, i have searched these 40 pages on GIT/master to propose
    merging them (in a table, even more suitable than in a
    variablelist), but i did not find them. Quite strange.

Hoping to read other comments soon.
BR

Samuel Gougeon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/dev/attachments/20160419/b5b4dd98/attachment.htm>