[Scilab-users] lcm() output type: how to avoid overflowing and get an actual result?

Wed Mar 15 17:04:54 CET 2017

Hello,

Processing the bug http://bugzilla.scilab.org/15017 for the lcm() function
deserves an open discussion about the datatype of the result when the operand
is a matrix of encoded integers.
Presently, the result of lcm(A) has the type of A:

--> r = lcm(int16([123 423])), typeof(r)
  r  =
   17343
  ans  =
  int16

--> r = lcm(uint8([123 423])), typeof(r)
  r  =
   61
  ans  =
  uint8

This behavior is questionable, in the way that the Least Common Multiple of some
integers is always equal or*most often greater*  than the maximum of the operand.
Therefore, when the operand is a matrix of encoded integers, the LCM can easily
(and most often) yield an overflow, and so yield a wrapped result, being most often
unrelevant and misleading:
Example:
--> A = int8([2 3 5 7]);
--> lcm(A)
  ans  =
  -46        // 210 expected
--> typeof(lcm(A))
  ans  =
  int8

So we see that even with few and small input values, as expected the result is already
"corrupted".

This bad effect does not happens with gcd() because its results is always
*smaller or equal*  to the maximum of the operand components.

Octave behaves in the same bad way than Scilab, just saturating to the highest
integer value instead of wrapping it, but in both cases the result is wrong:

>> A = int8([2 3 5 7]);
>> lcm(A(1), A(2), A(3), A(4))
ans = 127

So, the question is:*Do we improve the situation, and if we do, in which way?*
Here are some suggestions:

* The result is always promoted to the next class of inttype:
   int8 => int16, uint8 => uint16, int16 => int32, etc
   int64 and uint64 could be promoted into decimal numbers, but with a possible
   loss of accuracy (but beyond 2^63)

* The result is always promoted to the most accurate (int64 or) uint64 class

* The result is always promoted to the decimal (aka floating point) class

My (slight) preference would be*to cast the result always into the decimal class*. Why?
  - the type of the result would be always the same: decimal

  - it would be OK and lossless for all int8, uint8, int16, and uint16 operands,
    and for all int32, uint32, int64, uint64 smaller than 2^26
    OK, we could loose some low bits when processing some big 32 or 64 bits integers,
    but at least we would have a result ; otherwise, casting to uint64 could still
    lead to some overflow, so no actual result at all.

  - Usual +,-,*,/,^ operators between an int# or uint# and a decimal number forces
    the result to the encoded integer type. So this choice would let any further
    operations recasting the lcm() result into the original inttype (instead of
    promoting all further results in an unexpected way).

What's your opinion? What would you prefer working with?

Hope reading you soon
Samuel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20170315/d64da9d2/attachment.htm>