[Scilab-Dev] BLAS use in Scilab

Thu Feb 15 16:50:46 CET 2018

Hello all,

Following the recent discussion with fujimoto, I discovered that Scilab 
does not (seem to) fully use SIMD operation in  BLAS as it should. 
Besides the bottlenecks of its code, there are also many operations of 
the kind

scalar*matrix

Althoug this operation is correctly delegated to the DSCAL BLAS function 
(can be seen in C function iMultiRealMatrixByRealMatrix in 
modules/ast/src/c/operations/matrix_multiplication.c) :

> int iMultiRealScalarByRealMatrix(
>     double _dblReal1,
>     double *_pdblReal2,    int _iRows2, int _iCols2,
>     double *_pdblRealOut)
> {
>     int iOne    = 1;
>     int iSize2    = _iRows2 * _iCols2;
>
>     C2F(dcopy)(&iSize2, _pdblReal2, &iOne, _pdblRealOut, &iOne);
>     C2F(dscal)(&iSize2, &_dblReal1, _pdblRealOut, &iOne);
>     return 0;
> }
in the code below the product "A*1" is likely using only one processor 
core, as seen on the cpu usage graph and on the elapsed time,

A=rand(20000,20000);
tic; for i=1:10; A*1; end; toc

  ans  =

    25.596843

but this second piece of code is more than 8 times faster and uses 100% 
of the cpu,

ONE=ones(20000,1);
tic; for i=1:10; A*ONE; end; toc

  ans  =

    2.938314

with roughly the same number of multiplications. This second computation 
is delegated to DGEMM (C<-alpha*A*B + beta*C, here with alpha=1 and beta=0)

> int iMultiRealMatrixByRealMatrix(
>     double *_pdblReal1,    int _iRows1, int _iCols1,
>     double *_pdblReal2,    int _iRows2, int _iCols2,
>     double *_pdblRealOut)
> {
>     double dblOne        = 1;
>     double dblZero        = 0;
>
>     C2F(dgemm)("n", "n", &_iRows1, &_iCols2, &_iCols1, &dblOne,
>                _pdblReal1 , &_iRows1 ,
>                _pdblReal2, &_iRows2, &dblZero,
>                _pdblRealOut , &_iRows1);
>     return 0;
> }
Maybe my intuition is wrong, but I have the feeling that using dgemm 
with alpha=0 will be faster than dscal. I plan to test this by making a 
quick and dirty code linked to Scilab so my question to devs is : which 
are the #includes to add on top of the source (C) to be able to call 
dgemm and dscal ?

Thanks for your help

S.

-- 
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet