[Scilab-Dev] Machine Learning Toolbox

Wed May 31 16:25:15 CEST 2017

Hi all
It looks like you didn't receive my first email ?

Envoyé de mon mobile

Le 31 mai 2017 à 16:20, Amanda Osvaldo <lambdasoftware at yahoo.es<mailto:lambdasoftware at yahoo.es>> a écrit :

I everyone, I think we have nothing about it. <face-surprise.png>
So ... somebody have a plan ? <face-surprise.png>

-- Amanda Osvaldo

On Mon, 2017-05-29 at 00:04 +0200, Philippe Saadé (ESI INENDI) wrote:
Dear All,

I took some time to jump in the discussion due to the fact that I wanted to get a better understanding of the current status of your discussions, a better understanding of Mandar's profile and expertise, and also what is easy/hard to do with Scilab to meet some serious and legitimate demands from Scilab's users.

As I am the last to join the discussion, I will voluntarily reset my mind and start again the discussions with you so that we can try to structure the project and converge quickly on an achievable list of goals for this GSoC.

For that purpose, I would like to list a series of questions on which we need to share a mutual list of answers and common understanding.
This should serve as a basis to decide what to do, how and when.

So, feel free to fill in...

  1.  Scilab has a way to use Python : PIMS. Originaly created in August 2014.
     *   How mature do you think it is?
     *   How compatible is it with the potential need of using existing Python-based ML framework from within Scilab?
     *   How easy/hard would it be for Mandar to pursue what has been done here so that using the ML frameworks from Scilab would be working well?
  2.  Data Management. I think the questions related to the actual size of the data that would be possibly handled by Scilab's users is key. Many ML methods (not necessarily "Deep" ones) need to be trained on large data sets. It doesn't mean that everything has to sit in RAM during training or general pre-processing but it must be possible to handle large data sets.
     *   Do we use only "pointers" from Scilab to give an access to the real data structures that are used by the ML frameworks?
     *   Do we want to integrate part or all of the data structures that are useful, as native Scilab data structures?
     *   Do we consider that the execution of ML algorithms should be designed and architectured in a way that it is done "remotely" from the perspective of Scilab?
  3.  Use Cases. We need to list some use cases that are typical of what Scilab users do and that make the usage of ML an exciting perspective. If we can not demonstrate that ML within Scilab is possible, easy and really useful on these Use cases, I am not sure we will have reached the main target of that GSoC opportunity.
Can we list use cases together?
I will start by items some but your input is important here.
     *   image classification
     *   object recognition in images and video
     *   Data Driven Industrial Process Control
     *   Anomaly Detection
     *   Dimensionality / Model reduction
     *   etc.

For sure, these questions do not cover all the important topics for this "ML Toolbox" project but this is a way to bootstrap.
As we know, we need to be active and efficient for the 30th of May!

Thanks for your feedback and feel free to share your point of view.

Cordialement – Best regards,

Philippe SAADÉ
<http://www.esi-group.com/>

Le 18/05/2017 à 21:50, Amanda Osvaldo a écrit :
Hi everybody, can I made some questions ?

First, at all, I really agree that SciLab needs a Machine Learning toolbox.

However, I'm pretty critical about Scilab in your limitations.
I see very potential in the software but require a reform in your infrastructure.

So, my questions.

How large are we talking about the training dataset in scilab ?
Even with Tensorflow compatibility if you need to put all the dataset into the RAM I fear the toolbox utility will be very limited.
In another words: The toolbox will can handle a 250GB dataset or just a few GBs from a desktop ?

Have I read right ?
We are talking about to integrate Scilab and tensorflow or scikit-learn ?
I think it's a good idea, I just whant to know if I'm interpreting right.

Somebody have some idea how to handle this project in a software engineering perspective?
Just to ensure the tests and code quality.

-- Amanda Osvaldo

On Thu, 2017-05-18 at 16:01 +0000, Yann Debray wrote:
Dear Caio, Dhruv and Amanda,

I would like to include my colleague Philippe Saadé to the exchanges on Machine Learning for Scilab.
He is an experienced mathematician working with us at ESI Group, and has an interesting vision on the subject.
He will be scientific advisor and mentor for a joint internship on Machine learning starting mid june.

@Philippe Saadé   (ESI INENDI)<mailto:philippe.saade at esi-group.com>: Could you maybe share with us your view on the subject?

We can keep this exchange public if it is alright with you all, since I believe our success on the subject will depend on our capacity to centralize and merge our community efforts.
You can all collaborate on the project on our forge:
http://forge.scilab.org/index.php/p/machine-learning-toolbox/

Yours
Yann @ Scilab

De : Amanda Osvaldo <lambdasoftware at yahoo.es><mailto:lambdasoftware at yahoo.es>
Date : vendredi 28 avril 2017 à 01:03
À : List dedicated to the development of Scilab <dev at lists.scilab.org><mailto:dev at lists.scilab.org>, Yann Debray <Yann.Debray at esi-group.com><mailto:Yann.Debray at esi-group.com>, Dhruv Khattar <dhruvk1996 at gmail.com><mailto:dhruvk1996 at gmail.com>
Objet : Re: [Scilab-Dev] Machine Learning Toolbox

Hi Caio, sorry for the late.

I think we should ask ourselves what SciLAB's focus and what audience are.
I feel a lack of knowing what users of Scilab seek.

Me, for example, I want to do everything from protyping to running the script on hundreds of Intel Xeon servers with the least possible effort.
Even with less effort than it would have if the script were built in Python.

I am sure that new data structures will expand the use of SciLAB.

But what advantage will this bring to users?
Python, as example, have already optimized data structures and libraries.

-- Amanda Osvaldo

On Wed, 2017-04-26 at 14:32 -0300, Caio Souza wrote:
Hi,

I have been thinking about the usability of the toolbox and independent of which algorithms we are going to have, would be interesting to have some simplified structure (like TensorFlow).

Despite it being a lot of work to have such structure, (data, model, cost function, minimizer), it would make the toolbox easy to use and extend, having minimum impact to the usability.

IMHO, this is something that should be defined before any coding starts, and also well explained to the student.

I would like to hear from you what do you think, so we can start a discussion.

Best,
Caio SOUZA

_______________________________________________

dev mailing list

dev at lists.scilab.org<mailto:dev at lists.scilab.org>

http://lists.scilab.org/mailman/listinfo/dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/dev/attachments/20170531/4786c8a9/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: face-surprise.png
Type: image/png
Size: 1603 bytes
Desc: face-surprise.png
URL: <https://lists.scilab.org/pipermail/dev/attachments/20170531/4786c8a9/attachment.png>