[Scilab-users] leastsq question : what is 'gopt' useful for?

Wed Mar 23 15:08:26 CET 2016

Le 03/22/2016 09:54 PM, Samuel Gougeon a écrit :
> Hi,
>
> Le 22/03/2016 10:55, Stéphane Mottelet a écrit :
>> Hello,
>>
>> Le 22/03/2016 10:41, antoine.monmayrant at laas.fr a écrit :
>>> Hi everyone,
>>>
>>> I have a very general and naive question concerning leastsq: what am 
>>> I to do with "gopt", the "gradient of f at xopt"?
>>>
>>> Is there a way to link it to the confidence interval for each 
>>> parameter of my fit?
>> Not really, but since leastsq is a wrapper for optim, which returns 
>> the gradient at returned "optimal" solution, it is also returned. 
>> However, if the final gradient is seen to be far from the zero 
>> vector, then all confidence intervals based on the inverse of the 
>> Fisher matrix (computed with the Jacobian) will not have any sense, 
>> since these "linear" statistics are based on a development where the 
>> first term (using the gradient) is supposed to vanish... Hence, 
>> having access to the final gradient can be of interest.
>
> We may guess that, if bounds constraints are set for x, a non-zero 
> gradient could as well be returned whether xopt reaches a point on the 
> boundary, where fun() has not a true minimum, just a negative 
> interrupted slope leading it to a low value.

Right.
I usually check whether a given parameter is equal to one of its 
boundaries to avoid this kind of issues.
>
>>> For the moment, I know how to estimate these confidence intervals 
>>> when I have access to the Jacobian matrix of my fit function.
> I am not sure that we are speaking about the same jacobian.
Well, I am talking about the dfun() function as described in the help 
page of leastsq.
I use it to get an estimate of the confidence interval for each 
parameter of my fit based on "Least Squares Estimation", SARA A. VAN DE GEER
Volume 2, pp. 1041-1045,  in Encyclopedia of Statistics in Behavioral 
Science, ISBN-13: 978-0-470-86080-9.
> On one hand, we have some c(i) coordinates (say of spaces: c(1)=x, 
> c(2)=y, etc))
> on which fun() depends. On the other hand, we have p(j) parameters on 
> which fun() also depends.
> As good quiet parameters, p(j) have fixed values, whereas c(i) are 
> varied by leastsq().

Well, I am not sure I follow you here.
I usually describe my fit functions as yfit=myfit(x,param) and try to 
find the best param to fit some experimental measurements {yexp,xexp}, 
ie such that |yexp-myfit(xexp,param)| is minimized (in the least square 
sense).
Are my xexp your p(j) and my param your c(i)?

>
> AFAIU, Stephane's answer assumes that the x "passed" to leastqr() is 
> the full set gathering {c(i)} AND {p(j)}.
> Whereas, still afaiu, you look interested in somewhat getting the 
> sensitivity of fun() with respect to each
> parameter p(j) around the minimum value of fun({c(i)}) (parameters 
> p(i) being fixed in the fun()
> definition).
Nope, I don't think.
I just want, in my case, to see how much I can vary each param(i) 
without really altering the minimization of yfit-yexp.
The idea is to get grasp of how sharp is the minimization with respect 
to each param(i), in order to determine how to interpret and later use 
the param(i) given by the fit. (if param(1)=12.235658+/-1.0, it does not 
make sense to look too closely at the decimal part).

This can be achieved using Jackknifing or Bootstrapping (see for example 
http://www.jstor.org/stable/2289075?seq=1#page_scan_tab_contents), but 
these methods are resource intensive.
Using dfun() and some assumptions give a way faster estimate.
> Here, i don't think that we can speak more than about sensitivity. 
> "Confidence" is not the proper term,
> since parameters values p(j) are deterministic and fixed. They are not 
> random variables.
> To assess this sensitivity, you will need the fun() jacobian, BUT with 
> respect to p(j), not wrt to c(i) !
>
> To get what you want, i could suggest to run leastsq() with x={c(i)} U 
> {p(j)}
> Then, unless the optimum is reached on a boundary, the absolute value 
> of the derivative of order
> *2* of fun() along each p(j) direction evaluated at xopt will be 
> related to the "confidence interval",
> Unless the derivative of order 2 vanishes as well... (so then of order 
> 3... etc)
Well, that is how I understood the calculation of the confidence 
interval as described in "Least Squares Estimation".
The scaling factor depends on the kind of interval of confidence you 
choose (90,95,99%, ...).
> A more pragmatic way to get this interval could be to evaluate fun() 
> around xopt, just varying
> (with grand()) the p(j) component of xopt you want to get the 
> confidence, and measuring the
> spread of fun()'s answers.
Well, I am not sure to see how to use what you propose in practice.
In particular, how do I measure the width of the spread around the 
optimum value?
It's easier with jackkniffing where this width is the sigma of the 
distribution of param(i) across the various fits.
>
>>> Could "gopt" be of some use to estimate the confidence intervals 
>>> when the Jacobian matrix is not known?
> As Stephane said, in no way.
Yep, I was totally wrong there...

Thanks,

Antoine
>
> HTH
> Samuel
>
>
>
>
> _______________________________________________
> users mailing list
> users at lists.scilab.org
> http://lists.scilab.org/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20160323/a5c8c9fb/attachment.htm>