[Scilab-users] ?==?utf-8?q? Error in parameters determined in non-linear least-squares fitting

Mon Apr 6 08:17:36 CEST 2020

Hello Heinz,

See below the small example I built and I refer to whenever I need to do some data fitting with confidence intervals for the parameters of the model.
It is far from perfect but it might help you untangle the Jacobian and covariance matrix thingy.
Just two words of caution: (1) I am clearly less versed in applied maths than Stéphane or Samuel, so take this with a grain of salft; (2) I built this example ages ago, before Samuel improved datafit.

Good luck

Antoine

//////////////////////////////////////////////////////////////////////////////////////
// REFERENCE: 
//    
//    Least Squares Estimation
//    SARA A. VAN DE GEER
//    Volume 2, pp. 1041–1045
//    in
//    Encyclopedia of Statistics in Behavioral Science
//    ISBN-13: 978-0-470-86080-9
//    ISBN-10: 0-470-86080-4
//    Editors Brian S. Everitt & David C. Howell
//    John Wiley & Sons, Ltd, Chichester, 2005
//

//This is a short and definetly incomplete tutorial on data fitting in Scilab using leastsq.
//
// Basic assumption: this tutorial is for scientists that face this simple problem:
// they have an experimental dataset x_epx,y_exp and a certain model y=Model(x,p) to fit thie dataset.
//  This tutorial will try to answer the folowing questions:
//  1) how do you do that (simply)
//  2) how do you do that (more reliably and more quickly)
//      a) let's go faster with a Jacobian
//      b) how good is your fit? How big is the "error bar" associated with each parameter of your Model?
//      c) Can we bullet proof it?

//1) How do you do curve fitting in Scilab
//
//  We need a) a model function b) a dataset c) some more work
//  1)a) Let's start with the model function: a sinewave
//      here is the formula: y=A*sin(k*(x-x0))+y0;
//      here is the function prototypr [y]=SineModel(x,p) with  p=[x0,k,A,y0]'
function [y]=SineModel(x,p)
//INPUTS:   x 1D vector
//          p parameter vector of size [4,1] containing
//              x0  : sine origin
//              k   : sine spatial frequency ie k=2%pi/Xp with Xp the period
//              A   : sine amplitude
//              y0  : sine offset
//OUTPUTS:   y 1D vector of same size than x such that y=A*sin(k*(x-x0))+y0;
      x0=p(1);
      k=p(2);
      A=p(3);
      y0=p(4);
      y=y0+A*sin((x-x0)*k);
endfunction

//  1)b) Let's now have a fake dataset: a sine wave with some noise
//  We reuse the Model function we have just created to make this fake dataset

x_exp=[-10:0.1:10]';
x0=1.55;
k=1*%pi/3;
A=4.3;
y0=1.1
y_exp=SineModel(x_exp,[x0,k,A,y0])+(2*rand(x_exp)-1)*6/10;

//let's check and see what it looks like:
scf();
plot(x_exp,y_exp,'k.-');
xlabel('Exparimental X');
ylabel('Exparimental Y');
xtitle('Our fake experimental dataset to fit');

//  1)c) we are not done yet, we need some more work
//  First we need an error function that return for a given parameter set param the difference
//  between the experimental dataset and the model one:
//  Note that this function returns the difference at each point, not the square of the difference,
//  nor the sum over each point of the square of the differences
function e = ErrorFitSine(param, x_exp, y_exp)
   e = SineModel(x_exp,param)-y_exp;
endfunction
// Now we need a starting point that is not too far away from the solution
// Let's just fo it by hand for the moment we'll see later how to make it programmatically
// Just go and have a look at the previous plot and "guess", here is mine:
p0=[1,2*%pi/6,4,1];

//Ready to go:
[f,popt, gropt] = leastsq(list(ErrorFitSine,x_exp,y_exp),p0);
//popt contains the optimal parameter set that fits our dataset
//
scf();
plot(x_exp,y_exp,'k.');
plot(x_exp,SineModel(x_exp,popt),'r-');
xlabel('Experimental X')
ylabel('Experimental Y and best fit');
xtitle([...
    "x0="+msprintf('%1.3f     fit value:     \t%1.3f',x0,popt(1));...
    "k ="+msprintf('%1.3f     fit value:     \t%1.3f',k,popt(2));...
    "A ="+msprintf('%1.3f     fit value:     \t%1.3f',A,popt(3));...
    "y0="+msprintf('%1.3f     fit value:     \t%1.3f',y0,popt(4))...
]);

//Yep we are done popt is the optimal parameter set that fits our dataset x_exp,y_exp with our
// model SineModel
// How to assert the quality of our fit? We can use fopt and gropt for that. They should be both as small as possible.
// Ideally, the gradient should be zeros for each parameter, otherwise it means we have not found an optimum.

//2) How to go beyond that simple example?
// Namely:
//      a) how to go faster?
//      b) how to estimate how good our fit is (aka get error bars on our parameters)?
//      c) how to make thinks more reliable with less human guessing and more bulletproofing?

//2)a) and also 2)b)
//  We need the same extra function in order to speed things up and to estimate
//  how the "noise" on the experimental dataset translates in "noise" on each
//  individual parameter p(1), ...p($): the Jacobian matrix of our fit model.
//  Impressive name for a simple idea: providing leastsq with the partial
//  derivative of the model formula with respect to each parameter p(1)..p($).
function g = ErrorJMatSine(p, x, y)
//

//   g(1,:) = d(SinModel(x,p))/dx0 = d(SinModel(x,p))/d(p(1))
//   g(2,:) = d(SinModel(x,p))/dk = d(SinModel(x,p))/d(p(2))
//   g(3,:) = d(SinModel(x,p))/dA = d(SinModel(x,p))/d(p(3))
//   g(4,:) = d(SinModel(x,p))/dy0 = d(SinModel(x,p))/d(p(4))

//    y=A*sin(k*(x-x0))+y0;
   x0=p(1);
   k=p(2);
   A=p(3);
   y0=p(4);
   g = [...
         (-k)*SineModel(x,[x0-%pi/2/k,k,A,0]'),...
         (x-x0).*SineModel(x,[x0-%pi/2/k,k,A,0]'),...
         SineModel(x,[x0,k,1,0]'),...
         ones(x) ...
       ];
endfunction

//Ready to go again, and faster this time:
[f,popt_fast, gropt] = leastsq(list(ErrorFitSine,x_exp,y_exp),ErrorJMatSine,p0);

disp("This should be ~ [0,0,0] when both fits give exactly the same results")
disp(string((popt-popt_fast)./(popt+popt_fast)))

//Now we check that we are faster:
//  we do it several times to average over timing variations caused by
//  other processes running on our computer
speedup=zeros(1,10)
for i=1:100
    tic();
    [f,popt_fast, gropt] = leastsq(list(ErrorFitSine,x_exp,y_exp),p0);
    t1=toc();
    tic();
    [f,popt_fast, gropt] = leastsq(list(ErrorFitSine,x_exp,y_exp),ErrorJMatSine,p0);
    t2=toc();
    speedup(i)=t1/t2;
end

scf();
plot(speedup,'k.');
plot(mean(speedup)*ones(speedup),'r');
xlabel("Fit iteration");
ylabel("SpeedUp factor when using Jacobian Matrix");
xtitle("Here we have a "+msprintf("~%1.2f",mean(speedup))+...
" speed improvement using the Jacobian Matrix");

//2)b) How can we estimate "error bars" for each individual parameters?
//  We are going to estimate how the "noise" on our dataset turns into
//  noise on each parameter by estimating the confidence interval at 95%

g = ErrorJMatSine(popt_fast, x_exp, y_exp);//Jacobian matrix of the fit formula
//estimate of the initial noise on the dataset to fit
sigma2=1/(length(x_exp)-length(popt_fast))*f;

//covariance matrix of fitting parameters
pcovar=inv(g'*g)*sigma2;
//confidence interval for each parameter
ci=(1.96*sqrt(diag(pcovar)))';

//Let's present the results of the confidence interval calculation
str=msprintf("%4.3f\n", popt_fast')+' +/- '+msprintf("%4.3f\n", ci');
str=[["x0 = ";"k = ";"A = ";"y0 = "]+str];
str=["Fit results:";"y=A*sin(k*(x-x0))+y0";"Param +/- conf. interval @ 95%";str]

scf();
plot(x_exp,y_exp,'k.');
plot(x_exp,SineModel(x_exp,popt_fast),'r-');
xlabel('Experimental X')
ylabel('Experimental Y and best fit');
xtitle('Fit with confidence intervals at 95%');
xstringb(-6,-3,str,12,6,'fill');
e=gce();
e.box="on";
e.fill_mode="on";
e.background=color("light gray");

//Now we have a fast fit function that can give some estimation on
//  how seriously we can believe the fitted parameters.
//  You can see that the precision with which we retrieve each parameter varies
//  k0 is more precisely determined than y0 (15x more precisely!)
//You can play with the amount of noise to see how it affects the retrieved
//  parameters and confidence intervals

//2)b) How can we bullet proof our fitting script by putting all this stuff
//      together is several functions that checks the arguments and modify
//      them if needed to avoid difficult to understand complaints from
//      leastsq?

// test of a sinus fitting routine

////  y=A*sin(k*(x-x0))+y0;
//// function [y]=Sine(x,x0,k,A,y0)
//// function [y]=Sine(x,p) with  p=[x0,k,A,y0]'
//function [y]=Sine(x,varargin)
////    pause
//    select length(varargin)
//    case 1 then
//        //we use form [y]=Sine(x,p) where p=[x0,dx,A,y0]'
//        p=varargin(1);
//    case 4 then
//        //we use form [y]=Sine(x,x0,k,A,y0)
//        p=[varargin(1);varargin(2);varargin(3);varargin(4)];
//    else
//        //call is not correct, we give up
//        y=[];return
//    end
//      x0=p(1);
//      k=p(2);
//      A=p(3);
//      y0=p(4);
//      y=y0+A*sin((x-x0)*k);
//endfunction
//
function [p0,pinf,psup]=EstimateFitSine(x,y)
    //  y=A*sin(k*(x-x0))+y0;
    y0=mean(y);
    A=(max(y)-min(y))/2;
//    pause
    sy=fftshift(fft(y-mean(y)));
    //sy=fft(y-mean(y));
    msy=max(abs(sy));rg=find(abs(sy)==msy);
    delta=(rg(2)-rg(1))/length(y);
    k=delta*%pi/(mean(diff(x)));
    x0=mean(abs(atan(imag(sy(rg)),real(sy(rg)))/k))+%pi/2/k;
    p0=[x0,k,A,y0];
    pinf=[min(x),0      ,-%inf  ,-%inf];
    psup=[max(x),%inf   ,%inf   ,%inf];
endfunction

function e = ErrorFitSine(param, xi, yi)
   e = SineModel(xi,param)-yi;
endfunction

function g = dErrorFitSine(param, xi, yi)
//    y=A*sin(k*(x-x0))+y0;
   x0=param(1);
   k=param(2);
   A=param(3);
   y0=param(4);
//   pause
   g = [...
         (-k)*SineModel(xi,[x0-%pi/2/k,k,A,0]),...
         (xi-x0).*SineModel(xi,[x0-%pi/2/k,k,A,0]),...
         SineModel(xi,[x0,k,1,0]),...
         ones(xi) ...
       ];
endfunction

function [f,popt,yopt,gropt,ci,pcovar]=FitSine(x,y,p0,varargin)
// FIT Experimental dataset y(x) with a sine model
//      [y]=A*sin(k*(x-x0))+y0;
//
//INPUTS:
   //x             :       experiemental x dataset
   //y             :       experimental  y dataset
   //p0            :       starting parameter set [x0,k,A,y0]
   //varargin      :
   //               pinf   inferior limit for popt
   //               psup   superior limit for popt
//OUTPUTS
   //f             :       value of the cost function for the best parameter set
   //popt          :       best parameter set
   //yopt          :       fit function evaluated on the x dataset
   //gropt         :       gradient of the cost function at x
   //ci            ;       confidence interval @ 95% on each parameter in popt
   //pcovar        ;       covariance matrix of popt

   // CHECK x and y are col not row
   if  isrow(x) then
       x=x.';
   end
   if  isrow(y) then
       y=y.';
   end

   if (length(varargin) < 2) then
      // No constraints on parameter set were provided
      [f,popt, gropt] = leastsq(list(ErrorFitSine,x,y),dErrorFitSine,p0);
//      [f,popt_fast, gropt] = leastsq(list(ErrorFitSine,x_exp,y_exp),ErrorJMatSine,p0);
   else
      // Constraints on parameter set were provided
      pinf=varargin(1);
      psup=varargin(2);
      [f,popt, gropt] = leastsq(list(ErrorFitSine,x,y),dErrorFitSine,"b",pinf,psup,p0);
   end
   //normalized best fit evaluated on normalized x
   yopt=ErrorFitSine(popt, x, zeros(x));
   //rescale best fit param

    g = dErrorFitSine(popt, x, y);//Jacobian matrix of the fit formula
    //estimate of the noise on the signal to fit
    sigma2=1/(length(x)-length(popt))*f;

    //covariance matrix of fitting parameters
    pcovar=inv(g'*g)*sigma2;
    //confidence interval for each parameter
    ci=1.96*sqrt(diag(pcovar));
    ci=ci.';
endfunction

// Let's use our unified and bullet-proof routine

x=[-10:0.1:10];
//y=Sine(x,[1,2*%pi/3,4,1])+(2*rand(x)-1)*0.1;
y=SineModel(x,[5,1*%pi/3,4,1])+(2*rand(x)-1)*2;
//y=SineModel(x,[5,1*%pi/3,4,1])+(2*rand(x)-1)*2/10;

p0=EstimateFitSine(x,y);
[f,popt,yopt,gropt,ci,pcovar]=FitSine(x,y,p0);

str="$"+["x_0";"k";"A";"y_0"]+"="+string(popt.')+"\pm"+string(ci.')+"$";
str=["$\text{Model: }[y]=A\sin(k(x-x_0))+y_0$";str];

scf();
plot(x,y,'k.');
plot(x,SineModel(x,p0),'g');
plot(x,SineModel(x,popt),'r');
xlabel("$x$");
ylabel("$y$");
xtitle(str)
legend(["Data";"Guess";"Fit"]) 

////////////////////////////////////////////////////////////////////////////////////// 

Le Samedi, Avril 04, 2020 15:13 CEST, Heinz Nabielek <heinznabielek at me.com> a écrit: 

> Scilab friends: the power of Scilab is amazing and I have used it recently for non-linear least-squares fitting, below example from Scilab help function for "datafit". On occasions, I have also used "leastsq".
> 
> Question: how do I derive the 1sigma standard error in the three parameters p(1), p(2), and p(3)? And, if it is not too complicated, covariances?
> 
> I know this is written in dozens of textbooks, but I am always getting lost.
> Please provide a simple recipe written in Scilab.
> Best greetings
> Heinz
> 
> 
> 
> // -- 04/04/2020 14:57:30 -- //
> //generate the data
> function y=FF(x, p)
>   y=p(1)*(x-p(2))+p(3)*x.*x
> endfunction
> X=[];
> Y=[];
> pg=[34;12;14] //parameter used to generate data
> for x=0:.1:3
>   Y=[Y,FF(x,pg)+100*(rand()-.5)];
>   X=[X,x];
> end
> Z=[Y;X];
> //The criterion function
> function e=G(p, z),
>   y=z(1),x=z(2);
>   e=y-FF(x,p),
> endfunction
> //Solve the problem
> p0=[3;5;10]
> [p,err]=datafit(G,Z,p0);
> scf(0);clf()
> plot2d(X,FF(X,pg),5) //the curve without noise
> plot2d(X,Y,-1)  // the noisy data
> plot2d(X,FF(X,p),12) //the solution
> xgrid();legend("the curve without noise"," the noisy data", "THE FINAL SOLUTION.....",4);
> title("solution set   39.868419    10.312053    11.482521","fontsize",4);