[Scilab-users] Using GROCER ms_var parameters for forecasting

Thu Feb 19 21:43:50 CET 2015

Dear Brian

You cannot perform forecasts with the results fo the function I sent you,
because these results are under a matrix form while ms_forecast needs a
results tlist (typed list). What is needed is therefore a results tlist
with all needed fields to make forecasts. You will find enclosed a new
ms_var_run function that makes that. What I have done is replacing the
results that are new in the results tlist estimated, while keeping all
invariant results (suach as estimated parameters, t-stats,...): I think I
have done it properly, but I cannot insure you that it is the case.

Starting for the previous example, replace:
--> [y_hat,resid,PR,PR_STT,PR_STL]=run_ms_var(r,'100*(log(
us_revu)-lagts(2,log(us_revu)))'

with:
-->newr=run_ms_var(r,'100*(log(us_revu)-lagts(2,log(us_revu)))'

and then make a forecast with:
--> rf=ms_forecast(newr,'2004m12')

Again, the function is rough and should be improved somehow.

Éric.

2015-02-19 14:28 GMT+01:00 Brian Bouterse <bmbouter at gmail.com>:

> Hi Eric,
>
> Thank you so much for the function. The verification step you demonstrate
> are convincing that the implementation produces the correct filtered
> probability result on the benchmark data. I've been able to reproduce your
> demo results, and also apply it to my own data set. This is great!
>
> There is one more thing that I'm not sure how to do for the single
> variable case. How can I take the results I have from run_ms_var() and use
> them with ms_forecast() to produce a single variable filtered estimate? The
> results I have are [y_hat,resid,PR,PR_STT,PR_STL]. I imagine this could
> be done using the following pseudocode:
>
> for each time step in PR_STT:
>     select the regime with the highest filtered probability for this time
> step (ie: say regime N). This is like a maximum likelihood selection.
>     select the autoregressive parameters for regime N from the original
> training step
>     forecast the next time step using the autoregressive parameters using
> regime N
>
> This seems very similar to what ms_forecast() can do, but I'm not sure how
> to call ms_forecast given only the existence of parameters
> [y_hat,resid,PR,PR_STT,PR_STL]. Is this possible?
>
> Perhaps one of the variables [y_hat,resid,PR,PR_STT,PR_STL] already
> contains what I am looking for, but I want to be sure that it is based on
> the filtered probabilities and not considering data that comes later in the
> data set than the point of prediction. Does that make sense? In other words
> I want to predict the specific value at time t, and only consider data on
> the interval [0, t-1].
>
> Thanks again for everything you've done including writing this, helping
> me, responding so quickly, etc. This is really great.
>
> -Brian
>
>
>
> On Tue, Feb 17, 2015 at 3:50 PM, Eric Dubois <grocer.toolbox at gmail.com>
> wrote:
>
>> Dear Brian.
>>
>> 1) sorry, I made indeed a typo and wanted to speak about y_mat, x_mat and
>> z_mat.
>>
>> 2) I do not know exactly what you want, but you can calculate what you
>> want from the parameters and all other inputs
>>
>> 3) you will find attached a function run_ms_var that performs, I hope,
>> what you need: this function takes a results tlist from a ms_var execution
>> and a vector of endogenous variables to feed the VAR (your benchmark data).
>>
>> I have checked that if you give as endogenous variables exactly the same
>> variables as the one used for estimation, you recover the same yhat,
>> filtered probs, etc.
>>
>> To use the function, you have to save it in a folder, say c:/newms, and
>> run into Scilab
>> --> getd('c:/newms)
>>
>> To check what I mentionned above, run:
>> --> load(GROCERDIR+'\data\us_revu.dat')
>> --> bounds('1967m4','2004m2')
>> --> nb_states=2
>> --> switch_var=2 // variances are switching
>> --> var_opt=3 // heteroskedastik var-cov matrix
>>
>> --> r=ms_var('cte',3,'100*(log(us_revu)-lagts(2,log(us_revu)))',nb_states,switch_var,var_opt,'prt=initial;final','transf=stud')
>>
>> --> [y_hat,resid,PR,PR_STT,PR_STL]=run_ms_var(r,'100*(log(us_revu)-lagts(2,log(us_revu)))'
>> --> PR_STT-r('filtered probs')
>>
>> The function is rather rough (no header, no options,...) and can be
>> improved, but I hope it answers your needs.
>>
>> Éric.
>>
>>
>>
>>
>> 2015-02-17 15:03 GMT+01:00 Brian Bouterse <bmbouter at gmail.com>:
>>
>>> Hi Eric,
>>>
>>> Thanks for the reply! Yes you understand my goals correctly, but one
>>> clarification: It would be better to have the estimated values directly
>>> instead of the filtered state probabilities. I usually get these with
>>> ms_forecast(r, n).
>>>
>>> I've been reading through the grocer code to determine how to write the
>>> function you suggest. I do need it sooner than a few weeks so I'm
>>> attempting to do it. It seems straightforward except for the y_hat, x_hat,
>>> and z_hat variables I need to provide to MSVAR_Filt.(). Here are some
>>> questions:
>>>
>>> 1) You say I need to feed MSVAR_Filt() with y_hat, x_hat, and z_hat, but
>>> the variables in the function signature for MSVAR_Filt read
>>> as y_mat,x_mat,z_mat. Did you mean y_mat or y_hat?
>>>
>>> 2) y_hat (2nd output) is an output of MSVAR_Filt(). The function
>>> comments say that is my estimated y. Is that the direct estimates that I am
>>> looking for?
>>>
>>> 3) I read through ms_var() to see how to derive the y_hat, x_hat, and
>>> z_hat variables that are needed, but I don't see any code in ms_var that
>>> derive these variables. Can you more specifically point out where the code
>>> is that shows the derivation of these matrices?
>>>
>>> Separate from those questions I am wondering what kind of bias is
>>> introduced if I use the filtered probabilities from ms_var? Could I use
>>> those instead of attempting to predict with data set A and evaluate with
>>> data set B. The reason I like the two data set methodology is that the
>>> training data (A) is separated from the evaluation data (B) so there can't
>>> be any bias in terms of measuring how the trained data generalizes when
>>> benchmarked on evaluation data because the training model never saw data
>>> set (B). Chapter 23 says the filtered probabilities only use data up until
>>> that point in time, but it uses estimates that were built from all
>>> information that is available. It seems biased to evaluate the residuals
>>> using filtered probabilities (or smoothed probabilities) because training
>>> and evaluating error on the same data set seems wrong. What do you think
>>> the right way is to use these tools to avoid bias when measuring error of
>>> model performance?
>>>
>>> Thanks for any information. Also is there any possibility for us to chat
>>> on IRC? I'm 'bmbouter' in #scilab on freenode if you want to chat there. It
>>> would probably be faster than e-mail.
>>>
>>> Thanks!
>>> Brian
>>>
>>>
>>> On Thu, Feb 12, 2015 at 3:44 PM, Eric Dubois <grocer.toolbox at gmail.com>
>>> wrote:
>>>
>>>> Dear Brian.
>>>>
>>>> If I have well understood, you want:
>>>> - to estimate a ms_var model on a subset of your dataset;
>>>> - recover the estimated parameters;
>>>> - and calculate the filtered state probabilities on the other part of
>>>> your dataset with these parameters.
>>>>
>>>> This can be done:
>>>> - the function MSVAR_Filt calculates among other the filetered
>>>> probabilities (5th output);
>>>> - the function needs among other things the parameters of the model;
>>>> they can be recovered from the output tlist of function ms_var; if give it
>>>> the name res (with --> res=ms_var(...)): this is the field 'coeff' in the
>>>> output tlist (res('coeff') with this example);
>>>>
>>>> But the function MSVAR_Filt also has to be fed with matrices y_hat,
>>>> x_hat and z_hat that are matrices derived from the matrix of endogenous and
>>>> exogenous variables (see function ms_var to see how it is done).
>>>>
>>>> If you are not too in a hurry, I can write the function that gathers
>>>> all these operations within a few weeks.
>>>>
>>>> Éric.
>>>>
>>>> 2015-02-12 16:56 GMT+01:00 Brian Bouterse <bmbouter at gmail.com>:
>>>>
>>>>> I use GROCER's ms_var function to estimate a single variable VAR
>>>>> model, and it estimates parameters as expected and described by the
>>>>> manual. I want to train and evaluate my model on different data sets to
>>>>> avoid bias from training and benchmarking on the same data set. How can
>>>>> this be done?
>>>>>
>>>>> For example consider data set A (month 1) and data set B (month 2)
>>>>> from a 2 month sample. I would like to train on month 1 and then benchmark
>>>>> on month 2.
>>>>>
>>>>> I use ms_var to train on data set A. It gives me estimated parameters
>>>>> and filtered regime probabilities. That works well. How can I use the
>>>>> trained parameters to then estimate on month 2 data?
>>>>>
>>>>> I'm aware of the ms_forecast function, but it seems to only forecast
>>>>> using the results from an estimator like ms_var(). The forecasting will
>>>>> then only be done on the same data as was used for estimating. I want to
>>>>> use the trained parameters to product estimates for a different data set.
>>>>>
>>>>> Thanks in advance. I really appreciate being able to use this software.
>>>>>
>>>>> -Brian
>>>>>
>>>>> --
>>>>> Brian Bouterse
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users at lists.scilab.org
>>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users at lists.scilab.org
>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>>
>>> --
>>> Brian Bouterse
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at lists.scilab.org
>>> http://lists.scilab.org/mailman/listinfo/users
>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> users at lists.scilab.org
>> http://lists.scilab.org/mailman/listinfo/users
>>
>>
>
>
> --
> Brian Bouterse
>
> _______________________________________________
> users mailing list
> users at lists.scilab.org
> http://lists.scilab.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20150219/3d04bf4e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: run_ms_var.sci
Type: application/octet-stream
Size: 3498 bytes
Desc: not available
URL: <https://lists.scilab.org/pipermail/users/attachments/20150219/3d04bf4e/attachment.obj>