[Scilab-users] GROCER VAR-HMM Example Output Different from Chapter 23

Thu May 29 13:24:57 CEST 2014

Hi Eric,

Thanks again for the response. What you told me to do worked. If I closed
Scilab, and rerun the example from my earlier e-mail, I do not see any
warnings, which is good. The output I received is not what I expected in
several ways, so I'll ask some more questions here. There are a lot of
different questions here, but they all relate to the example below, and the
use of ms_var and GROCER. Thanks in advance for any insight into the issues
I am experiencing.

The example:

I have a file arhmm_example_280.dat [0] that was generated and loaded
from arhmm_example_280.csv [1] using the following commands:

    impexc2bd('arhmm_example_280.csv', ';', 'arhmm_example_280.dat')
    load('arhmm_example_280.dat')

I then run, what I expect to be a single variable, 2-regime,

    nb_states=2
    switch_var=2
    var_opt=3

  r=ms_var('cte',3,'ardata',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')

I then receive this output:      http://ur1.ca/hej98

1) In the "coefficients" section of the output, the output doesn't seem to
contain enough numbers to describe the expected number of outputs. The
model has 2 regimes, and it is autoregressive with order 3, so I expect
there to be 3 coefficients for EACH regime (regime 1 and regime 2).
Instead, I see a single coefficient for Regime 1 (0.9313736) and a single
coefficient for Regime 2 (8.2685973). Where are the three autoregressive
parameters for regime 1 and the other three autoregressive parameters for
regime 2?

2) There is a section in the output called "All Regimes" which is in the
"Coefficients" section contains 3 coefficients. Do these correspond to the
autoregressive aspect of the model [0.2258037, 0.4757391, -0.3161730). I
expected to see a section like this for Regime 1, and one for Regime 2, but
I did not. I'm not sure how these 3 coefficients for "All Regimes" fit in.
The Markov Switching Model switches the autoregressive coefficients with
each Regime, so I'm not sure how to interpret coefficients that are for
"both regimes".

3) I believe each regime will have its error term estimated, and I do see
that in the output. For example under Regime 2, the variance of the error
term is 2.3164941. This is listed in the section "Variance-covariance
matrix of residuals" which is a subsection under "Regime 2". Am I
interpreting the output correctly? Where is the mean value for the regime?

4)  My overall goal is to validate that I can hand a list of observations
to GROCER+scilab, along with the order of the autoregressive model, and the
number of regimes, and have it estimate the Markov hidden state transition
probabilities, the estimated regime that is in effect at time t, and
coefficients for each regime, and the error term that can be used for
prediction of the t+1 observation. I don't know of a "well known" data set
that allows for this type of validation, so I wrote a simple AR-HMM
generator named koa marhn [2], which outputs AR-HMM data that follows a set
of parameters. This is the program that generated the 280 point data set in
the arhmm_example_280.csv [1] file. You can see the parameters used to
generate this file here:

    https://github.com/bmbouter/koa_marhn/blob/master/arhmm_example.py

A[i][j] is the probability transition from regime i to regime j.
C[x][y] is the autoregressive coefficient for regime x used for generation
of the t+1 element from the t, t-1, t-2, ..., t-y summation. Each y is a
coefficient for regime x.
R[q] contains mean and std_deviation values for regime q
pi[m] contains the starting state probabilities that the regime is in state
m at time t=0

The output from http://ur1.ca/hej98 does not correspond at all with the
values of these inputs, which means something unexpected is going on! Do
you have insight into why scilab doesn't produce these expected numbers?

5)  One possibility is that maybe it is "working", but there isn't enough
data for it to estimate the parameters correctly. I had koa marhn [2]
output 10,000 observations, but I ran into trouble transforming it from a
csv file to a dat file using impexc2bd. My input file is
arhmm_example_10000.csv [3]. I tried to convert to .dat using:

    impexc2bd('arhmm_example_10000.csv', ';', 'arhmm_example_10000.dat')

I receive this output, and the .dat file in NOT created

 !--error 10000
dates are entered neither in chronlogical order nor in reverseerse
chronological order
at line      87 of function read_dates called by :
at line     188 of function impexc2bd called by :
arhmm_example_10000.dat'

How do you transform a large .csv file to .dat? Is there some problem with
the data file, the way I am making it?

[0]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_280.dat
[1]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_280.csv
[2]:  https://github.com/bmbouter/koa_marhn
[3]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_10000.csv
[4]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_1000.csv
[5]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_1000.dat

This e-mail is way to long, but I wanted to fully recap the issues I've
considered as I use ms_var(). Thanks for any help you can provide. You can
also find me in the #scilab channel on the OFTC IRC servers.

Thanks,
Brian

On Fri, May 23, 2014 at 3:48 PM, Eric Dubois <grocer.toolbox at gmail.com>
wrote:

> Dear Brian
>
> 1) For me it works fine; I suspect that you have run ms_var with your data
> after having run with the manual example. I expect that if you reopen
> Scialb and run your problem, starting from load('arhmm_example.dat'), it
> will work. A good advice anyway is to set the bounds before each estimation
> or to run:
> --> bounds()
> if you want to use the greatest available time span with your data
> 2) I agree; indeed the example in the manual as well as yours are
> univariate, but ms_var also works with multivariate series (I have run some
> tests which worked well)
> 3) If you want to contribute, do not hesitate to send me code (at
> grocer.toolbox at gmail.com or grocer.toolbox at free.fr); add your copyright;
> if you can create a help file it would still be better (I have some tools
> do help doing that if you want them); and if you can add to the manual, it
> would be marvellous!
> If you want to imporve the docs you are also welcome; I can sned you the
> OpenOffice files you need if you find it suitable
>
> Éric.
>
>
>
> 2014-05-23 14:13 GMT+02:00 Brian Bouterse <bmbouter at gmail.com>:
>
> Thanks for the reply Eric. It is great to get hints and suggestions from
>> the author directly!
>>
>> I used the commands that you outlined, and they were able to reproduce
>> the expected output verbatim, which is great. Thanks for clearing that up
>> for me. I've gotten further towards my goal.
>>
>> I've now got three questions:
>>
>> 1)  I get an unexpected result when I run ms_var() on my own data. I run
>> these commands:
>>
>> load('arhmm_example.dat')
>> nb_states=2
>> switch_var=2 // variances are switching
>> var_opt=3 // unrestricted var-cov matrix
>>
>> r=ms_var('cte',3,'ardata',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>
>> I receive this output:
>>
>> WARNING: in overlay, series number 2 has been ignored because of a bad
>> frequency
>>  !--error 10000
>> series ends before the end date of the bounds
>> at line      39 of function ts2vec0 called by :
>> at line     101 of function explone called by :
>> at line     253 of function ms_var called by :
>>
>> r=ms_var('cte',3,'ardata',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>
>> The arhmm_example.dat file is available here[0], and it was made by
>> running.the following command on the original csv[1] file arhmm_example.csv.
>>
>> impexc2bd('arhmm_example.csv', ';', 'arhmm_example.dat')
>>
>> I believe I either don't have the dates configured correctly, or it
>> requires a specific number of data points to match the frequency value,
>> which also may be wrong. Do you have some insight into this error message?
>> I've been reading the docs on the ts structure, and I will continue to try
>> to solve this roadblock.
>>
>>
>> 2) My goal in doing all this is to analyze Autoregressive Hidden Markov
>> Models. As I understand it, the VAR-HMM that ms_var provides is a
>> multivariate case of an Autoregressive Hidden Markov Model. The terms
>> Markov Switching Model, and Hidden Markov Model refer to the same thing.
>> Using a single variable with ms_var() as I show above in the example, will
>> simulate an AR-HMM(3). I would like to check if these statements agree with
>> your understanding.
>>
>>
>> 3) How could I contribute to the grocer code. At the very least I could
>> improve the docs some.
>>
>>
>> [0]:
>> https://s3.amazonaws.com/dfsklfdsklfds/fdsjkfsdjkfds/arhmm_example.dat
>> [1]:
>> https://s3.amazonaws.com/dfsklfdsklfds/fdsjkfsdjkfds/arhmm_example.csv
>>
>> Thanks,
>> Brian
>>
>>
>> On Tue, May 20, 2014 at 3:55 PM, Eric Dubois <grocer.toolbox at gmail.com>
>> wrote:
>>
>>> Hello Brian
>>>
>>> Sorry for this late answer, but I have been quite busy these days.
>>>
>>> I did not notice the problem with this version of the MS programs.
>>> Indeed I have changed the optimization device of all GROCER programs and I
>>> have not adapted the defaults for the MS programs.
>>>
>>> If you run:
>>> --> global GROCERDIR;
>>> --> load(GROCERDIR+'\data\us_revu.dat')
>>> --> bounds('1967m4','2004m2')
>>>
>>> --> nb_states=2
>>> --> switch_var=2 // variances are switching
>>> --> var_opt=3 // unrestricted var-cov matrix
>>>
>>> -->
>>> r=ms_var('cte',3,'100*(log(us_revu)-lagts(2,log(us_revu)))',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>> (see chapter 6 of the manual for explanations)
>>>
>>> Then the results of the ms_var demo is restaured.
>>>
>>> I will change the default in Grocer next version .
>>>
>>> Regards.
>>>
>>> Éric.
>>>
>>>
>>> 2014-05-19 12:56 GMT+02:00 Brian Bouterse <bmbouter at gmail.com>:
>>>
>>>> Hi Scilab community!
>>>>
>>>> I'm new to Scilab, and the AR-HMM and VAR-HMM solving capabilities of
>>>> GROCER are what interest me.
>>>>
>>>> I have a question relating to Chapter 23 from the GROCER manual[0].
>>>>  This is the univariate MS-AR(3) solved using the function ms_reg_d() on
>>>> the us_revu.dat data included with GROCER.  I have made no adjustment from
>>>> the example statements in Chapter 23.
>>>>
>>>> The example output is shown on pages 4 and 5 of the Chapter 23 module.
>>>>  Compare that against the output I receive.
>>>>
>>>> http://fpaste.org/102978/14004958/
>>>>
>>>> Here are my questions:
>>>>
>>>> 1.  The numerical output is completely different.  I expected it to be
>>>> the same since the data is provided by GROCER, and I've done the example
>>>> exactly as shown in Chapter 23.  Is there some explanation to why the
>>>> solved solution I receive is different than the example output in the
>>>> chapter?
>>>>
>>>> 2.  I see output like %i*8.4469016 which seems like an error because %i
>>>> looks like a variable that yet needs to be replaced, and then multiplied to
>>>> get to its final value.  Is this some kind of bug or error?
>>>>
>>>> Thanks for any help the community can provide.  We'll be using this for
>>>> a seminar on HMM, AR-HMM, and VAR-HMM at North Carolina State University.
>>>>  I'm also a developer, so I really appreciate all the effort that has been
>>>> put into scilab and GROCER.
>>>>
>>>> Thanks,
>>>> Brian
>>>>
>>>>
>>>> [0]:  http://dubois.ensae.net/Grocer_manual_v1.6.zip
>>>>
>>>>
>>>> --
>>>> Brian Bouterse
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users at lists.scilab.org
>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at lists.scilab.org
>>> http://lists.scilab.org/mailman/listinfo/users
>>>
>>>
>>
>>
>> --
>> Brian Bouterse
>>
>> _______________________________________________
>> users mailing list
>> users at lists.scilab.org
>> http://lists.scilab.org/mailman/listinfo/users
>>
>>
>
> _______________________________________________
> users mailing list
> users at lists.scilab.org
> http://lists.scilab.org/mailman/listinfo/users
>
>

-- 
Brian Bouterse
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20140529/56b82ab1/attachment.htm>