[Scilab-users] GROCER VAR-HMM Example Output Different from Chapter 23

Tue Jun 3 22:40:24 CEST 2014

Hello Brian.

Thank you for your feedback. I will try to make the distinction between
'cte' and 'all' more precise in the manual.

I will also think about your suggestion to put the code on a hub.

Regards.

Éric.

2014-06-03 12:21 GMT+02:00 Brian Bouterse <bmbouter at gmail.com>:

> Hi Eric,
>
> Thanks for the quick and helpful reply! I adjusted the first parameter
> from 'cte' to 'all' and that produces the correct form of output, and the
> values look close to the expected values. This is great!
>
> I've read the help documents several times, but I didn't realize the
> impact of adjusting the value of the first parameter. The doc example on
> the ms_var help page describes the parameter, but the only example on that
> page uses 'cte'. There is no example with 'all' on the ms_var() help page.
> Also including some sample output may also be good there. I would like to
> contribute an example like this to the docs, manual, or help files. I think
> that would have caused me to solve my own problems instead of writing long
> e-mails.
>
> I also tried using a larger data file 10,000 observations in a column
> format, and then impexc2bd transformed the data file correctly. That is
> also solved thanks to your suggestion.
>
> I know the code is available for download, but would you ever consider
> putting a copy of the GROCER code in a repository on github.com? If it
> were in a place like that, I, and others, would have an easier time of
> contributing to the codebase instead of e-mailing patches. What do you
> think about an idea like this? Just a friendly suggestion based on what has
> worked well for me in the past. I can help put it up there, and configure
> it if that is helpful.
>
> Thanks again for making this great software, and helping me use it!
>
> Best,
> Brian
>
>
>
>
>
> On Thu, May 29, 2014 at 1:41 PM, Eric Dubois <grocer.toolbox at gmail.com>
> wrote:
>
>> Hello Brian
>>
>> You did not recover the intial parameters because you did not estimate
>> the same model as the one you generated.
>>
>> This is because you used in ms_var the option 'cte' which means that only
>> the constant is allowed to switch: and this is why the AR coeffcients are
>> only given for 'all regimes'. I f you rather use the option 'all', then all
>> parameters are allowed to switch and the result is now consistent with the
>> parameters of the generated model, that is the true values are within the
>> confidence interval of the estimated ones. This is explained in the help
>> command (see help ms_var): may I ask you why you did not infer it from the
>> help files?
>>
>> As for the big file, I suspect it is truncated into Scilab. Can you try
>> with data in column instead of in rows?
>>
>> Éric.
>>
>>
>> 2014-05-29 13:24 GMT+02:00 Brian Bouterse <bmbouter at gmail.com>:
>>
>> Hi Eric,
>>>
>>> Thanks again for the response. What you told me to do worked. If I
>>> closed Scilab, and rerun the example from my earlier e-mail, I do not see
>>> any warnings, which is good. The output I received is not what I expected
>>> in several ways, so I'll ask some more questions here. There are a lot of
>>> different questions here, but they all relate to the example below, and the
>>> use of ms_var and GROCER. Thanks in advance for any insight into the issues
>>> I am experiencing.
>>>
>>> The example:
>>>
>>> I have a file arhmm_example_280.dat [0] that was generated and loaded
>>> from arhmm_example_280.csv [1] using the following commands:
>>>
>>>     impexc2bd('arhmm_example_280.csv', ';', 'arhmm_example_280.dat')
>>>     load('arhmm_example_280.dat')
>>>
>>> I then run, what I expect to be a single variable, 2-regime,
>>>
>>>     nb_states=2
>>>     switch_var=2
>>>     var_opt=3
>>>
>>>   r=ms_var('cte',3,'ardata',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>>
>>> I then receive this output:      http://ur1.ca/hej98
>>>
>>> 1) In the "coefficients" section of the output, the output doesn't seem
>>> to contain enough numbers to describe the expected number of outputs. The
>>> model has 2 regimes, and it is autoregressive with order 3, so I expect
>>> there to be 3 coefficients for EACH regime (regime 1 and regime 2).
>>> Instead, I see a single coefficient for Regime 1 (0.9313736) and a single
>>> coefficient for Regime 2 (8.2685973). Where are the three autoregressive
>>> parameters for regime 1 and the other three autoregressive parameters for
>>> regime 2?
>>>
>>> 2) There is a section in the output called "All Regimes" which is in the
>>> "Coefficients" section contains 3 coefficients. Do these correspond to the
>>> autoregressive aspect of the model [0.2258037, 0.4757391, -0.3161730). I
>>> expected to see a section like this for Regime 1, and one for Regime 2, but
>>> I did not. I'm not sure how these 3 coefficients for "All Regimes" fit in.
>>> The Markov Switching Model switches the autoregressive coefficients with
>>> each Regime, so I'm not sure how to interpret coefficients that are for
>>> "both regimes".
>>>
>>> 3) I believe each regime will have its error term estimated, and I do
>>> see that in the output. For example under Regime 2, the variance of the
>>> error term is 2.3164941. This is listed in the section "Variance-covariance
>>> matrix of residuals" which is a subsection under "Regime 2". Am I
>>> interpreting the output correctly? Where is the mean value for the regime?
>>>
>>> 4)  My overall goal is to validate that I can hand a list of
>>> observations to GROCER+scilab, along with the order of the autoregressive
>>> model, and the number of regimes, and have it estimate the Markov hidden
>>> state transition probabilities, the estimated regime that is in effect at
>>> time t, and coefficients for each regime, and the error term that can be
>>> used for prediction of the t+1 observation. I don't know of a "well known"
>>> data set that allows for this type of validation, so I wrote a simple
>>> AR-HMM generator named koa marhn [2], which outputs AR-HMM data that
>>> follows a set of parameters. This is the program that generated the 280
>>> point data set in the arhmm_example_280.csv [1] file. You can see the
>>> parameters used to generate this file here:
>>>
>>>     https://github.com/bmbouter/koa_marhn/blob/master/arhmm_example.py
>>>
>>> A[i][j] is the probability transition from regime i to regime j.
>>> C[x][y] is the autoregressive coefficient for regime x used for
>>> generation of the t+1 element from the t, t-1, t-2, ..., t-y summation.
>>> Each y is a coefficient for regime x.
>>> R[q] contains mean and std_deviation values for regime q
>>> pi[m] contains the starting state probabilities that the regime is in
>>> state m at time t=0
>>>
>>> The output from http://ur1.ca/hej98 does not correspond at all with the
>>> values of these inputs, which means something unexpected is going on! Do
>>> you have insight into why scilab doesn't produce these expected numbers?
>>>
>>> 5)  One possibility is that maybe it is "working", but there isn't
>>> enough data for it to estimate the parameters correctly. I had koa marhn
>>> [2] output 10,000 observations, but I ran into trouble transforming it from
>>> a csv file to a dat file using impexc2bd. My input file is
>>> arhmm_example_10000.csv [3]. I tried to convert to .dat using:
>>>
>>>     impexc2bd('arhmm_example_10000.csv', ';', 'arhmm_example_10000.dat')
>>>
>>> I receive this output, and the .dat file in NOT created
>>>
>>>  !--error 10000
>>> dates are entered neither in chronlogical order nor in reverseerse
>>> chronological order
>>> at line      87 of function read_dates called by :
>>> at line     188 of function impexc2bd called by :
>>> arhmm_example_10000.dat'
>>>
>>> How do you transform a large .csv file to .dat? Is there some problem
>>> with the data file, the way I am making it?
>>>
>>> [0]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_280.dat
>>> [1]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_280.csv
>>> [2]:  https://github.com/bmbouter/koa_marhn
>>> [3]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_10000.csv
>>> [4]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_1000.csv
>>> [5]:  https://s3.amazonaws.com/scilab_data_files/arhmm_example_1000.dat
>>>
>>> This e-mail is way to long, but I wanted to fully recap the issues I've
>>> considered as I use ms_var(). Thanks for any help you can provide. You can
>>> also find me in the #scilab channel on the OFTC IRC servers.
>>>
>>> Thanks,
>>> Brian
>>>
>>>
>>>
>>> On Fri, May 23, 2014 at 3:48 PM, Eric Dubois <grocer.toolbox at gmail.com>
>>> wrote:
>>>
>>>> Dear Brian
>>>>
>>>> 1) For me it works fine; I suspect that you have run ms_var with your
>>>> data after having run with the manual example. I expect that if you reopen
>>>> Scialb and run your problem, starting from load('arhmm_example.dat'),
>>>> it will work. A good advice anyway is to set the bounds before each
>>>> estimation or to run:
>>>> --> bounds()
>>>> if you want to use the greatest available time span with your data
>>>> 2) I agree; indeed the example in the manual as well as yours are
>>>> univariate, but ms_var also works with multivariate series (I have run some
>>>> tests which worked well)
>>>> 3) If you want to contribute, do not hesitate to send me code (at
>>>> grocer.toolbox at gmail.com or grocer.toolbox at free.fr); add your
>>>> copyright; if you can create a help file it would still be better (I have
>>>> some tools do help doing that if you want them); and if you can add to the
>>>> manual, it would be marvellous!
>>>> If you want to imporve the docs you are also welcome; I can sned you
>>>> the OpenOffice files you need if you find it suitable
>>>>
>>>> Éric.
>>>>
>>>>
>>>>
>>>> 2014-05-23 14:13 GMT+02:00 Brian Bouterse <bmbouter at gmail.com>:
>>>>
>>>> Thanks for the reply Eric. It is great to get hints and suggestions
>>>>> from the author directly!
>>>>>
>>>>> I used the commands that you outlined, and they were able to reproduce
>>>>> the expected output verbatim, which is great. Thanks for clearing that up
>>>>> for me. I've gotten further towards my goal.
>>>>>
>>>>> I've now got three questions:
>>>>>
>>>>> 1)  I get an unexpected result when I run ms_var() on my own data. I
>>>>> run these commands:
>>>>>
>>>>> load('arhmm_example.dat')
>>>>> nb_states=2
>>>>> switch_var=2 // variances are switching
>>>>> var_opt=3 // unrestricted var-cov matrix
>>>>>
>>>>> r=ms_var('cte',3,'ardata',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>>>>
>>>>> I receive this output:
>>>>>
>>>>> WARNING: in overlay, series number 2 has been ignored because of a bad
>>>>> frequency
>>>>>  !--error 10000
>>>>> series ends before the end date of the bounds
>>>>> at line      39 of function ts2vec0 called by :
>>>>> at line     101 of function explone called by :
>>>>> at line     253 of function ms_var called by :
>>>>>
>>>>> r=ms_var('cte',3,'ardata',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>>>>
>>>>> The arhmm_example.dat file is available here[0], and it was made by
>>>>> running.the following command on the original csv[1] file arhmm_example.csv.
>>>>>
>>>>> impexc2bd('arhmm_example.csv', ';', 'arhmm_example.dat')
>>>>>
>>>>> I believe I either don't have the dates configured correctly, or it
>>>>> requires a specific number of data points to match the frequency value,
>>>>> which also may be wrong. Do you have some insight into this error message?
>>>>> I've been reading the docs on the ts structure, and I will continue to try
>>>>> to solve this roadblock.
>>>>>
>>>>>
>>>>> 2) My goal in doing all this is to analyze Autoregressive Hidden
>>>>> Markov Models. As I understand it, the VAR-HMM that ms_var provides is a
>>>>> multivariate case of an Autoregressive Hidden Markov Model. The terms
>>>>> Markov Switching Model, and Hidden Markov Model refer to the same thing.
>>>>> Using a single variable with ms_var() as I show above in the example, will
>>>>> simulate an AR-HMM(3). I would like to check if these statements agree with
>>>>> your understanding.
>>>>>
>>>>>
>>>>> 3) How could I contribute to the grocer code. At the very least I
>>>>> could improve the docs some.
>>>>>
>>>>>
>>>>> [0]:
>>>>> https://s3.amazonaws.com/dfsklfdsklfds/fdsjkfsdjkfds/arhmm_example.dat
>>>>> [1]:
>>>>> https://s3.amazonaws.com/dfsklfdsklfds/fdsjkfsdjkfds/arhmm_example.csv
>>>>>
>>>>> Thanks,
>>>>> Brian
>>>>>
>>>>>
>>>>> On Tue, May 20, 2014 at 3:55 PM, Eric Dubois <grocer.toolbox at gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hello Brian
>>>>>>
>>>>>> Sorry for this late answer, but I have been quite busy these days.
>>>>>>
>>>>>> I did not notice the problem with this version of the MS programs.
>>>>>> Indeed I have changed the optimization device of all GROCER programs and I
>>>>>> have not adapted the defaults for the MS programs.
>>>>>>
>>>>>> If you run:
>>>>>> --> global GROCERDIR;
>>>>>> --> load(GROCERDIR+'\data\us_revu.dat')
>>>>>> --> bounds('1967m4','2004m2')
>>>>>>
>>>>>> --> nb_states=2
>>>>>> --> switch_var=2 // variances are switching
>>>>>> --> var_opt=3 // unrestricted var-cov matrix
>>>>>>
>>>>>> -->
>>>>>> r=ms_var('cte',3,'100*(log(us_revu)-lagts(2,log(us_revu)))',nb_states,switch_var,var_opt,'prt=initial;final','opt_convg=0')
>>>>>> (see chapter 6 of the manual for explanations)
>>>>>>
>>>>>> Then the results of the ms_var demo is restaured.
>>>>>>
>>>>>> I will change the default in Grocer next version .
>>>>>>
>>>>>> Regards.
>>>>>>
>>>>>> Éric.
>>>>>>
>>>>>>
>>>>>> 2014-05-19 12:56 GMT+02:00 Brian Bouterse <bmbouter at gmail.com>:
>>>>>>
>>>>>>> Hi Scilab community!
>>>>>>>
>>>>>>> I'm new to Scilab, and the AR-HMM and VAR-HMM solving capabilities
>>>>>>> of GROCER are what interest me.
>>>>>>>
>>>>>>> I have a question relating to Chapter 23 from the GROCER manual[0].
>>>>>>>  This is the univariate MS-AR(3) solved using the function ms_reg_d() on
>>>>>>> the us_revu.dat data included with GROCER.  I have made no adjustment from
>>>>>>> the example statements in Chapter 23.
>>>>>>>
>>>>>>> The example output is shown on pages 4 and 5 of the Chapter 23
>>>>>>> module.  Compare that against the output I receive.
>>>>>>>
>>>>>>> http://fpaste.org/102978/14004958/
>>>>>>>
>>>>>>> Here are my questions:
>>>>>>>
>>>>>>> 1.  The numerical output is completely different.  I expected it to
>>>>>>> be the same since the data is provided by GROCER, and I've done the example
>>>>>>> exactly as shown in Chapter 23.  Is there some explanation to why the
>>>>>>> solved solution I receive is different than the example output in the
>>>>>>> chapter?
>>>>>>>
>>>>>>> 2.  I see output like %i*8.4469016 which seems like an error because
>>>>>>> %i looks like a variable that yet needs to be replaced, and then multiplied
>>>>>>> to get to its final value.  Is this some kind of bug or error?
>>>>>>>
>>>>>>> Thanks for any help the community can provide.  We'll be using this
>>>>>>> for a seminar on HMM, AR-HMM, and VAR-HMM at North Carolina State
>>>>>>> University.  I'm also a developer, so I really appreciate all the effort
>>>>>>> that has been put into scilab and GROCER.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Brian
>>>>>>>
>>>>>>>
>>>>>>> [0]:  http://dubois.ensae.net/Grocer_manual_v1.6.zip
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Brian Bouterse
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users at lists.scilab.org
>>>>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> users at lists.scilab.org
>>>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Brian Bouterse
>>>>>
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> users at lists.scilab.org
>>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> users at lists.scilab.org
>>>> http://lists.scilab.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>>
>>> --
>>> Brian Bouterse
>>>
>>> _______________________________________________
>>> users mailing list
>>> users at lists.scilab.org
>>> http://lists.scilab.org/mailman/listinfo/users
>>>
>>>
>>
>> _______________________________________________
>> users mailing list
>> users at lists.scilab.org
>> http://lists.scilab.org/mailman/listinfo/users
>>
>>
>
>
> --
> Brian Bouterse
>
> _______________________________________________
> users mailing list
> users at lists.scilab.org
> http://lists.scilab.org/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20140603/0fcc9e13/attachment.htm>