[scilab-Users] saisonality in time series

Charles Warner cwarner.cw711 at gmail.com
Mon Nov 21 23:57:20 CET 2011


Stephan-
Sounds like you are working with data sets that resemble some that I work
with frequently.  About the same size, non-stationary seasonality (which I
prefer to call low-frequency periodic trends), high, non-stationary
temporally variable "noisy" signals.  I usually have a priori knowledge
that there is likely aliasing due to the fact that I am limited in sampling
rate vs. total time period (varying the sampling rate for different
collection instances can sometimes resolve at least part of this issue).  I
haven't figured out how to get the aliasing out of the system, although
there should be some way to do this based on "unfolding" the spectrum.

Anyway, the approach I take is this.  Starting with a COPY of the raw data,
I run it through NIST
Dataplot<http://www.itl.nist.gov/div898/software/dataplot/>to get a
feel for the data, using what they call the 4-plot.  This gives me
a feel for the periodicity, auto-correlation, statistical distribution, and
helps me identify the "ouliers" (i.e., spikes, 0's, any data point that is
"unusual"), which are then filtered out by replacing specific data points
with a "more reasonable" value (which is why I use a copy of the raw data,
rather than the original).  Usually use a spreadsheet for this.  I might
run the data through Dataplot a couple of times to evaluate the effects my
"filtering" have had.  Usually, still in the spreadsheet, I next remove the
"DC offset" by subtracting one of the Pythagorean
means<http://en.wikipedia.org/wiki/Pythagorean_means>.
Which is appropriate depends on the nature of the data.   This reduces the
low end of the spectrum, which is where the "trends" are located.

For the Fourier analysis, I could stay with Dataplot, but I find it much
easier to extract information from the Scilab approach.  Furthermore,
Scilab offers an alternative DFT processcalled MESE.  The Maximum Entropy
Spectral Estimate (MESE), designed to produce high-resolution, low-bias
spectral estimate (refer to page 128 of the *Signal processing With
Scilab*<http://wiki.scilab.org/Tutorials%20archives?action=AttachFile&do=view&target=signal.pdf>manual,
or available
here <http://wiki.scilab.org/Tutorials%20archives>). MESE incorporates no
information in the estimated spectrum about the autocorrelation lags. That
is to say that the bias resulting from the leakage from the window
sidelobes should be eliminated (or at least minimized in some sense).   In
other words, one tends to get cleaner "spikes" in the spectrum.  It is much
easier to pick out the lower-frequency components of the signals with this
procedure.  One then subtracts these components from the working data, and
repeats the process.  Ideally, you have extracted all of the available
information from the data when the residual is Gaussian white noise (a
point I have never actually reached in practice).

I don't believe "windowing" techniques will work with low-frequency
components, although I could be mistaken in this.  I have toyed with
windowing when I have reduced my residual to what has the appearance of a
frequency-modulated signal- but, then, I am looking to characterize such
events.  The information you are trying to extract will ultimately dictate
what approach you take.

I am attaching a "working document" that I have put together giving more
detail on this approach.

Charlie

2011/11/21 Ginters Bušs <ginters.buss at gmail.com>

> Better stick with DFT, smoothed DFT or try seasonal adjustment freeware
> Demetra+   - that's what official statisticians might do.
>
> gin
>
>
> On Mon, Nov 21, 2011 at 10:00 AM, Schreckenbach Stephan <
> s.schreckenbach at truma.com> wrote:
>
>> **
>>
>> Filtering temporal spikes is a good idea, since there are some of them. I
>> will try that.****
>>
>> The data sample as around 7000 data points, the frequency I look for is
>> around 1/10 * sample rate.****
>>
>> ** **
>>
>> May be there are methods that are better suited for identifying frequency
>> components in that kind of data?****
>>
>> FFT always describes the time series by harmonic oszillations, which
>> might not work well****
>>
>> if oscillations are not (strictly) harmonic.****
>>
>> ** **
>>
>> What about wavelets (don’t know much about it yet, though)?****
>>
>> ** **
>>
>> ** **
>>
>> Stephan****
>>
>>  ****
>>
>>  ****
>>   ------------------------------
>>
>> *Von:* Charles Warner [mailto:cwarner.cw711 at gmail.com]
>> *Gesendet:* Samstag, 19. November 2011 05:12
>>
>> *An:* **users at lists.scilab.org**
>> *Betreff:* Re: [scilab-Users] saisonality in time series
>> ****
>>
>>  ** **
>>
>> Another trick I have found that greatly reduces FFT noise it to
>> temporarily mask any localized "spikes" in the data (such spikes, with a
>> narrow temporal profile have a very broad spectral distribution).  One can
>> also try to eliminate any offset by subtracting the mean (or the geometric
>> mean or harmonic mean- the appropriate mean would be dictated by the nature
>> of the data).  This should hopefully reduce the scale of the FFT amplitude,
>> making it easier to spot any (especially low-frequency, or seasonal)
>> potential frequency components.****
>>
>> On Fri, Nov 18, 2011 at 3:09 AM, **Schreckenbach Stephan** <
>> s.schreckenbach at truma.com> wrote:****
>>
>> Hi,****
>>
>>  ****
>>
>> sorry, of course I meant seasonality.****
>>
>> The time series consists of longer term trends, short term noise and
>> short time seasonality. ****
>>
>> oscillations / seasonality, if any, it is most likely to be nonharmonic.
>> I look for distinct frequencies.****
>>
>> When I did a FFT plot of the original time series there was noise only in
>> the spectrum.****
>>
>> I will give it a run with the differenciated series / the log of the
>> data. ****
>>
>> There is still the question how to test for significance of the found
>> seasonality. ****
>>
>>  ****
>>
>> Stephan****
>>
>>  ****
>>
>>  ****
>>   ------------------------------
>>
>> *Von:* Charles Warner [mailto:cwarner.cw711 at gmail.com]
>> *Gesendet:* Freitag, 18. November 2011 00:34
>> *An:* users at lists.scilab.org
>> *Betreff:* Re: [scilab-Users] saisonality in time series****
>>
>>  ****
>>
>> Although "seasonality" is not the term I use for long term trends hidden
>> in noisy data, I have had some success by taking the log of the data, and
>> running an FFT on the log data.  Usually, I have some prior knowledge of
>> the long-term periodic trends I expect, so it is relatively easy to
>> determine quickly if this method works.  Plotting the log of the data also
>> gives one a good feel for whether the data is stationary, or whether there
>> are windows of data that can be treated as stationary.  Any changing
>> magnitude effect is, of course, reduced when on works with logs, but such
>> effects can help one understand what the raw data is really telling you.
>>
>> Charlie****
>>
>> On Thu, Nov 17, 2011 at 12:40 PM, Mike Page <Mike at page-one.waitrose.com>
>> wrote:****
>>
>> Hi,
>>
>> I don't know much about this application, but the Cepstrum can be used to
>> find hidden periodicity in time series.  Might be worth trying?  I have
>> used
>> it for finding rotational components in the vibration signatures from
>> rotating machinery.  There's a simple example here
>> (http://www.dliengineering.com/downloads/cepstrum%20analysis.pdf).
>>
>> Mike.****
>>
>>
>>
>> -----Original Message-----
>> From: Petter Wingren [mailto:petterwr at gmail.com]
>> Sent: 17 November 2011 17:18
>> To: users at lists.scilab.org
>> Subject: Re: [scilab-Users] saisonality in time series
>>
>>
>> Did a quick search but couldnt find anything obvious. I suppose the
>> word you are looking for is seasonality - maybe that helps in finding
>> something useful.
>>
>> On Thu, Nov 17, 2011 at 3:36 PM, **Schreckenbach Stephan**
>> <s.schreckenbach at truma.com> wrote:
>> >
>> > Hi,
>> >
>> > I look for a test of saisonality in time series.
>> > The time series might be instationary and nonlinear and the saisonality
>> > / oscillation might have a changing amplitude. Furthermore the
>> > distribution
>> > might be unknown as well.
>> > I need something to test for significant saisonality without knowing /
>> > estimating a (linear) model of the time series.
>> >
>> > ideas I got so far: Chi Square Test for independency:
>> > I could test for independence of saison and mean value of the data
>> >
>> > Chi Square Test to test for different means of two data groups.
>> > I could test for a difference of the mean between several seasons.
>> >
>> > Any more or better ideas?
>> >
>> > Thanks in advance, Stephan
>> >
>> >****
>>
>>  ****
>>
>> ** **
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.scilab.org/pipermail/users/attachments/20111121/11a1dab5/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Summary.odt
Type: application/vnd.oasis.opendocument.text
Size: 298787 bytes
Desc: not available
URL: <https://lists.scilab.org/pipermail/users/attachments/20111121/11a1dab5/attachment.odt>


More information about the users mailing list