Knowledge Management

How to recognize a flat pattern in a given time period?

yuanliu
SplunkTrust
SplunkTrust

I have a search that returns a large number of series of data to be displayed/analyzed easily. These series show three distinct patterns:

  1. Flat at beginning, rugged after some time.
  2. Irregular throughout.
  3. Zero (flat) at beginning, rugged after some time.

I want to then search according to each pattern. This falls into pattern recognition, but for my purposes, a simple method to identify the flat beginning is good enough. In other words, I only need to "search those with flat beginning greater than 11", "search those with flat beginning of 0", and "search those that are neither." Is there a simple method to do this?

alt text

0 Karma
1 Solution

lguinn2
Legend

Assume

yoursearchhere
| timechart count by ID

And you want to analyze the first 11 time periods reported by timechart, then do this

yoursearchhere
| bin span=1h _time
| stats count by _time ID
| appendpipe [ head 11
             | stats stddev(count) as sdev avg(count) as avg by ID
             | eval pattern=case(avg<.1,"Zero at beginning",
                                    sdev < .25,"Flat at beginning",
                                    1==1,"Other")
             | fields ID pattern ]
| stats first(pattern) first(count) by _time ID

I think this will give you a starting point. The appendpipe takes a copy of the data at that point in the execution pipeline, processes it and appends the results to the main pipeline. Oh, and I set the time interval to hours in the bin command - you could do this using timechart as you started, but I think it is easier to use bin and stats.

View solution in original post

0 Karma

lguinn2
Legend

Assume

yoursearchhere
| timechart count by ID

And you want to analyze the first 11 time periods reported by timechart, then do this

yoursearchhere
| bin span=1h _time
| stats count by _time ID
| appendpipe [ head 11
             | stats stddev(count) as sdev avg(count) as avg by ID
             | eval pattern=case(avg<.1,"Zero at beginning",
                                    sdev < .25,"Flat at beginning",
                                    1==1,"Other")
             | fields ID pattern ]
| stats first(pattern) first(count) by _time ID

I think this will give you a starting point. The appendpipe takes a copy of the data at that point in the execution pipeline, processes it and appends the results to the main pipeline. Oh, and I set the time interval to hours in the bin command - you could do this using timechart as you started, but I think it is easier to use bin and stats.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

"head" and "appendpipe" (and first()) are what I missed. Thanks! (Now I want even more commands to zoom in any given internal:-)

I just realize that "count by _time ID" does not give out 0 for missing values at the two ends. (Strangely, timechart always does.) I even tried fillnull to no avail. (I know this was encountered in another question, but fillnull seemed to have solved the problem.) Ideas?

0 Karma

lguinn2
Legend

Bah! I guess the timechart solution is better:

 yoursearchhere
| timechart count by ID
| untable _time ID count

then appendpipe etc as before

HTH!

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Thanks, @lguinn. untable is such a handy command! I had previously asked about filling leading zeros, and got a slim but still lengthy method. (My memory lapsed when I said straight fillnull had worked. It hadn't.) Will test in other use cases.

Note after untable, head will only return the number of events as in total, and not on a per ID basis. This is undesired. (For one, there could be more than 11 IDs.) So untable should be performed after head inside appendpipe. With this adjustment, and adding max() to criteria, I can use the following to group my IDs:

  yoursearchhere
 | timechart count by ID
 | appendpipe [ head 11
   | untable _time ID count
   | stats stdev(count) as sdev max(count) as max by ID
   | eval pattern=case(max==0,"Zero at beginning",
                       max>0 and sdev < .25,"Flat at beginning",
                       1==1,"No head pattern")
   | fields ID pattern ]
 | stats dc(ID) as Count by pattern

Now, there is a tail pattern in my search, whereby some IDs disappears in the final time periods. When I tried to use the same untable technique, using tail in place of head, I got no IDs in. I'll submit as a new question for that one.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

I realize that the original question miss this info: The illustrated time pattern is produced by
| timechart count by ID
Some IDs fall into 1, some fall into 2, some fall into 3.

0 Karma

masonmorales
Influencer

You could try using multiple functions in your timechart command, along with some | where clauses. If you use the stdev function then you'll be able to detect the flat lines (since stdev would be 0). Take a look at: http://docs.splunk.com/Documentation/Splunk/6.2.5/SearchReference/CommonStatsFunctions

0 Karma

yuanliu
SplunkTrust
SplunkTrust

Getting stdev is easy. The problem is to search based on 0-stdev in a sub period of the total search, because it is not 0 in the entire search period (in which case I can use an eventstat to identify them).

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...