Archive

how to use makecontinuous in combination with stats to fill in time

Path Finder

Hello,

I have this search query:

sourcetype="device"
| bucket span=1d _time | makecontinuous _time 
| stats count by _time, user | fillnull count

I was expecting that by using makecontinuous the days when the count was 0 will also be added to the results. With this query I get this result.

    _time           user                  count
    2017-08-18  user2                     5
    2017-08-21  user2                  1
    2017-08-25  user2                  4
    2017-08-27  user2                  1
    2017-08-30  user2                  6

I was expecting this result:

_time           user                  count
2017-08-18  user2                     5
2017-08-19  user2                     0
2017-08-20  user2                     0
2017-08-21  user2                  1
2017-08-22  user2                  1
.....and so on
2017-08-25  user2                  4
2017-08-26  user2                  0
2017-08-27  user2                  1
2017-08-30  user2                  6

I know that this would work well with timechart but I really need to use stats, so that I can then use the results in Machine Learning Toolkit, and timechart would not work there.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

Perhaps this could help if you wanted it in another format?

| timechart limit=0 span=5m count by user
| fillnull 
| untable _time, user, count
...

I've used that trick to fill in the missing time points before...let me know if that helps!

View solution in original post

SplunkTrust
SplunkTrust

Perhaps this could help if you wanted it in another format?

| timechart limit=0 span=5m count by user
| fillnull 
| untable _time, user, count
...

I've used that trick to fill in the missing time points before...let me know if that helps!

View solution in original post

Path Finder

this works, I would mark this as answer, but it is a reply, so I cannot mark it.

0 Karma

SplunkTrust
SplunkTrust

Moved to answer!

0 Karma

SplunkTrust
SplunkTrust

You have some options here. Since the MLTK is appending stats on there, any command such as fillnull or makecontineous will not solve your issue since it needs to be passed after timechart/stats.

You need to mock up some dummy data and set its values to zero then allow stats to fill in any non-null values.

An example would look like this

| makeresults | eval field1="" | eval field2=""
| append [| search index=... sourcetype=... | bin _time span=10m | stats count by _time | fillnull value=0]

So if your time range a 60 min span. The makeresults command will create 6 bins with 10 minute time spans and will fill any empty bin with a zero. You could also take the approach of using a lookup table to populate your null values or you could use the internal index to populate placeholders to prevent null values.

0 Karma

Could you please elaborate your solution a bit ? I am faced with a similar issue where _time is discontinuous and MLTK throws error as I try to fit or apply model. TIA. FYI , I am quite new to Splunk but learning things fast.

0 Karma

SplunkTrust
SplunkTrust

I've come up with a much better solution since posting this reply. Ask a new question and I will give you the code

0 Karma

Thanks . I used time chart to fix my issue currently . please let me know if your soln is different , I will start off a new thread .

0 Karma

SplunkTrust
SplunkTrust

Any update on if this helped?

0 Karma

Path Finder

Your answer helped to understand why it does not work.
Someone else suggested the solution in one of the replies:

| timechart limit=0 span=5m count by user
| fillnull
| untable _time, user, count

0 Karma

SplunkTrust
SplunkTrust

@jorjiana88, try the timechart command.

sourcetype="device" user="*"
| timechart count by user useother=f limit=0
____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

Path Finder

It is not an option to use timechart because it changes how the result is displayed and I cannot later apply some machine learning algorithm after timechart. I really need to use stats.

0 Karma

SplunkTrust
SplunkTrust

@jorjiana88, is it one of built in Machine Learning Toolkit Algorithm, or you are trying to create your own?

Can you please give the Algorithm you are trying to use? Outputwise, timechart command above generates same fields as stats command in your query, so I don't see how the two would be picked up differently by the algorithm.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!