Splunk Search

How to fill empty or missing data values based on approximations of previous values?

Engager

Hello All,

I spent a lot of time trying to figure out how to fill out missing data with approximations based on the previous values:

alt text

The problem I have is that my home automation system doesn't fire events at regular intervals, it might be because the value doesn't change (so the sensor doesn't use battery to re-send the data) or that the sensor is disconnected for a while.

I would like to create charts that looks complete, for that I would like to be able to determine the missing value based on the last one. The end result would be something like a straight line between the empty parts of the chart.

I've experimented with fillnull, filldown, streamstats to get previous values, but I can't seem to get the chart I'm looking for.
Could you help me figure out how to do this?

Cheers,
Benoit

0 Karma
1 Solution

SplunkTrust
SplunkTrust

For empty parts of a line chart, there's an option on the chart itself.

See this part of the docs, but basically get your chart displaying, then click the chart formatting button, select the data formatting and there's an option to "Connect" them together (there are also options to make them go to zero or to just leave gaps).

As an aside, in addition to the commands you looked into you may also want to investigate makecontinuous, which given a span like span=15m can fill in gaps, sort of, on tables (e.g. the data).

If you need more help, please post a sample of the events. And your karma isn't high enough yet to post links (though you can paste in a URL, just don't try to make it into a clickable link because that won't work) or screenshots, but you can put an image on imgur and paste in a link to it (NOT as a link, again, just as text). And don't worry, soon you'll have enough Karma to do both!

View solution in original post

New Member

In addition, When you series are missing some values, example, with given result from :

_time                  A    B
2020-04-20 07:15:00 1   1
2020-04-20 07:20:00     2
2020-04-20 07:25:00     3
2020-04-20 07:30:00 5   6
2020-04-20 07:35:00 7   
2020-04-20 07:40:00 10   11

The option to "Connect" does not work as expected and connect when using the stack display mode with an aera chart (if not stack, the connect option works like a charm).
stackvsnotstackeddisplay
You can use streamstats to progapate the last value :

somesearch | streamstats last(*) as *

That would transform your results to :

_time                  A    B
2020-04-20 07:15:00 1   1
2020-04-20 07:20:00 1   2
2020-04-20 07:25:00 1   3
2020-04-20 07:30:00 5   6
2020-04-20 07:35:00 7   6
2020-04-20 07:40:00 10   11

Hopping it will help someone.

0 Karma

Revered Legend

Give this a try

index="homeassistant" entity_id=flower_4_conductivity | timechart avg(value) as avg_conductivity | eval avg_conductivity=if(avg_conductivity=0,null(),avg_conductivity) | filldown avg_conductivity

Engager

This solution works pretty well and allows me to fill the missing data.

The problem that I see is that the previous value is kept until a new one is available, for a large gap such as the one I have above, a smoother slope seems closer to reality.

The if() statement is very interesting though, I will keep the snippet for further use, thanks for your help!

0 Karma

SplunkTrust
SplunkTrust

For empty parts of a line chart, there's an option on the chart itself.

See this part of the docs, but basically get your chart displaying, then click the chart formatting button, select the data formatting and there's an option to "Connect" them together (there are also options to make them go to zero or to just leave gaps).

As an aside, in addition to the commands you looked into you may also want to investigate makecontinuous, which given a span like span=15m can fill in gaps, sort of, on tables (e.g. the data).

If you need more help, please post a sample of the events. And your karma isn't high enough yet to post links (though you can paste in a URL, just don't try to make it into a clickable link because that won't work) or screenshots, but you can put an image on imgur and paste in a link to it (NOT as a link, again, just as text). And don't worry, soon you'll have enough Karma to do both!

View solution in original post

Engager

This answers my questions perfectly, by allowing me to tune the display without adding complexity the query, thanks!

Concerning makecontinuous, I'm struggling to find how to create the query, it seems like it should be close to this?
index="homeassistant" entity_id=flower_4_conductivity | makecontinuous _time span=1h | timechart avg(value)

0 Karma

SplunkTrust
SplunkTrust

So, what make continuous will do is if you have a scenario where you get rid of enough events, like
My search | bin _time span=1h | stats count | where count>30
You could have have stripped out entire hours where the count was too low. If that is the case you could do makecontinuous to put the missing hours back in with nulls. Just another option.

For more complex needs, somesoni2 has a good answer, and there are others too.

I am not sure there is any way to smoothly fill the between spots. The connect option is likely to be the least unlikely to be goofy, but it will be a straight line. If you had a spreadsheet with this data, how would you go about making a better line? Maybe with info we can at least make a stab at it. I can think of some things to do in a very tightly controlled data set but none that would handle arbitrary gaps. Sorry there isn't a better answer for that.

State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!