Splunk Search

How to replace all null values between two dates (min and max) / Replace null / "0" values in timechart, GantChart, CoverageChart, CoverageDashboard?

Hi all,

a simple question, to replace all null values between two dates (min and max) in a timechart with custom values to build some kind of "GantChart" or "CoverageGraph" you may have some ideas.

Find hereby the statement, even with streamstats, timechart, stats it seems to be tricky.
The subsearches which joins on source gives me the start/end date while "source_time" shall reply with a boolean(1/0) or count to colorize the timechart.

It works but for days without events the timechart table fields are still "0" also if streamstats position is changed incl. reverse etc.

The table at the end is to check the result only, it could be replaced by

timechart limit=0 span=1d first(runnum) by my_int_system
timechart limit=0 span=1d count(runnum) by my_int_system

Search:

index=index_x | 
    bucket _time span=1d | 
    makecontinuous _time span=1d | 
    fillnull _time | 
    rex field=source ".*/(?<my_int>\d+)_(?<my_max>\d+)_.*_(?<my_system>\w+).txt" | 
    eval my_max=strftime(strptime(my_max,"%Y%m%d"),"%Y/%m/%d") | 
    eval my_int_system = my_int + " " + my_system  | 
    streamstats count as runnum  | 
        join source [ search source=*.txt | 
            stats min(_time) as my_min, values(my_max) as my_max by source | 
            eval my_min=strftime(my_min,"%Y/%m/%d") | 
            eval my_max=strftime(strptime(my_max,"%Y%m%d"),"%Y/%m/%d") ] | 
    eval source_time=if(_time>=strftime(strptime(my_min, "%Y/%m/%d"),"%s") AND _time<=strftime(strptime(my_max, "%Y/%m/%d"),"%s"),"1","0") | 
    dedup _time,my_int,my_system | table _time,my_int,my_system,source_time,my_min,my_max,source

Comments are welcome.

The following data can act as sample data imported from *.txt file:

for file 1 20130114 ("%Y/%m/%d") is the min
for file 1 20130925 ("%Y/%m/%d") is the max

for file 2 20130302 ("%Y/%m/%d") is the min
for file 2 20130601 ("%Y/%m/%d") is the max

The join works and extracts min and max, due to the fact that some dates are missed in the input some values are "0" in the timechart which cause gaps, overwrite with statements or filling the values wasn't successful until now.

file1: 0001_20130925_foo_sys1.txt
content:
14.01.2013  evt1
15.01.2013  evt2
24.01.2013  evt10
26.01.2013  evt11
27.01.2013  evt12
28.01.2013  evt13
29.01.2013  evt14
30.01.2013  evt15
31.01.2013  evt16
01.02.2013  evt17
03.02.2013  evt18
04.02.2013  evt19
05.02.2013  evt20
06.02.2013  evt21
07.02.2013  evt22
09.02.2013  evt23
11.02.2013  evt24
13.02.2013  evt25
14.02.2013  evt26
15.02.2013  evt27
16.02.2013  evt28
17.02.2013  evt29
18.02.2013  evt30
19.02.2013  evt31
20.02.2013  evt32
21.02.2013  evt33
22.02.2013  evt34
23.02.2013  evt35

file2: 0002_20130601_bar_sys2.txt
content:
02.03.2013  evt2_2
03.03.2013  evt2_3
04.03.2013  evt2_4
05.03.2013  evt2_5
06.03.2013  evt2_6
07.03.2013  evt2_7
08.03.2013  evt2_8
09.03.2013  evt2_9
10.03.2013  evt2_10
11.03.2013  evt2_11
17.03.2013  evt2_12
18.03.2013  evt2_13
19.03.2013  evt2_14
20.03.2013  evt2_15
21.03.2013  evt2_16
27.03.2013  evt2_17
28.03.2013  evt2_18
31.03.2013  evt2_19
01.04.2013  evt2_20
02.04.2013  evt2_21
03.04.2013  evt2_22
04.04.2013  evt2_23
0 Karma
1 Solution

Influencer

That was bonkers hard to figure out. I took out some of the joins to make this clearer.

* | rex field=source ".*/(?<my_int>\d+)_(?<my_max>\d+)_.*_(?<my_system>\w+).txt"
  | eval intsys = my_int."_".my_system

The next 2 lines set up a running count over continuous days

  | timechart limit=0 span=1d count by intsys
  | untable _time intsys value 

Now we figure out the cumulative totals by day

  | streamstats sum(value) as total by intsys

Here's the tricky bit. For every given int_sys, find the range of time the series is valid

  | eventstats max(eval(if(value > 0,_time,NULL))) as mxtime min(eval(if(value > 0,_time,NULL))) as mntime by intsys 

Discard values outside of the valid range

  | eval total=if(_time < mntime or _time > mxtime,NULL,total)

Then plot it

  | xyseries _time,intsys,total 
  | makecontinuous _time

Before : ( reversed the direction, and reset the count for each integer/system )
alt text

After:
alt text

View solution in original post

Influencer

That was bonkers hard to figure out. I took out some of the joins to make this clearer.

* | rex field=source ".*/(?<my_int>\d+)_(?<my_max>\d+)_.*_(?<my_system>\w+).txt"
  | eval intsys = my_int."_".my_system

The next 2 lines set up a running count over continuous days

  | timechart limit=0 span=1d count by intsys
  | untable _time intsys value 

Now we figure out the cumulative totals by day

  | streamstats sum(value) as total by intsys

Here's the tricky bit. For every given int_sys, find the range of time the series is valid

  | eventstats max(eval(if(value > 0,_time,NULL))) as mxtime min(eval(if(value > 0,_time,NULL))) as mntime by intsys 

Discard values outside of the valid range

  | eval total=if(_time < mntime or _time > mxtime,NULL,total)

Then plot it

  | xyseries _time,intsys,total 
  | makecontinuous _time

Before : ( reversed the direction, and reset the count for each integer/system )
alt text

After:
alt text

View solution in original post

Thank you very much for the "tricky bit" 😉

0 Karma

just added

0 Karma

Influencer

Any chance you can post some sample data that exposes the problem ?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!