I need to filter different error values for a range of different instruments. To do this, I have created a macro and lookup that uses the host-field and the name of the measurement field to determine if the value should be removed or not. This part of the function works well, but in some cases, we also need to correct the measurements in different time periods due to calibrations etc. For those cases, I have created columns in the lookup named "case_<number>" which contains time_start, time_stop, value_to_remove, adjustment_value.
An example would be host=extensometer_001 with a distance_mm-field where we need to correct the following measurements:
- Remove -222 between unix time 1644546240 and 1644586240
- Adjust +30 for measurements between unix time 1641546240 and 1644566200
The case-columns in the lookup would then look like this:
case_001=1644546240,1644586240,-222
case_002=1641546240,1644566200,,30
To be able to handle zero or more cases per host and field, I use foreach in the following way:
| foreach "case_*"
[| makemv delim="," <<FIELD>>
| eval distance_mm= if(_time > mvindex(<<FIELD>>,0) AND _time < mvindex(<<FIELD>>,1) AND like(distance_mm,mvindex(<<FIELD>>,2)), "NULL", distance_mm)
| eval distance_mm= if(_time > mvindex(<<FIELD>>,0) AND _time < mvindex(<<FIELD>>,1) AND mvindex(<<FIELD>>,3)!="",distance_mm+tonumber(mvindex(<<FIELD>>,3)),distance_mm) ]
The problem is that when I looka at "All time" and graph distance_mm with timechart at the end of the search, I end up seeing empty buckets all the way back to the first event indexed in the index (even if the data in my search is not that old). If I remove the foreach section, the problem goes away. I cannot see what is happening that makes timechart show this period without data. The interesting thing is that if look at the data in "Statistics" view right before the timechart then it only shows the time period with data. It is only when the timechart command is run that the empty buckets appear.
Image of results with foreach:
Image of results without foreach:
Does anyone know what is going wrong here or how in the worst case to get around it? (I could use cont=false to make Splunk zoom into the area where there is data, but then I would not be able to choose "show gaps" where data is missing which is a requirement from the client.)
Full search:
| tstats summariesonly=false allow_old_summaries=false avg("Extensometer.distance_mm") as distance_mm FROM datamodel=Extensometer WHERE sourcetype="EXT" BY host, sourcetype, _time span=60min
| eval field="distance_mm"
| lookup error-filtering.csv instrument as host field as field
| foreach "case_*"
[| makemv delim="," <<FIELD>>
| eval distance_mm= if(_time > mvindex(<<FIELD>>,0) AND _time < mvindex(<<FIELD>>,1) AND like(distance_mm,mvindex(<<FIELD>>,2)), "NULL", distance_mm)
| eval distance_mm= if(_time > mvindex(<<FIELD>>,0) AND _time < mvindex(<<FIELD>>,1) AND mvindex(<<FIELD>>,3)!="",distance_mm+tonumber(mvindex(<<FIELD>>,3)),distance_mm) ]
| streamstats window=2 earliest(distance_mm) as earliest_distance_mm latest(distance_mm) as latest_distance_mm by host
| eval change_distance_mm=(latest_distance_mm - earliest_distance_mm)
| streamstats sum(change_distance_mm) as acc_change_distance_mm by host
| timechart span=1w limit=0 eval(round(avg(acc_change_distance_mm),2)) as distance_mm by host
The timechart command uses the timeframe from the search (earliest to latest) and creates rows for all timepoints between - use bin and chart by _time instead
| bin _time span=1w
| chart limit=0 eval(round(avg(acc_change_distance_mm),2)) as distance_mm by _time host