tl;dr : Need to manipulate rows / cols of a table in a specific way to avoid using subsearch, can't figure out how. Scroll to bottom and look at tables.
I'm working with some sensor data and I'd like to "shift" _time for some events, so that they are correctly correlated with an appropriate ItemID. I have previously done this with a subsearch (which I'll include below) but I'm trying to get rid of the subsearch because of it's limitations.
Here are the fields present in the data:
_time FieldName Value
There's a field called ItemID which identifies when the item is entering the processing queue. The queue runs at Speed. Some amount of time later, based on Speed and a fixed distance, the item of interest passes under the Sensors, (up to 100s). The raw data does not include the _time that an item is under these sensors, but it can be calculated with an equation from the Speed and the distance.
Here's the subsearch spl:
index=* Speed ItemID
| timechart span=1m avg(Measurement) by FieldName
| filldown
| timeshift=(##/Speed)*60
| append [ search SensorA | timechart avg(Measurement) span=1m | filldown | table _time Measurement FieldName ]
| sort _time
| filldown timeshift
| eval _time=if(FieldName==SensorA,_time-timeshift,_time)
| sort _time
| filldown Measurement
| rename Measurement as SensorA
| stats avg(SensorA) by ItemID
This search works because you have two sets of _time, one you're shifting by subtracting the shift (SensorA) and one you're not. However, when you get a large number of sensors over a longer period of time, the limitations of subsearch come into play.
Here's the new search, which eliminates subsearch:
index=* ItemID OR Speed OR Sensor*
| bin _time span=1m
| stats avg(Measurement) as Measurement by _time, FieldName
| table *
| xyseries _time FieldName Measurement
| filldown *
| eval timeshift=(##/Speed)*60
This search produces a table with records that look like this:
_time ItemID timeshift SensorA SensorB SensorC
8:00 9385723423 3333333 33.3 33.3 33.3
What I'd like to do is create a table like this:
_time FieldName Measurement timeshift
8:00 ItemID 9385723423 3333333
8:00 SensorA 33.3 3333333
8:00 SensorB 33.3 3333333
8:00 SensorC 33.3 3333333
8:00 SensorD 33.3 3333333
I can then manipulate this table to timeshift only the sensor data, xyseries everything to put it back together, and use stats against the results.
Any advice? I've looked at untable docs & examples but I think it only handles 3 fields, and I need timeshift to stay on every sensor record.
Give this a try
index=* ItemID OR Speed OR Sensor*
| bin _time span=1m
| stats avg(Measurement) as Measurement by _time, FieldName
| table *
| xyseries _time FieldName Measurement
| filldown *
| eval timeshift=(##/Speed)*60
| eval temp=_time."#".timeshift
| fields - _time timeshift
| untable temp FieldName Measurement
| eval _time=mvindex(split(temp,"#"),0)
| eval timeshift=mvindex(split(temp,"#"),1) | fields - temp
Give this a try
index=* ItemID OR Speed OR Sensor*
| bin _time span=1m
| stats avg(Measurement) as Measurement by _time, FieldName
| table *
| xyseries _time FieldName Measurement
| filldown *
| eval timeshift=(##/Speed)*60
| eval temp=_time."#".timeshift
| fields - _time timeshift
| untable temp FieldName Measurement
| eval _time=mvindex(split(temp,"#"),0)
| eval timeshift=mvindex(split(temp,"#"),1) | fields - temp
This solution works great! It appears to run quickly and efficiently, and doesn't introduce any limitations based on the commands used.
Many thanks! Folks like you make the splunk community great. 🙂
Another option I realized is to use timechart avg(*) as * span=30s to create "empty" buckets in between the 1 min buckets which were previously created. Timechart also imposes limitations though, so it's not as flexible as your solution.