Splunk Search

Manipulating Table Rows of Sensor Data to Shift _time

Communicator

tl;dr : Need to manipulate rows / cols of a table in a specific way to avoid using subsearch, can't figure out how. Scroll to bottom and look at tables.

I'm working with some sensor data and I'd like to "shift" _time for some events, so that they are correctly correlated with an appropriate ItemID. I have previously done this with a subsearch (which I'll include below) but I'm trying to get rid of the subsearch because of it's limitations.

Here are the fields present in the data:
_time FieldName Value

There's a field called ItemID which identifies when the item is entering the processing queue. The queue runs at Speed. Some amount of time later, based on Speed and a fixed distance, the item of interest passes under the Sensors, (up to 100s). The raw data does not include the _time that an item is under these sensors, but it can be calculated with an equation from the Speed and the distance.

Here's the subsearch spl:

index=* Speed ItemID 
| timechart span=1m avg(Measurement) by FieldName 
| filldown 
| timeshift=(##/Speed)*60 
| append [ search SensorA | timechart avg(Measurement) span=1m | filldown | table _time Measurement FieldName ] 
| sort _time 
| filldown timeshift 
| eval _time=if(FieldName==SensorA,_time-timeshift,_time)
| sort _time 
| filldown Measurement 
| rename Measurement as SensorA 
| stats avg(SensorA) by ItemID

This search works because you have two sets of _time, one you're shifting by subtracting the shift (SensorA) and one you're not. However, when you get a large number of sensors over a longer period of time, the limitations of subsearch come into play.

Here's the new search, which eliminates subsearch:

index=* ItemID OR Speed OR Sensor* 
| bin _time span=1m 
| stats avg(Measurement) as Measurement by _time, FieldName
| table * 
| xyseries _time FieldName Measurement 
| filldown * 
| eval timeshift=(##/Speed)*60

This search produces a table with records that look like this:

_time   ItemID               timeshift   SensorA   SensorB   SensorC
8:00     9385723423          3333333     33.3      33.3      33.3 

What I'd like to do is create a table like this:

_time       FieldName           Measurement             timeshift
8:00         ItemID                9385723423             3333333
8:00         SensorA                33.3                  3333333
8:00         SensorB                33.3                  3333333
8:00         SensorC                33.3                  3333333
8:00         SensorD                33.3                  3333333

I can then manipulate this table to timeshift only the sensor data, xyseries everything to put it back together, and use stats against the results.

Any advice? I've looked at untable docs & examples but I think it only handles 3 fields, and I need timeshift to stay on every sensor record.

0 Karma
1 Solution

Revered Legend

Give this a try

index=* ItemID OR Speed OR Sensor* 
 | bin _time span=1m 
 | stats avg(Measurement) as Measurement by _time, FieldName
 | table * 
 | xyseries _time FieldName Measurement 
 | filldown * 
 | eval timeshift=(##/Speed)*60
 | eval temp=_time."#".timeshift
 | fields - _time timeshift
 | untable temp FieldName Measurement
 | eval _time=mvindex(split(temp,"#"),0)
 | eval timeshift=mvindex(split(temp,"#"),1) | fields - temp

View solution in original post

Revered Legend

Give this a try

index=* ItemID OR Speed OR Sensor* 
 | bin _time span=1m 
 | stats avg(Measurement) as Measurement by _time, FieldName
 | table * 
 | xyseries _time FieldName Measurement 
 | filldown * 
 | eval timeshift=(##/Speed)*60
 | eval temp=_time."#".timeshift
 | fields - _time timeshift
 | untable temp FieldName Measurement
 | eval _time=mvindex(split(temp,"#"),0)
 | eval timeshift=mvindex(split(temp,"#"),1) | fields - temp

View solution in original post

Communicator

This solution works great! It appears to run quickly and efficiently, and doesn't introduce any limitations based on the commands used.

Many thanks! Folks like you make the splunk community great. 🙂

Another option I realized is to use timechart avg(*) as * span=30s to create "empty" buckets in between the 1 min buckets which were previously created. Timechart also imposes limitations though, so it's not as flexible as your solution.

0 Karma
Don’t Miss Global Splunk
User Groups Week!

Free LIVE events worldwide 2/8-2/12
Connect, learn, and collect rad prizes
and swag!