I have data where each row contains a timestamp and a set of fieldname=fieldvalue entries. I want to convert selected fieldnames from each row into multiple rows each with timestamp, category=fieldname, value=fieldvalue.
Source data
2014-05-19 10:00 first_field=1 second_field=1 third_field=2 other=5
2014-05-19 10:10 first_field=2 second_field=1 third_field=3 ignore=foo
If I want first_field, second_field, and third_value, I want the output to be something like
2014-05-19 10:00 category=first_field value=1
2014-05-19 10:00 category=second_field value=1
2014-05-19 10:00 category=third_field value=2
2014-05-19 10:10 category=first_field value=2
2014-05-19 10:10 category=second_field value=1
2014-05-19 10:10 category=third_field value=3
I want to run the final data through | timechart avg(value) by category (or other stats function) to look at multiple categories at once. Data will be "clean" in that ranges for different categories extracted at once will be compatible.
I know I've seen this done, but can't find the right keywords to search on to (re-)discover the examples.
assuming your initial results have fields that are _time first_field second_field third_field
then you want
| untable _time category count | rename count as value
and that will unwind the results you have into the results that have fields of _time category value
, just like you need.
The untable command is basically the inverse of the xyseries command. To really understand these two commands it helps to play around a little with the stats command vs the chart command. Take these two searches:
index=_internal group=per_sourcetype_thruput | stats count by date_hour series
vs
index=_internal group=per_sourcetype_thruput | chart count over date_hour by series
They look very similar and in fact they present the same information but in very different formats. Run them both and think about them.
Now, you can turn the first resultset into the second by appending | xyseries date_hour series count
, and you can turn the second resultset into the first by appending | untable date_hour series count
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Xyseries
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Untable
assuming your initial results have fields that are _time first_field second_field third_field
then you want
| untable _time category count | rename count as value
and that will unwind the results you have into the results that have fields of _time category value
, just like you need.
The untable command is basically the inverse of the xyseries command. To really understand these two commands it helps to play around a little with the stats command vs the chart command. Take these two searches:
index=_internal group=per_sourcetype_thruput | stats count by date_hour series
vs
index=_internal group=per_sourcetype_thruput | chart count over date_hour by series
They look very similar and in fact they present the same information but in very different formats. Run them both and think about them.
Now, you can turn the first resultset into the second by appending | xyseries date_hour series count
, and you can turn the second resultset into the first by appending | untable date_hour series count
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Xyseries
http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Untable
That was what I needed. As an example (to make sure I could do what I want) I offer the following which displays min, avg, max as three lines on a graph:
index=_internal group=per_sourcetype_thruput series=access_combined
| bucket _time span=10m
| stats min(kbps) as min avg(kbps) as avg max(kbps) as max by _time
| untable _time category count | rename count as value
| timechart span=10m avg(value) by category
With appropriate use of fields/table/stats and rename before the untable/rename, I can choose exactly what to show in the by split of the timechart.