I have a data set that looks like this:
Name, Month, Year, Data1, Data2, Data3, Data4, Data[x]
Steve, 2,2015, 1,1,1,0,0
Where the Month and Year field are essentially the start date and that date's value is Data1 and the following Data[x] fields are the next month sequentially.
Is there a Splunk command that will create the following for me. Other tools call this a "crosstab" or "crosstable". It's basically the opposite of a transpose.
Name,Month, Year, Value
Steve, 2, 2015, 1
Steve, 3, 2015, 1
Steve, 4, 2015, 1
Steve, 5, 2015, 0
Steve, 6, 2015, 0
This should do the trick
Your current search | veal temp=Name."#".Month."#".Year | fields - Name Month Year | untable temp data Value | rex field=temp "(?<Name>.*)#(?<Month>.*)#(?<Year>.*)" | table Name Month Year Value
Nice! I learned some new tricks from your search string, thanks.
I like your answer better than mine in several ways; for one, it's more concise.
On my system, with the "test" event I described in my answer, the following search:
sourcetype="steverimar" | eval temp=Name."#".Month."#".Year | fields - Name Month Year | untable temp data Value | rex field=temp "(?<Name>.*)#(?<Month>.*)#(?<Year>.*)" | table Name Month Year Value
produces the following results (as displayed on the Statistics tab):
Name Month Year Value
Steve 2 2015 1
Steve 2 2015 2
Steve 2 2015 3
Steve 2 2015 4
Steve 2 2015 5
Steve 2 2015 localhost:8088
Steve 2 2015 test
Steve 2 2015 1
Steve 2 2015 {"":_"",_"":_,_"":_,_"":_,_"":_,_"":_,_"":_,_"":_}
Steve 2 2015 http:GXH
Steve 2 2015 steverimar
Steve 2 2015 MYPC
(I wish this forum supported table formatting. Does it? I've tried GitHub-flavored Markdown table formatting: nope. And HTML table, tr, and td tags get "sanitized" away.)
Some nits:
The results include not just the desired Data1
, Data2
... values, but also the values of the following fields: host
, index
, linecount
, punct
, source
, sourcetype
, splunk_server
. (Or is this my problem; Splunk behavior that I can change on my system?)
@somesoni2,
The following search - your answer, slightly tweaked - produces the desired result (on my system):
sourcetype="steverimar" | eval temp=Name."#".Month."#".Year | fields temp Data* | untable temp data Value | rex field=temp "(?<Name>.*)#(?<Month>.*)#(?<Year>.*)" | eval Month=Month+tonumber(substr(data, 5))-1 | table Name Month Year Value
To exclude the unwanted rows, I replaced your original:
fields - Name Month Year
with:
fields temp Data*
To increment the Month values, I inserted:
eval Month=Month+tonumber(substr(data, 5))-1
One thing I liked about your answer is that it didn't rely on the field names being Data1
, Data2
...
But my tweaks rely on those names. Can you offer alternative tweaks to fix the nits without introducing this dependency?
I second the request from @somesoni2:
Would you be able to provide the actual field names from your current result?
The following solution is based on the specific field names that you cited: Data1
, Data2
...
This solution is deliberately designed to work for a variable number of Data fields.
I started with the following input data, in JSON format:
{
"Name": "Steve",
"Month": 2,
"Year": 2015,
"Data1": 1,
"Data2": 2,
"Data3": 3,
"Data4": 4,
"Data5": 5
}
For testing purposes, I deliberately specified unique values for each Data field.
The following search string (with line breaks inserted for readability) generates the output that you specified (I ingested the input data as sourcetype "steverimar"):
sourcetype=steverimar |
eval data_column_name="" |
foreach Data* [eval data_column_name=data_column_name + " <<FIELD>>"] |
makemv data_column_name |
mvexpand data_column_name |
eval Value=spath(_raw, data_column_name) |
eval month_offset=tonumber(substr(data_column_name, 5)) - 1 |
eval Month=Month+month_offset |
table Name, Month, Year, Value
Line-by-line explanation:
sourcetype=steverimar
Get the matching events.
eval data_column_name=""
Create a new field named data_column_name
, and set its value to an empty string.
foreach
...
For each field whose name starts with "Data", append the name of that field to the value of the data_column_name
field.
makemv
...
Convert data_column_name
into a multivalue field.
mvexpand
...
Expand the values of data_column_name
into separate events, one event for each value.
eval Value
...
Create a new field named Value
, and set its value to the value of the field whose name matches the value of the data_column_name
field. For example, if the value of the data_column_name
field is "Data1", then set the value of the Value
field to the value of the Data1
field ("1").
eval month_offset
...
Create a new field name month_offset
, and set its value to the numeric suffix of the value of the data_column_name
field, minus 1.
eval Month
...
Add the month_offset
to the value of the Month
field. (You could merge this step with the previous step. I kept them separate here for clarity.)
table
...
Transform the search results (for display in the Statistics tab), limited to the fields that we want in our final results.
The technique used in steps 2 to 6 is described in the Splunk docs topic "Build a chart of multiple data series".
Here's a screenshot of the results in the Statistics tab of Splunk Web (Splunk Enterprise 6.4):
It occurs to me that this solution might not be exactly what you need, because your actual column names might not be Data1
, Data2
, etc., but this solution matches the column names that you specified in your question.
I have not made the search string "smart" enough to increment the year if the month number is greater than 12. That would be easy to do.
I would welcome alternatives to:
eval Value=spath(_raw, data_column_name)
That is:
data_column_name
) contains the name of another field.Value
).Would you be able to provide the actual field names from your current result?
|transpose
Transpose isn't what I'm looking for. I'm looking to cross table or do the opposite essentially.