Splunk Search
Highlighted

How to extract multivalue pipe separated subfields (with header) from a field?

Builder

I have a data source from DBX that has a field called "description" that contains a pipe separated format with headers I would like to extract as a multivalue field (using the existing headers as field names). I would like to do it without config files if it's all possible in SPL. Example Event below.

Format from DBX output

Number,Type,Name,Description
10001,type1,name1,| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value1 | value2 | value3 | value4 |

I would like to end up with 4 multivalue fields called (subheader1,subheader2,subheader3,subheader4) and contain all the corresponding row values from the pipe seperated format.

Is this possible?

Tags (2)
0 Karma
Highlighted

Re: How to extract multivalue pipe separated subfields (with header) from a field?

SplunkTrust
SplunkTrust

Sure, it can be done. Do the subheader field names sometimes have spaces in them, or not? Are the values purely numeric, or alpha, or do they occasionally contain spaces themselves? Is there only one set of values allowed, or could it be a table? Is it always a fixed number of fields in the description, or could it vary?

0 Karma
Highlighted

Re: How to extract multivalue pipe separated subfields (with header) from a field?

Builder

Sub header field names can have spaces and are alphanumeric. The values contain alpha numeric and special characters including spaces. It is always 4 total fields but the number of rows in the table is variable.

0 Karma
Highlighted

Re: How to extract multivalue pipe separated subfields (with header) from a field?

SplunkTrust
SplunkTrust

Here's one example of how you can convert the piped values at search time to a single multivalue field with a format more like this...

subheader1=value1
subheader2=value2
subheader3=value3
subheader 4=value4

From there it can be rectified into different fields, but I need the answers to my questions above in order to give you the most appropriate solution.

This run-anywhere code is probably overcomplicated, and only takes you halfway to your goal, but it should give you some ideas of what is possible.

| makeresults | eval number=1001,Type="type1",Name="name1",Description="| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value1 | value2 | value3 | value4 |" 
| append [| makeresults | eval number=1002,Type="type2",Name="name2",Description="| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value11 | value12 | value13 | value14 |" ]
| streamstats count as grouprecordnumber
| eval flag="keepme"
| appendpipe [
| table grouprecordnumber Description | rex mode=sed field=Description "s/ \|\| / |!!!!| /g" 
| rex mode=sed field=Description "s/\|/||/g"
| makemv delim="!!!!" Description
| mvexpand Description
| streamstats count as reccount by grouprecordnumber
| where reccount!=2
| rex field=Description max_match=5 "\| (?<myfield>[^|]*) "
| eval fieldname = "myfield".reccount
| eval {fieldname} = myfield
| stats list(myfield1) as myfield1 list(myfield3) as myfield3 by grouprecordnumber
| eval myfield=mvzip(myfield1,myfield3,"=")
| eval flag="dumpme"
]
| eventstats list(myfield) as myfield by grouprecordnumber
| where flag!="dumpme"
0 Karma