How to extract multivalue pipe separated subfields...

jedatt01 · ‎03-29-2017

I have a data source from DBX that has a field called "description" that contains a pipe separated format with headers I would like to extract as a multivalue field (using the existing headers as field names). I would like to do it without config files if it's all possible in SPL. Example Event below.

Format from DBX output

Number,Type,Name,Description
10001,type1,name1,| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value1 | value2 | value3 | value4 |

I would like to end up with 4 multivalue fields called (subheader1,subheader2,subheader3,subheader4) and contain all the corresponding row values from the pipe seperated format.

Is this possible?

DalJeanis · ‎03-30-2017

Here's one example of how you can convert the piped values at search time to a single multivalue field with a format more like this...

subheader1=value1
subheader2=value2
subheader3=value3
subheader 4=value4

From there it can be rectified into different fields, but I need the answers to my questions above in order to give you the most appropriate solution.

This run-anywhere code is probably overcomplicated, and only takes you halfway to your goal, but it should give you some ideas of what is possible.

| makeresults | eval number=1001,Type="type1",Name="name1",Description="| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value1 | value2 | value3 | value4 |" 
| append [| makeresults | eval number=1002,Type="type2",Name="name2",Description="| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value11 | value12 | value13 | value14 |" ]
| streamstats count as grouprecordnumber
| eval flag="keepme"
| appendpipe [
| table grouprecordnumber Description | rex mode=sed field=Description "s/ \|\| / |!!!!| /g" 
| rex mode=sed field=Description "s/\|/||/g"
| makemv delim="!!!!" Description
| mvexpand Description
| streamstats count as reccount by grouprecordnumber
| where reccount!=2
| rex field=Description max_match=5 "\| (?<myfield>[^|]*) "
| eval fieldname = "myfield".reccount
| eval {fieldname} = myfield
| stats list(myfield1) as myfield1 list(myfield3) as myfield3 by grouprecordnumber
| eval myfield=mvzip(myfield1,myfield3,"=")
| eval flag="dumpme"
]
| eventstats list(myfield) as myfield by grouprecordnumber
| where flag!="dumpme"

DalJeanis · ‎03-30-2017

Sure, it can be done. Do the subheader field names sometimes have spaces in them, or not? Are the values purely numeric, or alpha, or do they occasionally contain spaces themselves? Is there only one set of values allowed, or could it be a table? Is it always a fixed number of fields in the description, or could it vary?

jedatt01 · ‎04-02-2017

Sub header field names can have spaces and are alphanumeric. The values contain alpha numeric and special characters including spaces. It is always 4 total fields but the number of rows in the table is variable.

How to extract multivalue pipe separated subfields (with header) from a field?

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Splunk App Dev Quarterly Roundup: AI, Agents, and Innovation!

Federated Search for Dynamic Data Self Storage Is Now Generally Available on Splunk ...

Index This | What has many keys but can’t unlock a door?

Join the Conversation