Splunk Search

How to extract multivalue pipe separated subfields (with header) from a field?

jedatt01
Builder

I have a data source from DBX that has a field called "description" that contains a pipe separated format with headers I would like to extract as a multivalue field (using the existing headers as field names). I would like to do it without config files if it's all possible in SPL. Example Event below.

Format from DBX output

Number,Type,Name,Description
10001,type1,name1,| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value1 | value2 | value3 | value4 |

I would like to end up with 4 multivalue fields called (subheader1,subheader2,subheader3,subheader4) and contain all the corresponding row values from the pipe seperated format.

Is this possible?

Tags (2)
0 Karma

DalJeanis
Legend

Here's one example of how you can convert the piped values at search time to a single multivalue field with a format more like this...

subheader1=value1
subheader2=value2
subheader3=value3
subheader 4=value4

From there it can be rectified into different fields, but I need the answers to my questions above in order to give you the most appropriate solution.

This run-anywhere code is probably overcomplicated, and only takes you halfway to your goal, but it should give you some ideas of what is possible.

| makeresults | eval number=1001,Type="type1",Name="name1",Description="| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value1 | value2 | value3 | value4 |" 
| append [| makeresults | eval number=1002,Type="type2",Name="name2",Description="| subheader1 | subheader2 | subheader3 | subheader 4 || --- | --- | --- | --- || value11 | value12 | value13 | value14 |" ]
| streamstats count as grouprecordnumber
| eval flag="keepme"
| appendpipe [
| table grouprecordnumber Description | rex mode=sed field=Description "s/ \|\| / |!!!!| /g" 
| rex mode=sed field=Description "s/\|/||/g"
| makemv delim="!!!!" Description
| mvexpand Description
| streamstats count as reccount by grouprecordnumber
| where reccount!=2
| rex field=Description max_match=5 "\| (?<myfield>[^|]*) "
| eval fieldname = "myfield".reccount
| eval {fieldname} = myfield
| stats list(myfield1) as myfield1 list(myfield3) as myfield3 by grouprecordnumber
| eval myfield=mvzip(myfield1,myfield3,"=")
| eval flag="dumpme"
]
| eventstats list(myfield) as myfield by grouprecordnumber
| where flag!="dumpme"
0 Karma

DalJeanis
Legend

Sure, it can be done. Do the subheader field names sometimes have spaces in them, or not? Are the values purely numeric, or alpha, or do they occasionally contain spaces themselves? Is there only one set of values allowed, or could it be a table? Is it always a fixed number of fields in the description, or could it vary?

0 Karma

jedatt01
Builder

Sub header field names can have spaces and are alphanumeric. The values contain alpha numeric and special characters including spaces. It is always 4 total fields but the number of rows in the table is variable.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Thanks for the Memories! Splunk University, .conf25, and our Community

Thank you to everyone in the Splunk Community who joined us for .conf25, which kicked off with our iconic ...

Data Persistence in the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. What happens if the OpenTelemetry collector ...

Introducing Splunk 10.0: Smarter, Faster, and More Powerful Than Ever

Now On Demand Whether you're managing complex deployments or looking to future-proof your data ...