Splunk Search

How to extract fields from a log that is comma delimited?

rhaarmann
Engager

Ok, complex extraction. I have a log that is comma delimited, but they have key,value,key,value,key,value, etc. It's key-value structured like JSON, but in CSV format. I have figured out how to work with this in Python, but not familiar with writing splunk apps and scripts that will do a stream on a file.

I found this Splunk Answers post (http://answers.splunk.com/answers/204994/complex-kv-extraction.html ), but it doesn't completely answer my problem as I will have to hardcode each field, but all my fields are unknown and can change as developers make changes.

Log Example:

Key.Value,Timestamp,$timestamp,ms,$ts_ms,key1,value1,key2,value2,key3,value3
Key.Value,Timestamp,$timestamp,ms,$ts_ms,key1,value1,key4,value4
Key.Value,Timestamp,$timestamp,ms,$ts_ms,key2,value2,key3,value3,key5,value5,key6,value6,key4,value4
0 Karma

woodcock
Esteemed Legend

Like this:

... | rex field=_raw "(?:[^,]+,){5}(?<KVPs>.*)" | streamstats current=t count AS serial | rex max_match=0 field=KVPs "(?<KVPNAME>[^,]+,[^,]+)(?:,|$)" | mvexpand KVPNAME | rex field=KVPNAME "(?<_KEY_1>[^,]+),(?<_VAL_1>[^,]+)" | eval {_KEY_1}=_VAL_1 | stats values(_*) AS _* values(*) AS * BY serial | fields - serial

rhaarmann
Engager

This worked for the case of when the raw data is already indexed, but I would like to setup a transform that converts it to a better format then index the transformed data. I am currently testing out the solution you provided with searches and summary indexes and it is very expensive on the search heads, especially since the amount of the log data is around 25GB/day. I tried doing a transform line that contained SEDCMD, but it would not work, not sure why yet.

0 Karma
Get Updates on the Splunk Community!

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization uses ...