Splunk Search

How to extract fields from a log that is comma delimited?

rhaarmann
Engager

Ok, complex extraction. I have a log that is comma delimited, but they have key,value,key,value,key,value, etc. It's key-value structured like JSON, but in CSV format. I have figured out how to work with this in Python, but not familiar with writing splunk apps and scripts that will do a stream on a file.

I found this Splunk Answers post (http://answers.splunk.com/answers/204994/complex-kv-extraction.html ), but it doesn't completely answer my problem as I will have to hardcode each field, but all my fields are unknown and can change as developers make changes.

Log Example:

Key.Value,Timestamp,$timestamp,ms,$ts_ms,key1,value1,key2,value2,key3,value3
Key.Value,Timestamp,$timestamp,ms,$ts_ms,key1,value1,key4,value4
Key.Value,Timestamp,$timestamp,ms,$ts_ms,key2,value2,key3,value3,key5,value5,key6,value6,key4,value4
0 Karma

woodcock
Esteemed Legend

Like this:

... | rex field=_raw "(?:[^,]+,){5}(?<KVPs>.*)" | streamstats current=t count AS serial | rex max_match=0 field=KVPs "(?<KVPNAME>[^,]+,[^,]+)(?:,|$)" | mvexpand KVPNAME | rex field=KVPNAME "(?<_KEY_1>[^,]+),(?<_VAL_1>[^,]+)" | eval {_KEY_1}=_VAL_1 | stats values(_*) AS _* values(*) AS * BY serial | fields - serial

rhaarmann
Engager

This worked for the case of when the raw data is already indexed, but I would like to setup a transform that converts it to a better format then index the transformed data. I am currently testing out the solution you provided with searches and summary indexes and it is very expensive on the search heads, especially since the amount of the log data is around 25GB/day. I tried doing a transform line that contained SEDCMD, but it would not work, not sure why yet.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...