Splunk Enterprise

How to extract JSON data within the logs which also has unstructured data in it.

arunsoni
Explorer

I want to extract JSON data alone into key value pairs and JSON is not fixed it can extend to extra lines. Everything need to be done on indexer level and nothing on search head.

 

Sample:

 

2024-03-11T20:58:12.605Z [INFO] SessionManager sgrp:System_default swn:99999 sreq:1234567 | {"abrMode":"NA","abrProto":"HLS","event":"Create","sUrlMap":"","sc":{"Host":"x.x.x.x","OriginMedia":"HLS","URL":"/x.x.x.x/vod/Test-XXXX/XXXXX.smil/transmux/XXXXX"},"sm":{"ActiveReqs":0,"ActiveSecs":0,"AliveSecs":360,"MediaSecs":0,"SpanReqs":0,"SpanSecs":0},"swnId":"XXXXXXXX","wflow":"System_default"}
2024-03-11T20:58:12.611Z [INFO] SessionManager sgrp:System_default swn:99999 sreq:1234567 | {"abrMode":"NA","abrProto":"HLS","event":"Cache","sUrlMap":"","sc":{"Host":"x.x.x.x","OriginMedia":"HLS","URL":"/x.x.x.x/vod/Test-XXXXXX/XXXXXX.smil/transmux/XXX"},"sm":{"ActiveReqs":0,"ActiveSecs":0,"AliveSecs":0,"MediaSecs":0,"SpanReqs":0,"SpanSecs":0},"swnId":"XXXXXXXXXXXXX","wflow":"System_default"}

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

1. Whether something is done on search-head or on indexer depends on the search as a whole. The same command(s) can be performed on either of those layers depending on the rest of the search.

2. Even indexers perform search-time operations (and it's a good thing)

So I suspect you wanted to say "in index-time" instead of "on indexer". And while we're at it...

1. Usually you don't extract fields in index time (so called indexed fields) unless you have a Very Good Reason (tm) to do so. The usual Splunk approach is to extract fields in search time

2. You can't use indexed extractions with data that is not fully well-formed json/xml/csv data.

3. You can try to define regex-based index time for single fields (which in itself isn't a great idea) but you cannot parse the json structure as a whole in index time.

4. Even in search time you have to explicitly use spath command on some extracted part of the raw data. There are severa ideas regarding this aspect of Splunk functionality which you could back up on ideas.splunk.com

0 Karma

arunsoni
Explorer

@PickleRick I am looking for options on the indexer to convert the data to a structured format not on the search head

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Again - it's not _where_ it's processed. It's when and how it's processed. Things are processed in search-time on indexers.

And no, you cannot use indexed extractions on data where whole events aren't fully well-formed structured data.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@arunsoni 

Can you please add the below configurations in props.conf and try?

[YOUR_SOURCETYPE]
SEDCMD-a=s/^[^{]*//g

 

Note: it will be applied to new events only.

 

Screenshot 2024-09-04 at 9.41.36 AM.png

 

I hope this will help you.

Thanks
KV
An upvote would be appreciated if any of my replies help you solve the problem or gain knowledge.

 

0 Karma

arunsoni
Explorer

@kamlesh_vaghela . I want to get full event to splunk. The below sedcmd will remove first few lines and then the remaining event is viewed as json format. I want to keep full event as it is. Is there a way we can apply props/transform in which splunk identifies both structured(json) and unstrutured formatted data.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...