I want to extract JSON data alone into key value pairs and JSON is not fixed it can extend to extra lines. Everything need to be done on indexer level and nothing on search head.
Sample:
2024-03-11T20:58:12.605Z [INFO] SessionManager sgrp:System_default swn:99999 sreq:1234567 | {"abrMode":"NA","abrProto":"HLS","event":"Create","sUrlMap":"","sc":{"Host":"x.x.x.x","OriginMedia":"HLS","URL":"/x.x.x.x/vod/Test-XXXX/XXXXX.smil/transmux/XXXXX"},"sm":{"ActiveReqs":0,"ActiveSecs":0,"AliveSecs":360,"MediaSecs":0,"SpanReqs":0,"SpanSecs":0},"swnId":"XXXXXXXX","wflow":"System_default"}
2024-03-11T20:58:12.611Z [INFO] SessionManager sgrp:System_default swn:99999 sreq:1234567 | {"abrMode":"NA","abrProto":"HLS","event":"Cache","sUrlMap":"","sc":{"Host":"x.x.x.x","OriginMedia":"HLS","URL":"/x.x.x.x/vod/Test-XXXXXX/XXXXXX.smil/transmux/XXX"},"sm":{"ActiveReqs":0,"ActiveSecs":0,"AliveSecs":0,"MediaSecs":0,"SpanReqs":0,"SpanSecs":0},"swnId":"XXXXXXXXXXXXX","wflow":"System_default"}
1. Whether something is done on search-head or on indexer depends on the search as a whole. The same command(s) can be performed on either of those layers depending on the rest of the search.
2. Even indexers perform search-time operations (and it's a good thing)
So I suspect you wanted to say "in index-time" instead of "on indexer". And while we're at it...
1. Usually you don't extract fields in index time (so called indexed fields) unless you have a Very Good Reason (tm) to do so. The usual Splunk approach is to extract fields in search time
2. You can't use indexed extractions with data that is not fully well-formed json/xml/csv data.
3. You can try to define regex-based index time for single fields (which in itself isn't a great idea) but you cannot parse the json structure as a whole in index time.
4. Even in search time you have to explicitly use spath command on some extracted part of the raw data. There are severa ideas regarding this aspect of Splunk functionality which you could back up on ideas.splunk.com
@PickleRick I am looking for options on the indexer to convert the data to a structured format not on the search head
Again - it's not _where_ it's processed. It's when and how it's processed. Things are processed in search-time on indexers.
And no, you cannot use indexed extractions on data where whole events aren't fully well-formed structured data.
Can you please add the below configurations in props.conf and try?
[YOUR_SOURCETYPE]
SEDCMD-a=s/^[^{]*//g
Note: it will be applied to new events only.
I hope this will help you.
Thanks
KV
An upvote would be appreciated if any of my replies help you solve the problem or gain knowledge.
@kamlesh_vaghela . I want to get full event to splunk. The below sedcmd will remove first few lines and then the remaining event is viewed as json format. I want to keep full event as it is. Is there a way we can apply props/transform in which splunk identifies both structured(json) and unstrutured formatted data.