Getting Data In

How to extract specific key/value pairs from a JSON payload and index them separately

beetlegeuse
Path Finder

I have a JSON payload that's ingested through a REST API input on a heavy forwarder, with the following configuration in props.conf (on the heavy forwarder, not on the indexer):

    [json_result]

    INDEXED_EXTRACTIONS = json
    KV_MODE = none
    DATETIME_CONFIG = CURRENT
    SHOULD_LINEMERGE = false
    TRUNCATE = 200000

The ensuing event in Splunk looks like this (minified):

{"totalCount":3,"nextPageKey":null,"result":[{"metricId":"builtin:synthetic.http.resultStatus","data":[{"dimensions":["HTTP_CHECK-02B087D58EC18C33","SUCCESS","SYNTHETIC_LOCATION-2CD023FA5F455E28"],"dimensionMap":{"Result status":"SUCCESS","dt.entity.synthetic_location":"SYNTHETIC_LOCATION-2CD023FA5F455E28","dt.entity.http_check":"HTTP_CHECK-02B087D58EC18C33"},"timestamps":[1639254360000],"values":[1]},{"dimensions":["HTTP_CHECK-02B087D58EC18C33","SUCCESS","SYNTHETIC_LOCATION-833A207E28766E49"],"dimensionMap":{"Result status":"SUCCESS","dt.entity.synthetic_location":"SYNTHETIC_LOCATION-833A207E28766E49","dt.entity.http_check":"HTTP_CHECK-02B087D58EC18C33"},"timestamps":[1639254360000],"values":[1]},{"dimensions":["HTTP_CHECK-02B087D58EC18C33","SUCCESS","SYNTHETIC_LOCATION-1D85D445F05E239A"],"dimensionMap":{"Result status":"SUCCESS","dt.entity.synthetic_location":"SYNTHETIC_LOCATION-1D85D445F05E239A","dt.entity.http_check":"HTTP_CHECK-02B087D58EC18C33"},"timestamps":[1639254360000],"values":[1]}]}]}

The text in red reflects what I'm trying to extract from the payload; basically, it's three fields ("Result status", "dt.entity.synthetic_location" and "dt.entity.http_check") and their associated values. I'd like to have three events created from the payload, one event for each occurrence of the three fields, with the fields searchable in Splunk.

I've tried this approach in props.conf to get what I'm looking for...

    [json_result]    

    SHOULD_LINEMERGE = false
    LINE_BREAKER = },
    DATETIME_CONFIG = CURRENT
    TRUNCATE = 0

    SEDCMD-remove_prefix = s/{"totalCount":.*"nextPageKey":.*"result":\[{"metricId"
:.*"data":\[//g
    SEDCMD-remove_dimensions = s/{"dimensions":.*"dimensionMap"://g
    SEDCMD-remove_timevalues = s/,"timestamps":.*"values":.*}//g
    SEDCMD-remove_suffix = s/\]}\]}//g

...but I'm only getting one set of fields to show up as an event in Splunk:

beetlegeuse_0-1639255239483.png

And, the fields aren't showing up as "interesting fields" in the left navbar (possibly because the props.conf is not on the indexer?).

Any assistance would be greatly appreciated.

UPDATE: I referenced this post that's pretty close to what I'm trying to accomplish:

https://community.splunk.com/t5/Getting-Data-In/How-to-split-a-json-array-into-multiple-events-with-...

The format of the JSON payload cited in this post is different than the format of the payload I'm using, though...so I'm guessing that some additional logic would be necessary to accommodate my format.

Labels (3)
0 Karma
1 Solution

beetlegeuse
Path Finder

Did a few things to get this working correctly:

- Since I'm using the REST API modular input, I established and implemented a custom response handler to trim some of the original minified JSON payload (revised payload below):

{"metricId": "builtin:synthetic.http.resultStatus", "data": [{"dimensions": ["HTTP_CHECK-F79EBFAF0B5C8BC1", "SUCCESS", "SYNTHETIC_LOCATION-2CD023FA5F455E28"], "dimensionMap": {"Result status": "SUCCESS", "dt.entity.synthetic_location": "SYNTHETIC_LOCATION-2CD023FA5F455E28", "dt.entity.http_check": "HTTP_CHECK-F79EBFAF0B5C8BC1"}, "timestamps": [1639412160000], "values": [1]}, {"dimensions": ["HTTP_CHECK-F79EBFAF0B5C8BC1", "SUCCESS", "SYNTHETIC_LOCATION-833A207E28766E49"], "dimensionMap": {"Result status": "SUCCESS", "dt.entity.synthetic_location": "SYNTHETIC_LOCATION-833A207E28766E49", "dt.entity.http_check": "HTTP_CHECK-F79EBFAF0B5C8BC1"}, "timestamps": [1639412160000], "values": [1]}, {"dimensions": ["HTTP_CHECK-F79EBFAF0B5C8BC1", "SUCCESS", "SYNTHETIC_LOCATION-1D85D445F05E239A"], "dimensionMap": {"Result status": "SUCCESS", "dt.entity.synthetic_location": "SYNTHETIC_LOCATION-1D85D445F05E239A", "dt.entity.http_check": "HTTP_CHECK-F79EBFAF0B5C8BC1"}, "timestamps": [1639412160000], "values": [1]}]}

- After taking a step back and realizing that the line break (parsing queue) must be spot on before the SEDCMD (typing queue) is applied, I identified my break points in the updated JSON payload and defined my SEDCMD executions to remove additional payload content:

    SHOULD_LINEMERGE = false
    LINE_BREAKER = \}(\,\s)
    DATETIME_CONFIG = CURRENT
    TRUNCATE = 0
    SEDCMD-remove_header = s/(^\{.*"data":\s\[)//g
    SEDCMD-remove_footer = s/\}\]\}/}/g
    SEDCMD-remove_dimensions = s/{"dimensions":.*"dimensionMap":\s//g
    SEDCMD-remove_timevalues = s/"timestamps":.*"values":.*}//g

- After stopping and starting the Splunk service and waiting a tick before searching, I validated that I now have the desired result:

beetlegeuse_1-1639427606911.png

While this was working correctly with the props.conf on just the heavy forwarder, I found that I had to implement the props.conf on the indexers as well in order for the three fields in each event to show up in the "Interesting Fields" portion of the left navbar.

 

View solution in original post

0 Karma

beetlegeuse
Path Finder

Did a few things to get this working correctly:

- Since I'm using the REST API modular input, I established and implemented a custom response handler to trim some of the original minified JSON payload (revised payload below):

{"metricId": "builtin:synthetic.http.resultStatus", "data": [{"dimensions": ["HTTP_CHECK-F79EBFAF0B5C8BC1", "SUCCESS", "SYNTHETIC_LOCATION-2CD023FA5F455E28"], "dimensionMap": {"Result status": "SUCCESS", "dt.entity.synthetic_location": "SYNTHETIC_LOCATION-2CD023FA5F455E28", "dt.entity.http_check": "HTTP_CHECK-F79EBFAF0B5C8BC1"}, "timestamps": [1639412160000], "values": [1]}, {"dimensions": ["HTTP_CHECK-F79EBFAF0B5C8BC1", "SUCCESS", "SYNTHETIC_LOCATION-833A207E28766E49"], "dimensionMap": {"Result status": "SUCCESS", "dt.entity.synthetic_location": "SYNTHETIC_LOCATION-833A207E28766E49", "dt.entity.http_check": "HTTP_CHECK-F79EBFAF0B5C8BC1"}, "timestamps": [1639412160000], "values": [1]}, {"dimensions": ["HTTP_CHECK-F79EBFAF0B5C8BC1", "SUCCESS", "SYNTHETIC_LOCATION-1D85D445F05E239A"], "dimensionMap": {"Result status": "SUCCESS", "dt.entity.synthetic_location": "SYNTHETIC_LOCATION-1D85D445F05E239A", "dt.entity.http_check": "HTTP_CHECK-F79EBFAF0B5C8BC1"}, "timestamps": [1639412160000], "values": [1]}]}

- After taking a step back and realizing that the line break (parsing queue) must be spot on before the SEDCMD (typing queue) is applied, I identified my break points in the updated JSON payload and defined my SEDCMD executions to remove additional payload content:

    SHOULD_LINEMERGE = false
    LINE_BREAKER = \}(\,\s)
    DATETIME_CONFIG = CURRENT
    TRUNCATE = 0
    SEDCMD-remove_header = s/(^\{.*"data":\s\[)//g
    SEDCMD-remove_footer = s/\}\]\}/}/g
    SEDCMD-remove_dimensions = s/{"dimensions":.*"dimensionMap":\s//g
    SEDCMD-remove_timevalues = s/"timestamps":.*"values":.*}//g

- After stopping and starting the Splunk service and waiting a tick before searching, I validated that I now have the desired result:

beetlegeuse_1-1639427606911.png

While this was working correctly with the props.conf on just the heavy forwarder, I found that I had to implement the props.conf on the indexers as well in order for the three fields in each event to show up in the "Interesting Fields" portion of the left navbar.

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Yes, the props/transforms settings you apply on HF's are applied when splunk is ingesting the data (like event breaking, timestamp recognition/parsing, indexed fields extraction) but search-time extractions are performed on search-heads so you need your search-time settings there.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You coud try to clone the event using CLONE_SOURCETYPE and each time cutting part of it. At least that's the only way I could think of.

About the interesting fields issue - are you searching in fast mode or verbose mode?

0 Karma

beetlegeuse
Path Finder

I'm searching in Verbose mode. I updated my original post with a link to another post that falls in line with what I'm trying to accomplish. The layout of the JSON payload in that post is different from my payload's layout...but they were able to break out the events successfully. I'm having difficulty trying to replicate that success.

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

WATCH NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If exploited, ...

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...