Getting Data In

How can I extract the nested JSON at index time

tkwaller
Builder

Hello

I have some logs that have nested JSON. If I add INDEXED_EXTRACTIONS = JSON the non-JSON data does not appear but the JSON is expandable and extracted.

Heres a sample of the log

2017-10-31 18:27:07,444 priority=INFO  app=apps thread=[stuff-2.0.177-v11111111].HttpsListenerConfig.worker.12 location=MessageProcessor line=151 _message="Message flow..." {appName=[stuff-2.0.177-v11111111, orderValue=10.00, field=1506373, retryCnt=0, field=12fdfg-123dsdf-213423vdc-dfg43, id=123456, field=123456789, field=2, field=220838349} responsePayload='{
  "field": 220838349,
  "field": 1292975431,
  "field": "1506373",
  "endTime": "2017-10-31T18:42:05.456Z",
  "field": true,
  "field": [
    {
      "field": -1,
      "field": "",
      "field": "31",
      "field": "27",
      "field": "16",
      "field": {
        "amount": 37.4,
        "currency": "USD"
      },
      "field": "HOLD"
    },
    {
      "field": -1,
      "field": "",
      "field": "31",
      "field": "27",
      "field": "17",
      "field": {
        "amount": 37.4,
        "currency": "USD"
      },
      "field": "HOLD"
    }
  ]
}' responseHttpStatus=200 timeTakenInMillis=2003

My current props are

   [sourcetype]
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%f
TRUNCATE = 100000
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
REPORT-json = report-json-kv

And I added transforms

[report-json-kv]       
REGEX = \"(\w+)\":[\s]*\"([^\,\}\"]+)
FORMAT = $1::$2
MV_ADD = true

The problem is now that it does not extract the values within the JSON data.
I tested with my regex extractor and it works there but not in splunk.
Any ideas?

Thanks!!

0 Karma

sbbadri
Motivator

@tkwaller
Try this,

props.conf
[sourcetype]
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%f
TRUNCATE = 100000
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
REPORT-json = report-json-kv

transforms.conf

[report-json-kv]
CLEAN_KEYS = 0
FORMAT = $1::$3
MV_ADD = 1
REGEX = \"(\w+)\":.(\"|)([a-z0-9-.A-Z:]+)
SOURCE_KEY = _raw

0 Karma

tkwaller
Builder

Hello

I removed the indexed data and the index, updated the configs with yours and the re-added the data but its still not extracting the fields. I DID test your regex and it IS correct but its still not working

0 Karma

valiquet
Contributor

From the UI you can use spath:

| makeresults count=1 
| eval myJson="{\"widget\": { \"text\": { \"data\": \"Click here\", \"size\": 36, \"data\": \"Learn more\", \"size\": 37, \"data\": \"Help\", \"size\": 38,}}" 
| spath input=myJson
0 Karma

tkwaller
Builder

So I updated the question with my new configs. It works in regex testers but doesnt extract in splunk.

0 Karma

swebb07g
Path Finder

Are you sending the JSON to HEC? if you want to do custom extraction at index time, make sure you use the HEC URL ending in /collector/raw.

 

if you use /collector (or /collector/event) endpoint, then it is probably bypassing some customizations.

0 Karma

koshyk
Super Champion

Had similar issue https://answers.splunk.com/answers/117121/extract-json-data-within-the-logs.html
Solved using props.conf and transforms.conf

0 Karma

ddrillic
Ultra Champion

Recently I had a similar embedded json challenge at How can we extract a json document within an event?

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...