Getting Data In

How can I extract the nested JSON at index time

tkwaller
Builder

Hello

I have some logs that have nested JSON. If I add INDEXED_EXTRACTIONS = JSON the non-JSON data does not appear but the JSON is expandable and extracted.

Heres a sample of the log

2017-10-31 18:27:07,444 priority=INFO  app=apps thread=[stuff-2.0.177-v11111111].HttpsListenerConfig.worker.12 location=MessageProcessor line=151 _message="Message flow..." {appName=[stuff-2.0.177-v11111111, orderValue=10.00, field=1506373, retryCnt=0, field=12fdfg-123dsdf-213423vdc-dfg43, id=123456, field=123456789, field=2, field=220838349} responsePayload='{
  "field": 220838349,
  "field": 1292975431,
  "field": "1506373",
  "endTime": "2017-10-31T18:42:05.456Z",
  "field": true,
  "field": [
    {
      "field": -1,
      "field": "",
      "field": "31",
      "field": "27",
      "field": "16",
      "field": {
        "amount": 37.4,
        "currency": "USD"
      },
      "field": "HOLD"
    },
    {
      "field": -1,
      "field": "",
      "field": "31",
      "field": "27",
      "field": "17",
      "field": {
        "amount": 37.4,
        "currency": "USD"
      },
      "field": "HOLD"
    }
  ]
}' responseHttpStatus=200 timeTakenInMillis=2003

My current props are

   [sourcetype]
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%f
TRUNCATE = 100000
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
REPORT-json = report-json-kv

And I added transforms

[report-json-kv]       
REGEX = \"(\w+)\":[\s]*\"([^\,\}\"]+)
FORMAT = $1::$2
MV_ADD = true

The problem is now that it does not extract the values within the JSON data.
I tested with my regex extractor and it works there but not in splunk.
Any ideas?

Thanks!!

0 Karma

sbbadri
Motivator

@tkwaller
Try this,

props.conf
[sourcetype]
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%f
TRUNCATE = 100000
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
REPORT-json = report-json-kv

transforms.conf

[report-json-kv]
CLEAN_KEYS = 0
FORMAT = $1::$3
MV_ADD = 1
REGEX = \"(\w+)\":.(\"|)([a-z0-9-.A-Z:]+)
SOURCE_KEY = _raw

0 Karma

tkwaller
Builder

Hello

I removed the indexed data and the index, updated the configs with yours and the re-added the data but its still not extracting the fields. I DID test your regex and it IS correct but its still not working

0 Karma

valiquet
Contributor

From the UI you can use spath:

| makeresults count=1 
| eval myJson="{\"widget\": { \"text\": { \"data\": \"Click here\", \"size\": 36, \"data\": \"Learn more\", \"size\": 37, \"data\": \"Help\", \"size\": 38,}}" 
| spath input=myJson
0 Karma

tkwaller
Builder

So I updated the question with my new configs. It works in regex testers but doesnt extract in splunk.

0 Karma

swebb07g
Path Finder

Are you sending the JSON to HEC? if you want to do custom extraction at index time, make sure you use the HEC URL ending in /collector/raw.

 

if you use /collector (or /collector/event) endpoint, then it is probably bypassing some customizations.

0 Karma

koshyk
Super Champion

Had similar issue https://answers.splunk.com/answers/117121/extract-json-data-within-the-logs.html
Solved using props.conf and transforms.conf

0 Karma

ddrillic
Ultra Champion

Recently I had a similar embedded json challenge at How can we extract a json document within an event?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...