Getting Data In

Ingesting a Json format data in Splunk

Shashank_87
Explorer

Hi, I am trying to upload a file with json formatted data like below but it's not coming properly. I tried using 2 ways -

  1. When selecting sourcetype as automatic, it is creating a separate event for timestamp field.
  2. When selecting the sourcetype as _json, the timestamp is not even coming in the event.

Tue 21 Apr 14:16:26 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4466.046 MB","used.jvm.memory": "1573.752 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.03 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}

Tue 21 Apr 14:16:36 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4456.382 MB","used.jvm.memory": "1583.415 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.439 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}

Is there a way to ingest/upload this data properly?

Tue 21 Apr 14:16:26 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4466.046 MB","used.jvm.memory": "1573.752 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.03 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}
Tue 21 Apr 14:16:36 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4456.382 MB","used.jvm.memory": "1583.415 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.439 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}
Tue 21 Apr 14:16:46 BST 2020
{"items":[{"cpu.load": "0.84","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4449.94 MB","used.jvm.memory": "1589.858 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3867.042 MB","total.used.physical.system.memory": "12.789 GB","number.of.cpus": "8"}]}
0 Karma
1 Solution

harsmarvania57
Ultra Champion

Hi,

Your rawdata contain timestamp Tue 21 Apr 14:16:26 BST 2020 and after that you have valid JSON, so you can't use _json sourcetype or INDEXED_EXTRACTIONS=json

At search time you use regex and then spath to create/extract fields from json blob.

Like

your_base_query | rex field=_raw "(?<ext_json>{[^}]+}]})" | spath input=ext_json

View solution in original post

0 Karma

harsmarvania57
Ultra Champion

Hi,

Your rawdata contain timestamp Tue 21 Apr 14:16:26 BST 2020 and after that you have valid JSON, so you can't use _json sourcetype or INDEXED_EXTRACTIONS=json

At search time you use regex and then spath to create/extract fields from json blob.

Like

your_base_query | rex field=_raw "(?<ext_json>{[^}]+}]})" | spath input=ext_json
0 Karma

Shashank_87
Explorer

@harsmarvania57 Thanks for the response but how would i upload the data at first place? which sourcetype should i use?
Because if i use automatic, the timestamp field comes as a separate event

0 Karma

harsmarvania57
Ultra Champion

Create your own sourcetype Like app_json

0 Karma

Shashank_87
Explorer

@harsmarvania57 I have already tried it and as i said it creates a separate event with just a timestamp. I don't want that I want that whole thing in a single event because I need that timestamp value in my report. I have attached s screenshot where you can see there are 2 separate events but that is actually a single event in the log file

0 Karma

harsmarvania57
Ultra Champion

I can’t see any screenshot, also please provide your raw data in code format(Use 101010 button)

0 Karma

Shashank_87
Explorer

@harsmarvania57 added

0 Karma

harsmarvania57
Ultra Champion

Based on data you have provided I have created below sourcetype on Indexer, if you are ingesting data via Heavy Forwarder then you need to create below props.conf on Heavy Forwarder.

props.conf

[test_st]
LINE_BREAKER = }([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 28
SHOULD_LINEMERGE = false
TIME_FORMAT = %a %d %b %H:%M:%S %Z %Y

And then used search query which I have provided and it is extracting data.

0 Karma

Shashank_87
Explorer

@harsmarvania57 That actually worked. Thank you. I am getting time time and the json in same event though the _time field has not been extracted. How do i extract the time because I have to plot the graph based on time.

0 Karma

harsmarvania57
Ultra Champion

I can see time from raw data in _time, see screenshot from my lab instance https://imgur.com/a/bW5T8ok

How are you ingesting data ?

0 Karma
Get Updates on the Splunk Community!

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...

Observability protocols to know about

Observability protocols define the specifications or formats for collecting, encoding, transporting, and ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...