Getting Data In

What are the options for parsing JSON data at index and search-time and are there any key constraints to be aware of?

Path Finder

Splunk Gurus -

I've yet not absorbed JSON data in my setup, but I'm anticipating many sources in near future generating lot of JSON data., I wanted to gather some inputs from this group -

I've read that use of spath is the way to work with JSON objects. In this regard,

  1. Do I've to use spath every time with all the searches while working with JSON data ?
  2. Does JSON data and spath have any constraints in regards to creating summaries, creating and searching data models ?
  3. what kind of performance impact I should anticipate when using spath vs not using it
  4. Do Splunk users take a route of converting JSON into KV pairs and then index? If so, what is that situation that you have faced?
  5. What will be key constraints that I should be aware of as user of spath for working with JSON data

thanks for your inputs

best, ronak

0 Karma

Contributor

Feature request...
Is there any way to make a single extract that runs before KV_MODE = JSON kicks in? I keep seeing cases where just the body of the message is json, listed after the date, level and logger. In that case, you have to run the rex first followed by the spath.

To the original question...
If the whole body of the event is json, then the fields will automatically be extracted. It "just works." You can summarize like any other extracted field. I've never measured the performance, but it seems pretty good, but just think about what it's doing... you probably don't want to convert all of your logs to json.

The only caveat I can give is to avoid complicated json documents if you can control the documents. For example, if you have arrays of objects that are actually each an event, you'll have to do some gymnastics with mvexpand to keep the fields in each nested event related.

0 Karma

Path Finder

If your data is all json then you want this in your props:

KV_MODE = JSON

http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf

For my JSON data I do little else. For my large data sets I use a tsindx to store what I need for speed.

If not pure JSON you need the json in a filed then extract

  | rex field=_raw "(?s)(?<xxx>match_json_here)" | spath input=xxx
0 Karma

Path Finder

BTW, the version that I've is 6.2.1

0 Karma