Splunk Search

Part 1: How to extract a json portion of an event then use spath to extract key=value pairs

lpolo
Motivator

I have the following log event but I have not been able to use spath to extract the json key=value pairs.

2013-03-12 10:37:10,205 <tvsquery id=58b6bf4d-948b-416b-8d17-cedcbc1059ec>{
"start" : 1,
"returned" : 0,
"count" : 0
}
</tvsquery>

Therefore, I tried to extract the json portion with this regex and then use spath:

|rex field=_raw "[^>]+)>(?.+?)"|spath input=response

But I having a hard time to make it work.

How can I extract the json portion of the event and then use spath to extract the key=value pairs?

Thanks,
Lp

Tags (3)

marciogh
Engager

By an unknown reason, I have to replace single quotes for double quotes in order to make a duplicate spath call.
"level" is a JSON field inside a JSON in message field.

 * | spath message | eval message=replace(message,"'","\"") |spath input=message | search level=INFO

Got the tip from here https://answers.splunk.com/answers/444133/extract-json-from-a-field.html

emiller42
Motivator

This is because using single-quotes isn't valid JSON, so it can't parse it as JSON.

{"aaa": 1, "bbb": "some value"}
vs.
{'aaa': 1, 'bbb': 'some value'}

The first is JSON. The second is not.

Since it's a field extracted from a larger JSON, I'm going to assume it's just incorrectly constructed. Something like this would work fine, and not require multiple spath calls:

{"id": 12345, "message": {"level": "INFO", "content": "foo bar baz"}}

But I'm going to guess what you have to work with is:

{"id": 12345, "message": "{'level': 'INFO', 'content': 'foo bar baz'}"}

Which isn't valid JSON at all. (or more specifically, message is just a string, and not a JSON object)

lpolo
Motivator

The following regex will work, if and only if, there is not any new line in the event:

rex "<tvsquery id=(?<id>[^>]+)>(?<response>.+?)</tvsquery>"

Therefore, I was able to make it work by trimming the event before the regular expression as follow:

| rex field=_raw mode=sed "s/[\r\n]//g" 
| rex "<tvsquery id=(?<id>[^>]+)>(?<response>.+?)</tvsquery>" 

Then, the extracted field "response" can be processed by spath search command.

Regards,
Lp

0 Karma

jonuwz
Influencer

With difficulty.

Here's a starter :

...
| rex "<tvsquery id=(?<id>[^>]+)>(?<response>.*)</tvsquery>" 
| table id response _time
| spath input=response
| eval key=_time.";".id
| fields - response _time id
| untable key field value

Now you can either have the values in seperate events :

| rex field=key "(?<_time>.*);(?<id>.*)"
| fields - key

Or for easy reading :

| stats list(field) as fields list(value) as values by key 
| rex field=key "(?<_time>.*);(?<id>.*)"
| fields - key

jonuwz
Influencer

Why not ? what does response get populated with ? (just run the 1st 2 lines)

0 Karma

lpolo
Motivator

This approach should work. However, I am still unable to make the regex work in my environment.

0 Karma

jonuwz
Influencer

is that json in xml ? yowsers

emiller42
Motivator

I believe you need to keep the {} in the field as well, which your rex omits. Try this for your rex:

rex "(?< json_field>{[^}]+})"

(remove the space used to keep text from disappearing)

Or if you want to keep the more complex regex, simply move the closing } into the capture group.

You should then be able to use spath on the rex'ed field.

EDIT TO ADD:
If you've got events with multiple JSON objects, then you'll have to do a bit more work with it. I would recommend adding the max_match param to the rex, which will find multiple matches and collect them into a multi-valued field. Then you can mvexpand that field to multiple events, and parse that with spath.

rex max_match=10 "(?< json_field>{[^}]+})" | mvexpand json_field | spath input=json_field ...

(replace rex as needed)

lpolo
Motivator

The regular expression behaves if an only if there is not any json array like the presented example.
It partially solves the problem.

0 Karma

emiller42
Motivator

If this is working, would you mind accepting the answer? That way other people searching will see this is something with a resolution. Thanks!

0 Karma

lpolo
Motivator

I reviewed my props.conf and I removed
KV_MODE = json from the related sourcetype. Then, reload props.conf and I ran the query with your recommendation. It seems to be working. I need to validate the result set.

Thanks,
Lp

0 Karma

emiller42
Motivator

I'm just another user, so I can't see your cases. Sorry!

0 Karma

lpolo
Motivator

It did not work. It continues to fail if the extracted json field contains multiple arrays and objects. I made sure that max_match=value was not greater that the number of objects.
Could you kindly see the log example I posted in case 114699?
thanks,LP

0 Karma

emiller42
Motivator

Then I would add a max_match= condition to the rex, so it could capture more than one JSON array into a multi-valued field. Then pipe that to mvexpand so that they get split to multiple events.

rex max_match=10 "regex_string" |mvexpand field_name | spath ...

0 Karma

lpolo
Motivator

what do you recommend?

0 Karma

lpolo
Motivator

Thanks for the observation. I corrected this problem as you recommended. And I was able to extract the json portion of the event and use spath. However, I am facing the same issue I had at the beginning: if the extracted json field contains multiple arrays and objects both regex fail to extract json portion of the event.

0 Karma

emiller42
Motivator

I think the most important thing is that in your original rex, the closing } wasn't part of the capture group, so the field being extracted was
{ "start" : 1, "returned" : 0, "count" : 0
which spath will fail on.

0 Karma

lpolo
Motivator

I tried using the regex in a more complex json field but it fails. It worked for the simple json presented in the example.

0 Karma

lpolo
Motivator

Thanks, It worked.

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...