Getting Data In
Highlighted

How to split a json array into multiple events with separate timestamps?

Champion

Hi,
Below is sample json input I am getting from rest api:

{ [-] 
    IPRequestLog: [ [-] 
     { [-] 
        access_key:  test 
        id:  0ac03844-a374-4237-9172-a7af9122bed2 
        ip_address:  192.168.1.245 
        requested_on:  2015-07-28 06:47:48 
        source_ip:  49.248.183.29 
     } 
     { [-] 
        access_key:  test 
        id:  7b1f5f38-77d1-453e-8a9e-e33f206474ff 
        ip_address:  192.168.1.240 
        requested_on:  2015-07-28 06:47:54 
        source_ip:  49.248.183.29 
     } 
     { [-] 
        access_key:  test 
        id:  83c6724b-2017-42fa-9cba-5c256d8d502e 
        ip_address:  192.168.1.249 
        requested_on:  2015-07-28 06:47:51 
        source_ip:  49.248.183.29 
     } 
   ] 
}

Currently values within the arrays are clubbed into a single event and 1st timestamp value is recognised as event time. I tried adding the following in props.conf:

[source::source_name]
TIME_PREFIX = requested_on":"
MAX_TIMESTAMP_LOOKAHEAD = 1000
BREAK_ONLY_BEFORE_DATE = false
MUST_BREAK_AFTER = },{

Does anybody know to split array into separate events with respective timestamps (in this case requested_on)?

Highlighted

Re: How to split a json array into multiple events with separate timestamps?

Path Finder

Just to clarify, did you want to do this at index time?
I'm told you have splunk parse json quite easily, I haven't tried but it's worth researching?

0 Karma
Highlighted

Re: How to split a json array into multiple events with separate timestamps?

Champion

Yes, I want to do this at index time.

0 Karma
Highlighted

Re: How to split a json array into multiple events with separate timestamps?

SplunkTrust
SplunkTrust

You would then need to SEDCMD in your props.conf to manipulate the data before the JSON transformation is done. See some readables here.
http://docs.splunk.com/Documentation/Splunk/6.2.4/Data/Anonymizedatausingconfigurationfiles
http://answers.splunk.com/answers/210096/how-to-configure-sedcmd-in-propsconf.html

0 Karma
Highlighted

Re: How to split a json array into multiple events with separate timestamps?

Contributor

You can do this by manipulating the break liners and cleaning up the stuff that is not needed. For instance, assume your JSON string looks like this:

{
    "IPRequestLog": [
        {
            "access_key": "test",
            "id": "0ac03844-a374-4237-9172-a7af9122bed2",
            "ip_address": "192.168.1.245",
            "requested_on": "2015-07-28 06:47:48",
            "source_ip": "49.248.183.29"
        },
        {
            "access_key": "test",
            "id": "0ac03844-a374-4237-9172-e33f206474ff",
            "ip_address": "192.168.1.245",
            "requested_on": "2015-07-28 06:47:54",
            "source_ip": "49.248.183.29"
        },
        {
            "access_key": "test",
            "id": "0ac03844-a374-4237-9172-5c256d8d502e",
            "ip_address": "192.168.1.245",
            "requested_on": "2015-07-28 06:47:51",
            "source_ip": "49.248.183.29"
        }
    ]
}

You can clean this up with this basic recipe:

# props.conf
[answers-1438103671]
BREAK_ONLY_BEFORE_DATE = false
BREAK_ONLY_BEFORE = (\{|\[\s+{)
MUST_BREAK_AFTER = (\}|\}\s+\])
SEDCMD-remove_header = s/(\{\s+.+?\[)//g
SEDCMD-remove_trailing_commas = s/\},/}/g
SEDCMD-remove_footer = s/\]\s+\}//g
TIME_PREFIX = \"requested_on\":\s+\"

Assume that your sourcetype is answers-1438103671. Your results should look like this:

alt text

View solution in original post

Highlighted

Re: How to split a json array into multiple events with separate timestamps?

Motivator

If I had to parse something like this coming from an API, I would probably write a modular input. That way you can use your language of choice to query the REST endpoint, pull the JSON, manipulate it into individual events, and send to splunk.

This is pretty advanced and requires some dev chops, but works very well. Trying to do this via conf files is likely going to be brittle.

Relevant Documentation

EDIT: Had a try at parsing this, and came up with a working example (that appears to be similar to the below answer, although I prefer using line_breakers when possible) This only linebreaks on newline characters or commas not near a quote. (So commas between events) And it strips the outer portions of JSON where found.

NOTE: This assumes your JSON is actually coming in minified.

{"IPRequestLog":[{"access_key":"test","id":"0ac03844-a374-4237-9172-a7af9122bed2","ip_address":"192.168.1.245","requested_on":"2015-07-28 06:47:48","source_ip":"49.248.183.29"},{"access_key":"test","id":"7b1f5f38-77d1-453e-8a9e-e33f206474ff","ip_address":"192.168.1.240","requested_on":"2015-07-28 06:47:54","source_ip":"49.248.183.29"},{"access_key":"test","id":"83c6724b-2017-42fa-9cba-5c256d8d502e","ip_address":"192.168.1.249","requested_on":"2015-07-28 06:47:51","source_ip":"49.248.183.29"}]}

Props.conf:

[json_split]
SHOULD_LINEMERGE=false
LINE_BREAKER=((?<!"),|[\r\n]+)
SEDCMD-remove_prefix=s/{"IPRequestLog":\[//g
SEDCMD-remove_suffix=s/\]}//g

alt text

Highlighted

Re: How to split a json array into multiple events with separate timestamps?

Path Finder

Hi,
Thanks for the solution, it works as expected. Only thing extra we get as events is the starting and ending braces of the JSON.
How do we overcome this?

Thanks
Shahid

0 Karma
Highlighted

Re: How to split a json array into multiple events with separate timestamps?

New Member

@emiller42
This can be done on universal forwarder side ??

0 Karma