Getting Data In

How to parse JSON log data?

Path Finder

I created an input in the _json format and send to it httpd access logs.
I received such logs:

Jul 14 14:35:44 172.16.16.100 1 2015-07-14T14:35:44+03:00 us-.local httpd - - - {"PROGRAM":"httpd","LOGTYPE":"access","ISODATE":"2015-07-14T14:35:44+03:00","HTTP":{"VHOST":"..com","USER_AGENT":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36","STATUS":"200","SIZE":"21","REQUEST_TIME":"756174","REQUEST":"GET /admins/widget/?widget_ReplicationLag_yw1[]= HTTP/1.1","REMOTE_USER":"u.","REMOTE_ADDR":"","REFERER":"https://..com/invoices/index","DATE":"2015-07-14T14:35:44"},"HOST_FROM":"us-.local","HOST":"us-.local","FILE_NAME":"/var/run/syslog-ng/apache.access.fifo"}

How I can parse this logs?

Tags (2)
0 Karma
1 Solution

Motivator

You can use index-time transforms to rewrite the event before it's written to the index. But you lose the prepending data.

In transforms.conf:

[my_data_json_extraction]
SOURCE_KEY = _raw
DEST_KEY = _raw
REGEX = ^([^{]+)({.+})$
FORMAT = $2

In props.conf:

[my_sourcetype]
KV_MODE = json
TRANSFORMS-whatever = my_data_json_extraction

The name of the transforms stanza can be whatever you want. It just needs to be unique. Same for the TRANSFORMS-foo bit in props. Just make the part after TRANSFORMS- unique.

I highly recommend testing this locally before applying it to production data as it is destructive. Make sure you capture anything you need from the initial part of the logs before applying this.

Reference the documentation for props.conf and transforms.conf for details.

View solution in original post

Motivator

You can use index-time transforms to rewrite the event before it's written to the index. But you lose the prepending data.

In transforms.conf:

[my_data_json_extraction]
SOURCE_KEY = _raw
DEST_KEY = _raw
REGEX = ^([^{]+)({.+})$
FORMAT = $2

In props.conf:

[my_sourcetype]
KV_MODE = json
TRANSFORMS-whatever = my_data_json_extraction

The name of the transforms stanza can be whatever you want. It just needs to be unique. Same for the TRANSFORMS-foo bit in props. Just make the part after TRANSFORMS- unique.

I highly recommend testing this locally before applying it to production data as it is destructive. Make sure you capture anything you need from the initial part of the logs before applying this.

Reference the documentation for props.conf and transforms.conf for details.

View solution in original post

Path Finder

It's cool!!!! It works! Thank you!

0 Karma

Motivator

Glad this worked for you! Can you accept the answer so others know there's a solution here?

Motivator

So what you have here is not a JSON log event. You have a plaintext log event which happens to contain JSON. That's a BIG difference.

It looks like these are coming through a syslog server which is prepending data before the JSON blob. If you don't need that data (as at least some of it looks redundant) then it would help if you could alter your syslog config for this file to not prepend the raw text and just write the JSON portion. If the event is just JSON, splunk will parse it automatically.

Failing that, you can handle this at search time:

| eval rex "(?P<json_data>{.*})" | spath input=json_data

Path Finder

I understood that it not absolutely json. I could process it as wrote above to comments. It is possible to cut off a log on reception? before events creating

0 Karma

Path Finder

Partially I understood. I created new field extraction and doing:

sourcetype=_json | eval _raw = access_log_json | spath

But how can I execute all before { on the input step????

0 Karma