Getting Data In

How to make Index time field extraction work for key at end of large json events

Path Finder

We are trying to do index time field extraction on the 'job' field from our json log events. We notice that if the "job":"123" field appears early in the json this works fine and we can do searches like this successfully:

... job::*
... job::123

However if the job field occurs after the 4096'th (or so) character in the event, the above searches will fail. In fact this doesn't even find the event:

... job=123

Our json events are on one line. Is there a config that will extend Splunk's search for the job field? Any suggestions?

Our configs are like this:

fields.conf

[job]
INDEXED=true

transforms.conf

[my_job]
REGEX = \"job\":\"(?<job>[^\"]+)\"
FORMAT = job::$1
WRITE_META = true

props.conf

[my_json]
KV_MODE = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_PREFIX = \"time\":\"
TRANSFORMS-job = my_job
disabled = false
Tags (2)
0 Karma

Contributor

Since it's an index time extraction you will need to add this to your transforms

REPEAT_MATCH = true

For multivalue fields in search time extractions use

MV_ADD = true

Here's a link to the docs: http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf

Hope this solves your problem.

0 Karma

Path Finder

Thanks for responding. I tried

REPEAT_MATCH = true 

but it did not make a difference. From http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf REPEAT_MATCH seems to be useful in cases "where an unknown number of REGEX matches are expected per event." In our case there is only one match per event/line. The match works when "job" key is early in the event, but not when "job" key is after 4096 (or so) character.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!