We are trying to do index time field extraction on the 'job' field from our json log events. We notice that if the "job":"123" field appears early in the json this works fine and we can do searches like this successfully:
... job::*
... job::123
However if the job field occurs after the 4096'th (or so) character in the event, the above searches will fail. In fact this doesn't even find the event:
... job=123
Our json events are on one line. Is there a config that will extend Splunk's search for the job field? Any suggestions?
Our configs are like this:
fields.conf
[job]
INDEXED=true
transforms.conf
[my_job]
REGEX = \"job\":\"(?<job>[^\"]+)\"
FORMAT = job::$1
WRITE_META = true
props.conf
[my_json]
KV_MODE = json
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_PREFIX = \"time\":\"
TRANSFORMS-job = my_job
disabled = false
Since it's an index time extraction you will need to add this to your transforms
REPEAT_MATCH = true
For multivalue fields in search time extractions use
MV_ADD = true
Here's a link to the docs: http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf
Hope this solves your problem.
Thanks for responding. I tried
REPEAT_MATCH = true
but it did not make a difference. From http://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf REPEAT_MATCH seems to be useful in cases "where an unknown number of REGEX matches are expected per event." In our case there is only one match per event/line. The match works when "job" key is early in the event, but not when "job" key is after 4096 (or so) character.