I have json log files that I need to pull into my Splunk instance. They have some trash data at the beginning and end that I plan on removing with SEDCMD . My end goal is to clean up the file using SEDCMD, index properly (line break & timestamp), auto-parse as much as possible.
The logs are on a system with a UF which send to the indexers. I'm getting very confused about INDEXED_EXTRACTIONS & KV_MODE . I thought that I would use INDEXED_EXTRACTIONS on the UF props.conf , then everything else I need on my indexers, but the docs state that:
When you forward structured data to an indexer, it is not parsed when it arrives at the indexer, even if you have configured props.conf on that indexer with INDEXED_EXTRACTIONS. Forwarded data skips the following pipelines on the indexer, which precludes any parsing of that data on the indexer...
This leads me to believe that if I use INDEXED_EXTRACTIONS on the UF, it won't apply any of the indexer props...so do I just use INDEXED_EXTRACTIONS on my indexers instead? Or does that only apply if I use one of the pretrained sourcetypes? Some answers I read said to use KV_MODE on the search heads? I'm pretty lost on this one.
I have this written up so far:
inputs.conf ON UF
[monitor://path_to_files]
index = my_json_index
sourcetype = my_custom_sourcetype
props.conf ON IDX
[my_custom_sourcetype]
disabled = false
INDEXED_EXTRACTIONS = JSON
KV_MODE = none
SHOULD_LINEMERGE = false
TRUNCATE = 0
LINE_BREAKER = (,)\{\"type\":\"\w+\",\"id\":\"\d+\",\"eventTime\":\"
TIME_PREFIX = \{\"type\":\"\w+\",\"id\":\"\d+\",\"eventTime\":\"
TIME_FORMAT = %FT%T.%3Q
TIME_ZONE = UTC
SEDCMD-1_del_header = s/.*\"events\":\[//g
SEDCMD-2_clean_eof = s/\(.*\)\]\}/\1/g
... View more