Splunk Search

Regex Truncation vs Rex

cdstealer
Contributor

Hi,
I searched and found several tickets regarding my situation, but all lead to nowhere.  So, my situation...

Unfortunately we have a few logs that mix formats eg starts in plain text and then contains a json payload.  The events are <4000 chars, so I can't see where the truncation is happening.

I've also tried specifying the following in props/transforms with no difference.:
TRUNCATE = 100000
MAXEVENTS = 500
MAXCHARS = 100000
DEPTH_LIMIT = 5000

I'm probably missing something obvious. 🙂

My current props/transforms for this sourcetype are:

 

[opt:gateway]
KV_MODE = none
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
TRANSFORMS-opt_json2 = optimus_dll1,optimus_dll2
SEDCMD-eol = s/\\r\\n//g
LINE_BREAKER=([\r\n]+)
[optimus_dll1]
REGEX = ^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s+\[\S+\]\s+(?P<LogLevel>[^ ]\w+)\s+(?P<OptimusDLL>[^ ]+)
WRITE_META = true
REPEAT_MATCH = false
FORMAT = LogLevel::$1 OptimusDLL::$2

[optimus_dll2]
REGEX = ^(?:[^$]*)\s-\s(?P<json>.+)
FORMAT = json::$1
WRITE_META = true
REPEAT_MATCH = false
SOURCE_KEY = _raw
DEPTH_LIMIT = 5000

 

optimus_dll1 is extracted as expected, optimus_dll2 only grabs the first 1000 chars of the match.

Using the regex inline via rex, via regex101 or cmdline all extract the full field.

TIA

Steve

Labels (4)
0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!