I am ingesting some JSON events, and one of the fields is just a massive spammy "//0//0//0//0" repeated 15000+ times. I know my regexes are working fine, and I accomplished this by changing my lookahead in transforms:
[extractMessage]
REGEX = "original":([\s\S]*?})},"
LOOKAHEAD=100000
DEST_KEY= _raw
FORMAT = $1
WRITE_META = true
BUT sedcmd doesnt listen to lookahead as defined in transforms, because it has to be called from props, and props has no lookahead!
So looking at my props.conf:
[host::xx]
SEDCMD-tst = s/(?:a){20,}/yoink/g
I made a bigass file of the letter "a", and counted how many chars were on each event. Then the sedcmd went in and replaced the "a"s with "yoink". Behold....
SEDCMD stops working at 4105 chars. I NEED MORE. How to expand SEDs reach?
I was wondering if you found a solution for this issue? I came across the same issue with the SEDCMD not being able to look ahead long enough. I am trying to truncate out a field from the JSON while keeping the rest intact.
Finally it turned out, that it was not a sed problem. There are sed versions that only support a limited line length. In my case I had to change some other properties:
So the main problem was LOOKAHEAD = 4096 which affects sedcmd too. Not really intuitive.
Hi
I have the same problem. Long events are truncated by sedcmd at about 4000 characters length of an event.
Any solution?
Got a 320k long event just now. I really don't want to set a global "allow massive json" option in KV, I would rather just strip this data out.