Hi Guys,
I'm trying to ingest an entire html file as a single event everytime it gets written. The html file ALWAYS starts with ANGLEBRACKET p ANGLEBRACKET and always ends with ANGLEBRACKET p ANGLEBRACKET. Any suggestions about how to setup the line breaking?
Based on comments, I think this should work:
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]*)\<p\>
Based on comments, I think this should work:
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]*)\<p\>
Just to make sure - it begins with <p>
and ends with </p>
, right?
Thanks FrankVI for the response. Correct that it doesn't have html tags around it, however it can function perfectly fine without those tags. There are only "P" tags at the beginning and end. Question about your solution. This file gets over-written 4 times a day. would setting MAX_Events and Should_LineMERGE be sufficient to keep each file as a single event, but also create a new event each time the file gets over-written?
That's a good question, never tried that.
But if there are indeed no further <p>
elements inside the file, then you could just use that as a linebreaker.
Which is odd, because that means it isn't a valid HTML file, as that should at least have <html>
tags around it all, right?
Also important to know before being able to answer this: are there any further <p>
tags in the middle of the file?
But since you don't want Splunk to break anything, wouldn't it be sufficient to just increase the MAX_EVENTS
setting to larger than the expected number of lines and leave SHOULD_LINEMERGE
to its default true
value?