New data source we're bringing in from an application. Default line breaking not working correct. All of these entries are in a single event, which should be 8 events. Every new events starts with a ">" character in the 1st position. I cannot figure out what to enter in the Add Data area of the GUI for line breaks / new events. Any help would be great. Thank You!
EXAMPLE (currently 1 event, should be 8):
This should do the trick:
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\>\w+\s+\w+\s+\d+
Make sure to remove any other settings related to breaking (this can be a bit tricky in the GUI sometimes as BREAK_ONLY_BEFORE keeps popping up)
Note: probably LINE_BREAKER = ([\r\n]+)\>
would also be sufficient, but making it a bit more specific like I did above prevents the linebreaker triggering on other occurences of >
.
You need these settings in props.conf:
[<Your Sourcetype Here>]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+\>)
TIME_PREFIX = ^\w+\s
TIME_FORMAT = %B %d,%Y %H:%M:%S %p %Z
MAX_TIMESTAMP_LOOKAHEAD = 34
Make sure that if you are doing a sourcetype-override/overwrite, that you use THE ORIGINAL SOURCETYPE VALUE, deploy this to the first full-instance of splunk that handles the events (usually HF or Indexer tier), restart all splunk instances there, forward in NEW events (old events will stay broken), verify function by adding _index_earliest=-5m
to your search to make sure that you are only looking at newly-indexed events.
how to you get the lines in your comments/responses to start with 1., 2., 3. etc on the splunk answers site? I've tried several methods... I must be missing something.
Thanks!
Select the text (make sure there is a blank line before it) and then press the 1010101
button in the editor's toolbar.
THANK YOU.
Thank you for the full props.conf. I need to understand all these settings better and know when to use them to successfully ingest data.
Thank you again guys!
My solution is far more efficient as it takes less than half as long as using the BREAK_*
settings.
Hi joesrepsolc,
if I correctly understood: you have a data flow like the one you showed and you want to ingest it by UI, correct?
I suggest to follow the documentation at https://docs.splunk.com/Documentation/Splunk/7.3.0/Data/Getstartedwithgettingdatain to better understand hot to get data in Splunk.
Anyway, you have to create a custom props.conf and put it in your Indexers to correctly parse your flow.
If your sourcetype is "my_sourcetype", in your props.conf you have
[my_sourcetype]
SHOULD_LINEMERGE = True
LINE_BREAKER = ^>
TIME_PREFIX = ?>\w+\s
TIME_FORMAT = %B %d,%Y %H:%M:%S %p %Z
Bye.
Giuseppe
LINE_BREAKER must always contain a capturing group (and I'm not sure whether using ^ even makes sense) and when using LINE_BREAKER you must set SHOULD_LINEMERGE = false.
This should do the trick:
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\>\w+\s+\w+\s+\d+
Make sure to remove any other settings related to breaking (this can be a bit tricky in the GUI sometimes as BREAK_ONLY_BEFORE keeps popping up)
Note: probably LINE_BREAKER = ([\r\n]+)\>
would also be sufficient, but making it a bit more specific like I did above prevents the linebreaker triggering on other occurences of >
.
This worked great! Man, I really got to put the time in on regex and understand these ingestion settings in props.conf.
Datasets have been very well formatted/predefined sourcetypes lately. Out of practice.
So appreciate the help everyone.