Solved: Line break help with incoming log data

joesrepsolc · ‎06-14-2019

New data source we're bringing in from an application. Default line breaking not working correct. All of these entries are in a single event, which should be 8 events. Every new events starts with a ">" character in the 1st position. I cannot figure out what to enter in the Add Data area of the GUI for line breaks / new events. Any help would be great. Thank You!

EXAMPLE (currently 1 event, should be 8):

>Informational June 14, 2019 8:09:59 AM CDT
Transaction "Blah Blah": Job ended
>Debug June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": Directory changed to /
>Debug June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": Filtered directory listing for Directory: /
>Informational June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": Added to the pending run queue
>Informational June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": Timer created for CDTto repeat every 4 hours
>Informational June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": File moved: \secret.seq
>Informational June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": The size of the destination file secret.seq was successfully verified
>Debug June 14, 2019 8:10:00 AM CDT
Transaction "Blah Blah": Primary archiving skipped

FrankVl · ‎06-14-2019

This should do the trick:

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\>\w+\s+\w+\s+\d+

Make sure to remove any other settings related to breaking (this can be a bit tricky in the GUI sometimes as BREAK_ONLY_BEFORE keeps popping up)

Note: probably LINE_BREAKER = ([\r\n]+)\> would also be sufficient, but making it a bit more specific like I did above prevents the linebreaker triggering on other occurences of >.

View solution in original post

woodcock · ‎06-16-2019

You need these settings in props.conf:

[<Your Sourcetype Here>]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+\>)
TIME_PREFIX = ^\w+\s
TIME_FORMAT = %B %d,%Y %H:%M:%S %p %Z
MAX_TIMESTAMP_LOOKAHEAD = 34

Make sure that if you are doing a sourcetype-override/overwrite, that you use THE ORIGINAL SOURCETYPE VALUE, deploy this to the first full-instance of splunk that handles the events (usually HF or Indexer tier), restart all splunk instances there, forward in NEW events (old events will stay broken), verify function by adding _index_earliest=-5m to your search to make sure that you are only looking at newly-indexed events.

joesrepsolc · ‎06-18-2019

how to you get the lines in your comments/responses to start with 1., 2., 3. etc on the splunk answers site? I've tried several methods... I must be missing something.

Thanks!

FrankVl · ‎06-18-2019

Select the text (make sure there is a blank line before it) and then press the 1010101 button in the editor's toolbar.

joesrepsolc · ‎06-19-2019

THANK YOU.

joesrepsolc · ‎06-18-2019

Thank you for the full props.conf. I need to understand all these settings better and know when to use them to successfully ingest data.

Thank you again guys!

woodcock · ‎06-20-2019

My solution is far more efficient as it takes less than half as long as using the BREAK_* settings.

gcusello · ‎06-14-2019

Hi joesrepsolc,
if I correctly understood: you have a data flow like the one you showed and you want to ingest it by UI, correct?

I suggest to follow the documentation at https://docs.splunk.com/Documentation/Splunk/7.3.0/Data/Getstartedwithgettingdatain to better understand hot to get data in Splunk.
Anyway, you have to create a custom props.conf and put it in your Indexers to correctly parse your flow.

If your sourcetype is "my_sourcetype", in your props.conf you have
[my_sourcetype]
SHOULD_LINEMERGE = True
LINE_BREAKER = ^>
TIME_PREFIX = ?>\w+\s
TIME_FORMAT = %B %d,%Y %H:%M:%S %p %Z

Bye.
Giuseppe

FrankVl · ‎06-14-2019

LINE_BREAKER must always contain a capturing group (and I'm not sure whether using ^ even makes sense) and when using LINE_BREAKER you must set SHOULD_LINEMERGE = false.

FrankVl · ‎06-14-2019

This should do the trick:

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\>\w+\s+\w+\s+\d+

Make sure to remove any other settings related to breaking (this can be a bit tricky in the GUI sometimes as BREAK_ONLY_BEFORE keeps popping up)

Note: probably LINE_BREAKER = ([\r\n]+)\> would also be sufficient, but making it a bit more specific like I did above prevents the linebreaker triggering on other occurences of >.

joesrepsolc · ‎06-18-2019

This worked great! Man, I really got to put the time in on regex and understand these ingestion settings in props.conf.

Datasets have been very well formatted/predefined sourcetypes lately. Out of practice.

So appreciate the help everyone.

Line break help with incoming log data

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...

Join the Conversation

Line break help with incoming log data

[Puzzles] Solve, Learn, Repeat: Dynamic formatting from XML events

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Stronger Security with Federated Search for S3, GCP SQL & Australian Threat ...