Getting Data In

Line break help with incoming log data

Communicator

New data source we're bringing in from an application. Default line breaking not working correct. All of these entries are in a single event, which should be 8 events. Every new events starts with a ">" character in the 1st position. I cannot figure out what to enter in the Add Data area of the GUI for line breaks / new events. Any help would be great. Thank You!

EXAMPLE (currently 1 event, should be 8):

  1. >Informational June 14, 2019 8:09:59 AM CDT
  2. Transaction "Blah Blah": Job ended
  3. >Debug June 14, 2019 8:10:00 AM CDT
  4. Transaction "Blah Blah": Directory changed to /
  5. >Debug June 14, 2019 8:10:00 AM CDT
  6. Transaction "Blah Blah": Filtered directory listing for Directory: /
  7. >Informational June 14, 2019 8:10:00 AM CDT
  8. Transaction "Blah Blah": Added to the pending run queue
  9. >Informational June 14, 2019 8:10:00 AM CDT
  10. Transaction "Blah Blah": Timer created for CDTto repeat every 4 hours
  11. >Informational June 14, 2019 8:10:00 AM CDT
  12. Transaction "Blah Blah": File moved: \secret.seq
  13. >Informational June 14, 2019 8:10:00 AM CDT
  14. Transaction "Blah Blah": The size of the destination file secret.seq was successfully verified
  15. >Debug June 14, 2019 8:10:00 AM CDT
  16. Transaction "Blah Blah": Primary archiving skipped
0 Karma
1 Solution

Ultra Champion

This should do the trick:

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\>\w+\s+\w+\s+\d+

Make sure to remove any other settings related to breaking (this can be a bit tricky in the GUI sometimes as BREAK_ONLY_BEFORE keeps popping up)

Note: probably LINE_BREAKER = ([\r\n]+)\> would also be sufficient, but making it a bit more specific like I did above prevents the linebreaker triggering on other occurences of >.

View solution in original post

Esteemed Legend

You need these settings in props.conf:

[<Your Sourcetype Here>]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+\>)
TIME_PREFIX = ^\w+\s
TIME_FORMAT = %B %d,%Y %H:%M:%S %p %Z
MAX_TIMESTAMP_LOOKAHEAD = 34

Make sure that if you are doing a sourcetype-override/overwrite, that you use THE ORIGINAL SOURCETYPE VALUE, deploy this to the first full-instance of splunk that handles the events (usually HF or Indexer tier), restart all splunk instances there, forward in NEW events (old events will stay broken), verify function by adding _index_earliest=-5m to your search to make sure that you are only looking at newly-indexed events.

0 Karma

Communicator

how to you get the lines in your comments/responses to start with 1., 2., 3. etc on the splunk answers site? I've tried several methods... I must be missing something.

Thanks!

0 Karma

Ultra Champion

Select the text (make sure there is a blank line before it) and then press the 1010101 button in the editor's toolbar.

0 Karma

Communicator

THANK YOU.

0 Karma

Communicator

Thank you for the full props.conf. I need to understand all these settings better and know when to use them to successfully ingest data.

Thank you again guys!

0 Karma

Esteemed Legend

My solution is far more efficient as it takes less than half as long as using the BREAK_* settings.

0 Karma

SplunkTrust
SplunkTrust

Hi joesrepsolc,
if I correctly understood: you have a data flow like the one you showed and you want to ingest it by UI, correct?

I suggest to follow the documentation at https://docs.splunk.com/Documentation/Splunk/7.3.0/Data/Getstartedwithgettingdatain to better understand hot to get data in Splunk.
Anyway, you have to create a custom props.conf and put it in your Indexers to correctly parse your flow.

If your sourcetype is "my_sourcetype", in your props.conf you have
[my_sourcetype]
SHOULD_LINEMERGE = True
LINE_BREAKER = ^>
TIME_PREFIX = ?>\w+\s
TIME_FORMAT = %B %d,%Y %H:%M:%S %p %Z

Bye.
Giuseppe

0 Karma

Ultra Champion

LINE_BREAKER must always contain a capturing group (and I'm not sure whether using ^ even makes sense) and when using LINE_BREAKER you must set SHOULD_LINEMERGE = false.

0 Karma

Ultra Champion

This should do the trick:

SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)\>\w+\s+\w+\s+\d+

Make sure to remove any other settings related to breaking (this can be a bit tricky in the GUI sometimes as BREAK_ONLY_BEFORE keeps popping up)

Note: probably LINE_BREAKER = ([\r\n]+)\> would also be sufficient, but making it a bit more specific like I did above prevents the linebreaker triggering on other occurences of >.

View solution in original post

Communicator

This worked great! Man, I really got to put the time in on regex and understand these ingestion settings in props.conf.

Datasets have been very well formatted/predefined sourcetypes lately. Out of practice.

So appreciate the help everyone.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!