Getting Data In

Why is BREAK_ONLY_BEFORE not working as expected for all my events?

arunloganathan
New Member

I am using the following configuration in props.conf. It is splitting most of the events correctly, but 2 or 3 events are collapsed. Should I need to include SHOULD_LINEMERGE = false?

[my_source_type]
NO_BINARY_CHECK = true
BREAK_ONLY_BEFORE = ^\d{1,11}\s?,(([^\,]+)?\,?\.?),(([^\,]+)?\,?\.?)
MAX_TIMESTAMP_LOOKAHEAD = 100
TIME_FORMAT = %Y%m%d%H%M%S%6N
TIME_PREFIX = ^(?:[^,\n]*,){7}
disabled = false
pulldown_type = true

This is a .dat file and it has more than 8000 events on a single file.

Sample data

Actual events

07986376244,Mrs,xxxx,40369036,29.06.2016,14:00,21:00,20160628070106529271,/ablive/data/xx/serial/yy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./MessageReminderPM201606280700120000.csv,MessageReminderPM201606280700120000.csv,38,4c7ca670-eddf-4362-8f4b-20ea99007a0b,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-1b5a,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-1bca,2016-06-28T07:02:23.224Z,2016-06-28T07:02:26.890Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

07941158158,Mr,yyyyy,40360516,29.06.2016,14:00,21:00,20160628070106516893,/ablive/data/xx/serial/yy/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./MessageReminderPM201606280700120000.csv,MessageReminderPM201606280700120000.csv,36,4a140e0f-69e4-44d3-a5ce-dfb186c9a081,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-19c6,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-1a2f,2016-06-28T07:02:17.050Z,2016-06-28T07:02:19.816Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS

indexed events

ELIVERY/delivery_messages_inbound/pending/./MessageReminderPM201606280700120000.csv,MessageReminderPM201606280700120000.csv,38,4c7ca670-eddf-4362-8f4b-20ea99007a0b,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-1b5a,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-1bca,2016-06-28T07:02:23.224Z,2016-06-28T07:02:26.890Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS
xx/serial/JL/DISTRIBUTION/DELIVERY/delivery_messages_inbound/pending/./MessageReminderPM201606280700120000.csv,MessageReminderPM201606280700120000.csv,36,4a140e0f-69e4-44d3-a5ce-dfb186c9a081,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-19c6,225b00fe-26-a633-5e21f14e2-ac168f26_5772129e_37501fc-1a2f,2016-06-28T07:02:17.050Z,2016-06-28T07:02:19.816Z,44,GB,Scheduled,Success,SUCCESS,SUCCESS
07986356244,Mrs,Mason,40369036,29.06.2016,14:00,21:00,20160628070106529271,/ablive/data/xx/serial/yy/DISTRIBUTION/D
07941156158,Mr,Hurley,40360516,29.06.2016,14:00,21:00,20160628070106516893,/ablive/data/

Thanks in advance

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Give this a try

 [my_source_type]
 NO_BINARY_CHECK = true
 LINE_BREAKER= ([\r\n]+)(\d{1,11}\s?,(([^\,]+)?\,?\.?),(([^\,]+)?\,?\.?))
 MAX_TIMESTAMP_LOOKAHEAD = 20
 TIME_FORMAT = %Y%m%d%H%M%S%6N
 TIME_PREFIX = ^(?:[^,\n]*,){7}
 SHOULD_LINEMERGE = false
0 Karma

jkat54
SplunkTrust
SplunkTrust

You should specify SHOULD_LINEMERGE = true if you want to use BREAK_ONLY_BEFORE, etc. It's not required though.

Sorry for so many versions of this answer... i get confused on this one all the time 😉

Here's the section in props.conf:

http://docs.splunk.com/Documentation/Splunk/6.4.1/Admin/Propsconf#Line_breaking

See if this works:

BREAK_ONLY_BEFORE = \d{1,11}\s?,(([^\,]+)?\,?.?),(([^\,]+)?\,?.?)

Note about BREAK_ONLY_BEFORE
* When set, Splunk creates a new event only if it encounters a new line that
matches the regular expression

I like the idea of using INDEXED_EXTRACTIONS = CSV instead.

0 Karma

ryanoconnor
Builder

Just to note, it's recommended to not use SHOULD_LINEMERGE = true if you can help it. You'll notice significant performance gains by not using that setting as it rules out an entire portion of the data pipeline.

0 Karma

sundareshr
Legend

This appears to be a csv file. Have you tried indexed_extractions?

http://docs.splunk.com/Documentation/Splunk/6.4.1/Data/Extractfieldsfromfileswithstructureddata

0 Karma

arunloganathan
New Member

i tired indexed_extractions as csv. All events get merged as 1 single event

0 Karma

ryanoconnor
Builder

Can you let us know exactly what your props.conf looks like for this sourcetype now?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...