Hi All,
I have here log sample which i need to break
I already tried LINE_BREAKER and BREAK_ONLY_BEFORE
LINE_BREAKER=\w+\d+\|\w+_\w+_\w+\s+\d+/\d+/\d+\|\d+\|\d+\|\d+\|\d+\|\w+\s+--------------------------------------------------------------------------------
AND
BREAK_ONLY_BEFORE\w+\d+\|\w+_\w+_\w+\s+\d+/\d+/\d+\|\d+\|\d+\|\d+\|\d+\|\w+\s+--------------------------------------------------------------------------------
My event should break before (for example)
SMSMSMSM|REALITY0|20150325|060128|20150325|061116|Completed
--------------------------------------------------------------------------------
but the regex is not working.
refer to the attachement for mysamplelog ..
Before you can linebreak something, you need to know exactly where and when you want a linebreak. If the first thing on a new event is not consistently the same thing, you need to work out a way to still identify those elements reliably. I'm assuming that there is an infinite number of possible "words" at the beginning of a new event, so the only thing we can do is rely on the pattern that happens before the 80 - characters (given that they are always there in that number). Here is my go at that, see if it does what you want at https://regex101.com/ (you can paste the regex and your log there and see it live in action, probably better than trying it out with you props.conf right away)
([\r\n]+)\w+\|.*\|\d*\|\d*\|\d*\|\d*\|\w+\n\-{80}
What this does is basically look for a linebreak followed by a word, then optionally anything between some pipes, ended by a word, a newline and 80 - characters. What helped me a lot was this blog post: http://blogs.splunk.com/2014/04/23/its-that-time-again/
Hope this is in the right direction. I do not know how the long part of
PA_COMP|P|MWSI|
PA_CYCLE|P|201503|
PA_LAUFI|P|CV0322|
PA_PORTN|P|08_BG22|
PA_STATR|P||
SO_IDID|S|127137382|
SO_IDID|S|127137384|
SO_IDID|S|127137386|
...
at the beginning is supposed to be indexed, right now it belongs to the event above it.
Before you can linebreak something, you need to know exactly where and when you want a linebreak. If the first thing on a new event is not consistently the same thing, you need to work out a way to still identify those elements reliably. I'm assuming that there is an infinite number of possible "words" at the beginning of a new event, so the only thing we can do is rely on the pattern that happens before the 80 - characters (given that they are always there in that number). Here is my go at that, see if it does what you want at https://regex101.com/ (you can paste the regex and your log there and see it live in action, probably better than trying it out with you props.conf right away)
([\r\n]+)\w+\|.*\|\d*\|\d*\|\d*\|\d*\|\w+\n\-{80}
What this does is basically look for a linebreak followed by a word, then optionally anything between some pipes, ended by a word, a newline and 80 - characters. What helped me a lot was this blog post: http://blogs.splunk.com/2014/04/23/its-that-time-again/
Hope this is in the right direction. I do not know how the long part of
PA_COMP|P|MWSI|
PA_CYCLE|P|201503|
PA_LAUFI|P|CV0322|
PA_PORTN|P|08_BG22|
PA_STATR|P||
SO_IDID|S|127137382|
SO_IDID|S|127137384|
SO_IDID|S|127137386|
...
at the beginning is supposed to be indexed, right now it belongs to the event above it.
hi @jeffland will check on this. tell you what will happen. Thanks 😄
Hi @jeffland it still not working. have you tried to indexed the log file i provided?
Yeah, it works fine for me. Although I have to say, your timestamps are a mess.
But I have found something even prettier:
(\-{80}[\r\n]+)
This makes all those - disappear as well. If this does not work for you, then I suspect there is something wrong with the way you're trying to apply the settings. Did you define a new custom sourcetype?
hello @jeffland, im trying to custom my sourctype upon indexing the log file. i wonder why it doesn't work on me ..
@jeffland would you mind if i ask you to post here your props.conf for the sourcetype you used? that would help me a lot to understand what you did with the line break.
In /etc/system/local/props.conf, I have
[temp_dummy_line]
LINE_BREAKER = (\-{80}[\r\n]+)
SHOULD_LINEMERGE = false
category = Custom
disabled = false
pulldown_type = true
When I import your logfile, I select Custom -> temp_dummy_line from the sourcetype menu, and this gives me these very nice events:
http://postimg.org/image/u1h31evzj/
I don't know how your timestamps work, but I even tried to add the following two lines to the same props.conf stanza:
DATETIME_CONFIG = /etc/temp_linebreak.xml
MAX_TIMESTAMP_LOOKAHEAD = 0
And in the temp_linebreak.xml, I put
<datetime>
<define name="time" extract="hour, minute, second">
<text><![CDATA[20\d{6}\|(\d{2})(\d{2})(\d{2})]]></text>
</define>
<define name="date" extract="year, month, day">
<text><![CDATA[20(\d{2})(\d{2})(\d{2})]]></text>
</define>
<timePatterns>
<use name="time"/>
</timePatterns>
<datePatterns>
<use name="date"/>
</datePatterns>
</datetime>
This may be the wrong interpretation of your timestamps, but at least every event has a timestamp now.
@jeffland i would try this. and by the way thank you for the effort on how would the timestamp work . i will get back to you in a while, i'll try this.
hello @jeffland .. it work but there is some misunderstanding between us..
what you meant is this http://postimg.org/image/fx32ptft5/
what you did is you break event every after the long dashes ---...---
but what i want to be my event is this http://postimg.org/image/6ssmefynl/
i enclosed in a red rectangle shape the event i want to have .
please bear with me ..
thank you very very much
You're welcome. Any help I can give is training for me.
Ah, so the parts with many lines of PA_NOTIF_...
and ABCDEFG_...
belong to the event before that. Does this also apply to the first event in your log, i.e. does the long part of PA_COMP...
belong to GARETTE...
? And what about the first EMEM1...
which is not divided from the first PA_COMP...
by 80 - characters, does it not belong to the long part of PA_COMP...
as well but is indeed also a new event? If the answer is yes to all those questions, then this is your regex:
([\r\n]+)(?:[^|]*\|){6}\w*\n\-{80}
This looks for a linebreak (which will mark your new event), six instances of | with something (or nothing) between them followed by a word (which so far is "Completed" in your data), a newline and 80 - characters.
Hope this is it 🙂
hello @jeffland will definitely try this one 🙂
hello again @jeffland .. I used the line breaker you provided. and what i get is this http://postimg.org/image/ip8wbeti1/
for my props.conf:
[jm_dummy]
LINE_BREAKER = ([\r\n]+)(?:[^|]*\|){6}\w*\n\-{80}
SHOULD_LINEMERGE = false
category = Custom
disabled = false
pulldown_type = true
did you got the same output?
THIS WORKS @jeffland!! 🙂 Amazing! what i used is the regex (?:[^|]*\|){6}\w*
and here's what i got http://postimg.org/image/j4y1q82fr/full/
Thank you very much . You've been so helpful @jeffland
Very good, glad I could help.
I haven't fully understood where in that file you want linebreaks. Exactly before the date inside a line? On the many ---? You should try your regular expressions at https://regex101.com/, they have a nice visualization. Your code for example has unescaped delimiters.
hi @jeffland here's a sample
SMSMSMSM|REALITY0|20150325|061528|20150325|062347|Completed
--------------------------------------------------------------------------------
ABCDEFG|S|03000036|
ABCDEFG|S|03000040|
ABCDEFG|S|03000073|
ABCDEFG|S|03000076|
ABCDEFG|S|03000080|
ABCDEFG|S|03000081|
ABCDEFG|S|03000091|
ABCDEFG|S|03000092|
ABCDEFG|S|03000093|
ABCDEFG|S|03000095|
ABCDEFG|S|03000097|
ABCDEFG|S|03000103|
ABCDEFG|S|03000104|
ABCDEFG|S|03000146|
ABCDEFG|S|03000160|
ABCDEFG|S|03000176|
ABLESGR|P|01|
ANLAGE|S||
BEGABL|S|03/01/2015|03/29/2015
COUNTREQ|P| 0|
EXTNR|P||
GEPLAART|P|01|
GPLARTTS|P||
IGNPREP|P|X|
KARPRFG|P|X|
MASSAKT|P||
SMSMSMSM|REALITY0|20150325|061628|20150325|062401|Completed
--------------------------------------------------------------------------------
ABCDEFG|S|03000211|
ABCDEFG|S|03000212|
ABCDEFG|S|03000215|
ABCDEFG|S|03000219|
ABCDEFG|S|03000220|
ABCDEFG|S|03000245|
ABCDEFG|S|03000256|
ABCDEFG|S|03000258|
ABCDEFG|S|03000283|
ABCDEFG|S|03000325|
ABCDEFG|S|03000360|
ABCDEFG|S|03000362|
ABCDEFG|S|03000370|
ABCDEFG|S|03000371|
ABCDEFG|S|03000600|
ABCDEFG|S|03000620|
ABLESGR|P|01|
ANLAGE|S||
BEGABL|S|03/01/2015|03/29/2015
COUNTREQ|P| 0|
EXTNR|P||
GEPLAART|P|01|
GPLARTTS|P||
IGNPREP|P|X|
KARPRFG|P|X|
MASSAKT|P||
and I want my event to break as like this
ABCDEFG|S|03000211|
ABCDEFG|S|03000212|
ABCDEFG|S|03000215|
ABCDEFG|S|03000219|
ABCDEFG|S|03000220|
ABCDEFG|S|03000245|
ABCDEFG|S|03000256|
ABCDEFG|S|03000258|
ABCDEFG|S|03000283|
ABCDEFG|S|03000325|
ABCDEFG|S|03000360|
ABCDEFG|S|03000362|
ABCDEFG|S|03000370|
ABCDEFG|S|03000371|
ABCDEFG|S|03000600|
ABCDEFG|S|03000620|
ABLESGR|P|01|
ANLAGE|S||
BEGABL|S|03/01/2015|03/29/2015
COUNTREQ|P| 0|
EXTNR|P||
GEPLAART|P|01|
GPLARTTS|P||
IGNPREP|P|X|
KARPRFG|P|X|
MASSAKT|P||
I'm sorry, that didn't make it much clearer. We need to find something that identifies a breakpoint. Is it only on lines like SMSMSMSM|...|Completed
? Or is it also on
-----...---EMEM1|...|Completed
?
-----...---
im sorry for that @jeffland. actually the breakpoint should be only before
\w+(any word)|.....|\w+(any word also)\s+ -----------...------------
so eveytime splunk sees this \w+|...|\w+\s+ -----------...------------
it will break the events. i hope i make it more clearer now. please do help me.. i need this to be done 😞