Hello all,
I have configured the props file to NOT break the event when encounters a new line with a date, however, sometimes the event is broken in the line containing the date and sometimes the event is not truncated. I don't understand the reason for different behaviors.
Props file:
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=SOMEJUNK
MAX_TIMESTAMP_LOOKAHEAD=450
TIME_PREFIX==\s+\w{3}
TIME_FORMAT=%m/%d/%y %H:%M:%S %Z
File that is being read:
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/02/18),(0AAAAAAAAAAIO5BE)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : tws 631/S/ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 43977410
= Tue 01/02/18 15:50:05 BRST
** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0
= Tue 01/02/18 15:50:39 BRST
Sometimes I got the multiline event containing the 12 lines, but sometimes the event is truncated like below sample:
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/02/18),(0AAAAAAAAAAIO5BE)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 35391514
= Tue 01/02/18 15:51:10 BRST
I need to have all entire text log indexed (12 lines) and not only the 5 above lines. Dont know why for sometimes the event is broken in the line date.
Thanks and regard,
Danillo Pavan
Hello alemarzu,
Actually the props file was edited in my last post. The correct SEDCMD is:
"SEDCMD-applychange01=s/[\r\n]\s*[A-z]+.+//g"
"SEDCMD-applychange02=s/(**+.)//g"
"SEDCMD-applychange04=s/(+++.)//g"
"SEDCMD-applychange05=s/(==+[\r\n]*)//g"
It is missing the "\" in the last post.
About the command debug, I executed it and got the below result:
ANNOTATE_PUNCT = True
AUTO_KV_JSON = true
BREAK_ONLY_BEFORE = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2}\d{2}:\d{2}:\d{2} [A-Z]+)
BREAK_ONLY_BEFORE_DATE = True
CHARSET = UTF-8
DATETIME_CONFIG =
EXTRACT-endTimeExecution = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\ s+-\w+\s+\w+\w+\w+\s+-\w+\s+\w+\w+\d+\s+-\w+\s+\d+\s+-\w+\s+\w+\s+=\s+\w+\s+\w+:\s+\d+\s+=\s+\w+\s+\d+/\d+/\d+\s+\d+:\ d+:\d+\s+\w+\s+=\s+\w+\s+\w+\s+:\s+\d+\s+=\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+=\s+\w+\s+\w+\s+ (\w+)\s+:\s+\d+\s+=\s+\w+\s+\d+/\d+/\d+\s+(?P[^ ]+)
EXTRACT-jobName = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s +(?P[^ ]+)
EXTRACT-jobNumber = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+ \s+\w+\w+\w+\s+-\w+\s+\w+_\w+\d+\s+-\w+\s+\d+\s+-\w+\s+\w+\s+=\s+\w+\s+\w+:\s+(?P\d+)
HEADER_MODE =
LEARN_SOURCETYPE = true
LINE_BREAKER = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2}\d{2}:\d{2}:\d{2} [A-Z]+)
LINE_BREAKER_LOOKBEHIND = 100
MAX_DAYS_AGO = 2000
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 3600
MAX_DIFF_SECS_HENCE = 604800
MAX_EVENTS = 256
MAX_TIMESTAMP_LOOKAHEAD = 128
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
NO_BINARY_CHECK = true
SEDCMD-applychange01 = s/[\r\n]\s*[A-z]+.+//g
SEDCMD-applychange02 = s/(**+.)//g
SEDCMD-applychange04 = s/(+++.)//g
SEDCMD-applychange05 = s/(==+[\r\n]*)//g
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
SHOULD_LINEMERGE = false
TIME_FORMAT = %m/%d/%y %H:%M:%S %Z
TIME_PREFIX = = [A-Z][a-z]+\s
TRANSFORMS =
TRANSFORMS-set = setNullJob,setParsingJob
TRUNCATE = 1000000
detect_trailing_nulls = false
disabled = false
maxDist = 100
priority =
pulldown_type = true
sourcetype =
Perhaps you could try:
SHOULD_LINEMERGE = false
MAX_TIMESTAMP_LOOKAHEAD = 40
TIME_PREFIX=\=\s+\w{3}
TIME_FORMAT=%m/%d/%y %H:%M:%S %Z
LINE_BREAKER = (\n)NOBREAKINGPLEASE
Assuming this file is always 12 lines because the above will never create a break.
@garethatiag is 100% correct. Once these base configs are applied then it will work correctly.
I try to stay away from the UI onboarding option and just edit props.conf
directly.
I tried this config and as I said before. It works however for sometimes there are some events which are truncated in the line date. If you execute this configuration via Data preview it works. But for any unknown reason the event is truncated some times (~20% of times)
Can you update your question or post a splunk btool props list --debug
?
Perhaps also include the the transforms.conf as everyone is just guessing your full configuration at this point...
In one post you mentioned a SEDCMD?
Hello garethatiag, I have posted all log file, props file and transform file in some posts below yesterday.
How can I execute this debug command on the btool props? I really dont know...
Hello garethatiag,
I have included this one also. It seems that it has decreased the number of times the event is being truncated, however is still happening. Sometimes (around 20% of the total of events) are still being truncated in the line date.
Maybe it is a important information. When I try to use the Data preview tool, using the same log file and sourcetype, it is showing only the correct lines (without truncate the lines) however it is showing a WARN information:
"Could not use strptime to parse timestamp from ": R3BRP#DECOUPLE_NFE[(0006 01/02/".
"Failed to parse timestamp. Defaulting to file modtime."
If your using the LINE_BREAKER than the TRUNCATE setting should apply based on the amount of data, so you could increase that to avoid truncation, the splunkd log file should have a WARN or ERROR around the time of the issue if this is the case.
If your using the BREAK_ONLY_BEFORE_DATE (the default) then the parsing of the date matters.
Look within the _internal index for the answers and to get at the issue faster use:
These errors are the ones related to TIME_FORMAT or LINE_BREAKER errors:
index=_internal source=*splunkd.log component=DataParserVerbose WARN OR ERROR
For some related to Line Breaking issues:
index=_internal source=*splunkd.log component=LineBreakingProcessor WARN OR ERROR
These are the ones that will be related to MAX_EVENTS (256 Lines by default) & TRUNCATE (10,000 bytes by default) which are some of the top two causes but there are many others...
A good 2016 Splunk .Conf preso (also one in '13 & '14) is the "Jiffy lube quick tune up for you Splunk environment":
https://conf.splunk.com/files/2016/slides/jiffy-lube-quick-tune-up-for-your-splunk-environment.pdf
Hello Imaclean, I have executed the both queries ( for the component DataParserVerbose and LineBreakingProcessor ), but didnt find anything.
For the search: index=_internal source=*splunkd.log component=LineBreakingProcessor
and just found some ERROR entries related to the BREAK_ONLY_BEFORE property that I have configured to read entire file, but it happened just few days ago - now i dont have any entry for this search.
"LineBreakingProcessor - Line breaking regex has no capturing groups: somethingjunk"
Executing the below query, didnt return any entry
index=_internal source=*splunkd.log component=DataParserVerbose
The problem is that it is so intermitent, sometimes all entire the file is indexed correctly and sometimes it is truncated in a specific line containing date. I have already used the SEDCMD to replace the data format by a string, but even with this replacement the file is truncated sometimes. So strange...
Check the _internal index for sourectype "splunkd" where you're indexing. Look for 'ERROR' or 'WARN' for that sourcetype.
I have included the property: "TRUNCATE = 0" in props file and still not work. Sometimes the file is truncated.
Don't do this..
Hello petercow, I have executed the below query:
index=_internal source=*splunkd.log component=LineBreakingProcessor
and just found some ERROR entries related to the BREAK_ONLY_BEFORE property that I have configured to read entire file, but it happened just few days ago - now i dont have any entry for this search.
"LineBreakingProcessor - Line breaking regex has no capturing groups: somethingjunk"
Executing the below query, didnt return any entry
index=_internal source=*splunkd.log component=DataParserVerbose
Please let me know if there is any other search command that I could run to try to find out the reasons...
You should use LINE_BREAKER
rather than BREAK_ONLY_BEFORE
. You should also set SHOULD_LINEMERGE = false
So are you saying you get the regex error when you get the bad line-breaking?
Some days ago, i was getting this error that I posted saying that it was not encountered the word that I have configured for the property "BREAK_ONLY_BEFORE". Now i am not facing any issue anymore, but the event is getting truncated sometimes incorrectly as I posted in the question.