Hello all,
I have configured the props file to NOT break the event when encounters a new line with a date, however, sometimes the event is broken in the line containing the date and sometimes the event is not truncated. I don't understand the reason for different behaviors.
Props file:
SHOULD_LINEMERGE=true
BREAK_ONLY_BEFORE_DATE=false
BREAK_ONLY_BEFORE=SOMEJUNK
MAX_TIMESTAMP_LOOKAHEAD=450
TIME_PREFIX==\s+\w{3}
TIME_FORMAT=%m/%d/%y %H:%M:%S %Z
File that is being read:
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/02/18),(0AAAAAAAAAAIO5BE)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : tws 631/S/ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 43977410
= Tue 01/02/18 15:50:05 BRST
** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0
= Tue 01/02/18 15:50:39 BRST
Sometimes I got the multiline event containing the 12 lines, but sometimes the event is truncated like below sample:
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/02/18),(0AAAAAAAAAAIO5BE)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 35391514
= Tue 01/02/18 15:51:10 BRST
I need to have all entire text log indexed (12 lines) and not only the 5 above lines. Dont know why for sometimes the event is broken in the line date.
Thanks and regard,
Danillo Pavan
@danillopavan Can you please paste more events
Original log file:
===============================================================
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/06/18),(0AAAAAAAAAAIROIP)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : XXX 631/S/*ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 47383116
+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS02 -n 172.22.8.248 -p 31111 -r 1971,1971 -s 0AAAAAAAAAAIROIP -d 20180106,1515196800 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.06/O47383116.1710 -j CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01,47383116 -- / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/06-17:11 : IFIPD_DECOUPLE_NFE, 17105901
Sat Jan 6 17:10:57 2018
+++ EEWO1007I The job status has been set to EXEC: IFIPD_DECOUPLE_NFE 17105901
+++ EEWO1006I Job status: IFIPD_DECOUPLE_NFE 17105901 FINISHED
+++ EEWO1061I Job IFIPD_DECOUPLE_NFE with job ID 17105901 was executed on SAP application server XXXXX_YYYYYY_PPPPPPP.
+++ EEWO1048I Retrieving the joblog of a job:: IFIPD_DECOUPLE_NFE , 17105901
WARNING 914 EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD_DECOUPLE_NFE
Job ID: 17105901.
WARNING 904 EEWO0904W The program could not copy the joblog to stdout.
WARNING 914 EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD_DECOUPLE_NFE
Job ID: 17105901.
+++ EEWO1012I BDC sessions are complete at: 01/06-17:11 : 0
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0
Props file
[logJobsSAP]
TRUNCATE = 10000
MAX_EVENTS = 1000
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = junkfood
BREAK_ONLY_BEFORE_DATE = False
DATETIME_CONFIG=CURRENT
TRANSFORMS-set= setNullJob,setParsingJob
Transform file:
[setNullJob]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setParsingJob]
REGEX = R3BRP#DECOUPLE_NFE
DEST_KEY = queue
FORMAT = indexQueue
About the events, sometimes it is indexing all 33 lines, sometimes it is indexing 27 lines (truncating in line date) and sometimes it is indexing 14 lines(again in line date)...
27 lines:
===============================================================
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/13/18),(0AAAAAAAAAAIT7MZ)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 5178738
+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS01 -n 172.22.8.248 -p 31111 -r 1978,1978 -s 0AAAAAAAAAAIT7MZ -d 20180113,1515801600 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.13/O5178738.0028 -j CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01,5178738 -- / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/13-00:28 : IFIPD_DECOUPLE_NFE, 00281001
Sat Jan 13 00:28:08 2018
+++ EEWO1007I The job status has been set to EXEC: IFIPD_DECOUPLE_NFE 00281001
+++ EEWO1006I Job status: IFIPD_DECOUPLE_NFE 00281001 FINISHED
+++ EEWO1061I Job IFIPD_DECOUPLE_NFE with job ID 00281001 was executed on SAP application server XXXXXXXXXXXXXXX.
+++ EEWO1048I Retrieving the joblog of a job:: IFIPD_DECOUPLE_NFE , 00281001
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD_DECOUPLE_NFE
Job ID: 00281001.
*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD_DECOUPLE_NFE
Job ID: 00281001.
+++ EEWO1012I BDC sessions are complete at: 01/13-00:28 : 0
+++ EEWO1017I The job completed normally at the following date and time: 01/13-00:28
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/13/18),(0AAAAAAAAAAIT7MZ)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 36833578
+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS01 -n 172.22.8.248 -p 31111 -r 1978,1978 -s 0AAAAAAAAAAIT7MZ -d 20180113,1515801600 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.13/O36833578.0029 -j CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01,36833578 -- / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/13-00:29 : IFIPD_DECOUPLE_NFE, 00291400
Sat Jan 13 00:29:11 2018
Hello there @danillopavan
Try something like this.
props.conf
[your_sourcetype]
BREAK_ONLY_BEFORE = (= User Time \(Seconds\) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
DATETIME_CONFIG =
LINE_BREAKER = (= User Time \(Seconds\) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_PREFIX = = [A-Z][a-z]+\s
Hope it helps.
Hello alemarzu,
Tried this configuration however the issue persists. Sometimes it is still truncating the indexed text. If you go via Data preview, it will show correctly the 9 lines. I dont understand why sometimes it is not following the correct way. just as curiosity: whenever the truncate happens, this occurs 2 ~ 3 times in sequence.
I am using also SEDCMD and transform. Dont know if the priorities can impact in something.
Thanks and regards,
Danillo Pavan
How many lines are in your events? I find it hard to believe that your events are greater than 10,000 lines..
Just to clarify:
Around 30~40 lines. Just included this option because if I search the errors in "_internal" I found some entries related to achieving the truncate limit.
The truncate limit is set to protect against long running events that have no linebreaking rules. If you set TRUNCATE=0
you are running the risk of getting massive events which are un-usable.
Perhaps some of those truncated events you mentioned are truncating due to poor linebreaking
I have returned the configuration to TRUNCATE=1000 - still not working
Hello alemarzu,
I just executed the below query and got 22 entries in the last 15 minutes (where I had 3 truncated events and 12 correct events) regarding to the component "DateParserVerbose" with the error "Failed to parse timestamp". Maybe it could explain the intermittence. However if I test the props config via Data Preview, it would not show the timestamp failure...
index=_internal source=splunkd.log AggregatorMiningProcessor OR LineBreakingProcessor OR DateParserVerbose WARN
Hi @danillopavan
Thats weird, I'm running Splunk 7.x.1 and its working, what version are you?
Try adding the TIME_FORMAT to your props.conf perhaps this might help and please could you share your actual props & transforms files ?
KR.
Hello alemarzu. The version is 6.4
Below we have the log file to be read by splunk, the props and tranform files:
LOG FILE:
===============================================================
= JOB : R3BRP#DECOUPLE_NFE[(0006 01/06/18),(0AAAAAAAAAAIROIP)].CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01
= USER : XXX 631/S/*ATHOCO/IBM/AUTOMATION_COORD_HORTOLANDIA/
= JCLFILE : / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
= Job Number: 47383116
+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS02 -n 172.22.8.248 -p 31111 -r 1971,1971 -s 0AAAAAAAAAAIROIP -d 20180106,1515196800 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.06/O47383116.1710 -j CL_S09_IFIPD_DECOUPLE_NFE_R3BRP_01,47383116 -- / -job IFIPD_DECOUPLE_NFE -user FF_PRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/06-17:11 : IFIPD_DECOUPLE_NFE, 17105901
Sat Jan 6 17:10:57 2018
+++ EEWO1007I The job status has been set to EXEC: IFIPD_DECOUPLE_NFE 17105901
+++ EEWO1006I Job status: IFIPD_DECOUPLE_NFE 17105901 FINISHED
+++ EEWO1061I Job IFIPD_DECOUPLE_NFE with job ID 17105901 was executed on SAP application server XXXXX_YYYYYY_PPPPPPP.
+++ EEWO1048I Retrieving the joblog of a job:: IFIPD_DECOUPLE_NFE , 17105901
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD_DECOUPLE_NFE
Job ID: 17105901.
*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD_DECOUPLE_NFE
Job ID: 17105901.
+++ EEWO1012I BDC sessions are complete at: 01/06-17:11 : 0
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0
Props file:
TRUNCATE = 1000000
BREAK_ONLY_BEFORE = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
DATETIME_CONFIG =
LINE_BREAKER = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TIME_PREFIX = = [A-Z][a-z]+\s
TIME_FORMAT = %m/%d/%y %H:%M:%S %Z
SEDCMD-applychange01=s/[\r\n]\s*[A-z]+.+//g
SEDCMD-applychange02=s/(**+.)//g
SEDCMD-applychange04=s/(+++.)//g
SEDCMD-applychange05=s/(==+[\r\n]*)//g
TRANSFORMS-set= setNullJob,setParsingJob
Transform file
[setNullJob]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setParsingJob]
REGEX = R3BRP#DECOUPLE_NFE
DEST_KEY = queue
FORMAT = indexQueue
From this log file, I need to extract 3 data: start/end time, and the app server.
I would probably suggest not using both LINE_BREAKER and BREAK_ONLY_BEFORE in the same props stanza. As they are to do the same job to a degree (Performance wise use LINE_BREAKER).
Pick one of these as LINE_BREAKER happens within the Parsing Pipeline and BREAK_ONLY_BEFORE (and the other similar ones) happens within the Merging Pipeline. Then SEDCMD happens in the Typing queue..
https://wiki.splunk.com/File:Splunk_EventProcessing_v19_0_standalone.png
@lmaclean
"Copy to clipboard" on data input wizard added it. At the beginning I tought it was human error (mine ofc) but no. Its weird Splunk 7.0.1
Sorry, @alemarzu. I didnt get your orientation..
Hi @alemarzu,
I have removed all the SEDCMD and all others properties just keeping the below configuration and it is still not working. Now of course it is bringing sometimes all the 33 lines (entire file) however sometimes it is being truncate in the date line:
Props:
[sourcetype]
TRUNCATE = 10000
BREAK_ONLY_BEFORE=junkfood
BREAK_ONLY_BEFORE_DATE = False
DATETIME_CONFIG=CURRENT
TRANSFORMS-set= setNullJob,setParsingJob
I have removed the BREAK_ONLY_BEFORE, but it still truncating the file...now executing the debug command, got the below result:
UTO_KV_JSON = true
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = false
CHARSET = UTF-8
DATETIME_CONFIG =
EXTRACT-endTimeExecution = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s+\w+\w+\w+\s+-\w+\s+\w+\w+\d+\s+-\w+\s+\d+\s+-\w+\s+\w+\s+=\s+\w+\s+\w+:\s+\d+\s+=\s+\w+\s+\d+/\d+/\d+\s+\d+:\d+:\d+\s+\w+\s+=\s+\w+\s+\w+\s+:\s+\d+\s+=\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+=\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+=\s+\w+\s+\d+/\d+/\d+\s+(?P[^ ]+)
EXTRACT-jobName = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s+(?P[^ ]+)
EXTRACT-jobNumber = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s+\w+\w+\w+\s+-\w+\s+\w+_\w+\d+\s+-\w+\s+\d+\s+-\w+\s+\w+\s+=\s+\w+\s+\w+:\s+(?P\d+)
HEADER_MODE =
LEARN_SOURCETYPE = true
LINE_BREAKER = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
LINE_BREAKER_LOOKBEHIND = 100
MAX_DAYS_AGO = 2000
MAX_DAYS_HENCE = 2
MAX_DIFF_SECS_AGO = 3600
MAX_DIFF_SECS_HENCE = 604800
MAX_EVENTS = 256
MAX_TIMESTAMP_LOOKAHEAD = 128
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
NO_BINARY_CHECK = true
SEDCMD-applychange01 = s/[\r\n]\s*[A-z]+.+//g
SEDCMD-applychange02 = s/(**+.)//g
SEDCMD-applychange04 = s/(+++.)//g
SEDCMD-applychange05 = s/(==+[\r\n]*)//g
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
SHOULD_LINEMERGE = false
TIME_FORMAT = %m/%d/%y %H:%M:%S %Z
TIME_PREFIX = = [A-Z][a-z]+\s
TRANSFORMS =
TRANSFORMS-set = setNullJob,setParsingJob
TRUNCATE = 10000
detect_trailing_nulls = false
disabled = false
maxDist = 100
priority =
pulldown_type = true
Does the LINE_BREAKER Regex require full regex? Can't remember or not, as if so you might need to change the spaces to "\s" instead.
Also the brackets around the "Seconds" if not a capture group will need to be escaped "\". Forward slash isn't a special character as such doesn't need to be escaped: "\d{2}\/\d{2}"
So if it does require full regex then the below is what it should look like to a degree. If you read the props.conf on LINE_BREAKER you will notice that the Capture Group "()" is what it breaks upon and doesn't ingest, i.e. you shouldn't see the "User Time..." coming up in any of the events
LINE_BREAKER = User\s+Time\s+\(Seconds\)\s+:\s+\d+\s+=\s+\w{3}\s+\d{2}/\d{2}/\d{2}\s+\d{2}:\d{2}:\d{2}\s+[A-Z]+([\r\n]+)
The contents of the first capturing group are discarded, and will not be present in any event. You are telling Splunk that this text comes between lines
Hi lmaclean,
I have removed all the SEDCMD and all others properties just keeping the below configuration and it is still not working. Now of course it is bringing sometimes all the 33 lines (entire file) however sometimes it is being truncate in the date line:
Props:
[sourcetype]
TRUNCATE = 10000
BREAK_ONLY_BEFORE=junkfood
BREAK_ONLY_BEFORE_DATE = False
DATETIME_CONFIG=CURRENT
TRANSFORMS-set= setNullJob,setParsingJob
@danillopavan
I've tested - again - this configuration and it seems its working fine except for the SEDCMD-applychange04 that I had to edit the regex to s/(\+{3}.)//g
and applychange02 that I dont know what it does.
Perhaps theres some difference between this splunk versions. What I suggest is this.
1- Run this command to debug if there is another attribute messing around your parsing.
splunk cmd btool --debug props list <your_sourcetype> | grep -v "system/<local/default>"
2- If you dont find anything wrong, start over with the process of data input through the wizard. Once you get the event indexed properly then go on with SEDCMD removals.