Getting Data In

Inconsistent linebreaker behavior

Communicator

Hello all,

I have configured the props file to NOT break the event when encounters a new line with a date, however, sometimes the event is broken in the line containing the date and sometimes the event is not truncated. I don't understand the reason for different behaviors.

Props file:
SHOULDLINEMERGE=true
BREAK
ONLYBEFOREDATE=false
BREAKONLYBEFORE=SOMEJUNK
MAXTIMESTAMPLOOKAHEAD=450
TIMEPREFIX==\s+\w{3}
TIME
FORMAT=%m/%d/%y %H:%M:%S %Z

File that is being read:

= JOB : R3BRP#DECOUPLENFE[(0006 01/02/18),(0AAAAAAAAAAIO5BE)].CLS09IFIPDDECOUPLENFER3BRP01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION
COORDHORTOLANDIA/
= JCLFILE : / -job IFIPD
DECOUPLENFE -user FFPRO1 -i 23154800 -c a
= Job Number: 43977410
= Tue 01/02/18 15:50:05 BRST
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0
= Tue 01/02/18 15:50:39 BRST

Sometimes I got the multiline event containing the 12 lines, but sometimes the event is truncated like below sample:

= JOB : R3BRP#DECOUPLENFE[(0006 01/02/18),(0AAAAAAAAAAIO5BE)].CLS09IFIPDDECOUPLENFER3BRP01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION
COORDHORTOLANDIA/
= JCLFILE : / -job IFIPD
DECOUPLENFE -user FFPRO1 -i 23154800 -c a
= Job Number: 35391514
= Tue 01/02/18 15:51:10 BRST

I need to have all entire text log indexed (12 lines) and not only the 5 above lines. Dont know why for sometimes the event is broken in the line date.

Thanks and regard,
Danillo Pavan

0 Karma

Communicator

@danillopavan Can you please paste more events

0 Karma

Communicator

Original log file:

===============================================================
= JOB : R3BRP#DECOUPLENFE[(0006 01/06/18),(0AAAAAAAAAAIROIP)].CLS09IFIPDDECOUPLENFER3BRP01
= USER : XXX 631/S/*ATHOCO/IBM/AUTOMATION
COORDHORTOLANDIA/
= JCLFILE : / -job IFIPD
DECOUPLENFE -user FFPRO1 -i 23154800 -c a
= Job Number: 47383116

= Sat 01/06/18 17:10:57 BRST

+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS02 -n 172.22.8.248 -p 31111 -r 1971,1971 -s 0AAAAAAAAAAIROIP -d 20180106,1515196800 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.06/O47383116.1710 -j CLS09IFIPDDECOUPLENFER3BRP01,47383116 -- / -job IFIPDDECOUPLENFE -user FFPRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/06-17:11 : IFIPD
DECOUPLE_NFE, 17105901

Sat Jan 6 17:10:57 2018
+++ EEWO1007I The job status has been set to EXEC: IFIPDDECOUPLENFE 17105901
+++ EEWO1006I Job status: IFIPDDECOUPLENFE 17105901 FINISHED
+++ EEWO1061I Job IFIPDDECOUPLENFE with job ID 17105901 was executed on SAP application server XXXXXYYYYYYPPPPPPP.

+++ EEWO1048I Retrieving the joblog of a job:: IFIPDDECOUPLENFE , 17105901

WARNING 914 EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPDDECOUPLENFE
Job ID: 17105901.

WARNING 904 EEWO0904W The program could not copy the joblog to stdout.

WARNING 914 EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPDDECOUPLENFE
Job ID: 17105901.
+++ EEWO1012I BDC sessions are complete at: 01/06-17:11 : 0

+++ EEWO1017I The job completed normally at the following date and time: 01/06-17:11

= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0

= Sat 01/06/18 17:11:32 BRST

Props file
[logJobsSAP]
TRUNCATE = 10000
MAXEVENTS = 1000
SHOULD
LINEMERGE = true
BREAKONLYBEFORE = junkfood
BREAKONLYBEFOREDATE = False
DATETIME
CONFIG=CURRENT
TRANSFORMS-set= setNullJob,setParsingJob

Transform file:
[setNullJob]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setParsingJob]
REGEX = R3BRP#DECOUPLENFE
DEST
KEY = queue
FORMAT = indexQueue

About the events, sometimes it is indexing all 33 lines, sometimes it is indexing 27 lines (truncating in line date) and sometimes it is indexing 14 lines(again in line date)...

27 lines:

===============================================================
= JOB : R3BRP#DECOUPLENFE[(0006 01/13/18),(0AAAAAAAAAAIT7MZ)].CLS09IFIPDDECOUPLENFER3BRP01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION
COORDHORTOLANDIA/
= JCLFILE : / -job IFIPD
DECOUPLENFE -user FFPRO1 -i 23154800 -c a
= Job Number: 5178738

= Sat 01/13/18 00:28:07 BRST

+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS01 -n 172.22.8.248 -p 31111 -r 1978,1978 -s 0AAAAAAAAAAIT7MZ -d 20180113,1515801600 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.13/O5178738.0028 -j CLS09IFIPDDECOUPLENFER3BRP01,5178738 -- / -job IFIPDDECOUPLENFE -user FFPRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/13-00:28 : IFIPD
DECOUPLENFE, 00281001
Sat Jan 13 00:28:08 2018
+++ EEWO1007I The job status has been set to EXEC: IFIPD
DECOUPLENFE 00281001
+++ EEWO1006I Job status: IFIPD
DECOUPLENFE 00281001 FINISHED
+++ EEWO1061I Job IFIPD
DECOUPLENFE with job ID 00281001 was executed on SAP application server XXXXXXXXXXXXXXX.
+++ EEWO1048I Retrieving the joblog of a job:: IFIPD
DECOUPLENFE , 00281001
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD
DECOUPLENFE
Job ID: 00281001.
*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.
*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPD
DECOUPLE_NFE
Job ID: 00281001.
+++ EEWO1012I BDC sessions are complete at: 01/13-00:28 : 0
+++ EEWO1017I The job completed normally at the following date and time: 01/13-00:28

14 lines

= JOB : R3BRP#DECOUPLENFE[(0006 01/13/18),(0AAAAAAAAAAIT7MZ)].CLS09IFIPDDECOUPLENFER3BRP01
= USER : tws 631/S/*ATHOCO/IBM/AUTOMATION
COORDHORTOLANDIA/
= JCLFILE : / -job IFIPD
DECOUPLENFE -user FFPRO1 -i 23154800 -c a
= Job Number: 36833578

= Sat 01/13/18 00:29:11 BRST

+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS01 -n 172.22.8.248 -p 31111 -r 1978,1978 -s 0AAAAAAAAAAIT7MZ -d 20180113,1515801600 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.13/O36833578.0029 -j CLS09IFIPDDECOUPLENFER3BRP01,36833578 -- / -job IFIPDDECOUPLENFE -user FFPRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/13-00:29 : IFIPD
DECOUPLE_NFE, 00291400
Sat Jan 13 00:29:11 2018

0 Karma

Motivator

Hello there @danillopavan

Try something like this.
props.conf

 [your_sourcetype]
 BREAK_ONLY_BEFORE = (= User Time \(Seconds\) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
 DATETIME_CONFIG = 
 LINE_BREAKER = (= User Time \(Seconds\) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
 NO_BINARY_CHECK = true
 SHOULD_LINEMERGE = false
 TIME_PREFIX = = [A-Z][a-z]+\s

Hope it helps.

0 Karma

Communicator

Hello alemarzu,

Tried this configuration however the issue persists. Sometimes it is still truncating the indexed text. If you go via Data preview, it will show correctly the 9 lines. I dont understand why sometimes it is not following the correct way. just as curiosity: whenever the truncate happens, this occurs 2 ~ 3 times in sequence.

I am using also SEDCMD and transform. Dont know if the priorities can impact in something.

Thanks and regards,
Danillo Pavan

0 Karma

SplunkTrust
SplunkTrust

How many lines are in your events? I find it hard to believe that your events are greater than 10,000 lines..

0 Karma

Path Finder

Just to clarify:

  • TRUNCATE (Default 10000) is for the size of an event in Bytes.
  • MAX_EVENTS (Default 256) is for max number of lines per event
0 Karma

Communicator

Around 30~40 lines. Just included this option because if I search the errors in "_internal" I found some entries related to achieving the truncate limit.

0 Karma

SplunkTrust
SplunkTrust

The truncate limit is set to protect against long running events that have no linebreaking rules. If you set TRUNCATE=0 you are running the risk of getting massive events which are un-usable.

Perhaps some of those truncated events you mentioned are truncating due to poor linebreaking

0 Karma

Communicator

I have returned the configuration to TRUNCATE=1000 - still not working

0 Karma

Communicator

Hello alemarzu,

I just executed the below query and got 22 entries in the last 15 minutes (where I had 3 truncated events and 12 correct events) regarding to the component "DateParserVerbose" with the error "Failed to parse timestamp". Maybe it could explain the intermittence. However if I test the props config via Data Preview, it would not show the timestamp failure...

index=_internal source=splunkd.log AggregatorMiningProcessor OR LineBreakingProcessor OR DateParserVerbose WARN

0 Karma

Motivator

Hi @danillopavan

Thats weird, I'm running Splunk 7.x.1 and its working, what version are you?

Try adding the TIME_FORMAT to your props.conf perhaps this might help and please could you share your actual props & transforms files ?

KR.

0 Karma

Communicator

Hello alemarzu. The version is 6.4

Below we have the log file to be read by splunk, the props and tranform files:

LOG FILE:

===============================================================
= JOB : R3BRP#DECOUPLENFE[(0006 01/06/18),(0AAAAAAAAAAIROIP)].CLS09IFIPDDECOUPLENFER3BRP01
= USER : XXX 631/S/*ATHOCO/IBM/AUTOMATION
COORDHORTOLANDIA/
= JCLFILE : / -job IFIPD
DECOUPLENFE -user FFPRO1 -i 23154800 -c a
= Job Number: 47383116

= Sat 01/06/18 17:10:57 BRST

+++ IBM Tivoli Workload Scheduler for Applications, method R3BATCH 8.5.0 (patchrev 1 - 16:42:24 Jun 13 2014)
+++ is called with following parameters:
+++ -t LJ -c R3BRP,SAPECCPINST1,ACSXTWS02 -n 172.22.8.248 -p 31111 -r 1971,1971 -s 0AAAAAAAAAAIROIP -d 20180106,1515196800 -l twsuser1 -o /amb/local/tws/sapeccpinst1/TWS/stdlist/2018.01.06/O47383116.1710 -j CLS09IFIPDDECOUPLENFER3BRP01,47383116 -- / -job IFIPDDECOUPLENFE -user FFPRO1 -i 23154800 -c a
+++ EEWO1031I The Tivoli Workload Scheduler home directory was found: ./..
+++ EEWO1027I The RFC connection is established: (1)
+++ EEWO1023I Started the R/3 job at the following date and time: 01/06-17:11 : IFIPD
DECOUPLE_NFE, 17105901

Sat Jan 6 17:10:57 2018
+++ EEWO1007I The job status has been set to EXEC: IFIPDDECOUPLENFE 17105901
+++ EEWO1006I Job status: IFIPDDECOUPLENFE 17105901 FINISHED
+++ EEWO1061I Job IFIPDDECOUPLENFE with job ID 17105901 was executed on SAP application server XXXXXYYYYYYPPPPPPP.

+++ EEWO1048I Retrieving the joblog of a job:: IFIPDDECOUPLENFE , 17105901

*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPDDECOUPLENFE
Job ID: 17105901.

*** WARNING 904 *** EEWO0904W The program could not copy the joblog to stdout.

*** WARNING 914 *** EEWO0914W An internal error has occurred. Either the joblog or the job protocol for the following job does not exist:
Job name: IFIPDDECOUPLENFE
Job ID: 17105901.
+++ EEWO1012I BDC sessions are complete at: 01/06-17:11 : 0

+++ EEWO1017I The job completed normally at the following date and time: 01/06-17:11

= Exit Status : 0
= System Time (Seconds) : 0 Elapsed Time (Minutes) : 0
= User Time (Seconds) : 0

= Sat 01/06/18 17:11:32 BRST

Props file:
TRUNCATE = 1000000
BREAKONLYBEFORE = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
DATETIMECONFIG =
LINE
BREAKER = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
NOBINARYCHECK = true
SHOULDLINEMERGE = false
TIME
PREFIX = = [A-Z][a-z]+\s
TIME_FORMAT = %m/%d/%y %H:%M:%S %Z
SEDCMD-applychange01=s/[\r\n]\s[A-z]+.+//g
SEDCMD-applychange02=s/(**+.
)//g
SEDCMD-applychange04=s/(+++.)//g
SEDCMD-applychange05=s/(==+[\r\n]
)//g
TRANSFORMS-set= setNullJob,setParsingJob

Transform file
[setNullJob]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setParsingJob]
REGEX = R3BRP#DECOUPLENFE
DEST
KEY = queue
FORMAT = indexQueue

From this log file, I need to extract 3 data: start/end time, and the app server.

0 Karma

Path Finder

I would probably suggest not using both LINE_BREAKER and BREAKONLYBEFORE in the same props stanza. As they are to do the same job to a degree (Performance wise use LINE_BREAKER).

Pick one of these as LINEBREAKER happens within the Parsing Pipeline and BREAKONLY_BEFORE (and the other similar ones) happens within the Merging Pipeline. Then SEDCMD happens in the Typing queue..

https://wiki.splunk.com/File:Splunk_EventProcessing_v19_0_standalone.png

Motivator

@lmaclean

"Copy to clipboard" on data input wizard added it. At the beginning I tought it was human error (mine ofc) but no. Its weird Splunk 7.0.1

0 Karma

Communicator

Sorry, @alemarzu. I didnt get your orientation..

0 Karma

Communicator

Hi @alemarzu,

I have removed all the SEDCMD and all others properties just keeping the below configuration and it is still not working. Now of course it is bringing sometimes all the 33 lines (entire file) however sometimes it is being truncate in the date line:

Props:

[sourcetype]
TRUNCATE = 10000
BREAKONLYBEFORE=junkfood
BREAKONLYBEFOREDATE = False
DATETIME
CONFIG=CURRENT
TRANSFORMS-set= setNullJob,setParsingJob

0 Karma

Communicator

I have removed the BREAKONLYBEFORE, but it still truncating the file...now executing the debug command, got the below result:

UTOKVJSON = true
BREAKONLYBEFORE =
BREAKONLYBEFOREDATE = false
CHARSET = UTF-8
DATETIME
CONFIG =
EXTRACT-endTimeExecution = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s+\w+\w+\w+\s+-\w+\s+\w+\w+\d+\s+-\w+\s+\d+\s+-\w+\s+\w+\s+=\s+\w+\s+\w+:\s+\d+\s+=\s+\w+\s+\d+/\d+/\d+\s+\d+:\d+:\d+\s+\w+\s+=\s+\w+\s+\w+\s+:\s+\d+\s+=\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+=\s+\w+\s+\w+\s+(\w+)\s+:\s+\d+\s+=\s+\w+\s+\d+/\d+/\d+\s+(?P[^ ]+)
EXTRACT-jobName = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s+(?P[^ ]+)
EXTRACT-jobNumber = ^=\s+\w+\s+:\s+\w+\d+\w+#\w+
\w+[(\d+\s+\d+/\d+/\d+),(\d+\w+)].\w+\w+\d+\w+\w+\w+\w+\d+\w+\d+\s+=\s+\w+\s+:\s+\w+\s+\d+/\w+/*\w+/\w+/\w+\w+\w+/\s+=\s+\w+\s+:\s+/\s+-\w+\s+\w+\w+\w+\s+-\w+\s+\w+\w+\d+\s+-\w+\s+\d+\s+-\w+\s+\w+\s+=\s+\w+\s+\w+:\s+(?P\d+)
HEADER
MODE =
LEARNSOURCETYPE = true
LINE
BREAKER = (= User Time (Seconds) : \d+\n= \w{3} \d{2}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} [A-Z]+)
LINEBREAKERLOOKBEHIND = 100
MAXDAYSAGO = 2000
MAXDAYSHENCE = 2
MAXDIFFSECSAGO = 3600
MAX
DIFFSECSHENCE = 604800
MAXEVENTS = 256
MAX
TIMESTAMPLOOKAHEAD = 128
MUST
BREAKAFTER =
MUST
NOTBREAKAFTER =
MUSTNOTBREAKBEFORE =
NO
BINARYCHECK = true
SEDCMD-applychange01 = s/[\r\n]\s[A-z]+.+//g
SEDCMD-applychange02 = s/(**+.
)//g
SEDCMD-applychange04 = s/(+++.)//g
SEDCMD-applychange05 = s/(==+[\r\n]
)//g
SEGMENTATION = indexing
SEGMENTATION-all = full
SEGMENTATION-inner = inner
SEGMENTATION-outer = outer
SEGMENTATION-raw = none
SEGMENTATION-standard = standard
SHOULD
LINEMERGE = false
TIMEFORMAT = %m/%d/%y %H:%M:%S %Z
TIME
PREFIX = = [A-Z][a-z]+\s
TRANSFORMS =
TRANSFORMS-set = setNullJob,setParsingJob
TRUNCATE = 10000
detecttrailingnulls = false
disabled = false
maxDist = 100
priority =
pulldown_type = true

0 Karma

Path Finder

Does the LINE_BREAKER Regex require full regex? Can't remember or not, as if so you might need to change the spaces to "\s" instead.

Also the brackets around the "Seconds" if not a capture group will need to be escaped "\". Forward slash isn't a special character as such doesn't need to be escaped: "\d{2}\/\d{2}"

So if it does require full regex then the below is what it should look like to a degree. If you read the props.conf on LINE_BREAKER you will notice that the Capture Group "()" is what it breaks upon and doesn't ingest, i.e. you shouldn't see the "User Time..." coming up in any of the events

LINE_BREAKER = User\s+Time\s+\(Seconds\)\s+:\s+\d+\s+=\s+\w{3}\s+\d{2}/\d{2}/\d{2}\s+\d{2}:\d{2}:\d{2}\s+[A-Z]+([\r\n]+)

The contents of the first capturing group are discarded, and will not be present in any event. You are telling Splunk that this text comes between lines

0 Karma

Communicator

Hi lmaclean,

I have removed all the SEDCMD and all others properties just keeping the below configuration and it is still not working. Now of course it is bringing sometimes all the 33 lines (entire file) however sometimes it is being truncate in the date line:

Props:

[sourcetype]
TRUNCATE = 10000
BREAKONLYBEFORE=junkfood
BREAKONLYBEFOREDATE = False
DATETIME
CONFIG=CURRENT
TRANSFORMS-set= setNullJob,setParsingJob

0 Karma

Motivator

@danillopavan

I've tested - again - this configuration and it seems its working fine except for the SEDCMD-applychange04 that I had to edit the regex to s/(\+{3}.)//g and applychange02 that I dont know what it does.

Perhaps theres some difference between this splunk versions. What I suggest is this.

1- Run this command to debug if there is another attribute messing around your parsing.

 splunk cmd btool --debug props list <your_sourcetype> | grep -v "system/<local/default>"

2- If you dont find anything wrong, start over with the process of data input through the wizard. Once you get the event indexed properly then go on with SEDCMD removals.

0 Karma