Hello, I am trying to figure out how to edit props.conf so that it splits my events properly. The events are added to a log file, which looks like this:
******************************************************************************
Mon 01/02/2023
09:00 AM
******************************************************************************
The command completed successfully.
1 file(s) copied.
\\share\folder\folder\folder\file
1 file(s) copied.
1 file(s) copied.
******************************************************************************
Tue 01/03/2023
09:00 AM
******************************************************************************
The command completed successfully.
The system cannot find the file specified.
\\share\folder\folder\folder\file
0 file(s) copied.
The system cannot find the file specified.
******************************************************************************
Wed 01/04/2023
09:00 AM
******************************************************************************
The command completed successfully.
1 file(s) copied.
\\share\folder\folder\folder\file
1 file(s) copied.
1 file(s) copied.
******************************************************************************
Thu 01/05/2023
09:00 AM
******************************************************************************
The command completed successfully.
1 file(s) copied.
\\share\folder\folder\folder\file
1 file(s) copied.
1 file(s) copied.
******************************************************************************
I would like my events to look like this:
******************************************************************************
Mon 01/02/2023
09:00 AM
******************************************************************************
The command completed successfully.
1 file(s) copied.
\\share\folder\folder\folder\file
1 file(s) copied.
1 file(s) copied.
It seems like no matter what I try, I can't get splunk to separate it properly.
The file updates daily and I have been testing my settings by uploading a copy of the text file directly and then adding then configuring splunk to monitor the file for continuous updates.
Typically the preview for the uploaded file looks somewhat acceptable like this:
Mon 01/02/2023
09:00 AM
******************************************************************************
The command completed successfully.
1 file(s) copied.
\\share\folder\folder\folder\file
1 file(s) copied.
1 file(s) copied.
This output would work, however I did notice that it is consistently cutting off the first line of text. The real problem comes in with the monitoring process.
It tends to split the data in a way that seems almost random, and definitely isn't matching my regex settings.
The date, the asterisks and the text get placed into separate events for reasons i dont understand.
My props.conf settings are displayed below:
[log_file_test]
BREAK_ONLY_BEFORE = \*{78}\s*[a-zA-z]{3}\s\d{2}\/\d{2}\/\d{2}\/\d{4}
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE=1
category=custom
pulldown_type=1
disabled=false
Any clues as to what I might be doing wrong or neglecting?
Hi @xwill13 ,
These are the props settings that I would use for your input:
[ log_file_test]
MAX_TIMESTAMP_LOOKAHEAD=0
TIME_PREFIX=^
SHOULD_LINEMERGE=false
LINE_BREAKER=(\*{78}[\r\n]+)[a-zA-Z]+\s\d+/\d+/\d+
TRUNCATE=500
NO_BINARY_CHECK=true
It is usually better to play with the LINE_BREAKER and SHOULD_LINEMERGE=false to prevent Splunk from breaking events into single lines (using the default line breaker) and then consume resources doing the merging operations.
Others settings are specified for improved indexing/parsing performance; TIME_FORMAT was left to let Splunk automatically interpret the timestamp with the hours and minutes part of the timestamp in the other line
Hope this will help you, have a good day,
Fabrizio
Hi @xwill13 ,
These are the props settings that I would use for your input:
[ log_file_test]
MAX_TIMESTAMP_LOOKAHEAD=0
TIME_PREFIX=^
SHOULD_LINEMERGE=false
LINE_BREAKER=(\*{78}[\r\n]+)[a-zA-Z]+\s\d+/\d+/\d+
TRUNCATE=500
NO_BINARY_CHECK=true
It is usually better to play with the LINE_BREAKER and SHOULD_LINEMERGE=false to prevent Splunk from breaking events into single lines (using the default line breaker) and then consume resources doing the merging operations.
Others settings are specified for improved indexing/parsing performance; TIME_FORMAT was left to let Splunk automatically interpret the timestamp with the hours and minutes part of the timestamp in the other line
Hope this will help you, have a good day,
Fabrizio
Thanks! This seems to have worked perfectly so far.