We have log files with the following format:
*******************************************************************************
INTERFACE "<interface name>" - LOG SUMMARY RESULTS
*******************************************************************************
JOB RESULTS:
===============================================================================
Name of the job :<job name>
Name of the xml file :<xml path>
Path of the xml file :<file path>
Database alias :<host name>
Job start date & time :2019-05-08 14:43:28
Job end date & time :2019-05-08 14:43:28
Total job execution time (mins) :0.0
Framework status code :400
Framework status message :Error in parsing the XML Configuration file.
-------------------------------------------------------------------------------
Error details :
-------------------------------------------------------------------------------
Error in parsing the XML Configuration file.
<XML file name>.XML (The system cannot find the file specified)
-------------------------------------------------------------------------------
********************************* END OF LOG **********************************
How do I make each log file a distinct event?
Hi @ddrillic ,
I would recommend against doing this. It's bad logging policy, and it leads to problems on the operating system over time if you have to manage groups of log files that constantly grow in number.
My first option would be to pre-process these log files. You could setup a system cron job that merges these log files into a single log file, and then arranges the data so that it shows up as an individual event.
However, if your hands are tied, you can always set your line breaker up so that it looks for the end of log entry:
LINE_BREAKER = ([\r\n]+\*+\s+END\s+OF\s+LOG\s+\*+)
SHOULD_LINEMERGE = false
You would do this in props.conf for the sourcetype you need this to happen on.
https://docs.splunk.com/Documentation/Splunk/7.3.0/Admin/Propsconf
You'll also need to consider changing initCrcLength
or crcSalt
. The beginning of your file looks like it will be the same every time. Splunk tracks files based on the opening 256 bytes, so it will likely stop reading new files. "Hey, I've seen this file before. Ignore."
That's a valid point.
crcSalt = <SOURCE>
Should work best in this case.
Thank you @twinspop and @jnudell_2 - crcSalt = <SOURCE>
seems to be the right one of the two, because each log file is an event here, and each log file has a distinct name.
Hi @ddrillic ,
I would recommend against doing this. It's bad logging policy, and it leads to problems on the operating system over time if you have to manage groups of log files that constantly grow in number.
My first option would be to pre-process these log files. You could setup a system cron job that merges these log files into a single log file, and then arranges the data so that it shows up as an individual event.
However, if your hands are tied, you can always set your line breaker up so that it looks for the end of log entry:
LINE_BREAKER = ([\r\n]+\*+\s+END\s+OF\s+LOG\s+\*+)
SHOULD_LINEMERGE = false
You would do this in props.conf for the sourcetype you need this to happen on.
https://docs.splunk.com/Documentation/Splunk/7.3.0/Admin/Propsconf
We "normally" would do -
LINE_BREAKER = ([\r\n]+)\*+\s+END\s+OF\s+LOG\s+\*+
Does it make any difference how large the capture group is ?
Yes.
The capture group I provided discards the line with **** END OF LOG ****
.
Yours will place **** END OF LOG ****
in a new event. So the log file will show up as two events instead of one.
@jnudell_2, for the record we ended using -
LINE_BREAKER = ([\r\n]+\*+\s+END\s+OF\s+LOG\s+\*+\n)
And my colleague suggested to add the \n
part - not sure if it changed anything.
I do see the closing line in the Splunk event -
********************************* END OF LOG **********************************
.
Is it right?
Gorgeous @jnudell_2 - much appreciated.
I wonder now about the TIME_PREFIX
. Is this good?
TIME_PREFIX=Job start date \& time :
I would use the following:
TIME_PREFIX = Job start date \& time\s+:
Gorgeous !
It should do the trick 😉
Just be sure to set MAX_TIMESTAMP_LOOKAHEAD
as it defaults to 128 chars.
Right @DavidHourani - it will look immediately after the TIME_PREFIX
value, so any small integer value will do, right?
yeah you can go for 20 🙂
20 it is ; -)
Makes perfect sense @jnudell_2 - great.