We currently have custom batch jobs running on EC2 instances in AWS and each of these processes creates one log. the log name and directory is not fixed, but predictable. For example the name format could be log__PID_YYYYDDMM.
Our main purpose is to make these logs available to developers for review without the need to log in to production. We also would like to get these logs searchable via Splunk for certain keywords and alert based on those keywords as well.
An example of the log format would be like this:
############ E X E C L O G ############
13080 20:16:38 NEW818D JOB STARTED CLASS=A NODE=JK
20:16:38 STEP GHHSR01 STARTED
GHHSR01 CPU TIME: 0.010000
20:16:38 STEP GHHSR01 ENDED CODE=0
20:16:38 STEP GHHSR02 STARTED
GHHSR02 CPU TIME: 0.020000
20:16:39 STEP GHHSR02 ENDED CODE=0
20:16:39 STEP SIAB858 STARTED
GHHB858 CPU TIME: 0.000000
20:16:39 STEP GHHB858 ENDED CODE=0
20:16:39 STEP GHHB859 STARTED
GHHB859 CPU TIME: 0.000000
20:16:39 STEP GHHB859 ENDED CODE=0
Does Splunk have the capability to handle logs like this?
I was not clear on the directory structure for above. The home directory for these logs always stays the same. For example /usr/batch/logs
After that comes the directory which is the current date in format MMDDYYY, and then the log files mentioned above will be created in those directories. As an example path to all of today's filers will be: /usr/batch/logs/04112018/*
What we need splunk to do is monitor the new directory everyday and extract all data from all log files inside the daily directories. Is that still possible?
Yes Splunk can handle these logs. But it would be better if the path for logs is fixed.Else you need to write monitor stanza for all predictable paths which can be tedious. Also the name format should not be same as some other files, since the path is not fixed else you might risk of indexing unwanted data.