We currently have custom batch jobs running on EC2 instances in AWS and each of these processes creates one log. the log name and directory is not fixed, but predictable. For example the name format could be log_PIDYYYYDDMM.
Our main purpose is to make these logs available to developers for review without the need to log in to production. We also would like to get these logs searchable via Splunk for certain keywords and alert based on those keywords as well.
An example of the log format would be like this:
############ E X E C L O G ############ 13080 20:16:38 NEW818D JOB STARTED CLASS=A NODE=JK UID=ghjrib_on 20:16:38 STEP GHHSR01 STARTED GHHSR01 SORTIN1=/data/ghhb/pvt/master/GH.GH252D.GH266 GHHSR01 SORTIN2=/data/ghhb/pvt/master/GH.GH262D.GH266 GHHSR01 SORTIN3=/data/ghhb/pvt/master/GH.GH266D.GH262 GHHSR01 SORTOUT=/mnt99/tmp/GH.GH818D.GHSR01_13088 GHHSR01 CPU TIME: 0.010000 20:16:38 STEP GHHSR01 ENDED CODE=0 20:16:38 STEP GHHSR02 STARTED GHHSR02 SORTIN1=/data/ghhb/pvt/master/GH.GH252D.AB272 GHHSR02 SORTIN2=/data/ghhb/pvt/master/GH.GH262D.AB272 GHHSR02 SORTIN3=/data/ghhb/pvt/master/GH.GH266D.AB268 GHHSR02 SORTOUT=/mnt99/tmp/GH.AB818D.GHSR02_13088 GHHSR02 CPU TIME: 0.020000 20:16:39 STEP GHHSR02 ENDED CODE=0 20:16:39 STEP SIAB858 STARTED GHHB858 SYS010=/data/ghhbb/pvt/master/GH.AB252D.AB259 GHHB858 SYS020=/mnt99/tmp/GH.AB818D.SIAB858_13088 GHHB858 CPU TIME: 0.000000 20:16:39 STEP GHHB858 ENDED CODE=0 20:16:39 STEP GHHB859 STARTED GHHB859 SYS010=/mnt99/tmp/GH.AB818D.GHSR01_13088 GHHB859 SYS020=/mnt99/tmp/GH.AB818D.GHB859_13088 GHHB859 CPU TIME: 0.000000 20:16:39 STEP GHHB859 ENDED CODE=0
Does Splunk have the capability to handle logs like this?
I was not clear on the directory structure for above. The home directory for these logs always stays the same. For example /usr/batch/logs
After that comes the directory which is the current date in format MMDDYYY, and then the log files mentioned above will be created in those directories. As an example path to all of today's filers will be: /usr/batch/logs/04112018/*
What we need splunk to do is monitor the new directory everyday and extract all data from all log files inside the daily directories. Is that still possible?
Yes Splunk can handle these logs. But it would be better if the path for logs is fixed.Else you need to write monitor stanza for all predictable paths which can be tedious. Also the name format should not be same as some other files, since the path is not fixed else you might risk of indexing unwanted data.
Also refer this doc:
Let me know if this helps!!