Getting Data In

can we get custom logs from a batch job running in AWS EC2 instance to Splunk Cloud for analysis?

New Member

Hello,

We currently have custom batch jobs running on EC2 instances in AWS and each of these processes creates one log. the log name and directory is not fixed, but predictable. For example the name format could be log_PIDYYYYDDMM.

Our main purpose is to make these logs available to developers for review without the need to log in to production. We also would like to get these logs searchable via Splunk for certain keywords and alert based on those keywords as well.

An example of the log format would be like this:

############  E X E C   L O G  ############

13080 20:16:38 NEW818D JOB STARTED CLASS=A NODE=JK
                    UID=ghjrib_on
      20:16:38 STEP GHHSR01 STARTED
                    GHHSR01 SORTIN1=/data/ghhb/pvt/master/GH.GH252D.GH266
                    GHHSR01 SORTIN2=/data/ghhb/pvt/master/GH.GH262D.GH266
                    GHHSR01 SORTIN3=/data/ghhb/pvt/master/GH.GH266D.GH262
                    GHHSR01 SORTOUT=/mnt99/tmp/GH.GH818D.GHSR01_13088
                    GHHSR01 CPU TIME: 0.010000
      20:16:38 STEP GHHSR01 ENDED CODE=0
      20:16:38 STEP GHHSR02 STARTED
                    GHHSR02 SORTIN1=/data/ghhb/pvt/master/GH.GH252D.AB272
                    GHHSR02 SORTIN2=/data/ghhb/pvt/master/GH.GH262D.AB272
                    GHHSR02 SORTIN3=/data/ghhb/pvt/master/GH.GH266D.AB268
                    GHHSR02 SORTOUT=/mnt99/tmp/GH.AB818D.GHSR02_13088
                    GHHSR02 CPU TIME: 0.020000
      20:16:39 STEP GHHSR02 ENDED CODE=0
      20:16:39 STEP SIAB858 STARTED
                    GHHB858 SYS010=/data/ghhbb/pvt/master/GH.AB252D.AB259
                    GHHB858 SYS020=/mnt99/tmp/GH.AB818D.SIAB858_13088
                    GHHB858 CPU TIME: 0.000000
      20:16:39 STEP GHHB858 ENDED CODE=0
      20:16:39 STEP GHHB859 STARTED
                    GHHB859 SYS010=/mnt99/tmp/GH.AB818D.GHSR01_13088
                    GHHB859 SYS020=/mnt99/tmp/GH.AB818D.GHB859_13088
                    GHHB859 CPU TIME: 0.000000
      20:16:39 STEP GHHB859 ENDED CODE=0

Does Splunk have the capability to handle logs like this?

0 Karma

New Member

I was not clear on the directory structure for above. The home directory for these logs always stays the same. For example /usr/batch/logs

After that comes the directory which is the current date in format MMDDYYY, and then the log files mentioned above will be created in those directories. As an example path to all of today's filers will be: /usr/batch/logs/04112018/*

What we need splunk to do is monitor the new directory everyday and extract all data from all log files inside the daily directories. Is that still possible?

0 Karma

Motivator

Hey@hitenv79,

Yes Splunk can handle these logs. But it would be better if the path for logs is fixed.Else you need to write monitor stanza for all predictable paths which can be tedious. Also the name format should not be same as some other files, since the path is not fixed else you might risk of indexing unwanted data.

Also refer this doc:
http://docs.splunk.com/Documentation/Splunk/7.0.3/Data/Monitorfilesanddirectories

Let me know if this helps!!

0 Karma