Solved: splunk with s3 add-on - monitor a s3 directory

tzhang101 · ‎01-24-2014

Hi, I have installed splunk w/ s3 add-on. I can add data for s3 bucket, but I can't add data for a s3 bucket/directory.
I will get the error saying no objects found under the directory whereas the directory does contain subdirectories and then files within the subdirectories. How to work around this? Thanks.

mkinsley_splunk · ‎01-24-2014

In order to pull an entire directory from s3 you must end in a "/"

In S3, this is perfectly legal:

/foo/bar
/foo

This would be illegal in a normal file system. Directories don't actually exist in S3 , they are just an illusion for our benefit. Everything is is just pairs of (key , object ) .

By ending your key in a "/" , you're telling s3 , I want everything that matches key/*

I'm not sure if this will fix your problem ( there could also be a problem in the underlying code), but give it a try.

Thanks

View solution in original post

shepdelacrem3 · ‎05-09-2014

The "answer" above is not valid since the S3 add-on does not seem to traverse into subdirectories correctly unless you add an entire bucket as the target.

i.e. given a bucket "log-bucket" that contains ELB logs you would only be able to monitor the entire bucket with a single input or a single directory/object. When setting up ELB and CloudTrail logging AWS manages the directory structure and organization of those logs in the S3 bucket you specify.

So in "log-bucket" you will have.
/AWSLogs
/AWSLogs/12345678890 (this is your account number)
/AWSLogs/12345678890/elasticloadbalancing
/AWSLogs/12345678890/elasticloadbalancing/us-east-1

Now you get to the actual log directories organized by date:
/AWSLogs/12345678890/elasticloadbalancing/us-east-1/{YEAR}/{Month}/{Day}/something.log

If you were to put CloudTrail logs in this same bucket they would be in the same dir structure.

/AWSLogs/12345678890/CloudTrail/us-east-1/{YEAR}/{Month}/{Day}/something.log

If you have CloudFront and S3 access logs in this same bucket then you would have more issues when monitoring the entire bucket.

Using an input of s3://log-bucket/AWSLogs/12345678890/CloudTrail/
Would give the following error:
Encountered the following error while trying to update: In handler 's3': Invalid configuration specified: No objects found inside s3://log-bucket/AWSLogs/12345678890/CloudTrail/.

In addition to these problems the S3 add-on s3.py script does not appear to handle "paging" of buckets properly. i.e. If there is over 1000 objects in a bucket then the script will only ever see the first 1000 objects because the script does not use markers to page through the results.

See: http://answers.splunk.com/answers/66611/splunk-for-amazon-s3-add-on-not-able-to-fetch-all-logs

sarit_s · ‎05-12-2019

is it possible to tell splunk to ignore some sub directories in the s3 input ?
so if i have
/foo/bar/1
/2
/3
it will ignore 3 ?

thanks

mkinsley_splunk · ‎01-24-2014

In order to pull an entire directory from s3 you must end in a "/"

In S3, this is perfectly legal:

/foo/bar
/foo

This would be illegal in a normal file system. Directories don't actually exist in S3 , they are just an illusion for our benefit. Everything is is just pairs of (key , object ) .

By ending your key in a "/" , you're telling s3 , I want everything that matches key/*

I'm not sure if this will fix your problem ( there could also be a problem in the underlying code), but give it a try.

Thanks

mkinsley_splunk · ‎01-29-2014

It looks to me like the code is passing the stanza name to s3 as-is.

That would mean you can use any valid s3 key.

To test if you're using a valid s3 key, I really like the aws cli utility . It is available as a pip package. You can install it with pip install awscli.

Then, from the command line you can run:

aws s3 ls s3:///

tzhang101 · ‎01-29-2014

Can the key contain wild card such as *? Thanks.

splunk with s3 add-on - monitor a s3 directory

Detecting Brute Force Account Takeover Fraud with Splunk

Buttercup Games: Further Dashboarding Techniques (Part 9)

Buttercup Games: Further Dashboarding Techniques (Part 8)

Are you a member of the Splunk Community?

splunk with s3 add-on - monitor a s3 directory

Detecting Brute Force Account Takeover Fraud with Splunk

Buttercup Games: Further Dashboarding Techniques (Part 9)

Buttercup Games: Further Dashboarding Techniques (Part 8)