All Apps and Add-ons

Splunk Add-on for Amazon Web Services: How to get a CSV file stored in Amazon S3 to properly split at index-time?

jpvlsmv
Path Finder

I'm having trouble getting a CSV file that I've stored in Amazon S3 to properly split at index-time.

I'm using the Splunk Add-on for AWS, which allows me to define an S3 bucket to monitor. It pulls the data down just fine when a new CSV is uploaded:

[aws_s3://s3_autoruns]
disabled = false
aws_account = Splunk Reader
bucket_name = mybucket
index = jm
initial_scan_datetime = default
interval = 30
max_items = 100000
max_retries = 10
recursion_depth = 3
sourcetype = s3_autoruns
whitelist = .*/autoruns.txt$
blacklist = .*
character_set = UTF-16LE

I have in my props.conf a working transform (which changes the Host field to part of the S3 url), so I know this stanza is hitting for this data.

[source::.../autoruns.txt]
TRANSFORMS-s3host = transform-s3-integhost
DATETIME_CONFIG=CURRENT

With this, I get an event per line of the file.

I think I should be able to add to my props.conf:

INDEXED_EXTRACTIONS=CSV
FIELD_NAMES=Time,EntryLocation,Entry,Enabled,Category,Description,Publisher,ImagePath,LaunchString,MD5,SHA-1,SHA-256
FIELD_DELIMITER=,

But when I do that, it does not change anything. I still get one event per line, and no EntryLocation field to search on.

Any thoughts?

Thanks,
--Joe

dmaislin_splunk
Splunk Employee
Splunk Employee

I have run into this similar issue when streaming data via scripted input into Splunk. In the interim, please use the DELIMS option for search time field extractions:

http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/transformsconf

jpvlsmv
Path Finder

If I mirror the S3 bucket to a local directory and monitor it, it splits nicely:
[monitor:///data]
disabled = 0
crcSalt = <SOURCE>
index = jm
sourcetype = s3_autoruns
whitelist = .*/autoruns.txt$

--Joe

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...