All Apps and Add-ons

Splunk Add-on for Amazon Web Services: How to get a CSV file stored in Amazon S3 to properly split at index-time?

jpvlsmv
Path Finder

I'm having trouble getting a CSV file that I've stored in Amazon S3 to properly split at index-time.

I'm using the Splunk Add-on for AWS, which allows me to define an S3 bucket to monitor. It pulls the data down just fine when a new CSV is uploaded:

[aws_s3://s3_autoruns]
disabled = false
aws_account = Splunk Reader
bucket_name = mybucket
index = jm
initial_scan_datetime = default
interval = 30
max_items = 100000
max_retries = 10
recursion_depth = 3
sourcetype = s3_autoruns
whitelist = .*/autoruns.txt$
blacklist = .*
character_set = UTF-16LE

I have in my props.conf a working transform (which changes the Host field to part of the S3 url), so I know this stanza is hitting for this data.

[source::.../autoruns.txt]
TRANSFORMS-s3host = transform-s3-integhost
DATETIME_CONFIG=CURRENT

With this, I get an event per line of the file.

I think I should be able to add to my props.conf:

INDEXED_EXTRACTIONS=CSV
FIELD_NAMES=Time,EntryLocation,Entry,Enabled,Category,Description,Publisher,ImagePath,LaunchString,MD5,SHA-1,SHA-256
FIELD_DELIMITER=,

But when I do that, it does not change anything. I still get one event per line, and no EntryLocation field to search on.

Any thoughts?

Thanks,
--Joe

dmaislin_splunk
Splunk Employee
Splunk Employee

I have run into this similar issue when streaming data via scripted input into Splunk. In the interim, please use the DELIMS option for search time field extractions:

http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/transformsconf

jpvlsmv
Path Finder

If I mirror the S3 bucket to a local directory and monitor it, it splits nicely:
[monitor:///data]
disabled = 0
crcSalt = <SOURCE>
index = jm
sourcetype = s3_autoruns
whitelist = .*/autoruns.txt$

--Joe

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...