s3 - Multiple directories data to be ingested into...

shoaibalimir · ‎07-25-2025

Hi Community,

I'm exploring ways to ingest data into Splunk Cloud from a Amazon s3 Bucket which has multiple directories and multiple files to be ingested onto Splunk.

Now, I have assessed the Generic s3, SQS-s3 and the Data Manager Inputs for AWS available on Splunk but am not getting the required outcome.

My use case is given below:

There's a s3 bucket named as exampledatastore, in that there's a directory named as statichexcodedefinition, in that there're multiple message Ids and Dates.

The s3 example structure is given below:

s3://exampledatastore/statichexcodedefinition/{messageId}/functionname/{date}/* - functionnameattribute

Where the {messageId} and the {date} values are dynamic. And I have a start date to begin with but the messageId varies.

Please can you assist me on this on how to get the data into Splunk.

Many Thanks!

livehybrid · ‎07-26-2025

Hi @shoaibalimir

When you assessed and didnt get the required outcome - what is the issue you had specifically?

Is this a one-time ingestion of historic files already in S3, or are you wanting to ingest on an ongoing basis (I assume the latter?).

Personally I would avoid Generic-S3 as it relies on checkpoint files and can get messy quickly. SQS based S3 is the way to go here I believe.

Check out https://splunk.github.io/splunk-add-on-for-amazon-web-services/SQS-basedS3/ for more details on setting up SQS-based-S3 input. Its also worth nothing that the dynamic parts of the path shouldnt be a problem. If you have requirements to put them into specific indexes depending on the dynamic values then you can configure this when you setup the event notification (https://docs.aws.amazon.com/AmazonS3/latest/userguide/enable-event-notifications.html) and will probably need multiple SQS. Alternatively you could use props/transforms to route to the correct index at ingest time.

🌟 Did this answer help you? If so, please consider:

Adding karma to show it was useful
Marking it as the solution if it resolved your issue
Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

shoaibalimir · ‎07-28-2025

Hi @livehybrid

I'll assess again with the SQS-s3 Connector, and I'll need to ingest both historic data as well as ongoing data stream.

By the initial observations I think I'll need to use multiple SQS-s3 Connectors or would need to use Lambda to process those into single SQS-s3 Connector.

Please let me know if there's any other alternative to this assumption.

Thanks!

s3 - Multiple directories data to be ingested into Splunk Cloud

index

modular input

source

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases

Are you a member of the Splunk Community?