Getting Data In

Does Splunk have the ability to batch ingest large .csv files?

nick405060
Motivator

Is Splunk capable of batch ingesting large .csv files? It does not seem like it.

For example, the below works

[monitor:///opt/splunk/var/run/splunk/csv/tenable_reports/*/*.csv]
disabled=false
index=security
sourcetype=csv
ignoreOlderThan=30d

but when you change monitor to batch and add move_policy=sinkhole, as well as delete ignoreOlderThan, it breaks. No ingestion, no purging. Adding initCrcLen=1000000000 does nothing. If I quickly vim a much smaller test csv then it works. Other users have had similar issues:

https://answers.splunk.com/answers/660982/why-is-the-batch-input-not-indexing-certain-files.html

I am batch inputting from my search head; the index resides on the indexer. There is no problem doing this with monitor or with smaller csvs.

It's a shame Splunk is not a robust enough SIEM to be able to handle batch ingestion of a CSV file 😕

nick405060
Motivator

This is only a partial answer. 15 hours and 100+ reboots later. Sigh.

I had to stop Splunk, delete the app, fishbucket, and batch directory all at once, restart, and then move files back in. If I did not do this exactly it did not work. Even then, it only indexed/deleted about 500MB of csvs at a time before it stopped indexing or deleting ANYTHING. So I had to go through the process again with my remaining ~800MB of csv files. And again, it stopped and broke halfway through. The third time going through this process finally finished the batch ingestion.

Again, this doesn't help much, because next time I go to deposit a large number of .csvs, I'm screwed unless I want to go through this process every time.

richgalloway
SplunkTrust
SplunkTrust

How large are the files?

---
If this reply helps you, Karma would be appreciated.
0 Karma

nick405060
Motivator

up to 250MB

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

    Are you ready to transform how your team handles complex data requests? We invite you to our upcoming ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...