Getting Data In

Monitoring large number of files

joonradley
Path Finder

We have a server that generates 100k log files a day. The logs must be forwarded to an indexer. Due to the critical nature of the server we can only install a light forwarder. The files only need to be loaded once monitoring is not needed.

Using monitor slows down the server to a crawl and we cannot use BATCH as the data must be preserved. Sadly we cannot copy the files to another directory for BATCH input.

Tried using fschange, but it does not forward the actual files to the indexer when sendCookData=false.

Any ideas?

Tags (1)
0 Karma
1 Solution

brianirwin
Path Finder

Using monitor I would issue is the time_before_close, this exists to tell Splunk to not close a file until x seconds after the last write. Default value for this is 3 seconds, and with only 86400 seconds in a day just opening and closing 100K files uses up more time than you have.

Looking at the manual it seems when you override this for monitor in inputs.conf you can only set to an integer, so even if you go to 1, you will be in trouble.

You could try setting it to time_before_close = 1, but if you have 100K files you are still going to take longer than you want.

To the earlier point you may need to tarball, or cat x number of files together and send to a separate directory where you sinkhole/batch them or do anything to reduce the number of files to be eaten. If nothing else I think your inode tables will thank you if you can combine some of these files.

View solution in original post

eashwar
Communicator

hello you got a spell error!!

sendCookedData = false

i am leaning splunk!! i set up an forwarder and indexer working perfectly. the forwarded logs get indexed in the MAIN index which is default.

i want to know how to index the data in a custom index.

thanks in advance

0 Karma

stefandagerman
Path Finder

How about you create your own topic, given the completely different nature of your question, once you have determined the the Splunk documentation at http://docs.splunk.com/Documentation/Splunk/latest/admin/inputsconf does not provide the answer to your question?

Please don't hijack threads as it is unlikely that you will get a response.

0 Karma

brianirwin
Path Finder

Using monitor I would issue is the time_before_close, this exists to tell Splunk to not close a file until x seconds after the last write. Default value for this is 3 seconds, and with only 86400 seconds in a day just opening and closing 100K files uses up more time than you have.

Looking at the manual it seems when you override this for monitor in inputs.conf you can only set to an integer, so even if you go to 1, you will be in trouble.

You could try setting it to time_before_close = 1, but if you have 100K files you are still going to take longer than you want.

To the earlier point you may need to tarball, or cat x number of files together and send to a separate directory where you sinkhole/batch them or do anything to reduce the number of files to be eaten. If nothing else I think your inode tables will thank you if you can combine some of these files.

Genti
Splunk Employee
Splunk Employee

perhaps you could tarball the files into a .gz and have splunk monitor that instead.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...