Getting Data In

Is it the forwarder or indexer that unzips monitored zip files?

reggie_123
Explorer

I understand that Splunk first uncompresses the monitored zip files and only then indexes them.
Where does the uncompressing take place? Universal forwarder or Indexer? In other words, at what box should I allocate enough disk space and cpu resources for the uncompressing?

0 Karma

woodcock
Esteemed Legend

This is done on the forwarder. Check out this Q&A:
https://answers.splunk.com/answers/91386/the-limitation-of-monitor-files-in-one-directory.html

Most people who put compressed files directly into inputs.conf will eventually see logs like this In splunkd.log:

DateTime INFO TailingProcessor - failed to insert into AQ, retrying...

This will be accompanied by logs like this in metrics.log:

DateTime INFO Metrics - group=queue, name=aq, blocked=true, max_size=10000,filled_count=7, empty_count=0, current_size=10000, largest_size=10000, smallest_size=9996*

Here AQ is the queue feeding the ArchiveProcessor, which is the thread that handles compressed and archived inputs (.gz, .bz2, .Z, .tar, .zip, .tgz). The ArchiveProcessor is single-threaded and handles archives one at a time. This means that the file processing code has found more than 10000 archive files that we are processing in turn. The only hope in this case is that the workload will have light periods to allow the ArchiveProcessor catch up on the backlog, but this may never happen.

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Original point of ingestion handles the decompress and reading of contents.

0 Karma
Get Updates on the Splunk Community!

This Week's Community Digest - Splunk Community Happenings [9.26.22]

Get the latest news and updates from the Splunk Community here! Upcoming User Group Events! 👏 Check ...

BSides Splunk 2022 - The Call for Papers is now Open!

TLDR; Main Site: https://bsidessplunk.com CFP Site: https://bsidessplunk.com/cfp CFP Opens: December 15th, ...

Sending Metrics to Splunk Enterprise With the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...