Getting Data In

Are there any issues with Splunk reading and indexing gzip files via a universal forwarder?

acidkewpie
Path Finder

Hi,

I've heard comments against configuring Splunk to read gzipped files, horror stories of it not always noticing the file was indeed a gz and logging the compressed raw data instead. I'm looking to piggyback on an existing process that drops a pile of gzipped logs onto a server with a universal forwarder already installed, and don't want to have to delve into custom scripts to first decompress the files to a temp location if there are no genuine known concerns around Splunk's consistent reliability when it comes to indexing gzipped files..

0 Karma
1 Solution

esix_splunk
Splunk Employee
Splunk Employee

Splunk can read zip/gzip files. Do understand that what Splunk does on the back end is:

1) Unarchives
2) Reads the Files
3) Indexes
4) Deletes the unarchived pieces

Additionally, the unzip process is not multithreaded. So you can see a fair amount of latency and cpu time used when this is done. Especially true if you are trying to monitor a large number of zip files. Also, you have to becareful regarding free disk space..

View solution in original post

esix_splunk
Splunk Employee
Splunk Employee

Splunk can read zip/gzip files. Do understand that what Splunk does on the back end is:

1) Unarchives
2) Reads the Files
3) Indexes
4) Deletes the unarchived pieces

Additionally, the unzip process is not multithreaded. So you can see a fair amount of latency and cpu time used when this is done. Especially true if you are trying to monitor a large number of zip files. Also, you have to becareful regarding free disk space..

acidkewpie
Path Finder

That all sounds reasonable as long as it reliable here. These are daily batch files, no manageable delay is really a problem, and it's done overnight when things are relatively sleepy. Where would the files be decompressed to by default?

Ultimately this is a temp hack before we get a real time stream of equivalent data, so looks good all round to me. Thanks

0 Karma

btt
Path Finder

For my understand there is no need to decompress gzip files before indexing it.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...