Getting Data In

How does Splunk handle gzip'd logs?

jm_tesla
Engager

Suppose I have `/var/log/nginx/access.log` and then a dozen files in the same directory named like `access.log-<date>.gz`. When Splunk processes the gzip'd files, is it supposed to index them under the `/var/log/nginx/access.log` source? I ask because I've noticed that these gzip files show up when I query:

```
source="/var/log/nginx/access.log*" | stats count by source
```

 

I'd appreciate a link to docs regarding this, I couldn't find any. Thanks!

Labels (1)
0 Karma

jm_tesla
Engager

Thanks, and that makes sense. I was hoping (expecting, honestly) that Splunk would realize that the gzip'd log files "were really just `access.log` at a previous time". 

It's good to have clarity!

0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @jm_tesla 

>>>the gzip'd log files "were really just `access.log` at a previous time". 

yes, you are right actually. 

"the previous time" will be file's last modification time.. that will become the "_time"

Each file's name will be assigned to the field "source"
the sourcetype will be just the "filename" (gzip will be removed)

the source will be filename.gzip\filename1.txt and filename.gzip\filename2.txt (i just verified this, on Splunk 9.3.0)

if you got your answers, can you pls mark this post as resolved (so that it will move from unanswered to answered and i will get a solution authored as well 😉 .. thanks)


Best Regards

Sekar

0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @jm_tesla 

May i know if you have further questions?.. if no then, could you pls mark this post as resolved (so that it will move from unanswered to answered and i will get a solution authored as well thanks)


Best Regards

Sekar

0 Karma

inventsekar
SplunkTrust
SplunkTrust

Hi @jm_tesla 

For easy understanding, lets say there are 2 files 

/var/log/nginx/access.log and /var/log/nginx/access1.log Inside a gzip file.

When you onboard this gzip'd log to Splunk, the Splunk engine will undo the gzip and read both files and assign 
source for first file as "/var/log/nginx/access.log"

source for the 2nd file as "/var/log/nginx/access1.log"

from the documentation - https://docs.splunk.com/Documentation/Splunk/9.3.0/Data/Monitorfilesanddirectories


other than gzip, these are supported:

  • TAR
  • GZ
  • BZ2
  • TAR.GZ and TGZ
  • TBZ and TBZ2
  • ZIP
  • Z

    Best Regards, Sekar

richgalloway
SplunkTrust
SplunkTrust

The gzip'd files are index under their own source names.  They come in the query because their names match the pattern source="/var/log/nginx/access.log*".  Remove the asterisk and only the one file will appear.

---
If this reply helps you, Karma would be appreciated.
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

    Thursday, July 9, 2026  |  11:00AM–12:00PM PDT Duration: 1 hour (includes Q&A) Managing can feel like a ...

Upgrade Prep for 10.4, Network Observability Deep Dives, and More from Splunk Lantern

Splunk Lantern is Splunk’s customer success center that provides practical guidance from Splunk experts on key ...

Splunk Developer Day announcements: AI agents, MCP tools, Forecasting, and Custom ...

Splunk Developer Day was packed with product and platform updates for developers building in the AI ...