Getting Data In

How to monitor and index tar.gz files in Splunk?

dantimola
Communicator

I have a tar.gz file and I wan't to continuously monitor it. I tried to index it to Splunk Enterprise via Settings>Data Inputs>Files&Directories, but when I run a search, Splunk doesn't return a result.
What are the steps to continuously monitor tar.gz files to index in Splunk? Do I need to write a script that automatically decompress tar.gz file so Splunk can index it? Thanks.

0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Splunk won't index compressed files because they look like binaries. A script is one idea. Or you could have Splunk monitor the files before they are tarred.

---
If this reply helps you, an upvote would be appreciated.

View solution in original post

Rhin0Crash
Path Finder

If you are trying to monitor a file on a universal forwarder (i.e. tar.gz on a remote system), you can use the GUI to create a forwarder data/file input.

Settings --> Data Inputs --> Forwarded Inputs --> Files & Directories

Once that is complete, make sure you go to Forwarder Management, and enable the app by editing it, and checking the box. The Deployment will take a few minutes, but should start returning results shortly thereafter.

If it doesn't start indexing the data, and if you have direct access to the file location, try moving the files out of location (e.g. from /log to /opt) and back again. The move should trigger indexing.

dantimola
Communicator

Hi thank you for your answer. I'm using heavy forwarder for me to monitor those compressed log.

0 Karma

dbcase
Motivator

My comment above works for me

0 Karma

ddrillic
Ultra Champion

dantimola
Communicator

I've already did this but still no logs are being indexed

0 Karma

dbcase
Motivator

According to the most recent docs Splunk does index compressed files

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Monitorfilesanddirectories

How Splunk Enterprise monitors archive files
Archive files (such as a .tar or .zip file, are decompressed before being indexed. The following types of archive files are supported:

 .tar
 .gz
 .bz2
 .tar.gz and .tgz
 .tbz and .tbz2
 .zip
 .z
If you add new data to an existing archive file, the entire file is reindexed, not just the new data. This can result in event duplication.

dbcase
Motivator

I use the Universal forwarder to monitor compressed files, haven't tried it with the gui though.....

0 Karma

SierraX
Communicator

Normally it should show you an error message in $SPLUNKHOME/var/log/splunk/splunkd.log when its not reading you can force it with a .splunk restart or you try a .splunk add oneshot to see in splunkd.log what happen.

...what kind of files are in the .tar.gz may there is something inside splunk can't read.

0 Karma

dantimola
Communicator

inside the .tar.gz is a log file

0 Karma

dbcase
Motivator

Check your sourcetype as well, does it match the data format?

0 Karma

dantimola
Communicator

How do you monitor compress file without using gui? Is it on the inputs.conf?

0 Karma

dbcase
Motivator

Yes, you would use inputs.conf

Here is what I do.

in $SPLUNK_HOME/etc/system/local

[batch:///var/nfs/SAT_SplunkLogs/weblogic/twc_media4/*.zip]
move_policy = sinkhole
host_segment=5
sourcetype=wls_managedserver
index=twc
0 Karma

dbcase
Motivator

keep in mind the batch option on the first line. This will ERASE the zip file when Splunk finishes indexing. If you don't want that, change batch to monitor and delete the move_policy line.

Also you must restart Splunk for any changes in inputs.conf to take effect.

0 Karma

dantimola
Communicator

Yes it is.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Splunk won't index compressed files because they look like binaries. A script is one idea. Or you could have Splunk monitor the files before they are tarred.

---
If this reply helps you, an upvote would be appreciated.

View solution in original post

brianrowe
Engager

I downvoted this post because the answer provided is incorrect.

0 Karma

ekintulga
Engager

I downvoted this post because splunk does index the compress files, it just doesn't perform parallel monitoring but sequential one. cpu is the key if you are going to decompress a lot of files like > 50k

0 Karma

dantimola
Communicator

In my case splunk enterprise did not index compressed file so we created a bash script to uncompressed the data and proceed with the indexing.

0 Karma

jcspigler2010
Path Finder

I downvoted this post because splunk can index compressed files

0 Karma

nnmiller
Contributor

I downvoted this post because this answer is incorrect. Splunk is capable of monitoring compressed files. There must be some other issue here.

http://docs.splunk.com/Documentation/Splunk/latest/Data/Monitorfilesanddirectories

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!