I have a tar.gz file and I wan't to continuously monitor it. I tried to index it to Splunk Enterprise via Settings>Data Inputs>Files&Directories, but when I run a search, Splunk doesn't return a result.
What are the steps to continuously monitor tar.gz files to index in Splunk? Do I need to write a script that automatically decompress tar.gz file so Splunk can index it? Thanks.
Splunk won't index compressed files because they look like binaries. A script is one idea. Or you could have Splunk monitor the files before they are tarred.
I downvoted this post because this answer is incorrect. Splunk is capable of monitoring compressed files. There must be some other issue here.
In my case splunk enterprise did not index compressed file so we created a bash script to uncompressed the data and proceed with the indexing.
I downvoted this post because splunk does index the compress files, it just doesn't perform parallel monitoring but sequential one. cpu is the key if you are going to decompress a lot of files like > 50k
According to the most recent docs Splunk does index compressed files
How Splunk Enterprise monitors archive files Archive files (such as a .tar or .zip file, are decompressed before being indexed. The following types of archive files are supported: .tar .gz .bz2 .tar.gz and .tgz .tbz and .tbz2 .zip .z If you add new data to an existing archive file, the entire file is reindexed, not just the new data. This can result in event duplication.