Getting Data In

Monitor directory containing zip files

gelica
Communicator

Hi,

I'm trying to monitor a directory which contains zip files. The zip files contain different file types, and I'm only interested in indexing the txt files.
My path would be something like: dir\something.zip\file.txt

I have tried some different monitor approaches, but either nothing gets indexed or all of the files in my zip file are indexed. Here are a few examples of what I have tried in inputs.conf:

[monitor://C:\Users\angeliga\Filer\...]
disabled = false
followTail = 0
sourcetype = my_type
whitelist=*.txt

[monitor://C:\Users\angeliga\Filer\...\*.txt]
disabled = false
followTail = 0
sourcetype = my_type

Does anybody have any idea of what I'm doing wrong?
Thanks!

dantimola
Communicator

alt text

splunkd.log

my inputs.conf
[monitor:///home/administrator/Pictures]
disabled = false
host = pgwlogs
index = pgw_logsource
sourcetype = pgw

./splunk list monitor
$SPLUNK_HOME/var/spool/splunk/...stash_new
/home/administrator/Pictures/
/home/administrator/Pictures/OK_USCDB_1_20161108050001.tar.gz
Monitored Files:
$SPLUNK_HOME/etc/splunk.version

0 Karma

grijhwani
Motivator

First let me stress I have not done this, and I am not even completely confident of the file syntax, but I suspect the path you want is something along the lines of:

$SPLUNK_HOME/etc/system/local/inputs.conf

[monitor://C:\Users\angeliga\Filer\...]
disabled = false
whitelist=*.txt
followTail = 0
sourcetype = my_type

$SPLUNK_HOME/etc/system/local/props.conf

[source:://C:\Users\angeliga\Filer\...]
TRANSFORMS-set=droprecord,userecord

$SPLUNK_HOME/etc/system/local/transforms.conf

[droprecord]
REGEX=.
DEST_KEY=queue
FORMAT=nullQueue

[userecord]
REGEX={targetmatch}
DEST_KEY=queue
FORMAT=indexQueue

This assumes that rather than targetting the .txt files within the .zip file, you have a record structure you can target for the "userecord" regex. Certainly, if I were to investigate this is where I would begin, but I could be entirely and utterly wrong. It is at best an educated guess.

I will be watching with interest to see if there is, in fact, a direct solution to what you want to do.

0 Karma

grijhwani
Motivator

I think you're missing the point. The regex is a pattern match to to target the format of the records within the text files, not the text file names. I am assuming that the text files follow some regular format.

The regex matching means that yes the files get processed, but only the matching records will actually be indexed.

0 Karma

arunsundarm
Engager

Yes thats right, some cases we write regex for hostnames that again scans the records and assign the host name to the events

0 Karma

gelica
Communicator

I appreciate your help, but unfortnuately, I didn't get it to work..

I tried some different options, including setting my "keep-regex" to a specific file name that is in the compressed file. I also tried excluding the whitelist parameter, or sending both droprecord and userecord to nullQueue.

I still get non-txt files indexed, it seems like Splunk doesn't like this approach, and maybe I have to extract the zip files beforehand.

0 Karma

gelica
Communicator

Thanks for your suggestion, I will try it and hope it works. 🙂

But I wonder if this means that the all of the files gets indexed at first and the the unwanted files gets sorted out? Or will this in fact only index the files that I want?

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...