Getting Data In

How can I monitor a directory for existence of a file without uploading the file into Splunk?

New Member

I have Splunk Enterprise running on Windows (server). All clients are running Windows with universal forwarders (mix of xp and win7)

I need to monitor a directory:

c:\pos\TKAgent\Work\Download

for any files with extension .processing and let me know when this file exists for over 3 hours (without uploading said file in to splunk)
Would this be possible with say monitor command?

0 Karma

Champion

I am not aware of any solutions for this. I have been thinking of making an app to do this though. I'll start working something up; I don't think it will be hard.

Update:

I wrote an app for this. You can download it here: https://splunkbase.splunk.com/app/2776/

This app exposes a new input type that can be configured in the Splunk Manager. To configure it, create a new input in the Manager under Data inputs » File Meta-data. Then enter your path and how often you want it to poll the file-system for.

The source code is available here: https://github.com/LukeMurphey/splunk-file-info

Engager

This is just what I was looking for. Thanks so much Luke! Any idea why, after adding the app and populating a particular file I would not be getting any indexed events? Assuming it does not need a Universal Forwarder install on the server where the file is monitored, and that it accepts UNC paths to the file.

0 Karma

Champion

Currently, this does require Python. Thus, it won't work on a Universal Forwarder (UF). I have a task to make the modular input work with system Python so that you could use a UF if Python was installed on the platform (see http://lukemurphey.net/issues/1068).

0 Karma

Engager

Gotcha. In my case, there is no universal forwarder, however I don't appear to be getting data/events from the app. Here is what I did:
1. Installed app in splunk, checked to make sure it is enabled
2. Went to data inputs / File Meta-Data, created a new entry, enabled
3. In the entry, specified a UNC path to a file on a server where my splunk service account has access
4. Specified a polling interval of 3m, and a max file size of 5MB (my file is much smaller)
5. Specified a host name it should use and an index

After that, I search for host=XXX (name I gave it earlier) and get nothing back. Is there something I am missing? Using 6.3.0.

0 Karma

Champion

I opened a ticket on this and am looking into it. I believe the problem is that UNC paths are problematic. My early investigation makes it appear that you have to escape the paths because backslashes are a special character in Python. Thus, this path won't work:

\\Server\file.txt

But this will:

\\\\Server\\file.txt

Can you let me know if doubling the backslashes works? My plan is to change the code so that you won't have to double the backslashes but it would be nice to hear that is indeed the root cause.

0 Karma

Engager

Hey Luke. I doubled the backslashes, and now it looks something like this:

\\\\SERVERNAME\\D$\\FOLDERNAME\\TEXTFILE.txt

Still no dice on a query for data from the host defined in the app. Is it possibly related to accessing it via an administrative share?

0 Karma

Champion

@JoelCBennett: could you be so kind to make a new answers post on this (that UNC paths are not working)? I'm making some progress on this issue and I would prefer to chime in on a different post since this issue is a bit more specific that what is in this post.

0 Karma

Champion

Actually, I'll go ahead and make the new question.

0 Karma

Champion
0 Karma

New Member

Thank you guys so much... I'm trying out the LukeMurphey's app. I'll let you know how it works out.

0 Karma

Champion

The app is now approved on Splunk-base: https://splunkbase.splunk.com/app/2776/

That one is slightly improved (has a better UI for making/editing the input).

0 Karma

Esteemed Legend

I don't see how this can be done with amonitor input but you can do it with a scripted input run every hour where you call the following command (or similar; check out tmpwatch😞

find . -maxdepth 0 -type f -name "*.processing" -cmin -240 -a -cmin +180

This design will cause each file to generate only 1 event ever, which is what I think you desire.

0 Karma

Explorer

Hi woodcock,
Have a qn about the command that you have suggested here.
I'm trying to monitor a certain kind of extn like .processing in a directory(in Windows) and wanted to know how I can use the above command?

0 Karma

Esteemed Legend

Ah, this is a command for *nix, not Windows. You will have to convert it to Powershell or some other Windows-based command.

0 Karma

Explorer

ok..Wanted to confirm..I'll check that..Thanks for the reply!!

0 Karma

Splunk Employee
Splunk Employee

The simplest solution is to create an input script to run with an interval.
Then the script checks the presence of the file, and outputs any notifications you want to index.

Example of pseudo code:

date=currentdate_and_timezone
if $file in "c:\pos\TKAgent\Work\Download"
then
echo "$date file $file found"
done

see the docs on scripted inputs:
http://docs.splunk.com/Documentation/Splunk/6.2.3/AdvancedDev/ScriptSetup
http://docs.splunk.com/Documentation/Splunk/6.2.3/AdvancedDev/ScriptWriting

Another option is to use the fschange monitor, but I will not recommend it as this feature is deprecated since splunk 5.*