Splunk Search

Monitor File Count and Age within a folder without indexing

jreagan
New Member

Im a Splunk newb and i am trying to find the best way to use Splunk to monitor an FTP Home Folder. I do not care about the contents of the file and prefer to not have their contents in Splunk since they contain HIPPA data.

I need to monitor how many files are in the folder and how old they are so i can be alerted if a file is left in there for a long time as well as if the number of files pass a threshold.

Tags (4)
0 Karma
1 Solution

dflodstrom
Builder

I think a scripted input would be useful for this task. Take the output of a command like $ls -l and index it. Extract the output into appropriate fields and Splunk it.

View solution in original post

daverodgers
Explorer

Just to finish off what I have ended up doing. Our splunk environment is windows based not linux.

so I created this script which is batch file that runs every 5 mins as a windows scheduled task. This runs on the splunk server. but i use a network path in the batch file so i can count remote directories.

@ECHO OFF
SETLOCAL 
SETLOCAL ENABLEDELAYEDEXPANSION
SET count=0 
for %%o IN ("\\network\folder\location\*.*") DO ( 
      echo %%o 
      SET /A count=count + 1 
)
net time \\%computername% |find "Current time" >> c:\count\countfiles.txt
echo dataareaid=UK >> c:\count\countfiles.txt
echo currentcount=%count% >> c:\count\countfiles.txt
ENDLOCAL ENABLEDELAYEDEXPANSION 
ENDLOCAL

this outputs the timestamp (net time) and the count result to the txt file (countfiles.txt). i use the double arrow >> to append the results each time it runs.

I then created a new data input > file input in splunk that index's this countfiles.txt file.

Splunk automatically picked up the timestamp and created the correct event rows for me.

because i put "currentcount=" in the batch file, splunk identifies that as a custom field so i can search on it.

When creating the input i created a new sourcetype called "filecount".

I monitor different directories each with a different batch file and resulting counttxt file. I have setup a file input for each of these in splunk and assigned them all this new sourcetype. This way i can search using "sourcetype="filecount" and it returns all my file count results which i plot on a single chart. In our case each directory relates to a different country.

the full search i use is:

sourcetype="filecount" | timechart max(currentcount) by dataareaid span=5m

this gives me exactly what i needed, a running count over time showing the maximum file count.

We have functions that process these files and move them on. If that function fails the files arent moved and the count rises as the files stay in the folders. This is shown perfectly on our splunk chart as a rising count line and alerts us to any issues with this process.

The same could be used for anyone with an ftp server, processing incoming files.

hope that helps anyone wanting to do a similar thing.

dflodstrom
Builder

I think a scripted input would be useful for this task. Take the output of a command like $ls -l and index it. Extract the output into appropriate fields and Splunk it.

daverodgers
Explorer

hi guys

this is exactly what i need to do.

did this suggested answer work?

if so, can someone be more specific on how to go about doing it?

thanks

0 Karma

dflodstrom
Builder

Hopefully this will help. Create a script that does two things:

  1. Count the number of files in a folder (this is a question for google/stackexchange). Put output from command into a text file (make sure you're appending the file)
  2. Determine the age of files in folder (another question for google/stackexchange). Put output from command into a text file, append a new file for this info or append the same file as before.

Create a cron job that executes this script on a given interval.

Your requirements might differ a little bit; maybe you want to examine several directories or all of the child directories within a given directory. You can use your scripting skills to format the output in a way that makes field extraction happen automatically or you can just use your Splunk field extraction skills on the default output.

Create a monitor input to read the file your script is making, make sure Splunk is exctacting your fields, create an alert in Splunk and you're set! Bonus*Syslog your output to another server if you don't want to install a universal forwarder on the server in question

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...