Splunk Search
Highlighted

Reporting on zero results?

Builder

In order to identify web content that hasn't been pulled in a while, I thought I would use Splunk since a) my Apache logs are in Splunk already, and b) I can easily create a scripted input to get a list of files under the various directories. Initially, I'm going to do this for our .cgi's and .pl files

So, I have one index for the standard Apache access logs. I do have a field extraction for this called file. More on that later.

I then created a scripted input that runs once per day to pull a list of files under our content sub-directory (we're talking 13,000+ files). An example of the input looks like this:

09/29/10 15:42:46 -0400,file=actDefaultAccSet.cfm,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=liferayLogin.html,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=favicon.ico,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=favicon.gif,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=Cps_Doc_Upload_Rules.doc,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=ordocs-index.jsp,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=contact_me2.cfm,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=orprefs-index.html,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=ppsathanks.html,app_root=public,dir=/cfmx_files/cfmx61/public

I can do a query that looks like this:

index="prod_ohs_logs" [search index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl" | fields file ] | table file | dedup file

Which only returns 36 out of the 125 .pl / .cgi files out there, which is not exactly what I'm looking for.

Basically, I'm looking to take a list of files from a specific query, check to see how many of those files are found in the Apache logs, including ones with zero results.

I've spent a couple of days trying to get this working, and I haven't been able to. Any ideas on how to do this? Is it even possible?

Highlighted

Re: Reporting on zero results?

Splunk Employee
Splunk Employee

Your best strategy here is to use an OR search, to load data from both prod_ohs_logs and prod_coldfusion_files at the same time and see, for each file, whether it is in one, the other or both of the indexes. For example:

index="prod_ohs_logs" OR (index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl") | chart count by file index

View solution in original post

Highlighted

Re: Reporting on zero results?

Builder

Great, it's a starting point. I need to figure out how to only list the files that have 1 as the results under prodcoldfusionfiles..

0 Karma
Highlighted

Re: Reporting on zero results?

Splunk Employee
Splunk Employee

Just add "... | search prodcondfusionfiles=0" to your search.

Highlighted

Re: Reporting on zero results?

Builder

Pure awesomeness Stephen. Thank you!

0 Karma