Splunk Search

Reporting on zero results?

Brian_Osburn
Builder

In order to identify web content that hasn't been pulled in a while, I thought I would use Splunk since a) my Apache logs are in Splunk already, and b) I can easily create a scripted input to get a list of files under the various directories. Initially, I'm going to do this for our .cgi's and .pl files

So, I have one index for the standard Apache access logs. I do have a field extraction for this called file. More on that later.

I then created a scripted input that runs once per day to pull a list of files under our content sub-directory (we're talking 13,000+ files). An example of the input looks like this:

09/29/10 15:42:46 -0400,file=actDefaultAccSet.cfm,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=liferayLogin.html,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=favicon.ico,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=favicon.gif,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=Cps_Doc_Upload_Rules.doc,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=ordocs-index.jsp,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=contact_me2.cfm,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=orprefs-index.html,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=ppsathanks.html,app_root=public,dir=/cfmx_files/cfmx61/public

I can do a query that looks like this:

index="prod_ohs_logs" [search index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl" | fields file ] | table file | dedup file

Which only returns 36 out of the 125 .pl / .cgi files out there, which is not exactly what I'm looking for.

Basically, I'm looking to take a list of files from a specific query, check to see how many of those files are found in the Apache logs, including ones with zero results.

I've spent a couple of days trying to get this working, and I haven't been able to. Any ideas on how to do this? Is it even possible?

1 Solution

Stephen_Sorkin
Splunk Employee
Splunk Employee

Your best strategy here is to use an OR search, to load data from both prod_ohs_logs and prod_coldfusion_files at the same time and see, for each file, whether it is in one, the other or both of the indexes. For example:

index="prod_ohs_logs" OR (index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl") | chart count by file index

View solution in original post

Stephen_Sorkin
Splunk Employee
Splunk Employee

Your best strategy here is to use an OR search, to load data from both prod_ohs_logs and prod_coldfusion_files at the same time and see, for each file, whether it is in one, the other or both of the indexes. For example:

index="prod_ohs_logs" OR (index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl") | chart count by file index

Brian_Osburn
Builder

Pure awesomeness Stephen. Thank you!

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

Just add "... | search prod_condfusion_files=0" to your search.

Brian_Osburn
Builder

Great, it's a starting point. I need to figure out how to only list the files that have 1 as the results under prod_coldfusion_files..

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...