Splunk Search

Reporting on zero results?

Brian_Osburn
Builder

In order to identify web content that hasn't been pulled in a while, I thought I would use Splunk since a) my Apache logs are in Splunk already, and b) I can easily create a scripted input to get a list of files under the various directories. Initially, I'm going to do this for our .cgi's and .pl files

So, I have one index for the standard Apache access logs. I do have a field extraction for this called file. More on that later.

I then created a scripted input that runs once per day to pull a list of files under our content sub-directory (we're talking 13,000+ files). An example of the input looks like this:

09/29/10 15:42:46 -0400,file=actDefaultAccSet.cfm,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=liferayLogin.html,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=favicon.ico,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=favicon.gif,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=Cps_Doc_Upload_Rules.doc,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=ordocs-index.jsp,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=contact_me2.cfm,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=orprefs-index.html,app_root=public,dir=/cfmx_files/cfmx61/public
09/29/10 15:42:46 -0400,file=ppsathanks.html,app_root=public,dir=/cfmx_files/cfmx61/public

I can do a query that looks like this:

index="prod_ohs_logs" [search index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl" | fields file ] | table file | dedup file

Which only returns 36 out of the 125 .pl / .cgi files out there, which is not exactly what I'm looking for.

Basically, I'm looking to take a list of files from a specific query, check to see how many of those files are found in the Apache logs, including ones with zero results.

I've spent a couple of days trying to get this working, and I haven't been able to. Any ideas on how to do this? Is it even possible?

1 Solution

Stephen_Sorkin
Splunk Employee
Splunk Employee

Your best strategy here is to use an OR search, to load data from both prod_ohs_logs and prod_coldfusion_files at the same time and see, for each file, whether it is in one, the other or both of the indexes. For example:

index="prod_ohs_logs" OR (index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl") | chart count by file index

View solution in original post

Stephen_Sorkin
Splunk Employee
Splunk Employee

Your best strategy here is to use an OR search, to load data from both prod_ohs_logs and prod_coldfusion_files at the same time and see, for each file, whether it is in one, the other or both of the indexes. For example:

index="prod_ohs_logs" OR (index="prod_coldfusion_files" file="*\.cgi" OR file="*\.pl") | chart count by file index

Brian_Osburn
Builder

Pure awesomeness Stephen. Thank you!

0 Karma

Stephen_Sorkin
Splunk Employee
Splunk Employee

Just add "... | search prod_condfusion_files=0" to your search.

Brian_Osburn
Builder

Great, it's a starting point. I need to figure out how to only list the files that have 1 as the results under prod_coldfusion_files..

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

In cybersecurity, defenders respond to threats. Architects design the systems that stop them.    As ...

Best Practices: Splunk auto adjust pipeline queue

When you enable autoAdjustQueue in Splunk, maxSize should be understood as the queue size Splunk starts with ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...