Splunk Search

Why is the stats count showing higher count for date_ then other fields?

Gawker
Path Finder

I have a report that provides a summary of key activity by IP.

I wanted to cross check that information against the results using a simple search that would return all the rows for a particular IP on a particular day.

I ran this search:

host="xxxxxx.xxx.xxxxx.local" "xxx.xxx.xxx.xxx"

Instead of getting 804,471 results (as in my report), the query yielded:

819,542 events (3/14/18 12:00:00.000 AM to 3/15/18 12:00:00.000 AM)

So, I then decided to add stats to the query, which is also used in the report:

I ran:

host="xxxxxx.xxx.xxxxx.local" "xxx.xxx.xxx.xxx" | stats count(*)

This search gave me something interesting:

819,542 events (3/14/18 12:00:00.000 AM to 3/15/18 12:00:00.000 AM)

count(apiKey)..........804471 << Correct and matches my report
count(date_hour).....819542 << Matches the total from the non-stats search
count(linecount)......819542 << Matches the total from the non-stats search

As a Splunk neophyte, this looks like I am getting phantom rows.

Why do I see different counts?

Thank you.

0 Karma
1 Solution

somesoni2
Revered Legend

The | stats count(<<fieldNameHere>>) give you count of events where the field has a value (non-null value). So your report must be using field apiKey in a way that it's result only include events with apiKey=some_non_null_value, thus count 804471. Your other query (just count of events in Events tab or value of | stats count(date_hour) where date_hour is available in each event) give you count of events regardless of whether apiKey field is present or not, thus more count.

If you run something like this, you can see all the (extra) events which are missing field apiKey.

host="xxxxxx.xxx.xxxxx.local" "xxx.xxx.xxx.xxx" NOT apiKey=*

View solution in original post

0 Karma

Gawker
Path Finder

Thank you for the excellent insight!

I tried the NOT search as you suggested and the count was the exact delta of the linecount and apiKey numbers.

0 Karma

somesoni2
Revered Legend

The | stats count(<<fieldNameHere>>) give you count of events where the field has a value (non-null value). So your report must be using field apiKey in a way that it's result only include events with apiKey=some_non_null_value, thus count 804471. Your other query (just count of events in Events tab or value of | stats count(date_hour) where date_hour is available in each event) give you count of events regardless of whether apiKey field is present or not, thus more count.

If you run something like this, you can see all the (extra) events which are missing field apiKey.

host="xxxxxx.xxx.xxxxx.local" "xxx.xxx.xxx.xxx" NOT apiKey=*
0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...