Splunk Search

Why am I getting zero results with my stats distinct count search?

Motivator

Hi

I wonder whether someone can help me please.

I'm using the code below to run a search which works fine.

index=main  detail.statusCode=200  | dedup detail.ipAddress | stats  count(eventId) | rename  count(eventId) AS "Registrations"

But I'm told, and I'm not sure whether this is correct, that dedup is quite labour intensive for Splunk to remove duplicate entries, so I changed the code to:

 index=main  detail.statusCode=200  | stats dc(detail.ipAddress) | stats  count(eventId) | rename  count(eventId) AS "Registrations"

The problem I have is that although the parameters for the two are exactly the same, I retrieve more events for the second search, but I have no idea why. The figure shown for the 1st search also returns a stats count of 74, whereas the second returns 0

Could someone tell me please? By default ,does 'stats dc' return another set of records?

Many thanks and kind regards

Chris

0 Karma
1 Solution

Motivator

Hi All, I've gone through the query @pwmcity provided, but unfortunately I couldn't get the figures to match those coming out of the dedup query.

So I continued to work on this, and managed to get this to work by using:

| stats distinct_count(detail.ipAddress) AS "Registrations"

I hope this helps.

Kind Regards
Chris

View solution in original post

0 Karma

Motivator

Hi All, I've gone through the query @pwmcity provided, but unfortunately I couldn't get the figures to match those coming out of the dedup query.

So I continued to work on this, and managed to get this to work by using:

| stats distinct_count(detail.ipAddress) AS "Registrations"

I hope this helps.

Kind Regards
Chris

View solution in original post

0 Karma

Path Finder

this is because you're removing the eventId in the first stats, try :

| stats dc(detail.ipAddress) by eventId | stats count

Also, know that that instead of this:

| stats  count(eventId) | rename  count(eventId) AS "Registrations"

you can do this:

| stats  count(eventId) AS "Registrations"

And as far as performance, just try both and inspect the time taken

Motivator

Hi @pwmcity, many thanks for taking the time to reply to my post and for the solution. Kind regards. Chris

Hi,

you are right, dedup is slow and resource heavy, espacially with a huge amount of events.
The problem with your second search is, if you run the search up to this point: index=main detail.statusCode=200 | stats dc(detail.ipAddress) you do not have the eventId field anymore. Therefore, your following| stats count(eventId) returns 0. So you have to refactor your search, but some infos could be useful. Is there only one eventId per ipAdress? What is the intention of your search? Do you have some sample data?

If i interpret your search correctly, i would say you get the same result (74) with just the following search:
index=main detail.statusCode=200 | stats dc(detail.ipAddress)
But however, this is the distinct count of ipadresses, which might not equal the number of registrations, but this depends on your data.

Greetings

Tom

Motivator

Hi @tom_frotscher, thank you for taking the time to reply to my post and for pointing out the error over the eventId.

Kind regards

Chris

0 Karma

Path Finder

Till the following line your query is good. But you have lost "eventId" in your data due to stats

index=main detail.statusCode=200 | stats dc(detail.ipAddress)

Hence

| stats count(eventId) is returning 0.

So instead of | stats count(eventId) use | stats count

So your final query will look like this

index=main detail.statusCode=200 | stats dc(detail.ipAddress) | stats count | rename count AS "Registrations"

0 Karma