Splunk Search

Possible losing field values when using "stats...by field_name"

PeterZhang
New Member

I thought the result of using "...| dedup src_ip | table src_ip | sort str(src_ip)" should be the same with the result of using "...| stats count by src_ip | fields - count".

But actually it may lose some values for "src_ip" field when using "...| stats count by src_ip | fields - count".

If using "...| search src_ip=59.160.18.202", it can find related events,

if using the "...| dedup src_ip | table src_ip | sort str(src_ip)" command, it can find "59.160.18.202" for "src_ip" field on "Statistics" tab,

but if using the "...| stats count by src_ip | fields - count" command, the value "59.160.18.202" for "src_ip" field on "Statistics" tab will be lost.

Not sure what's the reason, need your kind advice, thanks.

0 Karma
1 Solution

woodcock
Esteemed Legend

This is absolutely a bug. Open a support case.

View solution in original post

0 Karma

woodcock
Esteemed Legend

This is absolutely a bug. Open a support case.

0 Karma

PeterZhang
New Member

Thanks, that's what I think as well.
But I don't have a support account to report the bug.

0 Karma

woodcock
Esteemed Legend

If you bought a Splunk license, you are obligated to maintain a support contact as well.

0 Karma

PeterZhang
New Member

I don't have a Splunk license so far. I think that it would be better if Splunk can have some way to receive potential bug report even without a Splunk license.

0 Karma

woodcock
Esteemed Legend

Let's go line by line:

eventtype=cisco-security-events dest_ip!="255.255.255.255" dest_ip!="0.0.0.0" src_ip="*"

You should add index=... to EVERY search.

| eval isLocalIP=`local-ip-list(src_ip)`

That one is fine

| where isLocalIP!=1 AND isnotnull(threat_reason) AND threat_reason!="-"

There is redundancy here and this is more efficient:

| where isLocalIP!=1 AND threat_reason!="-"

Now the rest of your stuff makes no sense whatsoever so clearly the commands do not do what you think they do. Together they ensure that count will always have a value of 1.

| dedup src_ip

The command above keeps exactly 1 event for each distinct value of src_ip.

| table src_ip

This is not only unnecessary, but craters your search performance because table is a finalizing command. Remove that line completely and replace it with nothing.

| stats count by src_ip

Because we already eliminated everything but 1 event for each distinct value of src_ip, you might just as well do something silly like this, because you will get the same result:

| eval count = "1"
0 Karma

PeterZhang
New Member

Sorry, the above commands are not using for a certain purpose except SPL exploring.

Yes, table command is not necessary, but the problem is that if not using "dedup" before "stats", one of the value of "src_ip" will get lost. By the way, it lost on "statistic" tab as well. "dedup" can keep exactly 1 event for each distinct value of "src_ip", but still doesn't make sense that some values(59.160.18.202) of the "src_ip" will get lost when using "stats" without "dedup".

Also attached the comparing charts of these two situation as the following:
alt text

alt text

0 Karma

woodcock
Esteemed Legend

Show me the tabular results without the dedupcommand in the search.

0 Karma

PeterZhang
New Member

Hi woodcock, I exported the tabular results without/with dedup command to the following link:
https://drive.google.com/open?id=1otAU7hqSG5VwUQPdJsixe54pYnCqmvdl
https://drive.google.com/open?id=1gvVB-JjFY2CmlQjXhFEMhPGP3YzuZWoS

thanks.

0 Karma

woodcock
Esteemed Legend

Are you using ... | sort anywhere? If so, try using | sort 0 ..... The 0 makes it unlimited; without it, Splunk limits the output set to 10K rows.

0 Karma

PeterZhang
New Member

Thanks, woodcock, but there is no "... | sort" using anywhere. And the output is less than 50 rows.

PS: the full command is like the following:
eventtype=cisco-security-events dest_ip!="255.255.255.255" dest_ip!="0.0.0.0" src_ip="*" | eval isLocalIP=local-ip-list(src_ip) | where isLocalIP!=1 AND isnotnull(threat_reason) AND threat_reason!="-" | dedup src_ip | table src_ip | stats count by src_ip

among it, the macro is defined as the following:
case(cidrmatch("10.0.0.0/8", $field$),1,cidrmatch("172.12.0.0/12", $field$),1,cidrmatch("192.168.0.0/16", $field$),1,cidrmatch("169.254.0.0/16", $field$),1,cidrmatch("fe80::/64", $field$),1,cidrmatch("fec0::/10", $field$),1,cidrmatch("fc00::/7", $field$),1,$field$=="0.0.0.0",1,isnotnull($field$),0)

0 Karma

johnvr
Path Finder

Rather than running dedup + table, have you tried running stats values(src_ip)? Do you get different results?

0 Karma

PeterZhang
New Member

The value "59.160.18.202" for "src_ip" field will get lost on both "stats values(src_ip)" and "stats list(src_ip)" statistic as well unless adding preceding "dedup".

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...