Splunk Search

Possible losing field values when using "stats...by field_name"

PeterZhang
New Member

I thought the result of using "...| dedup src_ip | table src_ip | sort str(src_ip)" should be the same with the result of using "...| stats count by src_ip | fields - count".

But actually it may lose some values for "src_ip" field when using "...| stats count by src_ip | fields - count".

If using "...| search src_ip=59.160.18.202", it can find related events,

if using the "...| dedup src_ip | table src_ip | sort str(src_ip)" command, it can find "59.160.18.202" for "src_ip" field on "Statistics" tab,

but if using the "...| stats count by src_ip | fields - count" command, the value "59.160.18.202" for "src_ip" field on "Statistics" tab will be lost.

Not sure what's the reason, need your kind advice, thanks.

0 Karma
1 Solution

woodcock
Esteemed Legend

This is absolutely a bug. Open a support case.

View solution in original post

0 Karma

woodcock
Esteemed Legend

This is absolutely a bug. Open a support case.

0 Karma

PeterZhang
New Member

Thanks, that's what I think as well.
But I don't have a support account to report the bug.

0 Karma

woodcock
Esteemed Legend

If you bought a Splunk license, you are obligated to maintain a support contact as well.

0 Karma

PeterZhang
New Member

I don't have a Splunk license so far. I think that it would be better if Splunk can have some way to receive potential bug report even without a Splunk license.

0 Karma

woodcock
Esteemed Legend

Let's go line by line:

eventtype=cisco-security-events dest_ip!="255.255.255.255" dest_ip!="0.0.0.0" src_ip="*"

You should add index=... to EVERY search.

| eval isLocalIP=`local-ip-list(src_ip)`

That one is fine

| where isLocalIP!=1 AND isnotnull(threat_reason) AND threat_reason!="-"

There is redundancy here and this is more efficient:

| where isLocalIP!=1 AND threat_reason!="-"

Now the rest of your stuff makes no sense whatsoever so clearly the commands do not do what you think they do. Together they ensure that count will always have a value of 1.

| dedup src_ip

The command above keeps exactly 1 event for each distinct value of src_ip.

| table src_ip

This is not only unnecessary, but craters your search performance because table is a finalizing command. Remove that line completely and replace it with nothing.

| stats count by src_ip

Because we already eliminated everything but 1 event for each distinct value of src_ip, you might just as well do something silly like this, because you will get the same result:

| eval count = "1"
0 Karma

PeterZhang
New Member

Sorry, the above commands are not using for a certain purpose except SPL exploring.

Yes, table command is not necessary, but the problem is that if not using "dedup" before "stats", one of the value of "src_ip" will get lost. By the way, it lost on "statistic" tab as well. "dedup" can keep exactly 1 event for each distinct value of "src_ip", but still doesn't make sense that some values(59.160.18.202) of the "src_ip" will get lost when using "stats" without "dedup".

Also attached the comparing charts of these two situation as the following:
alt text

alt text

0 Karma

woodcock
Esteemed Legend

Show me the tabular results without the dedupcommand in the search.

0 Karma

PeterZhang
New Member

Hi woodcock, I exported the tabular results without/with dedup command to the following link:
https://drive.google.com/open?id=1otAU7hqSG5VwUQPdJsixe54pYnCqmvdl
https://drive.google.com/open?id=1gvVB-JjFY2CmlQjXhFEMhPGP3YzuZWoS

thanks.

0 Karma

woodcock
Esteemed Legend

Are you using ... | sort anywhere? If so, try using | sort 0 ..... The 0 makes it unlimited; without it, Splunk limits the output set to 10K rows.

0 Karma

PeterZhang
New Member

Thanks, woodcock, but there is no "... | sort" using anywhere. And the output is less than 50 rows.

PS: the full command is like the following:
eventtype=cisco-security-events dest_ip!="255.255.255.255" dest_ip!="0.0.0.0" src_ip="*" | eval isLocalIP=local-ip-list(src_ip) | where isLocalIP!=1 AND isnotnull(threat_reason) AND threat_reason!="-" | dedup src_ip | table src_ip | stats count by src_ip

among it, the macro is defined as the following:
case(cidrmatch("10.0.0.0/8", $field$),1,cidrmatch("172.12.0.0/12", $field$),1,cidrmatch("192.168.0.0/16", $field$),1,cidrmatch("169.254.0.0/16", $field$),1,cidrmatch("fe80::/64", $field$),1,cidrmatch("fec0::/10", $field$),1,cidrmatch("fc00::/7", $field$),1,$field$=="0.0.0.0",1,isnotnull($field$),0)

0 Karma

johnvr
Path Finder

Rather than running dedup + table, have you tried running stats values(src_ip)? Do you get different results?

0 Karma

PeterZhang
New Member

The value "59.160.18.202" for "src_ip" field will get lost on both "stats values(src_ip)" and "stats list(src_ip)" statistic as well unless adding preceding "dedup".

0 Karma
Get Updates on the Splunk Community!

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...

Adoption of Infrastructure Monitoring at Splunk

  Splunk's Growth Engineering team showcases one of their first Splunk product adoption-Splunk Infrastructure ...

Modern way of developing distributed application using OTel

Recently, I had the opportunity to work on a complex microservice using Spring boot and Quarkus to develop a ...