All Apps and Add-ons

Splunk Stats count discrepancy

tiagofbmm
Influencer

Hello

How would searching in VERBOSE mode and a strict timerange for index=foo host=bar | stats count return a much larger value than the number of events I see

Even if I search for index=foo host=bar in the same time frame I have much less events than what the count reports. What is wrong? How can Splunk count the events with a specific host but then not returning them?

Any ideas?

Thanks

P.S.:please note the attachments evidence

0 Karma

jkat54
SplunkTrust
SplunkTrust

By much larger, do you mean double, triple, or a factor of x times greater?

If the count is a whole number multiple of the correct count then it could be something like having search peers defined in a search head that is also joined to a cluster master of th same search peers.

0 Karma

tiagofbmm
Influencer

No, it is not the case

0 Karma

tiagofbmm
Influencer

And fyi, I also tried to search these events from a search head cluster member and also from the Monitoring Console of the environment that has distributed search instead of belonging to the cluster. No difference at all

0 Karma

woodcock
Esteemed Legend

This can happen if you have somehow overridden the original/indexed value of host. You might have a field extraction for host against your event data. You might be seeing the results of KV_MODE automatically overwriting host. I recently saw a similar thing with an aws:s3 feed. The indexed value for host as the proxy where the S3 buckets were. But the JSON events had a host field encoded within them. A similar thing can happen if you have a calculated field that overrides the host field. Here are some things that you can do:

If you desire to be sure that you are accessing the indexed field for host, then DO NOT use host=foohost, but instead use host::foohost.

You can run also run tstats WHERE host=foohost to get a count of the indexed field.

You might look for a field called orig*host. Many times when people deliberately override a value, they save off the old value. To some degree, this behavior can be controlled by using INDEXED and INDEXED_VALUE in fields.conf: https://docs.splunk.com/Documentation/Splunk/latest/Admin/Fieldsconf

tiagofbmm
Influencer

It is a good point you mentioned. In this case I already tried to search for the events having their sourcetype without KV extractions, which would avoid the host field override, and the results are the same. The count results I am using are coherent between stats and tstats, so I guess this is some other issue I'm facing. But anyway, good points you mentioned, thanks

woodcock
Esteemed Legend

That's what the up-vote button is for 😆

0 Karma

niketn
Legend

@tiagofbmm what is the behavior of tstats command?

| tstats count where index=foo by host

Also, would there be a possibility to test on 6.6 or 7.1?

Do engage Splunk support team if you have Valid Splunk Entitlement as 7.0 had some issues (Search issue with multikv was resolved in Splunk 7.1.1: https://docs.splunk.com/Documentation/Splunk/latest/ReleaseNotes/Fixedissues)

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tiagofbmm
Influencer

Niket both tstats and stats show the same results. Both showing per host a greater than zero count value but then filtering to a specific host is doesn't return the number of events showed in the count. It has happening for a lot of dbinputs from different connections. Not all of them though 😐

0 Karma

niketn
Legend

@tiagofbmm, if the same dbinputs worked for you before with previous Splunk/DB Connect version, you should work with Splunk Support with your Splunk entitlement. Also add a bug tag to your question here.

Is the issue only with sourcetypes coming from dbinput?

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tiagofbmm
Influencer

Yes Niket, only for DBCOnnect sources. I don't know if a previous version of DBConnect would work fine because I have only used this one. Your're right, a bug tag for sure

0 Karma

immortalraghava
Path Finder

There is definitely a bug here in Splunk 7.
This is what I am facing.
For same data, it works with 6.4.5

https://answers.splunk.com/answers/668827/issues-with-splunk-search-behavior-in-version-704.html

And this is not a metadata field.

0 Karma

manish_singh_77
Builder

Strange, I have never come across this issue, could you confirm the time frame that you are using?

0 Karma

tiagofbmm
Influencer

@manish_singh_777 I am using non-relative timeranges, from one specific hour, to a already ended day, to even All Time for this test. The behaviour is coherently wrong, whatever one I specify.

0 Karma

niketn
Legend

@manish_singh_777, since this is just a follow up question I have moved your point from Answer to Comment.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

scannon4
Communicator

Does it do this even if you choose All Time? What time frame are you choosing?

0 Karma

tiagofbmm
Influencer

The time frame is irrelevant here, it happens every time window I use. But this happens only if I filter the results with a metadata field, such as source or host or sourcetype. If I just place index=foo | stats count then the result is coherent between the number of events (checking by the Events tab and the Statistics tab in Verbose Mode)

0 Karma

tiagofbmm
Influencer

It happens for non relative time windows, such as yesterday, or a specific hour for the day.

0 Karma

scannon4
Communicator

If you do index=foo without the host, do you see events from other hosts?

0 Karma

tiagofbmm
Influencer

The issue is not really about not seing any event. It's about seing a really small fraction of the events comparing to what the count shows.

0 Karma

scannon4
Communicator

I would use Job Inspector as well to look at every step the search took to make sure nothing weird is going on with an indexer or something. Just a thought.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...