Splunk Search

Inconsistent number of results

gelica
Communicator

I have a search that generates different number of results and I can't figure out why..

Here's my search:

sourcetype=my_sourcetype eventname=my_eventname field1=* OR field2=* OR  field3=* | dedup num my_id | eval field123=field1.";".field2.";".field3 | eval field123=split(field123,";") | mvexpand field123 | stats count(field123) AS "#" by field123

I'm running the search over all time and I'm not indexing any more data right now. Field123 can have four different values (val1-4 below)

Here's two of my results from this search:

Splunk returns 17184 events:
val1   8
val2   5707
val3   10875
val4   594

Splunk returns 17121 events:
val1   8
val2   5685
val3   10850
val4   578

Does anyone know what I'm doing wrong? :S
Thanks


Update:

I tried simplify my search like this:

sourcetype=my_sourcetype eventname=my_eventname | dedup num my_id 

and still get inconsistent results. However, I have another similar eventtype, which also have the fields num and my_id, and when I run this search, I get the same amount of results:

sourcetype=my_sourcetype eventname=my_other_eventname | dedup num my_id

the_wolverine
Champion

There's a search that you can run that will show the search run, timing and result counts. You could create a report and submit it to Splunk for investigation:

[ search index=_audit search="*some string that matches what you ran in your query*" | fields search_id | format ] index=_audit | transaction search_id | table apiStartTime,apiEndTime,user,event_count,search,_time
0 Karma

rey123
Path Finder

@the_wolverine , I am facing the same issue as the author. Would you be able to clarify what can be put inside the "" in search="some string that matches what you ran in your query"? Because if I put my entire search query there, I get a 'Search Factory: Unknown search command 'a'' error when trying to run the query. Thanks.

0 Karma

joebensimo
Path Finder

This may be the same issue I'm encountering. In my case, stats is not aggregating properly or consistently. I get multiple result lines for items in the by part of the stats command. I can make the issue go away by performing the search over a smaller data set with less variation in the by fields or by including fewer or different aggregate functions and/or fields on which the aggregate functions are applied -- leading me to believe this is triggered by a combination of factors. I've spent many many hours unsuccessfully trying isolate the combination of conditions that trigger this issue.

0 Karma

gelica
Communicator

@lukejadamec I tried this and Splunk tells me below the searchbar that it has found 31254 matching events. But if I'm looking at the fields in the left column and inspect the sourcetype field for example, Splunk tells me that that fields appears in 100% of the results and in 6260/6345/... events, so the inconsistency is still there.

When I run the search without dedup on my other sourcetype I get 31254 vs 6467 results so Splunk still tells me it has found more events than the sourcetype, but no inconsistency in numbers.

0 Karma

lukejadamec
Super Champion

I was thinking you could run the search without the dedup, and that would give you a total count of events and each event would have a timestamp. If the first and last event are always the same then it is not a bracketing problem. If you always get the same number of events without the dedup command, then it is a dedup problem.

0 Karma

gelica
Communicator

@lukejadamec Do you mean that I should check the _time-field? Or that I should time the search? If the latter, is there a way to tell Splunk to log that?

Do you think there might be some kind of timeout somewhere? Because that was my thought in the beginning, but I get ~3-4 different number of results, if the cause of the problem was a timeout, the number of results should be more random I think..?

0 Karma

lukejadamec
Super Champion

Run your search and record the time of the first and last record, and compare between searches that yield different results.

0 Karma

gelica
Communicator

Every event that has that eventname has those fields. That's why I also tried removing that part from my search in my update, but still inconsistent :S

But either way, shouldn't I get the same amount of results when I run the exact same search on the same data, independent of if the fields are present or not?

0 Karma

linu1988
Champion

Does your event have all 3 fields? I meant every event has all 3 values? If not it's kinda obvious you will get inconsistent or odd values where event may be missing one single field.

If you want to see the consistent value you have to use :
field1=* AND field2=* AND field3=*.

But that may not give you the desired value.

0 Karma

0range
Communicator

(field1=* OR field2=* OR field3=*) use brackets, cause now your expression is not what you suppose it to be

0 Karma

gelica
Communicator

@0range, unfortunately that wasn't the problem, my results are still inconsistent 😕

0 Karma

gelica
Communicator

Oh, really? I thought that Splunk didn't need it when using OR. You mean that Splunk runs my search as (sourcetype=my_sourcetype eventname=my_eventname field1=*) OR ... as it is now?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...