Hi!
Maybe this question is so simple to answer that I did not find any example, so please be kind to me 😉
We use append in our correlation search to see if we do have a server in blackout. Unfortunately we have seen the append returning just partial results which makes an incoming event create an Episode and Incident.
It does happen very seldom but imagine you set a server into blackout for a week and your run the correlation search every minute. Just one issue with the indexer layer, i.e. timeout creates a risk of the event passing through.
Our idea now is to have a saved search to feed a lookup instead. This search can then even run at a lower frequency, maybe every 5 minutes. But what if that search is seeing partial results and updates the lookup with partial data.
So, long story short, how can one detect in a running search that it deals with partial results down the pipe?
Could this work, example for peer timeout?
|index=...
|eval sid="$name$"
|search NOT [|search index=_internal earliest=-5m latest=now() sourcetype=splunk_search_messages message_key="DISPATCHCOMM:PEER_ERROR_TIMEOUT" log_level=ERROR | fields sid]
|outputlookup ...
Any help is appreciated.
Hope it did not mention partial search 😉
I share the concern regarding subsearches with you, maybe appendpipe could help ...
This is a challenge for me, I should not update the lookup when the search is seeing partial results.
It happens very rarely, maybe doing it differently could help.
Sorry, I hopped there is something in SPL that would tell me that the search results are kind of limited.
require is a very good hint!
Splunk has no way of knowing how many results your search should return.
You can, however, "cheat" a bit if you know how many result should there be (or can easily get that with a subsearch for example).
<your_search>
| eventstats count
| where count>10 or 20 or whatever
| fields - count
Now your search will return _any_ results only if it produced at least some minimal number of events.
You can use it to trigger an error with require
The search can return anything between 0 and something but that is not my question. I cannot judge the quality of the received data by its quantity. It is just important that I receive all the data I'm asking for.
Again, my question is, is there a way to detect if a search is dealing with partial results. As a user you been notified when you run the search manually. This I want to detect in the search itself.
I will experiment with appendpipe.
As far as I remember you only get a warning about possible incomplete results if some of your indexers are down. It has nothing to do with source servers (and that's how I interpret your question - you want to know when one of source servers isn't sending data).
In case of a downed indexer(s) Splunk is warning you that it might not have all the data it should have. And it makes sense because the missing indexers could have had buckets which have not been replicated yet or might have been replicated but are not searchable. But it deals only with the state of the Splunk infrastructure, not the sources.
Splunk has no way of knowing what "partial" data is in case of missing sources. There are some apps meant for detecting downed sources but they don't affect searches running on the data from those sources (although you could add a safeguard based on similar technique to the one with require based on a lookup or something).
In my case it is about:
@PickleRick wrote:In case of a downed indexer(s) Splunk is warning you that it might not have all the data it should have. And it makes sense because the missing indexers could have had buckets which have not been replicated yet or might have been replicated but are not searchable.
I want my search to not store the data in a lookup when Splunk raises this warning. And here I'm stuck.
I don't think there is a way to get this info within a search. It might be (and probably is) returned as additional status along the search job but it's not reflected in the search results themselves. You could try to detect a situation in which this could happen instead of directly looking for the incomplete results by checking cluster health with rest.
It's not clear what is meant by "partial search" and how Splunk is to know a search returned partial results or just fewer results.
The subsearch idea likely won't work because subsearches execute before the main search and so would be unable to detect errors in the main search.
There is the require command that will abort a query if there are zero results. That may not meet your requirements, however.