Hi,
We have an issue with an external REST API that works properly 99% of the time, but once in a while it publishes data as "back dated".
Background:
We have configured a Splunk Add-on to fetch the API every 300 seconds with a REST API filter (let's say the filter is named "DateCreated") used as part of HTTP GET request indicating the time when an event was generated.
If everything would work like expected, this is how we can do the implementation:
However, once in a while, the API publishes data with DateCreated that is back dated up to multiple hours, meaning it does not match our initial implementation and resulting in missed events. We have also investigated there is no other filter in the API that we could used to go around this issue 100% of cases.
Potential solution:
We have been considering a solution where we would do a batch search, e.g. once in 24 hours fetching all events from the API. This would receive all events from past 24 hours (with high certainty the back dated as well) and then would process all events:
Implementing a batch API call feature comes with another problem, it generates duplicate events to index. We want to keep the index clean from duplicate events due to our configured alerting and reporting logic.
To have a solution to not have dupe events, we have been considering two options:
In the end, we want to minimize admin overhead over different Splunk environments performing exact same API calls, but for different entities. We have multiple environments that perform same activity so this should be a solution that can be easily deployed and managed for multiple environments.
Thanks.