Hi, We have an issue with an external REST API that works properly 99% of the time, but once in a while it publishes data as "back dated". Background: We have configured a Splunk Add-on to fetch the API every 300 seconds with a REST API filter (let's say the filter is named "DateCreated") used as part of HTTP GET request indicating the time when an event was generated. If everything would work like expected, this is how we can do the implementation: Filter: DateCreated > (now - 300s) Result: Returns all events from last 5 min time period However, once in a while, the API publishes data with DateCreated that is back dated up to multiple hours, meaning it does not match our initial implementation and resulting in missed events. We have also investigated there is no other filter in the API that we could used to go around this issue 100% of cases. Potential solution: We have been considering a solution where we would do a batch search, e.g. once in 24 hours fetching all events from the API. This would receive all events from past 24 hours (with high certainty the back dated as well) and then would process all events: Ingest events that have not been ingested as part of past API calls Dump the ones that can be considered duplicates (ingested as part of past API calls) Implementing a batch API call feature comes with another problem, it generates duplicate events to index. We want to keep the index clean from duplicate events due to our configured alerting and reporting logic. To have a solution to not have dupe events, we have been considering two options: Add-on would be utilizing KV store, storing unique identifier of every event during the original API call. Batch API call would then utilize the store for duplicate detection, dropping the ones already ingested. This comes with an issue of over time KV store growing and nobody cleaning it. Is there any good way to clean up KV store either once in a while or like set max size to it and it would remove the oldest data automatically? Optimally the clean up could be performed by the add-on. As part of every batch API call, add-on would perform REST API calls against Splunk index where data is already ingested, parsing the unique identifiers and using them to drop duplicates. Does an add-on have permissions to perform Splunk REST API calls natively without additional credentials? If not, what would be the optimal way of creating and storing account information? Any example implementation to mention of Add-on calling Splunk REST API? Any other potential implementation idea? In the end, we want to minimize admin overhead over different Splunk environments performing exact same API calls, but for different entities. We have multiple environments that perform same activity so this should be a solution that can be easily deployed and managed for multiple environments. Thanks.
... View more
Thanks. Figured out that it would be doable the way you mentioned, but since the amount of fields from lookup table is changing once in a while and the resulting search being hard to maintain, I decided it is better to split the search into two searches and do the dynamic part of filtering on the second search.
... View more
Hi, I am having a situation where a lookup table defines search filters that needs to be used as part of search query. The dynamic filter (data_owner_filter) is built from original search results and subsearch filters are defined by lookup table, where filters can either be inclusive or exclusive. I have tried with a following kind of approach, but the problem of subsearch not being able to reach value defined as data_owner_filter: <search>
| eval data_owner_filter=mvindex(split(data_owner,"_"),1)
| search ([| inputlookup lookup_table.csv | search static_filter="use_case_1" dynamic_filter=data_owner_filter rule_type="inclusive" | fields fieldx])
| search NOT ([| inputlookup lookup_table.csv | search static_filter="use_case_1" dynamic_filter=data_owner_filter rule_type="exclusive" | fields fieldx])
| table fieldx, fieldy, data_owner Example of the lookup table (table can have hundreds of entries): static_filter | dynamic_filter | rule_type | fieldx use_case_1 | 001 | inclusive | abc* use_case_1 | 001 | exclusive | efg* use_case_1 | 002 | inclusive | bcd* use_case_1 | 002 | inclusive | abc* use_case_2 | 002 | inclusive | abc* use_case_2 | 002 | exclusive | hij* ... The idea behind the whole approach is to have a single lookup table to handle various inclusions and exclusions for data related to different data owners (owner defined on data_owner_filter) while having a single search alert configured per use case (defined by "static_filter"). Any suggestion how this could be accomplished?
... View more