I have set up a REST API call to an application that returns JSON output of all audit log entries. Using the response handler, JSONArrayHandler, the events in the output are parsed correctly and indexed in Splunk. However, every time it runs, it re-adds events over and over again. Is it possible to set up a checkpoint for this input stream?
Also, each event from the API call has a unique ID and a created_at timestamp. Could I use these data points to stop duplication of events?
I cant determine from the Foreman docs or your JSON response example where you could specify some sort of timestamp or offset cursor in the GET request parameters so that each subsequent request doesn't pull in already indexed audit records.
You would have to do it by the "id", not a date/time.
Another approach would be to use the /api/audits/$id endpoint and get them one at a time and somehow maintain state for the audit ID.
You can achieve what you want by writing a custom response handler.
Can you provide any detailed information about your REST Endpoint , request format , URL examples etc.... ?
The endpoint application is Foreman and the API call is for the audit logs.
Endpoint website: http://theforeman.org
API reference can be found here, http://theforeman.org/api/1.8/index.html. The specific call is GET /api/audits.
Example curl:
curl -k -X GET -H "Accept: application/json" --user splunk:pass https://kadmin-corp.reacher.com/api/v2/audits
Full output of call can be found here: https://gist.github.com/palmertime/2a175d932d87689067ed
I have tried to hack up a custom response handler with out success. Any help would be greatly appreciated.
If you need more info please let me know.