Is it possible to have Splunk to index all the data in a file and when the file is changed to remove the currently indexed data in splunk and replacing it with the new data in the file.
I do not need the history of the data and am just interested in the current status of the test.
The issue is that every time an event is written to a file it is being indexed even though one element has changed.
ie.
Original file content
"test": "A", "status": "Pass"
"test": "B", "status": "Pending"
New file content. Tests B has change from Pending to Failed
"test": "A", "status": "Pass"
"test": "B", "status": "Failed"
If the entire file is being reindexed and you don't want the history could you do something like the following based on your data?
...your search... | head 1
This way it gets the most recent file/record
I have a similar use case and there is not a good way to "delete" the data from Splunk. What I have done is treated the entire file as a single event. This means when the file updates, you index the entire file over again... Then, you can use something like:
index=main sourcetype=foo | stats first(event) as event by host | transform to extract individual events | transaction to group events (if needed)