topic Re: Indexing data in Getting Data In

Indexing data

dgadjov — Sat, 30 Mar 2013 01:18:41 GMT

Is it possible to have Splunk to index all the data in a file and when the file is changed to remove the currently indexed data in splunk and replacing it with the new data in the file.
I do not need the history of the data and am just interested in the current status of the test.
The issue is that every time an event is written to a file it is being indexed even though one element has changed.

ie.

Original file content

"test": "A", "status": "Pass"
"test": "B", "status": "Pending"

New file content. Tests B has change from Pending to Failed

"test": "A", "status": "Pass"
"test": "B", "status": "Failed"

Re: Indexing data

ShaneNewman — Sat, 30 Mar 2013 04:26:00 GMT

I have a similar use case and there is not a good way to "delete" the data from Splunk. What I have done is treated the entire file as a single event. This means when the file updates, you index the entire file over again... Then, you can use something like:

index=main sourcetype=foo | stats first(event) as event by host | transform to extract individual events | transaction to group events (if needed)

Re: Indexing data

Runals — Sun, 31 Mar 2013 03:11:12 GMT

If the entire file is being reindexed and you don't want the history could you do something like the following based on your data?

...your search... | head 1

This way it gets the most recent file/record