Hi,
I have a CSV file in my folder on pc that is updated every day.
I want to use always the most up-to-date csv file.
I tried to use the monitor option but if the next days the csv is updated the old one remain in my splunk data.
Can I use only one csv file and not the sum of old csv?
Thanks in advance
You can clear the contents of the old file using something like - index=<index name> source=<source name> | delete
and then index the new file. It's doable...
I use for every csv the same index. So using monitor option, if my csv file is updated I will have the new csv on my splunk but I will have also the old one that are, in part, similar to the previous one.
I want, using the same index, that every day I work only with the latest up-to-date csv file
Consider writing CSV to a KV Store collection every day. Each update will replace the existing date so there will only one day of data in the store. Then your search just needs to do a inputlookup
to get the day's data.
Once data is indexed by Splunk it cannot be deleted or removed until it expires.
You probably should modify your query to search for only the most recent data. Post your query and we may be able to help.
My problem is that if I upload a new file with the same name of the old one, splunk summarize the data so I have many duplicate data.
For example: yesterday I uploaded a csv with 10 events, today I upload the same csv file but with 11 events (10 events equal to the yestarday's file and 1 new events).
I want to see only 11 events and not the sum of the two files (21 events)
If you're simply adding new events to the same file, rather than completely replacing the CSV, tell Splunk to monitor that file and it will only read the newest events. That should avoid duplicates.