I have a requirement to get the average of the count of the IPs over the last 90 days. I have thought of 2 approaches to distribute the query overhead across a span of 90 days.
Will schedule a query to run everyday at the end of the day and collect the result in a csv file using outputlookup. Will keep appending the results every day to the existing file. Then use this file to get the 90 day average.
Use summary indexing and schedule a query to run everyday at the end of the day and collect the result in the summary index specifying the search a name. Will keep appending the results every day to this index . Then use this index to get the 90 day average.
Question:
Is there a way to restrict the lookup csv file or the summary index to have just the latest 90 days record. Meaning , I want to purge the rows in the index or the file which are greater than 90 days. How can I do it?
This seems like a perfect candidate for a datamodel and dm acceleration.
Create a datamodel to include the index, fields, etc. that you require, then accelerate it for 90 days.
Splunk will automatically keep the acceleration up to date (always 90 days e.g.) and you'll get quite a performance increase on your searches.
Datamodels:
https://docs.splunk.com/Documentation/Splunk/8.0.2/Knowledge/Aboutdatamodels
Datamodel acceleration:
https://docs.splunk.com/Documentation/Splunk/8.0.1/Knowledge/Acceleratedatamodels
This seems like a perfect candidate for a datamodel and dm acceleration.
Create a datamodel to include the index, fields, etc. that you require, then accelerate it for 90 days.
Splunk will automatically keep the acceleration up to date (always 90 days e.g.) and you'll get quite a performance increase on your searches.
Datamodels:
https://docs.splunk.com/Documentation/Splunk/8.0.2/Knowledge/Aboutdatamodels
Datamodel acceleration:
https://docs.splunk.com/Documentation/Splunk/8.0.1/Knowledge/Acceleratedatamodels
Cool. Thanks for pointing it out. I did it the old school way and a very raw way.
So here's what I did:
- using a scheduled search query I queried the results for that day
- Appended the rows in the csv(loopup csv file already populated with previous x days data) with the above query results
- Checked if the timestamp of the results is beyond 90 days and then filtered out the results
- Wrote the filtered results back to the csv file so that the csv has the results for the past 90 days.
Then in the dashboard I read this csv and populate the report for the last 90 days.
But I will definitely take a look at your approach. Thanks very much. Appreciate it.
Glad to help!