A few static files are generated within my environment, and when they do they are collected by Splunk. The date and time the files are generated are not consistent, but I know I will get 4 per month. Some details on my data management in case it is useful, the files are from the same host, have the same sourcetype, but different source, and obviously different _time.
My question is how to automatically recognize when all 4 files have been indexed, and perform the desired actions.
Specifically I am trying to run some reports and update some dashboard panels. I need my reports to run once Splunk has indexed all 4 files and not before. I also don't need the reports to run immediately, it would be fine if they ran "the next day" after all 4 files have been indexed. The reports use table formatted results (so all fields should still be available within the results). The dashboard panels need to be designed to always use the most recent complete set of data. (4 files from the same month, not "the last 4 files")
I should mention that it is possible for a file to be accurately generated, and be empty. This would prevent running the search and counting the number of source values that show up. In this case the file is still generated and indexed, it just wont contain relevant data.
I only want the report to run once per month as soon as possible after all 4 files are indexed. I can't just wait till the last day of the month for everything to happen.
Any assistance or ideas on how to accomplish this would be appreciated.
1. run a scheduled search to identify you have received data / not.
2. if you have received data call the script
3. Using CURL handle whatever you want in the script like run a saved search that can generate /update lookup like that. Steps,
First Create a search to identify your data has been received or not .. then your search looks something like, |tstats count where index=_internal by source | eventstats dc(source) as total_source | where count>1 AND total_source >4
timeframe: @mon to now
cron: Based on your requirement like once a day / once per hour
Alert condition: if number of events - is equal to - 4
Scheduled a script by clicking "Run a script" - provide the script name - "sc_enable_other_reports.sh"
description of above search: find the number of source received and find source shoud greater than one event (you can change based on your request)
Lets say your scheduled saved search "search/my_search" is sends email / generate lookups.. what ever.. first remove the schedules, here the idea is to trigger from command line.
Sample script: /opt/splunk/etc/apps/search/bin/scripts/sc_enable_other_reports.sh (if required you can add logging and butify the script) ... curl -k -u admin:changeme https://localhost:8089/services/search/jobs/ -d "search=savedsearch my_search" -d max_count=5000000
Handle the monthly export log or state in a file .. So that you don't want to send the same report again. All can be handled in the script. I have used internal index and search app to validate, you can try yours. I suggest first try this example then try to do with real data. Let us know how it goes....
See if this gives you ideas... schedule this to run daily(?)
index sourcetype etc for final query [search earliest=-1mon | eval temp="x" | bin span=1mon _time | chart count over _time by source | addtotals | where Total=4 | head 1 | untable temp source count | fields source ] | rest of your query