Any suggestions on techniques (or KPIs) for trending realtime searches within a Splunk environment?
I'm doing some simple daily trend analysis on all the (non-realtime) scheduled searches by looking at
run_time as reported by the scheduler.log, but that value isn't helpful for realtime searches since realtime searches never really stop. For historic searches, I capture median
run_time, total events/results returned, and events scanned (to evaluate which searches are hitting the disk the most). All stats are stored in a summary index.
My aim in trending this information is to establish a long-term baseline to identify search performance changes and track the biggest resource consumers.
Any ideas on how to do something similar for real-time searches?
running a REST search for your rt search brings a lot of different performance counters and information regarding eventCount and diskUsage.
My rt search is something like this
eventtype="x-base-check" | ... and I find the information about this search via REST like this:
| rest /services/search/jobs | search eventSearch="*base-check*" isRealTimeSearch="1"
I don't know if this is useful to you or not, but this is what I use to get performance information related to real-time searches.
I thought about a rest-based approach, but didn't think it had enough info, and I was hoping to capture performance deltas rather than lifetime stats. But after looking more, I think this probably is the best option. So I've setup a persist a snapshot (summary index) every 15 minutes. I'll create a daily summary of the data for long-term analysis. This does 2 things I hadn't initially anticipated: (1) it allows me to capture long-running non-scheduled searches, and (2) associates the searches with a UI view. Thanks MuS! I may post my full solution once I get everything working.