Solved: Real Time Search Performance Considerations: Are t...

shailesh030 · ‎06-11-2015

I understand that real time searches on splunk are very expensive and should be avoided. My question is an extension to what has been asked and answered to some extent in the thread below

http://answers.splunk.com/answers/230777/best-practices-when-dealing-with-real-time-searche.html?utm...

Below are the details of my dashboard

My application dashboard has 5 tabs and each of them have 3-4 real time searches monitoring & measuring events which happened in past 15 minutes till now i.e earliesttime = rt-15m and latesttime=rt.
At any point of time only 1 or 2 users would be viewing the dashboard & that too occasionally especially after a scheduled realtime splunk search raises an alert indicating the monitored application is facing degradation of performance or lot of errors. But when users are viewing the dashboard, they prefer panels to be updated real-time & not with any lag because of mission critical nature of the application.
Events come in at a very high rate . It can range from an average of 20 transactions/events per second to 200-300 transactions/events per second.

Questions
1. Are there any scenario's where real time searches would be acceptable in production settings. For e.g. like in the one's i have detailed above . Or any kind of real time search (irrespective of # of users, # of searches per dashboard, window of search etc. ) is risk to splunk environment as a whole.

2. If answer to #1 above indicates real-time searches are not advisable then we can run savedsearches running every 30 seconds or 1 minute . But since panels need to refresh the latest statistics, we will have to anyways refresh the forms every 30 seconds or 1 minute. In that case, does scheduling savedsearches make any sense? Can we not just leave them as inline searches with form or panel refresh every 1 minute? Since window is just 15 minutes and don't need these searches for any reports (just real time monitoring) we don't need acceleration or summary indexing etc.

lguinn2 · ‎06-11-2015

Question 1: When real time searches would be acceptable in production settings --

Ultimately this is a decision that you have to make. In your environment, is this a good use of the available resources? Does your environment have the capacity to run this many real-time searches without significant degradation? You will have to do some performance testing to see...

Question 2: Can we run savedsearches running every 30 seconds or 1 minute --

This is actually a good idea. If you have 2 users looking at your dashboard, each will be running 3-4 real-time (RT) searches, for a total RT search load of 6-8 searches. If instead you run scheduled searches every 30 seconds, the dashboards will pick up the latest values of each scheduled search - and these cached results will be shared by all users. So more users does not increase the load on the system as dramatically as using RT searches on dashboards.

Other thoughts --

Finally, nothing is ever truly "real-time" - events happen, get forwarded to Splunk, must be parsed, and then filtered before they show up in a RT search. There is always some lag. But one of the benefits of a RT search can be that it monitors continuously; it never stops. That takes more resources of course, but perhaps the benefits outweigh the cost.

One way to reduce the cost of real-time searches is to use indexed realtime search. Normally, a RT search monitors an internal Splunk queue for arriving events. This can affect indexing performance. Setting the system to use indexed realtime search changes that - instead of monitoring an internal queue, the RT search monitors the data that is written to disk. This takes fewer resources.

You must a set a lag time, so indexed RT searches will have a bit more lag, but they will run continuously like normal RT searches. If you want to use a lot of RT searches, but you can tolerate a little more lag, this may be the best answer for you.

HTH

View solution in original post

lguinn2 · ‎06-11-2015

Question 1: When real time searches would be acceptable in production settings --

Ultimately this is a decision that you have to make. In your environment, is this a good use of the available resources? Does your environment have the capacity to run this many real-time searches without significant degradation? You will have to do some performance testing to see...

Question 2: Can we run savedsearches running every 30 seconds or 1 minute --

This is actually a good idea. If you have 2 users looking at your dashboard, each will be running 3-4 real-time (RT) searches, for a total RT search load of 6-8 searches. If instead you run scheduled searches every 30 seconds, the dashboards will pick up the latest values of each scheduled search - and these cached results will be shared by all users. So more users does not increase the load on the system as dramatically as using RT searches on dashboards.

Other thoughts --

Finally, nothing is ever truly "real-time" - events happen, get forwarded to Splunk, must be parsed, and then filtered before they show up in a RT search. There is always some lag. But one of the benefits of a RT search can be that it monitors continuously; it never stops. That takes more resources of course, but perhaps the benefits outweigh the cost.

One way to reduce the cost of real-time searches is to use indexed realtime search. Normally, a RT search monitors an internal Splunk queue for arriving events. This can affect indexing performance. Setting the system to use indexed realtime search changes that - instead of monitoring an internal queue, the RT search monitors the data that is written to disk. This takes fewer resources.

You must a set a lag time, so indexed RT searches will have a bit more lag, but they will run continuously like normal RT searches. If you want to use a lot of RT searches, but you can tolerate a little more lag, this may be the best answer for you.

HTH

shailesh030 · ‎06-12-2015

Thank you.. I think indexed realtime searches are more relevant for use case because we can tolerate a little bit of lag but at the sametime continuously monitor.

Based on the explanation, scheduled searches also makes sense but only issue I have is with the need to refresh panels every 30 second which causes the panels to re-load.If the system is slow, it creates a window albeit small where user needs to wait for data to appear on the screen. In real-time the user-experience is seamless as the stats refresh without reloading panels.

is there a way to make panel refresh with scheduled searches as seamless user experience as real-time? .

lguinn2 · ‎06-12-2015

You can set an auto-refresh on dashboard panels - you will need to edit the dashboard XML. It's not hard.
But it still won't quite be realtime, as the panel refresh may not always be well-synchronized with the scheduled searches.

You could use a mix, though, with some panels real-time and some on a periodic refresh.

shailesh030 · ‎06-15-2015

Thanks, This helps a lot.

shailesh030 · ‎10-05-2015

We are reconsidering to move away from any realtime search (indexed or realtime) due to performance degradation observed in test & production region. Given the fact that the dashboards will be used only when application being monitored has issues, wanted to clarify few additional questions:

a) Can we not just leave them as inline searches with form or panel refresh every 1 minute? Since window is just 15 minutes and don't need these searches for any reports (just real time monitoring) we don't need acceleration or summary indexing etc.
b) If they are left as inline searches, would large number of events in 15 minute interval will have any impacts

Real Time Search Performance Considerations: Are there any scenarios where real-time searches would be acceptable?

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?

Splunk Education Goes to Washington | Splunk GovSummit 2024