Splunk Enterprise

Is splunk is best tool monitoring and alerting

chanduira
Explorer

We have developed Data Center Management ( IT core infrastructure - rack, ups, etc). apps over Splunk. It collects metrics like temparature from more than 1000 end points. This works perfectly fine for all kind of analysis. But for alerts gives trouble.

Most of alerts here should be real time , if any device temperature goes more than allowed limit alert should generated. But running real time searches across these devices causing heavy penalty. Most of time Splunk reaches real time search limit and alerts not getting generated.

Now my doubt ....is splunk is really good solution of alerting use cases ?, or we need to use some other tools for such use case.

Tags (1)
0 Karma

pkeller
Contributor

As others point out here, instead of running realtime searches, run the search every five minutes looking back at the events over the last five minutes. You don't have the search constantly eating resources and you're getting your alerts within 5-6 minutes of the trapped event.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Consolidate your queries like this:

Create a lookup file device -> owner's email, add that to your data as a field like recipient. Group your alerts by recipient. Set the email To: field to $result.recipient$ and send one email per event.
http://docs.splunk.com/Documentation/Splunk/6.5.3/Alert/EmailNotificationTokens

As for real-time, while temperature rise may be critical, do check how fast people actually react to such an email. Set the alert schedule accordingly... for example, if people take 10 minutes to react, it'll be okay to send alerts every five minutes. If people take three minutes to react, send alerts every minute... or other timings based on your environment.

martin_mueller
SplunkTrust
SplunkTrust

What happens when an alert is generated? Is there some kind of automated response? Are humans getting an email? How fast do those humans react?
Usually, you don't actually need real-time alerting... especially when you're alerting humans with human-like reaction times.

Additionally, you can usually consolidate alerts for all temperature sensors into one alert and thereby significantly cut down processing overhead. Do describe your current alerting and post as many details as you can.

chanduira
Explorer

Thanks for response.

Alerts are send as mail to user.

Here alerts need to be real time because temperature rise is critical events.

Here 1000 devices is speared across 15 data center, each data center owners are different. I need to send alerts based on device to respective owners. So consolidating query wont work.

0 Karma

lycollicott
Motivator

@martin_mueller's suggestion for using a lookup is a great idea and it will allow you to consolidate.
Both he and @brreeves_splunk are correct when they explained the potential problems with real-time alerts. If you really do run it as a real-time search then consider how many emails they could trigger in just a few second:

  • If you are emailing a text message for example then the recipients' phones could ring constantly
  • Can your email system handle a barrage of outgoing emails?
  • Is it really going to help anyone to send so many emails.

Just because a manager says that he wants you to alert in real-time, doesn't mean its a good idea. 🙂

0 Karma

brreeves_splunk
Splunk Employee
Splunk Employee

What about running every minute? or 30sec? Real-time has the potential to run more than once per second, so even if you said every 10sec, it would cut it down.

The best REAL TIME alerts are generated by the temperature sensor itself. So you'd need 1000 new temperature monitors.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...