Solved: How do you handle "near real-time" alerting in Spl...

ekenne06 · ‎03-25-2021

we have a lot of operational data that comes into Splunk, and based on certain conditions this can cause a service impact. These conditions then trigger alert actions which update our Remedy and NMS tools.

my question is how do other Splunkers handle this? I've typically done, run on cron schedule every 1 min and look back by1 min. Is this the correct way to do it? I've ran into some issues where events get dropped due to index processing taking a bit more time, and it gets pick up on the sequential alert. I really want to utilize the data in Splunk to update help desk tickets, triage service outages, and all of that cool integration stuff, but I want to make sure I'm writing my alerts properly.

gcusello · ‎03-25-2021

Hi @ekenne06,

at first, you have to understand how much time takes your search and choose a relative time period and schedule, e.g. if it takes 2 minutes to be executed, you can schedule your execution frequency overy 5 minutes.

I usually use as minor time for my alerts 5 minutes because using less time is unless: to schedule an alert every minute having a reaction time of 5-10 minutes.

About the time-frame ir related to the schedule frequency: if the alert is scheduled every 5 minutes, it has a time frame of 5 minutes.

At the end (but probably it should be the first thing!), verify if your hardware resources are sufficient for the volumes that you have to manage, so, if you have too long execution times for your alert, maybe there's a resources problems .

In other words, every search takes a CPU and release it only when finished, so

how many CPUs have your Indexers?
hown many scheduled searches there are on your Indexers?
how many users uses your Splunk?

You can use the Monitoring Console to understand if you system is correctly dimensioned.

For this reason, don't use real-time search because it takes a CPU and doesn't never release it.

Ciao.

Giuseppe

View solution in original post

Funderburg78 · ‎03-30-2021

Well, if you want to invest the time and money, may want to look at Splunk data stream processor. It is designed to read the stream and alert even before data is fully indexed. This I am sure is NOT a cheap product. https://www.splunk.com/en_us/software/stream-processing.html

Otherwise, you want to run searches less often or send critical data to a seperate indexer. this can be done by adjusting your outputs.conf on all clients except the one with the critical data. Unless the critical data could be coming from a windows log on any PC like looking for a guest account login. Then Data stream processor it is. But if you have a specialized App sending data, you could dedicate an indexer even a clustered indexer to that app.

I also strongly suggest using dedicated search heads for NRT or alerting servers and probably adjusting base_searches or max_searchs_per_cpu (see server.conf.spec) if you have a beefy server and/or setting the search to a high priority so it takes precedent over other searches and does not get Queued.

Remember to Leave some Karma if this helped you!!!

gcusello · ‎03-25-2021

Hi @ekenne06,

at first, you have to understand how much time takes your search and choose a relative time period and schedule, e.g. if it takes 2 minutes to be executed, you can schedule your execution frequency overy 5 minutes.

I usually use as minor time for my alerts 5 minutes because using less time is unless: to schedule an alert every minute having a reaction time of 5-10 minutes.

About the time-frame ir related to the schedule frequency: if the alert is scheduled every 5 minutes, it has a time frame of 5 minutes.

At the end (but probably it should be the first thing!), verify if your hardware resources are sufficient for the volumes that you have to manage, so, if you have too long execution times for your alert, maybe there's a resources problems .

In other words, every search takes a CPU and release it only when finished, so

how many CPUs have your Indexers?
hown many scheduled searches there are on your Indexers?
how many users uses your Splunk?

You can use the Monitoring Console to understand if you system is correctly dimensioned.

For this reason, don't use real-time search because it takes a CPU and doesn't never release it.

Ciao.

Giuseppe

gcusello · ‎04-09-2021

Hi @ekenne06,

good for you, see next time!

Ciao and happy splunking.

Giuseppe

P.S.: Karma Points are appreciated 😉

How do you handle "near real-time" alerting in Splunk?

monitor

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

Are you a member of the Splunk Community?

How do you handle "near real-time" alerting in Splunk?

monitor

Splunk Mobile: Your Brand-New Home Screen

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...