Alerting

how to do application specific monitoring and alerting

palc
New Member

Hi, we are an energy trading company and we have requirements to monitor particular application services and processes related to the application itself. The application is running on Windows 2008 boxes. Apart from monitoring the services for the application, we would like some kind of remedial action also, like if the service is stopped, it should be restarted automatically if it stops at night and if it stops during office hours, a ticket should be generated in remedy ticketing system about the status or intimation to the relevant team in the form of an email. I also want to monitor how much CPU Usage or RAM usage is taken up by a particular application service. Can you please guide me how i can achieve this through Splunk Enterprise. We have used HPOM monitoring tool before and these activities were achieved using Config files placed on the servers and HPOM Agents were passing on the information to the HPOM for Unix Central Server.Looking forward to your help. Pallab

Tags (2)
0 Karma

lmyrefelt
Builder

As a small addition to the good tips already received and as pointed out, this is a to "big of a question" and to much "dependent on your infrastructure and other components" that is also outside of Splunk.

But this is all doable more or less.
A good start and inspiration could be;

http://www.splunk.com/goto/book and Chapter 6 covers the most common monitoring and alerting solutions.

This will guide you on as how to get the actual data into splunk;

http://docs.splunk.com/Documentation/Splunk/6.2.1/Data/WhatSplunkcanmonitor

Starting with;

To be installed on your application servers together with the Splunk forwarder;
https://apps.splunk.com/app/742/
To be installed on your central Splunk instance;
https://apps.splunk.com/app/1680/

It might be a good idea to contact Splunk professional services or a good Partner of Splunk.

Good luck!

0 Karma

aakwah
Builder

Hello,

If the logs coming from application includes the required info like service stopped then, we can create schedule search to run every 30 minutes for example to check if services/process are down then the action could be run script to start the application or to send email to notify app admins.

To schedule a search do the following, run search, Save As, Alert, then choose if he action email or run script.

To submit a ticket you can use POST workflow action as per he following, Settings, Fields, Workflow actions, then choose link method to be POST.

As per the received logs from your windows systems you can create searches and create alerts if certain threshold reached then choose the action to send email or to run script.

Regards

0 Karma

dolivasoh
Contributor

This question is pretty open ended. There could be thousands of ways to achieve these in Splunk. Have you sought out professional services to assist in your deployment?

I'd say start with the Windows Infrastructure app and go from there. Otherwise, your use case needs to be researched and planned thoroughly.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...