Alerting

Query/Alert to detect if a light forwarder stops reporting to deployment server

balbano
Contributor

Hey guys,

Just wondering if anyone knows whats the best way to keep track of your light forwarders.

Reason being is that one of my light forwarders stop reporting to the deployment server for months because the service was stopped on the server side.

When the service restarted, it basically was catching up on indexing and blew up my indexing volume license.

I wanna see if there is a way to create an alert if a light forwarder has not reported to the deployment server in the past 24 hours. I assume you would somehow need to do a diff between whats currently checking in vs. yesterday but not sure.

How do you guys do it? Whats the best way?

Let me know your input.

Thanks Guys.

Brian

Tags (2)
0 Karma
1 Solution

roryab
Splunk Employee
Splunk Employee

Hi Brian

Check out the deployment monitor app which is bundled with the latest version of splunk.

Deployment Monitor App

You'll be able to configure alerting for the Forwarder Warnings on the home page of the app.

View solution in original post

lguinn2
Legend

Try these searches to see the status of the deployment server and clients:

index=_internal sourcetype=splunkd component=DeploymentMetrics | rename scName as serverClass fqname as install_location hostname as deploymentClient | table _time deploymentClient ip serverClass appName event status reason install_location

index=_internal (component=*deploy* OR feature=*deploy* OR *serverclass*) (sourcetype=splunkd OR sourcetype=splunk_btool)  | sort host _time | table _time host log_level component message

index=_internal sourcetype=splunkd component=Metrics group=ds_connections* | rename ip as deploymentClient mgmt as mgmtPort | table _time deploymentClient mgmtPort utsname dsevent | sort deploymentClient _time

as a starting point. Then turn one or more of them into alerts...

lguinn2
Legend

Here is a search that I use and like (I think I got it from the Deployment Monitor, actually):

index="_internal" source="*metrics.log" group=tcpin_connections | 
eval sourceHost=if(isnull(hostname), sourceHost,hostname) | 
eval connectionType=case(fwdType=="uf","Universal Forwarder", fwdType=="lwf", "Light Weight Forwarder",fwdType=="full", "Splunk Indexer", connectionType=="cooked" or connectionType=="cookedSSL","Splunk Forwarder", connectionType=="raw" or connectionType=="rawSSL","Legacy Forwarder") | 
eval build=if(isnull(build),"n/a",build) | 
eval version=if(isnull(version),"pre 4.2",version) | 
eval guid=if(isnull(guid),sourceHost,guid) | 
eval os=if(isnull(os),"n/a",os)| 
eval arch=if(isnull(arch),"n/a",arch) | 
eval my_splunk_server = splunk_server | 
fields connectionType sourceIp sourceHost sourcePort destPort kb tcp_eps tcp_Kprocessed tcp_KBps my_splunk_server build version os arch | 
eval lastReceived = if(kb>0, _time, null) | 
stats first(sourceIp) as sourceIp first(connectionType) as connectionType first(sourcePort) as sourcePort first(build) as build first(version) as version first(os) as os first(arch) as arch max(_time) as lastConnected max(lastReceived) as lastReceived sum(kb) as kb avg(tcp_eps) as avg_eps by sourceHost | 
stats first(sourceIp) as sourceIp first(connectionType) as connectionType first(sourcePort) as sourcePort first(build) as build first(version) as version first(os) as os first(arch) as arch max(lastConnected) as lastConnected max(lastReceived) as lastReceived first(kb) as KB first(avg_eps) as eps by sourceHost | 
eval status = if(isnull(KB) or lastConnected<(info_max_time-900),"missing",if(lastConnected>(lastReceived+300) or KB==0,"quiet","active")) |
 sort sourceHost

If you want to turn it into an alert, you could add the following at the end:

 | where lastConnected < relative_time(now(), "-4h")

Then create an alert that triggers when number of events > 0 and you will be alerted anytime that a forwarder has not connected in the last 4 hours. Or you could test for the value of the status field, or look at lastReceived instead of lastConnected... There are lots of choices.

balbano
Contributor

I ended up going the Deployment Monitor route, but this was good info, thanks again Iguinn!!

0 Karma

sureshkandi
Explorer

thanks lguinn this helps me to check whether host is up and running.

0 Karma

lguinn2
Legend

oops - this search is checking for when the forwader connects to the indexer... not to the deployment server. But this is also worth checking.

0 Karma

roryab
Splunk Employee
Splunk Employee

Hi Brian

Check out the deployment monitor app which is bundled with the latest version of splunk.

Deployment Monitor App

You'll be able to configure alerting for the Forwarder Warnings on the home page of the app.

View solution in original post

balbano
Contributor

This is exactly what I needed plus more. Thanks roryab!

0 Karma