Hi All,
I am bit new to Splunk. In my current project, there are around 69,000+ universal forwarders. I need to perform a test and extract a report for all the universal forwarders which are not connecting to the Deployment Server.
P.S I don't have access to these universal forwarders as they are on client's side.
Could some-help help me out an SPL query or any ways I can check for this.
Thanks in advance
Yup.
If you want the contents of this database, you can look into the lookup
| inputlookup dmc_forwarder_assets
But remeber that it contains all forwarders which connected to your environment since the database was last cleared. So it might list nodes which are decomissioned ages ago, reinstalled and whatnot.
OK. Do you know how UFs and their deployment works?
You don't connect to the UFs. They connect to your DS and indexers/HFs.
You don't push config from DS. They pull it.
So the only source of knowledge about existence of UFs are either their historical logs stored in _inernal (the easiest way to check it would be
| metadata type=hosts index=_internal
(It will contain _all_ splunk-related hosts, not just UFs, mind you))
or the forwarder inventory database built in Monitoring Console.
Other than that you have no way of knowing how many forwarders the customer deploed and which of those are not able to reach your Splunk infrastructure. Where would you have it?
To get to this "forwarder inventory database built in Monitoring Console."
Is this the correct steps
Yup.
If you want the contents of this database, you can look into the lookup
| inputlookup dmc_forwarder_assets
But remeber that it contains all forwarders which connected to your environment since the database was last cleared. So it might list nodes which are decomissioned ages ago, reinstalled and whatnot.
@PickleRick
I am still not able to figure this out.
I am trying to get a list of the all the ufs which are disconnected or not communicating to the Deployment server.
I am breaking it down in this way .
1. To List all the existing forwarders for my instance , and
2. List all the forwarders which are establishing connection with DS
Hence, Missing/Disconnect UFs = All - (active UFs)
In my monitoring console, under Forwarder > Deployment, I see data which shows all the active and missing forwarders, which is pulling its data from a lookup called "| inputlookup dmc_forwarder_assets" , like you said earlier.
So, if I want to get all the forwarders which are there in my environment and is active, I am using this query below-
| inputlookup dmc_forwarder_assets
| search forwarder_type="uf" AND status="active"
| dedup hostname
| table hostnameAnd to get list of forwarders I am using the below query -
index=_internal sourcetype=splunkd component=DC:HandshakeReplyHandler
| dedup host
| table hostAs, in my instance , under internal logs, only 3 values for the field component seems relevant -
DC:HandshakeReplyHandler, DC:DeploymentClient , DC:PhonehomeThread and DS_DC_Common
So, with my understanding I used DC:HandshakeReplyHandler as it gives the message - Handshake done (which basically means that uf was able to establish connection with the DS )
Am I going in right direction??
Please reply
You might be overthinking the second part a bit. I'd consider a forwarder "active" if it's able to properly send data. If it sends data, it sends also its internal logs. So you can just do
| tstats count where index=_internal by host
One small thing about the first search - the
| dedup hostname
| table hostname
is suboptimal.
OK, it's just working on single lookup contents so the overall data size isn't that big but as a general rule instead of this you should just do
| stats values(hostname) as hostname
optionally followed by
| mvexpand hostname
Firstly, dedup is usually best avoided since it's almost always behaving not as you intended (unless you're the one percent that really knows what they're doing ;-)).
And secondly, it's centralized and fairly resource-intensive. Stats can use map-reduce so it only fetches the partial results from the indexers, not whole event stream.
@PickleRick
We have HF as well in my instance, so if I do -
| tstats count where index=_internal by host
| dedup hostname | table hostname
Is it not going to return HF as well?
Well. Yes, it wil (except for the fact that the second line of the search is completely unnecessary and will actually make Splunk return no events at all since you have no field named "hostname" after tstats ;-))l. This is the way to report _all_ Splunk hosts and forwarders. Usually that's the easiest trick to pull off and then manually remove known HFs, indexers, SHs and such.
It's quick and dirty but often just gets the job done.
If you want a more sophisticated approach, go for the dmc_forwarder_assets lookup.
BTW, the search based on the DC:Handshake component will only list components using DS. If you have forwarders which are not DS-managed, they will not be listed.
As usual - Splunk gives you tons of flexibility but the price is that sometimes getting some info about your Splunk installation can be a bit complicated because of several different ways things can be done or connected.
Hi @PickleRick
I was able to get the correct working query.
index=_internal sourcetype=splunkd component=DC:DeploymentClient splunk_server=* "*phonehome*" OR "*handshake*" OR "not_connected"
| stats latest(_time) AS lastTime BY host
| eval age=round((now() - lastTime)/3600, 1)
| where age >=24
| eval lastTime=strftime(lastTime, "%b %d, %Y %H:%M:%S")
| table host age lastTime
| sort 0 - ageApparently I was overthinking it 😅
One small but important detail.
If you're writing a search like this and you can avoid it, don't ever use terms like "*phonehome*". Splunk really hates wildcards at the beginning of search term because it makes it have to parse each single event to find out if it matches the term - it cannot use the lexicon.
Understood!
Just one last confirmation. For a host, I observe that after every 12 seconds, the host is trying to establish handshake with the Deployment server and lastly it says "handshake done"
Does that means it was finally able to establish connection with the DS??