Hi All,
I am trying to list out all the universal forwarders which are currently not connected/disconnected with the deployment server.
What are the possible ways to find out ?
I can see under the UI of DS under settings> forwarder management> forwarder tab
It shows a count of around 1600 UFs , 21 as offline.
I am trying to use a rest query, which list down all the ufs (hosts) but it does not have a status field where I can filter out those 21 that says offline under forwarder tab of forwarder management.
| rest /services/deployment/server/clients
Again is this the correct approach?
Please help !!!
PS I am on splunk enterprise (on-prem)
@manchou0709 - Yes, using the REST endpoint /services/deployment/server/clients is the correct approach to list all deployment clients, but it does not expose a direct “status” field like the Forwarder Management UI.
To identify disconnected forwarders, you need to calculate the difference between lastPhoneHomeTime and current time, apply a threshold that reflects the expected phone‑home interval (default 60 seconds). In practice, this means enriching your REST query with an eval to convert lastPhoneHomeTime into epoch time, compute the age in minutes, and then filter out those that exceed the threshold. This reproduces the offline count shown in the Forwarder Management UI and is a better way to identify UFs that are not connected to the Deployment Server.
Try the below SPL query, you can adjust the threshold: [This SPL will list forwarders that haven’t checked in for more than 10 minutes]
| rest /services/deployment/server/clients
| eval lastseen=strptime(lastPhoneHomeTime,"%Y-%m-%dT%H:%M:%S")
| eval ageInMinutes=(now()-lastseen)/60
| where ageInMinutes > 10Hope this helps.
>>
If this post addressed your question, you can:
Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.
>>
@manchou0709 - Yes, using the REST endpoint /services/deployment/server/clients is the correct approach to list all deployment clients, but it does not expose a direct “status” field like the Forwarder Management UI.
To identify disconnected forwarders, you need to calculate the difference between lastPhoneHomeTime and current time, apply a threshold that reflects the expected phone‑home interval (default 60 seconds). In practice, this means enriching your REST query with an eval to convert lastPhoneHomeTime into epoch time, compute the age in minutes, and then filter out those that exceed the threshold. This reproduces the offline count shown in the Forwarder Management UI and is a better way to identify UFs that are not connected to the Deployment Server.
Try the below SPL query, you can adjust the threshold: [This SPL will list forwarders that haven’t checked in for more than 10 minutes]
| rest /services/deployment/server/clients
| eval lastseen=strptime(lastPhoneHomeTime,"%Y-%m-%dT%H:%M:%S")
| eval ageInMinutes=(now()-lastseen)/60
| where ageInMinutes > 10Hope this helps.
>>
If this post addressed your question, you can:
Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.
>>
Hi,
I am working in a project where we have multiple splunk on-prem instances (mostly 3-4)
I am trying to get a list of all the forwarders which are currently disconnected to Deployment server. I cannot login to the backend and the UI of Deployemnt server forwarder management doesn't show any details (may be because it is acting as a Load balancer) for one of the splunk instance.
So I used the below query to list all the active forwarders -
index=_internal source=metrics.log splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*) group=tcpin_connections fwdType=uf
| stats count by hostname
| fields hostname
and , to list all the connected forwarders to DS -
index=_internal sourcetype=splunkd component=PubSubSvr "*handshake*" OR "*reply*" splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*)
| rex "\/handshake\/reply\/(?P<DeploymentClient>[^\/]+)"
| stats count by DeploymentClient
| rename DeploymentClient as host
| fields - count
I have tried multiple ways using search A | search NOT [search B] , join left and what not. Nothing is working.
Could anyone please help me with a query that will give me the list of forwarders which are disconnected to DS.
You asked pretty much the same question a few days ago. I'm merging those threads.
Hi @manchou0709
How about this? This will exclude those which are seen to be connecting.
index=_internal source=metrics.log splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*) group=tcpin_connections fwdType=uf NOT
[ search index=_internal sourcetype=splunkd component=PubSubSvr "*handshake*" OR "*reply*" splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*)
| rex "\/handshake\/reply\/(?P<DeploymentClient>[^\/]+)"
| stats count by DeploymentClient
| rename DeploymentClient as hostname
| fields hostname ]
| stats count by hostname
| fields hostname🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
@livehybrid not working, its giving 0 events.
I am trying to list those forwarders which are disconnected from DS
@kknairr I am using the similar logic where I am -
1. Calculating the last time a forwarder has phone homed/checked in
2. Converting the last-seen in hours
3. Filtering forwarder hasn’t contacted DS for 10+ hours → might be offline, blocked, or misconfigured.
| rest /services/deployment/server/clients
| eval lastSeenHours = (now() - lastPhoneHomeTime) / 3600
| where lastSeenHours > 10
| eval lastPhoneHomeReadable = strftime(lastPhoneHomeTime, "%Y-%m-%d %H:%M:%S")
| table hostname ip lastPhoneHomeReadable lastSeenHours
| sort - lastSeenHours| rest /services/deployment/server/clients | eval lastSeenHours = (now() - lastPhoneHomeTime) / 3600 | where lastSeenHours > 10 | eval lastPhoneHomeReadable = strftime(lastPhoneHomeTime, "%Y-%m-%d %H:%M:%S") | table hostname ip lastPhoneHomeReadable lastSeenHours | sort - lastSeenHours
But one doubt here is that this output gives agent: in error under forwarder tab of forwarder management and not agent: offline
Also the count for agent: offline keeps changing.
Any thoughts on it
@manchou0709 Query looks good. The behavior you are seeing comes down to how Splunk’s Forwarder Management UI calculates and labels status versus what you are deriving in SPL. The REST endpoint gives you raw fields like lastPhoneHomeTime, but it doesn’t expose the same "agent: offline" flag. The UI applies its own heartbeat logic and categorization, so when a forwarder misses a check‑in it may be marked as "in error" rather than "offline", depending on timing and internal thresholds. That’s why your SPL correctly identifies forwarders that haven’t phoned home for 10+ hours, but the UI shows them under "in error" and the offline count keeps changing as forwarders reconnect or miss intervals. In short, your query logic is fine, what you’re seeing is simply a difference in how the UI interprets missed check‑ins versus how your SPL calculates elapsed time. If you want closer alignment, you might need to match your threshold to the deployment server’s configured phone‑home interval, but you won’t be able to reproduce the exact "offline" label since that’s generated internally by Splunk. Hope it clarifies.
>>
If this post addressed your question, you can:
Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.
>>
@kknairr
I also observed that this rest query is only working when I run this under UI of my DS under search and reporting app, however in my client environment, we have multiple splunk (4-5) on prem instances whose search head could communicate with each other.
So if I run this query with -
| rest splunk_server=local /services/deployment/server/clientson any of the Search head, it does either return any value and without splunk_server=local it tends to pull hosts which doesn't belong to that particular environment.
In such a case what is the best way I could get the disconnected host?
@manchou0709 The REST endpoint only returns accurate client data when queried directly from the Deployment Server since that is the authoritative source for forwarder phone‑home data. So, running it on other search heads will either give no results or mix in hosts from different environments. The best way to get disconnected hosts is to run the query on the DS itself and then share or summarize those results into a new summary index and share across your other Splunk instances if you need wider visibility.
Use collect command for this purpose at the end of existing query in DS. Refer documentation for advanced options if you want to enhance the summary index collection.
| collect index=uf_status
Once that scheduled search is in place on the deployment server, other search heads can simply query the summary index to see the disconnected forwarders without hitting the DS REST endpoint themselves. This ensures you are always working from the authoritative source while still giving visibility across multiple Splunk environments.
Ref: collect | Splunk Enterprise (last updated 2025-07-04T01:27:46.382Z)
>>
If this post addressed your question, you can:
Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.
>>
Exactly, what I have observed. But the issue is for some reason this rest query doesn't return any value for 2 of the Splunk on prem instances when I run in UI of DS under search & reporting. Also, those two doesn't show any forwarders phoning home under forwarder management (main DS is acting as a LB as far I am aware)
I even tried using the query like this in search head -
| rest /services/deployment/server/clients splunk_server="dedicated_splunk_server_for _that_instance"
but it doesn't work
So, I took a different approach,
I am using an internal index query to get all the forwarders available using the query -
index=_internal source=*metrics.log* group=tcpin_connections os=* fwd_type=uf
|stats latest(fwdType) AS forwarder_type latest(os) AS os latest(version) AS version by hostnameand to get the connected hosts to DS -
index=_internal sourcetype=splunkd component=PubSubSvr
| rex "\/handshake\/reply\/(?P<DeploymentClient>[^\/]+)"
| stats count by host DeploymentClient
| rename host as DeploymentServer
| fields - count
@isoutamo
What I am trying to achieve here is -
I am trying to list down all the universal forwarders which are currently disconnected or not communicating to its Deployment Server using any idea way which ever works.
P.S. We are using splunk on-prem instances
I have access to the UI as well as backend of the Deployment server. So any approach would work for me!!
I am using the similar logic where I am -
1. Calculating the last time a forwarder has phone homed/checked in
2. Converting the last-seen in hours
3. Filtering forwarder hasn’t contacted DS for 10+ hours → might be offline, blocked, or misconfigured.
| rest /services/deployment/server/clients
| eval lastSeenHours = (now() - lastPhoneHomeTime) / 3600
| where lastSeenHours > 10
| eval lastPhoneHomeReadable = strftime(lastPhoneHomeTime, "%Y-%m-%d %H:%M:%S")
| table hostname ip lastPhoneHomeReadable lastSeenHours
| sort - lastSeenHours
But one doubt here is that this output gives agent: in error under forwarder tab of forwarder management and not agent: offline
Also the count for agent: offline keeps changing.
Any thoughts on it
Check out the lastPhoneHomeTime field returned by that endpoint. It says when each client last connected to the DS. Use that value to determine if a client is disconnected or not (you can decide how old the value needs to be).