Deployment Architecture

List Universal fowarders disconnected from the Deployment Server

manchou0709
Explorer

Hi All,

I am trying to list out all the universal forwarders which are currently not connected/disconnected with the deployment server.
What are the possible ways to find out ?

I can see under the UI of DS under settings> forwarder management> forwarder tab
It shows a count of around 1600 UFs ,  21 as offline.

I am trying to use a rest query, which list down all the ufs (hosts) but it does not have a status field where I can filter out those 21 that says offline under forwarder tab of forwarder management.

| rest /services/deployment/server/clients

 

 Again is this the correct approach?

Please help !!!

PS I am on splunk enterprise (on-prem)

0 Karma
1 Solution

kknairr
Contributor

@manchou0709 - Yes, using the REST endpoint /services/deployment/server/clients is the correct approach to list all deployment clients, but it does not expose a direct “status” field like the Forwarder Management UI. 

To identify disconnected forwarders, you need to calculate the difference between lastPhoneHomeTime and current time, apply a threshold that reflects the expected phonehome interval (default 60 seconds). In practice, this means enriching your REST query with an eval to convert lastPhoneHomeTime into epoch time, compute the age in minutes, and then filter out those that exceed the threshold. This reproduces the offline count shown in the Forwarder Management UI and is a better way to identify UFs that are not connected to the Deployment Server.

Try the below SPL query, you can adjust the threshold: [This SPL will list forwarders that haven’t checked in for more than 10 minutes]

 

| rest /services/deployment/server/clients 
| eval lastseen=strptime(lastPhoneHomeTime,"%Y-%m-%dT%H:%M:%S")
| eval ageInMinutes=(now()-lastseen)/60
| where ageInMinutes > 10

Hope this helps.

>>

If this post addressed your question, you can:

  • Give it karma to show appreciation 👍
  • Mark it as the solution if it solved your issue ✔️
  • Add a comment if you’d like more details ✏️

Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.

>>

View solution in original post

kknairr
Contributor

@manchou0709 - Yes, using the REST endpoint /services/deployment/server/clients is the correct approach to list all deployment clients, but it does not expose a direct “status” field like the Forwarder Management UI. 

To identify disconnected forwarders, you need to calculate the difference between lastPhoneHomeTime and current time, apply a threshold that reflects the expected phonehome interval (default 60 seconds). In practice, this means enriching your REST query with an eval to convert lastPhoneHomeTime into epoch time, compute the age in minutes, and then filter out those that exceed the threshold. This reproduces the offline count shown in the Forwarder Management UI and is a better way to identify UFs that are not connected to the Deployment Server.

Try the below SPL query, you can adjust the threshold: [This SPL will list forwarders that haven’t checked in for more than 10 minutes]

 

| rest /services/deployment/server/clients 
| eval lastseen=strptime(lastPhoneHomeTime,"%Y-%m-%dT%H:%M:%S")
| eval ageInMinutes=(now()-lastseen)/60
| where ageInMinutes > 10

Hope this helps.

>>

If this post addressed your question, you can:

  • Give it karma to show appreciation 👍
  • Mark it as the solution if it solved your issue ✔️
  • Add a comment if you’d like more details ✏️

Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.

>>

manchou0709
Explorer

Hi,

I am working in a project where we have multiple splunk on-prem instances (mostly 3-4)
I am trying to get a list of all the forwarders which are currently disconnected to Deployment server. I cannot login to the backend and the UI of Deployemnt server forwarder management doesn't show any details (may be because it is acting as a Load balancer) for one of the splunk instance.

So I used the below query to list all the active forwarders -

index=_internal source=metrics.log splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*) group=tcpin_connections fwdType=uf
| stats count by hostname
| fields hostname


and , to list all the connected forwarders to DS - 

index=_internal sourcetype=splunkd component=PubSubSvr "*handshake*" OR "*reply*" splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*)
| rex "\/handshake\/reply\/(?P<DeploymentClient>[^\/]+)"
| stats count by DeploymentClient
| rename DeploymentClient as host
| fields - count

  

I have tried multiple ways using search A | search NOT [search B] , join left and what not. Nothing is working.

Could anyone please help me with a query that will give me the list of forwarders which are disconnected to DS.

0 Karma

PickleRick
SplunkTrust
SplunkTrust

You asked pretty much the same question a few days ago. I'm merging those threads.

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @manchou0709 

How about this? This will exclude those which are seen to be connecting.

index=_internal source=metrics.log splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*) group=tcpin_connections fwdType=uf NOT 
    [ search index=_internal sourcetype=splunkd component=PubSubSvr "*handshake*" OR "*reply*" splunk_server IN (ncepnspidxdb*, ncwpnspidxdb*) 
    | rex "\/handshake\/reply\/(?P<DeploymentClient>[^\/]+)" 
    | stats count by DeploymentClient 
    | rename DeploymentClient as hostname 
    | fields hostname ] 
| stats count by hostname 
| fields hostname

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

manchou0709
Explorer

@livehybrid not working, its giving 0 events.

I am trying to list those forwarders which are disconnected from DS

0 Karma

manchou0709
Explorer

@kknairr I am using the similar logic where I am -

1. Calculating the last time a forwarder has phone homed/checked in
2. Converting the last-seen in hours
3. Filtering forwarder hasn’t contacted DS for 10+ hours → might be offline, blocked, or misconfigured.

| rest /services/deployment/server/clients
| eval lastSeenHours = (now() - lastPhoneHomeTime) / 3600
| where lastSeenHours > 10
| eval lastPhoneHomeReadable = strftime(lastPhoneHomeTime, "%Y-%m-%d %H:%M:%S")
| table hostname ip lastPhoneHomeReadable lastSeenHours
| sort - lastSeenHours

| rest /services/deployment/server/clients | eval lastSeenHours = (now() - lastPhoneHomeTime) / 3600 | where lastSeenHours > 10 | eval lastPhoneHomeReadable = strftime(lastPhoneHomeTime, "%Y-%m-%d %H:%M:%S") | table hostname ip lastPhoneHomeReadable lastSeenHours | sort - lastSeenHours


But one doubt here is that this output gives agent: in error under forwarder tab of forwarder management and not agent: offline

Also the count for agent: offline keeps changing.

Any thoughts on it

kknairr
Contributor

@manchou0709 Query looks good. The behavior you are seeing comes down to how Splunk’s Forwarder Management UI calculates and labels status versus what you are deriving in SPL. The REST endpoint gives you raw fields like lastPhoneHomeTime, but it doesn’t expose the same "agent: offline" flag. The UI applies its own heartbeat logic and categorization, so when a forwarder misses a checkin it may be marked as "in error" rather than "offline", depending on timing and internal thresholds. That’s why your SPL correctly identifies forwarders that haven’t phoned home for 10+ hours, but the UI shows them under "in error" and the offline count keeps changing as forwarders reconnect or miss intervals. In short, your query logic is fine, what you’re seeing is simply a difference in how the UI interprets missed checkins versus how your SPL calculates elapsed time. If you want closer alignment, you might need to match your threshold to the deployment server’s configured phonehome interval, but you won’t be able to reproduce the exact "offline" label since that’s generated internally by Splunk. Hope it clarifies.

>>

If this post addressed your question, you can:

  • Give it karma to show appreciation 👍
  • Mark it as the solution if it solved your issue ✔️
  • Add a comment if you’d like more details ✏️

Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.

>>

manchou0709
Explorer

@kknairr 

I also observed that this rest query is only working when I run this under UI of my DS under search and reporting app, however in my client environment, we have multiple splunk (4-5) on prem instances whose search head could communicate with each other.

So if I run this query with -

| rest splunk_server=local /services/deployment/server/clients

on any of the Search head, it does either return any value and without splunk_server=local it tends to pull hosts which doesn't belong to that particular environment.

In such a case what is the best way I could get the disconnected host?

0 Karma

kknairr
Contributor

@manchou0709 The REST endpoint only returns accurate client data when queried directly from the Deployment Server since that is the authoritative source for forwarder phone‑home data. So, running it on other search heads will either give no results or mix in hosts from different environments. The best way to get disconnected hosts is to run the query on the DS itself and then share or summarize those results into a new summary index and share across your other Splunk instances if you need wider visibility.

Use collect command for this purpose at the end of existing query in DS. Refer documentation for advanced options if you want to enhance the summary index collection.

| collect index=uf_status

Once that scheduled search is in place on the deployment server, other search heads can simply query the summary index to see the disconnected forwarders without hitting the DS REST endpoint themselves. This ensures you are always working from the authoritative source while still giving visibility across multiple Splunk environments.

Ref: collect | Splunk Enterprise (last updated 2025-07-04T01:27:46.382Z)

>>

If this post addressed your question, you can:

  • Give it karma to show appreciation 👍
  • Mark it as the solution if it solved your issue ✔️
  • Add a comment if you’d like more details ✏️

Acknowledging helpful answers keeps the community strong and motivates contributors to continue sharing their expertise.

>>

0 Karma

manchou0709
Explorer

  @kknairr 


Exactly, what I have observed. But the issue is for some reason this rest query doesn't return any value for 2 of the Splunk on prem instances when I run in UI of DS under search & reporting. Also, those two doesn't show any forwarders phoning home under forwarder management (main DS is acting as a LB as far I am aware)
I even tried using the query like this in search head - 

| rest /services/deployment/server/clients splunk_server="dedicated_splunk_server_for _that_instance" 


but it doesn't work
So, I took a different approach,

I am using an internal index query to get all the forwarders available using the query -

index=_internal source=*metrics.log* group=tcpin_connections os=* fwd_type=uf 
|stats latest(fwdType) AS forwarder_type latest(os) AS os latest(version) AS version by hostname

and to get the connected hosts to DS - 

index=_internal sourcetype=splunkd component=PubSubSvr 
| rex "\/handshake\/reply\/(?P<DeploymentClient>[^\/]+)" 
| stats count by host DeploymentClient 
| rename host as DeploymentServer 
| fields - count

 

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Can we go one step back and you tell to us what you are trying to achieve? I expecting that you are asking this detail level question as you have already idea how to solve your real issue?
0 Karma

manchou0709
Explorer

@isoutamo 
What I am trying to achieve here is -
I am trying to list down all the universal forwarders which are currently disconnected or not communicating to its Deployment Server using any idea way which ever works.
P.S. We are using splunk on-prem instances

I have access to the UI as well as backend of the Deployment server. So any approach would work for me!!

0 Karma

isoutamo
SplunkTrust
SplunkTrust
I get this already from your original question, but I try to understand why you want to know it? You probably have issue which you are trying to solve with this information?
E.g your data collection didn’t work, your UF configurations didn’t updated as expected ….
0 Karma

manchou0709
Explorer

@kknairr 

I am using the similar logic where I am -
1. Calculating the last time a forwarder has phone homed/checked in
2. Converting the last-seen in hours
3. Filtering forwarder hasn’t contacted DS for 10+ hours → might be offline, blocked, or misconfigured.

| rest /services/deployment/server/clients
| eval lastSeenHours = (now() - lastPhoneHomeTime) / 3600
| where lastSeenHours > 10
| eval lastPhoneHomeReadable = strftime(lastPhoneHomeTime, "%Y-%m-%d %H:%M:%S")
| table hostname ip lastPhoneHomeReadable lastSeenHours
| sort - lastSeenHours


But one doubt here is that this output gives agent: in error under forwarder tab of forwarder management and not agent: offline


Also the count for agent: offline keeps changing.


Any thoughts on it

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Check out the lastPhoneHomeTime field returned by that endpoint.  It says when each client last connected to the DS.  Use that value to determine if a client is disconnected or not (you can decide how old the value needs to be).

---
If this reply helps you, Karma would be appreciated.
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...

Network to App: Observability Unlocked [May & June Series]

In today’s digital landscape, your environment is no longer confined to the data center. It spans complex ...