Deployment Architecture

Deployment Server Log Entries Stopped

oreoshake
Communicator

Somewhat odd behavior, my deployment server stopped generating log entries. I've been using the searches from http://answers.splunk.com/questions/2038/whats-the-best-way-to-monitor-deployment-activity to monitor activity but the entires have pretty much stopped. Specifically, these searches don't return anything in the past week or so

index="_internal" sourcetype="splunkd" source=*metrics.log Component="DeploymentMetrics"

index=_internal sourcetype=splunkd Component="DeployedApplication"

source="/opt/splunk/var/log/splunk/splunkd_access.log" index=_internal /services/broker/phonehome | rex "/services/broker/phonehome/connection_[\\d|\\.]*_\\d*_(?<src>.*)_(?<client>.*)_" | dedup client

Yet, my deployment server still works. I can see new apps being deployed, the forwarders are restarting, etc. This was helpful because we have a forwarder deployed on a machine that I cannot access. I can see via the rest interface that the apps that should have been deployed have not been deployed so I want to make sure it can at least phone home.

Any ideas why my logs went silent?

ampledata
Splunk Employee
Splunk Employee

Starting in Splunk 4.1 a normal (non lightweight/LWF) forwarder will not forward events from indexes who's names begin with _ (such as _internal), except for _audit. As such, 'Installing app' events, like those cited above, will not get forwarded to your indexer. The source of this behavior are the filters in $SPLUNK_HOME/etc/system/default/outputs.conf (NOTE: Please do not edit files in the /default/ directory).

To override this behavior, add these lines to $SPLUNK_HOME/etc/system/local/outputs.conf:

[tcpout] 
forwardedindex.filter.disable = true 

Lowell
Super Champion

Assuming your running 4.0, please try these two searches. I have a support case open on the issue of deployment metrics disappearing randomly, and I'm wondering if you are having the same issue.

Search 1: Looking at deployment metrics (reported by the deployment-server)

index="_internal" sourcetype="splunkd" DeploymentMetrics "event=install" "status=ok" | timechart count by appName

Search 2: Looking at deployment installs (reported by the deployment-client)

index="_internal" sourcetype=splunkd DeployedApplication "Installing app:" | rex "app: (?<appName>[\w._-]+?) to location: (?<location>.*)$" | timechart count by appName

I'm curious to see if search 1 works, but search 2 does not. (Note that I have all my _internal events forwarded to a central server, which is our deployment server; so I can run these searches on a single splunk instance.)


Update: oreoshake, since you've been able to confirm that the you are also seeing a discrepancy between the deployment servers' metrics, and the deployment client install logs, would you be willing to provide some sample logs to Splunk support? (Pretty sure if we have multiple people reporting the same issues there's a better chance the issues an be found and corrected.)

My open Splunk support Case # 43023 (Accuracy of DeploymentMetrics, ref:00D49oyL.5004AS90Y:ref), you should be able to include that info and they should be able to tie the two issues together. Support is looking for sample splunkd.log files (and probably also the metrics.log files) from both a deployment client and server. Right now I don't have any example deployments in my logs since these get rotated so quickly, and it's been a few days since my last deploy.

0 Karma

Lowell
Super Champion

Chris, I'm not sure what your asking exactly. I don't see any outputs.conf entries in any of my unix app folders. (I'm assuming Neil=oreoshake?) Is there an upgrade that I should be trying? I'm still seeing the same sporadic behavior. (I am in the process of upgrading my forwarders to 4.1.3, so maybe that will make a change eventually.)

0 Karma

Chris_R_
Splunk Employee
Splunk Employee

Lowell how's your DS metrics now? I had a bug open due to both of your cases but Neil's was solved due to a rogue outputs.conf in the unix app.

0 Karma

Lowell
Super Champion

I reproduced the issues today by publishing a trivial config changes and sent in the corresponding log files to splunk support. Hopefully this issues can move forward. Do you find this problem is happening for both windows and Linux deployment clients? I'm only seeing this problem on my Linux machines.

0 Karma

oreoshake
Communicator

Will do, I'll reference your case number

0 Karma

Lowell
Super Champion

Any chance you would be willing to provide some sample logs to splunk support? I put some reference info about the open case I have with them about this problem. (Although in my case, the metrics events seem to be intermittent rather than completely missing, which may be an easier to track down.)

0 Karma

oreoshake
Communicator

Still on 4.0.11

The first search stopped returning results around the time of my other ones, the second search returns results probably because it comes from the forwarders themselves.

0 Karma
Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...