Deployment Architecture

deployment fails from the second time.

Champion

Is there any idea for following situation?
When the number of UF is beyond about 268, deployment fails from the second time.
Deployment is all right after the UF is installed. However, if we try to deploy again after we change or add some settings in the config files, the deployment cannot success again.

Other information:
Phonehome is set as 10 minutes. The log accusation works well after the first deployment.
Searching with "index=_internal source="splunkd.log" host=XXXX component="Deploy", only the settings of the first time is reflected even though we have done the second deployment.

1-268
DeploymentClient - DeploymentClient has been asked to redo-handshake. Resetting to initial state.
269-398
DeploymentClient - Unable to send handshake message to deployment server. Error status is: not_connected

DeploymentServer Windows
SPLUNK V5.0.5

DeploymentClient Windows,Linux

UF V5.0.5

UFが268台をこえたあたりからデプロイが反映されないケースが発生するようになりました。
何が原因かアドバイスお願いします。

UFインストール直後のデプロイは上手くいきます。数日後に設定変更をデプロイする場合にデプロイが反映されないケースが発生します。
Phonehomeは10分間隔できてます。最初に設定したログの取り込みも正常です。

index=_internal source="splunkd.log" host=XXXX component="Deploy"
で検索すると、最初に設定した時のログしか検索されません。

以上、よろしくお願いします。

Tags (2)
1 Solution

Champion

<Workaround>
UFの台数が多くなるとデプロイされなくなる現象は解消できませんでした。以下の運用で回避してます。

デプロイ方法(すべてデプロイメントサーバ上で実行)
(Running on the deployment server on all) how to deploy

1.デプロイサーバをリロード
Reload the deployment server
splunk reload deploy-server -class [hostname]

2.デプロイクライアントのApps削除
Delete Apps deployment client
splunk remove app [appname] -uri https://[hostname]:8089 -auth admin:xxxxxx

3.デプロイクライアントのリスタート
Restart the deployment client
splunk _internal call /services/server/control/restart -method POST -uri https://[hostname]:8089 -auth admin:xxxxxx

View solution in original post

0 Karma

Champion

<Workaround>
UFの台数が多くなるとデプロイされなくなる現象は解消できませんでした。以下の運用で回避してます。

デプロイ方法(すべてデプロイメントサーバ上で実行)
(Running on the deployment server on all) how to deploy

1.デプロイサーバをリロード
Reload the deployment server
splunk reload deploy-server -class [hostname]

2.デプロイクライアントのApps削除
Delete Apps deployment client
splunk remove app [appname] -uri https://[hostname]:8089 -auth admin:xxxxxx

3.デプロイクライアントのリスタート
Restart the deployment client
splunk _internal call /services/server/control/restart -method POST -uri https://[hostname]:8089 -auth admin:xxxxxx

View solution in original post

0 Karma

Contributor

Since you are using your deployment server in windows I am assuming you have a dedicated deployment server. Is my assumption correct ?
Do you see any errors in splunkd.log when the issue happens ?
What you see when you run "netstat -oan" ?
Also make sure your deployment server is not also set as a deployment client.

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!