<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout in Deployment Architecture</title>
    <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460065#M16131</link>
    <description>&lt;P&gt;Hi, you are probably looking at the restart timeout setting on the CM (see &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Admin/Serverconf"&gt;link text&lt;/A&gt; and &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Indexer/Userollingrestart"&gt;link text&lt;/A&gt;)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[clustering]
restart_timeout = time_in_sec 
# default is 60, probably a good idea to really increase here (to avoid the cluster to go in fix mode)  but still adapt it to the time it usually take for a idx to restart (use something like 3600 if you really want not to restart in that case but obviously if your idx crash in the middle of the restart, this will take more time to detect)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 03 Feb 2020 11:02:09 GMT</pubDate>
    <dc:creator>maraman_splunk</dc:creator>
    <dc:date>2020-02-03T11:02:09Z</dc:date>
    <item>
      <title>Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460064#M16130</link>
      <description>&lt;P&gt;Using  Splunk 7.3.3, after I initiated a rolling restart from the cluster master (multi-site indexer cluster), the first indexer began to restart.  Then it showed batch adding, then the &lt;STRONG&gt;Indexer Clustering: Master Node&lt;/STRONG&gt; page, showed that the indexer failed to restart&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[Mon Feb  2 12:47:52 2020] Failed to restart peer=&amp;lt;GUID&amp;gt; peer_name=&amp;lt;hostname&amp;gt;. Moving to failed peer group and continuing.
[Mon Feb  2 12:47:52 2020] Failing peer=&amp;lt;GUID&amp;gt; peer_name=&amp;lt;hostname&amp;gt; timed out while trying to restart.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I did a ping from the CM to this indexer and it returned fine.  Connectivity was not an issue before the rolling restart and network connectivity appears to be working fine.  &lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Is there a timeout window or setting I can adjust to better accommodate network latency and give the CM more time to reach the peer?&lt;/LI&gt;
&lt;LI&gt;What does this mean for my rolling restart, will remaining peers be restarted but I should restart this one manually? &lt;/LI&gt;
&lt;LI&gt;How can I list this "failed peer group" to see all systems that may fail to restart?&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Mon, 03 Feb 2020 01:34:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460064#M16130</guid>
      <dc:creator>DEAD_BEEF</dc:creator>
      <dc:date>2020-02-03T01:34:38Z</dc:date>
    </item>
    <item>
      <title>Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460065#M16131</link>
      <description>&lt;P&gt;Hi, you are probably looking at the restart timeout setting on the CM (see &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Admin/Serverconf"&gt;link text&lt;/A&gt; and &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Indexer/Userollingrestart"&gt;link text&lt;/A&gt;)&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[clustering]
restart_timeout = time_in_sec 
# default is 60, probably a good idea to really increase here (to avoid the cluster to go in fix mode)  but still adapt it to the time it usually take for a idx to restart (use something like 3600 if you really want not to restart in that case but obviously if your idx crash in the middle of the restart, this will take more time to detect)
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 03 Feb 2020 11:02:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460065#M16131</guid>
      <dc:creator>maraman_splunk</dc:creator>
      <dc:date>2020-02-03T11:02:09Z</dc:date>
    </item>
    <item>
      <title>Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460066#M16132</link>
      <description>&lt;P&gt;From the error, if the indexer did restart without manual intervention, I would guess that the restart of the indexer took longer than the restart_timeout defined in the cluster master's server.conf. By default this is set to 60 seconds, and I have seen indexers take &lt;STRONG&gt;much&lt;/STRONG&gt; longer than this to restart.&lt;/P&gt;

&lt;P&gt;Can you see from splunkd.log on the indexer how long the restart actually took? If it's longer than 60 seconds, then you might want to extend your restart_timeout (&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.3.3/Indexer/Userollingrestart#Handle_slow_restarts" target="_blank"&gt;https://docs.splunk.com/Documentation/Splunk/7.3.3/Indexer/Userollingrestart#Handle_slow_restarts&lt;/A&gt;)&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 03:59:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460066#M16132</guid>
      <dc:creator>hmallett</dc:creator>
      <dc:date>2020-09-30T03:59:31Z</dc:date>
    </item>
    <item>
      <title>Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460067#M16133</link>
      <description>&lt;P&gt;Most indexers were taking 15-20 mins.  I will try adjusting the &lt;CODE&gt;restart_timeout&lt;/CODE&gt; value but this is the first time I've seen these errors and I have restarted this cluster many times with each taking 15-20 mins just like always.  That's what prompted me to ask about this issue.&lt;/P&gt;

&lt;P&gt;So this setting needs to be changed on the CM's server.conf, not the indexers themselves?&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2020 08:52:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460067#M16133</guid>
      <dc:creator>DEAD_BEEF</dc:creator>
      <dc:date>2020-02-05T08:52:50Z</dc:date>
    </item>
    <item>
      <title>Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460068#M16134</link>
      <description>&lt;P&gt;I will try adjusting this.  Each idx takes on average 15-20 mins, my current timeout setting is 15mins, so maybe I just expand it to 30m to be safe?&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2020 08:53:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460068#M16134</guid>
      <dc:creator>DEAD_BEEF</dc:creator>
      <dc:date>2020-02-05T08:53:37Z</dc:date>
    </item>
    <item>
      <title>Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460069#M16135</link>
      <description>&lt;P&gt;Yes, it is a CM setting. 30 min(1800s) seem to be appropriate for your env.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2020 09:23:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460069#M16135</guid>
      <dc:creator>maraman_splunk</dc:creator>
      <dc:date>2020-02-05T09:23:12Z</dc:date>
    </item>
    <item>
      <title>Re: Multi-site Indexer rolling restart - indexer fails to restart/timeout</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460070#M16136</link>
      <description>&lt;P&gt;Just finished a rolling restart and no errors anymore after increasing the timeout to 30mins.  Thank you both for the assistance!&lt;/P&gt;</description>
      <pubDate>Sun, 09 Feb 2020 04:28:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Multi-site-Indexer-rolling-restart-indexer-fails-to-restart/m-p/460070#M16136</guid>
      <dc:creator>DEAD_BEEF</dc:creator>
      <dc:date>2020-02-09T04:28:32Z</dc:date>
    </item>
  </channel>
</rss>

