<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Missing data - Splunk is showing random gaps in the 'indexed data' timeline and safeService warning in Splunkd.log in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98992#M20705</link>
    <description>&lt;P&gt;Hi this may be caused by defect &lt;STRONG&gt;SPL-39127&lt;/STRONG&gt; on Splunk 4.2.0 and 4.2.1.&lt;BR /&gt;
This is caused by a push from the deployment server which restarts Splunkweb on indexers (includes search heads that are performing summary indexing) that are deployment clients. &lt;/P&gt;

&lt;P&gt;If Splunk is still at 4.2, first apply the latest 4.2.1 release which fixes an associated defect SPL-38464 where in rare cases, concurrent hash table and string length collisions for metadata field values can cause index-level metadata files to grow to very large sizes, up to several gigabytes.&lt;/P&gt;

&lt;P&gt;Reference: &lt;A href="http://www.splunk.com/base/Documentation/4.2/ReleaseNotes/Knownissues" rel="nofollow" target="_blank"&gt;&lt;/A&gt;&lt;A href="http://splunk.com/base/Documentation/4.2/ReleaseNotes/Knownissues" target="_blank"&gt;http://splunk.com/base/Documentation/4.2/ReleaseNotes/Knownissues&lt;/A&gt;&lt;BR /&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;If you encounter this problem, please file a case at splunk support.&lt;BR /&gt;
&lt;A href="http://www.splunk.com/support" target="_blank"&gt;http://www.splunk.com/support&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;To find if this is the case, search in the splunkd.log logs look for something like :&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;05-07-2011 05:44:45.466 +0000 WARN  MetaData - /opt/splunk/var/lib/splunk/apache/db/hot_v1_59/Hosts.data: attempting safeService to attempt to fix up metadata&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;To find those errors in the internal logs, (and the indexer in case of search-peers), you can use this search :&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
index=_internal host="indexer hostname" source=&lt;EM&gt;splunkd.log&lt;/EM&gt; safeService | rex " MetaData - (?P&lt;BUCKET&gt;.*)/" | stats count by bucket splunk_server&lt;BR /&gt;
&lt;/BUCKET&gt;&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Here is the manual procedure to fix.&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;STRONG&gt;Note:&lt;/STRONG&gt; There are 2 options: run multiple rebuilds in parallel or a single sequential rebuild as detailed below.&lt;/P&gt;

&lt;P&gt;1 - disable deploymentclient to prevent new corruption&lt;BR /&gt;&lt;BR /&gt;
(until the fix to SPL-39127: targeted for the upcoming maintenance release 4.2.2)&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
mv $SPLUNK_HOME/etc/system/local/deploymentclient.conf $SPLUNK_HOME/etc/system/local/deploymentclient.disabled&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
2 - collect the list of the corrupted buckets with&lt;BR /&gt;
&lt;PRE&gt;cd $SPLUNK_HOME/bin &lt;BR /&gt;
./splunk cmd splunkd fsck --mode metadata --all &amp;gt; /tmp/trash&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
the buckets with errors will be displayed on the screen&lt;BR /&gt;
by example : &lt;EM&gt;NEEDS REPAIR: file='/opt/splunk/var/lib/splunk/java/db/db_1303835244_1303775919_106/Hosts.data' code=25 contains recover-padding&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;3 - stop splunk to prevent bucket rotation&lt;/P&gt;

&lt;P&gt;4 - for each of them rebuild the tsdix files&lt;BR /&gt;
the process is long, if you have several buckets, it is faster to run several rebuild in parallel (use &amp;amp; on linux)&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd rebuild /pathtothebucketfolder/&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
For parallel processing&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd rebuild /pathtothebucketfolder1/ &amp;amp;&lt;BR /&gt;
./splunk cmd splunkd rebuild /pathtothebucketfolder2/ &amp;amp;&lt;BR /&gt;
etc...&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
&lt;STRONG&gt;OR&lt;/STRONG&gt; to run a single command to rebuild all sequentially (takes longer time):&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd fsck --mode metadata --all --repair&lt;BR /&gt;
&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;5 - check the result with&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd fsck --mode metadata --all&lt;BR /&gt;
&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;6 - restart splunk (it will also apply the modification to the deploymentclient config)&lt;/P&gt;

&lt;P&gt;For further information on splunkd fsck refer on the Community Wiki to:&lt;BR /&gt;&lt;BR /&gt;
&lt;A href="http://www.splunk.com/wiki/Check_and_Repair_Metadata" rel="nofollow" target="_blank"&gt;http:///&lt;/A&gt;&lt;A href="http://www.splunk.com/wiki/Check_and_Repair_Metadata" target="_blank"&gt;www.splunk.com/wiki/Check_and_Repair_Metadata&lt;/A&gt;&lt;BR /&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;How to prevent this from happening until 4.2.2 comes out?&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;There are two workarounds to address this.&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;P&gt;The workaround for the associated bug SPL-38464 (setting "inPlaceUpdates = false" as a global parameter in the [default] stanza of indexes.conf) is still a valid one :&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE&gt;[default]&lt;BR /&gt;
inPlaceUpdates = false&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;
Since we would always atomically update the metadata files via rename, there is no chance of corruption here. There is a chance, perhaps, of ending up with somewhat invalid metadata info, but not with corruption.  &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Another workaround is to set both "restartSplunkWeb=false" AND "restartSplunkd=false" in their serverclass.conf stanzas to disable restarts. The corruption happens in the splunkweb restart code path, but restarting splunkd also triggers splunkweb restart.&lt;/P&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;If applied, these work-arounds should be retired once 4.2.2 is installed.&lt;/P&gt;</description>
    <pubDate>Mon, 28 Sep 2020 09:32:49 GMT</pubDate>
    <dc:creator>yannK</dc:creator>
    <dc:date>2020-09-28T09:32:49Z</dc:date>
    <item>
      <title>Missing data - Splunk is showing random gaps in the 'indexed data' timeline and safeService warning in Splunkd.log</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98990#M20703</link>
      <description>&lt;P&gt;My Splunk instance is constantly indexing data 24*7, but I've noticed some gaps in the indexed data timeline recently.  I have also noticed that data I could search on yesterday is not being returned today.  This doesn't happen consistently, but regularly enough to cause concern.  I looked in splunkd.log and index=_internal to ensure that the buckets have not rotated out of the DB, and also confirmed that the buckets spanning the time period of the gap are present and in good shape.  What else can I do to track down this missing data?&lt;/P&gt;

&lt;P&gt;In splunkd.log I see the following:&lt;/P&gt;

&lt;P&gt;05-07-2011 05:44:45.466 +0000 WARN MetaData - /opt/splunk/var/lib/splunk/apache/db/hot_v1_59/Hosts.data: attempting safeService to attempt to fix up metadata&lt;/P&gt;

&lt;P&gt;My environment consists of 4 indexers running 4.2, 300 UF instances (also 4.2) and a standalone deployment server, also 4.2.  We use the deployment server to manage the configs of all instances.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 09:32:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98990#M20703</guid>
      <dc:creator>mctester</dc:creator>
      <dc:date>2020-09-28T09:32:44Z</dc:date>
    </item>
    <item>
      <title>Re: Missing data - Splunk is showing random gaps in the 'indexed data' timeline and safeService warning in Splunkd.log</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98991#M20704</link>
      <description>&lt;P&gt;If you have forwarders sending data, you can look for forwarder connectivity within the splunkd.log of both the indexers and forwarders.   I would first check to make sure the forwarder indeed had connectivity during that time.   Are these systems picking up network data or monitoring files?   Some keys to debugging:&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;Figure out exactly what source, sourcetype, or host is missing data.  Use searches to find them.&lt;/LI&gt;
&lt;LI&gt;Compare the actual raw data to the internal indexing volume.   Does indexing volume tail off for a specific forwarder or index?&lt;/LI&gt;
&lt;LI&gt;Search for "index=_internal source=&lt;EM&gt;metrics.log&lt;/EM&gt; blocked".   If something is blocked, that might be the problem.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;The above steps are typically enough to figure out if it is a problem getting the data, or indexing the data.&lt;/P&gt;</description>
      <pubDate>Tue, 10 May 2011 18:35:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98991#M20704</guid>
      <dc:creator>Simeon</dc:creator>
      <dc:date>2011-05-10T18:35:37Z</dc:date>
    </item>
    <item>
      <title>Re: Missing data - Splunk is showing random gaps in the 'indexed data' timeline and safeService warning in Splunkd.log</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98992#M20705</link>
      <description>&lt;P&gt;Hi this may be caused by defect &lt;STRONG&gt;SPL-39127&lt;/STRONG&gt; on Splunk 4.2.0 and 4.2.1.&lt;BR /&gt;
This is caused by a push from the deployment server which restarts Splunkweb on indexers (includes search heads that are performing summary indexing) that are deployment clients. &lt;/P&gt;

&lt;P&gt;If Splunk is still at 4.2, first apply the latest 4.2.1 release which fixes an associated defect SPL-38464 where in rare cases, concurrent hash table and string length collisions for metadata field values can cause index-level metadata files to grow to very large sizes, up to several gigabytes.&lt;/P&gt;

&lt;P&gt;Reference: &lt;A href="http://www.splunk.com/base/Documentation/4.2/ReleaseNotes/Knownissues" rel="nofollow" target="_blank"&gt;&lt;/A&gt;&lt;A href="http://splunk.com/base/Documentation/4.2/ReleaseNotes/Knownissues" target="_blank"&gt;http://splunk.com/base/Documentation/4.2/ReleaseNotes/Knownissues&lt;/A&gt;&lt;BR /&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;If you encounter this problem, please file a case at splunk support.&lt;BR /&gt;
&lt;A href="http://www.splunk.com/support" target="_blank"&gt;http://www.splunk.com/support&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;To find if this is the case, search in the splunkd.log logs look for something like :&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;05-07-2011 05:44:45.466 +0000 WARN  MetaData - /opt/splunk/var/lib/splunk/apache/db/hot_v1_59/Hosts.data: attempting safeService to attempt to fix up metadata&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;To find those errors in the internal logs, (and the indexer in case of search-peers), you can use this search :&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
index=_internal host="indexer hostname" source=&lt;EM&gt;splunkd.log&lt;/EM&gt; safeService | rex " MetaData - (?P&lt;BUCKET&gt;.*)/" | stats count by bucket splunk_server&lt;BR /&gt;
&lt;/BUCKET&gt;&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Here is the manual procedure to fix.&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;STRONG&gt;Note:&lt;/STRONG&gt; There are 2 options: run multiple rebuilds in parallel or a single sequential rebuild as detailed below.&lt;/P&gt;

&lt;P&gt;1 - disable deploymentclient to prevent new corruption&lt;BR /&gt;&lt;BR /&gt;
(until the fix to SPL-39127: targeted for the upcoming maintenance release 4.2.2)&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
mv $SPLUNK_HOME/etc/system/local/deploymentclient.conf $SPLUNK_HOME/etc/system/local/deploymentclient.disabled&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
2 - collect the list of the corrupted buckets with&lt;BR /&gt;
&lt;PRE&gt;cd $SPLUNK_HOME/bin &lt;BR /&gt;
./splunk cmd splunkd fsck --mode metadata --all &amp;gt; /tmp/trash&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
the buckets with errors will be displayed on the screen&lt;BR /&gt;
by example : &lt;EM&gt;NEEDS REPAIR: file='/opt/splunk/var/lib/splunk/java/db/db_1303835244_1303775919_106/Hosts.data' code=25 contains recover-padding&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;3 - stop splunk to prevent bucket rotation&lt;/P&gt;

&lt;P&gt;4 - for each of them rebuild the tsdix files&lt;BR /&gt;
the process is long, if you have several buckets, it is faster to run several rebuild in parallel (use &amp;amp; on linux)&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd rebuild /pathtothebucketfolder/&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
For parallel processing&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd rebuild /pathtothebucketfolder1/ &amp;amp;&lt;BR /&gt;
./splunk cmd splunkd rebuild /pathtothebucketfolder2/ &amp;amp;&lt;BR /&gt;
etc...&lt;BR /&gt;
&lt;/PRE&gt;&lt;BR /&gt;
&lt;STRONG&gt;OR&lt;/STRONG&gt; to run a single command to rebuild all sequentially (takes longer time):&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd fsck --mode metadata --all --repair&lt;BR /&gt;
&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;5 - check the result with&lt;BR /&gt;
&lt;PRE&gt;&lt;BR /&gt;
./splunk cmd splunkd fsck --mode metadata --all&lt;BR /&gt;
&lt;/PRE&gt;&lt;/P&gt;

&lt;P&gt;6 - restart splunk (it will also apply the modification to the deploymentclient config)&lt;/P&gt;

&lt;P&gt;For further information on splunkd fsck refer on the Community Wiki to:&lt;BR /&gt;&lt;BR /&gt;
&lt;A href="http://www.splunk.com/wiki/Check_and_Repair_Metadata" rel="nofollow" target="_blank"&gt;http:///&lt;/A&gt;&lt;A href="http://www.splunk.com/wiki/Check_and_Repair_Metadata" target="_blank"&gt;www.splunk.com/wiki/Check_and_Repair_Metadata&lt;/A&gt;&lt;BR /&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;How to prevent this from happening until 4.2.2 comes out?&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;There are two workarounds to address this.&lt;/P&gt;

&lt;UL&gt;
&lt;LI&gt;&lt;P&gt;The workaround for the associated bug SPL-38464 (setting "inPlaceUpdates = false" as a global parameter in the [default] stanza of indexes.conf) is still a valid one :&lt;BR /&gt;
&lt;PRE&gt;&lt;CODE&gt;[default]&lt;BR /&gt;
inPlaceUpdates = false&lt;/CODE&gt;&lt;/PRE&gt;&lt;BR /&gt;
Since we would always atomically update the metadata files via rename, there is no chance of corruption here. There is a chance, perhaps, of ending up with somewhat invalid metadata info, but not with corruption.  &lt;/P&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;P&gt;Another workaround is to set both "restartSplunkWeb=false" AND "restartSplunkd=false" in their serverclass.conf stanzas to disable restarts. The corruption happens in the splunkweb restart code path, but restarting splunkd also triggers splunkweb restart.&lt;/P&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;If applied, these work-arounds should be retired once 4.2.2 is installed.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 09:32:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98992#M20705</guid>
      <dc:creator>yannK</dc:creator>
      <dc:date>2020-09-28T09:32:49Z</dc:date>
    </item>
    <item>
      <title>Re: Missing data - Splunk is showing random gaps in the 'indexed data' timeline and safeService warning in Splunkd.log</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98993#M20706</link>
      <description>&lt;P&gt;if i upgrade to 4.2.2, do I still need to run the rebuild/repair operations?&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jul 2011 20:12:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Missing-data-Splunk-is-showing-random-gaps-in-the-indexed-data/m-p/98993#M20706</guid>
      <dc:creator>tpsplunk</dc:creator>
      <dc:date>2011-07-25T20:12:18Z</dc:date>
    </item>
  </channel>
</rss>

