<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Backup Index 'rawdata' only (exclude 'index files') in Deployment Architecture</title>
    <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53158#M1691</link>
    <description>&lt;P&gt;The minimum to back up and be able to restore/rebuild your data is to back up the &lt;EM&gt;index&lt;/EM&gt;&lt;CODE&gt;/db*/rawdata/journal.gz&lt;/CODE&gt; files, and the contents of the &lt;EM&gt;index&lt;/EM&gt;&lt;CODE&gt;/db*/rawdata/deletes/&lt;/CODE&gt; directories. Other data, including the tsidx files can be reconstructed from this, though it will take time and CPU to do so.&lt;/P&gt;

&lt;P&gt;You should note that a "rep factor" that is higher than the "search factor" will simply keep only the minimal files as well.&lt;/P&gt;

&lt;P&gt;In addition however to the tsidx files, which can be rebuilt by issuing an index rebuild command, you could also &lt;/P&gt;</description>
    <pubDate>Tue, 03 Sep 2013 03:12:36 GMT</pubDate>
    <dc:creator>gkanapathy</dc:creator>
    <dc:date>2013-09-03T03:12:36Z</dc:date>
    <item>
      <title>Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53156#M1689</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;

&lt;P&gt;If i wanted to only backup the rawdata, and exclude the 'index files', is it just as easy as excluding *.tsidx, or do I need to do more?&lt;/P&gt;

&lt;P&gt;Assuming that when you restore it, it'll go "oh, I don't have those 'index files', let me rebuild them for you" (if this isn't automatic, and I need to issue a command, that is fine, just tell me what to do! - I figure it would be, as index replication handles the creation of 'index files' by itself...)&lt;/P&gt;

&lt;P&gt;Some context:&lt;BR /&gt;
Our backup guy is telling me my Splunk systems are the largest users of capacity, so I'm seeing what I can do to reduce the backup size. If there is nothing, so be it, but I'd like to know my options.&lt;/P&gt;

&lt;P&gt;I have a clustered environment running Splunk 5.0.4 (4 indexers with rep and search factor of 4), so the chance of a restore being required is very low, but we obviously still need backups.&lt;/P&gt;

&lt;P&gt;I am happy to accept the delay of service restoration while Splunk rebuilds the 'index files'.&lt;/P&gt;

&lt;P&gt;It sounds like it is possible, as hinted at under: &lt;A href="http://docs.splunk.com/Documentation/Splunk/5.0.4/Indexer/Backupindexeddata"&gt;http://docs.splunk.com/Documentation/Splunk/5.0.4/Indexer/Backupindexeddata&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;From the above link "Another thing to consider when designing a cluster backup script is whether you want to back up just the bucket's rawdata or both its rawdata and index files. If the latter, the script must also identify a searchable copy of each bucket."&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;Carson.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 02:09:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53156#M1689</guid>
      <dc:creator>carsonl</dc:creator>
      <dc:date>2013-09-03T02:09:21Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53157#M1690</link>
      <description>&lt;P&gt;Is there any reason in particular you want/need an index replication AND a search factor of 4? That seems a bit on the excessive side, and there may be more efficient ways to give you the redundancy/resiliency you're after (while keeping storage volumes down).&lt;/P&gt;

&lt;P&gt;Just thought I'd get some more info before I provide a (possible) answer &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 02:35:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53157#M1690</guid>
      <dc:creator>rturk</dc:creator>
      <dc:date>2013-09-03T02:35:45Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53158#M1691</link>
      <description>&lt;P&gt;The minimum to back up and be able to restore/rebuild your data is to back up the &lt;EM&gt;index&lt;/EM&gt;&lt;CODE&gt;/db*/rawdata/journal.gz&lt;/CODE&gt; files, and the contents of the &lt;EM&gt;index&lt;/EM&gt;&lt;CODE&gt;/db*/rawdata/deletes/&lt;/CODE&gt; directories. Other data, including the tsidx files can be reconstructed from this, though it will take time and CPU to do so.&lt;/P&gt;

&lt;P&gt;You should note that a "rep factor" that is higher than the "search factor" will simply keep only the minimal files as well.&lt;/P&gt;

&lt;P&gt;In addition however to the tsidx files, which can be rebuilt by issuing an index rebuild command, you could also &lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 03:12:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53158#M1691</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2013-09-03T03:12:36Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53159#M1692</link>
      <description>&lt;P&gt;Against Splunk advise, I'm doing replication across the WAN (My WAN link is 600Mbps with ~25ms latency, hence going against their advise). I wanted to ensure that I have 2 searchable copies in each DC to ensure everything is okay if there is a link outage + server failure at the same time.&lt;/P&gt;

&lt;P&gt;You're right, I could probably drop the search/rep factor to 3, and still be okay, but disk and processing is still comparatively cheap compared to downtime.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 03:14:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53159#M1692</guid>
      <dc:creator>carsonl</dc:creator>
      <dc:date>2013-09-03T03:14:22Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53160#M1693</link>
      <description>&lt;P&gt;Hopefully you're aware that you can only be guaranteed 2 searchable copies at each of 2 sites &lt;EM&gt;if&lt;/EM&gt; you only have 2 indexer nodes in the cluster at each site, since Splunk replication in the current version is note site-aware. If you have 3 or more nodes at one site, it is possible for 3 or more copies to be at the same site.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 03:22:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53160#M1693</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2013-09-03T03:22:52Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53161#M1694</link>
      <description>&lt;P&gt;To be clear... excluding *.tsidx will result in those files being recreated... Is that automatically, or only when the rebuild command is run? (So I can update my restore documentation)&lt;/P&gt;

&lt;P&gt;Also, it would be much more reliable to exclude *.tsidx using the backup agent... leaving the other files won't cause any problems? (Other files being: bloomfilter bucket_info.csv Hosts.data merged_lexicon.lex optimize.result Sources.data SourceTypes.data splunk-autogen-params.dat Strings.data)&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 14:42:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53161#M1694</guid>
      <dc:creator>carsonl</dc:creator>
      <dc:date>2020-09-28T14:42:03Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53162#M1695</link>
      <description>&lt;P&gt;Yeah, aware of that, it is even with 2 in each DC, hence 3 could be okay for me, but for completeness sake, I've chosen 4.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 03:24:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53162#M1695</guid>
      <dc:creator>carsonl</dc:creator>
      <dc:date>2013-09-03T03:24:39Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53163#M1696</link>
      <description>&lt;P&gt;If you restore back to a cluster that is needs to recreate its search factor then it should get rebuilt automatically. But if you restore to a standalone node, you need to execute a rebuild on each bucket. The extra files should not cause any problems.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 03:27:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53163#M1696</guid>
      <dc:creator>gkanapathy</dc:creator>
      <dc:date>2013-09-03T03:27:43Z</dc:date>
    </item>
    <item>
      <title>Re: Backup Index 'rawdata' only (exclude 'index files')</title>
      <link>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53164#M1697</link>
      <description>&lt;P&gt;Perfect, thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2013 03:29:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Deployment-Architecture/Backup-Index-rawdata-only-exclude-index-files/m-p/53164#M1697</guid>
      <dc:creator>carsonl</dc:creator>
      <dc:date>2013-09-03T03:29:34Z</dc:date>
    </item>
  </channel>
</rss>

