<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What is the best way to estimate frozen storage sizing needs? in Splunk Enterprise</title>
    <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311898#M7266</link>
    <description>&lt;P&gt;Thank you!  Appreciate the info on this....&lt;/P&gt;</description>
    <pubDate>Wed, 06 Sep 2017 18:08:04 GMT</pubDate>
    <dc:creator>vanderaj2</dc:creator>
    <dc:date>2017-09-06T18:08:04Z</dc:date>
    <item>
      <title>What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311893#M7261</link>
      <description>&lt;P&gt;Hello All,&lt;/P&gt;

&lt;P&gt;I'm trying to assess some offline storage needs for archiving old Splunk data.  I'm planning to adjust my retention policy to 90 days for hot-warm-cold (i.e. "online", searchable data) and then have anything older than 90 days sent to NAS as "frozen", to be stored there for 1 year.&lt;/P&gt;

&lt;P&gt;My storage guy is asking how much storage I need on the NAS to cover 1 year of frozen data.  My understanding is that compressed, raw events are what would be sent to frozen, if you specify a frozen path or a script.  &lt;/P&gt;

&lt;P&gt;How does one go about estimating the size of the raw, compressed events?&lt;/P&gt;

&lt;P&gt;I have an indexer cluster, comprised of 2 indexers.  Should I plan to double whatever the storage estimate is, to account for frozen data coming from 2 indexers?&lt;/P&gt;

&lt;P&gt;Thank you in advance! &lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 19:29:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311893#M7261</guid>
      <dc:creator>vanderaj2</dc:creator>
      <dc:date>2017-08-31T19:29:09Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311894#M7262</link>
      <description>&lt;P&gt;Hey @vanderaj2, Here's some documentation on planning your capacity: &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.6.3/Capacity/Estimateyourstoragerequirements"&gt;http://docs.splunk.com/Documentation/Splunk/6.6.3/Capacity/Estimateyourstoragerequirements&lt;/A&gt;. It says that "typically, the compressed rawdata file is 10% the size of the incoming, pre-indexed raw data. The associated index files range in size from approximately 10% to 110% of the rawdata file. The number of unique terms in the data affect this value. "&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 19:44:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311894#M7262</guid>
      <dc:creator>lfedak_splunk</dc:creator>
      <dc:date>2017-08-31T19:44:02Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311895#M7263</link>
      <description>&lt;P&gt;To get started, plug your numbers in &lt;A href="https://splunk-sizing.appspot.com/"&gt;here&lt;/A&gt; and it will give you your estimated storage needs based on "normal" compression assumptions (journal.gz = 15% of raw).&lt;BR /&gt;
Note that if you are in a cluster, every indexer will freeze its own buckets, so you will have RF*raw on your archive volume. You can create a script that identifies replicated bucket archives and deletes all but one copy to minimize your storage need.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 20:54:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311895#M7263</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2017-08-31T20:54:44Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311896#M7264</link>
      <description>&lt;P&gt;Thank you both for weighing in!  I also have a follow-on question to the Splunk community:&lt;/P&gt;

&lt;P&gt;Does anyone know whether during the compression of the raw data, Splunk does any data deduplication to reduce storage overhead?  Just curious.....&lt;/P&gt;</description>
      <pubDate>Tue, 05 Sep 2017 18:32:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311896#M7264</guid>
      <dc:creator>vanderaj2</dc:creator>
      <dc:date>2017-09-05T18:32:17Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311897#M7265</link>
      <description>&lt;P&gt;We don't deduplicate anything. The raw data file (journal.gz) is a ZIP file of zipped 128kb data slices. &lt;/P&gt;</description>
      <pubDate>Tue, 05 Sep 2017 19:37:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311897#M7265</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2017-09-05T19:37:34Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311898#M7266</link>
      <description>&lt;P&gt;Thank you!  Appreciate the info on this....&lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2017 18:08:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311898#M7266</guid>
      <dc:creator>vanderaj2</dc:creator>
      <dc:date>2017-09-06T18:08:04Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311899#M7267</link>
      <description>&lt;P&gt;By default Splunk stores the replicated buckets and the searchable copies if coldToFrozenDir is specified.  Therefore you can assume the following equation:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; (Daily Ingestion Volume * 0.35 * search factor) + (Daily ingestion Volume * 0.15 * replication factor) = total storage needed
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Total storage needed / number of peers = storage per peer.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2017 18:13:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311899#M7267</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2017-09-06T18:13:12Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311900#M7268</link>
      <description>&lt;P&gt;Not quite. Index and metadata files are not frozen, only rawdata (journal.gz) is. &lt;BR /&gt;
So &lt;CODE&gt;(Daily ingestion Volume * 0.15 * replication factor) = total storage needed&lt;/CODE&gt; is the best approximation.&lt;BR /&gt;
This can be reduced to just ingestion*.15 if replicated buckets are deleted after freezing via customer provided script.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2017 18:29:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311900#M7268</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2017-09-06T18:29:09Z</dc:date>
    </item>
    <item>
      <title>Re: What is the best way to estimate frozen storage sizing needs?</title>
      <link>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311901#M7269</link>
      <description>&lt;P&gt;Oh my mistake I'm thinking of the rb and db files which are the raw data as you said...&lt;/P&gt;

&lt;P&gt;So in a cluster both the replicated copies and the original copies get copied &lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2017 19:01:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Enterprise/What-is-the-best-way-to-estimate-frozen-storage-sizing-needs/m-p/311901#M7269</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2017-09-06T19:01:51Z</dc:date>
    </item>
  </channel>
</rss>

