<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to tune index retention? in Monitoring Splunk</title>
    <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646700#M9671</link>
    <description>&lt;P&gt;Yes&amp;nbsp;&lt;SPAN&gt;complexity of your architecture.&amp;nbsp; But not if you break it down.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Forget about all my Cluster.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Simple question.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;How can I find out the Daily Size and Compression Ratio for an Index.&amp;nbsp; I know you said its&amp;nbsp;&amp;nbsp;(0.15 x RF + 0.35 x SF) but i not sure that's&amp;nbsp;is right.&amp;nbsp; But it may well work given I need to allow fudge factor.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Then expand the question.&amp;nbsp; To all the Indexers on a given set of Indexers.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Something like.&lt;/P&gt;&lt;P&gt;What is the size of each index on a given set of Indexer for the last 30 days.&amp;nbsp; Them Simply take that answer / 30 to give an approximate daily ingestion size in MB.&amp;nbsp; Then take the&amp;nbsp;&lt;SPAN&gt;&amp;nbsp;(0.15 x RF + 0.35 x SF) to work up a number.&amp;nbsp; And times that but a number of days for Hot/Warm and a number of Days for Cold. i.e.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;homePath.maxDataSizeMB =&lt;/P&gt;&lt;P&gt;And&lt;/P&gt;&lt;P&gt;coldPath.maxDataSizeMB =&lt;/P&gt;&lt;P&gt;And&lt;/P&gt;&lt;P&gt;maxTotalDataSizeMB =&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 12 Jun 2023 16:40:51 GMT</pubDate>
    <dc:creator>andynewsoncap</dc:creator>
    <dc:date>2023-06-12T16:40:51Z</dc:date>
    <item>
      <title>How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646656#M9664</link>
      <description>&lt;P&gt;Hello, as far as I can understand and please correct me if I am wrong.&amp;nbsp;How an index behaves is based on it’s conf.&lt;/P&gt;
&lt;P&gt;We have 5 IDX cluster and over 300 IDXes at this stage.&lt;/P&gt;
&lt;P&gt;Our AIM is to keep 9 days (or there about) in Hot/Warm and 31days (or there about in Cold) One day maybe keep data in Frozen (but we are not there yet).&lt;/P&gt;
&lt;P&gt;So as far as I understand in Splunk this is controlled by working out the Size per day of the data.&amp;nbsp; Then Take that number and x by 9 and x by 31.&lt;/P&gt;
&lt;P&gt;To create&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;homePath.maxDataSizeMB =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;And&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;coldPath.maxDataSizeMB =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;And&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;maxTotalDataSizeMB =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Then finally&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;frozenTimePeriodInSecs =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To further complicate matters there is a compress ration for RAW and the compression ratio for TSIDX and the number of Indexers in the cluster and the replication factor.&lt;/P&gt;
&lt;P&gt;All of which make for some fun and complicate calculation.&lt;/P&gt;
&lt;P&gt;My question is this.&lt;/P&gt;
&lt;P&gt;Have anyone come up with a way to do this?&lt;/P&gt;
&lt;P&gt;Or at has someone worked out how to extract a list of IDX per INDEX Cluster and the current daily Data Rate.&amp;nbsp; And doing the same to extract per IDX per INDEX Cluster the compression Ratio of the RAX Data and the TSIDX&lt;/P&gt;
&lt;P&gt;Then maybe its possible to do some magic in Excel to build out&lt;/P&gt;
&lt;P&gt;Per IDX&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;homePath.maxDataSizeMB =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;coldPath.maxDataSizeMB =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;maxTotalDataSizeMB =&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 14:05:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646656#M9664</guid>
      <dc:creator>andynewsoncap</dc:creator>
      <dc:date>2023-06-12T14:05:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646670#M9665</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/235196"&gt;@andynewsoncap&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;it's very difficoult give some values for your requirements.&lt;/P&gt;&lt;P&gt;The first thing I have in mind is that probably you have too many indexes to manage: usually indexes are divided in different ones based on retention and access rights, what's te algorithm ofd you index division?&lt;/P&gt;&lt;P&gt;then the max dimension depends on the number and dimension of events: you could try an empirical approach extrapolating this dimension from the used disk space for a known number of events for each index and extrapolate them:&amp;nbsp;&lt;/P&gt;&lt;P&gt;at first extrapolating:&amp;nbsp;e.g. if 1million of events uses 100 MB for an index in 1 day you have to calculate&lt;/P&gt;&lt;P&gt;hot_warm_dimension = 1_day_volume x hot_wart_retention x (0.15 x RF + 0.35 x SF)&lt;/P&gt;&lt;P&gt;cold_dimension = 1_day_volume x cold_retention x (0.15 x RF + 0.35 x SF)&lt;/P&gt;&lt;P&gt;anyway, consider alway a contingency or 20% on this space and monitor the real occupation.&lt;/P&gt;&lt;P&gt;Then you have to consider index replication that depends on the Replication Factor and on the Search Factor.&lt;/P&gt;&lt;P&gt;In conclusion, you can try a calculation but to be more sure I hint to engage a Splunk PS or at least a Splunk Certified Architect, this isn't a questin for the Community!&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 14:42:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646670#M9665</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2023-06-12T14:42:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646683#M9667</link>
      <description>&lt;P&gt;Thanks.&amp;nbsp; We can't be the only people to ask this.&amp;nbsp; Oh and we designed it with Splunk PS.&amp;nbsp; And there are very good reasons why we have so many indexes. 5 Index Cluster so spread-out so 70-100 per cluster.&lt;/P&gt;&lt;P&gt;I worked out I would have to take RF and SF into account.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was hoping someone may know how to pull size and compression factor from an SPL.&lt;/P&gt;&lt;P&gt;If not next stop Splunk on-demand i guess.&lt;/P&gt;&lt;P&gt;Thanks.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 15:49:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646683#M9667</guid>
      <dc:creator>andynewsoncap</dc:creator>
      <dc:date>2023-06-12T15:49:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646687#M9668</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/235196"&gt;@andynewsoncap&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;let me understand: you have 5 Clusters of Indexers or 5 Indexers in one Cluster?&lt;/P&gt;&lt;P&gt;Anyway, also 70 indexes are very many in my opinion!&lt;/P&gt;&lt;P&gt;About Compression Factor the values are the ones I described: 0.15 for row data and 0.35 for indexes.&lt;/P&gt;&lt;P&gt;About Replication Factor and Search Factor, they depends on the affidability you want:&lt;/P&gt;&lt;P&gt;if you have 5 Indexers, how many Indexers can be down without missing data?&lt;/P&gt;&lt;P&gt;Replication Factor and search factor can be eual to number of indexers, in this case you have all the data in all indexers and you system have a consistent base of data also with 4 Indexers down but with a greater cost for storage, so you have to define the leverage of cost and affidability.&lt;/P&gt;&lt;P&gt;You can surely ask to Splunk PS on demand and I hint this or to ask to your reference Splunk Partner that should have a Splunk Architect.&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 15:58:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646687#M9668</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2023-06-12T15:58:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646695#M9669</link>
      <description>&lt;P&gt;In fact we have 9 Index Clusters.&lt;/P&gt;&lt;P&gt;2 in US. 5 node Cluster and 3 node cluster&lt;/P&gt;&lt;P&gt;2 in APAC. 4 node Cluster and 3 node cluster&lt;/P&gt;&lt;P&gt;2 in LATAM. 4 node Cluster and 3 node cluster&lt;/P&gt;&lt;P&gt;2 in EMEA. 5 node Cluster and 3 node cluster&lt;/P&gt;&lt;P&gt;All reporting to one 9 node Search head cluster.&amp;nbsp; Which has its own 4 node ITSI Indexing Cluster.&lt;/P&gt;&lt;P&gt;2+ 2 + 2 + 2 + 1&amp;nbsp;&lt;/P&gt;&lt;P&gt;There are about 40,000 devices feeding it Data. lets say 10k / 10k / 10k / 10k&amp;nbsp;&lt;/P&gt;&lt;P&gt;All are set to 2 / 2. SF / RF.&lt;/P&gt;&lt;P&gt;Each Index Cluster has its own set of Indexes / Index names based on the Data it is collecting for that region.&amp;nbsp; So I can tell if this is EMEA data or USA Data. etc.etc.&amp;nbsp; Based of the source of the Indexer.&amp;nbsp; The customer has a habit of spinning up server in one region but colleting data for another region.&amp;nbsp; Yer dont ask.&amp;nbsp; Customer is always right. Right &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp; For complex reason i will not go into here.&amp;nbsp; But this is how myself and Splunk PS designed it 3 years ago.&amp;nbsp; And its been running for 3 years now.&lt;/P&gt;&lt;P&gt;Now we need to ensure we are keeping the data for an appropriator amount of time.&amp;nbsp; Which we are not.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 16:26:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646695#M9669</guid>
      <dc:creator>andynewsoncap</dc:creator>
      <dc:date>2023-06-12T16:26:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646697#M9670</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/235196"&gt;@andynewsoncap&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;for the complexity of your architecture, this isn't a question for the Community, but it requires a deep analysis from a Splunk PS or a Certified Splunk Architect.&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 16:31:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646697#M9670</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2023-06-12T16:31:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646700#M9671</link>
      <description>&lt;P&gt;Yes&amp;nbsp;&lt;SPAN&gt;complexity of your architecture.&amp;nbsp; But not if you break it down.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Forget about all my Cluster.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Simple question.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;How can I find out the Daily Size and Compression Ratio for an Index.&amp;nbsp; I know you said its&amp;nbsp;&amp;nbsp;(0.15 x RF + 0.35 x SF) but i not sure that's&amp;nbsp;is right.&amp;nbsp; But it may well work given I need to allow fudge factor.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Then expand the question.&amp;nbsp; To all the Indexers on a given set of Indexers.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Something like.&lt;/P&gt;&lt;P&gt;What is the size of each index on a given set of Indexer for the last 30 days.&amp;nbsp; Them Simply take that answer / 30 to give an approximate daily ingestion size in MB.&amp;nbsp; Then take the&amp;nbsp;&lt;SPAN&gt;&amp;nbsp;(0.15 x RF + 0.35 x SF) to work up a number.&amp;nbsp; And times that but a number of days for Hot/Warm and a number of Days for Cold. i.e.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;homePath.maxDataSizeMB =&lt;/P&gt;&lt;P&gt;And&lt;/P&gt;&lt;P&gt;coldPath.maxDataSizeMB =&lt;/P&gt;&lt;P&gt;And&lt;/P&gt;&lt;P&gt;maxTotalDataSizeMB =&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 16:40:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646700#M9671</guid>
      <dc:creator>andynewsoncap</dc:creator>
      <dc:date>2023-06-12T16:40:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to tune index retention?</title>
      <link>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646762#M9672</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/235196"&gt;@andynewsoncap&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;you can find the daily size of an index using the License Consuption Report .&lt;/P&gt;&lt;P&gt;the compression factor is usually around 50%, but a more detailed calculation is the one I shared, from the Splunk Architect Course.&lt;/P&gt;&lt;P&gt;total_disk_space = daily_indexing * (0.15 * RF + 0.35 * SF) * retention&lt;/P&gt;&lt;P&gt;obviously, this is the total required storage for a cluster, to have the storage required for each Indexer of a cluster you have to divide by the number of Indexers of the Cluster.&lt;/P&gt;&lt;P&gt;Ciao.&lt;/P&gt;&lt;P&gt;Giuseppe&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 06:40:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Monitoring-Splunk/How-to-tune-index-retention/m-p/646762#M9672</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2023-06-13T06:40:00Z</dc:date>
    </item>
  </channel>
</rss>

