<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161645#M45687</link>
    <description>&lt;P&gt;Agree, delay for data older than 1 month is fine. Any impact on the same month's indexing/searching ? I guess maxHotBuckets should also be increased to maybe 5 for other hot to warm rolling scenarios.&lt;/P&gt;</description>
    <pubDate>Fri, 19 Dec 2014 16:38:40 GMT</pubDate>
    <dc:creator>KomalSharma</dc:creator>
    <dc:date>2014-12-19T16:38:40Z</dc:date>
    <item>
      <title>Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161643#M45685</link>
      <description>&lt;P&gt;I have gone through the documentation and want to check if a scenario like this will work out:&lt;BR /&gt;
-Hold 1 months data in hot buckets (maxHotSpanSecs=2628000)&lt;BR /&gt;
-Move data from hot to cold directly (maxWarmDBCount=0, frozenTimePeriodInSecs=31536000)&lt;/P&gt;

&lt;P&gt;The colddb is a different/ slower storage.&lt;BR /&gt;
Are there any pitfalls from taking this approach in term of search and performance results?&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Komal&lt;/P&gt;</description>
      <pubDate>Thu, 18 Dec 2014 17:08:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161643#M45685</guid>
      <dc:creator>KomalSharma</dc:creator>
      <dc:date>2014-12-18T17:08:52Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161644#M45686</link>
      <description>&lt;P&gt;There should be a performance impact when searching the historical data (data older than 1 month).&lt;/P&gt;</description>
      <pubDate>Thu, 18 Dec 2014 20:04:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161644#M45686</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2014-12-18T20:04:38Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161645#M45687</link>
      <description>&lt;P&gt;Agree, delay for data older than 1 month is fine. Any impact on the same month's indexing/searching ? I guess maxHotBuckets should also be increased to maybe 5 for other hot to warm rolling scenarios.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Dec 2014 16:38:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161645#M45687</guid>
      <dc:creator>KomalSharma</dc:creator>
      <dc:date>2014-12-19T16:38:40Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161646#M45688</link>
      <description>&lt;P&gt;What is the amount of data that you're expecting to be present in Hot Bucket? If suppose you expected data volume is 10 GB in 1 month, then you can try with this setting&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[yourIndex]
maxDataSize = auto_high_volume
maxHotSpanSecs = 2628000
maxHotBuckets = 1
maxWarmDBCount = 0
frozenTimePeriodInSecs = 31536000
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If the volume is more than 10 GB then you can increase the maxHotBuckets values.&lt;/P&gt;

&lt;P&gt;If the volume is much lower like 1-2 GB, you can use " maxDataSize = auto" (750 MB volume for hot bucket) and adjust maxHotBuckets  to accommodate your max volume.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Dec 2014 17:15:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161646#M45688</guid>
      <dc:creator>somesoni2</dc:creator>
      <dc:date>2014-12-19T17:15:03Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161647#M45689</link>
      <description>&lt;P&gt;I would &lt;STRONG&gt;not&lt;/STRONG&gt; keep a month of data in hot buckets. Warm buckets and hot buckets can be both be searched very quickly. But hot buckets are open for writing and warm buckets are not. Therefore warm buckets can be backed up and are less vulnerable to corruption.&lt;/P&gt;

&lt;P&gt;Plus, Splunk sometimes reorganizes hot buckets to optimize search.  I would &lt;EM&gt;never&lt;/EM&gt; set maxHotBuckets=1, unless you can guarantee that your data arrives in strict time sequence. And really, I still wouldn't do it. Also, you do not want your hot buckets to get too large, as this also impacts search speed.&lt;/P&gt;

&lt;P&gt;I would set the bucket size to approximately 1 day's worth of data, but not smaller than 750MB or larger than 10GB. Let's face it, Splunk engineering has years of experience in understanding the range of optimum bucket sizes.&lt;BR /&gt;
Setting &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;maxWarmDBCount = 31
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;means that you will have at most 31 warm buckets, plus your hot buckets, for approximately a month of data in hot/warm. I would also leave &lt;CODE&gt;maxHotBuckets&lt;/CODE&gt; at the default setting of 3. I see no need to set &lt;CODE&gt;maxHotSpanSecs&lt;/CODE&gt;. If you are concerned about disk space usage for hot/warm buckets, set&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;homePath.maxDataSizeMB = XYZ
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;where XZY is the maximum amount of disk space in MB that you want the hot/warm buckets to use. Splunk will never let hot/warm use more than this, even if that means that you end up with fewer than 31 warm buckets...&lt;/P&gt;

&lt;P&gt;Finally, look at the settings for the default index, main. It comes configured to manage a fairly high volume of incoming data. I would start with the same settings for my index and then tune (like the maxWarmDBCount and homePath.maxDataSizeMB) for the particular situation. Again, leverage the experience of Splunk engineering - they figured out these defaults!&lt;/P&gt;</description>
      <pubDate>Sat, 20 Dec 2014 00:17:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161647#M45689</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2014-12-20T00:17:38Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161648#M45690</link>
      <description>&lt;P&gt;+1! &lt;/P&gt;

&lt;P&gt;Don't use a count-based warm bucket sizing rule; there are lots of reasons why hot buckets might roll to warm before they reach their max size. Instead, use @lguinn's suggestion of a size limit on the homePath.&lt;/P&gt;

&lt;P&gt;Also, please understand that hot buckets aren't magical. There's nothing about a hot bucket that makes it any different to search than a warm bucket--the former is open for writing, that's all. The only difference between warm and cold is the partition on which they're stored (which may lead to different search performance, but in the common case of "just one big partition for all Splunk data", it does not).&lt;/P&gt;</description>
      <pubDate>Thu, 22 Jan 2015 16:22:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161648#M45690</guid>
      <dc:creator>sowings_splunk</dc:creator>
      <dc:date>2015-01-22T16:22:26Z</dc:date>
    </item>
    <item>
      <title>Re: Are there any search and performance pitfalls with keeping data in hot buckets for 1 month and moving it from hot to cold directly?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161649#M45691</link>
      <description>&lt;P&gt;Hi @lguinn [Splunk],&lt;/P&gt;

&lt;P&gt;Based on your points and if my requirement is 30 day ACTIVE and 90  day COLD storage with same storage for hot,warm &amp;amp; cold, while assuming Avg indexed data per day for main index=1Gb, would the following settings be right ?&lt;BR /&gt;
maxDataSize = 1000&lt;BR /&gt;
maxHotBuckets= 3&lt;BR /&gt;
maxWarmDBCount = 31&lt;BR /&gt;
homePath.maxDataSizeMB = 32000 (data size equivalent of 30 days + extra)&lt;BR /&gt;
coldPath.maxDataSizeMB = 90000 (data size equivalent of 90 days)&lt;BR /&gt;
maxTotalDataSizeMB = 122000&lt;BR /&gt;
frozenTimePeriodInSecs = 10368000&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
Dev&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 23:07:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Are-there-any-search-and-performance-pitfalls-with-keeping-data/m-p/161649#M45691</guid>
      <dc:creator>damode</dc:creator>
      <dc:date>2017-11-20T23:07:39Z</dc:date>
    </item>
  </channel>
</rss>

