<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance / Design recommendations for dimensions in Metrics Index in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318461#M59518</link>
    <description>&lt;P&gt;Yes, this makes sense are things I have already done with my own custom sourcetypes for metrics.&lt;/P&gt;

&lt;P&gt;The issue is what Splunk going to say are the best practices, recommendations, limitations of dimensions and data sets.&lt;/P&gt;</description>
    <pubDate>Mon, 23 Oct 2017 11:27:04 GMT</pubDate>
    <dc:creator>rjthibod</dc:creator>
    <dc:date>2017-10-23T11:27:04Z</dc:date>
    <item>
      <title>Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318451#M59508</link>
      <description>&lt;P&gt;Does Splunk have any guidelines or limitations on the number of dimensions (i.e., cardinality) that the new Metrics Index supports?&lt;/P&gt;

&lt;P&gt;Are there specific limitations in terms of the number of dimensions or unique values of a single dimension or unique combinations of dimensions for a single measurement?&lt;/P&gt;

&lt;P&gt;I understand that Splunk's searching and indexing performance is always contingent on the hardware / platform. Just wanting to see if there are any hard limits built into the design of the Metrics Index or a configuration threshold, or even better, can Splunk provide some benchmarks about data sets they have tested?&lt;/P&gt;

&lt;P&gt;I have seen other metric stores / time-series databases enforce these kinds of limits (in configuration settings), hence the question.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Oct 2017 14:30:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318451#M59508</guid>
      <dc:creator>rjthibod</dc:creator>
      <dc:date>2017-10-19T14:30:52Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318452#M59509</link>
      <description>&lt;P&gt;There are no published limits, but they've been tested with some pretty high numbers.  Higher numbers reduce performance so try to keep the number of dimensions as low as you can.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Oct 2017 20:10:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318452#M59509</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2017-10-19T20:10:15Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318453#M59510</link>
      <description>&lt;P&gt;When you say higher numbers, are you referring number of dimensions (columns) or unique values for dimensions?&lt;/P&gt;</description>
      <pubDate>Thu, 19 Oct 2017 21:01:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318453#M59510</guid>
      <dc:creator>rjthibod</dc:creator>
      <dc:date>2017-10-19T21:01:00Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318454#M59511</link>
      <description>&lt;P&gt;My information does not specify.&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 08:47:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318454#M59511</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2017-10-20T08:47:08Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318455#M59512</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;unfortunately the recordings of the keynote sessions of .conf 2017 are not available, yet. (And the metrics sessions as well).&lt;/P&gt;

&lt;P&gt;It looks like you can do 50k eps+ per indexer and this number scales well over the number of indexers (10 indexers approx 500k eps)!&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.0.0/Metrics/Performance" target="_blank"&gt;https://docs.splunk.com/Documentation/Splunk/7.0.0/Metrics/Performance&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;A single metrics (measurement) needs two fields: _value and metric_name. But without dimensions (every other field!) you can't filter/aggregate it for statistics (and MSTATS is using dimensions heavily). Host and source are automatically added and available as dimensions.&lt;/P&gt;

&lt;P&gt;Regarding data on disk: you always trade speed versus other things... metrics are stored in TSIDX (take a look at "splunk cmd walklex").&lt;BR /&gt;
It's a little bit like using INDEXED_EXTRACTIONS. You create search time fields which can be used in a TSTATS query (which is up to 1000x faster than having RAW events and doing extractions at search time).&lt;/P&gt;

&lt;P&gt;Caveat: you need more storage.&lt;/P&gt;

&lt;P&gt;HTH,&lt;/P&gt;

&lt;P&gt;Holger&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 16:21:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318455#M59512</guid>
      <dc:creator>hsesterhenn_spl</dc:creator>
      <dc:date>2020-09-29T16:21:45Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318456#M59513</link>
      <description>&lt;P&gt;I was at the keynote, so no need to worry about that - there wasn't anything useful about metrics.&lt;/P&gt;

&lt;P&gt;I understand the basics of indexing, tsidx files, and what fields metrics indexes require - I have already built a custom sourcetype for metrics indexes.&lt;/P&gt;

&lt;P&gt;I am specifically asking about dimensions. Is there a limit or a suggestion from Splunk about how many dimensions (the cardinality of the index), or the unique sequences/combinations of dimensions that the Metrics index supports? This is a concern in other time-series databases, in fact some of them put in configuration parameters to limit these exact things.  &lt;/P&gt;

&lt;P&gt;Does Splunk think that more than 5 dimensions in a measurement going to cause problem with scaling? If I have over 1 Million unique values across those 5 dimensions, is that going to cause a significant problem unless I am running on a 16 GB machine? etc...&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 10:45:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318456#M59513</guid>
      <dc:creator>rjthibod</dc:creator>
      <dc:date>2017-10-20T10:45:41Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318457#M59514</link>
      <description>&lt;P&gt;Recordings for the metrics sessions are available at &lt;A href="http://conf.splunk.com/sessions/2017-sessions.html#search=metrics&amp;amp;"&gt;http://conf.splunk.com/sessions/2017-sessions.html#search=metrics&amp;amp;&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 10:53:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318457#M59514</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2017-10-20T10:53:07Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318458#M59515</link>
      <description>&lt;P&gt;A little over 5 dimensions probably won't matter much.  You can probably go a lot over 5.  As in all things Splunk, however, you should test it on your dev system first.&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 11:04:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318458#M59515</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2017-10-20T11:04:06Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318459#M59516</link>
      <description>&lt;P&gt;Unfortunately most of the interesting ones are currently not available.&lt;/P&gt;</description>
      <pubDate>Sun, 22 Oct 2017 16:42:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318459#M59516</guid>
      <dc:creator>hsesterhenn_spl</dc:creator>
      <dc:date>2017-10-22T16:42:22Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318460#M59517</link>
      <description>&lt;P&gt;Expect more official details in the docs the next couple of months...&lt;/P&gt;

&lt;P&gt;Let me show you an example I did on my local instance:&lt;/P&gt;

&lt;P&gt;curl -k &lt;A href="https://localhost:8088/services/collector" target="_blank"&gt;https://localhost:8088/services/collector&lt;/A&gt; -H "Authorization: Splunk token-XXXX" -d '{"time": 1503209999.111,"event":"metric","source":"disk","host":"host_99","fields":{"region":"us-west-1","datacenter":"us-west-1a","rack":"63","os":"Ubuntu16.10","arch":"x64","team":"LON","service":"6","service_version":"0","service_environment":"test","path":"/dev/sda1","fstype":"ext3","_value":1099511627776,"metric_name":"total"}}'&lt;/P&gt;

&lt;P&gt;This single measurement results in 11 dimensions. &lt;/P&gt;

&lt;P&gt;See 'splunk cmd walklex ./var/lib/splunk//db/yyyy.tsidx "" | less&lt;/P&gt;

&lt;P&gt;my needle: &lt;BR /&gt;
0 1  arch::x64&lt;BR /&gt;
1 1  datacenter::us-west-1a&lt;BR /&gt;
2 1  fstype::ext3&lt;BR /&gt;
3 1  host::host_99&lt;BR /&gt;
4 1  metric_name::total&lt;BR /&gt;
5 1  os::Ubuntu16.10&lt;BR /&gt;
6 1  path::/dev/sda1&lt;BR /&gt;
7 1  rack::63&lt;BR /&gt;
8 1  region::us-west-1&lt;BR /&gt;
9 1  service::6&lt;BR /&gt;
10 1  service_environment::test&lt;BR /&gt;
11 1  service_version::0&lt;BR /&gt;
12 1  source::disk&lt;BR /&gt;
13 1  sourcetype::metrics_hse&lt;BR /&gt;
14 1  team::LON&lt;BR /&gt;
15 1 _catalog::total|arch|datacenter|fstype|os|path|rack|region|service|service_environment|service_version|team&lt;BR /&gt;
16 1 _dims::arch&lt;BR /&gt;
17 1 _dims::datacenter&lt;BR /&gt;
18 1 _dims::fstype&lt;BR /&gt;
19 1 _dims::os&lt;BR /&gt;
20 1 _dims::path&lt;BR /&gt;
21 1 _dims::rack&lt;BR /&gt;
22 1 _dims::region&lt;BR /&gt;
23 1 _dims::service&lt;BR /&gt;
24 1 _dims::service_environment&lt;BR /&gt;
25 1 _dims::service_version&lt;BR /&gt;
26 1 _dims::team&lt;BR /&gt;
27 1 _subsecond::.111&lt;BR /&gt;
28 1 arch::x64&lt;BR /&gt;
29 1 datacenter::us-west-1a&lt;BR /&gt;
30 1 fstype::ext3&lt;BR /&gt;
31 1 host::host_99&lt;BR /&gt;
32 1 metric_name::total&lt;BR /&gt;
33 1 os::ubuntu16.10&lt;BR /&gt;
34 1 path::/dev/sda1&lt;BR /&gt;
35 1 rack::63&lt;BR /&gt;
36 1 region::us-west-1&lt;BR /&gt;
37 1 service::6&lt;BR /&gt;
38 1 service_environment::test&lt;BR /&gt;
39 1 service_version::0&lt;BR /&gt;
40 1 source::disk&lt;BR /&gt;
41 1 sourcetype::metrics_hse&lt;BR /&gt;
42 1 team::lon&lt;/P&gt;

&lt;P&gt;The "_dims" are used in "| mstats". Did this with 1 millions data points.... search time approx 1 sec on a MacBook.... &lt;/P&gt;

&lt;P&gt;Does it make more sense for you?&lt;/P&gt;

&lt;P&gt;Holger&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 16:22:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318460#M59517</guid>
      <dc:creator>hsesterhenn_spl</dc:creator>
      <dc:date>2020-09-29T16:22:24Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318461#M59518</link>
      <description>&lt;P&gt;Yes, this makes sense are things I have already done with my own custom sourcetypes for metrics.&lt;/P&gt;

&lt;P&gt;The issue is what Splunk going to say are the best practices, recommendations, limitations of dimensions and data sets.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Oct 2017 11:27:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318461#M59518</guid>
      <dc:creator>rjthibod</dc:creator>
      <dc:date>2017-10-23T11:27:04Z</dc:date>
    </item>
    <item>
      <title>Re: Performance / Design recommendations for dimensions in Metrics Index</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318462#M59519</link>
      <description>&lt;P&gt;I was able to download 226 recordings.  That's pretty much all of them.  For an easy way to download all of the sessions, try this.&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;curl --silent &lt;A href="http://conf.splunk.com/sessions/2017-sessions.html" target="test_blank"&gt;http://conf.splunk.com/sessions/2017-sessions.html&lt;/A&gt; 2&amp;gt;&amp;amp;1 | egrep -i speaker-file | wget -B &lt;A href="http://conf.splunk.com" target="test_blank"&gt;http://conf.splunk.com&lt;/A&gt; -F -i - --continue&lt;/CODE&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Oct 2017 13:04:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Performance-Design-recommendations-for-dimensions-in-Metrics/m-p/318462#M59519</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2017-10-23T13:04:32Z</dc:date>
    </item>
  </channel>
</rss>

