<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Issue with count -- How can I search a large data set without Splunk truncating the data? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303906#M91397</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I would like query all data over the past year and then use "stats count by some fields" to calculate the counts.&lt;/P&gt;

&lt;P&gt;However,  the data is too large (at least a few millions) and Splunk truncates data when querying, so the number of counts is inaccurate.&lt;/P&gt;

&lt;P&gt;Does anyone know a good way to fix it?&lt;/P&gt;

&lt;P&gt;PS. I tried  'sistats' and set a report run every hour to query data from the previous year.&lt;BR /&gt;
Ideally, I hope the report can collect data in a smaller time interval accurately, and the aggregate the result.&lt;BR /&gt;
However, in each hour, the report query the whole previous data inaccurately and then added up all counts as the result. &lt;/P&gt;</description>
    <pubDate>Mon, 28 Aug 2017 20:53:14 GMT</pubDate>
    <dc:creator>closeset</dc:creator>
    <dc:date>2017-08-28T20:53:14Z</dc:date>
    <item>
      <title>Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303906#M91397</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I would like query all data over the past year and then use "stats count by some fields" to calculate the counts.&lt;/P&gt;

&lt;P&gt;However,  the data is too large (at least a few millions) and Splunk truncates data when querying, so the number of counts is inaccurate.&lt;/P&gt;

&lt;P&gt;Does anyone know a good way to fix it?&lt;/P&gt;

&lt;P&gt;PS. I tried  'sistats' and set a report run every hour to query data from the previous year.&lt;BR /&gt;
Ideally, I hope the report can collect data in a smaller time interval accurately, and the aggregate the result.&lt;BR /&gt;
However, in each hour, the report query the whole previous data inaccurately and then added up all counts as the result. &lt;/P&gt;</description>
      <pubDate>Mon, 28 Aug 2017 20:53:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303906#M91397</guid>
      <dc:creator>closeset</dc:creator>
      <dc:date>2017-08-28T20:53:14Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303907#M91398</link>
      <description>&lt;P&gt;can you provide the original query that ended up being truncated as well as what query you're using to try summary indexing? replace any sensitive information. This will help the community answer your question more accurately.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Aug 2017 21:17:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303907#M91398</guid>
      <dc:creator>cmerriman</dc:creator>
      <dc:date>2017-08-28T21:17:01Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303908#M91399</link>
      <description>&lt;P&gt;Are you referring to the number of rows getting truncated? &lt;/P&gt;

&lt;P&gt;If so, I had a simialr problem a while back where it would truncate anything more than 50,000 rows and lead to inaccurate results. Luckily this is a simple fix to &lt;CODE&gt;limits.conf&lt;/CODE&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;maxresultrows = &amp;lt;integer&amp;gt;
* Configures the maximum number of events are generated by search commands which 
grow the size of your result set (such as multikv) or that create events. Other search commands are explicitly 
controlled in specific stanzas below.
* This limit should not exceed 50000. Setting this limit higher than 50000 causes instability.
* Defaults to 50000. 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/Limitsconf"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/Limitsconf&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Aug 2017 21:44:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303908#M91399</guid>
      <dc:creator>skoelpin</dc:creator>
      <dc:date>2017-08-28T21:44:14Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303909#M91400</link>
      <description>&lt;P&gt;Hi, Thanks for reminding me. The code is here:&lt;/P&gt;

&lt;P&gt;The code to create summary indexing report:&lt;BR /&gt;
sourcetype=my_source event_id =*&lt;BR /&gt;
| sistats count by event_id field1 field2&lt;BR /&gt;&lt;BR /&gt;
The name of the report is "my_report_name." &lt;/P&gt;

&lt;P&gt;The code to retrieve the result:&lt;BR /&gt;
index=summary search_name="my_report_name" &lt;BR /&gt;
|stats count by event_id field1 field2  &lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 15:33:07 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303909#M91400</guid>
      <dc:creator>closeset</dc:creator>
      <dc:date>2020-09-29T15:33:07Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303910#M91401</link>
      <description>&lt;P&gt;Thank you skoelpin!&lt;BR /&gt;
This is one possible solution for me. In this case, because increasing the limit might cause some instability, do you happen to know other possible methods? &lt;/P&gt;</description>
      <pubDate>Mon, 28 Aug 2017 22:30:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303910#M91401</guid>
      <dc:creator>closeset</dc:creator>
      <dc:date>2017-08-28T22:30:44Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303911#M91402</link>
      <description>&lt;P&gt;Are the number of rows getting truncated after 50k? If so then this may be your only solution &lt;/P&gt;

&lt;P&gt;I've increased the limit before and haven't seen any instability issues. I would contact support and get their opinion before trying this in production &lt;/P&gt;</description>
      <pubDate>Wed, 30 Aug 2017 13:19:52 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303911#M91402</guid>
      <dc:creator>skoelpin</dc:creator>
      <dc:date>2017-08-30T13:19:52Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303912#M91403</link>
      <description>&lt;P&gt;The limit is there to protect your browser from locking up (amoung other reasons... or at least that's what I  believe).  When you load than much into memory things can get funny.  "Unstable" even! &lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 01:10:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303912#M91403</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2017-08-31T01:10:30Z</dc:date>
    </item>
    <item>
      <title>Re: Issue with count -- How can I search a large data set without Splunk truncating the data?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303913#M91404</link>
      <description>&lt;P&gt;I think you have several options.&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;Create and accelerate a data model.&lt;/LI&gt;
&lt;LI&gt;Create summary indexes (using searches that run every day, or more frequently (like every 5-15 minutes) and then use a backfill script).&lt;/LI&gt;
&lt;LI&gt;Do both of the above, create an accelerated DM and summary indexes from the DM using the tstats command, etc.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Number one being the easiest approach. Number 2 being a faster approach.  Number three being necessary if you need to correlate data from more than one really large data set.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Aug 2017 07:33:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Issue-with-count-How-can-I-search-a-large-data-set-without/m-p/303913#M91404</guid>
      <dc:creator>jkat54</dc:creator>
      <dc:date>2017-08-31T07:33:09Z</dc:date>
    </item>
  </channel>
</rss>

