<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Counting Unique Events Over 30 Days in Reporting</title>
    <link>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559507#M9337</link>
    <description>&lt;P&gt;"But the issue persists where Splunk goes through more than 600,000,000 events first and then it proceeds to dedup the data. My issue is very similar to:"&lt;/P&gt;&lt;P&gt;I guess I'm not sure what you are trying to resolve then. Your search returns 600M matching events and you're running a dedup on the result set, so that is expected behavior. Splunk can't dedup fields before getting the fields/events to evaluate.&lt;BR /&gt;&lt;BR /&gt;Also, tstats are extremely fast so this shouldn't be a performance concern.&lt;/P&gt;</description>
    <pubDate>Wed, 14 Jul 2021 21:53:50 GMT</pubDate>
    <dc:creator>codebuilder</dc:creator>
    <dc:date>2021-07-14T21:53:50Z</dc:date>
    <item>
      <title>Counting Unique Events Over 30 Days</title>
      <link>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559305#M9327</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;How can I improve on my Splunk query so that only one event is counted over a 30-day span where we have 500,000,000 events matched?&lt;/P&gt;&lt;P&gt;This is the query I have so far:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| tstats count WHERE (index=&amp;lt;my_index&amp;gt; sourcetype=json_data earliest=-30d latest=-0h) BY _time span=1mon, host, address, server&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This query returns approximately 600,000,000 events, but I only need to count just one of these unique events at the host-level. Since I'm using the tstats command first to retrieve data, I made sure that indeces exist on _time, host, address, and server. My problem here is that Splunk first retreives all of the matching events and then it removes the duplicates. Is there a way to just retreive unique events by host, address, and server?&lt;/P&gt;&lt;P&gt;For example, a host could have the following events over the past 30 days:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;_time&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;host&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;address&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;server&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07-13 12:55:08&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;testenv1&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.10&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07-13 12:55:08&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;testenv1&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.10&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07-13 12:55:08&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;testenv1&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.10&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07-13 12:55:08&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;testenv2&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.11&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07-13 12:55:08&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;&amp;nbsp;testenv2&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.11&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07-13 12:55:08&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;&amp;nbsp;testenv2&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.11&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And I want my query to do this:&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;_time&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;host&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;address&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;server&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;testenv1&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.10&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%" height="25px"&gt;&lt;SPAN&gt;2021-07&lt;/SPAN&gt;&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;&amp;nbsp;testenv2&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;10.10.10.11&lt;/TD&gt;&lt;TD width="25%" height="25px"&gt;store2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is just a sample of my data. In several cases, we have unique hosts that repeat 20,000 times over a hour time span. I need my Splunk query to display this record just once, without having to retreive all other 20,000 events.&lt;/P&gt;&lt;P&gt;I also tried to use disctinct_counts like this, but this still retrieves all of the duplicated events under the Events tab:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| tstats distinct_count WHERE (index=&amp;lt;my_index&amp;gt; sourcetype=json_data earliest=-30d latest=-0h) BY _time span=1mon, host, address, server&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've browsed multiple Splunk threads and I'm just stumped.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jul 2021 20:29:28 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559305#M9327</guid>
      <dc:creator>dodge27</dc:creator>
      <dc:date>2021-07-13T20:29:28Z</dc:date>
    </item>
    <item>
      <title>Re: Counting Unique Events Over 30 Days</title>
      <link>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559490#M9334</link>
      <description>&lt;P&gt;You can use dedup to filter your results.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/8.2.1/SearchReference/Dedup" target="_blank"&gt;https://docs.splunk.com/Documentation/Splunk/8.2.1/SearchReference/Dedup&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jul 2021 19:28:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559490#M9334</guid>
      <dc:creator>codebuilder</dc:creator>
      <dc:date>2021-07-14T19:28:54Z</dc:date>
    </item>
    <item>
      <title>Re: Counting Unique Events Over 30 Days</title>
      <link>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559505#M9336</link>
      <description>&lt;P&gt;With the dedup command, I tried this:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| tstats count WHERE index=&amp;lt;my_index&amp;gt; sourcetype=json_data earliest=-30d latest=-0h BY host, address, server
| dedup host, address, server
| where isnotnull(address) AND host!="(none)"
| stats sum BY host, address, server&lt;/LI-CODE&gt;&lt;P&gt;But the issue persists where Splunk goes through more than 600,000,000 events first and then it proceeds to dedup the data. My issue is very similar to:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.splunk.com/t5/Splunk-Search/How-can-I-limit-the-number-of-events-returned-from-a-search/m-p/281240" target="_blank" rel="noopener"&gt;https://community.splunk.com/t5/Splunk-Search/How-can-I-limit-the-number-of-events-returned-from-a-s...&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Somebody else at work mentioned that Splunk has the capability for distributed search jobs but I'm not quite sure how to do that.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jul 2021 21:26:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559505#M9336</guid>
      <dc:creator>dodge27</dc:creator>
      <dc:date>2021-07-14T21:26:26Z</dc:date>
    </item>
    <item>
      <title>Re: Counting Unique Events Over 30 Days</title>
      <link>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559507#M9337</link>
      <description>&lt;P&gt;"But the issue persists where Splunk goes through more than 600,000,000 events first and then it proceeds to dedup the data. My issue is very similar to:"&lt;/P&gt;&lt;P&gt;I guess I'm not sure what you are trying to resolve then. Your search returns 600M matching events and you're running a dedup on the result set, so that is expected behavior. Splunk can't dedup fields before getting the fields/events to evaluate.&lt;BR /&gt;&lt;BR /&gt;Also, tstats are extremely fast so this shouldn't be a performance concern.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jul 2021 21:53:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Reporting/Counting-Unique-Events-Over-30-Days/m-p/559507#M9337</guid>
      <dc:creator>codebuilder</dc:creator>
      <dc:date>2021-07-14T21:53:50Z</dc:date>
    </item>
  </channel>
</rss>

