<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to get count of unique values by group? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/572131#M199393</link>
    <description>&lt;P&gt;Pandas nunique() is used to get a count of unique values. It returns the Number of pandas unique values in a column. Pandas &lt;A href="https://exerror.com/" target="_self"&gt;DataFrame groupby()&lt;/A&gt; method is used to split data of a particular dataset into groups based on some criteria. The groupby() function split the data on any of the axes.&lt;/P&gt;</description>
    <pubDate>Sun, 24 Oct 2021 05:08:22 GMT</pubDate>
    <dc:creator>milanpatel78</dc:creator>
    <dc:date>2021-10-24T05:08:22Z</dc:date>
    <item>
      <title>How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357087#M105638</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;I am working on query to retrieve count of unique host IPs by user and country. The country has to be grouped into Total vs Total Non-US. The final result would be something  like below -&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;UserId,    Total Unique Hosts,    Total Non-US Unique Hosts&lt;/STRONG&gt;&lt;BR /&gt;
user1, 42, 54&lt;BR /&gt;
user2, 23, 95&lt;/P&gt;

&lt;P&gt;So far I have below query which works but its very slow. Is there any better and faster way to achieve desired result ? Thanks&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=customindex sourcetype=custom src
| iplocation allfields=true lang=code HOST | search Country!=US | stats estdc(HOST) as total_non_us by USERID
| join USERID type="left"
    [
       search index=customindex sourcetype=custom src
         | iplocation allfields=true lang=code HOST | search Country=US | stats estdc(HOST) as total_us by USERID
    ] 
| fillnull
| eval total = total_non_us + total_us
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 16 Mar 2018 15:28:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357087#M105638</guid>
      <dc:creator>indusbull</dc:creator>
      <dc:date>2018-03-16T15:28:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357088#M105639</link>
      <description>&lt;P&gt;This should run more efficiently by avoiding the &lt;CODE&gt;join&lt;/CODE&gt; command and duplicate searching:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=customindex sourcetype=custom src
| iplocation allfields=true lang=code HOST 
| eval us_host=if(Country="US", HOST, NULL), non_us_host=if(Country!="US", HOST, NULL)
| stats estdc(us_host) AS total_us, estdc(non_us_host) AS total_non_us BY USERID
| fillnull
| eval total = total_non_us + total_us
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 16 Mar 2018 17:34:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357088#M105639</guid>
      <dc:creator>elliotproebstel</dc:creator>
      <dc:date>2018-03-16T17:34:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357089#M105640</link>
      <description>&lt;P&gt;@elliotproebstel, I would perform stats first and then apply iplocation to aggregated fields.&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Geostats#Usage"&gt;http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Geostats#Usage&lt;/A&gt;&lt;BR /&gt;
&lt;A href="http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Lookup#Optimizing_your_lookup_search"&gt;http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Lookup#Optimizing_your_lookup_search&lt;/A&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=customindex sourcetype=custom src 
| stats count BY USERID HOST 
| iplocation allfields=true lang=code HOST 
| eval us_host=if(Country="US", HOST, NULL), non_us_host=if(Country!="US", HOST, NULL) 
| stats dc(us_host) AS total_us, dc(non_us_host) AS total_non_us BY USERID
| addtotals row=t col=f
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;@indusbull, can you explain why you are trying to use &lt;CODE&gt;estdc()&lt;/CODE&gt; and not &lt;CODE&gt;dc()&lt;/CODE&gt;?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Mar 2018 19:02:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357089#M105640</guid>
      <dc:creator>niketn</dc:creator>
      <dc:date>2018-03-16T19:02:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357090#M105641</link>
      <description>&lt;P&gt;Good point, thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 16 Mar 2018 20:02:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357090#M105641</guid>
      <dc:creator>elliotproebstel</dc:creator>
      <dc:date>2018-03-16T20:02:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357091#M105642</link>
      <description>&lt;P&gt;@niketnilay I was using dc initially but since it was taking long time I decided to try estdc  since splunk doc mentions that estdc can give better performance.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Mar 2018 20:02:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/357091#M105642</guid>
      <dc:creator>indusbull</dc:creator>
      <dc:date>2018-03-16T20:02:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/512932#M143907</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Pandas nunique() is used to get a count of unique values. It returns the Number of&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://appdividend.com/2020/05/25/pandas-unique-pandas-series-unique-in-python/" target="_blank" rel="noopener noreferrer nofollow"&gt;pandas unique values&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;in a column.&amp;nbsp;&amp;nbsp;&lt;A href="https://appdividend.com/2020/06/02/pandas-dataframe-groupby-method-in-python/" target="_blank" rel="noopener"&gt;Pandas DataFrame groupby()&lt;/A&gt; method is used to split data of a particular dataset into groups based on some criteria. The groupby() function split the data on any of the axes.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Aug 2020 06:38:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/512932#M143907</guid>
      <dc:creator>rushabh92</dc:creator>
      <dc:date>2020-08-07T06:38:16Z</dc:date>
    </item>
    <item>
      <title>Re: How to get count of unique values by group?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/572131#M199393</link>
      <description>&lt;P&gt;Pandas nunique() is used to get a count of unique values. It returns the Number of pandas unique values in a column. Pandas &lt;A href="https://exerror.com/" target="_self"&gt;DataFrame groupby()&lt;/A&gt; method is used to split data of a particular dataset into groups based on some criteria. The groupby() function split the data on any of the axes.&lt;/P&gt;</description>
      <pubDate>Sun, 24 Oct 2021 05:08:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-count-of-unique-values-by-group/m-p/572131#M199393</guid>
      <dc:creator>milanpatel78</dc:creator>
      <dc:date>2021-10-24T05:08:22Z</dc:date>
    </item>
  </channel>
</rss>

