<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why am I getting different count results using &amp;quot;chart count by field&amp;quot; versus &amp;quot;chart count(field) by field&amp;quot;? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218736#M64297</link>
    <description>&lt;P&gt;Interesting!&lt;BR /&gt;
Since _time is very accurate, I could use &lt;CODE&gt;count(_time)&lt;/CODE&gt; in place of &lt;CODE&gt;count(_raw)&lt;/CODE&gt;... Thank you.&lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2020 08:55:03 GMT</pubDate>
    <dc:creator>sistemistiposta</dc:creator>
    <dc:date>2020-09-29T08:55:03Z</dc:date>
    <item>
      <title>Why am I getting different count results using "chart count by field" versus "chart count(field) by field"?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218734#M64295</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I have this raw line:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2016-02-25T15:48:09.762479+01:00 03ucas amavis[1369]: (01369-16) run_av (ClamAV-clamd-stream): p005 p003 p002 p001 INFECTED: Sanesecurity.Jurlbl.51aaae.UNOFFICIAL, Sanesecurity.Jurlbl.51aaae.UNOFFICIAL
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If I run this search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;amavis run_av (\(ClamAV-clamd-stream\) OR \(ClamAV-clamd\)) INFECTED "Sanesecurity.Jurlbl.51aaae.UNOFFICIAL" | rex field=_raw "^(?:[^:\n]*:){6}(?:\s+(?P&amp;lt;virus&amp;gt;[A-z.\-\d,\s]+))" | makemv delim=", " virus | chart count by virus
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I see:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;virus=Sanesecurity.Jurlbl.51aaae.UNOFFICIAL
count=2
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If I run this search:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;amavis run_av (\(ClamAV-clamd-stream\) OR \(ClamAV-clamd\)) INFECTED "Sanesecurity.Jurlbl.51aaae.UNOFFICIAL" | rex field=_raw "^(?:[^:\n]*:){6}(?:\s+(?P&amp;lt;virus&amp;gt;[A-z.\-\d,\s]+))" | makemv delim=", " virus | chart count(virus) by virus
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I see:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;virus=Sanesecurity.Jurlbl.51aaae.UNOFFICIAL
count=4
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Why do I see &lt;STRONG&gt;count=4&lt;/STRONG&gt; in the second search?&lt;/P&gt;

&lt;P&gt;Thank you very much&lt;BR /&gt;
Best Regards&lt;BR /&gt;
Marco&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 15:52:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218734#M64295</guid>
      <dc:creator>sistemistiposta</dc:creator>
      <dc:date>2016-02-25T15:52:49Z</dc:date>
    </item>
    <item>
      <title>Re: Why am I getting different count results using "chart count by field" versus "chart count(field) by field"?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218735#M64296</link>
      <description>&lt;P&gt;The problem is the &lt;CODE&gt;count(virus)&lt;/CODE&gt;.    A lot of people think in general that &lt;CODE&gt;count(foo)&lt;/CODE&gt; is the same as &lt;CODE&gt;count&lt;/CODE&gt; but it is not. &lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;count(virus) by foo&lt;/CODE&gt;,  will count the number of occurrences of the "virus" field,  for each of the values of foo. &lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;count by foo&lt;/CODE&gt; on the other hand will count the number of &lt;EM&gt;rows&lt;/EM&gt; for each of the values of foo. &lt;/P&gt;

&lt;P&gt;Since you have multivalue fields going on in your search, you can begin to see the trouble.  &lt;CODE&gt;count(virus)&lt;/CODE&gt; will quite happily count each of the multivalue values as its own "occurence". &lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;count(virus) by virus&lt;/CODE&gt; is even more peculiar.   This will consider each value of virus separately in the &lt;CODE&gt;by virus&lt;/CODE&gt; part but then for the &lt;CODE&gt;count(virus)&lt;/CODE&gt; part, it has to count up &lt;EM&gt;all&lt;/EM&gt; occurrences that co-occur with that value, including its own value and other multivalue values. &lt;/P&gt;

&lt;P&gt;In short,  if virus is guaranteed to be single-value,  &lt;CODE&gt;count(virus) by virus&lt;/CODE&gt; does indeed do the same thing as &lt;CODE&gt;count by virus&lt;/CODE&gt;, but as you see when multivalue fields enter the picture, it really isn't. &lt;/P&gt;

&lt;P&gt;as a corollary, &lt;CODE&gt;count(_raw) by foo&lt;/CODE&gt; is a little insane, because you're forcing Splunk to look at the _raw field, for really no reason. &lt;/P&gt;

&lt;P&gt;Another note - it's sometimes intuitive to think &lt;CODE&gt;count(foo)&lt;/CODE&gt; will count the distinct values of foo but it wont, so remember that &lt;CODE&gt;distinct_count(foo)&lt;/CODE&gt;  aka its shorthand &lt;CODE&gt;dc(foo)&lt;/CODE&gt; is a different thing entirely. &lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 17:34:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218735#M64296</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2016-02-25T17:34:59Z</dc:date>
    </item>
    <item>
      <title>Re: Why am I getting different count results using "chart count by field" versus "chart count(field) by field"?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218736#M64297</link>
      <description>&lt;P&gt;Interesting!&lt;BR /&gt;
Since _time is very accurate, I could use &lt;CODE&gt;count(_time)&lt;/CODE&gt; in place of &lt;CODE&gt;count(_raw)&lt;/CODE&gt;... Thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 08:55:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218736#M64297</guid>
      <dc:creator>sistemistiposta</dc:creator>
      <dc:date>2020-09-29T08:55:03Z</dc:date>
    </item>
    <item>
      <title>Re: Why am I getting different count results using "chart count by field" versus "chart count(field) by field"?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218737#M64298</link>
      <description>&lt;P&gt;hehe.  No, don't do that either.  If ultimately you want to count the rows, just do &lt;CODE&gt;count&lt;/CODE&gt;.    it's always more efficient than any &lt;CODE&gt;count(foo)&lt;/CODE&gt;,  even _time.  &lt;/P&gt;

&lt;P&gt;Plus if you get in the habit of doing &lt;CODE&gt;count(_time)&lt;/CODE&gt; it'll betray you later when you have some funky report with more than one transforming command, and for the later transforming commands there is no _time field.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Feb 2016 17:19:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Why-am-I-getting-different-count-results-using-quot-chart-count/m-p/218737#M64298</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2016-02-26T17:19:51Z</dc:date>
    </item>
  </channel>
</rss>

