<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Calculate stats over a percentage of the data in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44723#M10564</link>
    <description>&lt;P&gt;Thanks for the formatting help!&lt;/P&gt;

&lt;P&gt;Yes, I mean the same thing when I say "access" and "login". I have a set of access logs and I want to find the total count of accesses for the top 10% of users per access point.&lt;/P&gt;</description>
    <pubDate>Thu, 17 Feb 2011 08:20:01 GMT</pubDate>
    <dc:creator>gpburgett</dc:creator>
    <dc:date>2011-02-17T08:20:01Z</dc:date>
    <item>
      <title>Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44717#M10558</link>
      <description>&lt;P&gt;I got a challenging request from a customer regarding their access logs. They want to monitor access patterns across all the the access points in their network by user. Particularly, they are interested in stats for the top 10% of users for each access point. (10%- the top ten out of 100 users). I've managed to get most of the info that they want using pretty simple searches, but I'm still stumped on this one:&lt;/P&gt;

&lt;P&gt;total number of logins for the top 10% of users by access point &lt;/P&gt;

&lt;P&gt;I've tried some things using subsearches and the perc() function, but the search string gets too complicated or I end up doing something that Splunk doesn't like. Maybe I'm overthinking it.&lt;/P&gt;

&lt;P&gt;Here's my latest failure(the appendcols and where commands cause the problems):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;sourcetype="access_log" |stats count as EVNT by APNUM USERID 
|appendcols [search sourcetype="access_log" | stats count AS EVNTCNT by APNUM USERID
             | stats p90(EVNTCNT) as LIM by APIP| fields APIP LIM ]
| eval USE=if(EVNT&amp;lt;LIM, "NO", "YES")| table APIP, EVNT, LIM, USE| stats sum(EVNT) by APIP| where USE=YES
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 15 Feb 2011 12:37:37 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44717#M10558</guid>
      <dc:creator>gpburgett</dc:creator>
      <dc:date>2011-02-15T12:37:37Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44718#M10559</link>
      <description>&lt;P&gt;It looks like the "where" command can only be used after a search string and not after a function. Correct?&lt;/P&gt;</description>
      <pubDate>Tue, 15 Feb 2011 14:21:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44718#M10559</guid>
      <dc:creator>gpburgett</dc:creator>
      <dc:date>2011-02-15T14:21:04Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44719#M10560</link>
      <description>&lt;P&gt;The search string got cut off. Here's the complete search: &lt;/P&gt;

&lt;P&gt;sourcetype="wims_auth" |stats count as EVNT by APIP MACID |appendcols [search sourcetype="wims_auth" | stats count AS EVNTCNT by APIP MACID| stats p90(EVNTCNT) as LIM by APIP| fields APIP LIM ]| eval USE=if(EVNT&amp;lt;LIM, "NO", "YES")| table APIP, EVNT, LIM, USE| stats sum(EVNT) by APIP| where USE=YES&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 09:24:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44719#M10560</guid>
      <dc:creator>gpburgett</dc:creator>
      <dc:date>2020-09-28T09:24:44Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44720#M10561</link>
      <description>&lt;P&gt;Just tried to hack up the searchstring to be a bit more readable in answers&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2011 03:06:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44720#M10561</guid>
      <dc:creator>jrodman</dc:creator>
      <dc:date>2011-02-17T03:06:45Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44721#M10562</link>
      <description>&lt;P&gt;Are logins and accesses the same?&lt;BR /&gt;
Are we starting with the set of logins, and wanting to find, for each access point, the top 10% of users and their count of logins?&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2011 03:10:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44721#M10562</guid>
      <dc:creator>jrodman</dc:creator>
      <dc:date>2011-02-17T03:10:24Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44722#M10563</link>
      <description>&lt;P&gt;One note is that "| where USE=YES" is going to look for rows where the value of the USE field is equal to the value of the YES field.   If you mean the literal 3-character value YES, you have to put the YES in quotes.  Where is a little different from search and that's one of the ways.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2011 03:45:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44722#M10563</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2011-02-17T03:45:45Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44723#M10564</link>
      <description>&lt;P&gt;Thanks for the formatting help!&lt;/P&gt;

&lt;P&gt;Yes, I mean the same thing when I say "access" and "login". I have a set of access logs and I want to find the total count of accesses for the top 10% of users per access point.&lt;/P&gt;</description>
      <pubDate>Thu, 17 Feb 2011 08:20:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44723#M10564</guid>
      <dc:creator>gpburgett</dc:creator>
      <dc:date>2011-02-17T08:20:01Z</dc:date>
    </item>
    <item>
      <title>Re: Calculate stats over a percentage of the data</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44724#M10565</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;sourcetype=wims_auth | stats count as EVNT by APIP MACID | eventstats perc90(EVNT) as cutoff by APIP | where EVNT&amp;gt;=cutoff | stats sum(EVNT) by APIP
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 17 Feb 2011 14:03:08 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Calculate-stats-over-a-percentage-of-the-data/m-p/44724#M10565</guid>
      <dc:creator>steveyz</dc:creator>
      <dc:date>2011-02-17T14:03:08Z</dc:date>
    </item>
  </channel>
</rss>

