<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using head command in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200790#M58220</link>
    <description>&lt;P&gt;Try top instead and use the date_mday built-in field (you can group by date including month and year too, it's just an example):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;YOURQUERY | top limit=10 yourfield by date_mday
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You can then pipe a stats avg after that if you want a daily average for the top 10. &lt;/P&gt;</description>
    <pubDate>Thu, 24 Dec 2015 08:34:21 GMT</pubDate>
    <dc:creator>javiergn</dc:creator>
    <dc:date>2015-12-24T08:34:21Z</dc:date>
    <item>
      <title>Using head command</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200789#M58219</link>
      <description>&lt;P&gt;I have the query with stats, and I want to use head command to retrieve limited events for everyday. But head command is limiting the events for whole query. &lt;/P&gt;

&lt;P&gt;index=myindex "searchQuery"  |  rex "&amp;amp;lt;messageId&amp;amp;gt;(?&amp;lt;myMsgId&amp;gt;[^&amp;amp;lt;]+)"  | rex "refToMessageId&amp;amp;gt;(?&amp;lt;myMsgId&amp;gt;[^&amp;amp;lt;]+)" | rex field=_raw "(?&amp;lt;fldDay&amp;gt;[\d-]{10}).*\s[\s[a-zA-Z0-9-:.]"  stats earliest(_time) AS startTime, latest(_time) AS endTime, count as TotalEvents by fldDay , myMsgId | eval responseTime=endTime-startTime | where TotalEvents = 2 |  stats avg(responseTime) as avgResponseTime by fldDay &lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 08:13:02 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200789#M58219</guid>
      <dc:creator>nidhiagrawal</dc:creator>
      <dc:date>2020-09-29T08:13:02Z</dc:date>
    </item>
    <item>
      <title>Re: Using head command</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200790#M58220</link>
      <description>&lt;P&gt;Try top instead and use the date_mday built-in field (you can group by date including month and year too, it's just an example):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;YOURQUERY | top limit=10 yourfield by date_mday
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You can then pipe a stats avg after that if you want a daily average for the top 10. &lt;/P&gt;</description>
      <pubDate>Thu, 24 Dec 2015 08:34:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200790#M58220</guid>
      <dc:creator>javiergn</dc:creator>
      <dc:date>2015-12-24T08:34:21Z</dc:date>
    </item>
    <item>
      <title>Re: Using head command</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200791#M58221</link>
      <description>&lt;P&gt;This is taking too much time and fails for events of just a few hours. Any other alternative for queries that have longer timespans. My data is very huge and unable to give any significant results with this.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jul 2017 12:10:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200791#M58221</guid>
      <dc:creator>AshimaE</dc:creator>
      <dc:date>2017-07-10T12:10:24Z</dc:date>
    </item>
    <item>
      <title>Re: Using head command</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200792#M58222</link>
      <description>&lt;P&gt;you could try:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;...|sort 0 fldDay - avgResponseTime |streamstats count by fldDay|search count&amp;lt;=5
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;i'm not sure if that'll be quicker than the answer @javiergn gave, however it is another method. the sort should sort each fldDay and descend the avgResponseTime. streamstats will give a rowcount, essentially, for each event by fldDay and then you can search for the first 5 (or any set of number)&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jul 2017 13:24:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200792#M58222</guid>
      <dc:creator>cmerriman</dc:creator>
      <dc:date>2017-07-10T13:24:24Z</dc:date>
    </item>
    <item>
      <title>Re: Using head command</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200793#M58223</link>
      <description>&lt;P&gt;Your code is invalid.  At the very least, there's a pipe missing before the stats command in the middle.&lt;/P&gt;

&lt;P&gt;Please mark your code as code, for instance using the 101 010 button.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Jul 2017 14:48:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200793#M58223</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-07-10T14:48:38Z</dc:date>
    </item>
    <item>
      <title>Re: Using head command</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200794#M58224</link>
      <description>&lt;P&gt;@AshimaE -&lt;/P&gt;

&lt;P&gt;This code assumes that message IDs are relatively unique across the time you will be running the query: &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=myindex "searchQuery" ("&amp;lt;messageId&amp;gt;" OR "&amp;lt;refToMessageId&amp;gt;")
   [search index=myindex "searchQuery" "&amp;lt;messageId&amp;gt;"
   | rex "\&amp;lt;messageId\&amp;gt;(?&amp;lt;myMsgId&amp;gt;[^\&amp;lt;]+)" 
   | where isnotnull(myMsgId)
   | rex field=_raw "(?&amp;lt;fldDay&amp;gt;[\d-]{10}).*\s[\s[a-zA-Z0-9-\:.]" 
   | dedup 5 fldDay
   | table myMsgId
   | format "(" "" "" "" "OR" ")" 
   | rex field=search mode=sed "s/myMsgId=//g"
   ]
| rename COMMENT as "The above subsearch checks only one type of record and grabs the first five MsgIds for each day."
| rename COMMENT as "then only records with that MsgId somewhere in them will be returned from the main search." 

| stats earliest(_time) AS startTime, latest(_time) AS endTime, count as TotalEvents by fldDay, myMsgId 
| where TotalEvents = 2 
| eval responseTime=endTime-startTime 
| stats avg(responseTime) as avgResponseTime by fldDay 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;HR /&gt;

&lt;P&gt;However, if you are going to be running this kind of search across any great span of time, then you probably should consider creating a summary index.  Sampling the first five events for each day -- which in practice will be the LAST five events, since splunk retrieves in reverse chron order -- is a VERY blunt instrument, not at all valid statistically.     &lt;/P&gt;

&lt;P&gt;At the very least, I'd consider some kind of sampling protocol, like this.  The number 769 can be just any reasonably large number, less than abut 15% of your expected daily volume if you want 5 samples, but I happen to prefer odd primes... &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;   [search index=myindex "searchQuery" "&amp;lt;messageId&amp;gt;"
   | eval my1in769sample= random()%769
   | where my1in769sample==0
   | rex "\&amp;lt;messageId\&amp;gt;(?&amp;lt;myMsgId&amp;gt;[^\&amp;lt;]+)" 
   | where isnotnull(myMsgId)
   | rex field=_raw "(?&amp;lt;fldDay&amp;gt;[\d-]{10}).*\s[\s[a-zA-Z0-9-\:.]" 
   | dedup 5 fldDay
   | table myMsgId
   | format "(" "" "" "" "OR" ")" 
   | rex field=search mode=sed "s/myMsgId=//g"
   ]
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 10 Jul 2017 15:00:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-head-command/m-p/200794#M58224</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2017-07-10T15:00:53Z</dc:date>
    </item>
  </channel>
</rss>

