<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How do we overcome the shortcomings of &amp;quot;bin&amp;quot; when trying to find X number of events in a certain time period? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477492#M134004</link>
    <description>&lt;P&gt;Hello all,&lt;/P&gt;

&lt;P&gt;I've had this issue in the past but never really spent the time to find a solution as bin is usually "good enough." Take the following data set:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;12:00:00 -- id = A
12:01:00 -- id = B
12:02:00 -- id = A
12:03:00 -- id = A
12:04:00 -- id = A
12:05:00 -- id = A
12:06:00 -- id = A
12:07:00 -- id = A
12:08:00 -- id = A
12:09:00 -- id = A
12:10:00 -- id = B
12:11:00 -- id = A
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If I did a &lt;CODE&gt;|bin _time span=5m | stats count by id&lt;/CODE&gt;I'd get something like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;12:00:00| A : 4 | B : 1
12:05:00| A : 5 | B : 0
12:10:00| A : 1 | B : 1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And if then wanted to find a period where I had 5 or more As I'd have 1 period. If I wanted a period with 4 or more As I'd get 2 periods. Here's a very advanced diagram to illustrate the point:&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/8159i63FEEF89266402B1/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Using what I'm &lt;EM&gt;intending&lt;/EM&gt; I would have &lt;STRONG&gt;8&lt;/STRONG&gt; periods with A&amp;gt;=4 and &lt;STRONG&gt;4&lt;/STRONG&gt; periods with A&amp;gt;=5&lt;/P&gt;

&lt;P&gt;I've tried stuff like transaction with maxspan but that doesn't seem to work and streamstats has been giving me issues when I'm not using only one field. &lt;CODE&gt;bin&lt;/CODE&gt; and &lt;CODE&gt;span&lt;/CODE&gt; seem to do the job most of the time but I feel like a lot of interesting data can be missed simply because the data happens to fall across the binning period. &lt;/P&gt;

&lt;P&gt;More severely if it the following happened and I was looking for A&amp;gt;=5 I would get no results!&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;12:00:00 -- id = B
12:01:00 -- id = B
12:02:00 -- id = B
12:03:00 -- id = A
12:04:00 -- id = A
12:05:00 -- id = A
12:06:00 -- id = A
12:07:00 -- id = A
12:08:00 -- id = B
12:09:00 -- id = B
12:10:00 -- id = B
12:11:00 -- id = B
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 09 Jan 2020 20:27:58 GMT</pubDate>
    <dc:creator>jadamsplunk</dc:creator>
    <dc:date>2020-01-09T20:27:58Z</dc:date>
    <item>
      <title>How do we overcome the shortcomings of "bin" when trying to find X number of events in a certain time period?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477492#M134004</link>
      <description>&lt;P&gt;Hello all,&lt;/P&gt;

&lt;P&gt;I've had this issue in the past but never really spent the time to find a solution as bin is usually "good enough." Take the following data set:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;12:00:00 -- id = A
12:01:00 -- id = B
12:02:00 -- id = A
12:03:00 -- id = A
12:04:00 -- id = A
12:05:00 -- id = A
12:06:00 -- id = A
12:07:00 -- id = A
12:08:00 -- id = A
12:09:00 -- id = A
12:10:00 -- id = B
12:11:00 -- id = A
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If I did a &lt;CODE&gt;|bin _time span=5m | stats count by id&lt;/CODE&gt;I'd get something like:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;12:00:00| A : 4 | B : 1
12:05:00| A : 5 | B : 0
12:10:00| A : 1 | B : 1
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And if then wanted to find a period where I had 5 or more As I'd have 1 period. If I wanted a period with 4 or more As I'd get 2 periods. Here's a very advanced diagram to illustrate the point:&lt;/P&gt;

&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/8159i63FEEF89266402B1/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Using what I'm &lt;EM&gt;intending&lt;/EM&gt; I would have &lt;STRONG&gt;8&lt;/STRONG&gt; periods with A&amp;gt;=4 and &lt;STRONG&gt;4&lt;/STRONG&gt; periods with A&amp;gt;=5&lt;/P&gt;

&lt;P&gt;I've tried stuff like transaction with maxspan but that doesn't seem to work and streamstats has been giving me issues when I'm not using only one field. &lt;CODE&gt;bin&lt;/CODE&gt; and &lt;CODE&gt;span&lt;/CODE&gt; seem to do the job most of the time but I feel like a lot of interesting data can be missed simply because the data happens to fall across the binning period. &lt;/P&gt;

&lt;P&gt;More severely if it the following happened and I was looking for A&amp;gt;=5 I would get no results!&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;12:00:00 -- id = B
12:01:00 -- id = B
12:02:00 -- id = B
12:03:00 -- id = A
12:04:00 -- id = A
12:05:00 -- id = A
12:06:00 -- id = A
12:07:00 -- id = A
12:08:00 -- id = B
12:09:00 -- id = B
12:10:00 -- id = B
12:11:00 -- id = B
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 09 Jan 2020 20:27:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477492#M134004</guid>
      <dc:creator>jadamsplunk</dc:creator>
      <dc:date>2020-01-09T20:27:58Z</dc:date>
    </item>
    <item>
      <title>Re: How do we overcome the shortcomings of "bin" when trying to find X number of events in a certain time period?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477493#M134005</link>
      <description>&lt;P&gt;Streamstats seems to work the way you are thinking.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;&amp;lt;your search&amp;gt; | reverse| streamstats time_window=5m count by id | table _time id count
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I put the reverse in to get the events in the right order.  It produces this outcome which looks like it lines up with your expectations.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;_time   id  count
2020-01-09 12:00:00 A   1
2020-01-09 12:01:00 B   1
2020-01-09 12:02:00 A   2
2020-01-09 12:03:00 A   3
2020-01-09 12:04:00 A   4
2020-01-09 12:05:00 A   4
2020-01-09 12:06:00 A   5
2020-01-09 12:07:00 A   5
2020-01-09 12:08:00 A   5
2020-01-09 12:09:00 A   5
2020-01-09 12:10:00 B   1
2020-01-09 12:11:00 A   4
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 09 Jan 2020 20:54:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477493#M134005</guid>
      <dc:creator>jimodonald</dc:creator>
      <dc:date>2020-01-09T20:54:38Z</dc:date>
    </item>
    <item>
      <title>Re: How do we overcome the shortcomings of "bin" when trying to find X number of events in a certain time period?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477494#M134006</link>
      <description>&lt;P&gt;That works if you've filtered down to a single entity with those values, but  say I have that dataset 3 or 4 times in roughly the same time period for other entities. Those events intermingle with the other data and I think streamstats ends up just resetting basically every event. At the very least I have not had any success.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2020 21:13:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477494#M134006</guid>
      <dc:creator>jadamsplunk</dc:creator>
      <dc:date>2020-01-09T21:13:34Z</dc:date>
    </item>
    <item>
      <title>Re: How do we overcome the shortcomings of "bin" when trying to find X number of events in a certain time period?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477495#M134007</link>
      <description>&lt;P&gt;It seems to work pretty consistently for me for both values, but I'm using pretty small samples (~150 events).&lt;/P&gt;

&lt;P&gt;You may need to tweak the time_window, for the counting, but give it a try.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jan 2020 21:34:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477495#M134007</guid>
      <dc:creator>jimodonald</dc:creator>
      <dc:date>2020-01-09T21:34:01Z</dc:date>
    </item>
    <item>
      <title>Re: How do we overcome the shortcomings of "bin" when trying to find X number of events in a certain time period?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477496#M134008</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;| makeresults count=2 
| streamstats count 
| eval _time=relative_time(_time,-1*count."d@d") 
| makecontinuous _time span=1min
| eval id=mvindex(split("ABC",""),random() % 3)
| table _time id 
| streamstats dc(eval(id="A")) as p_test by id 
| streamstats window=5 sum(p_test) as window_num count as tmp 
| streamstats count(eval(tmp="5")) as session 
| table _time id window_num session 
| eventstats count(eval(window_num=4)) as "A&amp;gt;=4" count(eval(window_num=5)) as "A&amp;gt;=5"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;From the table, &lt;CODE&gt;time_window&lt;/CODE&gt; does not work.&lt;BR /&gt;
Please set &lt;CODE&gt;_time&lt;/CODE&gt; appropriately.&lt;/P&gt;</description>
      <pubDate>Sun, 12 Apr 2020 03:44:33 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-do-we-overcome-the-shortcomings-of-quot-bin-quot-when-trying/m-p/477496#M134008</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2020-04-12T03:44:33Z</dc:date>
    </item>
  </channel>
</rss>

