<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dedup using time range in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196205#M56574</link>
    <description>&lt;P&gt;This is exactly what the bin command does.   bin _time span=1m   will round down to the nearest minute.    I prefer to use it as "bin" rather than the command alias "bucket", partly for this reason.       The two functions are often conflated, because you often see the bin command used with stats such as  " | bin _time span=1m | stats count by _time"&lt;/P&gt;</description>
    <pubDate>Thu, 29 Jan 2015 20:49:53 GMT</pubDate>
    <dc:creator>sideview</dc:creator>
    <dc:date>2015-01-29T20:49:53Z</dc:date>
    <item>
      <title>Dedup using time range</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196200#M56569</link>
      <description>&lt;P&gt;Hi. I am creating a search and dashboard to display our last ten locked account events. This seems to work well as I have it configured. One of the things I am doing is using the &lt;CODE&gt;dedup&lt;/CODE&gt; command to remove extra occurrences of an event, given that the lockout events often show up on multiple Active Directory domain controllers (outlined in green below). I am using the "Account_Name" and _time values for this purpose. This works well except where the events are on different domain controllers at different times. In this case, I would prefer to &lt;CODE&gt;dedup&lt;/CODE&gt; using a window of time (say 5 seconds), but I cannot find how to do this. Shown in the example below are some entries outlined in red, where they are the same user but at different times, and I would want to be careful to not exclude those events, so a straight &lt;CODE&gt;dedup&lt;/CODE&gt; does not help.&lt;/P&gt;

&lt;P&gt;Code:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;EventCodeDescription="A user account was locked out" Account_Name=* NOT "Guest" Account_Domain=* Caller_Computer_Name=* dvc=* source="WinEventLog:Security" _time=* | eval Account_Name=mvindex(Account_Name,1)  | dedup Account_Name _time  | rename dvc AS "Domain Controller" | rename Account_Domain AS "Domain Name" | rename Caller_Computer_Name AS "Client Host" | rename Account_Name AS "Account Name" | table _time "Account Name" "Client Host" "Domain Controller" "Domain Name" | sort -_time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Thanks.&lt;/P&gt;

&lt;P&gt;EDIT: Apparently, I am not allowed to attach images. Please see the Evernote link below:&lt;/P&gt;

&lt;P&gt;&lt;A href="https://www.evernote.com/shard/s26/sh/9082054d-788a-491f-92c2-66718d443740/cc2de893310866299d291983497de0dc/deep/0/Locked-Accounts---Last-10.png"&gt;https://www.evernote.com/shard/s26/sh/9082054d-788a-491f-92c2-66718d443740/cc2de893310866299d291983497de0dc/deep/0/Locked-Accounts---Last-10.png&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jan 2015 20:28:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196200#M56569</guid>
      <dc:creator>jhillenburg</dc:creator>
      <dc:date>2015-01-20T20:28:25Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup using time range</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196201#M56570</link>
      <description>&lt;P&gt;Take a look at the bucket command:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://docs.splunk.com/Documentation/Splunk/6.2.1/SearchReference/bucket"&gt;http://docs.splunk.com/Documentation/Splunk/6.2.1/SearchReference/bucket&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jan 2015 20:33:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196201#M56570</guid>
      <dc:creator>trsavela</dc:creator>
      <dc:date>2015-01-20T20:33:39Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup using time range</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196202#M56571</link>
      <description>&lt;P&gt;That certainly groups them together, though if an administrator were trying to search for the events in the logs, they would not find a precise match. Having a dedup threshold would achieve the most desirable behavior.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jan 2015 20:41:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196202#M56571</guid>
      <dc:creator>jhillenburg</dc:creator>
      <dc:date>2015-01-20T20:41:34Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup using time range</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196203#M56572</link>
      <description>&lt;P&gt;using the bin command (aka bucket),  and then doing dedup _time "Domain Controller" is a good solution. &lt;/P&gt;

&lt;P&gt;One problem though with using bin here though is that you're going to have a certain amount of cases where even though the duplicate events are only 5 seconds away,  they happen to cross one of the arbitrary bucketing boundaries.   To take a worst case scenario if the user gets locked out at 11:59:57 PM on one DC and 12:00:02 AM on another dc.    &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| bin _time span=5sec | dedup _time "Domain Controller" 
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;wont dedup them. Nor will any other reasonable arguments to bin. &lt;/P&gt;

&lt;P&gt;An odd solution that seems to avoid the problematic cases that I can think of, would be &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| bin _time span=5sec | eval _time=_time+10 | bin _time span=10sec | dedup _time "Domain Controller" 
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 20 Jan 2015 22:17:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196203#M56572</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2015-01-20T22:17:03Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup using time range</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196204#M56573</link>
      <description>&lt;P&gt;What about a way, for the purposes of deduplication (though not display), to round to the nearest one minute? I'm really talking about rounding, not bucketing, which are different things.&lt;/P&gt;</description>
      <pubDate>Wed, 28 Jan 2015 21:18:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196204#M56573</guid>
      <dc:creator>jhillenburg</dc:creator>
      <dc:date>2015-01-28T21:18:26Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup using time range</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196205#M56574</link>
      <description>&lt;P&gt;This is exactly what the bin command does.   bin _time span=1m   will round down to the nearest minute.    I prefer to use it as "bin" rather than the command alias "bucket", partly for this reason.       The two functions are often conflated, because you often see the bin command used with stats such as  " | bin _time span=1m | stats count by _time"&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jan 2015 20:49:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Dedup-using-time-range/m-p/196205#M56574</guid>
      <dc:creator>sideview</dc:creator>
      <dc:date>2015-01-29T20:49:53Z</dc:date>
    </item>
  </channel>
</rss>

