<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using dedup to keep the oldest events in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49038#M11748</link>
    <description>&lt;P&gt;Indeed it does! Thanks for the help David, and for confirming that I'm not going crazy.&lt;/P&gt;</description>
    <pubDate>Wed, 27 Jul 2011 18:45:12 GMT</pubDate>
    <dc:creator>acdevlin</dc:creator>
    <dc:date>2011-07-27T18:45:12Z</dc:date>
    <item>
      <title>Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49033#M11743</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;

&lt;P&gt;I know that the "dedup" command returns the most recent values in time. However, I'm currently in a situation where I want to use dedup to only keep the oldest events from my data (example below). I found  &lt;A href="http://splunk-base.splunk.com/answers/10932/is-it-possible-to-use-dedup-to-grab-the-oldest-events-rather-than-the-newest"&gt;the following thread&lt;/A&gt; which is identical to my question, but the proposed solution (sorting on +_time) does not seem to work for me.&lt;/P&gt;

&lt;P&gt;What I specifically have are a bunch of client requests to a web server. Each event has an associated &lt;CODE&gt;req_time&lt;/CODE&gt; and a &lt;CODE&gt;session_id&lt;/CODE&gt;; many transactions can share the same &lt;CODE&gt;session_id&lt;/CODE&gt;. What I want to do is call &lt;CODE&gt;'...|dedup session_id'&lt;/CODE&gt; and have only the OLDEST transaction from each individual &lt;CODE&gt;session_id&lt;/CODE&gt; be returned, rather than the NEWEST.&lt;/P&gt;

&lt;P&gt;Any suggestions on how to accomplish this?&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jul 2011 22:44:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49033#M11743</guid>
      <dc:creator>acdevlin</dc:creator>
      <dc:date>2011-07-26T22:44:36Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49034#M11744</link>
      <description>&lt;P&gt;I think you will find the sortby parameter to do this for you. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;YourSearch | dedup session_id sortby +_time
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Check out the docs for more ways you can tweak dedup:&lt;/P&gt;

&lt;P&gt;&lt;A href="http://www.splunk.com/base/Documentation/latest/SearchReference/Dedup"&gt;http://www.splunk.com/base/Documentation/latest/SearchReference/Dedup&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2011 00:29:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49034#M11744</guid>
      <dc:creator>David</dc:creator>
      <dc:date>2011-07-27T00:29:43Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49035#M11745</link>
      <description>&lt;P&gt;Thanks for the reply, David.&lt;/P&gt;

&lt;P&gt;I mentioned that I tried this solution in my earlier question. For some reason, it did not work yesterday and only the oldest events were removed. However, it is working this morning to my pleasant surprise. &lt;/P&gt;

&lt;P&gt;Any idea as to why that happened?&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;EDIT: Answered my own question, but I'm still mystified by it. The query which successfully returned the oldest events included some concurrency information that I had been playing around with.&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;... | eval timeout=1599 | ... | concurrency duration=timeout | dedup session_id
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;The above works. I have no idea why.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2011 16:07:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49035#M11745</guid>
      <dc:creator>acdevlin</dc:creator>
      <dc:date>2011-07-27T16:07:29Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49036#M11746</link>
      <description>&lt;P&gt;I just tried that, and can definitely confirm what you found. If you toss a concurrency before the dedup, it does return the same results as if you had done a sortby +_time. You should be able to override this by doing a sortby -_time, but that search failed for me ("job ... is a zombie and is no longer with us"). This appears to be a bug, where concurrency is doing some sort of work on _time, and breaking dedup.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 09:45:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49036#M11746</guid>
      <dc:creator>David</dc:creator>
      <dc:date>2020-09-28T09:45:46Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49037#M11747</link>
      <description>&lt;P&gt;Fortunately, if you need to grab the newest events after running a concurrency (or either way want to wrest control of your search's fate out from the hands of concurrency), you can work around this by creating another time field. I was able to do:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;MySearch | eval MyTime = _time | concurrency duration=duration output=concurrentevents | dedup MyField sortby -MyTime
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Without the same issue. Likewise, +MyTime works.&lt;/P&gt;

&lt;P&gt;Does that get you where you need to be?&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2011 17:33:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49037#M11747</guid>
      <dc:creator>David</dc:creator>
      <dc:date>2011-07-27T17:33:43Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49038#M11748</link>
      <description>&lt;P&gt;Indeed it does! Thanks for the help David, and for confirming that I'm not going crazy.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jul 2011 18:45:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49038#M11748</guid>
      <dc:creator>acdevlin</dc:creator>
      <dc:date>2011-07-27T18:45:12Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49039#M11749</link>
      <description>&lt;P&gt;maybe the correct is:&lt;/P&gt;

&lt;P&gt;Your_search | reverse | dedup ...&lt;/P&gt;</description>
      <pubDate>Tue, 04 Apr 2017 19:41:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49039#M11749</guid>
      <dc:creator>fli</dc:creator>
      <dc:date>2017-04-04T19:41:16Z</dc:date>
    </item>
    <item>
      <title>Re: Using dedup to keep the oldest events</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49040#M11750</link>
      <description>&lt;P&gt;Hi David,&lt;/P&gt;

&lt;P&gt;I am in kind of same situation , I need to retrieve results for latest time instead of old events. &lt;BR /&gt;
I performed search as - &lt;BR /&gt;
index=x | eval sorttime=strptime('_time',"%m/%d/%Y %H:%M:%S%p")| sort -sorttime |dedup hostname compName +_time keepempty=true | xyseries hostname compName status&lt;/P&gt;

&lt;P&gt;This should retrieve latest week / time results instead it's showing old week data&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 00:47:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Using-dedup-to-keep-the-oldest-events/m-p/49040#M11750</guid>
      <dc:creator>rashi83</dc:creator>
      <dc:date>2020-09-30T00:47:32Z</dc:date>
    </item>
  </channel>
</rss>

