<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to get the most recent event with specific fields by &amp;quot;dedup&amp;quot; command in indexer cluster condition? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500814#M195100</link>
    <description>&lt;P&gt;Our purpose is to get the most recent event with specific fields by "dedup" command in indexer cluster &lt;/P&gt;

&lt;P&gt;We have read a similar case according to this link, but still confused about the usage of &lt;STRONG&gt;dedup&lt;/STRONG&gt;.:&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/323510/how-to-keep-all-most-recent-events-for-a-specific.html"&gt;https://answers.splunk.com/answers/323510/how-to-keep-all-most-recent-events-for-a-specific.html&lt;/A&gt;&lt;BR /&gt;
The following is our case&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Event sample (index=myIndex)&lt;/STRONG&gt;&lt;BR /&gt;
conditions:&lt;BR /&gt;
 (1) 1 search-head + 2 indexer instances (we use index cluster)&lt;BR /&gt;
 (2) each event have one duplicated record (marked "duplicated event")&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-12-04 12:00:00, machine=serverA, result=pass # duplicated event
2019-12-04 12:00:00, machine=serverA, result=pass 
2019-12-04 12:00:00, machine=serverB, result=pass # duplicated event
2019-12-04 12:00:00, machine=serverB, result=pass
2019-12-03 12:00:00, machine=serverA, result=fail # duplicated event
2019-12-03 12:00:00, machine=serverA, result=fail   
2019-12-03 12:00:00, machine=serverB, result=fail # duplicated event
2019-12-03 12:00:00, machine=serverB, result=fail   
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;We want to get the most recent server's result per day, such as&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Taget result&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-12-04 12:00:00, machine=serverA, result=pass 
2019-12-04 12:00:00, machine=serverB, result=pass
2019-12-03 12:00:00, machine=serverA, result=fail   
2019-12-03 12:00:00, machine=serverB, result=fail   
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;SPL query&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=myIndex
| dedup  _time machine
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Question: &lt;BR /&gt;
Does "&lt;STRONG&gt;dedup&lt;/STRONG&gt;" command "always" return the most recent events based on the specific fields crossing multiple indexers? &lt;/P&gt;

&lt;P&gt;According to our case, If we apply the spl query based on our condition, can we always get the target result?&lt;/P&gt;</description>
    <pubDate>Wed, 04 Dec 2019 04:07:10 GMT</pubDate>
    <dc:creator>davidgogogo</dc:creator>
    <dc:date>2019-12-04T04:07:10Z</dc:date>
    <item>
      <title>How to get the most recent event with specific fields by "dedup" command in indexer cluster condition?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500814#M195100</link>
      <description>&lt;P&gt;Our purpose is to get the most recent event with specific fields by "dedup" command in indexer cluster &lt;/P&gt;

&lt;P&gt;We have read a similar case according to this link, but still confused about the usage of &lt;STRONG&gt;dedup&lt;/STRONG&gt;.:&lt;BR /&gt;
&lt;A href="https://answers.splunk.com/answers/323510/how-to-keep-all-most-recent-events-for-a-specific.html"&gt;https://answers.splunk.com/answers/323510/how-to-keep-all-most-recent-events-for-a-specific.html&lt;/A&gt;&lt;BR /&gt;
The following is our case&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Event sample (index=myIndex)&lt;/STRONG&gt;&lt;BR /&gt;
conditions:&lt;BR /&gt;
 (1) 1 search-head + 2 indexer instances (we use index cluster)&lt;BR /&gt;
 (2) each event have one duplicated record (marked "duplicated event")&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-12-04 12:00:00, machine=serverA, result=pass # duplicated event
2019-12-04 12:00:00, machine=serverA, result=pass 
2019-12-04 12:00:00, machine=serverB, result=pass # duplicated event
2019-12-04 12:00:00, machine=serverB, result=pass
2019-12-03 12:00:00, machine=serverA, result=fail # duplicated event
2019-12-03 12:00:00, machine=serverA, result=fail   
2019-12-03 12:00:00, machine=serverB, result=fail # duplicated event
2019-12-03 12:00:00, machine=serverB, result=fail   
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;We want to get the most recent server's result per day, such as&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;Taget result&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-12-04 12:00:00, machine=serverA, result=pass 
2019-12-04 12:00:00, machine=serverB, result=pass
2019-12-03 12:00:00, machine=serverA, result=fail   
2019-12-03 12:00:00, machine=serverB, result=fail   
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;SPL query&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=myIndex
| dedup  _time machine
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Question: &lt;BR /&gt;
Does "&lt;STRONG&gt;dedup&lt;/STRONG&gt;" command "always" return the most recent events based on the specific fields crossing multiple indexers? &lt;/P&gt;

&lt;P&gt;According to our case, If we apply the spl query based on our condition, can we always get the target result?&lt;/P&gt;</description>
      <pubDate>Wed, 04 Dec 2019 04:07:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500814#M195100</guid>
      <dc:creator>davidgogogo</dc:creator>
      <dc:date>2019-12-04T04:07:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to get the most recent event with specific fields by "dedup" command in indexer cluster condition?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500815#M195101</link>
      <description>&lt;P&gt;&lt;CODE&gt;dedup&lt;/CODE&gt; "removes the events that contain an identical combination of values for the fields that you specify", so as long as all of the logs are being pulled in to your searchhead from all of the indexers (which it looks like from your query results that they are), then yes, it will grab just one of them since you've specified those two fields. "Events returned by dedup are based on search order. For historical searches, the most recent events are searched first." So without a sort, it will just go in descending _time order, as that is the default for how Splunk reads in logs for historical (time based) searches. You can sort by _time or other fields as well with &lt;CODE&gt;dedup&lt;/CODE&gt;  &lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/dedup#Optional_arguments" target="_blank"&gt;https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/dedup#Optional_arguments&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;dedup _time machine sortby -_time&lt;/CODE&gt; &lt;/P&gt;

&lt;P&gt;This doesn't make a ton of sense in this case because you're already specifying _time as a field to dedup on but a thought for the future. That being said, you can also leverage the &lt;CODE&gt;stats&lt;/CODE&gt; command, as this will give you more control over what exactly you want to be passed through, and with less fuzziness on what Splunk chose to dedup. Example:&lt;/P&gt;

&lt;P&gt;&lt;CODE&gt;stats values(result) by _time, machine&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;will return the unique set of results for each _time/machine pairing. I prefer this because it's very clear exactly what you're doing, and you can also more easily compare by switching &lt;CODE&gt;values&lt;/CODE&gt; to &lt;CODE&gt;list&lt;/CODE&gt; to see what is duplicated!&lt;/P&gt;

&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 03:15:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500815#M195101</guid>
      <dc:creator>aberkow</dc:creator>
      <dc:date>2020-09-30T03:15:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to get the most recent event with specific fields by "dedup" command in indexer cluster condition?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500816#M195102</link>
      <description>&lt;P&gt;Thanks for your answer! it's really helpful.&lt;/P&gt;

&lt;P&gt;We have considered using &lt;CODE&gt;stats&lt;/CODE&gt; before, but there were two reasons why we use &lt;CODE&gt;dedup&lt;/CODE&gt; rather than &lt;CODE&gt;stats&lt;/CODE&gt; &lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;performance aspect&lt;/STRONG&gt;&lt;BR /&gt;
if &lt;CODE&gt;dedup&lt;/CODE&gt; command only searchs the first matching event, does that mean the performance will be much better than &lt;CODE&gt;stats&lt;/CODE&gt;?&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;query complexity&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;if we have to deal with many fields, such as &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-12-04 12:00:00, machine=A, field1=x1 field2=x2 field2=x2 field3=x3......field100=x100
2019-12-04 12:00:00, machine=A, field1=x1 field2=x2 field2=x2 field3=x3......field100=x100
2019-12-03 12:00:00, machine=A, field1=x1 field2=x2 field2=x2 field3=x3......field100=x100
2019-12-03 12:00:00, machine=A, field1=x1 field2=x2 field2=x2 field3=x3......field100=x100
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;we can just use a very simple query to get the most recent result per day per machine&lt;BR /&gt;
&lt;CODE&gt;| dedup _time machine&lt;/CODE&gt; &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-12-04 12:00:00, machine=A, field1=x1 field2=x2 field2=x2 field3=x3......field100=x100
2019-12-03 12:00:00, machine=A, field1=x1 field2=x2 field2=x2 field3=x3......field100=x100
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;on the other hand, it will be more complected if we use &lt;CODE&gt;stats&lt;/CODE&gt; to deal with each field, such as &lt;BR /&gt;
&lt;CODE&gt;|stats latest(field1), latest(field2)... by _time machine&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;I'm not sure our consideration making sense or not, do you have any advice for this case?&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2019 09:51:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-get-the-most-recent-event-with-specific-fields-by-quot/m-p/500816#M195102</guid>
      <dc:creator>davidgogogo</dc:creator>
      <dc:date>2019-12-05T09:51:11Z</dc:date>
    </item>
  </channel>
</rss>

