<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Help with creating an earliest/latest event table in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423334#M121496</link>
    <description>&lt;P&gt;(Using Splunk 6.1.2 for...reasons)&lt;/P&gt;

&lt;P&gt;Background: We send out a push notification to a third party. The third party sometimes responds quickly with a data pull for that event, and sometimes days later.&lt;/P&gt;

&lt;P&gt;The notifications are automatically extracted as the field  &lt;CODE&gt;"pushID"&lt;/CODE&gt;&lt;BR /&gt;
The data pulls are automatically extracted as the field  &lt;CODE&gt;"pullID"&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;Several notifications and several extractions can occur for each event. To further complicate matters, there can be several irrelevant &lt;CODE&gt;pullID&lt;/CODE&gt; events occurring before the relevant  &lt;CODE&gt;pushID&lt;/CODE&gt; event occurs.&lt;/P&gt;

&lt;P&gt;What I am needing to do is get the  &lt;CODE&gt;pushID&lt;/CODE&gt; event time that occurs before the very first  &lt;CODE&gt;pullID&lt;/CODE&gt; time following, and calculate the delay (in decimal hours - which makes it a tad bit easier).&lt;/P&gt;

&lt;P&gt;Up to this point, I have been using a memory hogging  &lt;CODE&gt;join&lt;/CODE&gt; subsearch on  &lt;CODE&gt;pullID&lt;/CODE&gt; for every stats earliest (&lt;CODE&gt;pushID&lt;/CODE&gt;) and attempting to return the stats latest(&lt;CODE&gt;pullID&lt;/CODE&gt;) result. Unfortunately, this is not returning correct results - and I need help in building something leaner and more accurate.&lt;/P&gt;

&lt;P&gt;Sample Data (all from same host and source):&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Jan 18 21:23:32 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:16:06 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:16:05 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:15:58 web event-server: EVENT_NOTIFICATION for pushID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:14:11 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:13:03 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:13:02 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:13:00 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:12:35 web event-server: EVENT_NOTIFICATION for pushID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 14 13:11:18 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 14 13:11:18 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 14 11:53:58 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;Under my current code, the correct  &lt;CODE&gt;pushID&lt;/CODE&gt; time is returning (Jan 15 14:12:35) but the wrong  &lt;CODE&gt;pullID&lt;/CODE&gt; result is returned (Jan 18 21:23:32). Mostly complete code (minus some other evals for other things) below:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host=web source=/var/log/messages earliest=01/14/2019:00:00:00 latest=01/20/2019:24:00:00 pushID!=NULL
...[other evals and rex here]...
| stats earliest(_time) as pushTime by pushID
| fields pushID pushTime
| eval pullID=pushID 
| join pullID max=0  usetime=true earlier=false
[search host=web source=/var/log/messages earliest=01/14/2019:00:00:00 latest=01/20/2019:24:00:00
| stats earliest(_time) as pullTime by pullID
| fields pullID pullTime
] 
| eval delayTime=pullTime-pushTime
| where delayTime &amp;gt; 0
...[other evals here]...
| eval delayHours = delayTime/3600
| table pushID pushTime pullID pullTime delayHours
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;OL&gt;
&lt;LI&gt;What am I doing wrong, and how can I fix it?&lt;/LI&gt;
&lt;LI&gt;Is there a much simpler and less expensive way to retrieve the correct results?
(have tried transactions, selfjoins, subsearches without joins, appendcols, etc.)&lt;/LI&gt;
&lt;/OL&gt;</description>
    <pubDate>Wed, 30 Sep 2020 01:34:09 GMT</pubDate>
    <dc:creator>katharsys</dc:creator>
    <dc:date>2020-09-30T01:34:09Z</dc:date>
    <item>
      <title>Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423334#M121496</link>
      <description>&lt;P&gt;(Using Splunk 6.1.2 for...reasons)&lt;/P&gt;

&lt;P&gt;Background: We send out a push notification to a third party. The third party sometimes responds quickly with a data pull for that event, and sometimes days later.&lt;/P&gt;

&lt;P&gt;The notifications are automatically extracted as the field  &lt;CODE&gt;"pushID"&lt;/CODE&gt;&lt;BR /&gt;
The data pulls are automatically extracted as the field  &lt;CODE&gt;"pullID"&lt;/CODE&gt;&lt;/P&gt;

&lt;P&gt;Several notifications and several extractions can occur for each event. To further complicate matters, there can be several irrelevant &lt;CODE&gt;pullID&lt;/CODE&gt; events occurring before the relevant  &lt;CODE&gt;pushID&lt;/CODE&gt; event occurs.&lt;/P&gt;

&lt;P&gt;What I am needing to do is get the  &lt;CODE&gt;pushID&lt;/CODE&gt; event time that occurs before the very first  &lt;CODE&gt;pullID&lt;/CODE&gt; time following, and calculate the delay (in decimal hours - which makes it a tad bit easier).&lt;/P&gt;

&lt;P&gt;Up to this point, I have been using a memory hogging  &lt;CODE&gt;join&lt;/CODE&gt; subsearch on  &lt;CODE&gt;pullID&lt;/CODE&gt; for every stats earliest (&lt;CODE&gt;pushID&lt;/CODE&gt;) and attempting to return the stats latest(&lt;CODE&gt;pullID&lt;/CODE&gt;) result. Unfortunately, this is not returning correct results - and I need help in building something leaner and more accurate.&lt;/P&gt;

&lt;P&gt;Sample Data (all from same host and source):&lt;/P&gt;

&lt;P&gt;&lt;EM&gt;Jan 18 21:23:32 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:16:06 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:16:05 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:15:58 web event-server: EVENT_NOTIFICATION for pushID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:14:11 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:13:03 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:13:02 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:13:00 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 15 14:12:35 web event-server: EVENT_NOTIFICATION for pushID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 14 13:11:18 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 14 13:11:18 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;BR /&gt;
Jan 14 11:53:58 web event-server: Requesting pull event for pullID: ORG_12345_EVENT12345ABC1&lt;/EM&gt;&lt;/P&gt;

&lt;P&gt;Under my current code, the correct  &lt;CODE&gt;pushID&lt;/CODE&gt; time is returning (Jan 15 14:12:35) but the wrong  &lt;CODE&gt;pullID&lt;/CODE&gt; result is returned (Jan 18 21:23:32). Mostly complete code (minus some other evals for other things) below:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;host=web source=/var/log/messages earliest=01/14/2019:00:00:00 latest=01/20/2019:24:00:00 pushID!=NULL
...[other evals and rex here]...
| stats earliest(_time) as pushTime by pushID
| fields pushID pushTime
| eval pullID=pushID 
| join pullID max=0  usetime=true earlier=false
[search host=web source=/var/log/messages earliest=01/14/2019:00:00:00 latest=01/20/2019:24:00:00
| stats earliest(_time) as pullTime by pullID
| fields pullID pullTime
] 
| eval delayTime=pullTime-pushTime
| where delayTime &amp;gt; 0
...[other evals here]...
| eval delayHours = delayTime/3600
| table pushID pushTime pullID pullTime delayHours
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;OL&gt;
&lt;LI&gt;What am I doing wrong, and how can I fix it?&lt;/LI&gt;
&lt;LI&gt;Is there a much simpler and less expensive way to retrieve the correct results?
(have tried transactions, selfjoins, subsearches without joins, appendcols, etc.)&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:34:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423334#M121496</guid>
      <dc:creator>katharsys</dc:creator>
      <dc:date>2020-09-30T01:34:09Z</dc:date>
    </item>
    <item>
      <title>Re: Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423335#M121497</link>
      <description>&lt;P&gt;Hi  katharsys,&lt;BR /&gt;
probably the problems is related to the limit of 50,000 events in subsearches.&lt;BR /&gt;
so you have to use a different approach;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(host=web source=/var/log/messages earliest=01/14/2019:00:00:00 latest=01/20/2019:24:00:00 pushID!=NULL) OR (host=web source=/var/log/messages earliest=01/14/2019:00:00:00 latest=01/20/2019:24:00:00)
| stats earliest(_time) AS pushTime latest(_time) AS pullTime BY pullID
| eval delayTime=pullTime-pushTime
 | where delayTime &amp;gt; 0
 ...[other evals here]...
 | eval delayHours = delayTime/3600
 | table pushID pushTime pullID pullTime delayHours
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If you need to use other fields insert in | stats also values(your_field) AS your_field&lt;BR /&gt;
In addition I suggest to always use index in your searches.&lt;/P&gt;

&lt;P&gt;Bye.&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:34:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423335#M121497</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2020-09-30T01:34:20Z</dc:date>
    </item>
    <item>
      <title>Re: Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423336#M121498</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am sort of confused by the response. The way you have that written, it is simply getting the earliest pullID timestamp and the latest pullID timestamp... To use my exemplar data above, it would simply return Jan 14 11:53:58 and assign it to the pushTime(?!) and Jan 18 21:23:32 as pullTime, which is the issue I have having trouble with...&lt;/P&gt;

&lt;P&gt;There is no correlation between the latest pushID time and the earliest pullID time that occurs following the pushID time.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 16:19:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423336#M121498</guid>
      <dc:creator>katharsys</dc:creator>
      <dc:date>2019-08-01T16:19:30Z</dc:date>
    </item>
    <item>
      <title>Re: Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423337#M121499</link>
      <description>&lt;P&gt;hello&lt;/P&gt;

&lt;P&gt;I think i have understood your problem statement. Maybe you can try tweaking your approach, instead of JOIN, use append and then try use transaction to find the time difference between the events.&lt;/P&gt;

&lt;P&gt;&lt;A href="https://docs.splunk.com/Documentation/Splunk/7.3.0/SearchReference/Transaction"&gt;https://docs.splunk.com/Documentation/Splunk/7.3.0/SearchReference/Transaction&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;hopefully this helps find a solution, good luck.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 17:00:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423337#M121499</guid>
      <dc:creator>praphulla1</dc:creator>
      <dc:date>2019-08-01T17:00:05Z</dc:date>
    </item>
    <item>
      <title>Re: Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423338#M121500</link>
      <description>&lt;P&gt;After scouring more of the Splunk answers pages, I have cobbled together a result that works reasonably well.  Unfortunately, my Karma is too low for me to actually post links to the various answers used, but I will try to add them at the end.&lt;/P&gt;

&lt;P&gt;There is only one condition on this search which I can't seem to figure out - if by some chance, there is a pullID that is incredibly late, it will get knocked out of the results if there is a pullID that occurs before the pushID timestamp. That said, here is the resulting pseudocode:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=main earliest=-1w@w1 latest=@w1 sourcetype=syslog (pushID=* OR pullID=*)
| eval matchID=coalesce(pushID,pullID)
...[Other evals, rex, and dragons be here]...
| eval eventType=if(isnull(pullID),"Push","Pull")
| eval hour-{eventType}=strftime(_time,"%c")
| stats latest(hour-Push) AS PushTime earliest(hour-Pull) as PullTime distinct_count(eventType) AS Type by matchID
| where Type=2
| eval decimalHours=(strptime(PullTime,"%c")-strptime(PushTime,"%c"))/3600
| where decimalHours&amp;gt;0.0084
| sort decimalHours desc
| fields - Type
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;@somesoni2's answer here &lt;A href="https://answers.splunk.com/answers/586269/alternative-to-joins.html#answer-586298"&gt;https://answers.splunk.com/answers/586269/alternative-to-joins.html#answer-586298&lt;/A&gt; for the idea on using distinct_count based on an event type.&lt;/P&gt;

&lt;P&gt;@MuS's answer here &lt;A href="https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-join-append-or-use-of-subsearches.html#answer-129426"&gt;https://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-join-append-or-use-of-subsearches.html#answer-129426&lt;/A&gt; for the how-to on making a dynamic field name.&lt;/P&gt;

&lt;P&gt;EDIT: The where clause for decimalHours was to limit only those pullIDs that were longer than 5 minutes (give or take).&lt;/P&gt;</description>
      <pubDate>Thu, 01 Aug 2019 20:13:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423338#M121500</guid>
      <dc:creator>katharsys</dc:creator>
      <dc:date>2019-08-01T20:13:04Z</dc:date>
    </item>
    <item>
      <title>Re: Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423339#M121501</link>
      <description>&lt;P&gt;Make sure that you &lt;CODE&gt;UpVote&lt;/CODE&gt; those answers in those links that helped you.&lt;/P&gt;</description>
      <pubDate>Sun, 04 Aug 2019 18:59:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423339#M121501</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-08-04T18:59:48Z</dc:date>
    </item>
    <item>
      <title>Re: Help with creating an earliest/latest event table</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423340#M121502</link>
      <description>&lt;P&gt;Done! Thank you for the reminder to &lt;CODE&gt;UpVote&lt;/CODE&gt; those comments!&lt;/P&gt;</description>
      <pubDate>Mon, 05 Aug 2019 15:24:50 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Help-with-creating-an-earliest-latest-event-table/m-p/423340#M121502</guid>
      <dc:creator>katharsys</dc:creator>
      <dc:date>2019-08-05T15:24:50Z</dc:date>
    </item>
  </channel>
</rss>

