<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Transaction with multiple down events but only one up event results in incorrect duration calculation in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391276#M113945</link>
    <description>&lt;P&gt;hi,&lt;BR /&gt;
have you considered applying something like : eventstats max(Interface Up) as uptime, min(Interface Down) as down | eva duration=uptime-downtime?&lt;BR /&gt;
The issue here is splunk can not know what is the actual time when human engineers are working to fix the issue, so for example 07:22:44 is a downtime event. Now, you know that the human engineers are working in this time , but such information is not available to splunk. The above pseudo code i posted is merely doing max(uptime)- min(downtime) . How do you 'tell' splunk or any other system to ignore the down time at 07:22:44 on an automated basis? Since, you know that the downtime at 07:22:44 should not be considered, the above code will work but how are you going to inform splunk which is the actual downtime to consider as the earliest?&lt;/P&gt;</description>
    <pubDate>Sat, 30 Jun 2018 16:59:01 GMT</pubDate>
    <dc:creator>Sukisen1981</dc:creator>
    <dc:date>2018-06-30T16:59:01Z</dc:date>
    <item>
      <title>Transaction with multiple down events but only one up event results in incorrect duration calculation</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391274#M113943</link>
      <description>&lt;P&gt;I've created a dashboard showing downtime for BGP adjacencies and WAN circuits.&lt;BR /&gt;
It works (almost) perfectly, but recently on one circuit we've seen a down event on a circuit and it's taken a long time to come back up.&lt;BR /&gt;
While troubleshooting the circuit, the engineers have tried several things to bring up the interface, but it has failed several times.&lt;BR /&gt;
This troubleshooting has caused "down" events identical to the "original" down event.&lt;BR /&gt;
I want the dashboard to be able to show the duration between the first "down" event and the recent "up" event, but what the dashboard is showing is very misleading.&lt;/P&gt;

&lt;P&gt;This is the search used:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; index=net &amp;lt;host-ip&amp;gt; AND "%LINK-3-UPDOWN: Interface"
 | rex "%LINK-3-UPDOWN: Interface (?P&amp;lt;Interface_Name&amp;gt;(.+)),"
 | transaction host startswith="Down" endswith="Up" keepevicted=true
 | eval duration=if(duration==0,now()-_time,duration)
 | convert rmunit(duration) as numSecs
 | eval string_dur = tostring(round(numSecs), "duration") 
 | eval formatted_dur = replace(string_dur,"(?:(\d+)\+)?0?(\d+):0?(\d+):0?(\d+)","\1d \2h \3m \4s")
 | eval stringSecs = replace(formatted_dur, "^d (0h (0m )?)?","") 
 | eval InterfaceUpTime = _time + duration
 | table host, Interface_Name, _time, , InterfaceUpTime, stringSecs 
 | rename _time AS "Interface Down", InterfaceUpTime AS "Interface Up", host AS "Reporting Host", Interface_Name AS "Interface", stringSecs AS "Total Downtime"
 | convert timeformat="%a-%d-%b-%Y %H:%M:%S" ctime("Interface Down") ctime("Interface Up")
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;And this is the filtered log output from the device feeding into Splunk:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; 008464: Jun  3 2018 12:50:53.356 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to up
 008466: Jun  3 2018 12:50:53.356 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to up
 008493: Jun  8 2018 10:55:34.676 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to down
 008495: Jun  8 2018 10:55:34.675 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to down
 008527: Jun 12 2018 13:15:29.641 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to down
 008529: Jun 12 2018 13:15:29.644 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to down
 008548: Jun 14 2018 17:35:31.894 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to down
 008549: Jun 14 2018 17:35:31.897 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to down
 008558: Jun 14 2018 17:43:38.631 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to down
 008559: Jun 14 2018 17:43:38.634 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to down
 008573: Jun 15 2018 07:22:05.050 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to up
 008575: Jun 15 2018 07:22:05.050 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to up
 008576: Jun 15 2018 07:22:44.183 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to down
 008579: Jun 15 2018 07:22:44.183 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to down
 008580: Jun 15 2018 07:23:16.648 dst: %LINK-3-UPDOWN: Interface GigabitEthernet0/0/0, changed state to up
 008582: Jun 15 2018 07:23:16.648 dst: %LINK-3-UPDOWN: SIP0/0: Interface GigabitEthernet0/0/0, changed state to up
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;What I'd like to see in the dashboard was the duration between the "Down" event Jun 8 10:55 and until the "Up" event Jun 15 07:23.&lt;/P&gt;

&lt;P&gt;Instead I'm seeing this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; Reporting Host    Interface        Interface Down              Interface Up                Total Downtime
 X.X.X.X    GigabitEthernet0/0/0    Fri-15-Jun-2018 07:22:44    Fri-15-Jun-2018 07:23:17    33s
 X.X.X.X    GigabitEthernet0/0/0    Thu-14-Jun-2018 17:43:39    Fri-15-Jun-2018 07:22:05    13h 38m 26s
 X.X.X.X    GigabitEthernet0/0/0    Thu-14-Jun-2018 17:35:32    Fri-15-Jun-2018 09:06:24    15h 30m 52s
 X.X.X.X    GigabitEthernet0/0/0    Tue-12-Jun-2018 13:15:29    Fri-15-Jun-2018 09:06:24    2d 19h 50m 55s
 X.X.X.X    GigabitEthernet0/0/0    Sun-03-Jun-2018 12:06:57    Sun-03-Jun-2018 12:50:54    43m 57s
 X.X.X.X    GigabitEthernet0/0/0    Sun-03-Jun-2018 11:45:27    Sun-03-Jun-2018 12:04:57    19m 30s
 X.X.X.X    GigabitEthernet0/0/0    Sun-03-Jun-2018 09:03:27    Sun-03-Jun-2018 10:45:27    1h 42m 0s
 X.X.X.X    GigabitEthernet0/0/0    Sat-02-Jun-2018 23:59:05    Sun-03-Jun-2018 00:34:49    35m 44s
 X.X.X.X    GigabitEthernet0/0/0    Wed-23-May-2018 00:37:20    Wed-23-May-2018 00:40:27    3m 7s
 X.X.X.X    GigabitEthernet0/0/0    Fri-08-Jun-2018 10:55:35    Fri-15-Jun-2018 09:06:24    6d 22h 10m 49s
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Any help appreciated &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jun 2018 12:02:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391274#M113943</guid>
      <dc:creator>NJL</dc:creator>
      <dc:date>2018-06-21T12:02:47Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction with multiple down events but only one up event results in incorrect duration calculation</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391275#M113944</link>
      <description>&lt;P&gt;Anybody? &lt;span class="lia-unicode-emoji" title=":grinning_face_with_smiling_eyes:"&gt;😄&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 29 Jun 2018 10:55:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391275#M113944</guid>
      <dc:creator>NJL</dc:creator>
      <dc:date>2018-06-29T10:55:20Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction with multiple down events but only one up event results in incorrect duration calculation</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391276#M113945</link>
      <description>&lt;P&gt;hi,&lt;BR /&gt;
have you considered applying something like : eventstats max(Interface Up) as uptime, min(Interface Down) as down | eva duration=uptime-downtime?&lt;BR /&gt;
The issue here is splunk can not know what is the actual time when human engineers are working to fix the issue, so for example 07:22:44 is a downtime event. Now, you know that the human engineers are working in this time , but such information is not available to splunk. The above pseudo code i posted is merely doing max(uptime)- min(downtime) . How do you 'tell' splunk or any other system to ignore the down time at 07:22:44 on an automated basis? Since, you know that the downtime at 07:22:44 should not be considered, the above code will work but how are you going to inform splunk which is the actual downtime to consider as the earliest?&lt;/P&gt;</description>
      <pubDate>Sat, 30 Jun 2018 16:59:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391276#M113945</guid>
      <dc:creator>Sukisen1981</dc:creator>
      <dc:date>2018-06-30T16:59:01Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction with multiple down events but only one up event results in incorrect duration calculation</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391277#M113946</link>
      <description>&lt;P&gt;&lt;CODE&gt;Transaction&lt;/CODE&gt; is almost never the right answer.  This should get the data you are asking for, and then you can reformat it as needed.  &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=net &amp;lt;host-ip&amp;gt; AND "%LINK-3-UPDOWN: Interface"
| rex "%LINK-3-UPDOWN: Interface (?P&amp;lt;Interface_Name&amp;gt;(.+)),"

| rename COMMENT as "Find the new status and kill all events without it"
| rename COMMENT as "WARNING - Make sure that that this line properly extracts the status"
| rex "changed state to (?&amp;lt;myStatus&amp;gt;(up|down)"
| where isnotnull(myStatus)

| rename COMMENT as "sort data into order, copy the prior status, kill records that just maintain the prior status"
| sort 0 host _time
| streamstats current=f last(myStatus) as prevStatus by host
| where isnull(prevStatus) OR myStatus &amp;lt;&amp;gt; prevStatus

| rename COMMENT as "reverse the prior order, copy the next later time backwards"
| reverse 
| streamstats current=f last(_time) as nextTime last(myStatus) as nextStatus by host

| rename COMMENT as "calculate duration, fill in for any still down hosts, and kill the up records"
| eval duration= if(isnotnull(nextTime), nextTime - _time, now() - _time) 
| where myStatus == "down"

| rename COMMENT as "sort into desired order"
| sort 0 _time  
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You can identify down interfaces by &lt;CODE&gt;isnull(nextStatus)&lt;/CODE&gt;.&lt;/P&gt;</description>
      <pubDate>Sat, 30 Jun 2018 19:10:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391277#M113946</guid>
      <dc:creator>DalJeanis</dc:creator>
      <dc:date>2018-06-30T19:10:30Z</dc:date>
    </item>
    <item>
      <title>Re: Transaction with multiple down events but only one up event results in incorrect duration calculation</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391278#M113947</link>
      <description>&lt;P&gt;Thanks for the elaborate answer!&lt;/P&gt;

&lt;P&gt;It seems to work, so thank you very much for your help!&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jul 2018 10:17:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Transaction-with-multiple-down-events-but-only-one-up-event/m-p/391278#M113947</guid>
      <dc:creator>NJL</dc:creator>
      <dc:date>2018-07-02T10:17:18Z</dc:date>
    </item>
  </channel>
</rss>

