<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Time Skew for when logs are read. in All Apps and Add-ons</title>
    <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427705#M52345</link>
    <description>&lt;P&gt;Also noticed that when the API call is made, the following API call does not follow on from the previous End date/time. In the example below the end date/time was 07/03/2019 17:21:04, but with the next API call the start time was 07/03/2019 17:21:03. If this keeps happening, after a while the timing gap would get larger.&lt;/P&gt;

&lt;P&gt;When TA API call made       Start                                        End&lt;BR /&gt;
07/03/2019 17:45:15     07/03/2019 17:21:03.000 07/03/2019 17:31:03&lt;BR /&gt;
07/03/2019 17:35:15     07/03/2019 17:11:04.000 07/03/2019 17:21:04&lt;/P&gt;</description>
    <pubDate>Tue, 12 Mar 2019 14:27:00 GMT</pubDate>
    <dc:creator>becksyboy</dc:creator>
    <dc:date>2019-03-12T14:27:00Z</dc:date>
    <item>
      <title>Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427704#M52344</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;we deployed the Microsoft Office 365 Reporting Add-on for Splunk five days ago with the following settings:&lt;BR /&gt;
INDEX_CONTINUOUS &lt;BR /&gt;
Interval(secs): 600 &lt;BR /&gt;
Query window size (minutes) 10&lt;BR /&gt;
Delay throttle (minutes) 5&lt;BR /&gt;
Start Time: 2019-03-07T16:01:05 &lt;/P&gt;

&lt;P&gt;On the day of the deployment we where seeing the logs being ingested approx 15 mins behind current time, which is what we expected. Today (5 days later) we are seeing this now extended to 30-35 mins. has anyone experienced this or know of the cause/solution?&lt;/P&gt;

&lt;P&gt;11/03/2019/09:55:07 11/03/2019 09:09    11/03/2019 09:19    &amp;gt;&amp;gt; approx. 35 mins behind current time&lt;BR /&gt;
07/03/2019/16:35:15 07/03/2019 16:11    07/03/2019 16:21    &amp;gt;&amp;gt; approx. 15 mins behind current time&lt;/P&gt;

&lt;P&gt;thanks&lt;/P&gt;</description>
      <pubDate>Mon, 11 Mar 2019 11:26:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427704#M52344</guid>
      <dc:creator>becksyboy</dc:creator>
      <dc:date>2019-03-11T11:26:12Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427705#M52345</link>
      <description>&lt;P&gt;Also noticed that when the API call is made, the following API call does not follow on from the previous End date/time. In the example below the end date/time was 07/03/2019 17:21:04, but with the next API call the start time was 07/03/2019 17:21:03. If this keeps happening, after a while the timing gap would get larger.&lt;/P&gt;

&lt;P&gt;When TA API call made       Start                                        End&lt;BR /&gt;
07/03/2019 17:45:15     07/03/2019 17:21:03.000 07/03/2019 17:31:03&lt;BR /&gt;
07/03/2019 17:35:15     07/03/2019 17:11:04.000 07/03/2019 17:21:04&lt;/P&gt;</description>
      <pubDate>Tue, 12 Mar 2019 14:27:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427705#M52345</guid>
      <dc:creator>becksyboy</dc:creator>
      <dc:date>2019-03-12T14:27:00Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427706#M52346</link>
      <description>&lt;P&gt;You need to be careful of this TA. In a production environment you will find it slowly falls behind your delay throttleuntil it is  hopelessly out of sync.&lt;/P&gt;

&lt;P&gt;A couple of reasons for this.&lt;/P&gt;

&lt;P&gt;It computes the last checkpoint a few different ways:&lt;/P&gt;

&lt;P&gt;1) If messages are downloaded then the last checkpoint is the time of the last message. Sounds sensible, but because it operates on a fixed query window size you will start to fall behind your delay throttle period. Even in a busy environment like yours the last message is maybe a second or so earlier than the actual start_date + window_size. But this means you will be 1 second behind your delay throttle. Repeat this over a number of days and you will fall far behind your delay throttle. This is exactly what you are seeing above. It will never catch up.&lt;/P&gt;

&lt;P&gt;2) If you ever stop the forwarder that this TA runs on you will be in trouble for similar reasons. It simply picks up off at the last checkpoint and applies this + query_window. This means you are will behind your delay throttle and will never catch up.&lt;/P&gt;

&lt;P&gt;To get it to work in a sensible way we had to change how the end_date checkpoint was computed in input_module_ms_o365_message_trace.py:&lt;/P&gt;

&lt;P&gt;end_date = utc_now - datetime.timedelta(minutes=delay_throttle)&lt;/P&gt;

&lt;P&gt;Then if no messages or skipped messages:&lt;/P&gt;

&lt;P&gt;max_date = end_date&lt;/P&gt;

&lt;P&gt;Although I am tempted to put this one as max_date = start_date so you reread the same time period if no messages. Because of MS potential delay.&lt;/P&gt;

&lt;P&gt;Maybe we missed some tunable config params, but we just found it kept falling behind and only way to fix was to update the code to suit our environment.&lt;/P&gt;

&lt;P&gt;-S.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:41:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427706#M52346</guid>
      <dc:creator>swatts1000</dc:creator>
      <dc:date>2020-09-29T23:41:36Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427707#M52347</link>
      <description>&lt;P&gt;To add to this, if we ran this search, we can see the time skew in action:&lt;BR /&gt;
index=_internal host= source="/opt/splunk/var/log/splunk/ta_ms_o365_reporting_ms_o365_message_trace.log" "start_date=" OR "Start date" OR "End Date" OR "Number of messages returned" |rex field=_raw "start_date=(?P\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})" |rex field=_raw "Start\sdate:\s(?P.&lt;EM&gt;?)," | rex field=_raw "End\sdate:\s(?P.&lt;/EM&gt;?)$"| rex field=_raw "Number\sof\smessages\sreturned:\s(?P\d+)" | convert timeformat="%d/%m/%Y/%T %N" ctime(_time) AS last_polled|table last_polled,Number_Of_Messages,checkpoint_good_start, checkpoint_good_end,checkpoint_bad | sort -last_polled&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 23:38:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427707#M52347</guid>
      <dc:creator>becksyboy</dc:creator>
      <dc:date>2020-09-29T23:38:00Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427708#M52348</link>
      <description>&lt;P&gt;Thanks for this. It is exactly this behaviour we see. Would be great if the original authors could fix the TA checkpoint issue.&lt;/P&gt;</description>
      <pubDate>Wed, 08 May 2019 15:39:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427708#M52348</guid>
      <dc:creator>nightflame</dc:creator>
      <dc:date>2019-05-08T15:39:23Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427709#M52349</link>
      <description>&lt;P&gt;I tried this mod to the file in question, but it just started throwing errors so I reverted to the backup copy. Is it just a matter of altering the pyhton script? Or is there more to it than that?&lt;/P&gt;</description>
      <pubDate>Thu, 20 Jun 2019 05:19:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427709#M52349</guid>
      <dc:creator>jeremyhagand61</dc:creator>
      <dc:date>2019-06-20T05:19:30Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427710#M52350</link>
      <description>&lt;P&gt;All we did is modify the python script. Here's what we did:&lt;/P&gt;

&lt;P&gt;In input_module_ms_o365_message_trace.py :&lt;/P&gt;

&lt;P&gt;145      start_date = get_start_date(helper, check_point_key)&lt;BR /&gt;
   146      # end_date = start_date + datetime.timedelta(minutes=query_window_size)&lt;BR /&gt;
   147      # CHANGE&lt;BR /&gt;
   148      utc_now = datetime.datetime.utcnow()&lt;BR /&gt;
   149      # CHANGE - read from last checkpoint up to time now minus delay throttle - this makes more sense!&lt;BR /&gt;
   150      end_date = utc_now - datetime.timedelta(minutes=delay_throttle)&lt;/P&gt;

&lt;P&gt;...&lt;/P&gt;

&lt;P&gt;166          # CHANGE&lt;BR /&gt;
   167          # max_date = start_date + datetime.timedelta(seconds=interval)&lt;BR /&gt;
   168          max_date = end_date&lt;/P&gt;

&lt;P&gt;...&lt;/P&gt;

&lt;P&gt;204      if len(messages) == message_skip_count:&lt;BR /&gt;
   205          # If we skipped all the messages, move the query window forward by the amount of seconds in the interval&lt;BR /&gt;
   206          # CHANGE&lt;BR /&gt;
   207          # max_date = max_date + datetime.timedelta(seconds=interval)&lt;BR /&gt;
   208          max_date = end_date&lt;BR /&gt;
   209          helper.log_debug("Skipped all messages.  Setting max date to %s" % max_date)&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:00:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427710#M52350</guid>
      <dc:creator>swatts1000</dc:creator>
      <dc:date>2020-09-30T01:00:58Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427711#M52351</link>
      <description>&lt;P&gt;Thanks for your reply. However it isn't working for me. I get an ExecProcessor error on this line:&lt;BR /&gt;
end_date = utc_now - datetime.timedelta(minutes=delay_throttle)&lt;/P&gt;

&lt;P&gt;When splunk attempts to run the script.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 01:12:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427711#M52351</guid>
      <dc:creator>jeremyhagand61</dc:creator>
      <dc:date>2020-09-30T01:12:54Z</dc:date>
    </item>
    <item>
      <title>Re: Time Skew for when logs are read.</title>
      <link>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427712#M52352</link>
      <description>&lt;P&gt;Maybe I'll write my own PowerShell modular input.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2019 06:06:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/All-Apps-and-Add-ons/Time-Skew-for-when-logs-are-read/m-p/427712#M52352</guid>
      <dc:creator>jeremyhagand61</dc:creator>
      <dc:date>2019-07-11T06:06:27Z</dc:date>
    </item>
  </channel>
</rss>

