All Apps and Add-ons

Python script errors(code 255) and Invalid start time errors

becksyboy
Communicator

Hi,

currently testing with this add on, and we are ingesting data successfully. But we are seeing the following errors. Has anyone seen this type of error before with this addon?

Python script errors(code 255): index=_internal "code 255"
ERROR pid=29779 tid=MainThread file=configuration_check.py:run:164 | status="completed" task="confcheck_script_errors" message="msg="A script exited abnormally" input="/opt/splunk/etc/apps/splunk_ta_o365/bin/splunk_ta_o365_management_activity.py" stanza="splunk_ta_o365_management_activity://Audit_AD" status="exited with code 255""

From one of the logs; /opt/splunk/var/log/splunk/splunk_ta_o365_management_activity_Audit_AD.log
O365PortalError: 400:{"error":{"code":"AF20055","message":"Date range for requested content is invalid startTime:2019-01-15T12:58:57 endTime:2019-01-15T13:58:57."}}
2019-01-22 12:59:13,409 level=INFO pid=7100 tid=MainThread logger=splunksdc.collector pos=collector.py:run:248 | | message="Modular input exited."

thanks

jeanyvesnolen
Path Finder

I found the issue here

In file management_activity.py line 119

        now = self._now()
        end_time = datetime.utcfromtimestamp(now)
        start_time = end_time - timedelta(days=7)

The problem is that the time of performing all ranges of

self._normalize_time_range(start_time, end_time)

the latest time-range startTime is greater than 7 days

The workarround here is to set_up a control in portal.py line 126

--- a/splunk_ta_o365/bin/splunk_ta_o365/common/portal.py
+++ b/splunk_ta_o365/bin/splunk_ta_o365/common/portal.py
@@ -126,6 +126,10 @@ class O365Subscription(O365Portal):
     def list_available_content(self, session, start_time, end_time):
         for _start_time, _end_time in self._normalize_time_range(start_time, end_time):
             items = list()
+            min_start_date = datetime.utcnow() - timedelta(days=7) + timedelta(minutes=1)
+            _start_time = max(min_start_date, _start_time)
+            if _end_time < min_start_date:
+                yield []
             response = self._list_available_content(session, _start_time, _end_time)
             while True:
                 array = response.json()

I have put

    datetime.utcnow() - timedelta(days=7) + timedelta(minutes=1)

To be sure that the API will receive a not outdate startTime a avoid traffic impact

if _end_time < min_start_date:
    yield []

Is here to be sure that the script will not query for an invalid range (if endTime is before the limit of 7 days no need no query this range)

Hope this helps

     _time                                   ContentType    endTime                            startTime
**2019-03-07 11:15:00.105** Audit.Exchange  2019-02-28T12:06:18 **2019-02-28T11:16:00**
2019-03-07 11:14:52.109 Audit.Exchange  2019-02-28T13:06:18 2019-02-28T12:06:18
2019-03-07 11:14:44.943 Audit.Exchange  2019-02-28T14:06:18 2019-02-28T13:06:18
2019-03-07 11:14:39.220 Audit.Exchange  2019-02-28T15:06:18 2019-02-28T14:06:18
2019-03-07 11:14:31.087 Audit.Exchange  2019-02-28T16:06:18 2019-02-28T15:06:18
2019-03-07 11:14:24.550 Audit.Exchange  2019-02-28T17:06:18 2019-02-28T16:06:18
2019-03-07 11:14:23.513 Audit.Exchange  2019-02-28T18:06:18 2019-02-28T17:06:18
2019-03-07 11:14:23.043 Audit.Exchange  2019-02-28T19:06:18 2019-02-28T18:06:18
2019-03-07 11:14:22.062 Audit.Exchange  2019-02-28T20:06:18 2019-02-28T19:06:18
2019-03-07 11:14:20.283 Audit.Exchange  2019-02-28T21:06:18 2019-02-28T20:06:18
2019-03-07 11:14:17.829 Audit.Exchange  2019-02-28T22:06:18 2019-02-28T21:06:18
2019-03-07 11:14:15.291 Audit.Exchange  2019-02-28T23:06:18 2019-02-28T22:06:18
2019-03-07 11:14:13.837 Audit.Exchange  2019-03-01T00:06:18 2019-02-28T23:06:18
2019-03-07 11:14:11.430 Audit.Exchange  2019-03-01T01:06:18 2019-03-01T00:06:18
2019-03-07 11:14:10.963 Audit.Exchange  2019-03-01T02:06:18 2019-03-01T01:06:18
2019-03-07 11:14:09.427 Audit.Exchange  2019-03-01T03:06:18 2019-03-01T02:06:18
2019-03-07 11:14:09.080 Audit.Exchange  2019-03-01T04:06:18 2019-03-01T03:06:18
2019-03-07 11:14:08.340 Audit.Exchange  2019-03-01T05:06:18 2019-03-01T04:06:18
2019-03-07 11:14:06.699 Audit.Exchange  2019-03-01T06:06:18 2019-03-01T05:06:18
2019-03-07 11:14:04.601 Audit.Exchange  2019-03-01T07:06:18 2019-03-01T06:06:18

MuS
SplunkTrust
SplunkTrust

Hi becksyboy, @tommoore

I had a similar issue where the input stopped unnoticed for more than 2 weeks, and once it was restarted the events were no longer available from the MS API.

It took me some time to troubleshoot the script/issue, but once I found who and where the checkpoint is accessed it was easy to manually check and update the checkpoint hidden deep inside this weird REST API / KV store construct.

You can use this command to see the checkpoint:

curl -k https://127.0.0.1:8089/servicesNS/nobody/TA-MS_O365_Reporting/storage/collections/data/TA_MS_O365_Re... -u <username>

And you can use this command to modify the checkpoint:

curl -k --header "Content-Type: application/json" --request POST --data '[ { "state" : "{\"max_date\": \"2018-11-20 18:56:17.772814\"}", "_user" : "nobody", "_key" : "O365_<input name here>_checkpoint"}] ' https://127.0.0.1:8089/servicesNS/nobody/TA-MS_O365_Reporting/storage/collections/data/TA_MS_O365_Re... -u <username>

Hope this helps should you have further issues ...

cheers, MuS

tommoore
Path Finder

@MuS ♦ Thanks for the info! Funny thing.. I actually did it a little simpler, albeit with a sledgehammer. (PS I'm using the Splunk Add-on for Microsoft Office 365)

I edited this file "splunk_ta_o365/bin/splunk_ta_o365/modinputs/management_activity.py"

Changed the timedelta to 1 day 🙂

119c119
<         start_time = end_time - timedelta(days=7)
---
>         start_time = end_time - timedelta(days=1)

Now at least it only asks Azure for 24 hrs worth of data at a time 🙂 You could probably get away with 6 days if you wanted to check back that far.

mannioke
Engager

@tommoore - this is exactly what fixed my issue - thanks

0 Karma

becksyboy
Communicator

Thanks @MuS and @tommoore.

Doesn't this mean for "start_time = end_time - timedelta(days=7)" the start-time will increment forward over time, so you will still pick up new events?

0 Karma

tommoore
Path Finder

Yes.. the end-time is always moving so when it runs it uses the "now" as the end-time, and subtracts the days. As long as you keep it lower than 7 it should complete successfully.

0 Karma

tommoore
Path Finder

Looks like now we need to have the ability to tell the add-on to only go back X days, otherwise it never completes and just keeps pulling in the same weeks worth of data. I've turned off my inputs until this is figured out 😞

0 Karma

tommoore
Path Finder

I just found this post on another thread. Makes sense because my pulls bomb out exactly 1 week back

Looks like the limitation is in the O365 Management API that the Splunk app relies on:

https://msdn.microsoft.com/office-365/office-365-management-activity-api-reference

"Content older than 7 days cannot be retrieved."

Esky73
Builder

We are also seeing this issue - i think i'll raise a case with Splunk

0 Karma

layamba
Explorer

Hi there, where you able to get answer from Splunk about this?

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!