All Apps and Add-ons

Microsoft Office 365 Reporting Add-on for Splunk is Unstable- Anyone else experiencing this issue?

lucas4394
Path Finder

We are using version 1.1 of Microsoft Office 365 Reporting Add-on for Splunk, but the app stops pulling the data very often, and there aren't any error messages at all from internal logs. We have to disable and enable the account manually from the inputs, and the app starts pulling the data. Does anyone encounter this kind of issue? And what is the better solution to fix this issue? Thanks.

Labels (1)
1 Solution

madcitygeek
Explorer

I have experienced this as well. We've only been using it for a short time, so I can't speak to frequency yet. I found the python process responsible for the input was stuck on the heavy forwarder. I killed it with a HUP and it restarted properly and back-filled the data.

Next time it happens I'll try to get deeper into where it is stuck. If I can't figure it out, I'll look to build a watchdog of sorts to kill it when stale.

View solution in original post

becksyboy
Communicator
On version 1.2.4 on Splunk 7.3.7. We had applied the suggested fix of adding a \ before the $filter. But we are faced with two issues.

[1] The connection interval is not being obeyed. We have it set to 10 mins, but our logs seems to be continuously updated like the connection hasn't completed.

2020-12-01 10:21:10.854
2020-12-01 10:21:10,854 DEBUG pid=28567 tid=MainThread file=connectionpool.py:_new_conn:959 | Starting new HTTPS connection (1): reports.office365.com:443
2020-12-01 10:21:10.850
2020-12-01 10:21:10,850 DEBUG pid=28567 tid=MainThread file=base_modinput.py:log_debug:288 | Endpoint URL: https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MessageTrace?$skiptoken=413999
2020-12-01 10:21:10.850
2020-12-01 10:21:10,850 DEBUG pid=28567 tid=MainThread file=base_modinput.py:log_debug:288 | _Splunk_ nextLink URL (@odata.nextLink): https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MessageTrace?$skiptoken=413999
2020-12-01 10:21:10.850
2020-12-01 10:21:10,850 DEBUG pid=28567 tid=MainThread file=binding.py:new_f:73 | Operation took 0:00:00.022766
2020-12-01 10:21:10.848
2020-12-01 10:21:10,848 DEBUG pid=28567 tid=MainThread file=connectionpool.py:_make_request:437 | https://127.0.0.1:9001 "POST /servicesNS/nobody/TA-MS_O365_Reporting/storage/collections/data/TA_MS_O365_Reporting_checkpointer/batch_save HTTP/1.1" 200 39
2020-12-01 10:21:10.827
2020-12-01 10:21:10,827 DEBUG pid=28567 tid=MainThread file=binding.py:post:750 | POST request to https://127.0.0.1:9001/servicesNS/nobody/TA-MS_O365_Reporting/storage/collections/data/TA_MS_O365_Re... (body: {'body': '[{"_key": "index_continuously_obj_checkpoint", "state": "{\\"max_date\\": \\"2020-12-01 09:54:07.121067\\"}"}]'})
2020-12-01 10:21:10.827
2020-12-01 10:21:10,827 DEBUG pid=28567 tid=MainThread file=base_modinput.py:log_debug:288 | _Splunk_ max date after getting messages: 2020-12-01 09:54:07.121067
2020-12-01 10:21:09.913
2020-12-01 10:21:09,913 DEBUG pid=28567 tid=MainThread file=connectionpool.py:_make_request:437 | https://reports.office365.com:443 "GET /ecp/reportingwebservice/reporting.svc/MessageTrace?$skiptoken=411999 HTTP/1.1" 200 None
2020-12-01 10:21:21.866
2020-12-01 10:21:21,866 DEBUG pid=28567 tid=MainThread file=connectionpool.py:_new_conn:959 | Starting new HTTPS connection (1): reports.office365.com:443
 
 
[2] When we do have a successful connection that pulls logs (happens every 1-2 hours) we get these errors:

HTTPError: 500 Server Error: Internal Server Error for url: https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MessageTrace?$skiptoken=999999
2020-12-01 09:53:50,446 ERROR pid=13324 tid=MainThread file=base_modinput.py:log_error:309 | Get error when collecting events.
2020-12-01 09:53:50,440 ERROR pid=13324 tid=MainThread file=base_modinput.py:log_error:309 | HTTP Request error: 500 Server Error: Internal Server Error for url: https://reports.office365.com/ecp/reportingwebservice/reporting.svc/MessageTrace?$skiptoken=999999
0 Karma

gordo32
Communicator

I haven't seen this behaviour. Did you change *boht* lines that contain $filter ??

0 Karma

gordo32
Communicator

This thread has a working solution:

https://community.splunk.com/t5/All-Apps-and-Add-ons/Microsoft-Office-365-Reporting-Add-on-for-Splun...

@poisar opened a case with MS and adding a \ before the $filter in the script solved the problem for me

0 Karma

_joe
Communicator

I only wanted to additionally comment, we are up to version 1.2.1 now and this TA still appears to be unstable.

For me, most of the problems absolutely stem from network and connection issues. My location does have intermittent WAN issues, no denying that. The problem is, 24 hours later, the TA still cannot recover from them.

 I find that enabling and disabling the input from the web UI is the smallest action to correct the problem. 

I am considering scripting a restart of Splunk but that just seems silly. Are other people still having issues with this TA, and how have you found to be the most eloquent way to deal with them?

 

0 Karma

dharma891
New Member

Same error here as well - HTTP Request error: 500 Server Error: Internal Server Error for url: https://reports.office365.com/ecp/reportingwebservice/reporting.svc .

0 Karma

psalm_splunk
Splunk Employee
Splunk Employee

For the 500 error, you might want to go to this post and up-vote.

https://answers.splunk.com/answers/780097/microsoft-office-365-reporting-add-on-for-splunk-n.html

0 Karma

scannon4
SplunkTrust
SplunkTrust

So here is an interesting twist with this app (Microsoft Office 365 Reporting Add-on for Splunk) We have been using it for over a year with no issues. All of a sudden, we started getting a 500 Internal Server error when we pull the data. We contacted Microsoft and this method of pulling down message trace is not supported by Microsoft and is not even guaranteed to work in the future.

So, does anyone else have a reliable method for pulling in message trace data into Splunk?

0 Karma

psalm_splunk
Splunk Employee
Splunk Employee

For the 500 error, you might want to go to this post and up-vote.

https://answers.splunk.com/answers/780097/microsoft-office-365-reporting-add-on-for-splunk-n.html

0 Karma

joshschwarz
Engager

Same issue here.
It seems to correlate with network issues/server reboots.
Haven't been able to pin it down yet. My workaround is to add a new input each time and kill the old one.

0 Karma

madcitygeek
Explorer

I have experienced this as well. We've only been using it for a short time, so I can't speak to frequency yet. I found the python process responsible for the input was stuck on the heavy forwarder. I killed it with a HUP and it restarted properly and back-filled the data.

Next time it happens I'll try to get deeper into where it is stuck. If I can't figure it out, I'll look to build a watchdog of sorts to kill it when stale.

_joe
Communicator

Have you by chance found the solution for this? I've run into this a couple of times with version 1.2.0 on Splunk v8. I bounce the input similar to as described in the original post, the only error I see is:

ERROR pid=21847 tid=MainThread file=base_modinput.py:log_error:309 | HTTP Request error: ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer'))

0 Karma

lucas4394
Path Finder

Yes, this was the same solution placed into my environment as well. Hopefully, Microsoft will fix this issue in the near future. Thanks.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...