MSO365 Reporting add on for Splunk intermittently ...

jakewhittet · ‎04-19-2018

I am encountering an error where this add on intermittently stops sending data. This issue was temporarily resolved by restarting Splunk on the Heavy Forwarder (where the app is installed).

Note that I have version 1.0.0 of the app installed.

On checking the internal logs the last event seen before I lost data was file=connectionpool.py:_new_conn:809 | Starting new HTTPS connection (1): reports.office365.com

When functioning normally the next event I would expect to see is file=connectionpool.py:_make_request:400 | https://reports.office365.com:443 This event was not observed again until after the HF was restarted.

Under normal operation /TA-MS_O365_Reporting/bin/ms_o365_message_trace.py is observed once every 5 minutes, as defined by interval=300. During the downtime this process appears to have hung until the HF was restarted, as shown in the below graph

Has anyone else experienced this issue or could perhaps point me in the direction of a fix? Thanks

macklaud · ‎01-28-2020

In my case I had same problem with some tenants "Domains" of Azure I fix it with these value in all management_activity.

[splunk_ta_o365_management_activity://AzureActiveDirectory_DOMAIN]
content_type = Audit.AzureActiveDirectory
index = mso365
interval = 300
tenant_name = DOMAIN
disabled = 0
number_of_threads = 8 --> I passed for the default 4 to 8, you have to put multiples of 4.
request_timeout = 600
token_refresh_window = 3600

Regards.

thambisetty · ‎12-23-2018

Observed this kind of problem with so many TAs in Splunk. The scheduled scripts will stop suddently and if we disable and enable the input will start getting the logs.

————————————
If this helps, give a like below.

sykuang · ‎11-25-2019

See this question was from last year. We are running into the same exact issue where the underlying python script ms_o365_message_trace.py gets stuck and sticks around, preventing the input from firing and getting new logs (killing the pid for the python script frees it up again and processing continues afterwards).

Is there a permanent solution to this issue from the Microsoft Office 365 Reporting Add-on perspective? See other folks have rolled with their own custom solutions, but want to know if this has been addressed/fixed in the add-on itself.

thambisetty · ‎11-25-2019

Are you using proxy?

————————————
If this helps, give a like below.

sykuang · ‎11-25-2019

No. We're not using proxy.

marycordova · ‎08-22-2018

I decided to roll my own O365...I feel like the data ingestion is more reliable than anything else out there: https://answers.splunk.com/answers/678660/how-to-get-logs-from-azure-and-o365-into-splunk.html

@marycordova

p_gurav · ‎04-20-2018

Is there any error in _internal log?

jakewhittet · ‎04-22-2018

As Mike has stated below.

MikeElliott · ‎04-20-2018

Hi p_gurav,

I have viewed the logs contained at the below location (Jake is my colleague). There are no error messages contained within the logs.

/opt/splunk/var/log/splunk/ta_ms_o365_reporting_ms_o365_message_trace.log

The last log entry within the above-noted log is:

Starting new HTTPS connection (1): reports.office365.com

There are no further entries within the logs until the HF was restarted, at which point, the first logs of new data were:

 Starting new HTTPS connection (1): 127.0.0.1
 Starting new HTTPS connection (1): 127.0.0.1
 Starting new HTTPS connection (1): 127.0.0.1
 Use HTTP connection pooling

However, this information is expected "normal" activity.

MSO365 Reporting add on for Splunk intermittently stops sending data

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!