I am encountering an error where this add on intermittently stops sending data. This issue was temporarily resolved by restarting Splunk on the Heavy Forwarder (where the app is installed).
Note that I have version 1.0.0 of the app installed.
On checking the internal logs the last event seen before I lost data was file=connectionpool.py:_new_conn:809 | Starting new HTTPS connection (1): reports.office365.com
When functioning normally the next event I would expect to see is file=connectionpool.py:_make_request:400 | https://reports.office365.com:443
This event was not observed again until after the HF was restarted.
Under normal operation /TA-MS_O365_Reporting/bin/ms_o365_message_trace.py
is observed once every 5 minutes, as defined by interval=300
. During the downtime this process appears to have hung until the HF was restarted, as shown in the below graph
Has anyone else experienced this issue or could perhaps point me in the direction of a fix? Thanks
In my case I had same problem with some tenants "Domains" of Azure I fix it with these value in all management_activity.
[splunk_ta_o365_management_activity://AzureActiveDirectory_DOMAIN]
content_type = Audit.AzureActiveDirectory
index = mso365
interval = 300
tenant_name = DOMAIN
disabled = 0
number_of_threads = 8 --> I passed for the default 4 to 8, you have to put multiples of 4.
request_timeout = 600
token_refresh_window = 3600
Regards.
Observed this kind of problem with so many TAs in Splunk. The scheduled scripts will stop suddently and if we disable and enable the input will start getting the logs.
See this question was from last year. We are running into the same exact issue where the underlying python script ms_o365_message_trace.py gets stuck and sticks around, preventing the input from firing and getting new logs (killing the pid for the python script frees it up again and processing continues afterwards).
Is there a permanent solution to this issue from the Microsoft Office 365 Reporting Add-on perspective? See other folks have rolled with their own custom solutions, but want to know if this has been addressed/fixed in the add-on itself.
Are you using proxy?
No. We're not using proxy.
I decided to roll my own O365...I feel like the data ingestion is more reliable than anything else out there: https://answers.splunk.com/answers/678660/how-to-get-logs-from-azure-and-o365-into-splunk.html
Is there any error in _internal log?
As Mike has stated below.
Hi p_gurav,
I have viewed the logs contained at the below location (Jake is my colleague). There are no error messages contained within the logs.
/opt/splunk/var/log/splunk/ta_ms_o365_reporting_ms_o365_message_trace.log
The last log entry within the above-noted log is:
Starting new HTTPS connection (1): reports.office365.com
There are no further entries within the logs until the HF was restarted, at which point, the first logs of new data were:
Starting new HTTPS connection (1): 127.0.0.1
Starting new HTTPS connection (1): 127.0.0.1
Starting new HTTPS connection (1): 127.0.0.1
Use HTTP connection pooling
However, this information is expected "normal" activity.