All Apps and Add-ons

Data stopped coming to Splunk for Splunk add-on for Microsoft Cloudservices

mlevsh
Builder

We are running Splunk Enterprise 6.5.3.
On Heavy forwarder we installed and configured Splunk add-on for Microsoft Cloudservices (current version 2.0.3) in summer of the previous year and have been using it since.
Yesterday we stopped receiving any data in Splunk for that add-on.
There are few errors: & warnings in Splunk internal index (sample errors to follow).
We also ran Linux patching on that heavy forwarder server ,that required reboot, right before data stopped coming.

Any advices on how to approach this issue and possibly fix it will be appreciated.

Here are patterns of errors and warnings :

Log_level=ERROR, pid=13456, tid=MainThread, file=config.py, func_name=log, code_line_no=51 | UCC Config Module: Fail to load value of "json" - endpoint=account_list, item=O365prod, field=refresh_token File "/export/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/bin/ms_o365_account_monitoring.py", line 286,.....

Log_level=WARNING, pid=13456, tid=MainThread, file=config.py, func_name=log, code_line_no=51 | UCC Config Module: Fail to load value of "json" - endpoint=account_list, item=O365prod, field=refresh_token - No JSON object could be decoded File "/export/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/bin/ms_o365_account_monitoring.py", line 286, in  main() File "/export/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/bin/ms_o365_account_monitoring.py", line 278..

Log_level=WARNING, pid=13456, tid=MainThread, file=config.py, func_name=log, code_line_no=51 | UCC Config Module: Fail to load value of "json" - endpoint=account_list, item=O365prod, field=refresh_token - No JSON object could be decoded File "/export/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/bin/ms_o365_account_monitoring.py", line 286, in  main() File "/export/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/bin/ms_o365_account_monitoring.py", line 278.....MalformedHeader: WWW-Authenticate

Log_level=ERROR, pid=13456, tid=MainThread, file=ms_o365_account_monitoring.py, func_name=run, code_line_no=146 | Failed to load conf files, reason: .......ConfigException: Fail to load value of "json" - endpoint=account_list, item=O365prod, field=refresh_token....
0 Karma

marycordova
SplunkTrust
SplunkTrust

I personally do not prefer any of the "Apps" or "TA"s for Microsoft Azure or O365 written by either Splunk or Microsoft.

Here is a mechanism I created that has not broken since I implemented it: https://answers.splunk.com/answers/678660/how-to-get-logs-from-azure-and-o365-into-splunk.html

@marycordova
0 Karma

raugugliaro
New Member

I'm having the same issue and I think it has something to do with settings not getting copied properly.

Take a look at this set of messages that keeps occurring over and over again:

2018-01-11 16:42:21,667 +0000 log_level=INFO, pid=15095, tid=MainThread, file=file_monitor.py, func_name=check_changes, code_line_no=48 | Detect /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local/splunk_ta_ms_o365_server_accounts.conf has changed
2018-01-11 16:42:31,345 +0000 log_level=INFO, pid=15095, tid=Thread-4, file=dispatch_engine.py, func_name=_deploy_global_setting, code_line_no=612 | message="Deploy global setting:account_list$$f69356c0-5d8c-41bd-ab4e-2e575c1baff0_SplunkInt to forwarder:localhost success"
2018-01-11 16:42:40,356 +0000 log_level=INFO, pid=15095, tid=MainThread, file=file_monitor.py, func_name=check_changes, code_line_no=48 | Detect /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local/splunk_ta_ms_o365_server_accounts.conf has changed
2018-01-11 16:42:49,987 +0000 log_level=INFO, pid=15095, tid=Thread-5, file=dispatch_engine.py, func_name=_deploy_global_setting, code_line_no=612 | message="Deploy global setting:account_list$$f69356c0-5d8c-41bd-ab4e-2e575c1baff0_SplunkInt to forwarder:localhost success"
2018-01-11 16:42:53,992 +0000 log_level=INFO, pid=15095, tid=MainThread, file=file_monitor.py, func_name=check_changes, code_line_no=48 | Detect /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local/splunk_ta_ms_o365_server_accounts.conf has changed
2018-01-11 16:43:03,566 +0000 log_level=INFO, pid=15095, tid=Thread-2, file=dispatch_engine.py, func_name=_deploy_global_setting, code_line_no=612 | message="Deploy global setting:account_list$$f69356c0-5d8c-41bd-ab4e-2e575c1baff0_SplunkInt to forwarder:localhost success"
2018-01-11 16:43:12,578 +0000 log_level=INFO, pid=15095, tid=MainThread, file=file_monitor.py, func_name=check_changes, code_line_no=48 | Detect /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local/splunk_ta_ms_o365_server_accounts.conf has changed
2018-01-11 16:43:22,170 +0000 log_level=INFO, pid=15095, tid=Thread-3, file=dispatch_engine.py, func_name=_deploy_global_setting, code_line_no=612 | message="Deploy global setting:account_list$$f69356c0-5d8c-41bd-ab4e-2e575c1baff0_SplunkInt to forwarder:localhost success"
2018-01-11 16:43:31,180 +0000 log_level=INFO, pid=15095, tid=MainThread, file=file_monitor.py, func_name=check_changes, code_line_no=48 | Detect /opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local/splunk_ta_ms_o365_server_accounts.conf has changed

0 Karma

mlevsh
Builder

@raugugliaro
1) so you also stopped receiving data to Splunk add-on for MS CS? When did it start? For us: 1/7/2018 around 4-5-6-7pm Eastern time. Just trying to see if it might have been caused by some changes on Microsoft site in case we started getting the issue at the same time.

2)
The message:

2018-01-02 04:15:08,577 +0000 log_level=INFO, pid=5217, tid=Thread-1, file=file_monitor.py, func_name=check_changes, code_line_no=48 | Detect /export/opt/splunk/etc/apps/Splunk_TA_microsoft-cloudservices/local/splunk_ta_ms_o365_server_accounts.conf has changed

I've checked if we had the similar messages before the data stopped coming . It seems that we had this kind of messages before as well. For example , On Jan 1st. 2018 we had data coming. And message existed. So , I assume, it's a standard message.

0 Karma

jaxjohnny2000
Builder

so ours stopped working around 10/30/2018
we get these messages from the splunk_ta_microsoft-cloudservices_account_monitoring.log

2018-12-06 19:35:22,049 +0000 log_level=INFO, pid=123035, tid=MainThread, file=o365_refresh_token.py, func_name=get_updated_datas, code_line_no=557 | No account for account splunk_prod_o365 needs to be update by client_credentials
2018-12-06 19:35:22,049 +0000 log_level=INFO, pid=123035, tid=MainThread, file=o365_refresh_token.py, func_name=get_updated_datas, code_line_no=557 | No account for account splunk_prod_o365 needs to be update by refresh_token

0 Karma

damien_chillet
Builder

The error logs seems to indicate a problem with parsing of the UCC Config JSON file at endpoint "account_list"
Going through the add-on code, it seems to come from a problem with parsing of the "o365_schema.account_monitor_config.json" file under /bin/splunktamscs/o365_schema.account_monitor_config.json

More specifically, with the account_list section and refresh_token value.

You can try looking for a missing coma or missing quotes around "json" for example.

The default content for that config file is (fresh download of version 2.0.3):

{
    "_product": "Splunk_TA_microsoft-office365",
    "_rest_namespace": "splunk_ta_ms_o365",
    "_rest_prefix": "ta_o365_server_",
    "_protocol_version": "1.0",
    "_version": "1.0.0.0",
    "cert_setting": {
    "endpoint": "certificate"
    },
    "api_setting": {
        "endpoint": "#configs/conf-splunk_ta_ms_o365_api_settings",
        "field_types": {
            "*": {
                "api_url": "json",
                "data": "json"
            }
        }
    },
    "ucc_system_setting": {
        "endpoint": "#configs/conf-splunk_ta_ms_o365_server_ucc_system_setting",
        "field_types": {
            "o365_refresh_token": {
                "apis": "json",
                "url": "json"
            }
        }
    },
    "global_setting": {
        "endpoint": "settings",
        "field_types": {
            "proxy": {
                "enable": "bool",
                "dns_passthrough": "bool"
            }
        }
    },
    "account_list": {
        "endpoint": "accounts",
        "field_types": {
            "*": {
                "access_tokens": "json",
                "access_tokens_encrypted": "json",
                "refresh_token": "json"
            }
        }
    },
    "management_api_input_list": {
        "endpoint": "management_api_inputs"
    }
}

Hopefully that will help if not, can you try provide more errors/warnings if any.

0 Karma

mlevsh
Builder

@damien_chillet Thank you for your reply!
The add-on was created by Splunk and it worked until Sunday, 1/7/2018. So don't think that it is the parsing.
Unless Microsoft changed something on their side?
We found the following error starting around the time we stopped receiving data:
SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:676)

0 Karma

raugugliaro
New Member

The error "SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed" might be a problem with how Python is doing SSL verifications on your machine. Have you recently updated your Python installation or changed your SSL Certificate Store?

0 Karma

mlevsh
Builder

@raugugliaro , you know the problem corrected itself, we haven't done any changes or anything. Makes me thing that it was something on Microsoft side

0 Karma
Get Updates on the Splunk Community!

Optimize Cloud Monitoring

  TECH TALKS Optimize Cloud Monitoring Tuesday, August 13, 2024  |  11:00AM–12:00PM PST   Register to ...

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...