Encountered an issue with Splunk SAML authentication in conjunction when using scripted inputs for leveraging splunk cloud gateway for mobile.
We have configured SAML with Azure AD for SSO with our existing SHC. As part of Splunk cloud gateway implementation, we performed few additional steps mentioned in the document which is recommending to include scripted inputs.
https://docs.splunk.com/Documentation/Gateway/1.9.0/Installation/SAMLauth
Post saving the above config, we started noticing issue with SSO auth. The very first SAML request works fine and the subsequent requests starts failing with 404 0r 500 response error on the browser.
In Splunk internal logs, we observed below error after successful splunk SAML response
Splunk Query:
index=_internal sourcetype=splunkd OR sourcetype=splunkdconf SAML OR component=AuthenticationManager*
| dedup event_message
03-03-2020 13:45:04.195 -0500 ERROR AuthenticationManagerSAML - authentication extension getUserInfo() failed for user: XXXX
03-03-2020 13:45:03.864 -0500 INFO AuthenticationManagerSAML - Calling getUserInfo() authentication extension for user: XXXX
03-03-2020 13:35:45.919 -0500 WARN Saml - Original response xml =[https://sts.windows.net/xxxxxxxx/https://sts.windows.net/xxxxxxxxxx/cYUrN24ezTvOMwLoKLKVtSTfkWdqCm6JGn71+xli2pg=IR8f70MSdjWFDZW34iR4Zz5SBzZb4xNznxMOE6wZ8QAqbUAsIeit6lt4a4PhS1UMI+xHWAabDptkaLUDI4yuPiiGtmQSMRbA2hb7GshE9JgXCnjxDRVeb4F/TX56PWf6klgp43Jzo1hSdNdsfnA1mYPkIEBZFeTMNLa28na7HBRStdA3SKXjqdcHfqJj9xrEleTgmY1Q71BF3PBFLsNpuMFlCx9eN44/ucSeq+KMDP+yd7HnL+R3eq57qhDB9W8c2vvf16iIblo4V72u5LNHH3Z0GcOtw1PGi25bhAPkRpKlakH60yiWyo+EG3PZyPNsthXAy8GdQPEsK+M7g06p9A==MIIC8DCCAdigAwIBAgIQWyiZvwFOJadJjIQF9L7k4TANBgkqhkiG9w0BAQsFADA0MTIwMAYDVQQDEylNaWNyb3NvZnQgQXp1cmUgRmVkZXJhdGVkIFNTTyBDZXJ0aWZpY2F0ZTAeFw0xOTExMjAxNjM4NTFaFw0yMjExMjAxNjM4NTFaMDQxMjAwBgNVBAMTKU1pY3Jvc29mdCBBenVyZSBGZWRlcmF0ZWQgU1NPIENlcnRpZmljYXRlMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAz9GMbCtuig3ai00TrgVNifrkQp/M4MJwyI8i457idgWEoiQ+8/sKpgBSIZgyO/3+IzPl1MwDe8gbaS1S0i/+FmpoJwBoBL60EuU39foh/KmvkyR7qAOSDK5o2Dme3RYmMAtE26cUY4sR7RmYGQkwsMMiv4iGr14Oyid4WAedOVa+ubGb4h7s6jMw/kdHjRHHPVNE/XNeteb5Aq62YSQqDL1xJMVFWA6jOsuvPAdVCyy072XC/9eCUqbBRQU5+6vuHKa27scoCjMBkxrTf9R/A9Kg7TOn+Nqp22CcYwTmkJ90N8Lh1RLXKS2uLnCbQlfrdZnuIQyd3FrVACdIUW96dwIDAQABMA0GCSqGSIb3DQEBCwUAA4IBAQCCjPTCr+AZcVfAsI8CEzxjxu7Jj8A4t57JPWwX89IiMkt69tN9DebKayEERq8FGp/AdE4AW3jHAQp26cDunPFIaeQ3QeahFRMUZZZzJGj9MJGSih1XQpTuh4bH0rhkZthBIcSfO9nUoo0/ihv4Nnpk4VUNPRrAkuT5qpEhp789culbwujIWy3/OuG7Xjj4FQZ2smcaznRY9BmLICGycI6ZqdAmUf4xxMGbepkDlHQkbrnlgSBpreqmEJvLgNgPSpmvt++vGuoqcv1nlTyC31jo7ltjGCTgp/O/AxBfzN7F3TS6VOJnkOyyF/hD1l+LPXpfDeTgHSfRzHaz6WbvYAm/XXXXXhttps://xxxxxxxxxxxxxxxxxd3813d6f-bf44-41ed-8c50-04e05d31c5dfXXXXXX, XXXXXXXXX-AllAssociates-LDC-D1XXX-EnterWebTech-u1XXX-Splunk-dbconnect-admin-U0Landesk Week 3mule-platform-adminXXX-ENTAndFMIT-ExtLeadership-D1XXX-AllAssociates-ATL-IT-D1XXX-MuleSoftProject-U1XXX-esmssev1cb-u1XXX-StatusPage-Allowed-U1SN-SEC-ITIL-IT-DXXX-DesignatedInsiders-U1Landesk Week DMZLandesk Week 2XXX-fmHubspanMonitor-u1XXX-ENTAndFMIT-ExtLeadership-U1FM-AllAssociates-IT-D1XXX-esmssev1all-u1XXX-Splunk-powerconnect-power-U0mule-qa-adminLandesk Week FM Prod 1100-IntuneUsers-U0XXX-AllMigratedUsers-D1FM-AllUsers-U1XXX-AnthemMembers-U1SN-AG-IT-FM-EBIZXXX-Splunk-tools-admin-U0XXX-ITAEO-Atlanta-U1XXX-msgfromjd-u1XXX-Splunk-tools-power-U0XXX-AllAssociates-LDC-IT-D1Landesk Week 4XXX-fmitmanagers-u1XXX-AllAssociates-U1XXX-Splunk-Readonly-U0XXX-fmwebsphere-u1mule-viewerXXX-Splunk-FM-ecom-SLT-U0XXX-esmssev2cti-u1mule-designerXXX-esmssev2all-u1XXX-ITArcEngOpr-U1XXX-fmwebsphereadmin-u1XXX-AllAssociates-US-exHI-D1XXX-ManagementTeam-IT-D1XXX-ManagementTeam-IT-U1XXX-ENTAndFMIT-All-D1Landesk Week 1XXX-allgscempatl-u1FM-AllManagersAndAbove-D1SN-AG-IT-FM-MANAGERSXXX-AnthemHSTMembers-U1XXX-ITAEO-Atlanta-D1XXX-ESMSSv1AllbutFM-u1XXX-ENTAndFMIT-LDC-D1XXX-AllMedicalEnrolled-U1XXX-ENTAndFMIT-AEO-D1-U1XXX-esmssev2gsc-u1XXX-VPNUsersGSC-ITCorporateApplications-U1XXX-esmssev1fm-u1XXX-All-IT-U1XXX-cmdbitcpusers-u1XXX-esmssev1can-u1FM-AllManagersofOthers-D1XXX-AllUsers-U1FM-All-IT-U1_HD Supply - AllXXX-Splunk-candelete-admin-U0XXX-ENTAndFMIT-LDC-U1100-Azure-App-XXX-Ava-U0XXX-esmssev2can-u1XXX-AllPeopleManagers-US-exHI-D1XXX-ENTAndFMIT-All-U1100-Azure-License-E3-D0XXX-AllPeopleManagers-D1FM-AllAssociates-D1XXX-Splunk-dbconnect-users-U0XXX-esmssev2fm-u1Landesk Week 5XXX-ENTAndFMIT-AEO-D1XXX-Splunk-Infosec-power-U0mule-dev-adminFM-AllAssociates-USA-D1XXX-AllAssociates-Salaried-D1XXX-gscoicdirects-u1XXX-esmssev1gsc-u1XXX-AllAssociates-USA-D1LANDesk - GSCXXX-esmssev2cb-u1FM-mule-operations-U0https://sts.windows.net/xxx-xxx-xx-xx/urn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransporthttp://schemas.microsoft.com/claims/multipleauthnXXXXXXXXXXXX.XXXX@XXXXXXX.XXX@XXXX.comurn:oasis:names:tc:SAML:2.0:ac:classes:PasswordProtectedTransport]
Seems like the azureScripted.py is not able to obtaimn relevant token from the apikey and query azure graph endpoint for user impersonation. Need help with troubleshooting this issue and see if any users have successful implementation of. splunk cloudgateway with SAML authentication.
We recently updated to 8.0.2 and encountered the same issue yesterday when trying to implement Cloud Gateway. Oddly at first we had a period of about 4 hours where it seemed to work and I was able to register and use devices, but after we began to see the 404 errors.
Are you also using scripted input options with oyur existing saml config. We noticed the provided Azure saml script under authScriptSamples directory doesnt use any oauth implementation to obtain acces token for querying saml endpoints. Are you also using azure sso for your saml auth.
We are using azure sso and trying to use the scripted input options, though the only thing that seems to matter is the name of the script. I can literally put a hello world script in there and I can generate tokens until something crashes or the search head is restarted.
We ended up having to disable the scripted input options. Things worked normally for a bit, but then we started randomly getting the 404/500 pages and would have to re-authenticate to azure using the splunk login url. After re-authenticating it would work for a few minutes, then give us errors again.
I think we finally got it working after making few changes to the script to use Ouath client_credentials to obtain token for user validation. Also we had to switch the name id format in the saml config to use email address as inout for username in the script. Since our Azure graph endpoint doesnt return the group display name we had to use the id in our saml group to map to a relevant role.
@pv063910 , We are also trying to implement SAML auth for Spunk Cloud Gateway. I am curious to know the flow of how users are logged into the Splunk instance in Mobile Client.
Could you tell me what happens when user opens the App on mobile?
OR
How are users register on mobile app with Scripted SAML authentication?
Would you be willing to share the changes you made to get this working? It sounds like we're in a very similar situation and I imagine the changes will be similar to what we need to do.
I wanted to add an update to this after some troubleshooting.. When our crash occurred, a user in the admin group got the message that he did not have the cloud gateway role, though he should have had it. As soon as he tried to add a device, we began getting the 404 errors. This was after two users had successfully added devices to cloud gateway.
To get our search head back online, I renamed the script to 1azureScripted and restarted splunk. When I went back into the SAML configuration, I noticed that the "Script Functions" and "Script Secure Arguments" blocks were empty, although they both showed up in the authentication.conf file. I updated the file and the script function to the original values, which got things working again.
One thing that has me puzzled, is that I had to update the azurescripted.py file to make it proxy aware, and it the process I messed up some of the space/tab combos which resulted in an error, meaning that the script never could've executed successfully anyway.
It seems like we are able to generate tokens as long as something in entered in the script field. If a user has an error during device registration or splunk is restarted, you will begin to see the 404/500 errors.