Hello All!
Trying to set up CAC Based Auth for SPLUNK 9.1.1 on Windows Server 2022 for the first time. I have successfully setup LDAP and am able to sign into Splunk using an AD username/password without any issues. When I add in the requiredClientCert, enableCertBasedAuth and certBasedUserAuthMethod stanzas, and attempt to access the Splunk GUI, all users are immediately greeted with an 'Unauthorized' message. I've been fighting this for about a week now, and Splunk support hasn't been able to help me pin this down yet. Any assistance would be greatly appreciated.
I've ensured TLS 1.2 registry keys exist in SCHANNEL to Enable TLS 1.2.
Corresponding logs from splunkd.log for the logon attempt are:
09-29-2023 09:02:43.191 -0400 INFO AuthenticationProviderLDAP [12404 TcpChannelThread] - Could not find user=" \x84\x07\xd8\xb6\x05" with strategy="123_LDAP"
09-29-2023 09:02:43.192 -0400 ERROR HTTPAuthManager [12404 TcpChannelThread] - SSO failed - User does not exist: \x84\x07\xd8\xb6\x05
09-29-2023 09:02:43.192 -0400 ERROR UiAuth [12404 TcpChannelThread] - user= \x84\x07\xd8\xb6\x05 action=login status=failure reason=sso-failed useragent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36" clientip=<ip>
09-29-2023 09:03:10.247 -0400 ERROR UiAuth [12404 TcpChannelThread] - SAN OtherName not found for configured OIDs in client certificate
09-29-2023 09:03:10.247 -0400 ERROR UiAuth [12404 TcpChannelThread] - CertBasedUserAuth: error fetching username from client certificate
authentication.conf:
[splunk_auth]
minPasswordLength = 8
minPasswordUppercase = 0
minPasswordLowercase = 0
minPasswordSpecial = 0
minPasswordDigit = 0
[authentication]
authSettings = 123_LDAP
authType = LDAP
[123_LDAP]
SSLEnabled = 1
anonymous_referrals = 0
bindDN = CN=<Account>,OU=Service Accounts,OU=<Command Accounts>,DC=<Command>,DC=NAVY,DC=MIL
bindDNpassword = <removed>
charset = utf8
emailAttribute = mail
enableRangeRetrieval = 0
groupBaseDN = OU=SPLUNK Groups,OU=Groups,DC=<command>,DC=NAVY,DC=MIL
groupMappingAttribute = dn
groupMemberAttribute = member
groupNameAttribute = cn
host = DC.<Command>.NAVY.MIL
nestedGroups = 1
network_timeout = 20
pagelimit = -1
port = 636
realNameAttribute = displayName
sizelimit = 1000
timelimit = 15
userBaseDN = OU=Users,OU=<Command Accounts>,DC=<Command>,DC=NAVY,DC=MIL
userNameAttribute = userprincipalname
[roleMap_LDAP]
admin = SPLUNK AUDITOR
can_delete = SPLUNK AUDITOR
network = SPLUNK NETWORK
user = SPLUNK AUDITOR;SPLUNK USERS
web.conf
[settings]
enableSplunkWebSSL = true
privKeyPath = $SPLUNK_HOME\etc\auth\dodCerts\splunk2_key.pem
serverCert = $SPLUNK_HOME\etc\auth\dodCerts\splunk2_server.pem
sslPassword = <removed>
requireClientCert = true
sslRootCAPath = $SPLUNK_HOME\etc\auth\dodCerts\DoDRootCA3.pem
enableCertBasedUserAuth=true
SSOMode=permissive
trustedIP = 127.0.0.1
certBasedUserAuthMethod=PIV
server.conf
[sslConfig]
enableSplunkdSSL = true
sslRootCAPath = $SPLUNK_HOME\etc\auth\dodCerts\DoDRootCA3.pem
serverCert = $SPLUNK_HOME\etc\auth\dodCerts\splunk2_server.pem
sslPassword = <removed>
cliVerifyServerName = true
sslVersions = tls1.2
sslVerifyServerCert = true
[general]
serverName = SPKVSPLUNK2
pass4SymmKey = <removed>
trustedIP = 127.0.0.1
I am also digging into this. It is helpful to know that with newer Splunk stuff it has been moved to a different website. How to prepare TLS certificates for use with the Splunk platform | Splunk Docs
looks like Splunk changed how some of the stuff works in 9.4 - 10. That being said the above link is a good way to verify if your pem files are correct.
Currently we are stuck on the:
SAN OtherName not found for configured OIDs in client certificate
CertBasedUserAuth: error fetching username from client certificate
I did get this to work finally just a few weeks ago.
The BLUF reason is the account used to do the LDAP queries did not have an SPN setup that allowed Kerberos delegation to happen.
To accomplish this I did the following:
In an elevated cmd/powershell prompt
setspn -S http/splunkserver.domain domain\ldapaccountname
In Active Directory:
Open the account properties and go to the delegation tab, ensure it is either set to trust for any service (Kerberos only), or trust for specified services and configured appropriately. I selected to trust for any service.
In web.conf under [Settings]:
requireClientCert = true
enableCertBasedUserAuth = true
SSOMode = permissive
trustedIP = 127.0.0.1
certBasedUserAuthMethod = PIV
certBasedUserAuthPivOidList = Microsoft Universal Principal Name
allowSsoWithoutChangingServerConf = 1
in server.conf under [sslConfig]:
sslVerifyServerCert = true
sslVerifyServerName = true
enableSplunkdSSL = true
requireClientCert = false
in authentication.conf under the section named for your LDAP strategy:
SSLEnabled = 1
anonymous_referrals = 0
bindDN = CN=Account,OU=OU,OU=OU,DC=DC,DC=DC,DC=DC
bindDNpassword = <encrypted password>
charset = utf8
emailAttribute = mail
enableRangeRetrieval = 0
groupBaseDN = OU=Groups,DC=DC,DC=DC,DC=DC
groupMappingAttribute = dn
groupMemberAttribute = member
groupNameAttribute = cn
host = server.domain
nestedGroups = 0
network_timeout = 20
pagelimit = -1
port = 636
realNameAttribute = cn
sizelimit = 1000
timelimit = 15
userBaseDN = OU=Accounts,DC=DC,DC=DC,DC=DC
userNameAttribute = userprincipalname
in authentication.conf under [authentication]:
authSettings = [name of your LDAP strategy]
authType=LDAP
I'll have to try the ldap service account issue a little later, but I have a hard time believing that's my issue since I'm using the same service account for both old and new servers... but maybe.
Question though, in your server certificate chain, and your private key file, does your private key stay:
-----BEGIN RSA PRIVATE KEY-----
or
-----BEGIN PRIVATE KEY-----
I did confirm that if I bring my exported pfx file over to my old splunk 9.x server's bin folder, and run the openssl commands there to pull out the private key, I'm left with an RSA version of the private key.
If I use that same pfx file into my splunk 10 server's bin folder and run the same commands, I'm left with a non-RSA version of the private key.
I tried injecting both versions into my chain and private.pem files but neither way made any difference. Still "cannot decrypt token" error.
Mine is
-----BEGIN PRIVATE KEY-----
gobbleygook
-----END PRIVATE KEY-----
Are you using port 8443??? I don't see the paths to your privkey/server certs.
[settings]
### START SPLUNK WEB USING HTTPS:8443 ###
enableSplunkWebSSL = 1
httpport = 8443
privKeyPath = $SPLUNK_HOME\etc\auth\DOD.web.certificates\privkey.pem
serverCert = $SPLUNK_HOME\etc\auth\DOD.web.certificates\cert.pem
### TOKEN AUTHENTICATION ###
requireClientCert = true
sslRootCAPath = $SPLUNK_HOME\etc\auth\DOD.web.certificates\dod_chain.pem
enableCertBasedUserAuth = true
SSOMode = permissive
trustedIP = 127.0.0.1
certBasedUserAuthMethod = PIV
certBasedUserAuthPivOidList = Microsoft Universal Principal Name
allowSsoWithoutChangingServerConf = 1
I am not using 8443, we changed it to 443 since its the only service on the server.
I do have the other items in those conf sections, I just assumed they would already be there, so I only supplied the items specifically found that were need for CAC Based Auth to work. For reference this is what i have
server.conf
[sslConfig]
serverCert = $SPLUNK_HOME\etc\auth\server.pem
sslRootCAPath = $SPLUNK_HOME\etc\auth\cacert.pem
sslPassword = <encrtypted password>
sslVersions = tls1.2
# CAC Based Login
sslVerifyServerCert = true
sslVerifyServerName = true
enableSplunkdSSL = true
requireClientCert = false
server.conf
[general]
serverName = splunkServer
pass4SymmKey = <encrypted password>
# CAC Based Login
trustedIP = 127.0.0.1
web.conf:
[settings]
enableSplunkWebSSL = 1
httpport = 443
serverCert = $SPLUNK_HOME\etc\auth\dodCerts\splunk2_server.pem
privKeyPath = $SPLUNK_HOME\etc\auth\dodCerts\splunk2_key.pem
sslPassword = <encrypted password>
sslRootCAPath = $SPLUNK_HOME\etc\auth\dodCerts\DoDRootCA3.pem
sslVersions = tls1.2
#CAC Based Login Stuff
requireClientCert = true
enableCertBasedUserAuth = true
SSOMode = permissive
trustedIP = 127.0.0.1
certBasedUserAuthMethod = PIV
certBasedUserAuthPivOidList = Microsoft Universal Principal Name
allowSsoWithoutChangingServerConf = 1
How did you create the certs? DoD NPE Portal?
We got it working with just "certBasedUserAuthPivOidList = Microsoft Universal Principal Name".
I shared my configs further up, please have a look
Yes, the certs were created using the NPE portal and then later converted to PEM using the openSSL package contained in Splunk
Same way I've created all of our servers. I think I remember one of your posts earlier on saying that we didn't need the "Smart Card Logon" key usage. We do have that on all of our certs because from day 1, 4 years ago, I wasn't sure if it was needed or not, so I threw it on there. I don't think it's hurting anything because both our production server and this new server have the usage ability. I just removed the OID number from that OidList field and left only Microsoft Universal Principal Name, but it still doesn't work.
Question - is your software V 10? or still 9.x? Since these new ones are virtual, I'm debating about taking a snapshot so I don't lose my progress, but then uninstalling 10 and going back to 9.x to see if it's a problem with the latest version.
Also, is your installation on NIPR or, let's just say "Other"?
The fact that my error message says "Cannot decrypt token" leads me to believe it's something wrong with my certs. But they seem fine as far as I can tell. They are issued by a different intermediate CA than our production system was, but I do have the proper updated chain in the CA cert pem file.
Correct - all I requested from DoD NPE was a vanilla 3-year TLS cert (no extended key usage). Yes, we have this working on 9.x on [X]IPR. Might there be something wrong with your SmarCard middleware? Not being able to decrypt the token sounds like maybe there's an issue passing the PIN?
ok, if you are on 9, I may just do that uninstall/reinstall. I really don't want to but, it will rule that out as a factor at least.
I don't think it's my middleware because from this same workstation, I and open a second tab in my browser and connect to our old splunk and it authenticates just fine. I already emailed my Splunk support rep to ask if they have any known issues with CAC Auth on version 10, but my rep hasn't responded yet.
My only other thought is something odd I noticed with the private key. And maybe it's the version of openssl that's bundled with Splunk 10 causing this.
When I do the process of pulling the private key, the last open ssl command creates a pem version that's non-password protected.
On all of our old systems (including desktops with the forwarder installed and their private key is used there too), the start is:
-----BEGIN RSA PRIVATE KEY-----
But, when running the commands in this new Splunk 10 openssl, the resulting file just says:
-----BEGIN PRIVATE KEY-----
It doesn't specify "RSA". I kind of doubt this is the problem but I'm going to pull my key over to the previous server to generate the private key there to see if the result is the same.
These are the instructions we followed:
https://lantern.splunk.com/Splunk_Platform/Product_Tips/Administration/Configuring_Splunk_for_Common...
I recall this command not working:
I don't know HOW this worked but we just ran "mv mySplunkWebPrivateKey.key privkey.pem" ...
Should be in the splunkd.log. Here is an example from someone's previous post:
09-29-2023 09:02:43.191 -0400 INFO AuthenticationProviderLDAP [12404 TcpChannelThread] - Could not find user=" \x84\x07\xd8\xb6\x05" with strategy="123_LDAP" 09-29-2023 09:02:43.192 -0400 ERROR HTTPAuthManager [12404 TcpChannelThread] - SSO failed - User does not exist: \x84\x07\xd8\xb6\x05
If you are not seeing failed logins in your splunkd.log, you can try updating the log.cfg or log-local.cfg file to add debugging. This should give you more information in the splunkd.log. The log.cfg/log-local.cfg file is located in the .../splunk/etc directory.
Find "category.AuthenticationProviderLDAP=INFO" and change INFO to DEBUG.
Restart the Splunk service.
This should at least give you the username it is finding. There may be other options you can change to DEBUG to give you more information.
I configured the DEBUG logs and will hopefully have more to go off next week. One thing I thought I'd bring up is some confusion regarding the Certificate Profile, Subject Alternative Names (SANs) and Extended Key Usage we should be specifying when we send our CSRs to the high-side PKI portal.
Certificate Profiles
We have a few certificate profiles to choose from, including: device, domain controller, device/TLS/Application Email, Mini Crypto Key Agreement, Encrypted File System, IPSEC, Mini Crypro Authentication, Robotic Process (RPA/BO), and what seemed to be the most fitting, TLS Server.
Subject Alternative Names (loopback?)
From there, we've been defining the Subject Alternative Name which makes the most sense, the device IP dress. However, I'm being told we should be using 127.0.0.1 instead - what's your take on that?
Key Usage/Extended Key Usage Selections
When selecting the TLS Server certificate profile, the default Key Usages are selected:
Key Usage:
digitalSignature
keyEncipherment
Extended Key Usage:
id-kp-serverAuth
I'm being told that we need to include Extended Keys for smartCardAuth and possibly id-kp-clientAuth. The problem is that when we select additional keys for the TLS Server profile the portal, instead automatically approving the CSR, kicks it back with the following error: "An extended key usage was found that requires the certificate application to be queued. EKU for "smarCardLogon" for profile "tlsServer" requires the certificate application to be queued" This MASSIVELY slows down the troubleshooting process, making it quite difficult to iteratively troubleshoot (not Splunk's problem, but I needed to lament the hardship).
I know this is a LOT and truly appreciate everyone's help on this. Some my peers have been trying to figure this our for over a year! If we can figure this out it'd be a massive win for whole bunch of burnt out sys admins ❤️
goal/tl:dr - confirm the minimal, PKI-approved certificate profile, SAN list, and EKU set that Splunk Web needs for CAC / SIPR-token authentication, so we can eliminate certificate-format variables from our troubleshooting. Thanks again!
I'm not sure if anyone has found the exact problem in your situation, but looks like you may missing the attribute certBasedUserAuthPivOidList. I do see errors for OID not found in client cert. The default value is Microsoft Universal Principal Name, but you may need to change it. Or try changing certBasedUserAuthMethod from PIV to EDIPI. Hope this helps.
Did you get this figured out? We are currently fighting the same issue.
Were you able to find any resolution to this?