- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
User=unknown in Authentication DataModel

Symptom:
Our authentication datamodel is showing user=Unknown for events that have a username defined in the log.
Example:
2020-02-07 09:31:11,161 - xxx.xxx.xxx.xxxx - INFO [net.shibboleth.idp.authn.duo.impl.ValidateDuoWebResponse:200] - Profile Action ValidateDuoWebResponse: Duo authentication succeeded for 'user1'
2020-02-07 09:31:10,527 - xxx.xxx.xxx.xxx - INFO [net.shibboleth.idp.authn.impl.ValidateUsernamePasswordAgainstLDAP:152] - Profile Action ValidateUsernamePasswordAgainstLDAP: Login by 'user2' succeeded
The fields look ok, except user=unknown (respectively, below):
action = success | app = shibboleth | src_user = user1 | tag = authentication tag = success | user = unknown
action = success | app = shibboleth | src_user = user2 | tag = authentication tag = success | user = unknown
I thought that adding a field alias in the props.conf for this app would do the trick...
But it still seems to display user=unknown for the datamodel...
Here is an eval expression for the datamodel definition:
src_user=if(isnull(src_user) OR src_user="","unknown",src_user), user=if(isnull(user) OR user="","unknown",user)
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Here's a funny interaction...
With the search:
index="shib" host="myhost" signature="INFO [net.shibboleth.idp.authn.duo.impl.ValidateDuoWebResponse"
| where isnull(user)
I get what you would expect... a number of problematic logs where for some reason the user field is not populated..
After some time, I did the search again, and it says No Results Found.
user
100 Values, 99.139% of events
When I ran this search initially, it found ~25% of events with user value....
Is this because it is performing the field extraction at search-time?!
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Everything you have defined in that TA is a searchtime extraction.
As a general rule, you want fields to be searchtime extractions (they are more flexible and fast)
In other words, unless you have a specific need to use index time extractions you are best to avoid them.
If your search is now returning good data for the search fields, I wonder if your data model will also include them correctly populated too?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Doesn't look like it...
| from datamodel Authentication
| search app=shibboleth user=unknown
2/10/20
4:18:13.170 PM
2020-02-10 16:18:13,170 - XXX.XXX.XXX.XXX - INFO [net.shibboleth.idp.authn.impl.ValidateUsernamePasswordAgainstLDAP:152] - Profile Action ValidateUsernamePasswordAgainstLDAP: Login by 'user1' succeeded
action = success successapp = shibboleth dest = unknown signature = INFO [net.shibboleth.idp.authn.impl.ValidateUsernamePasswordAgainstLDAPsource = /var/log/shibboleth-idp/idp-process.log sourcetype = idp-process src = unknown src_user = unknown
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

index="shib"
| where isnull(user)
| search "net.shibboleth.idp.authn.impl.ValidateUsernamePasswordAgainstLDAP:152"
And there it is...
a similar search:
index="shib" "net.shibboleth.idp.authn.impl.ValidateUsernamePasswordAgainstLDAP:152"
user
100 Values, 100% of events
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

The datamodel uses src_user and user for different purposes.
"src_user" is used to refer to the user who initiated the action, where there is an escalation or modification
"user" is the account on which something was modified in escalated events, or where there is no escalation it should be the initiator.
https://docs.splunk.com/Documentation/CIM/4.14.0/User/Authentication
It looks like your shibboleth integration is populating the wrong field, which its why its missing in the datamodel.
Ideally, you want to change the field extraction so that shibboleth populates user (not src_user)
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@nickhillscpl
Thanks for the explanation!
I have this defined in our props:
[idp-process]
EVAL-app = "shibboleth"
...
EXTRACT-user,vendor_action = Login by '(?P<user>\S+)'\s+(?P<vendor_action>.+)
EXTRACT-signature,vendor_action,user = net\.shibboleth\.idp\.authn\..*\] \- (?<signature>.*?)\:.*?\s+(?P<vendor_action>\S+) for '(?P<user>\S+)'
EXTRACT-signature = \.\d{1,3}\s\-\s(?P<signature>[^:]+)\:
EXTRACT-signature,user,vendor_action = net\.shibboleth\.idp\.authn\..*\] \- (?<signature>.*?)\:.*(?<user>No username)\s.*(?<vendor_action>UsernamePasswordContext)
Which seems to address the two use-cases that I have there... seems to return nothing.
2020-02-10 10:13:22,181 - xxx.xxx.xxx.xxx - INFO [net.shibboleth.idp.authn.duo.impl.ValidateDuoWebResponse:200] - Profile Action ValidateDuoWebResponse: Duo authentication succeeded for 'user1'
2020-02-10 10:13:21,414 - xxx.xxx.xxx.xxx - INFO [net.shibboleth.idp.authn.impl.ValidateUsernamePasswordAgainstLDAP:152] - Profile Action ValidateUsernamePasswordAgainstLDAP: Login by 'user1' succeeded
I get neither field user or src_user.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I agree, the extractions and regex looks (and tests) good.
Are you looking at the props.conf on your searchhead?
Do the fields extract correctly if you search your logs?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

They did not appear to extract properly when I search the log directly.
e.g.
index=shib sourcetype=idp-process
Shows events with no user, this is likely why the DataModel shows user=unknown!
This TA is deployed on the Indexer, SH, and Enterprise Security SH
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Is the TA, and its Knowledge objects shared Globally?
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

It would appear so..
TA-Shib -- global, everyone
index=shib -- global, everyone
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Oh, so you are not using this: https://splunkbase.splunk.com/app/4389/ ??
Can you paste the whole props.conf and transforms.conf? (redacted if necessary)
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

We are not using the Shib splunkbase app..
I'm not sure why not.
Here are the props.conf and transforms.conf
Props.conf
[idp-audit]
CHARSET=UTF-8
MAX_TIMESTAMP_LOOKAHEAD=18
#REPORT-idp_audit=REPORT-idp_audit
SHOULD_LINEMERGE=false
disabled=false
TIME_FORMAT=%Y%m%dT%H%M%SZ
TIME_PREFIX=|
TZ=UTC
REPORT-idp-audit = REPORT-idp-audit
##From shib docs field name is 'a'
FIELDALIAS-src_ip = SSOclientIP AS src_ip
FIELDALIAS-src = SSOclientIP AS src
##field name is 'u'
FIELDALIAS-user = SSOprincipalName AS user
FIELDALIAS-url = SSOrelyingPartyID AS url
FIELDALIAS-SSOrelayingPartyID = SSOrelyingPartyID AS SSOrelayingPartyID
[idp-process]
EVAL-app = "shibboleth"
EVAL-action = case(vendor_action="succeeded" OR vendor_action="true","success", vendor_action="failed" OR vendor_action="false","failure")
FIELDALIAS-dest = SSOrelayingPartyID AS dest
EXTRACT-log_level,process,pid = ^(?:[^ \n]* ){3}(?P<log_level>\w+)\s+\[(?P<process>[^\]:]+)\:(?<pid>\d+)\]
EXTRACT-src_ip = \] \- (?P<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})
EXTRACT-user,vendor_action = Login by '(?P<user>\S+)'\s+(?P<vendor_action>.+)
EXTRACT-signature,vendor_action,user = net\.shibboleth\.idp\.authn\..*\] \- (?<signature>.*?)\:.*?\s+(?P<vendor_action>\S+) for '(?P<user>\S+)'
EXTRACT-signature = \.\d{1,3}\s\-\s(?P<signature>[^:]+)\:
EXTRACT-signature,user,vendor_action = net\.shibboleth\.idp\.authn\..*\] \- (?<signature>.*?)\:.*(?<user>No username)\s.*(?<vendor_action>UsernamePasswordContext)
REPORT-idp-process-fields = REPORT-idp-process-fields
##From shib docs field name is 'a'
FIELDALIAS-src_ip = SSOclientIP AS src_ip
FIELDALIAS-src = SSOclientIP AS src
##field name is 'u'
FIELDALIAS-user = SSOprincipalName AS user
FIELDALIAS-url = SSOrelyingPartyID AS url
FIELDALIAS-SSOrelayingPartyID = SSOrelyingPartyID AS SSOrelayingPartyID
[idp-warning]
EXTRACT-log_level,process,pid = ^(?:[^ \n]* ){3}(?P<log_level>\w+)\s+\[(?P<process>[^\]:]+)\:(?<pid>\d+)\]
EXTRACT-src_ip = \] \- (?P<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})
EXTRACT-user,vendor_action = Login by '(?P<user>\S+)'\s+(?P<vendor_action>.+)
EXTRACT-signature = \.\d{1,3}\s\-\s(?P<signature>[^:]+)\:
Transforms.conf:
[REPORT-idp-audit]
DELIMS = "|"
FIELDS = "SSOclientIP","SSOauditEventTime","SSOrequestBinding","SSOrequestId","SSOrelyingPartyID","SSOmessageProfileId","SSOassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMethod","SSOreleasedAttributeID","SSOnameIdentifier","SSOassertionID"
[REPORT-idp-process-fields]
DELIMS = "|"
FIELDS = "event_headers","SSOrequestBinding","SSOrequestId","SSOrelayingPartyID","SSOmessageProfileId","SSOassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMethod","SSOreleasedAttributeID","SSOnameIdentifier","SSOassertionID"
And apparently /local/props.conf
[idp-audit]
MAX_TIMESTAMP_LOOKAHEAD=28
SHOULD_LINEMERGE=false
disabled=false
TIME_FORMAT=%Y%m%dT%H%M%SZ
TIME_PREFIX=\|
TZ=America/New_York
TRUNCATE=99999
FIELDALIAS-src_ip = Client_IP AS src_ip
FIELDALIAS-src = Client_IP AS src
FIELDALIAS-user = Username AS user
[idp-process]
LINE_BREAKER = ([\n\r]+)\d{4}-\d{1,2}-\d{1,2}\s\d+:\d+:\d+
TIME_FORMAT=%Y-%m-%d %H:%M:%S,%3Q
TZ=America/New_York
MAX_TIMESTAMP_LOOKAHEAD=28
SHOULD_LINEMERGE=false
disabled=false
TRUNCATE=99999
EXTRACT-src_ip = \|(?P<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\|
EXTRACT-message = (\]\s-\s)(?P<message>[a-zA-Z0-9:?\s.\(\):\/=%&]+)([\n\r]+)
EXTRACT-reason= --(?P<potential_reason>[a-zA-Z0-9:?\s\(\):\/=%&]+\S)\s+at
EXTRACT-examples = \s+at\s+(?P<example_of__reason>[a-zA-Z0-9:?\s\(\):\/=%&.]+)\s
[idp-warning]
LINE_BREAKER = ([\n\r]+)\d{4}-\d{1,2}-\d{1,2}\s\d+:\d+:\d+
TIME_FORMAT=%Y-%m-%d %H:%M:%S,%3Q
TZ=America/New_York
MAX_TIMESTAMP_LOOKAHEAD=28
SHOULD_LINEMERGE=false
disabled=false
TRUNCATE=99999
EXTRACT-src_ip = \|(?P<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\|
EXTRACT-message = (\]\s-\s)(?P<message>[a-zA-Z0-9:?\s.\(\):\/=%&]+)([\n\r]+)
EXTRACT-reason= --(?P<potential_reason>[a-zA-Z0-9:?\s\(\):\/=%&]+\S)\s+at
EXTRACT-examples = \s+at\s+(?P<example_of_reason>[a-zA-Z0-9:?\s\(\):\/=%&.]+)\s
And local/transforms.conf
[REPORT-idp-audit]
DELIMS = "|"
FIELDS = "Timestamp","Client_IP","Inbound_binding","Inbound_message_ID","Service_provider_name","Profile_ID","Identity_provider_name","Outbound_binding","Outbound_message_ID","Username","AuthenticationContext","Attributes","NameID_value"
#FIELDS = "timestamp","SSOclientIP","SSOauditEventTime","SSOrequestBinding","SSOrequestId","SSOrelyingPartyID","SSOmessageProfileId","SSOassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMethod","SSOrelease
dAttributeID","SSOnameIdentifier","SSOassertionID"
[REPORT-idp-process-fields]
DELIMS = "|"
FIELDS = event_headers,SSOclientIP,SSOrequestBinding,SSOrequestId,SSOrelayingPartyID,SSOmessageProfileId,SSOassertingPartyId,SSOresponseBinding,SSOresponseId,SSOprincipalName,SSOauthNMethod,SSOreleasedAttributeID,SSOnameIdentifier,SSOasser
tionID,field15,field16,url,uri_path,field19,signature,vendor_action,http_user_agent,field23
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I cant see anything wroing with your config - moreover, i cant see any definition for src_user!!
My thought process therefore, is where is src_user comming from?
Have a look in your field extractions/aliases/calc.fields for anything which monkeying with src_user which could apply to thet source type.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

It looks like src_user is not defined in the sourcetype=idp-process
Rather, it is coming through from the datamodel definition:
src_user=if(isnull(src_user) OR src_user="","unknown",src_user)
The implementer here also created a separate TA: TA-shib_inputs
Which has additional props/transforms.conf:
**default/props.conf**
[idp-audit]
CHARSET=UTF-8
MAX_TIMESTAMP_LOOKAHEAD=18
#REPORT-idp_audit=REPORT-idp_audit
SHOULD_LINEMERGE=false
disabled=false
TIME_FORMAT=%Y%m%dT%H%M%SZ
TIME_PREFIX=\|
TZ=UTC
REPORT-idp-audit = REPORT-idp-audit
##From shib docs field name is 'a'
FIELDALIAS-src_ip = SSOclientIP AS src_ip
FIELDALIAS-src = SSOclientIP AS src
##field name is 'u'
FIELDALIAS-user = SSOprincipalName AS user
FIELDALIAS-url = SSOrelyingPartyID AS url
FIELDALIAS-SSOrelayingPartyID = SSOrelyingPartyID AS SSOrelayingPartyID
[idp-process]
EVAL-app = "shibboleth"
EVAL-action = case(vendor_action="succeeded","success", vendor_action="failed","failure")
EXTRACT-log_level,process,pid = ^(?:[^ \n]* ){3}(?P\w+)\s+\[(?P[^\]:]+)\:(?\d+)\]
EXTRACT-src_ip = \] \- (?P\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})
EXTRACT-user,vendor_action = Login by '(?P\S+)'\s+(?P.+)
EXTRACT-signature = \.\d{1,3}\s\-\s(?P[^:]+)\:
REPORT-idp-process-fields = REPORT-idp-process-fields
##From shib docs field name is 'a'
FIELDALIAS-src_ip = SSOclientIP AS src_ip
FIELDALIAS-src = SSOclientIP AS src
##field name is 'u'
FIELDALIAS-user = SSOprincipalName AS user
FIELDALIAS-url = SSOrelyingPartyID AS url
FIELDALIAS-SSOrelayingPartyID = SSOrelyingPartyID AS SSOrelayingPartyID
**default/transforms.conf:**
[REPORT-idp-audit]
DELIMS = "|"
FIELDS = "SSOclientIP","SSOauditEventTime","SSOrequestBinding","SSOrequestId","SSOrelyingPartyID","SSOmess
ageProfileId","SSOassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMetho
d","SSOreleasedAttributeID","SSOnameIdentifier","SSOassertionID"
[REPORT-idp-process-fields]
DELIMS = "|"
FIELDS = "event_headers","SSOrequestBinding","SSOrequestId","SSOrelayingPartyID","SSOmessageProfileId","SS
OassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMethod","SSOreleasedAt
tributeID","SSOnameIdentifier","SSOassertionID"
**local/props.conf**
[idp-audit]
MAX_TIMESTAMP_LOOKAHEAD=28
SHOULD_LINEMERGE=false
disabled=false
TIME_FORMAT=%Y%m%dT%H%M%SZ
TIME_PREFIX=\|
TZ=EST
TRUNCATE=99999
FIELDALIAS-src_ip = Client_IP AS src_ip
FIELDALIAS-src = Client_IP AS src
FIELDALIAS-user = Username AS user
[idp-process]
LINE_BREAKER = \d{4}-\d{1,2}-\d{1,2}\s\d+:\d+:\d+
TIME_FORMAT=%Y-%m-%d\s%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=28
SHOULD_LINEMERGE=false
disabled=false
TZ=EST
TRUNCATE=99999
EXTRACT-loglevel = \d{3}\s\-\s(?P\w+)\s
EXTRACT-logger = \[(?P.*):\d+\]
EXTRACT-logline = \[.*:(?P\d+)\]
EXTRACT-message = (\]\s-\s)(?P[a-zA-Z0-9:?\s.\(\):\/=%&]+)(\s'|([\n\r]+)\d+)
EXTRACT-key_value = '(?P[a-zA-Z0-9]+)'
EXTRACT-reason= --(?P[a-zA-Z0-9:?\s\(\):\/=%&]+\S)\s+at
EXTRACT-examples = \s+at\s+(?P[a-zA-Z0-9:?\s\(\):\/=%&.]+)\s
[idp-warning]
LINE_BREAKER = \d{4}-\d{1,2}-\d{1,2}\s\d+:\d+:\d+
TIME_FORMAT=%Y-%m-%d\s%H:%M:%S
MAX_TIMESTAMP_LOOKAHEAD=28
SHOULD_LINEMERGE=false
disabled=false
TZ=EST
TRUNCATE=99999
EXTRACT-loglevel = \d{3}\s\-\s(?P\w+)\s
EXTRACT-logger = \[(?P.*):\d+\]
EXTRACT-logline = \[.*:(?P\d+)\
EXTRACT-message = (\]\s-\s)(?P[a-zA-Z0-9:?\s.\(\):\/=%&]+)(\s'|([\n\r]+)\d+)
EXTRACT-key_value = '(?P[a-zA-Z0-9]+)'
EXTRACT-reason= --(?P[a-zA-Z0-9:?\s\(\):\/=%&]+\S)\s+at
EXTRACT-examples = \s+at\s+(?P[a-zA-Z0-9:?\s\(\):\/=%&.]+)\s
**local/transforms.conf**
[REPORT-idp-audit]
DELIMS = "|"
FIELDS = "Client_IP","Timestamp","Inbound_binding","Inbound_message_ID","Service_provider_name","Profile_I
D","Identity_provider_name","Outbound_binding","Outbound_message_ID","Username","AuthenticationContext","A
ttributes","NameID_value"
#FIELDS = "SSOclientIP","SSOauditEventTime","SSOrequestBinding","SSOrequestId","SSOrelyingPartyID","SSOmes
sageProfileId","SSOassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMeth
od","SSOreleasedAttributeID","SSOnameIdentifier","SSOassertionID"
#[REPORT-idp-process-fields]
#DELIMS = "|"
#FIELDS = "event_headers","SSOrequestBinding","SSOrequestId","SSOrelayingPartyID","SSOmessageProfileId","S
SOassertingPartyId","SSOresponseBinding","SSOresponseId","SSOprincipalName","SSOauthNMethod","SSOreleasedA
ttributeID","SSOnameIdentifier","SSOassertionID"
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

It looks like src_user is not defined in the sourcetype=idp-process Rather, it is coming through from the datamodel definition:
No, its the other way round - Datamodels take thier fields from extractions, a datamodel can't add fields to a search.
Back in your original question you said that the DM returns:
action = success | app = shibboleth | src_user = user1 | tag = authentication tag = success | user = unknown
action = success | app = shibboleth | src_user = user2 | tag = authentication tag = success | user = unknown
So its getting src_user from somewhere, just not from this TA as far as I can see.
