We are successfully ingesting Websense logs into Splunk but the user field is recorded in LDAP context and has spaces. Splunk's default field extraction truncates the user field at the end of the LDAP host, leaving the user context and name undefined. While I can do search-time rex to define that data, I'm thinking it would be better to do this at index so that the Websense app would function as expected and so that we're not custom defining the full variety of records we may receive. Is there way to modify the default extraction to delimit this field properly?
A sample record is:
Jul 3 12:41:10 10.10.2.4 vendor=Websense product=Security product_version=7.8.3 action=permitted severity=1 category=109 user=LDAP://addc1.my.ad.dom OU=Users,OU=Office,DC=my,DC=ad,DC=dom/John Mark Sanders src_host=10.22.70.18 src_port=0 dst_host=www.google.com:80 dst_ip=74.125.228.81 dst_port=80 bytes_out=548 bytes_in=0 http_response=0 http_method=GET http_content_type=- http_user_agent=Mozilla/4.0_(compatible;MSIE_8.0;_Windows_NT_6.1;_WOW64;_Trident/4.0;_SLCC2;.NET_CLR_2.0.50727;.NET_CLR_3.5.30729;.NET_CLR_3.0.30729;Media_Center_PC_6.0;.NET4.0C;_InfoPath.3) http_proxy_status_code=0 reason=- disposition=1026 policy=RegularUser role=8 duration=0 url=h_t_t_p://www.google.com:80/images/google_favicon_128.png
The rex delims for this would obviously be the text between user= and src_host=
I'm relatively new to Splunk transformations, so I apologize if this is obvious.
Thanks in advance.
You could define a Field Extraction (Settings -> Fields) for that sourcetype using this expression:
user=(?<user>.+?)\s+src_host
That'll look for "user=", capture as few chars as possible until it encounters " src_host=".
Check if there are any issues around the automatically extracted partial user
value.
You could define a Field Extraction (Settings -> Fields) for that sourcetype using this expression:
user=(?<user>.+?)\s+src_host
That'll look for "user=", capture as few chars as possible until it encounters " src_host=".
Check if there are any issues around the automatically extracted partial user
value.