Re: Why does extracted field search work for certa...

gnshah12345 · ‎05-03-2023

I created an extracted field called remote_user. My search for certain dates do bring the field value properly. However the same search for some other dates do not bring the proper values. I checked the events and the extracted field is malformed on the dates having issues. The remote_user field value will be like "CompanyName John_doe". The days when search is working the remote_user shows "CompanyName John_doe". The dates when the search is not working the field shows value as "CompanyName". How can same extracted field works differently on different dates? Any suggestions?

seemanshu · ‎05-03-2023

Hi @gnshah12345 ,

You may use the following regex expression for fetching the required "remote_user" field.

\d{0,3}\.\d{0,3}\.\d{0,3}\.\d{0,3}\s\-\s(?<remote_user>.+)\[

Kindly upvote, if found helpful.

seemanshu · ‎05-03-2023

Hi @gnshah12345 ,

If the field extraction is based on user provided regex, kindly share the same in the response with a sample data, will be helpful in finding the right cause.

Thanks!

gnshah12345 · ‎05-03-2023

I used regular expression for field extraction.

gnshah12345 · ‎05-03-2023

The below is sample. The extracted field is highlighted.

May 3 11:26:01 linux_1 request-instance SoftCert 10.10.20.30 - Brew Bar John Doe_123456_UE [03/May/2023:11:25:55.509 -0400] "GET /rest/BROk305031.xml?ink=202305031525554263206 HTTP/1.1" 404 196 36580 1 25135 brew.bar.com /rest 749 "OU=123456+CN= Brew Bar John Doe,OU=ny,O=Brew Bar Joint,C=us" cc045c0a-e9a9-11ed-a6e5-0050568916c1 "x509: TLSV12: 30" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"

yuanliu · ‎05-04-2023

The question doesn't seem to be related to dates - unless you can show two different raw events, one for which your regex works as desired, one for which not. Additionally, unless you can demonstrate your regex, there is no way to diagnose.

But ultimately, what is the significance of this string preceding the bracketed date, namely "Brew Bar John Doe_123456_UE"? According to your description, the value you want is "Brew Bar John Doe". If your description is accurate, this is the value of CN attribute in that embedded LDAP node, except that embedded message contains a nonstandard delimiter ("+" instead of space), and some inconvenient spacing, both can be fixed easily.

Instead of trying to reinvent regex, I suggest that you use Splunk supported extractions when applicable. They are more robust. In your case, the log contains a segment that is NCSA/Apache access log. Splunk comes with access-request and access-extractions for such. For example,

| rex mode=sed "s/\+/,/g s/= */=/g" ``` handle little quirks in data ```
| extract access-request ``` but this is robust ```

This will give you

C

CN

O

OU

file

ink

method

root

uri

uri_domain

uri_path

uri_query

version

us

Brew Bar John Doe

Brew Bar Joint

123456

BROk305031.xml

202305031525554263206

GET

rest

/rest/BROk305031.xml?ink=202305031525554263206

/rest/BROk305031.xml

ink=202305031525554263206

HTTP/1.1

Alternatively, you can use

| rex mode=sed "s/\+/,/g s/= */=/g"
| extract access-extractions

C	CN	O	OU	ink
us	Brew Bar John Doe	Brew Bar Joint	123456	202305031525554263206

Why does extracted field search work for certain dates but not some other dates?

field extraction

Enterprise Security Content Update (ESCU) | New Releases

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

Index This | What are the 12 Days of Splunk-mas?