Splunk Search

Why does extracted field search work for certain dates but not some other dates?

gnshah12345
Observer

I created an extracted field called remote_user.  My search for certain dates do bring the field value properly. However the same search for some other dates do not bring the proper values. I checked the events and the extracted field is malformed on the dates having issues. The remote_user field value will be like "CompanyName John_doe".  The days when search is working the remote_user shows "CompanyName John_doe".  The dates when the search is not working the field shows  value as "CompanyName". How can same extracted field works differently on different dates? Any suggestions?

Labels (1)
0 Karma

seemanshu
Path Finder

Hi @gnshah12345 ,

You may use the following regex expression for fetching the required "remote_user" field.

\d{0,3}\.\d{0,3}\.\d{0,3}\.\d{0,3}\s\-\s(?<remote_user>.+)\[

 Kindly upvote, if found helpful.

 

0 Karma

seemanshu
Path Finder

Hi @gnshah12345 ,

If the field extraction is based on user provided regex, kindly share the same in the response with a sample data, will be helpful in finding the right cause.

Thanks!

 

0 Karma

gnshah12345
Observer

I used regular expression for field extraction.

0 Karma

gnshah12345
Observer

The below is sample. The extracted field is highlighted.

May 3 11:26:01 linux_1 request-instance SoftCert 10.10.20.30 - Brew Bar John Doe_123456_UE [03/May/2023:11:25:55.509 -0400] "GET /rest/BROk305031.xml?ink=202305031525554263206 HTTP/1.1" 404 196 36580 1 25135 brew.bar.com /rest 749 "OU=123456+CN= Brew Bar John Doe,OU=ny,O=Brew Bar Joint,C=us" cc045c0a-e9a9-11ed-a6e5-0050568916c1 "x509: TLSV12: 30" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"

0 Karma

yuanliu
SplunkTrust
SplunkTrust

The question doesn't seem to be related to dates - unless you can show two different raw events, one for which your regex works as desired, one for which not.  Additionally, unless you can demonstrate your regex, there is no way to diagnose.

But ultimately, what is the significance of this string preceding the bracketed date, namely "Brew Bar John Doe_123456_UE"?  According to your description, the value you want is "Brew Bar John Doe".  If your description is accurate, this is the value of CN attribute in that embedded LDAP node, except that embedded message contains a nonstandard delimiter ("+" instead of space), and some inconvenient spacing, both can be fixed easily.

Instead of trying to reinvent regex, I suggest that you use Splunk supported extractions when applicable.  They are more robust.  In your case, the log contains a segment that is NCSA/Apache access log.  Splunk comes with access-request and access-extractions for such.  For example,

 

| rex mode=sed "s/\+/,/g s/= */=/g" ``` handle little quirks in data ```
| extract access-request ``` but this is robust ```

 

This will give you

CCNOOUfileinkmethodrooturiuri_domainuri_pathuri_queryversion
usBrew Bar John DoeBrew Bar Joint123456BROk305031.xml202305031525554263206GETrest/rest/BROk305031.xml?ink=202305031525554263206 /rest/BROk305031.xmlink=202305031525554263206HTTP/1.1
Alternatively, you can use

 

| rex mode=sed "s/\+/,/g s/= */=/g"
| extract access-extractions

 

CCNOOUink
usBrew Bar John DoeBrew Bar Joint123456202305031525554263206
 
Tags (1)
0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...