Splunk Search

Search time field extractions with backslash not showing up

sonomauser
Explorer

Hello Splunk Wizards,

I know there are plenty of people who've had similar issues, but I haven't been able to use their resolution for my issue. 

I'm doing a search time field extraction to capture login username, which includes a backslash. I have the regex correct (?P<User_Name>(domain\\\\\\S+)) slightly modified from regex 101 for Splunk. In the field extraction wizard, it perfectly grabs all sample data (ex: domain\username).

 

(?P<User_Name>(domain\\\\\\S+))

 

However, this field doesn't show up in search when looking at the exact same sample data. I've performed a verbose search and made sure all available fields are showing, it's not there. I've tried using groups names I know Splunk isn't already using, no improvement. 

Pretty sure it was to do with the backslash, because if I modify the regex to (?P<User_Name>domain\S+), the field extraction shows up in search, but it also contains data that isn't exactly correct. 

 

(?P<User_Name>domain\S+)

 

I've tried variations with more and less backslashes, none seem to work. 

I guess I can live with a sloppy field extraction if that's all I can do, but the first regex really is perfect. 

Any ideas?

Labels (1)
0 Karma
1 Solution

PickleRick
Champion

I don't know why you have so many spaces.

Configuring extraction in props/transforms you'd need only to escape the literal backslash character so you'd need

 

(?<username>domain\\\S+)

 

If you want to extract this field with rex command you'll need to escape every backslash once more so indeed you end up with 6 backslashes.

Which method are you using?

Oh, and depending on circumstances I'd probably rather go for something like

(?<userdomain>\S+)\\(?<username>\S+)

View solution in original post

0 Karma

PickleRick
Champion

I don't know why you have so many spaces.

Configuring extraction in props/transforms you'd need only to escape the literal backslash character so you'd need

 

(?<username>domain\\\S+)

 

If you want to extract this field with rex command you'll need to escape every backslash once more so indeed you end up with 6 backslashes.

Which method are you using?

Oh, and depending on circumstances I'd probably rather go for something like

(?<userdomain>\S+)\\(?<username>\S+)

View solution in original post

0 Karma

sonomauser
Explorer

Thank you. I was using rex. 

Your suggested regex totally worked and now it's in the Interesting fields. Thank you very much.

I wish I knew enough about regex to know what makes a decent extraction vs a less than decent. 

(?<userdomain>\S+)\\(?<username>\S+)

 I will mark this as the solution. 

0 Karma

PickleRick
Champion

It all comes with time 🙂 I've been using regexes for over 20 years now...

But up to the point of using splunk never really had much use of PCRE's, just used basic and extended regexes.

So don't worry, you'll get the hang of it.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

It would help immensely to see some data samples from which you are trying to extract the field.

Are you extracting at index-time or search-time?

What exactly does "contains data that isn't exactly correct" mean?

---
If this reply helps you, an upvote would be appreciated.
0 Karma

sonomauser
Explorer

Thank you for your assistance, PickleRick's suggested Regex was able to give me the results I was looking for. Poor regex usage on my part. 

However, to answer your questions, this is a Search Time extraction. 

Some sample data would be:

2021-10-20 00:00:11 POST /EWS/Exchange.asmx &CorrelationID=<empty>;&cafeReqId=a0d15c2a-4c65-bd1837e8755d; domain\username 192.168.111.222 Microsoft+Office/16.0+(Windows+NT+10.0;+Microsoft+Outlook+16.0.5149;+Pro) 200

and

2021-10-20 00:00:05 POST /Microsoft-Server-ActiveSync/default.eas Cmd=Ping&User=domain%5username&DeviceId=bc6fbaac6f14f7811cea8d7&DeviceType=Outlook&CorrelationID=<empty>;&cafeReqId=4a35be43-351a-662266fc3188; domain\username 111.222.333.444 Outlook-iOS-Android/1.0 - 200

What I meant by "contains data that isn't exactly correct" is that when using (?P<Test>domain\S+), the extraction would grab "username\domain" from the first example event, but would also grab "domain%5username&DeviceId=bc6fbaac6f14f7811cea8d7&DeviceType=Outlook&CorrelationID=<empty>;&cafeReqId=4a35be43-351a-662266fc3188" from the second example. Almost right, but way too much.

Clearly a lack of regex understanding on my part. 

Thank you. 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If your problem is resolved, then please click an "Accept as Solution" button to help future readers.

---
If this reply helps you, an upvote would be appreciated.
0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!