Splunk Search

Regex to extract fields with different format

nuaraujo
Path Finder

Hello all,

I need your help in order to get a regex that may extract fields from some messages.

Example 1
USER: user1 UPDATED CUSTOMER - 123456. Added new user. New user was added.

What I am looking at this message:
username: user1
operation: UPDATED CUSTOMER (always two words in uppercase)
customer_id: 123456 (always preceded by "-" and ending with ".") (not available in all messages)
comment: Added new user. New user was added

Example2
USER: user2 ADDED COUNTRY with identifier: Germany

What I am looking at this message:
username:user2
operation: ADDED COUNTRY (always two words in uppercase)
comment: with identifier: Germany (in this message I do not have customer_id field)

I am using the following REGEX that I far from being accurate. It works for the first use case but not for the second

 ... | rex field=message    "^USER: (?P<username>.+?) (?P<operation>[A-Z].+?) - (?P<id>.+?)\. (?P<message>.*)"

What I am looking for a final result:
|username.... | operation...........................| id...............| message................................................|
|user1............|UPDATED CUSTOMER.....| 123456 ....| Added new user. New user was added |
|user2............|ADDED COUNTRY............| ..................| with identifier: Germany.........................|

Can someone help me, building a general regex, please?

Tags (2)
0 Karma
1 Solution

javiergn
SplunkTrust
SplunkTrust

Hi @nuaraujo,

Try the following regex instead. I've tested it on my lab with you two examples and it seems to be working fine. Note I am assuming your customer ID is a number so you might need to tweak that if that's not the case.

^USER: (?P<username>\S+)\s+(?P<operation>[A-Z]+ [A-Z]+)(\s+\-\s+(?P<customerid>\d+)\.)?\s+(?P<message>.*)

Thanks,
J

View solution in original post

0 Karma

javiergn
SplunkTrust
SplunkTrust

Hi @nuaraujo,

Try the following regex instead. I've tested it on my lab with you two examples and it seems to be working fine. Note I am assuming your customer ID is a number so you might need to tweak that if that's not the case.

^USER: (?P<username>\S+)\s+(?P<operation>[A-Z]+ [A-Z]+)(\s+\-\s+(?P<customerid>\d+)\.)?\s+(?P<message>.*)

Thanks,
J

0 Karma

nuaraujo
Path Finder

Thanks
Thanks
Thanks 🙂

0 Karma

lloydknight
Builder

Hello @nuaraujo

try something like this.

... | rex field=message "USER:\s(?<username>.+?)\s(?<operation>\w+\s\w+)\s\-?\s(?<id>\d+)\.?\s(?<message>.*)"

Hope it helps!

0 Karma

p_gurav
Champion

Can you try :

| rex field=message    "^USER: (?P<username>.+?) (?P<operation>\w+) (?P<comment>.*)" | rex field=comment "CUSTOMER - (?P<id>[^\.]+)"
0 Karma

nuaraujo
Path Finder

Thank @p_gurav.

Your suggestion would already be a good solution. However, can you just help me getting 2 words for "operation"? In your suggestion, I am only getting one. Even so, BIG THANK YOU for your quick reply.

0 Karma

gcusello
Esteemed Legend

Hi nuaraujo,
try this:

| rex field=message    "^USER: (?P<username>.+?) (?P<operation>[A-Z]+ [A-Z]+) (?<comment>.*)"
| rex field=comment "- (?<id>\d+)\. (?<message>.*)"

In this way you have all fields.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Splunk Education - Fast Start Program!

Welcome to Splunk Education! Splunk training programs are designed to enable you to get started quickly and ...

Five Subtly Different Ways of Adding Manual Instrumentation in Java

You can find the code of this example on GitHub here. Please feel free to star the repository to keep in ...

New Splunk APM Enhancements Help Troubleshoot Your MySQL and NoSQL Databases Faster

Splunk Observability has two new enhancements to make it quicker and easier to troubleshoot slow or frequently ...