Solved: How to build a regular expression that will split ...

mstark31 · ‎11-29-2016

I need to use regex to split a field into two parts, delimited by an underscore.

The vast majority of the time, my field (a date/time ID) looks like this, where AB or ABC is a 2 or 3 character identifier.

11232016-0056_ABC 
11232016-0056_AB

I use the following rex command to extract, and it works great.

| rex field=originalField "(?<subField1>.*)\_(?<subField2>.*)"

For example:

originalField = 11232016-0056_ABC
subField1 = 11232016-0056
subField2 = ABC

However, I have a few special cases where originalField = 11232016-0056_ABC_M, where M could be anything alphanumeric following an additional underscore.

When I use the above rex command, I get the following result:

originalField = 11232016-0056_ABC_M
subField1 = 11232016-0056_ABC
subField2 = M

I want to see the following:

originalField = 11232016-0056_ABC_M
subField1 = 11232016-0056 
subField2 =  ABC_M

Basically, I need it to split at the first underscore and ignore all subsequent underscores.

sundareshr · ‎11-29-2016

Try this

.... | rex field=originalField "(?<subField1>[^_]+)_(?<subField2>.+)"

View solution in original post

gdziuba · ‎11-29-2016

This should get you going.

.... | rex field=originalField "(?<subField1>[^_]+)_(?<subField2>.*)"

Use this if you want to keep the underscore at the end of the line in the case that the character is other than an underscore.

 .... | rex field=originalField "(?<subField1>.*?_)(?<subField2>.*)"

sshelly_splunk · ‎11-29-2016

(?P<field1>\S+)_(?P<field2>\w+)

sshelly_splunk · ‎11-29-2016

sorry -too fast on the draw. I didnt see the additional info around possible 2nd "_"'s occurring.
gdziuba's answer works perfectly (or so I think:))

mstark31 · ‎11-29-2016

This still splits on the 2nd underscore.

sundareshr · ‎11-29-2016

Try this

.... | rex field=originalField "(?<subField1>[^_]+)_(?<subField2>.+)"

mstark31 · ‎11-29-2016

This works! Thanks!

mstark31 · ‎12-13-2019

Hello Past mstark31. Current mstark31 thanks you for asking this question 3 years ago.

mstark31 · ‎11-29-2016

| rex field=specimenId "(?<subField1>[^_]+)_(?<subField2>.*)"

Changed + to * to account for cases where _ABC may not exist.

How to build a regular expression that will split a field on the first underscore?

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases

Are you a member of the Splunk Community?

How to build a regular expression that will split a field on the first underscore?

Building Reliable Asset and Identity Frameworks in Splunk ES

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

Automatic Discovery Part 3: Practical Use Cases