Splunk Search

Regular Expression extract beginning and end of string- What am I missing?

rhenry
Explorer

Hello,

I have a situation where I am trying to pull from within a field the nomenclature of ABC-1234-56-7890 but want to be able to only pull the first three letters and the last four numbers into one field. I have the following query below thus far but have not figured out how to do as described above:

| rex field=comment (?<ABC>ABC\-\d+\-\d+\-\d+)

I want the return of "ABC-7890"

What am I missing so that I can successfully pull both beginning and end of the above described string? Thanks!

Labels (4)
0 Karma

yuanliu
SplunkTrust
SplunkTrust
  • I can't help but noticing that your initial regex contains hard-coded leading string "ABC".  This implies that the first group of letters is fixed.  If this is the case, you can focus on the end of string, then compose with the known group, like this:

 

| rex field=comment "\bABC-\S+-(?<ABC>\d+)"
| eval ABC="ABC-" . ABC

 

  • Another way is to use sed mode to strip whatever you don't need.  This example assumes that leading string is unknown.

 

| rex field=comment mode=sed "s/.*?(\w+)\S+-(\d+).*/\1-\2/"​

 

(If you cannot sacrifice original content of comment, you can first copy it into a different field name such as ABC, then apply rex to that field.)

  • Alternatively, you can apply sed or replace to the ABC field you initially extracted.  This example uses replace.

 

| rex field=comment (?<ABC>ABC\-\d+\-\d+\-\d+)
| eval ABC=replace(ABC, "ABC-\d+-\d+-", "ABC-")​

 

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Unfortunately, with PCRE you don't have a "ignore this part" group. (I would also welcome that)

You can however capture the beginning and end into separate fields and then create a calculated field combining them together,

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @rhenry,

you could use a regex and an eval:

 

your_search
| rex "^(?<my_field_1>\w\w\w).*(?<myfield_2>\d\d\d\d)"
| eval my_field=my-field-1."-".my_field_2

 

you can test the regex at https://regex101.com/r/S7tXqS/1

Ciao.

Giuseppe

0 Karma

rhenry
Explorer

Hey this string does what I am looking for. However, it looks like it only works if ABC-1234-56-7890 is the only string in the field. What if there is additional words before and after? Like for example:

"This the location for ABC-1234-56-7890 at this point."

Is there a way to extract just that string highlighted above and again only beginning and end? Thanks!

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi, please try this:

your_search
| rex "(?<my_field_1>\w\w\w)\S*(?<myfield_2>\d\d\d\d)"
| eval my_field=my-field-1."-".my_field_2

that you can test at https://regex101.com/r/S7tXqS/2

Ciao.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...