Splunk Search

Help with regular expression extract and match

mehrdad_2000
Path Finder

I have log file like this:

11:00:00 jon nginx: A[1234]B[56789] [0.1222]

11:00:00 dan service cloud: C[0078]D[12] F[2]

11:00:00 dan mongo_DB: D[0078]C[12]A[2]

1) Match nginx: and service cloud: but only extract “nginx” and “service cloud”, not “:”

2) Regex match to whole part like this A[1234] but only want extract numbers between brackets like “1234”. (Between brackets have a different range of number N[234] or K[343443],..., And maybe have separator like [0.1222].)

Any recommendation?

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @ mehrdad_2000,
if I correctly understand, you want to extract the numbers between brackets, is it correct?
if this is your need, try this regex

\s+\w+\[(?<first_num>\d+)\]\w+\[(?<second_num>\d+)\][^\[]*\[(?<third_num>[^\]]*)\]

that you can test at https://regex101.com/r/vVIUkL/1

Ciao.
Giuseppe

View solution in original post

0 Karma

woodcock
Esteemed Legend

You cannot not do both at the same time, but you can do 1 with each like this:

... | rex "^\S+\s+\S+\s+(?<service>[^:]+)"
| rex max_match=0 "\[(?<numbers>\d+)\]"
| eval numbers=mvsort(numbers)

And then, depending on what you mean, something like this:

| nomv numbers

Or:

| eval numbers_range = mvindex(numbers, 0) . " - " . mvindex(numbers, -1)
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @ mehrdad_2000,
if I correctly understand, you want to extract the numbers between brackets, is it correct?
if this is your need, try this regex

\s+\w+\[(?<first_num>\d+)\]\w+\[(?<second_num>\d+)\][^\[]*\[(?<third_num>[^\]]*)\]

that you can test at https://regex101.com/r/vVIUkL/1

Ciao.
Giuseppe

View solution in original post

0 Karma

mehrdad_2000
Path Finder

Thank you for answer
But I want to get all “A” an “B” ... grouped in related column. In each line location of A are different.
This log is unstructured, and messy.
I need to get them wherever there are in each line and group them.
E.g. first_num all A
Second_num all B ...

Also some them separate by space others not. This is random
E.g. A[324] B[5455]C[55]
D[324] B[5455] A[55]

First_num | second_num |
324—————5455
55————— 5455

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @mehrdad_2000,
if you want to extract all the numbers after A and all the numbers after B without any order in your logs, you could use two different regexes, something like this:

your_search
| rex "A\[(?<A_field>\d+)"
| rex "B\[(?<B_field>\d+)"
| table A B

That you can test at https://regex101.com/r/vVIUkL/2

Ciao.
Giuseppe

0 Karma

mehrdad_2000
Path Finder

This is exactly what I want, thank you so much.
In field extraction it work perfectly one by one, But when I write both of them like this:

A\[(?\d+) | B\[(?\d+)

separate them with pipe it match all A but some of B!

Do have any idea about this?

0 Karma

woodcock
Esteemed Legend

Of course it doesn't. They are TWO SEPARATE extractions.

0 Karma

mehrdad_2000
Path Finder

Exactly, I thought it can extract multiple fields with field extraction.

0 Karma

woodcock
Esteemed Legend

You need to structure the | correctly. See this example:

| makeresults 
| eval raw="foo=bar, bat=baz"
| makemv raw
| mvexpand raw
| rename raw AS _raw
| rex "(?:foo=(?<foo>\S+))|(?:bat=(?<bat>\S+))"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @mehrdad_2000,
Sorry but I cannot read you regex, please use the Code Sample button (1010101) otherwise I cannot help you.

Ciao.
Giuseppe

0 Karma

mehrdad_2000
Path Finder

As @woodcock told They are TWO SEPARATE extractions.

0 Karma