Solved: How to extract the last part of all the combined U...

kiru2992 · ‎08-25-2020

Hello Everyone!

I have a field(FieldA) which contains multiple URLs together. I would like to have a new field(FieldB) with the list of last part of all the URLs.

FieldA: https://...../...../..../123994https://.../....../....../....../123441 https://.../....../....../....../133456

FieldB: 123994 123441 133456

Currently I am using the below query for extraction but I am only getting '133456' not the list of all the values.

Query:

| rex field=FieldA "\/(?<FieldB>\w+)$"

Can you please help me with expression for desired output or is there a better way of doing the same?

ITWhisperer · ‎08-26-2020

Try this:

| rex max_match=0 field=FieldA "\/(?<FieldB>[^\/]*)(https:|\n|$)"
| mvcombine delim=" " FieldB

View solution in original post

to4kawa · ‎08-26-2020

The field extraction of the log is wrong to begin with.
It's faster to extract it from the raw logs.

richgalloway · ‎08-25-2020

Try this

| rex max_match=0 field=FieldA "\/(?<FieldB>\w+)$"
| mvexpand FieldB

---
If this reply helps you, Karma would be appreciated.

kiru2992 · ‎08-25-2020

Hello @richgalloway

I am sorry.. I am still getting only the last number.

richgalloway · ‎08-25-2020

That's probably because of the $ anchor in the regex. Try removing it. You may then find the regex matches other parts of the URLs since '\w+' will match a lot of text. If so, it will be necessary to modify the regex to match only the ends of the URLs.

---
If this reply helps you, Karma would be appreciated.

kiru2992 · ‎08-25-2020

Hello @richgalloway ,

As you mentioned, the '/w +' gets other parts of URL after removing '$'. All parts of the URLs are individually mapped to fieldB resulting in duplicate entries.

Can you please let me know how get only the end of URLs in a single row?

ITWhisperer · ‎08-26-2020

How are the URLs delimited in FieldA? it looks like in some instances there is a space but not others. Try:

| rex max_match=0 field=FieldA "\/(?<FieldB>[^\/]*)( |$)"
| mvexpand FieldB

If there is no space between in some instances, use:

| rex max_match=0 field=FieldA "\/(?<FieldB>[^\/]*)(https:| |$)"
| mvexpand FieldB

kiru2992 · ‎08-26-2020

Hello @ITWhisperer ,

The first snippet gives only the last part of the last URL

The second snippet gives only the last part of the first URL.

Can you please let me know how to get the list of last parts of all the URLs?

kiru2992 · ‎08-26-2020

Hello @ITWhisperer ,

I forgot to mention, it is a '\n' between each URL not space.

ITWhisperer · ‎08-26-2020

Try this:

| rex max_match=0 field=FieldA "\/(?<FieldB>[^\/]*)(https:|\n|$)"
| mvcombine delim=" " FieldB

kiru2992 · ‎08-26-2020

Hello @ITWhisperer ,

Thank you!! It worked like charm.:)

kiru2992 · ‎09-22-2020

Hello @ITWhisperer ,

Now I would like to have separate row for each of the extracted value but I am not able to split the extracted 'fieldB'. Can you please help me with this?

ITWhisperer · ‎09-23-2020

If you want FieldB separated into events use mvexpand instead of mvcombine

| rex max_match=0 field=FieldA "\/(?<FieldB>[^\/]*)(https:|\n|$)"
| mvexpand FieldB

kiru2992 · ‎09-28-2020

Hello @ITWhisperer ,

Thank you. It worked:)

richgalloway · ‎08-26-2020

If your problem is resolved, then please click the "Accept as Solution" button to help future readers.

---
If this reply helps you, Karma would be appreciated.

How to extract the last part of all the combined URLs

regex

rex

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life