Getting Data In

regexp in Splunk

laurentiugrama
Explorer

I tried to find a solution in order to parse some URL to obtain the base but it seems that I cannot succeed.

For the between GET/POST and HTTP I want to return the baseurl as in the examples below

GET /gw/api/aaa/v1/ HTTP - to return /gw/api/aaa/v1
GET /gw/api/abc/v3 HTTP - to return /gw/api/abc/v3
POST /gw/api/cba/ HTTP - to return /gw/api/cba
POST /gw/transactions/swaggers/v2 HTTP - to return /gw/transactions/swaggers/v2
POST /gw/api/swaggers/v1/asd?dssa HTTP - to return /gw/api/swaggers/v1/asd
POST /api/swaggers/ HTTP - to return /api/swaggers
GET /api/cashAccountOpenings/v3/sadsa-123312-1312 HTTP - to return /api/cashAccountOpenings/v3

 

I added this examples to regex101.com to be easier to find a solution.

https://regex101.com/r/oLXtw8/1/

 

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

(?<method>\S+)\s(?<path>/[^?]*)(\?(?<query>\S*))?

Typing on my phone, didn't verify it 😉

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

sorry, what's the question?

the regex seems to be correct, even if I'd use an easier regex:

[GET|POST]\s(?<URL>.+)\s+(HTTP)

Ciao.

Giuseppe

laurentiugrama
Explorer

Thank you for the response but the expression returns URL not  baseurl

As I said, I trayed to obtain the baseurl from an url

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

sorry but I don't understand: what do you mean with baseurl?

the name of the field or a part of the URL?

If you want the name of the field you can modify the regex:

| rex "[GET|POST]\s(?<baseurl>.+)\s+(HTTP)"

if instead you want a part of the URL, e.g. the first two sections, you could use something like this:

| rex "[GET|POST]\s(?<baseurl>\/\w+\/\w+\/)\s+(HTTP)"

Ciao.

Giuseppe

laurentiugrama
Explorer

In the first post I explained what is the URL and what I want to obtain from regext

for first line the url is /gw/api/aaa/v1/ and the baseurl is /gw/api/aaa/v1

GET /gw/api/aaa/v1/ HTTP - to return /gw/api/aaa/v1
GET /gw/api/abc/v3 HTTP - to return /gw/api/abc/v3
POST /gw/api/cba/ HTTP - to return /gw/api/cba
POST /gw/transactions/swaggers/v2 HTTP - to return /gw/transactions/swaggers/v2
POST /gw/api/swaggers/v1/asd?dssa HTTP - to return /gw/api/swaggers/v1/asd
POST /api/swaggers/ HTTP - to return /api/swaggers
GET /api/cashAccountOpenings/v3/sadsa-123312-1312 HTTP - to return /api/cashAccountOpenings/v3

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

ok, sorry for the misunderstanding!

Please try this:

index=your_index
| rex field=URL "[GET|POST]\s(?<baseurl>.+)\s+(HTTP)"
| rex field=baseurl "(?<baseurl1>.+)\/$"
| rex field=baseurl "(?<baseurl2>.+)((\?\w+$)|(\/sadsa.*))"
| eval baseurl=coalesce(baseurl2,baseurl1,baseurl)
| table URL baseurl

Ciao.

Giuseppe

laurentiugrama
Explorer

Your solution it works well.

Do you think that it's possible to have a solution based on only one regexp iteration ?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

probably it's possible, I'll try tomorrow (today I'm out!).

Ciao and happy splunking.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

The University of Nevada, Las Vegas (UNLV) is another premier research institution helping to shape the next ...

How to Monitor Google Kubernetes Engine (GKE)

We’ve looked at how to integrate Kubernetes environments with Splunk Observability Cloud, but what about ...

Index This | How can you make 45 using only 4?

October 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...