Getting Data In

regexp in Splunk

laurentiugrama
Explorer

I tried to find a solution in order to parse some URL to obtain the base but it seems that I cannot succeed.

For the between GET/POST and HTTP I want to return the baseurl as in the examples below

GET /gw/api/aaa/v1/ HTTP - to return /gw/api/aaa/v1
GET /gw/api/abc/v3 HTTP - to return /gw/api/abc/v3
POST /gw/api/cba/ HTTP - to return /gw/api/cba
POST /gw/transactions/swaggers/v2 HTTP - to return /gw/transactions/swaggers/v2
POST /gw/api/swaggers/v1/asd?dssa HTTP - to return /gw/api/swaggers/v1/asd
POST /api/swaggers/ HTTP - to return /api/swaggers
GET /api/cashAccountOpenings/v3/sadsa-123312-1312 HTTP - to return /api/cashAccountOpenings/v3

 

I added this examples to regex101.com to be easier to find a solution.

https://regex101.com/r/oLXtw8/1/

 

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

(?<method>\S+)\s(?<path>/[^?]*)(\?(?<query>\S*))?

Typing on my phone, didn't verify it 😉

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

sorry, what's the question?

the regex seems to be correct, even if I'd use an easier regex:

[GET|POST]\s(?<URL>.+)\s+(HTTP)

Ciao.

Giuseppe

laurentiugrama
Explorer

Thank you for the response but the expression returns URL not  baseurl

As I said, I trayed to obtain the baseurl from an url

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

sorry but I don't understand: what do you mean with baseurl?

the name of the field or a part of the URL?

If you want the name of the field you can modify the regex:

| rex "[GET|POST]\s(?<baseurl>.+)\s+(HTTP)"

if instead you want a part of the URL, e.g. the first two sections, you could use something like this:

| rex "[GET|POST]\s(?<baseurl>\/\w+\/\w+\/)\s+(HTTP)"

Ciao.

Giuseppe

laurentiugrama
Explorer

In the first post I explained what is the URL and what I want to obtain from regext

for first line the url is /gw/api/aaa/v1/ and the baseurl is /gw/api/aaa/v1

GET /gw/api/aaa/v1/ HTTP - to return /gw/api/aaa/v1
GET /gw/api/abc/v3 HTTP - to return /gw/api/abc/v3
POST /gw/api/cba/ HTTP - to return /gw/api/cba
POST /gw/transactions/swaggers/v2 HTTP - to return /gw/transactions/swaggers/v2
POST /gw/api/swaggers/v1/asd?dssa HTTP - to return /gw/api/swaggers/v1/asd
POST /api/swaggers/ HTTP - to return /api/swaggers
GET /api/cashAccountOpenings/v3/sadsa-123312-1312 HTTP - to return /api/cashAccountOpenings/v3

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

ok, sorry for the misunderstanding!

Please try this:

index=your_index
| rex field=URL "[GET|POST]\s(?<baseurl>.+)\s+(HTTP)"
| rex field=baseurl "(?<baseurl1>.+)\/$"
| rex field=baseurl "(?<baseurl2>.+)((\?\w+$)|(\/sadsa.*))"
| eval baseurl=coalesce(baseurl2,baseurl1,baseurl)
| table URL baseurl

Ciao.

Giuseppe

laurentiugrama
Explorer

Your solution it works well.

Do you think that it's possible to have a solution based on only one regexp iteration ?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

probably it's possible, I'll try tomorrow (today I'm out!).

Ciao and happy splunking.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Why You Can't Miss .conf25: Unleashing the Power of Agentic AI with Splunk & Cisco

The Defining Technology Movement of Our Lifetime The advent of agentic AI is arguably the defining technology ...

Deep Dive into Federated Analytics: Unlocking the Full Power of Your Security Data

In today’s complex digital landscape, security teams face increasing pressure to protect sprawling data across ...

Your summer travels continue with new course releases

Summer in the Northern hemisphere is in full swing, and is often a time to travel and explore. If your summer ...