Getting Data In

regexp in Splunk

laurentiugrama
Explorer

I tried to find a solution in order to parse some URL to obtain the base but it seems that I cannot succeed.

For the between GET/POST and HTTP I want to return the baseurl as in the examples below

GET /gw/api/aaa/v1/ HTTP - to return /gw/api/aaa/v1
GET /gw/api/abc/v3 HTTP - to return /gw/api/abc/v3
POST /gw/api/cba/ HTTP - to return /gw/api/cba
POST /gw/transactions/swaggers/v2 HTTP - to return /gw/transactions/swaggers/v2
POST /gw/api/swaggers/v1/asd?dssa HTTP - to return /gw/api/swaggers/v1/asd
POST /api/swaggers/ HTTP - to return /api/swaggers
GET /api/cashAccountOpenings/v3/sadsa-123312-1312 HTTP - to return /api/cashAccountOpenings/v3

 

I added this examples to regex101.com to be easier to find a solution.

https://regex101.com/r/oLXtw8/1/

 

Labels (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

(?<method>\S+)\s(?<path>/[^?]*)(\?(?<query>\S*))?

Typing on my phone, didn't verify it 😉

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

sorry, what's the question?

the regex seems to be correct, even if I'd use an easier regex:

[GET|POST]\s(?<URL>.+)\s+(HTTP)

Ciao.

Giuseppe

laurentiugrama
Explorer

Thank you for the response but the expression returns URL not  baseurl

As I said, I trayed to obtain the baseurl from an url

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

sorry but I don't understand: what do you mean with baseurl?

the name of the field or a part of the URL?

If you want the name of the field you can modify the regex:

| rex "[GET|POST]\s(?<baseurl>.+)\s+(HTTP)"

if instead you want a part of the URL, e.g. the first two sections, you could use something like this:

| rex "[GET|POST]\s(?<baseurl>\/\w+\/\w+\/)\s+(HTTP)"

Ciao.

Giuseppe

laurentiugrama
Explorer

In the first post I explained what is the URL and what I want to obtain from regext

for first line the url is /gw/api/aaa/v1/ and the baseurl is /gw/api/aaa/v1

GET /gw/api/aaa/v1/ HTTP - to return /gw/api/aaa/v1
GET /gw/api/abc/v3 HTTP - to return /gw/api/abc/v3
POST /gw/api/cba/ HTTP - to return /gw/api/cba
POST /gw/transactions/swaggers/v2 HTTP - to return /gw/transactions/swaggers/v2
POST /gw/api/swaggers/v1/asd?dssa HTTP - to return /gw/api/swaggers/v1/asd
POST /api/swaggers/ HTTP - to return /api/swaggers
GET /api/cashAccountOpenings/v3/sadsa-123312-1312 HTTP - to return /api/cashAccountOpenings/v3

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

ok, sorry for the misunderstanding!

Please try this:

index=your_index
| rex field=URL "[GET|POST]\s(?<baseurl>.+)\s+(HTTP)"
| rex field=baseurl "(?<baseurl1>.+)\/$"
| rex field=baseurl "(?<baseurl2>.+)((\?\w+$)|(\/sadsa.*))"
| eval baseurl=coalesce(baseurl2,baseurl1,baseurl)
| table URL baseurl

Ciao.

Giuseppe

laurentiugrama
Explorer

Your solution it works well.

Do you think that it's possible to have a solution based on only one regexp iteration ?

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @laurentiugrama,

probably it's possible, I'll try tomorrow (today I'm out!).

Ciao and happy splunking.

Giuseppe

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...