Splunk Search

How to use rex to extract values from URLs into a field?

willamwar
Path Finder

Hello all,

From the following list

http://www.foo.com:80/main.html
http://www.foo.com:80/xe/journal/v1/book/nF1.jpg
http://www.goo.com:80/fiction/journal/07/40

Where the website is foo, find journal and extract /xe/journal/
So there are other websites, and not every target website has journal, and xe can be an number of characters

I have a working regex

 https?:\/\/.+?(?=foo\.com)[^\/]++(\/?[^\/]+\/(?=journal)journal\/?)

My issue is I can't figure out how to get rex to send the data to the 'data' variable.

| rex field=trimurl "https?:\/\/.+?(?=foo\.com)[^\/]++(?<data>.*)(\/?[^\/]+\/(?=journal)journal\/?)" 
0 Karma
1 Solution

gokadroid
Motivator

To get you the data in field data, rex part can be handled as follows:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+){2}\/)"

See here the regex a work

If in field called data you specifically want the keyword journal together with variable number string called xe, where xe is one or more charaters long, like in the form /xe/journal/ then try this:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+)\/journal\/)"

View solution in original post

woodcock
Esteemed Legend

Also, check out URL Toolbox:
https://splunkbase.splunk.com/app/2734/

0 Karma

gokadroid
Motivator

To get you the data in field data, rex part can be handled as follows:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+){2}\/)"

See here the regex a work

If in field called data you specifically want the keyword journal together with variable number string called xe, where xe is one or more charaters long, like in the form /xe/journal/ then try this:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+)\/journal\/)"

gokadroid
Motivator

oh damn...thanks...if you can accept the answer for it to be closed then that will help too...editing the answer as per your need and to correct my mistake..

0 Karma

willamwar
Path Finder

Thank you so much.

Could you fix the spelling of feld --> field so that others don't get an error and have to figure that out.

Get Updates on the Splunk Community!

See just what you’ve been missing | Observability tracks at Splunk University

Looking to sharpen your observability skills so you can better understand how to collect and analyze data from ...

Weezer at .conf25? Say it ain’t so!

Hello Splunkers, The countdown to .conf25 is on-and we've just turned up the volume! We're thrilled to ...

How SC4S Makes Suricata Logs Ingestion Simple

Network security monitoring has become increasingly critical for organizations of all sizes. Splunk has ...