Splunk Search

How to use rex to extract values from URLs into a field?

willamwar
Path Finder

Hello all,

From the following list

http://www.foo.com:80/main.html
http://www.foo.com:80/xe/journal/v1/book/nF1.jpg
http://www.goo.com:80/fiction/journal/07/40

Where the website is foo, find journal and extract /xe/journal/
So there are other websites, and not every target website has journal, and xe can be an number of characters

I have a working regex

 https?:\/\/.+?(?=foo\.com)[^\/]++(\/?[^\/]+\/(?=journal)journal\/?)

My issue is I can't figure out how to get rex to send the data to the 'data' variable.

| rex field=trimurl "https?:\/\/.+?(?=foo\.com)[^\/]++(?<data>.*)(\/?[^\/]+\/(?=journal)journal\/?)" 
0 Karma
1 Solution

gokadroid
Motivator

To get you the data in field data, rex part can be handled as follows:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+){2}\/)"

See here the regex a work

If in field called data you specifically want the keyword journal together with variable number string called xe, where xe is one or more charaters long, like in the form /xe/journal/ then try this:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+)\/journal\/)"

View solution in original post

woodcock
Esteemed Legend

Also, check out URL Toolbox:
https://splunkbase.splunk.com/app/2734/

0 Karma

gokadroid
Motivator

To get you the data in field data, rex part can be handled as follows:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+){2}\/)"

See here the regex a work

If in field called data you specifically want the keyword journal together with variable number string called xe, where xe is one or more charaters long, like in the form /xe/journal/ then try this:

rex field=trimurl "https?:\/\/.+?(foo\.com)[^\/]+(?<data>(\/[^\/]+)\/journal\/)"

gokadroid
Motivator

oh damn...thanks...if you can accept the answer for it to be closed then that will help too...editing the answer as per your need and to correct my mistake..

0 Karma

willamwar
Path Finder

Thank you so much.

Could you fix the spelling of feld --> field so that others don't get an error and have to figure that out.

Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...