Splunk Search

How to extract a field from a GET request?

dmenon84
Path Finder

Hi All - I am having trouble extracting the following fields from a GET request .

GET **/TSGene/**images/literature.jpg

I tried the following but it did not seem to work \bGET\s+\K\S+(\/[\/[:word:]\-\.\=\&\?]+)\s

I just want to extract the part highlighted above. Thanks in advance!

Thanks,
Deepthi

0 Karma

FeatureCreeep
Path Finder

This should get you what you want:

| rex "\"GET (?P<url>\/.*?[\/ ])" | eval url=trim(url)

This will match in the case of an additional / and in the case where there isn't a second /. If there is no / then there will be a trailing space in the url so I added a trim to remove it. A fancier regex could probably remove the need for the trim but this works.

I'm a little confused about what you want to do with POSTs. In your example above, you still parsed POSTs but maybe that was just an oversight. I would suggest filtering them out so you are only processing events with ""GET " in the event. If you don't filter them out then the "url" field will be NULL since the regex will not match.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Try this.. Your fieldname will be GET

| rex (?<GET>GET\s\S+\.jpg)

0 Karma

dmenon84
Path Finder

Sorry if I wasn't clear I only want the following parts extracted. The data between the first slashes / after GET which should include the slashes / .

Extracted data -

/TSGene/
/TSGene/
/favicon.ico
/TSGene/
/static/
/static/
/orl/

Actual requests -

"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","gene=5781","GET /TSGene/gene_general.cgi?gene=5781 HTTP/1.1\r\n
"HTTPS","","GET /favicon.ico HTTP/1.1\r\n
"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","ver=20142803","GET /static/wp-content/plugins/fruitful-shortcodes/includes/shortcodes/js/tabs/easyResponsiveTabs.js?ver=20142803 HTTP/1.1\r\n
"HTTPS","ver=1.11.4","GET /static/wp-includes/js/jquery/ui/slider.min.js?ver=1.11.4 HTTP/1.1\r\n
"HTTPS","","GET /orl/wp-content/themes/utms-orl/images/common/prefooter-bg.jpg HTTP/1.1\r\n

0 Karma

tiagofbmm
Influencer

Try this one:

| rex field=_raw "(?<=POST|GET)\s(<?yourfield>\/[^\/]*)"
0 Karma

dmenon84
Path Finder

Thanks that works better but in some cases it picks up the HTTP that follows the requests.

Can this be modified to extract like this ?

"HTTPS","","GET /favicon.ico HTTP/1.1\r\n -> /favicon.ico should only be extracted.

At this time, it extracts the following -> - /favicon.ico HTTP

Thanks in advance !

0 Karma

tiagofbmm
Influencer

Yes just use the space in the rex too

 | rex field=_raw "(?<=POST|GET)\s(<?yourfield>\/[^\/|\s]*)"
0 Karma

tiagofbmm
Influencer

Can you please paste a full example of the GET request?

0 Karma

dmenon84
Path Finder

Sure - some more samples of GET and POST

"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","gene=5781","GET /TSGene/gene_general.cgi?gene=5781 HTTP/1.1\r\n
"HTTPS","","GET /favicon.ico HTTP/1.1\r\n
"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","ver=20142803","GET /static/wp-content/plugins/fruitful-shortcodes/includes/shortcodes/js/tabs/easyResponsiveTabs.js?ver=20142803 HTTP/1.1\r\n
"HTTPS","ver=1.11.4","GET /static/wp-includes/js/jquery/ui/slider.min.js?ver=1.11.4 HTTP/1.1\r\n
"HTTPS","","GET /orl/wp-content/themes/utms-orl/images/common/prefooter-bg.jpg HTTP/1.1\r\n

some logs have version number in between

0 Karma
Get Updates on the Splunk Community!

Incident Response: Reduce Incident Recurrence with Automated Ticket Creation

Culture extends beyond work experience and coffee roast preferences on software engineering teams. Team ...

Splunk Classroom Chronicles: Training Tales and Testimonials (Episode 2)

Welcome to the "Splunk Classroom Chronicles" series, created to help curious, career-minded learners get ...

Index This | I am a number but I am countless. What am I?

January 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  Happy New Year! We’re ...