Splunk Search

How to extract a field from a GET request?

dmenon84
Path Finder

Hi All - I am having trouble extracting the following fields from a GET request .

GET **/TSGene/**images/literature.jpg

I tried the following but it did not seem to work \bGET\s+\K\S+(\/[\/[:word:]\-\.\=\&\?]+)\s

I just want to extract the part highlighted above. Thanks in advance!

Thanks,
Deepthi

0 Karma

FeatureCreeep
Path Finder

This should get you what you want:

| rex "\"GET (?P<url>\/.*?[\/ ])" | eval url=trim(url)

This will match in the case of an additional / and in the case where there isn't a second /. If there is no / then there will be a trailing space in the url so I added a trim to remove it. A fancier regex could probably remove the need for the trim but this works.

I'm a little confused about what you want to do with POSTs. In your example above, you still parsed POSTs but maybe that was just an oversight. I would suggest filtering them out so you are only processing events with ""GET " in the event. If you don't filter them out then the "url" field will be NULL since the regex will not match.

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Try this.. Your fieldname will be GET

| rex (?<GET>GET\s\S+\.jpg)

0 Karma

dmenon84
Path Finder

Sorry if I wasn't clear I only want the following parts extracted. The data between the first slashes / after GET which should include the slashes / .

Extracted data -

/TSGene/
/TSGene/
/favicon.ico
/TSGene/
/static/
/static/
/orl/

Actual requests -

"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","gene=5781","GET /TSGene/gene_general.cgi?gene=5781 HTTP/1.1\r\n
"HTTPS","","GET /favicon.ico HTTP/1.1\r\n
"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","ver=20142803","GET /static/wp-content/plugins/fruitful-shortcodes/includes/shortcodes/js/tabs/easyResponsiveTabs.js?ver=20142803 HTTP/1.1\r\n
"HTTPS","ver=1.11.4","GET /static/wp-includes/js/jquery/ui/slider.min.js?ver=1.11.4 HTTP/1.1\r\n
"HTTPS","","GET /orl/wp-content/themes/utms-orl/images/common/prefooter-bg.jpg HTTP/1.1\r\n

0 Karma

tiagofbmm
Influencer

Try this one:

| rex field=_raw "(?<=POST|GET)\s(<?yourfield>\/[^\/]*)"
0 Karma

dmenon84
Path Finder

Thanks that works better but in some cases it picks up the HTTP that follows the requests.

Can this be modified to extract like this ?

"HTTPS","","GET /favicon.ico HTTP/1.1\r\n -> /favicon.ico should only be extracted.

At this time, it extracts the following -> - /favicon.ico HTTP

Thanks in advance !

0 Karma

tiagofbmm
Influencer

Yes just use the space in the rex too

 | rex field=_raw "(?<=POST|GET)\s(<?yourfield>\/[^\/|\s]*)"
0 Karma

tiagofbmm
Influencer

Can you please paste a full example of the GET request?

0 Karma

dmenon84
Path Finder

Sure - some more samples of GET and POST

"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","gene=5781","GET /TSGene/gene_general.cgi?gene=5781 HTTP/1.1\r\n
"HTTPS","","GET /favicon.ico HTTP/1.1\r\n
"HTTPS","","POST /TSGene/search_result.cgi HTTP/1.1\r\n
"HTTPS","ver=20142803","GET /static/wp-content/plugins/fruitful-shortcodes/includes/shortcodes/js/tabs/easyResponsiveTabs.js?ver=20142803 HTTP/1.1\r\n
"HTTPS","ver=1.11.4","GET /static/wp-includes/js/jquery/ui/slider.min.js?ver=1.11.4 HTTP/1.1\r\n
"HTTPS","","GET /orl/wp-content/themes/utms-orl/images/common/prefooter-bg.jpg HTTP/1.1\r\n

some logs have version number in between

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

(view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...