Getting Data In

Can I loop through URL and http_referrer to find original request?

bababou
Explorer

Hi everyone,

I'd like to see the flow from a given final URL, back to original URL the user typed.

In my Web Proxy Logs, I see the following :
_time, src_ip, http_referrer, http_method, URL

For example :
003, 1.1.1.1, htp://www.bbb.com/ads.html, GET, htp://www.ccc.com/ccc.html
002, 1.1.1.1, htp://www.aaa.com/, GET, htp://www.bbb.com/ads.html
001, 1.1.1.1, -, GET, htp://www.aaa.com/

What I want to do is, given the final URL (ccc.com/ccc.html), be able to go back in time, through the pair (http_referrer, URL) and find all the URLs up to the original one (aaa.com) with http_referrer="-".

Sometimes this flow can be spread among 10 different requests mixed in the middle of other web traffic, so this is hard to find by hand.

Programmatically I would do this with one loop, but I cannot find any loops with Splunk.

Can you help me ? Thanks.

Labels (1)
0 Karma
1 Solution

bababou
Explorer

I solved my problem with an external script :


import splunk.Intersplunk

results, dummyresults, settings = splunk.Intersplunk.getOrganizedResults()

keywords, options = splunk.Intersplunk.getKeywordsAndOptions()
httpref = options.get('url', '-')

newresults = []

for result in results:
    if httpref == '-':
        break
    if result.get('url') == httpref:
        newresults.append(result)
        httpref = result.get('http_referer')

splunk.Intersplunk.outputResults(newresults)

And I call it this way :

... | referer url="htp://www.ccc.com/ccc.html" | table _time, http_referer, url

View solution in original post

0 Karma

bababou
Explorer

I solved my problem with an external script :


import splunk.Intersplunk

results, dummyresults, settings = splunk.Intersplunk.getOrganizedResults()

keywords, options = splunk.Intersplunk.getKeywordsAndOptions()
httpref = options.get('url', '-')

newresults = []

for result in results:
    if httpref == '-':
        break
    if result.get('url') == httpref:
        newresults.append(result)
        httpref = result.get('http_referer')

splunk.Intersplunk.outputResults(newresults)

And I call it this way :

... | referer url="htp://www.ccc.com/ccc.html" | table _time, http_referer, url
0 Karma

somesoni2
Revered Legend

See Splunk's map command which is looping operator.

0 Karma

neerajs_81
Builder

Can someone pls assist how to use MAP command or how to search for the original request URL  without the external script that was marked as solution ?

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @neerajs_81,

Please try below sample with map command;

index="web_proxy" sourcetype="proxy" 
| map search="search index="web_proxy" sourcetype="proxy" http_referrer=$URL$ OR http_referrer="-" | eval finalURL=$URL$ " 
| map search="search index="web_proxy" sourcetype="proxy" http_referrer=$http_referrer$ | eval finalURL=$finalURL$ " 
| search http_referrer="-" 
| dedup _raw 
| rename URL as originalURL 
| table finalURL originalURL

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

technoe
Explorer

How is the data indexed? Maybe you could use a last or first command instead of looping through each one...

0 Karma

bababou
Explorer

Some kind of "transaction" could also be fine, ideally a table with _time and url.

0 Karma

jsie_splunk
Splunk Employee
Splunk Employee

When you say "interested" how do you want the data expressed? As a single field containing the full path?

0 Karma

bababou
Explorer

What really interests me is the whole path.
In this example : aaa.com -> bbb.com/ads.html -> ccc.com/ccc.html
And not only the first and last requests.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...