Hello,
Given the following access logs generated by the same page:
Input:
http://mydomain1.com/q?L=5000 [ Referer header: http://mydomain2.com/some-page2.html ]
http://mydomain1.com/q?L=6000 [ Referer header: http://mydomain5.com/some-page5.html ]
http://mydomain1.com/q?L=5500 [ Referer header: http://mydomain2.com/some-page2.html ]
Requirement:
I am trying find average values of L (greater than 1000 and less than 60001) by top 5 referers.
Attempted solutions:
I thought about subsearch, but get it to work as expected:
index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | timechart avg(L) span=5m by referer
Would I have to find the top 5 referers in a query, and then use the results of referers from that query as a pivot for another query?! 🙂 I wouldn't know how to get started with that one in Splunk.. I was trying to follow this guide http://www.innovato.com/splunk/SQLSplunk.html but no luck 😕
Any help is appreciated 🙂
Thank you.
-Gokce
Part 3:
And if I execute this subsearch:
index=myindex sourcetype="mysource" [index=myindex sourcetype="mysource" L>1000 L<60001 | stats count by referer | sort 10 -count | fields referer] | timechart avg(L) span=2m by referer
I get an error from Splunk:
Unknown search command 'index'.
Any ideas on how to execute this subsearch?
Thank you for all your help.
-Gokce
Part 2:
If I want to use this in a subsearch, then looks like I need to prepend "search" keyword to the query:
search index=myindex sourcetype="mysource" L>1000 L<60001 | stats count by referer | sort 10 -count | fields referer
but then I get a different result set:
252 matching events
referer
1 http://someotherdomain1.com
2 http://somereallyotherdomain1.com
So the "search" keyword in the beginning does not seem correct 😞
Part 1 of comment (due to 600 char limit):
While I am trying to find a solution.. I have observed couple of things... Given this query:
index=myindex sourcetype="mysource" L>1000 L<60001 | stats count by referer | sort 10 -count | fields referer 
I get the expected results; ie Top 10 referers by count, and matching events would be:
113,948 matching events
referer
1   http://domain1.com/page1.html
2   http://domain2.com/page2.html
with a subsearch
index=myindex sourcetype="mysource" L>1000 L<60001 [search index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | fields + referer]| timechart avg(L) span=5m by referer
/Kristian
@gt2013: search is required for the subsearch to work.
@bmacias84: No I normally don't use return for subsearches. fields will normally do quite well (and will produce a set of OR'ed field/value pairs.)
@gt2013, well, the docs for the return command states that you can/should specify the number of results you want. So, if you want to use return, try | return 5 referer I guess.
Looks like return only returns 1 answer:
search index=myindex sourcetype="mysource" | where L>1000 and L<60001 | top 5 referer | return referer
search
1 referer="http://somedomain.com/page1.html"
without return:
    referer count   percent
1   http://domain1.com/page1.html   14  8.187135
2   http://domain2.com/page2.html   12  7.017544
3   http://domain3.com/page3.html   10  5.847953
4   http://domain4.com/page4.html   7   4.093567
5   http://domain5.com/page5.html...
I think we are very close.. 🙂 I am learning a lot from this exercise.. I didn't even know about return command 🙂
Thank you for all your help.
@kristian.kolb, To make you subsearch work shouldnt you use the return command which will pass values up from the subsearch. index=myindex sourcetype="mysource" [search index=myindex sourcetype="mysource" | where L>1000 and L<6001 | top 5 referer | return referer]| timechart avg(L) span=5m by referer. The return command should produce a search base like this index=myindex sourcetype="mysource" AND (referer=<value1> OR referer=<value2> OR referer=<value3> OR referer=<value4> OR referer=<value5>)|.... You might need another where clause before timechart
This is exactly what I tried to, however the subsearch is returning a different result set when ran as a subsearch:
This query:
index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | fields + referer
is returning different results when ran as (ie with search keyword prepended to it):
search index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | fields + referer
First I thought it's the subsearch maxtime, but when I run it for a short period of time, it's definitely returning different results. Please see 3 part comment above 🙂
Thank you for all your help.
 
		
		
		
		
		
	
			
		
		
			
					
		Updated:
What about doing something like this:
... <your search> |bucket _time span=5m | stats avg(L) as myAvg  by referer | sort -myAvg | head 5
That should give you the top 5 referrers based on the average value of 'L'.
 
		
		
		
		
		
	
			
		
		
			
					
		Right. In that case use subsearch as Kristian has in his answer.
Thank you for all your answers, this is awesome.
This is definitely close to what I would like to get, however the query is sorting results by top 5 average values of L and finding the referers.
How would I get the top 5 referers and then find their average values of L over time? This is why I thought I needed the subsearch.
Thanks again, I appreciate all your help.
 
		
		
		
		
		
	
			
		
		
			
					
		Kristian - yep i need the sort. I'll update it
would the head 5 ensure that is was the top 5? without c and | sort?
