Splunk Search

Subsearch needed or can't use top :)

Engager

Hello,

Given the following access logs generated by the same page:

Input:

http://mydomain1.com/q?L=5000 [ Referer header: http://mydomain2.com/some-page2.html ]

http://mydomain1.com/q?L=6000 [ Referer header: http://mydomain5.com/some-page5.html ]

http://mydomain1.com/q?L=5500 [ Referer header: http://mydomain2.com/some-page2.html ]

Requirement:

I am trying find average values of L (greater than 1000 and less than 60001) by top 5 referers.

Attempted solutions:

I thought about subsearch, but get it to work as expected:

index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | timechart avg(L) span=5m by referer

Would I have to find the top 5 referers in a query, and then use the results of referers from that query as a pivot for another query?! 🙂 I wouldn't know how to get started with that one in Splunk.. I was trying to follow this guide http://www.innovato.com/splunk/SQLSplunk.html but no luck 😕

Any help is appreciated 🙂

Thank you.

-Gokce

Tags (2)
0 Karma

Engager

Part 3:
And if I execute this subsearch:

index=myindex sourcetype="mysource" [index=myindex sourcetype="mysource" L>1000 L<60001 | stats count by referer | sort 10 -count | fields referer] | timechart avg(L) span=2m by referer

I get an error from Splunk:

Unknown search command 'index'.

Any ideas on how to execute this subsearch?

Thank you for all your help.

-Gokce

0 Karma

Engager

Part 2:
If I want to use this in a subsearch, then looks like I need to prepend "search" keyword to the query:

search index=myindex sourcetype="mysource" L>1000 L<60001 | stats count by referer | sort 10 -count | fields referer

but then I get a different result set:

252 matching events
referer
1 http://someotherdomain1.com
2 http://somereallyotherdomain1.com

So the "search" keyword in the beginning does not seem correct 😞

0 Karma

Engager

Part 1 of comment (due to 600 char limit):

While I am trying to find a solution.. I have observed couple of things... Given this query:

index=myindex sourcetype="mysource" L>1000 L<60001 | stats count by referer | sort 10 -count | fields referer 

I get the expected results; ie Top 10 referers by count, and matching events would be:

113,948 matching events
referer
1   http://domain1.com/page1.html
2   http://domain2.com/page2.html
0 Karma

Ultra Champion

with a subsearch

index=myindex sourcetype="mysource" L>1000 L<60001 [search index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | fields + referer]| timechart avg(L) span=5m by referer

/Kristian

0 Karma

Ultra Champion

@gt2013: search is required for the subsearch to work.

@bmacias84: No I normally don't use return for subsearches. fields will normally do quite well (and will produce a set of OR'ed field/value pairs.)

@gt2013, well, the docs for the return command states that you can/should specify the number of results you want. So, if you want to use return, try | return 5 referer I guess.

0 Karma

Engager

Looks like return only returns 1 answer:

search index=myindex sourcetype="mysource" | where L>1000 and L<60001 | top 5 referer | return referer

search

1 referer="http://somedomain.com/page1.html"

without return:
referer count percent
1 http://domain1.com/page1.html 14 8.187135
2 http://domain2.com/page2.html 12 7.017544
3 http://domain3.com/page3.html 10 5.847953
4 http://domain4.com/page4.html 7 4.093567
5 http://domain5.com/page5.html...

I think we are very close.. 🙂 I am learning a lot from this exercise.. I didn't even know about return command 🙂

Thank you for all your help.

0 Karma

Champion

@kristian.kolb, To make you subsearch work shouldnt you use the return command which will pass values up from the subsearch. index=myindex sourcetype="mysource" [search index=myindex sourcetype="mysource" | where L>1000 and L<6001 | top 5 referer | return referer]| timechart avg(L) span=5m by referer. The return command should produce a search base like this index=myindex sourcetype="mysource" AND (referer=<value1> OR referer=<value2> OR referer=<value3> OR referer=<value4> OR referer=<value5>)|.... You might need another where clause before timechart

0 Karma

Engager

This is exactly what I tried to, however the subsearch is returning a different result set when ran as a subsearch:

This query:
index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | fields + referer

is returning different results when ran as (ie with search keyword prepended to it):
search index=myindex sourcetype="mysource" L>1000 L<60001 | top 5 referer | fields + referer

First I thought it's the subsearch maxtime, but when I run it for a short period of time, it's definitely returning different results. Please see 3 part comment above 🙂

Thank you for all your help.

0 Karma

Splunk Employee
Splunk Employee

Updated:

What about doing something like this:

... <your search> |bucket _time span=5m | stats avg(L) as myAvg  by referer | sort -myAvg | head 5

That should give you the top 5 referrers based on the average value of 'L'.

0 Karma

Splunk Employee
Splunk Employee

Right. In that case use subsearch as Kristian has in his answer.

0 Karma

Engager

Thank you for all your answers, this is awesome.

This is definitely close to what I would like to get, however the query is sorting results by top 5 average values of L and finding the referers.

How would I get the top 5 referers and then find their average values of L over time? This is why I thought I needed the subsearch.

Thanks again, I appreciate all your help.

0 Karma

Splunk Employee
Splunk Employee

Kristian - yep i need the sort. I'll update it

0 Karma

Ultra Champion

would the head 5 ensure that is was the top 5? without c and | sort?

0 Karma