topic Grouping by String and Sorting by Average in Splunk Search

Grouping by String and Sorting by Average

stanbridge — Thu, 24 Sep 2015 00:13:26 GMT

Hi there!

I have run the following search...

index="prop_data" uri=*/property/*/* | stats avg(execution_time) by uri | head 10

Which produces results like...

/testfolder1/property/for-sale-adverts.json 1.142857
/testfolder1/property/10006959/adverts.json 103.000000
/testfolder1/property/10006959/forrent.json 3.000000
/testfolder1/property/10007021/adverts.json 14.000000
/testfolder1/property/10007021/forrent.json 4.000000
/testfolder1/property/10010951/adverts.json 13.000000
/testfolder1/property/10010951/single-ad/15892269.json  18.500000
/testfolder1/property/10010951/single-ad/80817600.json  15.500000
/testfolder1/property/10015532/adverts.json 197.000000
/testfolder1/property/10015532/single-ad/19372287.json  15.000000

Ideally, what I'm actually wanting (broken into dot points for easier reading) is:

the top 10 grouped uri's sorted in decending order by the average execution_time for that "grouped set"...
where those uri's are grouped by: [whatever is between the 3rd and 4th slash that doesn't contain numbers] and [whatever is between the 4th and 5th slash]

So in the output above, there would only be an average execution time for:

for-sale-adverts.json (this is the only "uri" that would be captured by my first grouping)
adverts.json
forrent.json
single-ad

Any help on this one is MUCH appreciated!!!

Cheers,

Chris

Re: Grouping by String and Sorting by Average

yuanliu — Thu, 24 Sep 2015 00:56:20 GMT

Something like

index="prop_data" uri=*/property/*/*
 | rex field=uri mode=sed "s=(/[^\/]+){2}.+?([^\d/]+).*=\2="
 | stats avg(execution_time) by uri

Re: Grouping by String and Sorting by Average

stanbridge — Thu, 24 Sep 2015 01:45:04 GMT

Thanks yuanliu, but no results unfortunately.

If it helps, here's some standard regex that successfully finds all of the strings I would want to group by...

(?<=\/)(?!.*\/\D)\D[^\/]+

Re: Grouping by String and Sorting by Average

yuanliu — Thu, 24 Sep 2015 06:35:43 GMT

If you have the regex, that should be all you need. All I'm suggesting is to extract that string and group accordingly. I don't get how D is used in the above, but I can think of another workaround: Just get rid of all numerals. Like this?

index="prop_data" uri=*/property/*/*
 | eval uri=replace(uri,".+/property/","")
 | eval uri=replace(uri,"/\d+(\.json$|/)","")
 | stats avg(execution_time) by uri

Re: Grouping by String and Sorting by Average

stanbridge — Thu, 24 Sep 2015 23:07:00 GMT

Hi Yuanliu!

Sorry for the delayed reply, I'm currently only alloowed 2 replies a day. I had this comment ready to go yesterday.

"Actually, I have it!

I just used two separate rex's. One to remove junk from the start of the wanted part of the string and a second one to remove stuff after the wanted part of the string.

Thanks anyway for your help Yuanliu!"

The regex I had above was good for finding the values in the middle of the string but didn't work ideally for Splunk.

Thanks for your suggestions though, very much appreciated!

Cheers,

Chris