Hi All,
I am looking into using some proxy logs to determine download volume for particular streaming sites and was looking for a way to merge hostnames into one "service".
Consider the SPL:
index=proxy url=*.streaming-site.com
| eval megabytes=round(((bytes_in/1024)/1024),2)
| stats sum(megabytes) as Download_MB by url
| sort -Download_MB
Will likely return multiple rows like:
cdn1.streaming-site.com 180.3
cdn2.streaming-site.com 164.8
www.streaming-site.com 12.3
I am wanting to merge those all into one row of
streaming-site.com 357.4
I have played around with the coalesce function, but this would be unsustainable for sites like Netflix which have dozens of URLs associated with them.
If anyone has any suggestions on how I might combine results with say a wildcard (*), I'd love to hear from you!
The coalesce function selects a field within a single result. To combine (aggregate) multiple results, use the stats command again after modifying the url field.
index=proxy url=*.streaming-site.com
| eval megabytes=round(((bytes_in/1024)/1024),2)
| stats sum(megabytes) as Download_MB by url
| eval url=replace(url, ".*?\.(.*)","\1")
| stats sum(Download_MB) as Download_MB by url
| sort - Download_MB
Thanks for this! 🙂
More tweaking required on my part as some of the subdomains being evaluated have more than 3 levels, but this is a big help in getting me on the right track!
The coalesce function selects a field within a single result. To combine (aggregate) multiple results, use the stats command again after modifying the url field.
index=proxy url=*.streaming-site.com
| eval megabytes=round(((bytes_in/1024)/1024),2)
| stats sum(megabytes) as Download_MB by url
| eval url=replace(url, ".*?\.(.*)","\1")
| stats sum(Download_MB) as Download_MB by url
| sort - Download_MB