Hi,
I am trying to get top 50 404s by uri and the corresponding referers by their count. For example, if uri1 is the top most uri with 5k 404s and the top referer for that uri1 (which results in 404s) could be referer1 with count of 1000.
I have used this search and it gives me count for uri and referers but i am not getting top 50 count. I am getting less than 50. what is wrong in this query? any suggestions please.
sourcetype=access_combined_wcookie status=404 | top 50 uri | eval CNT=count | eval %=percent | join uri [search sourcetype=access_combined_wcookie status=404 | top 50 uri referer | eval referer_count=count | eval referer_percent=percent ] | table CNT,%,uri,referer, count | sort - CNT
That did not work either...it is giving count as 1 for each of them..
Try the updated search.
Give this a try
Updated Search
index=_internal sourcetype=*web_access (status=4* OR status=5*) | stats count as ur_count by status, uri ,referer | eventstats sum(ur_count) as count by status, uri | sort 0 - status, count | dedup status, uri | sort - status uri count | streamstats count as rank by status, uri | where rank < 51 | sort status -count| fields - rank ur_count | table count status uri referer
That is what i was expecting but surprisingly the counts are very less. i can send a screenshot of it if you would like to.
The second query is computing the top 50 uri-referer combinations. It's not giving you the top referer for the top 50 uris.
Try something like this:
sourcetype=access_combined_wcookie status=404 | stats count as ur_count by uri referer | eventstats sum(ur_count) as count by uri | sort 0 - ur_count | dedup uri | sort 0 - count | head 50
You'll get the count per uri in count
and the count for that top referer in ur_count
.
The pipeline from the first stats
onwards doesn't know or care about the status
field. As a result, searching for status=4* OR status=5*
will lead to the top 50 uri
values over all matching status
codes. As a consequence, the top uri
should have a higher count with the more broad status
filter than with status=404
.
Hi. Thanks for the reply. This is much simpler version. One question - In the status, instead of status=404 i am have (status=4* OR status=5*). So when it uses head 50, does it use first 50 by their count? I am getting very less number of results when i am doing this way. But when i just use status=404, looks like it gives me correct numbers.