Thanks @yannK! Hope all is well! Time flies huh? 2013...**bleep**! I have come from the future to add an example where I applied perc95 to application access logging - an oft asked party trick app developers ask for. I stumbled on this post while working on analyzing some service mesh logging and reading the perc95 docs. The year is now 2021 and I have events from a traffic gateway (Istio - think access_combined type stuff) and I receive access logging events for my "Ingress traffic". [2021-02-28T13:35:35.921Z] "GET /code/mattymo/docker_addon_builder/-/branches/all?sort=updated_asc HTTP/1.1" 200 - "-" "-" 0 9656 574 570 "185.191.171.6" "Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html)" "349525cc-6fff-9c55-af95-986cb31bdf70" "mattymo.io" "10.1.74.210:443" outbound|443||gitlab.gitlab.svc.cluster.local - 10.1.74.189:443 185.191.171.6:16156 mattymo.io - This event then gets parsed to provide me many fields but the two ill use here will be "duration" and "upstream_cluster". in the event above, for example, "duration=574" and "upstream_cluster="outbound|443||gitlab.gitlab.svc.cluster.local" As an app developer or performance analyst or SRE....or frankly as anyone who cares, I will invaribly want to ask Splunk to find out what my application response times are. index=k8s pod="istio-ingressgateway*"
| stats count, perc50(duration) AS "Median Duration", perc95(duration) AS "95th Percentile Duration" by cluster_name, upstream_cluster
| sort - "95th Percentile Duration" This table gets me started with analyzing web traffic and the time it takes to serve my gitlab, ghost and Splunk apps! I can immediately start to drill into customer requests that take large amounts of time to serve! Here's to 8 more years
... View more