I am pushing latency histogram data to spulk. i.e
under time range : Time range (sec)
5 : 1
2 : 2
1 : 3
above histogram is created and dumped by client from individual entries of latency -> Latency under 1 Sec (5 times), 2 sec (2 times), 3 sec (1 time) : 1,1,1,1,1,2,2,3
If i have these individual records in a field at splunk then it is easy to find out the per99() by just applying this function on that field. But i do not have individual records and have buckets as mentioned above.
What is the best way to calculate the percentile to give the same result as it should give by applying per99() on individual entries (e.g. 1,1,1,1,1,2,2,3) ?
Can you try something like this with your data:
| makeresults | eval ph="0,0,0,0,0,0,0,1,0,34,66,66,64,68,60,79,7374,13812,0,0,0,0,0,0,0,0,0,0" | eval times="0,1,2,3,4,5,6,7,8,9,10,20,30,40,50,60,70,80,90,100,200,300,400,500,600,700,800,900" | makemv ph delim="," | makemv times delim="," | eval z=mvzip(times, ph) | mvexpand z | rex field=z "(?<t>[^\,]+)\,(?<ph>.*)" | table _time t ph | chart values(ph) as ph over t by _time
but i do not have common seperated list of numbers bu i have histogram i want to get this comma seperated list on the basis of count.
E.g. I have
5 : 1 (1, 5 times)
2 : 2 (2, 2 times)
1 : 3 (3, 1 times)
and want 1,1,1,1,2,2,3
| makeresults | eval ph="5 : 1, 2 : 2, 1 : 3" | makemv ph delim="," | mvexpand ph | rex field=ph "(?P\d+) : (?P\d+)" | table _time t ph | chart values(ph) as ph over t by _time