Splunk Search

How to calculate percentrank in Splunk?

LearningGuy
Builder

How to calculate percentrank in Splunk?
I appreciate your help

Below is the expected result:   Percentrank exc and Percentrank inc are excel functions.

StudentScorePercentrank excPercentrank inc
Student 11091%100%
Student 2982%89%
Student 3873%78%
Student 4764%67%
Student 5655%56%
Student 6545%44%
Student 7436%33%
Student 8327%22%
Student 9218%11%
Student 1019%0%
Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

This example will calculate those ranks from the base data of Student+Score, which uses eventstats to build the collection of scores (stats list) and then mvfind to find the position in the list and then calculate rank.

| makeresults count=10
| fields - _time
| streamstats c as Score
| eval Student="Student ".(11 - Score)
| table Student Score
``` Above simulates your data ```
``` Generate list of scores and find position in results ```
| sort Score
| eventstats count list(Score) as Scores
| eval pos=mvfind(Scores, "^".Score."$")
``` Now calculate ranks ```
| eval Rank_Inc=round(pos/(count-1)*100, 0)
| eval Rank_Exc=round((pos+1)/(count+1)*100, 0)
| fields - Scores pos count

The bit from sort score is what you want 

View solution in original post

LearningGuy
Builder

Hello,
I tried you suggestion and it was working. I accepted this as a solution.
I have few questions:
1) Is there a way to move "sort" command into eventstats, so we don't have 2 lines?
2) Is it possible to do this calculation without using mvfind?

Thank you so much

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Good questions.

You can't get the sorted order inside eventstats. The list(Score) will put them in the order they are found, so they will not be sorted, and mvsort will not sort numerically, so cannot be used, so the mvfind will not get the correct position. 

The other issue with list(Score) is that it can only cope with 100 values, so it will fail at that point.

As to whether there is an alternate solution, the following is probably a better option as it does not have the limitations of list() and does not require mvfind. It may be more efficient.

| makeresults count=10
| fields - _time
| streamstats c as Score
| eval Student="Student ".(11 - Score)
| table Student Score
``` Above simulates your data ```
``` Generate list of scores and find position in results ```
| sort Score
| streamstats count as pos
| eventstats count 
``` Now calculate ranks ```
| eval Rank_Inc=round((pos-1)/(count-1)*100, 0)
| eval Rank_Exc=round((pos+0)/(count+1)*100, 0)
| fields - Scores pos count

You still have to sort the scores and it uses streamstats to identify position (rather than mvfind).

I think there may be a difference in behaviour when there are multiple students with the same score. Using mvfind would always find the position as the first instance of that score, whereas using streamstats as above it would use the user's position.

However, you could probably solve that issue.

LearningGuy
Builder

@bowesmana 
How do I display both the score and the percentRankInc in a "single value pane"  like below?
Thanks

CloudGuy_0-1700071517459.png

 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

You can't do it with a single value viz, but I suggest using Chris Younger's excellent number display viz - https://splunkbase.splunk.com/app/4537

bowesmana_0-1700084331300.png

 

See this example that shows you how you can do it

<dashboard>
  <label>single</label>
  <row>
    <panel>
      <viz type="number_display_viz.number_display_viz">
        <search>
          <query>| makeresults
| eval title="Score"
| eval value=7
| eval subtitle="67%"
| fields - _time</query>
          <earliest>-15m</earliest>
          <latest>now</latest>
          <sampleRatio>1</sampleRatio>
        </search>
        <option name="drilldown">none</option>
        <option name="number_display_viz.number_display_viz.bordercolor">#ffffff</option>
        <option name="number_display_viz.number_display_viz.bordersize">2</option>
        <option name="number_display_viz.number_display_viz.colorprimary">#000000</option>
        <option name="number_display_viz.number_display_viz.colorprimarymode">auto</option>
        <option name="number_display_viz.number_display_viz.colorsecondary">#000000</option>
        <option name="number_display_viz.number_display_viz.colorsecondarymode">darker1</option>
        <option name="number_display_viz.number_display_viz.max">100</option>
        <option name="number_display_viz.number_display_viz.min">0</option>
        <option name="number_display_viz.number_display_viz.nodatacolor">#0178c7</option>
        <option name="number_display_viz.number_display_viz.padding">10</option>
        <option name="number_display_viz.number_display_viz.pulserate">4</option>
        <option name="number_display_viz.number_display_viz.shadowcolor">#F2F4F5</option>
        <option name="number_display_viz.number_display_viz.shapebordercolor">#FFFFFF</option>
        <option name="number_display_viz.number_display_viz.shapebordercolormode">static</option>
        <option name="number_display_viz.number_display_viz.shapebordersize">1</option>
        <option name="number_display_viz.number_display_viz.shapedropcolor">#ffffff</option>
        <option name="number_display_viz.number_display_viz.shapeshadow">yes</option>
        <option name="number_display_viz.number_display_viz.shapetexture">solid</option>
        <option name="number_display_viz.number_display_viz.sparkHeight">30</option>
        <option name="number_display_viz.number_display_viz.sparkWidth">90</option>
        <option name="number_display_viz.number_display_viz.sparkalign">5</option>
        <option name="number_display_viz.number_display_viz.sparkalignv">70</option>
        <option name="number_display_viz.number_display_viz.sparkcolorfill">#009DD9</option>
        <option name="number_display_viz.number_display_viz.sparkcolorline">#0178c7</option>
        <option name="number_display_viz.number_display_viz.sparkcolormodefill">auto</option>
        <option name="number_display_viz.number_display_viz.sparkcolormodeline">auto</option>
        <option name="number_display_viz.number_display_viz.sparkmin">0</option>
        <option name="number_display_viz.number_display_viz.sparknulls">gaps</option>
        <option name="number_display_viz.number_display_viz.sparkorder">bg</option>
        <option name="number_display_viz.number_display_viz.sparkstyle">area</option>
        <option name="number_display_viz.number_display_viz.spinnerspeedmax">20</option>
        <option name="number_display_viz.number_display_viz.spinnerspeedmin">1</option>
        <option name="number_display_viz.number_display_viz.style">a7</option>
        <option name="number_display_viz.number_display_viz.subtitlealign">center</option>
        <option name="number_display_viz.number_display_viz.subtitlealignv">70</option>
        <option name="number_display_viz.number_display_viz.subtitlecolor">#5C6773</option>
        <option name="number_display_viz.number_display_viz.subtitlecolormode">static</option>
        <option name="number_display_viz.number_display_viz.subtitledrop">yes</option>
        <option name="number_display_viz.number_display_viz.subtitledropcolor">#ffffff</option>
        <option name="number_display_viz.number_display_viz.subtitlesize">40</option>
        <option name="number_display_viz.number_display_viz.textalign">center</option>
        <option name="number_display_viz.number_display_viz.textalignv">50</option>
        <option name="number_display_viz.number_display_viz.textcolor">#000000</option>
        <option name="number_display_viz.number_display_viz.textdrop">yes</option>
        <option name="number_display_viz.number_display_viz.textdropcolor">#ffffff</option>
        <option name="number_display_viz.number_display_viz.textduration">300</option>
        <option name="number_display_viz.number_display_viz.textmode">static</option>
        <option name="number_display_viz.number_display_viz.textprecision">1</option>
        <option name="number_display_viz.number_display_viz.textsize">100</option>
        <option name="number_display_viz.number_display_viz.textunitposition">after</option>
        <option name="number_display_viz.number_display_viz.textunitsize">50</option>
        <option name="number_display_viz.number_display_viz.thickness">50</option>
        <option name="number_display_viz.number_display_viz.thresholdcol1">#1a9035</option>
        <option name="number_display_viz.number_display_viz.thresholdcol2">#d16f18</option>
        <option name="number_display_viz.number_display_viz.thresholdcol3">#b22b32</option>
        <option name="number_display_viz.number_display_viz.thresholdcol4">#ffffff</option>
        <option name="number_display_viz.number_display_viz.thresholdcol5">#ffffff</option>
        <option name="number_display_viz.number_display_viz.thresholdcol6">#ffffff</option>
        <option name="number_display_viz.number_display_viz.thresholdsize">20</option>
        <option name="number_display_viz.number_display_viz.thresholdval2">70</option>
        <option name="number_display_viz.number_display_viz.thresholdval3">90</option>
        <option name="number_display_viz.number_display_viz.titlealign">center</option>
        <option name="number_display_viz.number_display_viz.titlealignv">30</option>
        <option name="number_display_viz.number_display_viz.titlecolor">#5C6773</option>
        <option name="number_display_viz.number_display_viz.titlecolormode">static</option>
        <option name="number_display_viz.number_display_viz.titledrop">yes</option>
        <option name="number_display_viz.number_display_viz.titledropcolor">#ffffff</option>
        <option name="number_display_viz.number_display_viz.titlesize">45</option>
        <option name="refresh.display">progressbar</option>
        <option name="trellis.enabled">0</option>
        <option name="trellis.scales.shared">1</option>
        <option name="trellis.size">medium</option>
      </viz>
    </panel>
  </row>
</dashboard>

 

LearningGuy
Builder

@bowesmana 
1. Should I download and install "Number Display Viz" app first in Splunk before applying your commands?
2. Will it work in Dashboard Studio?  It looks like your commands are XML format

3.  Also, I got this error when applying your solutions since the number of students are more than 100
'list' command: Limit of '100' for values reached. Additional values may have been truncated or ignored.

Please suggest.. I appreciate your help.. Thank you

0 Karma

bowesmana
SplunkTrust
SplunkTrust

and no, it's NOT Dashboard Studio compatible - I steer well clear of DS for the time being as XML is so much more flexible at the current time.

bowesmana
SplunkTrust
SplunkTrust

That was an XML example dashboard and is dependent on you installing the number display viz - otherwise it will not work at all.

As for 100 values issue - I mentioned that would be a problem is you had more than 100 results - one of the other posts showed an alternative solution that did not need to use list(), but used streamstats+eventstats.

Note that it also highlighted a difference in behaviour when having scores of the same value in that percentrank would be different for each of the students with the same score.

 

LearningGuy
Builder

Thank you for your assistance
1) Since I am using DS, do you think it's doable if I just display two numbers on two separate "single value" box?

2)  Is this  the alternative solution?   
Can  you please help translate it with the current case (plus percentile_Inc)?   
https://community.splunk.com/t5/Splunk-Search/Is-there-a-way-to-calculate-the-percentile-of-a-value-...

| stats count by value | sort + value
    | streamstats current=f sum(count) as rank | fillnull rank
    | eventstats sum(count) as total | eval percentile_rank = rank / total * 100

 
3) Can I use perc<percentage>(<value>)  or upperperc(<value>,<percentile>) to solve this ?
https://docs.splunk.com/Documentation/SCS/current/SearchReference/Aggregatefunctions

0 Karma

bowesmana
SplunkTrust
SplunkTrust

With DS you can make two boxes - but you can use a single search and use the same search as a chained search for each box.

My other solution was posted in this thread last Friday - see post containing...

As to whether there is an alternate solution, the following is probably a better option as it does not have the limitations of list() and does not require mvfind. It may be more efficient.

Those links to the Aggregate functions are SPL2, but you can't use percentiles because rank is somewhat different to the percentiles.

LearningGuy
Builder

Hello,
I tried your alternative solution, it worked fine, but if there are 2 similar scores, it doesn't give the same rank for the 2 students as in your first solution.

Your alternative solutions + data

| makeresults format=csv data="Student, Score  
a,100 
b,95
c,84
d,73
e,73
f,54
g,43
h,37
i,22
j,12"
| sort Score
| streamstats count as pos 
| eventstats count
| eval Rank_Inc=round((pos-1)/(count-1)*100, 0)
| eval Rank_Exc=round((pos+1)/(count+1)*100, 0)
| table Student, Score, count, pos, Rank_Exc, Rank_Inc
| sort - Score


I will need to add the following searches by ITWhisperer to give the same rank
| streamstats window=2 range(Score) as range
| eval pos=if(pos=1 OR range != 0, pos, null())
| filldown pos

The final searches are too long, which contains 2 streamstats and 1 eventstats.
Do you know if there's a way to shorten this?   
I appreciate your assistance.  Thanks

Final solution

| makeresults format=csv data="Student, Score  
a,100 
b,95
c,84
d,73
e,73
f,54
g,43
h,37
i,22
j,12"
| sort Score
| streamstats count as pos
| eventstats count
| streamstats window=2 range(Score) as range
| eval pos=if(pos=1 OR range != 0, pos, null())
| filldown pos
| eval Rank_Inc=round((pos-1)/(count-1)*100, 0)
| eval Rank_Exc=round((pos+1)/(count+1)*100, 0)
| table Student, Score, count, pos, Rank_Exc, Rank_Inc
| sort - Score



 

0 Karma

bowesmana
SplunkTrust
SplunkTrust

What do you mean by too long? Too many lines or performance?

You can combine eval statements, i.e.

| eval Rank_Inc=round((pos-1)/(count-1)*100, 0), Rank_Exc=round((pos+1)/(count+1)*100, 0)

 From a performance point of view, as soon as you sort data, it will be running on the search head and both eventstats and streamstats must run on the search head, but your data set should be pretty small at this point regardless of how many students you have, you only need Student and Score on the search head, so if it's performance, make sure you do a fields statement to limit the fields before the sort.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

This example will calculate those ranks from the base data of Student+Score, which uses eventstats to build the collection of scores (stats list) and then mvfind to find the position in the list and then calculate rank.

| makeresults count=10
| fields - _time
| streamstats c as Score
| eval Student="Student ".(11 - Score)
| table Student Score
``` Above simulates your data ```
``` Generate list of scores and find position in results ```
| sort Score
| eventstats count list(Score) as Scores
| eval pos=mvfind(Scores, "^".Score."$")
``` Now calculate ranks ```
| eval Rank_Inc=round(pos/(count-1)*100, 0)
| eval Rank_Exc=round((pos+1)/(count+1)*100, 0)
| fields - Scores pos count

The bit from sort score is what you want 

Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...