Solved: Get top 5 field values (by percent) for multiple f...

jluste · ‎09-12-2012

I have a simple need that I cannot solve. For a generic search of source=whatever filter1 filter2 filterx | I want to see for N number of fields, the top, say, 5 values, by percentage (not count).

For example: Say I am looking at a web storefront and want transaction data. Assuming that all fields are reported 100% of the time, data I am interested in is say, top 5 of the following fields; creditCardBrand, webBrowser, shipToCity, orderHour, and ipAddress. For the last 24 hours, source=transactions successful=True orderStatus=Complete shipped=True |

The results I want need to look something like this:



Top cCardBrand Percent  webBrowser Percent  shipToCity Percent orderHour Percent

1   Visa       35.00    MSIE       42.00    Austin     10.00   21        13.56

2   Mastercard 35.00    Chrome     25.23    Boston     9.85    22        13.01

3   Discover   20.00    FireFox    19.50    New York   9.84    18        11.78

4   Amex       10.00    Safari     13.00    Miami      5.54    5         10.52

5                       Opera      00.27    Denver     3.22    20        4.45

NOTE: All 4 of these fields appear and report these percentages on the right-hand side as selected fields. I am merely trying to select some of the fields and report back the top 5 values (percentages) of each. Calculating percents takes a while, so this is fine to be scheduled to run overnight.

Ayn · ‎09-12-2012

You could run top with no limit, then sort by the percent field and grab the first N items in the list.

... | top 0 creditCardBrand webBrowser shipToCity orderHour ipAddress | sort - percent | head 5

EDIT: OK, I misunderstood your original question. You want to grab the top fields by percentage separately. For this my best advice would be to use appendcols. The caveats I can think of is that you will actually need to spawn one search for each field you want to grab the top values by percentage for, and you need to call the percentage fields different names in order to not have them overwrite each other. So something like this should get you going in the right direction:

... | top limit=5 percentfield=ccpercent creditCardBrand | appendcols [search ... | top limit=5 percentfield=browserpercent webBrowser] | appendcols [search ... | top limit=5 percentfield=citypercent shipToCity] | ...

View solution in original post

Ayn · ‎09-12-2012

You could run top with no limit, then sort by the percent field and grab the first N items in the list.

... | top 0 creditCardBrand webBrowser shipToCity orderHour ipAddress | sort - percent | head 5

EDIT: OK, I misunderstood your original question. You want to grab the top fields by percentage separately. For this my best advice would be to use appendcols. The caveats I can think of is that you will actually need to spawn one search for each field you want to grab the top values by percentage for, and you need to call the percentage fields different names in order to not have them overwrite each other. So something like this should get you going in the right direction:

... | top limit=5 percentfield=ccpercent creditCardBrand | appendcols [search ... | top limit=5 percentfield=browserpercent webBrowser] | appendcols [search ... | top limit=5 percentfield=citypercent shipToCity] | ...

jluste · ‎09-13-2012

Awesome. Perfect. This is a closed 100% completed answer. Thanks Ayn!

Ayn · ‎09-13-2012

You can add | table <yourfields> at the end to force a list of fields with the order you want.

jluste · ‎09-13-2012

This worked! I got the results I was looking for. Only issue is that some of the placement is wonky. Example: the orderHour percentage is 4 columns to the left of orderHour and webBrowser percent comes before webBrowser.

Is there an easy way to force the order be maintained?

Ayn · ‎09-12-2012

Ah, I see. Updating my answer.

jluste · ‎09-12-2012

That only gives me one percentage, which is where all of the vales in that row coincide. Example, 0.249% of my transactions were charged to Visa, ordered on MSIE at 11pm and shipped to Austin. It doesn't treat each field value as a separate entity. I need each value's percentage to the search, which means a percentage after every value.

Get top 5 field values (by percent) for multiple fields in a single search or report

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life