Splunk Search

Get top 5 field values (by percent) for multiple fields in a single search or report

jluste
Path Finder

I have a simple need that I cannot solve. For a generic search of source=whatever filter1 filter2 filterx | I want to see for N number of fields, the top, say, 5 values, by percentage (not count).

For example: Say I am looking at a web storefront and want transaction data. Assuming that all fields are reported 100% of the time, data I am interested in is say, top 5 of the following fields; creditCardBrand, webBrowser, shipToCity, orderHour, and ipAddress. For the last 24 hours, source=transactions successful=True orderStatus=Complete shipped=True |

The results I want need to look something like this:


Top cCardBrand Percent webBrowser Percent shipToCity Percent orderHour Percent
1 Visa 35.00 MSIE 42.00 Austin 10.00 21 13.56
2 Mastercard 35.00 Chrome 25.23 Boston 9.85 22 13.01
3 Discover 20.00 FireFox 19.50 New York 9.84 18 11.78
4 Amex 10.00 Safari 13.00 Miami 5.54 5 10.52
5 Opera 00.27 Denver 3.22 20 4.45


NOTE: All 4 of these fields appear and report these percentages on the right-hand side as selected fields. I am merely trying to select some of the fields and report back the top 5 values (percentages) of each. Calculating percents takes a while, so this is fine to be scheduled to run overnight.

1 Solution

Ayn
Legend

You could run top with no limit, then sort by the percent field and grab the first N items in the list.

... | top 0 creditCardBrand webBrowser shipToCity orderHour ipAddress | sort - percent | head 5

EDIT: OK, I misunderstood your original question. You want to grab the top fields by percentage separately. For this my best advice would be to use appendcols. The caveats I can think of is that you will actually need to spawn one search for each field you want to grab the top values by percentage for, and you need to call the percentage fields different names in order to not have them overwrite each other. So something like this should get you going in the right direction:

... | top limit=5 percentfield=ccpercent creditCardBrand | appendcols [search ... | top limit=5 percentfield=browserpercent webBrowser] | appendcols [search ... | top limit=5 percentfield=citypercent shipToCity] | ...

View solution in original post

Ayn
Legend

You could run top with no limit, then sort by the percent field and grab the first N items in the list.

... | top 0 creditCardBrand webBrowser shipToCity orderHour ipAddress | sort - percent | head 5

EDIT: OK, I misunderstood your original question. You want to grab the top fields by percentage separately. For this my best advice would be to use appendcols. The caveats I can think of is that you will actually need to spawn one search for each field you want to grab the top values by percentage for, and you need to call the percentage fields different names in order to not have them overwrite each other. So something like this should get you going in the right direction:

... | top limit=5 percentfield=ccpercent creditCardBrand | appendcols [search ... | top limit=5 percentfield=browserpercent webBrowser] | appendcols [search ... | top limit=5 percentfield=citypercent shipToCity] | ...

jluste
Path Finder

Awesome. Perfect. This is a closed 100% completed answer. Thanks Ayn!

0 Karma

Ayn
Legend

You can add | table <yourfields> at the end to force a list of fields with the order you want.

jluste
Path Finder

This worked! I got the results I was looking for. Only issue is that some of the placement is wonky. Example: the orderHour percentage is 4 columns to the left of orderHour and webBrowser percent comes before webBrowser.

Is there an easy way to force the order be maintained?

0 Karma

Ayn
Legend

Ah, I see. Updating my answer.

0 Karma

jluste
Path Finder

That only gives me one percentage, which is where all of the vales in that row coincide. Example, 0.249% of my transactions were charged to Visa, ordered on MSIE at 11pm and shipped to Austin. It doesn't treat each field value as a separate entity. I need each value's percentage to the search, which means a percentage after every value.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...