I have a pivot set up as follows:
Data source myfirewall
Rows split on source_ip
Columns include bytes_downloaded
and outside_hosts
.
I'd like to only return the top say 5 or 10 outside_hosts
along with their associated bytes_downloaded
. How can I do so? I'm guessing it is somewhere in the manual but I'm getting lost in the terminology.
EDIT to answer @mattness:
Yes, I'm hoping to do this through the GUI. I've written nested searches and the like before and I can go back to doing so here if needed. Hoping not though. I have dozens of varying searches similar to this and it would be extremely useful to allow some of my other coworkers to make their own.
outside_hosts
is a string field containing a domain name or an IP address. source_ip
is a string field containing a source IP address (inside the firewall)bytes_downloaded
is a numeric field representing the number of bytes downloaded in a given sessionEach source_ip
can map to one or more outside_hosts
.
What I need to do is make a chart of the top 10 source_ip
entries based on their sum(bytes_downloaded
). Once that is completed to make the results readable I need a way to only show the top 20 outside_hosts
associated with that given source_ip
.
Using pseudocode for the last bit of logic:
top(sum(`bytes_downloaded` split by `outside_hosts`) ) limit 20.
Ok, I'm a bit unclear about a few things:
column_hosts
field? Is it a string type field? Or is it a numeric field--a count of hosts, perhaps? source_ip
value correspond to a distinct/different outside_hosts
value? If the answer to that last question is 'no' you may want to consider making outside_hosts
the first Split Row element, and move source_ip
to second place.In any case, here's a guess at a solution. Note that this assumes you're using the Pivot Builder interface, not the pivot
command.
outside_hosts
as a second Split Row element. bytes_downloaded
as a Column Value element. When you do this, make sure Value is set to Sum. The resulting column will be labeled Sum of Bytes_Downloaded.outside_hosts
as the attribute. Filter Type: Limit. Limit By: bytes_downloaded
. Limit: Highest 10 sums. This should give you a table that shows the highest 10 outside hosts by sum of bytes downloaded. You can click the top of the Sum of bytes_downloaded column to sort it from highest to lowest.
If this doesn't work for you, I probably need answers to the questions I asked at the top of my reply.
Ok, I'm a bit unclear about a few things:
column_hosts
field? Is it a string type field? Or is it a numeric field--a count of hosts, perhaps? source_ip
value correspond to a distinct/different outside_hosts
value? If the answer to that last question is 'no' you may want to consider making outside_hosts
the first Split Row element, and move source_ip
to second place.In any case, here's a guess at a solution. Note that this assumes you're using the Pivot Builder interface, not the pivot
command.
outside_hosts
as a second Split Row element. bytes_downloaded
as a Column Value element. When you do this, make sure Value is set to Sum. The resulting column will be labeled Sum of Bytes_Downloaded.outside_hosts
as the attribute. Filter Type: Limit. Limit By: bytes_downloaded
. Limit: Highest 10 sums. This should give you a table that shows the highest 10 outside hosts by sum of bytes downloaded. You can click the top of the Sum of bytes_downloaded column to sort it from highest to lowest.
If this doesn't work for you, I probably need answers to the questions I asked at the top of my reply.
Thanks @mattness. I figured that the nested searching might be beyond the pivot GUI at current. If that isn't already on the Splunk roadmap and hopefully close it really aught to be. Half of my searches are set up this way and a way to allow nesting from the pivot interface (as well as cross reference data sources easily) would both be absolutely killer features.
You could set up a third filter (after the time range, and the "Highest 10 Sums" filter) that matches a selected source_ip
value. But it will only return values that are already in the first top 10 list--you won't get 20 values, in other words.
Ok, the second search wasn't part of the original scenario, so that complicates things, as does the fact that each source_ip
can have multiple outside_hosts
. The procedure I outlined should work for the first search, but I don't know at the moment if there's an easy way to fulfill the second "to make results readable" search. You say that you need a way to "only show the top 20 outside_hosts
associated with that given source_ip
." Did you mean to say "a given source_ip
? " If not, how do you arrive at "that" source_ip
value?
Thanks. I've updated my question with (I think) all the relevant answers to your questions. I'm thinking that using a second filter here would apply at the top level, not nested within a first set of criteria.