I've been pulling my hair out trying to do what seems like a basic task:
Given a log of requests with dates and source IP addresses, show the top 10 IPs making requests each day.
In other words, I'm after a graph sorted by date which shows the top 10 SourceIPs for each day and the number of requests each SourceIP made. This seems like an extremely simple task and yet I'm baffled at how to do it. I've managed to get the top 10 SourceIPs for the whole time range, but I just want it to show it plotted against each day - ie: for each day, there should be 10 columns, one for each IP, with the number of requests being the height of the column. It would seem that I just need to search for the top 10 IPs and then graph by date_mday, but that doesn't work it seems..
my current search (for top 10 over all time) is as follows: eventtype="Request" | top limit=10 SourceIP
Any ideas? I just simply can't believe that such a simple function seems utterly impossible to implement..
*edit - It seems I'm after multi-series graphing - so I need it to generate 10 different series and graph them by day.
The obvious search is something like:
eventtype=Request | timechart count by SourceIP limit=10
The problem with this is that it shows the top 10 globally, not the top 10 per day. The problem with "per-day" is that every day could have 10 completely different top SourceIPs and thus for a month, you may need 300 series.
If you really want to calculate per day, it's something more like:
eventtype=Request | bin span=1d _time | stats count by _time SourceIP | sort - _time count | dedup 10 _time
So this will give you, per-day, the top 10 SourceIP,count pairs (using count). To make this into a chart, you could add:
| timechart span=1d sum(count) by SourceIP limit=1000.
Fantastic. Exactly what I was after, though how a person new to Splunk is supposed to know such a confusing array of commands is beyond me.. There needs to be some tutorials and a decent help section - not the mish-mash that the current help section is.
Using access-logs and extracting just the IP address over time, the answer above does not work and provides not results.
timechart span=1d sum(count) by SourceIP limit=1000
| timechart by SourceIP
does work as expected, outputs the range of IP adresses seen by apache, however, these are not sorted by the most visited IP address.