We're using Splunk in a SIEM environment and I have a search that returns all the bad event signatures with a count, sorted by the source department where the bad event signature was picked up.
dept, signature, total 1 virus 32768 1 trojan 30000 1 worm 20000 1 adware 12000 2 virus 48234 2 worm 13000 2 trojan 10000
That is obviously a simplified view of what we have. We have 100's of signatures for bad events per department.
What I'm looking for is taking my results and limiting it to the top 10 signatures per department.
The moment I introduce the 'top' command to my search, I get skewed results. Logic dictates that what I want to do should be easy, but I'm struggling quite a bit.
This is the search I have at the moment (running it for the full previous month):
index="myindex_summary" | fields dept, signature, total | stats sum(total) by signature, dept | table dept, signature, total | top 10 total by signature, dept | sort dept, -total
At the moment I get a lot more than 10 results per dept, but I suspect it's the by clause in the top command that messes it up. Also, I seem to get the correct results if I only do 'top 10 total by dept', but I need the signature in the final search result as well.
Then I have a secondary problem as well. The 'stats' command is obviously limited to 10000 events only, so the 'top' command will only return the top 10 signatures by dept based on those 10000 rows.
Logic then obviously dictates that I do my 'top' before my 'stats' command, but I just can't get it working.
What am I doing wrong?
So at the end of the day, the search I wanted to use, was just too complex if I wanted it to run on the index_summary. I finally created a dashboard view where I basically split up the search results per department.
Now I have a dashboard with 17 tables, each listing the top 10 signatures for the individual department.
I think this search might have worked using a subsearch and searching directly on my main index. At the end of the day, getting the data out of the system for reporting purposes are more important. My method of splitting up the departments into 17 different searches (still using the index summary for speed) just works and we get the data.
Thanks for all the input guys!
Your top command is using total for its "pivot", but I think you want the top number of signatures per dept. Try this:
| fields dept, signature
| top 10 signature by dept
| sort dept, -count
Agreed, but remember that I'm doing this search on a summary index, which includes a total field. I therefore first need a sum of the total field to know which signatures are the top 10, and then somehow I need to pass those top 10 values (for both signature and dept) to the outer search to do stats on. I'm starting to thing what I want to do is too complex to do on a summary index? Maybe I should just do it on the main index?
I don't think a subsearch would help you. According to limits.conf doc page http://docs.splunk.com/Documentation/Splunk/latest/admin/Limitsconf stats should return 50000 rows. Can you check to see if you have a limits.conf setting restricting you to 10k?
I possibly need to consider a subsearch... maybe get the Top 10 dept and signature and then use the results in a subsearch to limit the outer search, upon which I can then do stats. Been trying all morning to implement a subsearch though, no luck so far
While this is a good suggestion, it still leaves me with my secondary problem. The 'dedup' is based on the (limited to) 10000 rows generated by the stats command. I somehow need to do the 'top' or 'dedup 10' commands before the stats, and this is where I'm stuck. Maybe what I'm trying to do isn't possible?