Re: Limit search to top 10 by specific fields

tiny3001 · ‎10-31-2012

Hi everyone

We're using Splunk in a SIEM environment and I have a search that returns all the bad event signatures with a count, sorted by the source department where the bad event signature was picked up.

Like this:

dept,    signature,    total
1        virus         32768
1        trojan        30000
1        worm          20000
1        adware        12000
2        virus         48234
2        worm          13000
2        trojan        10000

That is obviously a simplified view of what we have. We have 100's of signatures for bad events per department.

What I'm looking for is taking my results and limiting it to the top 10 signatures per department.

The moment I introduce the 'top' command to my search, I get skewed results. Logic dictates that what I want to do should be easy, but I'm struggling quite a bit.

This is the search I have at the moment (running it for the full previous month):

index="myindex_summary"
| fields dept, signature, total
| stats sum(total) by signature, dept
| table dept, signature, total
| top 10 total by signature, dept
| sort dept, -total

At the moment I get a lot more than 10 results per dept, but I suspect it's the by clause in the top command that messes it up. Also, I seem to get the correct results if I only do 'top 10 total by dept', but I need the signature in the final search result as well.

Then I have a secondary problem as well. The 'stats' command is obviously limited to 10000 events only, so the 'top' command will only return the top 10 signatures by dept based on those 10000 rows.

Logic then obviously dictates that I do my 'top' before my 'stats' command, but I just can't get it working.

What am I doing wrong?

tiny3001 · ‎11-04-2012

So at the end of the day, the search I wanted to use, was just too complex if I wanted it to run on the index_summary. I finally created a dashboard view where I basically split up the search results per department.

Now I have a dashboard with 17 tables, each listing the top 10 signatures for the individual department.

I think this search might have worked using a subsearch and searching directly on my main index. At the end of the day, getting the data out of the system for reporting purposes are more important. My method of splitting up the departments into 17 different searches (still using the index summary for speed) just works and we get the data.

Thanks for all the input guys!

alacercogitatus · ‎11-01-2012

Your top command is using total for its "pivot", but I think you want the top number of signatures per dept. Try this:

index="myindex_summary" | fields dept, signature | top 10 signature by dept | sort dept, -count

tiny3001 · ‎11-01-2012

Agreed, but remember that I'm doing this search on a summary index, which includes a total field. I therefore first need a sum of the total field to know which signatures are the top 10, and then somehow I need to pass those top 10 values (for both signature and dept) to the outer search to do stats on. I'm starting to thing what I want to do is too complex to do on a summary index? Maybe I should just do it on the main index?

dart · ‎11-01-2012

Try this:
index="myindex_summary" | stats sum(total) as total by signature, dept | sort dept, -total | dedup 10 dept

tiny3001 · ‎11-02-2012

@dart - It doesn't really matter what the limit is... I'm working with millions of events. The limit of 'top 10 signatures by dept' should be done before I do 'stats'.

dart · ‎11-01-2012

I don't think a subsearch would help you. According to limits.conf doc page http://docs.splunk.com/Documentation/Splunk/latest/admin/Limitsconf stats should return 50000 rows. Can you check to see if you have a limits.conf setting restricting you to 10k?

tiny3001 · ‎11-01-2012

I possibly need to consider a subsearch... maybe get the Top 10 dept and signature and then use the results in a subsearch to limit the outer search, upon which I can then do stats. Been trying all morning to implement a subsearch though, no luck so far

tiny3001 · ‎11-01-2012

While this is a good suggestion, it still leaves me with my secondary problem. The 'dedup' is based on the (limited to) 10000 rows generated by the stats command. I somehow need to do the 'top' or 'dedup 10' commands before the stats, and this is where I'm stuck. Maybe what I'm trying to do isn't possible?

bmcfar000 · ‎01-29-2020

The sort command is capping your results to 10000, change your sorts command to | sort 0 dept, -total.

Limit search to top 10 by specific fields

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers

Are you a member of the Splunk Community?

Limit search to top 10 by specific fields

Index This | Why did the turkey cross the road?

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

Feel the Splunk Love: Real Stories from Real Customers