Why is my stats by command so slow and how can I s...

timbCFCA · ‎09-20-2017

I'm working on some statistics related queries. I'm trying to get the security id, date and count of hosts connected to.

index=wineventlog sourcetype="WinEventLog:Security" 4624 |
fields host,Security_ID,_time  | 
bucket _time span=1d | 
stats dc(host) by Security_ID, _time

They work perfectly until I start adding Security_ID. With no by command or only based on time it's fast.

I also tried to do a dedup Security_ID, _time, host before the stats dc command but it didn't help the overall speed.

It takes well over 10 minutes to complete this search for a week, and I'd like to be able to run this for 30 60 or 90 days. What do I need to do for that to be viable?

lfedak_splunk · ‎09-21-2017

Heya @timbCFCA, if DalJeanis solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

worshamn · ‎09-20-2017

Seems puzzling, I do see that Security_ID gets used as a source key several times in transforms.conf in the TA, I wonder if that causes any overhead. Maybe try specifying EventCode=4624 so that it isn't searching through all fields looking for 4624.

DalJeanis · ‎09-20-2017

How often are you running the search? If you are running it fairly often, then you might consider a summary index so you don't have to re-spin the whole search multiple times a day.

Try this -

 index=wineventlog sourcetype="WinEventLog:Security" 4624 
| fields host, Security_ID, _time  
| eval _time=floor(_time/86400)*86400
| dedup host,Security_ID,_time
| stats dc(host) by Security_ID, _time

Explanation -

eval is streaming and distributed, while bin is not. This way, the binning can be done at the individual indexers.

Try it with and without the dedup and see what happens.

timbCFCA · ‎09-25-2017

@DalJeanis not really. It did let me save as an accelerated search which helps.

DalJeanis · ‎09-26-2017

@timbCFCA - Well, that's a decent consolation prize.

Did you try it without the dedup, which is redundant with the dc()?

DalJeanis · ‎09-25-2017

@timbCFCA - Did this change the run time at all?

Why is my stats by command so slow and how can I speed it up for longer time intervals?

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability

Are you a member of the Splunk Community?

Why is my stats by command so slow and how can I speed it up for longer time intervals?

Observe and Secure All Apps with Splunk

Splunk Decoded: Business Transactions vs Business IQ

Fastest way to demo Observability