I'm trying to do something very similar, and trying to optimize my search--
Over the timeframe of 24h, I hit the subsearch limits.
Using a smaller 2hour timeframe, I see the difference is something like 5 ips after subsearch filter, vs. about 1,130,000 ips using the dc(index) model-- after which I am splitting all of those by about 10 different parameters and it takes forever to calculation on all of them.
Any suggestions on how to improve this?
My current search format is:
` (index=a sourcetype=b ) OR (index=c sourcetype=d action=e)
| eval ipv4=coalesce(ipv4, pattern)
| eval DISTINCTLOCKOUT=if(statement)
| eval DISTINCTELOCKOUT=if(statement)
| eval impacted_username=if(statement)
| eval whitelisted_username=if(statement)
| eval Date=strftime(_time, "%Y/%m/%d")
| eval failed_name=if(activity_status=="FAILED" AND activity_error!="7577",username,NULL)
| eval success_name=if(statement)
| eval blocked_name=if(statement)
| stats count as TOTAL_COUNT, count(this) as UNBLOCKED_TOTAL, count(this)) as BLOCKED_COUNT, count(this)) as WHITELISTED_COUNT, dc(this) as UNIQUE_WHITELISTED, count(eval(activity_status=="SUCCESS")) as SUCCESS_COUNT, count(eval(activity_status=="FAILED" AND activity_error!="7577")) as FAILED_COUNT, count(eval(this and isnull(that))) as IMPACTED_COUNT, dc(this) as UNIQUE_IMPACTED, count(eval(this OR that)) as LOCKOUT_COUNT, count(eval(this OR that OR that)) as ELOCKOUT_COUNT, count(eval(this OR that OR that)) as RANDOM_USER_LOGINS, dc(username) as UNIQUE_USER_COUNT, dc(failed_name) as FAILED_UNIQUE, dc(success_name) as SUCCESS_UNIQUE, dc(blocked_name) as BLOCKED_UNIQUE, min(_time) as FIRST_TIME, max(_time) as LAST_TIME, min(eval(if(statement))) as BLOCKED_TIME, dc(DISTINCTLOCKOUT) as "DISTINCTLOCKOUT_COUNT", by ipv4
| some prettyprint stuff `
Thanks in advance for any pointers.
... View more