I have two indexes. One is index=intrusion and the other is index=threat_list, index=ips consists of IPS event logs which tracks dropped traffic on our perimeter IPS, the second index is index=threat_list containing the IPs of known threat sources. We would like to correlate IP addresses hitting our permiter IPS in index=intrusion with IP adresses on the index=threat_list. Then we would then like to map those results every month.
The threat list has more than 2 million IPs per day and the IPS can receive about 3 million "attacks" per month.
This is what we have come up with so far but the problem is when reporting on these indexes on a monthly basis the event counts are so high that searches time out even though we have raised their limits in limits.conf
Google Maps App for plotting search results:
index=intrusion | join src [search index=threat_list] | stats count as _geo_count by src | geoip src | search _geo=* | stats sum(_geo_count) as _geo_count by _geo
Can anyone suggest a way to make the search more efficient so we do not see these timeouts or suggest a different method perhaps using lookup, dedup, stats, etc. that we can then use to map the results similar to the image below and avoid timing out?
Taking a look at the first part of your search:
index=intrusion | join src [search index=threat_list] | stats count as _geo_count by src
You're essentially looking for source addresses that appear in both indexes, right?
Consider this alternative:
index=intrusion OR index=threat_list | stats count as _geo_count dc(index) as index_count by src | where index_count==2
That should achieve the same thing without a subsearch.
Edit: The count may be different, my example counts a src appearing once per index as two events. If that's a concern it should be fixable by replacing count
with count(eval(index=="intrusion"))
.
Taking a look at the first part of your search:
index=intrusion | join src [search index=threat_list] | stats count as _geo_count by src
You're essentially looking for source addresses that appear in both indexes, right?
Consider this alternative:
index=intrusion OR index=threat_list | stats count as _geo_count dc(index) as index_count by src | where index_count==2
That should achieve the same thing without a subsearch.
Edit: The count may be different, my example counts a src appearing once per index as two events. If that's a concern it should be fixable by replacing count
with count(eval(index=="intrusion"))
.
Thank you for all your help
You can do the exact same thing:
(index=intrusion host="192.168.1.20" action=drop NOT src="10.1.*") OR index=threat_list | ...
The NOT src=something
might be applicable to both indexes, but won't do much to speed things up.
I'm trying to do something very similar, and trying to optimize my search--
Over the timeframe of 24h, I hit the subsearch limits.
Using a smaller 2hour timeframe, I see the difference is something like 5 ips after subsearch filter, vs. about 1,130,000 ips using the dc(index) model-- after which I am splitting all of those by about 10 different parameters and it takes forever to calculation on all of them.
Any suggestions on how to improve this?
My current search format is:
` (index=a sourcetype=b ) OR (index=c sourcetype=d action=e)
| eval ipv4=coalesce(ipv4, pattern)
| eval DISTINCTLOCKOUT=if(statement)
| eval DISTINCTELOCKOUT=if(statement)
| eval impacted_username=if(statement)
| eval whitelisted_username=if(statement)
| eval Date=strftime(_time, "%Y/%m/%d")
| eval failed_name=if(activity_status=="FAILED" AND activity_error!="7577",username,NULL)
| eval success_name=if(statement)
| eval blocked_name=if(statement)
| stats count as TOTAL_COUNT, count(this) as UNBLOCKED_TOTAL, count(this)) as BLOCKED_COUNT, count(this)) as WHITELISTED_COUNT, dc(this) as UNIQUE_WHITELISTED, count(eval(activity_status=="SUCCESS")) as SUCCESS_COUNT, count(eval(activity_status=="FAILED" AND activity_error!="7577")) as FAILED_COUNT, count(eval(this and isnull(that))) as IMPACTED_COUNT, dc(this) as UNIQUE_IMPACTED, count(eval(this OR that)) as LOCKOUT_COUNT, count(eval(this OR that OR that)) as ELOCKOUT_COUNT, count(eval(this OR that OR that)) as RANDOM_USER_LOGINS, dc(username) as UNIQUE_USER_COUNT, dc(failed_name) as FAILED_UNIQUE, dc(success_name) as SUCCESS_UNIQUE, dc(blocked_name) as BLOCKED_UNIQUE, min(_time) as FIRST_TIME, max(_time) as LAST_TIME, min(eval(if(statement))) as BLOCKED_TIME, dc(DISTINCTLOCKOUT) as "DISTINCTLOCKOUT_COUNT", by ipv4
| some prettyprint stuff `
Thanks in advance for any pointers.
ok I adjusted my view settings and was able to see the field you indicate, but what if I wanted to do some additional filtering? For instance, usually when we use the join command our search would be something like this:
index=intrusion host="192.168.1.20" action=drop NOT src="10.1.*" | join src [search index=threat_list] | stats count as _geo_count by src
How can I accomplish the same modifying the search you sent?
the underscore'd field name may get hidden depending on your view.
Ok, that is perhaps my misunderstanding. I haven't seen _geo_count only index_count. Thank you for the clarification.
It'd give you a row for that with fields src _geo_count index_count
where index_count
would indeed be 2, and _geo_count
would be 30.
What happens if one IP attacks 30 times in one month? Wouldn't it only tell me that the IP exists in both index (index_count==2)?
What's the difference?
This seems to only tell me if the IP occurred in both indexes but not the attack count from each IP on the threat list against my IPS throughout the entire month.
Can you explain a little more about the threat_list
index? Does it only contain IPs? Do you need to join the entire threat_list
event to the intrusion
event when the source IP matches? Honestly, depending on your setup, this sounds like a good candidate situation for using a database lookup instead of joining two indexes. We use a database lookup to accomplish a similar task.
Yes the threat list index currently is IP and threat type for instance src=192.168.x.x threat_type=scanning host. I want to know if an IP on the threat list attacked our IPSs any time throughout the month, so I think the answer is yes, we have to see the entire threat list.