All Apps and Add-ons

IP correlation and mapping

splunkingsplun1
Explorer

I have two indexes. One is index=intrusion and the other is index=threat_list, index=ips consists of IPS event logs which tracks dropped traffic on our perimeter IPS, the second index is index=threat_list containing the IPs of known threat sources. We would like to correlate IP addresses hitting our permiter IPS in index=intrusion with IP adresses on the index=threat_list. Then we would then like to map those results every month.

The threat list has more than 2 million IPs per day and the IPS can receive about 3 million "attacks" per month.

This is what we have come up with so far but the problem is when reporting on these indexes on a monthly basis the event counts are so high that searches time out even though we have raised their limits in limits.conf

Google Maps App for plotting search results:

index=intrusion | join src [search index=threat_list] | stats count as _geo_count by src | geoip src | search _geo=* | stats sum(_geo_count) as _geo_count by _geo

Can anyone suggest a way to make the search more efficient so we do not see these timeouts or suggest a different method perhaps using lookup, dedup, stats, etc. that we can then use to map the results similar to the image below and avoid timing out?

alt text

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

Taking a look at the first part of your search:

index=intrusion | join src [search index=threat_list] | stats count as _geo_count by src

You're essentially looking for source addresses that appear in both indexes, right?
Consider this alternative:

index=intrusion OR index=threat_list | stats count as _geo_count dc(index) as index_count by src | where index_count==2

That should achieve the same thing without a subsearch.

Edit: The count may be different, my example counts a src appearing once per index as two events. If that's a concern it should be fixable by replacing count with count(eval(index=="intrusion")).

View solution in original post

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Taking a look at the first part of your search:

index=intrusion | join src [search index=threat_list] | stats count as _geo_count by src

You're essentially looking for source addresses that appear in both indexes, right?
Consider this alternative:

index=intrusion OR index=threat_list | stats count as _geo_count dc(index) as index_count by src | where index_count==2

That should achieve the same thing without a subsearch.

Edit: The count may be different, my example counts a src appearing once per index as two events. If that's a concern it should be fixable by replacing count with count(eval(index=="intrusion")).

0 Karma

splunkingsplun1
Explorer

Thank you for all your help

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

You can do the exact same thing:

(index=intrusion host="192.168.1.20" action=drop NOT src="10.1.*") OR index=threat_list | ...

The NOT src=something might be applicable to both indexes, but won't do much to speed things up.

0 Karma

rabitoblanco
Path Finder

I'm trying to do something very similar, and trying to optimize my search--

Over the timeframe of 24h, I hit the subsearch limits.

Using a smaller 2hour timeframe, I see the difference is something like 5 ips after subsearch filter, vs. about 1,130,000 ips using the dc(index) model-- after which I am splitting all of those by about 10 different parameters and it takes forever to calculation on all of them.

Any suggestions on how to improve this?

My current search format is:

` (index=a sourcetype=b ) OR (index=c sourcetype=d action=e)
| eval ipv4=coalesce(ipv4, pattern)
| eval DISTINCTLOCKOUT=if(statement)
| eval DISTINCTELOCKOUT=if(statement)
| eval impacted_username=if(statement)

| eval whitelisted_username=if(statement)

| eval Date=strftime(_time, "%Y/%m/%d")

| eval failed_name=if(activity_status=="FAILED" AND activity_error!="7577",username,NULL)

| eval success_name=if(statement)
| eval blocked_name=if(statement)

| stats     count as TOTAL_COUNT, count(this) as UNBLOCKED_TOTAL,   count(this)) as BLOCKED_COUNT,  count(this)) as WHITELISTED_COUNT, dc(this)  as UNIQUE_WHITELISTED, count(eval(activity_status=="SUCCESS")) as SUCCESS_COUNT, count(eval(activity_status=="FAILED" AND activity_error!="7577")) as FAILED_COUNT,    count(eval(this and isnull(that))) as IMPACTED_COUNT,   dc(this)  as UNIQUE_IMPACTED, count(eval(this OR that)) as LOCKOUT_COUNT,  count(eval(this OR that OR that)) as ELOCKOUT_COUNT, count(eval(this OR that OR that)) as RANDOM_USER_LOGINS,  dc(username) as UNIQUE_USER_COUNT, dc(failed_name) as FAILED_UNIQUE, dc(success_name) as SUCCESS_UNIQUE, dc(blocked_name) as BLOCKED_UNIQUE,    min(_time) as FIRST_TIME,     max(_time) as LAST_TIME,     min(eval(if(statement))) as BLOCKED_TIME, dc(DISTINCTLOCKOUT) as "DISTINCTLOCKOUT_COUNT",    by ipv4
| some prettyprint stuff  `

Thanks in advance for any pointers.

0 Karma

splunkingsplun1
Explorer

ok I adjusted my view settings and was able to see the field you indicate, but what if I wanted to do some additional filtering? For instance, usually when we use the join command our search would be something like this:

index=intrusion host="192.168.1.20" action=drop NOT src="10.1.*" | join src [search index=threat_list] | stats count as _geo_count by src

How can I accomplish the same modifying the search you sent?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

the underscore'd field name may get hidden depending on your view.

0 Karma

splunkingsplun1
Explorer

Ok, that is perhaps my misunderstanding. I haven't seen _geo_count only index_count. Thank you for the clarification.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

It'd give you a row for that with fields src _geo_count index_count where index_count would indeed be 2, and _geo_count would be 30.

0 Karma

splunkingsplun1
Explorer

What happens if one IP attacks 30 times in one month? Wouldn't it only tell me that the IP exists in both index (index_count==2)?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

What's the difference?

0 Karma

splunkingsplun1
Explorer

This seems to only tell me if the IP occurred in both indexes but not the attack count from each IP on the threat list against my IPS throughout the entire month.

0 Karma

gauldridge
Path Finder

Can you explain a little more about the threat_list index? Does it only contain IPs? Do you need to join the entire threat_list event to the intrusion event when the source IP matches? Honestly, depending on your setup, this sounds like a good candidate situation for using a database lookup instead of joining two indexes. We use a database lookup to accomplish a similar task.

0 Karma

splunkingsplun1
Explorer

Yes the threat list index currently is IP and threat type for instance src=192.168.x.x threat_type=scanning host. I want to know if an IP on the threat list attacked our IPSs any time throughout the month, so I think the answer is yes, we have to see the entire threat list.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...