All Apps and Add-ons

Source IPs Communicating with Far More Hosts Than Normal (Assistant: Detect Spikes)

davidmonaghan
Explorer

Hello All

I was wondering if someone could break down what the following search does and what the final outputted fields mean?

This search was taken from the Splunk Security Essentials app...

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)
| bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
| eventstats max(_time) as maxtime 
| stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"
| eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
| eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)
0 Karma
1 Solution

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

View solution in original post

gjanders
SplunkTrust
SplunkTrust

Some of these searches are quite complicated and could do with some comments inside them 🙂

(tag=network tag=communicate) OR (index=pan_logs sourcetype=pan*traffic) OR (index=* sourcetype=opsec) OR (index=* sourcetype=cisco:asa)

This part is simple enough, use tags, sourcetype and indexes to find the relevant events to look at.

 | bucket _time span=1d | stats dc(dest_ip) as count by src_ip, _time
 | eventstats max(_time) as maxtime 

Group the time of each event into a 1 day block, from memory it will go to Monday midnight, Tuesday midnight et cetera.
Then provide a distinct count of destinations by source IP's and time (where time is now per day).
Add an additional field to find the maximum/most recent time for all events...

 | stats count as num_data_samples max(eval(if(_time >= relative_time(maxtime, "-1d@d"), 'count',null))) as "count" avg(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as avg stdev(eval(if(_time<relative_time(maxtime,"-1d@d"),'count',null))) as stdev by "src_ip"

Performing stats by the source IP, in particular a count, a max function that will find the maximum of the count field where the event is newer than the maxtime minus 1 day (snapped to midnight).
The average and also the average standard deviation of the count field where the _time is less than the maxtime minus 1 day

 | eval lowerBound=(avg-stdev*2), upperBound=(avg+stdev*2)
 | eval isOutlier=if(('count' < lowerBound OR 'count' > upperBound) AND num_data_samples >=7, 1, 0)

This part is fairly straightforward, find the average minus 2*stdev, and avg+2*stdev
Then add the isOutlier boolean if the count is less than or greater than the lower/upper bound and there are enough data samples.

I expected a where clause at the end of this but I do not see it, does that make sense or are you more confused ? 🙂
Effectively the query finds outliers based on number of destinations from a source ip / time.

davidmonaghan
Explorer

Thanks

That was pretty much my reading once I broke it down.

David

0 Karma
Get Updates on the Splunk Community!

Upcoming Webinar: Unmasking Insider Threats with Slunk Enterprise Security’s UEBA

Join us on Wed, Dec 10. at 10AM PST / 1PM EST for a live webinar and demo with Splunk experts! Discover how ...

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

If you’re unfamiliar, .conf is Splunk’s premier event where the Splunk community, customers, partners, and ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

There’s something special about this time of year—maybe it’s the glow of the holidays, maybe it’s the ...