Splunk Enterprise Security

Create Correlated Notable from PAN Wildfire and Endpoint events

Explorer

We are looking to trigger a notable event when a series of events happen in a short period of time and in a specific order. For instance, looking at our Palo Alto Wildfire then looking for C2 or other suspicious traffic within 30 seconds. This type of logic could easily be applied to other scenarios or parts of the Cyber Kill Chain, so I would like to come up with a repeatable way of alerting when a series of events happen.

We have successfully created a search using |transaction in a Search, but I'm curious if this is the most efficient way to do this. We dumped this into a correlation search, which has yet to trigger (which might be a good indicator, lol).

sourcetype=pan_threat from!=DMZ to!=DMZ category=spyware OR category=file OR category=malware action!=sinkhole |transaction dest_ip startswith="wildfire" endswith="spyware OR malware" maxevents=2 maxspan=30s

All of this data is available in Data Models (which I am guessing is the most efficient method), however I am not sure how to write it. Other posts on here that are somewhat related show using multiple |tstats commands in the same search, however this throws an error be me saying that tstats needs to be the first command (which it is, it's just followed by another).

One resource I have found helpful is Dave Veuve's presentation from .conf: http://conf.splunk.com/session/2015/conf2015_DVeuve_Splunk_SecurityCompliance_SecurityJiujitsuBuildi...

Any input is appreciated!

1 Solution

Splunk Employee
Splunk Employee

There are two approaches I would use for this. The first is with raw text -- yours is probably just fine, given the relatively small number of events you're likely to see, and the extremely short time window. Any alternatives that don't use transaction would probably be so much harder to write while respecting the very short time window, that it wouldn't be worthwhile. That said, if I were looking at this, I might want to look for a longer time window to respect malware that delays execution specifically for sandboxing. Alternatively, if you are using the sandboxing technology in Wildfire (as opposed to the IOC database), I think the wildfire alert might actually come after the endpoint hit. You could allow for some of this complication by scheduling the following search to run every 15 minutes (note the overlapping time window -- employ throttling based on the dest_ip field, or maybe dest_ip + the wildfire hash to control excess notifications):

earliest=-30m@m sourcetype=pan_threat from!=DMZ to!=DMZ category=spyware OR category=file OR category=malware action!=sinkhole 
| stats count(eval(searchmatch("wildfire"))) as wildfire count(eval(searchmatch("spyware OR malware"))) as malware values(signature) as signature values(other_useful_context) as other_useful_context [...etc...] by dest_ip 
| where wildfire>0 AND malware>0

(Better yet, use by dest, and maybe even vendor_product so that it is CIM compliant!)

For how to accelerate this, I would expect both of these to show up in the malware data model, in which case you could do something like the following. The tstats definitely does need to be first, unless you're doing prestats=t and append=t, in which case you can combine them. (You can even do eval, rename, etc., between them.. but that gets into a little more voodoo, which is hard to feel out on your own):

| tstats count from datamodel=Malware where earliest=-30m@m groupby "Malware.dest" "Malware.vendor_product" 
| stats sum(eval(if('Malware_Attacks.vendor_product' = "Palo Alto Networks Wildfire", count, 0))) as wildfire sum(eval(if('Malware_Attacks.vendor_product' = "Palo Alto Networks Endpoint", count, 0))) as malware by "Malware_Attacks.dest" 
| where wildfire>0 AND malware>0

(Double check in your data that there are now src/dest issues, etc. Let me know if that's the case, happy to adapt it). That said, this is one of those searches which you may not actually need to accelerate, because the dataset should be really low, and the speed should thus be really fast.

Does that sound reasonable? Let me know if this response is off base in any way! And thank you for providing a clear question with example searches -- that's really helpful!

View solution in original post

Splunk Employee
Splunk Employee

There are two approaches I would use for this. The first is with raw text -- yours is probably just fine, given the relatively small number of events you're likely to see, and the extremely short time window. Any alternatives that don't use transaction would probably be so much harder to write while respecting the very short time window, that it wouldn't be worthwhile. That said, if I were looking at this, I might want to look for a longer time window to respect malware that delays execution specifically for sandboxing. Alternatively, if you are using the sandboxing technology in Wildfire (as opposed to the IOC database), I think the wildfire alert might actually come after the endpoint hit. You could allow for some of this complication by scheduling the following search to run every 15 minutes (note the overlapping time window -- employ throttling based on the dest_ip field, or maybe dest_ip + the wildfire hash to control excess notifications):

earliest=-30m@m sourcetype=pan_threat from!=DMZ to!=DMZ category=spyware OR category=file OR category=malware action!=sinkhole 
| stats count(eval(searchmatch("wildfire"))) as wildfire count(eval(searchmatch("spyware OR malware"))) as malware values(signature) as signature values(other_useful_context) as other_useful_context [...etc...] by dest_ip 
| where wildfire>0 AND malware>0

(Better yet, use by dest, and maybe even vendor_product so that it is CIM compliant!)

For how to accelerate this, I would expect both of these to show up in the malware data model, in which case you could do something like the following. The tstats definitely does need to be first, unless you're doing prestats=t and append=t, in which case you can combine them. (You can even do eval, rename, etc., between them.. but that gets into a little more voodoo, which is hard to feel out on your own):

| tstats count from datamodel=Malware where earliest=-30m@m groupby "Malware.dest" "Malware.vendor_product" 
| stats sum(eval(if('Malware_Attacks.vendor_product' = "Palo Alto Networks Wildfire", count, 0))) as wildfire sum(eval(if('Malware_Attacks.vendor_product' = "Palo Alto Networks Endpoint", count, 0))) as malware by "Malware_Attacks.dest" 
| where wildfire>0 AND malware>0

(Double check in your data that there are now src/dest issues, etc. Let me know if that's the case, happy to adapt it). That said, this is one of those searches which you may not actually need to accelerate, because the dataset should be really low, and the speed should thus be really fast.

Does that sound reasonable? Let me know if this response is off base in any way! And thank you for providing a clear question with example searches -- that's really helpful!

View solution in original post