The problem I am having is finding a way to write a rule that will be good enough to find a malicious child-process that is multiple levels deep from the parent process.
For example:
I want to be able to create a sysmon related rule that detects when cmd.exe launches a sub-process office process
cmd.exe --> powershell.exe --> winword.exe OR powerpnt.exe OR excel.exe
The issue with sysmon is that I cannot seem to figure out a way to be able to walk up a chain or down a chain of processes and the ParentProcessGUID is only good for 1 level deep. Therefore, I tried to create a search that will look for the child-process and then look for the parent process I am looking for and join them together and possibly bucket them within 1m or 2m since most of the time the executions are very close to the same time. Below is something that I thought may work for some of this, but for whatever reason it does not seem to pull in all host together for this, even if the time is in the right range. If I specify the host name directly in the main search and subsearch, it works fine, but if I do it like below I don't get the intended result, but instead get other systems except the one I want.
index="wineventlog" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" EventID=1 EventDescription="Process Create" (process=winword.exe OR process=excel.exe OR process=powerpnt.exe)
| fields process, parent_process_name, _time, host
| join type=inner host max=0
[ search index="wineventlog" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" EventID=1 EventDescription="Process Create" parent_process_name=cmd.exe
| rename process as other_process, parent_process_name as other_parent
| fields other_process, other_parent, _time, host ]
I would figure that passing the host into the join command would join together the results from the main search host that has the office processes with the one from the subsearch that has the same host and is looking for cmd.exe. Together I could then use that to make a rule out of it. Is there a better way of doing this? I have been racking my brain around it for awhile now. One level deep is not good enough for detection since a lot of malicious processes can execute sub-processes that need to be accounted for as well. You also do not want a ton of false positives.
Any help on this would be most appreciated!
import splunklib.client as client
import splunklib.results as results
from time import sleep
from datetime import datetime, timedelta
import getpass
HOST = "localhost"
USERNAME = "admin"
PORT=8089
PASSWORD = getpass.getpass()
targetIndex = "sysmon" # The name of the index with the sysmon data...
targetSource = "SysmonEventID1.csv"
targetHost = "importCSV" # The name of the host with the data from the above index...
targetGuid = "{YOUR-GUID-HERE}"
eventTime = "2019-03-31T15:46:07"
targetTime = datetime.strptime(eventTime, "%Y-%m-%dT%H:%M:%S")
targetTime -= timedelta(minutes=1)
earliestTime = "%s/%s/%s:%s:%s:00" % (str(targetTime.month), str(targetTime.day), str(targetTime.year), str(targetTime.hour), str(targetTime.minute))
targetTime += timedelta(minutes=10)
latestTime = "%s/%s/%s:%s:%s:00" % (str(targetTime.month), str(targetTime.day), str(targetTime.year), str(targetTime.hour), str(targetTime.minute))
searchFragment = "index=%s host=%s earliest=%s latest=%s (ProcessGuid=%s OR ParentProcessGuid=%s) " % (
targetIndex, targetHost, earliestTime, latestTime, targetGuid, targetGuid)
# Define a Splunk Query Function (This gets called in recursion)...
def ExecuteSplunkQuery(splunkSearch):
searchQuery_normal = splunkSearch # This is the argument that gets passed to the function...
kwArgs_normalSearch = {"exec_mode": "normal", "count": 0} # Don't limit the returned results...
service = client.connect(
host=HOST,
port=PORT,
app="search",
username=USERNAME,
password=PASSWORD)
# Get the collection of search jobs and create new job...
jobs = service.jobs
job = jobs.create(searchQuery_normal, **kwArgs_normalSearch)
# We are executing a non-blocking search so we must wait and poll job to see when done...
while not job.is_done():
print("Waiting for job %s to complete...") % str(job)
sleep(2) # Wait 2 seconds and test again...
# Count search results. If zero, set no results flag...
resultCount = job["resultCount"] # This is a dictionary attribute of the job itself...
if (resultCount == "0"):
statusCode = 1
else:
statusCode = 0
# Get search results and return them as ResultsReader...
searchResults = results.ResultsReader(job.results(**kwArgs_normalSearch))
job.cancel()
return statusCode, searchResults
# First need to build function to create descendant queries from EventCode=1 sysmon events...
def BuildProcessDescendantSearch(guidList = []):
# Function argument is a list created from previous searches...
numGuids = len(guidList)
print "Found %i unique processGuid's. Finding descendants..." % numGuids
searchFragment = ""
for n in range(numGuids):
if (n == numGuids-1):
searchFragment += "ParentProcessGuid=%s" % guidList[n]
else:
searchFragment += "ParentProcessGuid=%s OR " % guidList[n]
return searchFragment
# This function will build the final query...
def BuildProcessQueryString(guidList = []):
numGuids = len(guidList)
searchFragment = ""
for n in range(numGuids):
if (n == numGuids - 1):
searchFragment += "ProcessGuid=%s" % guidList[n]
else:
searchFragment += "ProcessGuid=%s OR " % guidList[n]
return searchFragment
################
# Begin iteration search. The initial fragment was defined up top...
################
seedQuery = "search EventCode=1 %s | stats count BY ProcessGuid, ParentProcessGuid | fields - count" % searchFragment
statusFlag = 0 # This is set to 1 in the ExecuteSplunkQuery when no results are returned../
descendantCount = 0
parentProcessDictionary = {} # This dictionary will contain the unique Process/ParentProcess Guids
print "Executing seedQuery: %s" % seedQuery
statusFlag, resultsReader = ExecuteSplunkQuery(seedQuery)
# This is the iteration loop that builds the chain...
while (statusFlag != 1):
processGUIDList =[] # This will be a list process dictionaries
for result in resultsReader:
# Note dictionary structure {'ProcessGuid':'ParentProcessGuid'}
parentProcessDictionary[result.get('ProcessGuid', 'None')] = result.get('ParentProcessGuid', 'None')
processGUIDList.append(result.get('ProcessGuid', 'None'))
newSearchFragment = BuildProcessDescendantSearch(processGUIDList)
searchQuery = "search index=%s host=%s earliest=%s latest=%s (%s) | stats count BY ProcessGuid, ParentProcessGuid | fields - count" % (targetIndex, targetHost, earliestTime, latestTime, newSearchFragment)
print "New derived query: %s" % searchQuery
statusFlag, resultsReader = ExecuteSplunkQuery(searchQuery)
descendantCount += 1
print "Total descendant process generations found: %i" % descendantCount
# To construct final query, use keys from the processDictionary...
processList = parentProcessDictionary.keys()
finalSearchFragment = BuildProcessQueryString(processList)
finalQuery = "search index=%s host=%s earliest=%s latest=%s (%s) | table ParentProcessGuid, ParentCommandLine, EventDescription, ProcessGuid, CommandLine" % (targetIndex, targetHost, earliestTime, latestTime, finalSearchFragment)
print "Final sysmon chain query: %s" % finalQuery
statusFlag, resultsReader = ExecuteSplunkQuery(finalQuery)
for result in resultsReader:
print ">>>" + str(result)
I can't get my original work so I threw this together to show you how to use the Splunk SDK to recursively build descendant process chains from an input ProcessGuid using the Sysmon sourcetype. The final output from the search is an ordered dictionary of all the search fields (all the fields in the table..). You will need to grab the values from these dictionaries to do anything further. Also, to integrate it into Splunk, you'll need to return the results to the search pipeline by using InterSplunk library as well as passing auth into the function - too much to digest right now though.
If you want to time sequence the results sort by RecordID and UtcTime.
Note you must have the splunklib library from the SDK for this to work...
I've done it and you won't be able to do it with a sub-search. Unfortunately Splunk doesn't provide any convenient recursive functionality - you're going need to write a custom command...
I don't have access to the code, but I can walk you through it. You'll need to start digging through the Splunk Python API.
The first thing you'll need to do is pick your target ProcessGuid. I made a workflow action from the event list to call a custom search command that runs in Python, but you can do the same thing (and it's easier) to just use the Python REST API.
It isn't often that an event will have more than one descendant process create. It's also a pretty computationally expensive when they do as you may see 5+ generations and a few thousand processGuid's in the search results. I limit the search window by passing the host and the event time bounding the search with a small time window (maybe an hour).
Sorry I can't just paste the Python (it was a 200 or so lines), but learn the REST API and learn to make a few queries, then just put the recursion in a while loop where you test to see if no more returns are made on the search.
I'm curious to what the python looks like. I have used Splunk's rest API in the past, but it is much easier to see it in the code than to follow all the steps. Could you host it up on some file share like dropbox temporarily so I could download it? I do appreciate the walkthrough and that is a great solution.
I do think that any of the solutions posted so far is going to be best only in responding to alerts already in Splunk ES as notables and not generating notables themselves. This is based on process relationships since that would span everything coming into sysmon over 10 minutes and having to build all of that for every process possibility on each system over 10 minutes could take a longer than desired result. Even if I accelerate it into a data model, it may still be too much computationally to track if the parent process is fairly common.
I use McAfee expert rules to alert on a lot of parent --> child relationships, but they too lack sub-process tracking beyond 1 level deep. Symantec, however, does track it and wish we had it where I work now. Was hoping sysmon could do this much easier. I could always play around with it and see.
I admire that you have been able to implement a method to identify child processes of a selected process within the Splunk UI. Here is my REST-based method to list activities logged by sysmon for a selected process and it's children. Sounds like we apply similar methods and that they could be used by OP in some way to list the kill chain associated with a notable event.