How do I correlate sysmon data for malicious child...

doodoodonk · ‎03-26-2019

The problem I am having is finding a way to write a rule that will be good enough to find a malicious child-process that is multiple levels deep from the parent process.

For example:

I want to be able to create a sysmon related rule that detects when cmd.exe launches a sub-process office process

cmd.exe --> powershell.exe --> winword.exe OR powerpnt.exe OR excel.exe

The issue with sysmon is that I cannot seem to figure out a way to be able to walk up a chain or down a chain of processes and the ParentProcessGUID is only good for 1 level deep. Therefore, I tried to create a search that will look for the child-process and then look for the parent process I am looking for and join them together and possibly bucket them within 1m or 2m since most of the time the executions are very close to the same time. Below is something that I thought may work for some of this, but for whatever reason it does not seem to pull in all host together for this, even if the time is in the right range. If I specify the host name directly in the main search and subsearch, it works fine, but if I do it like below I don't get the intended result, but instead get other systems except the one I want.

index="wineventlog" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" EventID=1 EventDescription="Process Create" (process=winword.exe OR process=excel.exe OR process=powerpnt.exe) 
| fields process, parent_process_name, _time, host 
| join type=inner host max=0
    [ search index="wineventlog" sourcetype="XmlWinEventLog:Microsoft-Windows-Sysmon/Operational" EventID=1 EventDescription="Process Create" parent_process_name=cmd.exe 
    | rename process as other_process, parent_process_name as other_parent 
    | fields other_process, other_parent, _time, host ]

I would figure that passing the host into the join command would join together the results from the main search host that has the office processes with the one from the subsearch that has the same host and is looking for cmd.exe. Together I could then use that to make a rule out of it. Is there a better way of doing this? I have been racking my brain around it for awhile now. One level deep is not good enough for detection since a lot of malicious processes can execute sub-processes that need to be accounted for as well. You also do not want a ton of false positives.

Any help on this would be most appreciated!

bkilday · ‎03-31-2019

import splunklib.client as client
import splunklib.results as results
from time import sleep
from datetime import datetime, timedelta
import getpass

HOST = "localhost"
USERNAME = "admin"
PORT=8089
PASSWORD = getpass.getpass()
targetIndex = "sysmon" # The name of the index with the sysmon data...
targetSource = "SysmonEventID1.csv"
targetHost = "importCSV" # The name of the host with the data from the above index...
targetGuid = "{YOUR-GUID-HERE}"
eventTime = "2019-03-31T15:46:07"
targetTime = datetime.strptime(eventTime, "%Y-%m-%dT%H:%M:%S")
targetTime -= timedelta(minutes=1)
earliestTime = "%s/%s/%s:%s:%s:00" % (str(targetTime.month), str(targetTime.day), str(targetTime.year), str(targetTime.hour), str(targetTime.minute))
targetTime += timedelta(minutes=10)
latestTime = "%s/%s/%s:%s:%s:00" % (str(targetTime.month), str(targetTime.day), str(targetTime.year), str(targetTime.hour), str(targetTime.minute))

searchFragment = "index=%s host=%s earliest=%s latest=%s (ProcessGuid=%s OR ParentProcessGuid=%s) " % (
    targetIndex, targetHost, earliestTime, latestTime, targetGuid, targetGuid)

# Define a Splunk Query Function (This gets called in recursion)...
def ExecuteSplunkQuery(splunkSearch):
    searchQuery_normal = splunkSearch # This is the argument that gets passed to the function...
    kwArgs_normalSearch = {"exec_mode": "normal", "count": 0} # Don't limit the returned results...

    service = client.connect(
        host=HOST,
        port=PORT,
        app="search",
        username=USERNAME,
        password=PASSWORD)

    # Get the collection of search jobs and create new job...
    jobs = service.jobs
    job = jobs.create(searchQuery_normal, **kwArgs_normalSearch)

    # We are executing a non-blocking search so we must wait and  poll job to see when done...
    while not job.is_done():
        print("Waiting for job %s to complete...") % str(job)
        sleep(2) # Wait 2 seconds and test again...

    # Count search results.  If zero, set no results flag...
    resultCount = job["resultCount"] # This is a dictionary attribute of the job itself...

    if (resultCount == "0"):
        statusCode = 1
    else:
        statusCode = 0

    # Get search results and return them as ResultsReader...
    searchResults = results.ResultsReader(job.results(**kwArgs_normalSearch))
    job.cancel()
    return statusCode, searchResults

# First need to build function to create descendant queries from EventCode=1 sysmon events...
def BuildProcessDescendantSearch(guidList = []):
    # Function argument is a list created from previous searches...
    numGuids = len(guidList)
    print "Found %i unique processGuid's.  Finding descendants..." % numGuids
    searchFragment = ""
    for n in range(numGuids):
        if (n == numGuids-1):
            searchFragment += "ParentProcessGuid=%s" % guidList[n]
        else:
            searchFragment += "ParentProcessGuid=%s OR " % guidList[n]
    return searchFragment

# This function will build the final query...
def BuildProcessQueryString(guidList = []):
    numGuids = len(guidList)
    searchFragment = ""
    for n in range(numGuids):
        if (n == numGuids - 1):
            searchFragment += "ProcessGuid=%s" % guidList[n]
        else:
            searchFragment += "ProcessGuid=%s OR " % guidList[n]
    return searchFragment

################
# Begin iteration search.  The initial fragment was defined up top...
################
seedQuery = "search EventCode=1 %s | stats count BY ProcessGuid, ParentProcessGuid | fields - count" % searchFragment

statusFlag = 0 # This is set to 1 in the ExecuteSplunkQuery when no results are returned../
descendantCount = 0
parentProcessDictionary = {} # This dictionary will contain the unique Process/ParentProcess Guids

print "Executing seedQuery: %s" % seedQuery
statusFlag, resultsReader = ExecuteSplunkQuery(seedQuery)

# This is the iteration loop that builds the chain...
while (statusFlag != 1):
    processGUIDList =[] # This will be a list process dictionaries
    for result in resultsReader:
        # Note dictionary structure {'ProcessGuid':'ParentProcessGuid'}
        parentProcessDictionary[result.get('ProcessGuid', 'None')] = result.get('ParentProcessGuid', 'None')
        processGUIDList.append(result.get('ProcessGuid', 'None'))
    newSearchFragment = BuildProcessDescendantSearch(processGUIDList)
    searchQuery = "search index=%s host=%s earliest=%s latest=%s (%s) | stats count BY ProcessGuid, ParentProcessGuid | fields - count" % (targetIndex, targetHost, earliestTime, latestTime, newSearchFragment)
    print "New derived query: %s" % searchQuery
    statusFlag, resultsReader = ExecuteSplunkQuery(searchQuery)
    descendantCount += 1

print "Total descendant process generations found: %i" % descendantCount

# To construct final query, use keys from the processDictionary...
processList = parentProcessDictionary.keys()
finalSearchFragment = BuildProcessQueryString(processList)
finalQuery = "search index=%s host=%s earliest=%s latest=%s (%s)  | table ParentProcessGuid, ParentCommandLine, EventDescription, ProcessGuid, CommandLine" % (targetIndex, targetHost, earliestTime, latestTime, finalSearchFragment)

print "Final sysmon chain query: %s" % finalQuery

statusFlag, resultsReader = ExecuteSplunkQuery(finalQuery)

for result in resultsReader:
    print ">>>" + str(result)

bkilday · ‎03-31-2019

I can't get my original work so I threw this together to show you how to use the Splunk SDK to recursively build descendant process chains from an input ProcessGuid using the Sysmon sourcetype. The final output from the search is an ordered dictionary of all the search fields (all the fields in the table..). You will need to grab the values from these dictionaries to do anything further. Also, to integrate it into Splunk, you'll need to return the results to the search pipeline by using InterSplunk library as well as passing auth into the function - too much to digest right now though.

If you want to time sequence the results sort by RecordID and UtcTime.

Note you must have the splunklib library from the SDK for this to work...

bkilday · ‎03-27-2019

I've done it and you won't be able to do it with a sub-search. Unfortunately Splunk doesn't provide any convenient recursive functionality - you're going need to write a custom command...

I don't have access to the code, but I can walk you through it. You'll need to start digging through the Splunk Python API.

The first thing you'll need to do is pick your target ProcessGuid. I made a workflow action from the event list to call a custom search command that runs in Python, but you can do the same thing (and it's easier) to just use the Python REST API.

First only focus on the EventCode=1 process create events. These will be used to build a process chain dictionary since this SysMon event has both the ProcessGuid and ParentProcessGuid in the data. You have your targetGuid already so use that data to find descendant processes, i.e. find all events where ParentProcessGuid=targetGuid. If any descendants exist you will get 1 or more results. Add the new ProcessGuid's to your dictionary - this will be used later for building your chain on the non-EvenCode=1 events. I also use the keys of the dictionary to produce a list of GUIDS for the final search.
Now just repeat. Find new processes who's ParentProcessGuid=[newTargetGuidList]. Keep iterating through until the search returns no new results.
You should now have a master ProcessGuid:ParentProcessGuid dictionary of every event that directly relates to the processGuid you originally targeted. Make a new list of all the ProcessGuids in the dictionary.
Now execute the search again, don't limit to EventCode=1 so you get the file creates, network connections etc. Use the list to create your search string (something like processGuid=targetGuidList OR processGuid...). You don't need the parentProcessGuid in this search as you already have all the unique GUID's in the chain.
Use the dictionary to assign the parentProcessGuid to non EventCode=1 events since they don't have the parent in the logged event. This is extremely useful if you go bananas (like me) and create a graph structure to visualize the process chain and want to see which process created which file, registry change, etc.

It isn't often that an event will have more than one descendant process create. It's also a pretty computationally expensive when they do as you may see 5+ generations and a few thousand processGuid's in the search results. I limit the search window by passing the host and the event time bounding the search with a small time window (maybe an hour).

Sorry I can't just paste the Python (it was a 200 or so lines), but learn the REST API and learn to make a few queries, then just put the recursion in a while loop where you test to see if no more returns are made on the search.

doodoodonk · ‎03-28-2019

I'm curious to what the python looks like. I have used Splunk's rest API in the past, but it is much easier to see it in the code than to follow all the steps. Could you host it up on some file share like dropbox temporarily so I could download it? I do appreciate the walkthrough and that is a great solution.

I do think that any of the solutions posted so far is going to be best only in responding to alerts already in Splunk ES as notables and not generating notables themselves. This is based on process relationships since that would span everything coming into sysmon over 10 minutes and having to build all of that for every process possibility on each system over 10 minutes could take a longer than desired result. Even if I accelerate it into a data model, it may still be too much computationally to track if the parent process is fairly common.

I use McAfee expert rules to alert on a lot of parent --> child relationships, but they too lack sub-process tracking beyond 1 level deep. Symantec, however, does track it and wish we had it where I work now. Was hoping sysmon could do this much easier. I could always play around with it and see.

dstaulcu · ‎03-27-2019

I admire that you have been able to implement a method to identify child processes of a selected process within the Splunk UI. Here is my REST-based method to list activities logged by sysmon for a selected process and it's children. Sounds like we apply similar methods and that they could be used by OP in some way to list the kill chain associated with a notable event.

How do I correlate sysmon data for malicious child sub processes several levels deep from the parent?

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM