Splunk Search

How to exclude results from joined subsearches if the field value appears in a 3rd sourcetype?

Splunk2095
Engager

Hi All,

I ran into a tricky one and can’t wrap my head around it (or if it is even possible).  The use case is as follows:

There are 3 sourcetypes that share a common field “Detection_id”.  This field has a unique value that appears in all 3 sourcetypes.

The first sourcetype is “DetectionName” and the 2nd sourcetype is “ProcessInfo”.  I did a join based on the “Detection_id” field so that way I can see my detection names displayed in a table right next to the process information responsible for the detection from the 2nd sourcetype.  That search works fine.

This is where it gets a bit tricky.  The 3rd sourcetype is called “FalsePositive” meaning it was a detection that was already investigated and considered false positive.  We do not want any events displaying in our table from the first two searches IF the same unique “Detection_id” value also appears in the “FalsePositive” sourcetype.  That way time isn’t wasted scrolling through detections that were already investigated.

Any thoughts on the best way to handle this and/or if it is even possible?

Appreciate any answers or feedback.

Labels (2)
0 Karma
1 Solution

ITWhisperer
SplunkTrust
SplunkTrust

Firstly, try to avoid using join if possible as it is slow and has limitations

Secondly, you could combine your searches into one and then determine which detection_ids are not represented in the false positive sourcetype

sourcetype="DetectionName" OR sourcetype="ProcessInfo" OR sourcetype="FalsePositive"
| eventstats values(sourcetype) as sourcetype by Detection_id
| where sourcetype!="FalsePositive"
| stats values(*) as * by Detection_id

View solution in original post

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Firstly, try to avoid using join if possible as it is slow and has limitations

Secondly, you could combine your searches into one and then determine which detection_ids are not represented in the false positive sourcetype

sourcetype="DetectionName" OR sourcetype="ProcessInfo" OR sourcetype="FalsePositive"
| eventstats values(sourcetype) as sourcetype by Detection_id
| where sourcetype!="FalsePositive"
| stats values(*) as * by Detection_id
0 Karma

Splunk2095
Engager

Thanks for the quick response!  and yes I hate join... making Splunk do 2 searches every time it runs breaks my heart.

Will your example also work if for instance, I want my search to be based on sourcetype="DetectionName" AND detect_name="Bypass UAC" ?  So putting that first one in brackets?

Essentially I will be having a scheduled search for each detection name so events can be grouped and analyzed based on the attack tactic.

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Yes, this should work

(sourcetype="DetectionName" AND detect_name="Bypass UAC") OR sourcetype="ProcessInfo" OR sourcetype="FalsePositive"

You also might want to change the where so that it is look for both valid sourcetypes as well as not having the FalsePositive sourcetype - by not having this extra check you are effectively doing a union i.e. events could come from either or both of the desired sourcetypes.

0 Karma

Splunk2095
Engager

Hmm so the search does run, but it is never producing results.  I tested multiple variations, but this is the jist:

 

index=main (sourcetype="DetectionName" AND detect_name="Bypass UAC") OR sourcetype="ProcessInfo" OR sourcetype="FalsePositive"
| eventstats values(sourcetype) as sourcetype by Detection_id
| where sourcetype!="FalsePositive" AND sourcetype="DetectionName" AND sourcetype="ProcessInfo"
| stats values(ComputerName) as ComputerName values(detect_description) as detect_description values(detect_name) as detect_name values(detect_scenario) as detect_scenario values(Process) as Process values(CommandLine) as CommandLine by Detection_id

When I take away the where clause it runs but of course produces a long list of stuff where the majority are missing key field values (detect_description, detect_name, detect_scenario are all unique fields in the DetectionName sourcetype while Process and CommandLine are unique fields in the ProcessInfo sourcetype).

 

 

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Here is a runanywhere example of this technique working

| makeresults count=20
| fields - _time
| eval sourcetype=mvindex(split("DetectionName,ProcessInfo,FalsePositive",","),random()%3)
| eval Detection_id=random()%10
| eval detect_name=if(sourcetype="DetectionName","Bypass UAC",null())
| eval Process=if(sourcetype="ProcessInfo",mvindex(split("xyz",""),random()%3).mvindex(split("xyz",""),random()%3).mvindex(split("xyz",""),random()%3),null())
| eventstats values(sourcetype) as sourcetype by Detection_id
| where sourcetype!="FalsePositive" AND sourcetype="DetectionName" AND sourcetype="ProcessInfo"
| stats values(detect_name) as detect_name values(Process) as Process values(sourcetype) as sourcetype by Detection_id
0 Karma

Splunk2095
Engager

I got it working!!!!

After specifying the original sourcetypes in the beginning, I did a pipe to fields and specified all fields that were going to be included in some form or another from each of the sourcetypes.

Then I tested by having a wildcard for the detection name and having |where sourcetype="FalsePositive".  I saw 4 results and wrote down the detect_id's.  Then I ran the same search but did |where sourcetype!="FalsePositive" and it displayed even more events, but none with the detect_id that showed up when including "FalsePositive".

Brain is fried hope that makes sense lol.  Either way, thank you so much!  I wouldn't have reached this point without your help.  Also, the search runs so much faster than using join ofc :).

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...