Splunk Search

Alternatives to "join" with two matching event fields.

sknot1454
Explorer

I'm trying to search through one sourcetype called "Windows_System".

There's also a specific field I'm interested in first called "EventCode"

If I do a search only for for sourcetype="Windows_System" EventCode=1, I get the results I want.

There's an executable that I want to be present in the fields when I search for EventCode 2 called "executable.exe"

If I do a search only for sourcetype="Windows_System" EventCode=2 process="executable.exe", I get the results I want.

However, there are fields that show up when EventCode=1 is specified for that aren't present when EventCode=2 is. I want to chart out all the fields I want if both EventCode values are specified.

The good thing is that there's a common field with a unique value between both searches called "GUID" so I can focus on that.

Here's the query so far

sourcetype="Windows_System" EventCode=2 Image="executable.exe"
| join GUID type=outer [search sourcetype="Windows_System" EventCode=1]
| rename SourceHostname as hostname_ip, SourceIp as source_ip, SourcePort as source_port, DestinationHostname as destination_hostname, DestinationIp as destination_ip, DestinationPort as destination_port
| table User, hostname_ip, source_ip, source_port, destination_hostname, destination_ip, destination_port, Protocol, GUID, process, CommandLine, ParentImage, ParentCommandLine

CommandLine, ParentImage, ParentCommandLine are fields that are present ONLY when you specify EventCode1 that aren't when you specify EventCode2. The data in those fields are not showing up when run my main.

I have two questions.

Is the "join" command really necessary or can I accomplish this with a less intensive search like with stats or chart?

Is this search the correct syntax to get the results I want?

Thanks!

0 Karma
1 Solution

sknot1454
Explorer

I know this was a while ago, but I ended up using "Eval GUID=coalesce(GUID,GUID)" and sorting by "GUID" when I tabled out the fields.

Since I know when EventCode=1 is specified, there would always be data in the field CommandLine, and I knew there would always be data for the field "DestinationIp" when EventCode=3 was specified, I just used "CommandLine!="" and DestinationIP="" to filter out null fields.

Works 100000x faster than a join. Queries that would take hours to run with join now take less than 10 minutes

View solution in original post

0 Karma

sknot1454
Explorer

I know this was a while ago, but I ended up using "Eval GUID=coalesce(GUID,GUID)" and sorting by "GUID" when I tabled out the fields.

Since I know when EventCode=1 is specified, there would always be data in the field CommandLine, and I knew there would always be data for the field "DestinationIp" when EventCode=3 was specified, I just used "CommandLine!="" and DestinationIP="" to filter out null fields.

Works 100000x faster than a join. Queries that would take hours to run with join now take less than 10 minutes

0 Karma

roopendra
Engager

Can you please share final version of your query? Thanks

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Joins are expensive and your requirement can easily be done using a stats. Try something like this

sourcetype="Windows_System" (EventCode=2 Image="executable.exe") OR (EventCode=1)
| stats values(process) as process, values(SourceHostname) as hostname_ip, values(SourceIp) as source_ip, values(SourcePort) as source_port, values(DestinationHostname) as destination_hostname, values(DestinationIp) as destination_ip, values(DestinationPort) as destination_port values(CommandLine) as CommandLine, values(ParentImage) as ParentImage, values(ParentCommandLine) as ParentCommandLine by User, Protocol, GUID
| table User, hostname_ip, source_ip, source_port, destination_hostname, destination_ip, destination_port, Protocol, GUID, process, CommandLine, ParentImage, ParentCommandLine

UPDATED SEARCH

sourcetype="Windows_System" (EventCode=2 Image="executable.exe") OR (EventCode=1)
| eval CommandLine=coalesce(CommandLine,null()) | eval ParentImage=coalesce(ParentImage,null()) | eval ParentCommandLine=coalesce(ParentCommandLine
 ,null())   | stats values(process) as process, values(SourceHostname) as hostname_ip, values(SourceIp) as source_ip, values(SourcePort) as source_port, values(DestinationHostname) as destination_hostname, values(DestinationIp) as destination_ip, values(DestinationPort) as destination_port values(CommandLine) as CommandLine, values(ParentImage) as ParentImage, values(ParentCommandLine) as ParentCommandLine by User, Protocol, GUID
    | table User, hostname_ip, source_ip, source_port, destination_hostname, destination_ip, destination_port, Protocol, GUID, process, CommandLine, ParentImage, ParentCommandLine

sknot1454
Explorer

You're definitely right. The stats helped speed up the query.

Unfortunately the CommandLine, ParentImage, ParentCommandLine field values are still blank in the results table.

All of the fields tabled out are common between EventCode1 and EventCode2 except for the three I listed above.

Any idea why they aren't being included?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

See if updated answer works for you.

0 Karma

sknot1454
Explorer

Still not working.

I tabled out the _raw field and it seems only raw data the from EventCode 3 is in the field. There's nothing from EventCode 1.

It seems they aren't joining together which makes sense that I can't see those three fields in my table.

The data is in XML format. Would this be a problem?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...