Splunk Search

Alternatives to "join" with two matching event fields.

Explorer

I'm trying to search through one sourcetype called "Windows_System".

There's also a specific field I'm interested in first called "EventCode"

If I do a search only for for sourcetype="Windows_System" EventCode=1, I get the results I want.

There's an executable that I want to be present in the fields when I search for EventCode 2 called "executable.exe"

If I do a search only for sourcetype="Windows_System" EventCode=2 process="executable.exe", I get the results I want.

However, there are fields that show up when EventCode=1 is specified for that aren't present when EventCode=2 is. I want to chart out all the fields I want if both EventCode values are specified.

The good thing is that there's a common field with a unique value between both searches called "GUID" so I can focus on that.

Here's the query so far

sourcetype="WindowsSystem" EventCode=2 Image="executable.exe"
| join GUID type=outer [search sourcetype="Windows
System" EventCode=1]
| rename SourceHostname as hostnameip, SourceIp as sourceip, SourcePort as sourceport, DestinationHostname as destinationhostname, DestinationIp as destinationip, DestinationPort as destinationport
| table User, hostnameip, sourceip, sourceport, destinationhostname, destinationip, destinationport, Protocol, GUID, process, CommandLine, ParentImage, ParentCommandLine

CommandLine, ParentImage, ParentCommandLine are fields that are present ONLY when you specify EventCode1 that aren't when you specify EventCode2. The data in those fields are not showing up when run my main.

I have two questions.

Is the "join" command really necessary or can I accomplish this with a less intensive search like with stats or chart?

Is this search the correct syntax to get the results I want?

Thanks!

0 Karma
1 Solution

Explorer

I know this was a while ago, but I ended up using "Eval GUID=coalesce(GUID,GUID)" and sorting by "GUID" when I tabled out the fields.

Since I know when EventCode=1 is specified, there would always be data in the field CommandLine, and I knew there would always be data for the field "DestinationIp" when EventCode=3 was specified, I just used "CommandLine!="" and DestinationIP="" to filter out null fields.

Works 100000x faster than a join. Queries that would take hours to run with join now take less than 10 minutes

View solution in original post

0 Karma

Explorer

I know this was a while ago, but I ended up using "Eval GUID=coalesce(GUID,GUID)" and sorting by "GUID" when I tabled out the fields.

Since I know when EventCode=1 is specified, there would always be data in the field CommandLine, and I knew there would always be data for the field "DestinationIp" when EventCode=3 was specified, I just used "CommandLine!="" and DestinationIP="" to filter out null fields.

Works 100000x faster than a join. Queries that would take hours to run with join now take less than 10 minutes

View solution in original post

0 Karma

Engager

Can you please share final version of your query? Thanks

0 Karma

SplunkTrust
SplunkTrust

Joins are expensive and your requirement can easily be done using a stats. Try something like this

sourcetype="Windows_System" (EventCode=2 Image="executable.exe") OR (EventCode=1)
| stats values(process) as process, values(SourceHostname) as hostname_ip, values(SourceIp) as source_ip, values(SourcePort) as source_port, values(DestinationHostname) as destination_hostname, values(DestinationIp) as destination_ip, values(DestinationPort) as destination_port values(CommandLine) as CommandLine, values(ParentImage) as ParentImage, values(ParentCommandLine) as ParentCommandLine by User, Protocol, GUID
| table User, hostname_ip, source_ip, source_port, destination_hostname, destination_ip, destination_port, Protocol, GUID, process, CommandLine, ParentImage, ParentCommandLine

UPDATED SEARCH

sourcetype="Windows_System" (EventCode=2 Image="executable.exe") OR (EventCode=1)
| eval CommandLine=coalesce(CommandLine,null()) | eval ParentImage=coalesce(ParentImage,null()) | eval ParentCommandLine=coalesce(ParentCommandLine
 ,null())   | stats values(process) as process, values(SourceHostname) as hostname_ip, values(SourceIp) as source_ip, values(SourcePort) as source_port, values(DestinationHostname) as destination_hostname, values(DestinationIp) as destination_ip, values(DestinationPort) as destination_port values(CommandLine) as CommandLine, values(ParentImage) as ParentImage, values(ParentCommandLine) as ParentCommandLine by User, Protocol, GUID
    | table User, hostname_ip, source_ip, source_port, destination_hostname, destination_ip, destination_port, Protocol, GUID, process, CommandLine, ParentImage, ParentCommandLine

Explorer

You're definitely right. The stats helped speed up the query.

Unfortunately the CommandLine, ParentImage, ParentCommandLine field values are still blank in the results table.

All of the fields tabled out are common between EventCode1 and EventCode2 except for the three I listed above.

Any idea why they aren't being included?

0 Karma

SplunkTrust
SplunkTrust

See if updated answer works for you.

0 Karma

Explorer

Still not working.

I tabled out the _raw field and it seems only raw data the from EventCode 3 is in the field. There's nothing from EventCode 1.

It seems they aren't joining together which makes sense that I can't see those three fields in my table.

The data is in XML format. Would this be a problem?

0 Karma