I'm currenty trying to combine data from our firewall and sysmon which is running on a testclient. I want to join the Commandline and the PID of the causing process to the firewall information. That works pretty well but I can then only see the datasets which were sucessfully joined. I want to extend the datasets that have been joined but I also want to keep the ones that couldn't be joined. This is my search:
sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block | join source_address, source_port, destinatin_port, destination_address [search sourcetype=nxlog_sysmon | rename SourceIp AS source_address, SourcePort AS source_port, DestinationPort AS destination_port, DestinationIp AS destination_address] | table source_address, destination_address, destination_port, Image
How can I extend the datasets that match the join but stell display all the unmatched ones?
With join (which is very expensive BTW), you get option to do left join (keep everything from LHS of join) and can be done like this (modified the join subsearch to include aggregation command)
sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block | join type=left source_address, source_port, destinatin_port, destination_address [search sourcetype=nxlog_sysmon | stats count by SourceIp SourcePort DestinationPort DestinationoIp | rename SourceIp AS source_address, SourcePort AS source_port, DestinationPort AS destination_port, DestinationIp AS destination_address] | table source_address, destination_address, destination_port, Image
But there may be a better alternative using stats. See if this give your results that you need (guessing it would work as you just doesn't seem to bother about time)
( sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block) OR (sourcetype=nxlog_sysmon)
| eval source_address=coalesce(source_address,SourceIp) | eval source_port=coalesce(source_port,SourcePort)
| eval destination_address=coalesce(destination_address,DestinationIp) | eval destination_port=coalesce(destination_port,DestinationPort)
| stats values(Image) as Image by source_address, destination_address, destination_port
With join (which is very expensive BTW), you get option to do left join (keep everything from LHS of join) and can be done like this (modified the join subsearch to include aggregation command)
sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block | join type=left source_address, source_port, destinatin_port, destination_address [search sourcetype=nxlog_sysmon | stats count by SourceIp SourcePort DestinationPort DestinationoIp | rename SourceIp AS source_address, SourcePort AS source_port, DestinationPort AS destination_port, DestinationIp AS destination_address] | table source_address, destination_address, destination_port, Image
But there may be a better alternative using stats. See if this give your results that you need (guessing it would work as you just doesn't seem to bother about time)
( sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block) OR (sourcetype=nxlog_sysmon)
| eval source_address=coalesce(source_address,SourceIp) | eval source_port=coalesce(source_port,SourcePort)
| eval destination_address=coalesce(destination_address,DestinationIp) | eval destination_port=coalesce(destination_port,DestinationPort)
| stats values(Image) as Image by source_address, destination_address, destination_port
Thanks for your answer! It does work well and as you guessed correctly time is not a factor in this case. But out of curiosity: Is there any way to get the same result in a more efficient way?
The more effective solution comes with some serious problems. We don't currently have the exact same basis of data in both sourcetypes. When I'm using the stats search the datasets are merged eventhough there often aren't two matchind entries. I only want them to show up if there are two matching entries. I looked at all the documentation I could find about the coalesce method but it does not seem to be possible, is it?
the stats
search created by @somesoni2 is your efficient way to do it 😉
cheers, MuS