Splunk Search

How to join with multiple matches?

Hendrik2509
Engager

Im trying to join the correct source hostname to my Event from where a RDP Connection was innitiated.
Since the Event just provides the Source IP-Address, I want to join the hostname from my summary Index that has hostnames with the IP-Addresses which they have been assigned to over time (1m Bucket)

Unfortunately its not working as expected.

I build the search as following:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address

 

Now what happens is, that the Source_Network_Addresses are getting matched, but it only returns the latest _time value from the summary_index by the matched Network Address for all, which ofc mostly results in a wrong hostname

Why is it not also matching the _time value from the base search with the _time value from the subsearch?

both _time fields are in timestamp format

Thanks for helping me

 

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Fields that begin with underscore e.g. _time are hidden from field lists unless explicitly included. Try it like this:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left _time Source_Network_Address
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address
0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's not how you do it. Splunk is not a RDBMS 😉

Normal approach to such task is to list all events with overlapping fields on which you want to aggregate events and do some groupping with stats command.

For example - a rough idea if you want to "join" on source_ip:

(<search for RDP events>) OR (<search for summary events>)|  eval summary_source_hostname=if(index="summary",hostname,null())| eval source_ip=if(index="summary",Source_Network_Address,source_ip_field_from_rdp_connection)
| bin _time span=1m
| stats values(summary_source_ip) values(rdp_username) by source_ip _time

 

Hendrik2509
Engager

thanks, I have changed it to this:

 

(<RDP_Event_Search>) OR (<summary_index>)
| bucket span=1m _time
| eval host = if(index=<RDP_Events>,host,null())
| eval Source_Network_Address = if(index=<summary_index>,IP,Source_Network_Address)
| eval RDP_Event_time = if(EventCode=21,_time,null())
| stats values(hostname) as source_hostname values(host) as dest_host values(RDP_Event_time) as RDP_Event_time by Source_Network_Address _time
| where RDP_Event_time=_time
| convert ctime(RDP_Event_time)

 

 

It works now, but the search is quiet slow though.

I created a kvstore from my summary_index and tried a lookup, which turned out to be a lot faster.

<RDP_Event_Search>
| bucket span=1m _time
| lookup <lookup_name> IP as Source_Network_Address _time as _time OUTPUT hostname as source_hostname 
| table _time host source_hostname Source_Network_Address
| where isnotnull(source_hostname)

 

However, changing to a kvstore would result in a continuously grow of plus >1.000.000 rows per day, are there any size limit recommendations for kvstores and lookup-queries?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, yes. Stats will be slower than lookup since lookup works completely differently.

I'm a bit puzzled though why you fool around with RDP_Event_time since it should be the same as _time and you're statsing over it anyway.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Brett Adams

In our third Spotlight feature, we're excited to shine a light on Brett—a Splunk consultant, innovative ...

Index This | What can you do to make 55,555 equal 500?

April 2025 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with this ...

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...

In today’s evolving threat landscape, we understand you’re constantly bombarded with phishing and malware ...