Splunk Search

How to join with multiple matches?

Hendrik2509
Engager

Im trying to join the correct source hostname to my Event from where a RDP Connection was innitiated.
Since the Event just provides the Source IP-Address, I want to join the hostname from my summary Index that has hostnames with the IP-Addresses which they have been assigned to over time (1m Bucket)

Unfortunately its not working as expected.

I build the search as following:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address

 

Now what happens is, that the Source_Network_Addresses are getting matched, but it only returns the latest _time value from the summary_index by the matched Network Address for all, which ofc mostly results in a wrong hostname

Why is it not also matching the _time value from the base search with the _time value from the subsearch?

both _time fields are in timestamp format

Thanks for helping me

 

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Fields that begin with underscore e.g. _time are hidden from field lists unless explicitly included. Try it like this:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left _time Source_Network_Address
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address
0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's not how you do it. Splunk is not a RDBMS 😉

Normal approach to such task is to list all events with overlapping fields on which you want to aggregate events and do some groupping with stats command.

For example - a rough idea if you want to "join" on source_ip:

(<search for RDP events>) OR (<search for summary events>)|  eval summary_source_hostname=if(index="summary",hostname,null())| eval source_ip=if(index="summary",Source_Network_Address,source_ip_field_from_rdp_connection)
| bin _time span=1m
| stats values(summary_source_ip) values(rdp_username) by source_ip _time

 

Hendrik2509
Engager

thanks, I have changed it to this:

 

(<RDP_Event_Search>) OR (<summary_index>)
| bucket span=1m _time
| eval host = if(index=<RDP_Events>,host,null())
| eval Source_Network_Address = if(index=<summary_index>,IP,Source_Network_Address)
| eval RDP_Event_time = if(EventCode=21,_time,null())
| stats values(hostname) as source_hostname values(host) as dest_host values(RDP_Event_time) as RDP_Event_time by Source_Network_Address _time
| where RDP_Event_time=_time
| convert ctime(RDP_Event_time)

 

 

It works now, but the search is quiet slow though.

I created a kvstore from my summary_index and tried a lookup, which turned out to be a lot faster.

<RDP_Event_Search>
| bucket span=1m _time
| lookup <lookup_name> IP as Source_Network_Address _time as _time OUTPUT hostname as source_hostname 
| table _time host source_hostname Source_Network_Address
| where isnotnull(source_hostname)

 

However, changing to a kvstore would result in a continuously grow of plus >1.000.000 rows per day, are there any size limit recommendations for kvstores and lookup-queries?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, yes. Stats will be slower than lookup since lookup works completely differently.

I'm a bit puzzled though why you fool around with RDP_Event_time since it should be the same as _time and you're statsing over it anyway.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...