Splunk Search

How to join with multiple matches?

Hendrik2509
Engager

Im trying to join the correct source hostname to my Event from where a RDP Connection was innitiated.
Since the Event just provides the Source IP-Address, I want to join the hostname from my summary Index that has hostnames with the IP-Addresses which they have been assigned to over time (1m Bucket)

Unfortunately its not working as expected.

I build the search as following:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address

 

Now what happens is, that the Source_Network_Addresses are getting matched, but it only returns the latest _time value from the summary_index by the matched Network Address for all, which ofc mostly results in a wrong hostname

Why is it not also matching the _time value from the base search with the _time value from the subsearch?

both _time fields are in timestamp format

Thanks for helping me

 

Labels (1)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Fields that begin with underscore e.g. _time are hidden from field lists unless explicitly included. Try it like this:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left _time Source_Network_Address
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address
0 Karma

PickleRick
SplunkTrust
SplunkTrust

That's not how you do it. Splunk is not a RDBMS 😉

Normal approach to such task is to list all events with overlapping fields on which you want to aggregate events and do some groupping with stats command.

For example - a rough idea if you want to "join" on source_ip:

(<search for RDP events>) OR (<search for summary events>)|  eval summary_source_hostname=if(index="summary",hostname,null())| eval source_ip=if(index="summary",Source_Network_Address,source_ip_field_from_rdp_connection)
| bin _time span=1m
| stats values(summary_source_ip) values(rdp_username) by source_ip _time

 

Hendrik2509
Engager

thanks, I have changed it to this:

 

(<RDP_Event_Search>) OR (<summary_index>)
| bucket span=1m _time
| eval host = if(index=<RDP_Events>,host,null())
| eval Source_Network_Address = if(index=<summary_index>,IP,Source_Network_Address)
| eval RDP_Event_time = if(EventCode=21,_time,null())
| stats values(hostname) as source_hostname values(host) as dest_host values(RDP_Event_time) as RDP_Event_time by Source_Network_Address _time
| where RDP_Event_time=_time
| convert ctime(RDP_Event_time)

 

 

It works now, but the search is quiet slow though.

I created a kvstore from my summary_index and tried a lookup, which turned out to be a lot faster.

<RDP_Event_Search>
| bucket span=1m _time
| lookup <lookup_name> IP as Source_Network_Address _time as _time OUTPUT hostname as source_hostname 
| table _time host source_hostname Source_Network_Address
| where isnotnull(source_hostname)

 

However, changing to a kvstore would result in a continuously grow of plus >1.000.000 rows per day, are there any size limit recommendations for kvstores and lookup-queries?

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, yes. Stats will be slower than lookup since lookup works completely differently.

I'm a bit puzzled though why you fool around with RDP_Event_time since it should be the same as _time and you're statsing over it anyway.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

March Community Office Hours Security Series Uncovered!

Hello Splunk Community! In March, Splunk Community Office Hours spotlighted our fabulous Splunk Threat ...

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...