How to join with multiple matches?

Hendrik2509 · ‎04-02-2022

Im trying to join the correct source hostname to my Event from where a RDP Connection was innitiated.
Since the Event just provides the Source IP-Address, I want to join the hostname from my summary Index that has hostnames with the IP-Addresses which they have been assigned to over time (1m Bucket)

Unfortunately its not working as expected.

I build the search as following:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address

Now what happens is, that the Source_Network_Addresses are getting matched, but it only returns the latest _time value from the summary_index by the matched Network Address for all, which ofc mostly results in a wrong hostname

Why is it not also matching the _time value from the base search with the _time value from the subsearch?

both _time fields are in timestamp format

Thanks for helping me

ITWhisperer · ‎04-02-2022

Fields that begin with underscore e.g. _time are hidden from field lists unless explicitly included. Try it like this:

<search string for RDP Logon Event>
| bucket span=1m _time
| join type=left _time Source_Network_Address
[search index=<summary_index> 
| eval source_host = hostname
| eval Source_Network_Address = IP
| fields _time Source_Network_Address source_host]

| table _time host source_host Source_Network_Address

PickleRick · ‎04-02-2022

That's not how you do it. Splunk is not a RDBMS 😉

Normal approach to such task is to list all events with overlapping fields on which you want to aggregate events and do some groupping with stats command.

For example - a rough idea if you want to "join" on source_ip:

(<search for RDP events>) OR (<search for summary events>)|  eval summary_source_hostname=if(index="summary",hostname,null())| eval source_ip=if(index="summary",Source_Network_Address,source_ip_field_from_rdp_connection)
| bin _time span=1m
| stats values(summary_source_ip) values(rdp_username) by source_ip _time

Hendrik2509 · ‎04-02-2022

thanks, I have changed it to this:

(<RDP_Event_Search>) OR (<summary_index>)
| bucket span=1m _time
| eval host = if(index=<RDP_Events>,host,null())
| eval Source_Network_Address = if(index=<summary_index>,IP,Source_Network_Address)
| eval RDP_Event_time = if(EventCode=21,_time,null())
| stats values(hostname) as source_hostname values(host) as dest_host values(RDP_Event_time) as RDP_Event_time by Source_Network_Address _time
| where RDP_Event_time=_time
| convert ctime(RDP_Event_time)

It works now, but the search is quiet slow though.

I created a kvstore from my summary_index and tried a lookup, which turned out to be a lot faster.

<RDP_Event_Search>
| bucket span=1m _time
| lookup <lookup_name> IP as Source_Network_Address _time as _time OUTPUT hostname as source_hostname 
| table _time host source_hostname Source_Network_Address
| where isnotnull(source_hostname)

However, changing to a kvstore would result in a continuously grow of plus >1.000.000 rows per day, are there any size limit recommendations for kvstores and lookup-queries?

PickleRick · ‎04-03-2022

Well, yes. Stats will be slower than lookup since lookup works completely differently.

I'm a bit puzzled though why you fool around with RDP_Event_time since it should be the same as _time and you're statsing over it anyway.

How to join with multiple matches?

join

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25

Are you a member of the Splunk Community?

How to join with multiple matches?

join

Dashboards: Hiding charts while search is being executed and other uses for tokens

Splunk Observability Cloud's AI Assistant in Action Series: Explaining Metrics and ...

Brains, Bytes, and Boston: Learn from the Best at .conf25