<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Join closest event from different logs in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175558#M50396</link>
    <description>&lt;P&gt;Thank you for your answer. I'll try to be more accurate. I have two types of logs:&lt;BR /&gt;
1) SSH access log (sourcetype="sshd"). Examples:&lt;BR /&gt;
May 12 23:59:59 SRV1_IP sshd[11720]: Accepted publickey for SSH_LOGIN1 from SSH_CLIENT1_IP port 65093 ssh2&lt;BR /&gt;&lt;BR /&gt;
May 12 23:59:59 SRV2_IP sshd[4441]: Accepted publickey for SSH_LOGIN2 from SSH_CLIENT2_IP port 53704 ssh2&lt;BR /&gt;
 You can get needed fields using | rex " (?&amp;lt;srv_ip&amp;gt;\S+) sshd.*? for (?&amp;lt;local_user&amp;gt;\S+) from (?&amp;lt;src_ip&amp;gt;\S+)" as you did.&lt;BR /&gt;
2) Firewall logs (sourcetype="firewall_logs"). Examples:&lt;BR /&gt;
timestamp="1431464386" event="ADD (post-auth)" login="LOGIN1" assigned_ip="CLIENT1_IP" &lt;BR /&gt;
timestamp="1431464446" event="ADD (post-auth)" login="LOGIN2" assigned_ip="CLIENT2_IP"&lt;/P&gt;

&lt;P&gt;Users should use their own logins for ssh authentication. But they can use other ones. In such cases they can use other laptop (with IP for different login), or maybe they are hacked. In both cases login!=ssh_login, but client_ip==ssh_client_ip. So I want to know and alert about such cases. The algorithm is:&lt;BR /&gt;
1) Get SSH events where we can see SSH_CLIENT_IP, SRV_IP and SSH_LOGIN. It is events with accepting connections.&lt;BR /&gt;
2) Then we need to know what LOGIN SSH_CLIENT_IP has. That's why we need to merge every event from clause 1 with firewall event, where SSH_EVENTTIME &amp;gt; FW_EVENTTIME for paticular SSH_CLIEN_IP (which equal to CLIENT_IP from fw logs) and FW_EVENTTIME is the nearest to SSH_EVENTTIME.&lt;BR /&gt;
3) After merging we can get events where SSH_LOGIN!=LOGIN.&lt;/P&gt;</description>
    <pubDate>Mon, 28 Sep 2020 19:53:49 GMT</pubDate>
    <dc:creator>IVV</dc:creator>
    <dc:date>2020-09-28T19:53:49Z</dc:date>
    <item>
      <title>Join closest event from different logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175555#M50393</link>
      <description>&lt;P&gt;Hello everyone!&lt;BR /&gt;
The problem: I want to identify users who use SSH with login other than their own. I have two logs:&lt;BR /&gt;
1) SSH log like&lt;BR /&gt;
timestamp="1111120" yyy.yyy.yyy.yyy sshd[1737]: Accepted publickey for user1 from xxx.xxx.xxx.xxx port 41902 ssh2&lt;BR /&gt;
2) Firewall log like&lt;BR /&gt;
timestamp="1111111" login="user1" assigned_ip="xxx.xxx.xxx.xxx" &lt;BR /&gt;
Now it works using the query below&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; sourcetype="sshd" "Accepted"
| rex field=_raw  " (?&amp;amp;lt;srv_ip&amp;amp;gt;\S+) sshd" 
| lookup dnslookup clientip AS srv_ip OUTPUT clienthost as  fqdn 
| rex field=_raw  "for (?&amp;amp;lt;local_user&amp;amp;gt;\S+) from" 
| rex field=_raw  "from (?&amp;amp;lt;src_ip&amp;amp;gt;\S+)"
| join type=inner src_ip [search sourcetype="firewall_logs"  
|  rename assigned_ip as src_ip  
|  stats max(_time) as fwtimestamp  by login,src_ip  
| convert timeformat="%m/%d/%Y %H:%M:%S" ctime(fwtimestamp) as fwtime  
| sort -fwtimestamp]  
| convert timeformat="%m/%d/%Y %H:%M:%S" ctime(accesstimestamp) as accesstime
| fields - accesstimestamp  fwtimestamp 
| where fwtime&amp;amp;lt;accesstime 
| where local_user!="root" 
| where login!=local_user
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;But it misses a lot of events because of using 'stats max'. So I need to compare ssh event to firewall event by closest same IP lease. &lt;BR /&gt;
Example:&lt;BR /&gt;
time=5 ip=ip1 login=man1&lt;BR /&gt;
time=7 ip=ip2 login=man2&lt;BR /&gt;
time=9 ip=ip2 login=man2&lt;BR /&gt;
time=10 ssh_ip=ip_ssh ssh_login=evil ip=ip2&lt;BR /&gt;
time=15 ip=ip2 login=man2&lt;BR /&gt;
Result should be something like this:&lt;BR /&gt;
fw_time=9 ssh_time=10 ip=ip2 login=man2 ssh_ip=ip_ssh ssh_login=evil&lt;/P&gt;

&lt;P&gt;How can I do it?&lt;BR /&gt;
Thank you.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 19:09:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175555#M50393</guid>
      <dc:creator>IVV</dc:creator>
      <dc:date>2020-09-28T19:09:26Z</dc:date>
    </item>
    <item>
      <title>Re: Join closest event from different logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175556#M50394</link>
      <description>&lt;P&gt;I can use usetime=true earlier=true to make timebased join. But when I do something like:&lt;BR /&gt;
     sourcetype="sshd" "Accepted"|  rex field=_raw  " (?&amp;lt;srv_ip&amp;gt;\S+) sshd" | lookup dnslookup clientip AS srv_ip OUTPUT clienthost as  fqdn | rex field=_raw  "for (?&amp;lt;local_user&amp;gt;\S+) from" |   rex field=_raw  "from (?&amp;lt;src_ip&amp;gt;\S+)" | stats max(_time) as accesstimestamp  by src_ip,fqdn,local_user  | join type=inner usetime=true earlier=true src_ip [search sourcetype="firewall_logs"  |  rename assigned_ip as src_ip  |  eval fwtimestamp=_time | table fwtimestamp,login,src_ip  | convert timeformat="%m/%d/%Y %H:%M:%S" ctime(fwtimestamp) as fwtime  | sort -fwtimestamp]  |  convert timeformat="%m/%d/%Y %H:%M:%S" ctime(accesstimestamp) as accesstime| fields - accesstimestamp  fwtimestamp |  where local_user!="root" | where login!=local_user&lt;BR /&gt;
I can see results where fwtime&amp;gt;accesstime. But it shouldn't be. What's the reason?&lt;/P&gt;

&lt;P&gt;To make it more lightweight. I use the following query (similar example):&lt;BR /&gt;
sourcetype="web_access" | stats max(_time) as accesstime by src_ip,web_login | join type=inner usetime=true earlier=true [ search sourcetype="fw_logs" | eval fwtime=_time | table fwtime,login,src_ip] | where web_login!=login&lt;BR /&gt;
and I can see results where fwtime&amp;gt;accesstime. That's weird.&lt;/P&gt;

&lt;P&gt;Maybe stats have no information about time?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 19:53:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175556#M50394</guid>
      <dc:creator>IVV</dc:creator>
      <dc:date>2020-09-28T19:53:22Z</dc:date>
    </item>
    <item>
      <title>Re: Join closest event from different logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175557#M50395</link>
      <description>&lt;P&gt;Your problems are surely due to the limits on the number of events returned by subsearches.  You can do the join without &lt;CODE&gt;join&lt;/CODE&gt; (and thus without the subsearch and its lmits) and I strongly encourage you to do so.  This search will do the join and enhance event data with the field I think you need:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;(sourcetype="sshd" "Accepted") OR (sourcetype="firewall_logs") | rex " (?&amp;amp;lt;srv_ip&amp;amp;gt;\S+) sshd.*? for (?&amp;amp;lt;local_user&amp;amp;gt;\S+) from (?&amp;amp;lt;src_ip&amp;amp;gt;\S+)" | lookup dnslookup clientip AS srv_ip OUTPUT clienthost AS fqdn | rename assigned_ip as src_ip | eventstats dc(sourcetype) AS sourcetypes by src_ip | search sourcetypes&amp;amp;gt;=2 | eventstats max(_time) AS maxTimeBySTandLogin BY sourcetype, login
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;At this point you only have events which share the same value of &lt;CODE&gt;src_ip&lt;/CODE&gt; (events with any particular &lt;CODE&gt;src_ip&lt;/CODE&gt; that do not show up in each sourcetype are filtered out) and you have enhanced each event to gain a field &lt;CODE&gt;maxTimeBySTandLogin&lt;/CODE&gt; which contains the highest/latest/most-recent time among groups of events which share the same &lt;CODE&gt;sourcetype&lt;/CODE&gt; and &lt;CODE&gt;login&lt;/CODE&gt;.  In other words, within each sourcetype dataset, each event possesses its own copy the (shared) &lt;CODE&gt;max(_time)&lt;/CODE&gt; which was assessed by comparing all events with the same &lt;CODE&gt;login&lt;/CODE&gt;.  From here, you should be able to modify until you get exactly what you would like.  If you can't get there on your own, do explain exactly what the post-join logic is supposed to do (it is not at clear to me because some of your explanation diverges from itself).&lt;/P&gt;</description>
      <pubDate>Tue, 12 May 2015 18:23:32 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175557#M50395</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2015-05-12T18:23:32Z</dc:date>
    </item>
    <item>
      <title>Re: Join closest event from different logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175558#M50396</link>
      <description>&lt;P&gt;Thank you for your answer. I'll try to be more accurate. I have two types of logs:&lt;BR /&gt;
1) SSH access log (sourcetype="sshd"). Examples:&lt;BR /&gt;
May 12 23:59:59 SRV1_IP sshd[11720]: Accepted publickey for SSH_LOGIN1 from SSH_CLIENT1_IP port 65093 ssh2&lt;BR /&gt;&lt;BR /&gt;
May 12 23:59:59 SRV2_IP sshd[4441]: Accepted publickey for SSH_LOGIN2 from SSH_CLIENT2_IP port 53704 ssh2&lt;BR /&gt;
 You can get needed fields using | rex " (?&amp;lt;srv_ip&amp;gt;\S+) sshd.*? for (?&amp;lt;local_user&amp;gt;\S+) from (?&amp;lt;src_ip&amp;gt;\S+)" as you did.&lt;BR /&gt;
2) Firewall logs (sourcetype="firewall_logs"). Examples:&lt;BR /&gt;
timestamp="1431464386" event="ADD (post-auth)" login="LOGIN1" assigned_ip="CLIENT1_IP" &lt;BR /&gt;
timestamp="1431464446" event="ADD (post-auth)" login="LOGIN2" assigned_ip="CLIENT2_IP"&lt;/P&gt;

&lt;P&gt;Users should use their own logins for ssh authentication. But they can use other ones. In such cases they can use other laptop (with IP for different login), or maybe they are hacked. In both cases login!=ssh_login, but client_ip==ssh_client_ip. So I want to know and alert about such cases. The algorithm is:&lt;BR /&gt;
1) Get SSH events where we can see SSH_CLIENT_IP, SRV_IP and SSH_LOGIN. It is events with accepting connections.&lt;BR /&gt;
2) Then we need to know what LOGIN SSH_CLIENT_IP has. That's why we need to merge every event from clause 1 with firewall event, where SSH_EVENTTIME &amp;gt; FW_EVENTTIME for paticular SSH_CLIEN_IP (which equal to CLIENT_IP from fw logs) and FW_EVENTTIME is the nearest to SSH_EVENTTIME.&lt;BR /&gt;
3) After merging we can get events where SSH_LOGIN!=LOGIN.&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 19:53:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175558#M50396</guid>
      <dc:creator>IVV</dc:creator>
      <dc:date>2020-09-28T19:53:49Z</dc:date>
    </item>
    <item>
      <title>Re: Join closest event from different logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175559#M50397</link>
      <description>&lt;P&gt;In the first part of the original query max() function was used only to limit the number of events before joining. It is not the best solution because we can miss some events, but for the sake of simplicity let us not consider ones.&lt;/P&gt;</description>
      <pubDate>Wed, 13 May 2015 15:08:59 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175559#M50397</guid>
      <dc:creator>IVV</dc:creator>
      <dc:date>2015-05-13T15:08:59Z</dc:date>
    </item>
    <item>
      <title>Re: Join closest event from different logs</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175560#M50398</link>
      <description>&lt;P&gt;Did you get this working? I would like to know how it works&lt;/P&gt;</description>
      <pubDate>Tue, 19 Sep 2017 20:19:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Join-closest-event-from-different-logs/m-p/175560#M50398</guid>
      <dc:creator>anuremanan88</dc:creator>
      <dc:date>2017-09-19T20:19:16Z</dc:date>
    </item>
  </channel>
</rss>

