What's the recommended approach when trying to correlate events together whenever the the events themselves don't all have a common fields? (e.g. There is no singular transaction field across all events)
I'm wondering if there is a common technique here. I have a specific problem in mind that spans 3 different types of events, but there are some other problems that I would like to tackle that would require spanning even more....
Here is an example I've been wrestling with: I'm attempting to correlate a set of RAS (remote dial up access) logs with activity on our FTP logs. Our FTP file transfer logs contain IP address (which is assigned by the RAS server), but the most helpful of the RAS logs only contain the RAS comm port number but not the IP address. This gets tricky to visualize, so perhaps a little ASCII table will help...
Event Type User Com Port Ip Stats Unique Id EventCode
----------------- ---- -------- -- ----- --------- ---------
RAS1 (Login) X X X 20200
FTP1 (file xfer) X X
RAS2 (Logout) X X X 20048
RAS3 (Disconnect) X 20201
The info I really want is in the "Stats" and "Unique Id" categories. But you can't get from RAS2
to FTP1
directly without an IP address.
My end goal is to get to the unique id information in the "FTP1" event and combine that with the stats info in "RAS2" event. So what I'm really looking for is a transaction command that lets me sometimes link based on my ip address, and other times let's me link based on User and Com Port; and the transaction command doesn't seem to like doing this.
Additional complexities:
startswith=RAS1
and endswith=RAS2 OR RAS3
)Here is an example that I came up with so far:
host=RAS-SERVER sourcetype=WinEventLog:System "SourceName=RemoteAccess"
("EventCode=20200" OR "EventCode=20048")
| rex field=Message " user (?<user>\S+) .*? port (?<port>[^ .]+)"
| rex " address (?<clientip>\d+\.\d+\.\d+\.\d+)\b"
| rex "\b(?<bytes_sent>[0-9]+)\s+bytes\s+.*?\s+sent.*?\s+(?<bytes_received>[0-9]+) bytes .*? received"
| transaction port user clientip maxspan=2h endswith=("EventCode=20048")
| append
[ search host=FTP-SERVER sourcetype="xferlog" clientip="172.16.*" file_name="/important/path/*" ]
| sort - _time
| transaction clientip maxspan=15m
| table _time, route, route_name, clientip, port, bytes*
This search nearly works. My inner (first) transaction
is grouping together all my RAS-based events, which works well. At that point a have a single event that has all the fields that I want to join with the corresponding FTP events, but then I run into issues with my second transaction
because the timestamp of my event is now the earliest RAS event and the FTP event comes sometime later and I no longer have a clear endswith=
(because all of the RAS events are rolled into one by the first transaction). So I'm left with a time-frame based transaction. The "15m" is rather arbitrary and sometimes it works and other times it does not, just depending on how busy the server is at any given moment. (I don't think there is a perfect value here, there's too much variation.)
Is there some search transaction
like command that would let me collect fields from comment events without actually consolidating them into a single event? Or is some way to incorporate the duration of the first transaction with the ending of the second-level transaction?
There are a couple of choices here. You can still use your basic search above with the second transaction based on clientip and use a startswith=(EventCode=20200)
.
The other possibility is to use a time-based lookup. If you record all of the RAS-SERVER transactions to a lookup table, you can match the route, route_name, port, etc. based on the clientip with a search against just the FTP-SERVER logs.
from the ascii table that you've provided using connected=f in the transaction should get rid of the subsearch
.... | transaction port clientip maxspan=2h endswith=("EventCode=20048") startswith=(EventCode=20200) connected=f
to see why we need to set connected=f, consider the same table but in reverse time order (same way as transaction's command sees the events). We need to set connected to false because there is no transitive relationship established between RAS2 and FTP1 if viewing events in reverse chronological order
Event Type User Com Port Ip Stats Unique Id EventCode
----------------- ---- -------- -- ----- --------- ---------
RAS3 (Disconnect) X 20201
RAS2 (Logout) X X X 20048
FTP1 (file xfer) X X
RAS1 (Login) X X X 20200
Also, I would really like to eliminate the sub-search (and the append
search command) if at all possible.
There are a couple of choices here. You can still use your basic search above with the second transaction based on clientip and use a startswith=(EventCode=20200)
.
The other possibility is to use a time-based lookup. If you record all of the RAS-SERVER transactions to a lookup table, you can match the route, route_name, port, etc. based on the clientip with a search against just the FTP-SERVER logs.
Your startswith
option does yield more accurate results. Thanks Stephen! (Do you think it would be possible to eliminate that subsearch?)