My goal is to parse my sftp logs, match the pid to the user name, then generate a list of what that user downloaded and how many times they downloaded it.
I've found the 2 parts of the puzzle but now I need to put them together. I have SFTP logs that look like so:
May 12 11:09:17 sftp2 internal-sftp: session opened for local user sftp_user from ip
May 12 11:09:24 sftp2 internal-sftp: open "/home/bla/ubuntu-10.04-alternate-amd64.iso" flags WRITE,CREATE,TRUNCATE mode 0644
May 12 16:16:54 sftp2 internal-sftp: open "/home/boo/UPLOADFILE" flags WRITE,CREATE,TRUNCATE mode 0755
What I would like to do is first match the PID to the name which I have found I can do with this:
host=sftpserver source=/var/log/sftp.log | stats first(sftp_action) as sftp_action first(sftp_user) as sftp_user by sftp_pid
and then I would like to be able to create a list by sftp_user that lists all of the files (extracted as sftp_file) that a user has run the "open" action on (I extracted the action field as sftp_action). This command does almost what I need except for the part where it doesn't match the sftp_pid to the sftp_user:
host=sftpserver source=/var/log/sftp.log sftp_action=open | stats count values(sftp_file) by sftp_pid
I keep trying different combinations but have yet to come up with anything that has all of the output I'm looking for.
Sorry the messed up line breaks have been fixed and you may now be able to better see the issue. The user ID is not always matched to the PID. The first line in the example log I posted contains the PID and the user name but that only happens when the user logs in. After that, as demonstrated in the 2nd and 3rd lines, there's only a PID (unique to that user's session). So I can get the PID to match the user name by using:
| stats count values(sftp_file) by sftp_pid
but now I need to see a list of opened items (in lines 2 and 3 of my example) catagorized by sftp_user. Make more sense now?