I have a log (IPs and user name altered):
Time - ID - Command - Argument
2018-05-16T18:06:23.680096Z   225 Connect   Access denied for user 'admin'@'214.76.36.55' (using password: YES)
2018-05-16T18:06:25.389937Z   226 Connect   admin@214.76.36.55 on  using TCP/IP
2018-05-16T18:06:25.390134Z   226 Connect   Access denied for user 'admin'@'218.76.36.55' (using password: YES) 
2018-05-16T18:15:49.136165Z   227 Connect   ruff@205.181.75.173 on  using TCP/IP
2018-05-16T18:15:49.213672Z   227 Query select @@version_comment limit 1
2018-05-16T18:15:52.691468Z   227 Quit  
I'd like to build a query that will show me any two lines that 
Line 1 has a "Connect" and the NEXT line has a "Query"
2018-05-16T18:06:25.390134Z   226 Connect   Access denied for user 'admin'@'218.76.36.55' (using password: YES) 
2018-05-16T18:15:49.136165Z   227 Connect   ruff@205.181.75.173 on  using TCP/IP
Note, the 226 and 227 (ID) appear to be unique for each attempt, i.e. both 226's are related to the same connection, the 227, Connect, query and quit are all part of the same connection. Also, some of the lines are not formatted so nicely.
I'm stumped!! Thanks in advance.
 
					
				
		
If your ID is getting recycled, you'd have to make some assumption about the max time when ID will be recycled. Lets assuming an ID remains unique for a session at least for a day (it's not getting recycled same day), they try something like this
your base search
 | rex "^\S+\s+(?<ID>\S+)\s+(?<Action>\S+)" | eval date=strftime(_time,"%F")
 | stats list(_time) as _raw list(Action) as Action by ID date
 | where isnotnull(mvfind(Action,"Connect")) AND isnotnull(mvfind(Action,"Query"))
 
					
				
		
If your ID is getting recycled, you'd have to make some assumption about the max time when ID will be recycled. Lets assuming an ID remains unique for a session at least for a day (it's not getting recycled same day), they try something like this
your base search
 | rex "^\S+\s+(?<ID>\S+)\s+(?<Action>\S+)" | eval date=strftime(_time,"%F")
 | stats list(_time) as _raw list(Action) as Action by ID date
 | where isnotnull(mvfind(Action,"Connect")) AND isnotnull(mvfind(Action,"Query"))
That seemed to work, spot on. Since you didn't answer this in the normal answer block, I can't upvote or select this as good. If you put it in the answer block, I can give you credit.
Again, thanks!!!
 
					
				
		
Here you go.
I must have pasted wrong, I"m looking for this combo:
2018-05-16T18:15:49.136165Z 227 Connect ruff@205.181.75.173 on using TCP/IP
2018-05-16T18:15:49.213672Z 227 Query select version_comment limit 1
After reviewing all the logs manually, this combo is the only indication of a successful connection and authentication. There are multitudes of "connects" and "query"s, but this exact combo, having the same ID number for both Connect and Query, is what I'm looking for. Extracting the resulting IP and date/time should be easy after finding both lines.
Building on the suggestion above by @somesoni2:
your base search
| rex "^\S+\s+(?<ID>\S+)\s+(?<Action>\S+)\s+((?<User>\w+)@(?<IP>\d+\.\d+\.\d+\.\d+))?"
| stats list(User) AS User list(IP) AS IP list(Action) AS Action BY ID
| where isnotnull(mvfind(Action,"Connect")) AND isnotnull(mvfind(Action,"Query"))
This will give you the User, IP, and Action for each ID - based on the format of those two events here in your comment.
Pretty close as well, Output:
ID  User    IP  Action
952             Quit
                Query
                Connect
Because I only had the last 24 hours selected, I didn't realize the output was actually wrong:
Search Result:
227 ruff 205.181.75.173  Quit
        root 62.164.252.90 Query
<---space                       -> Connect
<---space                       ->Connect
<---space                       ->Connect
RAW Data:
2018-05-16T18:15:52.691468Z 227 Quit
2018-05-16T18:15:49.213672Z 227 Query   select @@version_comment limit 1
2018-05-16T18:15:49.136165Z 227 Connect ruff@205.181.75.173 on using TCP/IP
2018-05-11T18:19:41.688250Z 227 Connect Access denied for user 'root'@'62.164.252.90' (using password: YES)
2018-05-11T18:19:41.688086Z 227 Connect root@62.164.252.90 on mysql using TCP/IP
It looks like the ID is recycled as ID 227 has multiple dates.
Hi ,
I think you can use transaction command with maxspan and get the results,
Can you share a sanitized version of the logs for the connection with ID 952?
 
					
				
		
I guess you can do something like this to combine each connection attempt (ID) like this (assuming fields are not extracted, if they are exclude rex command)
your base search
| rex "^\S+\s+(?<ID>\S+)\s+(?<Action>\S+)"
| stats list(_time) as _raw list(Action) as Action by ID
| where isnotnull(mvfind(Action,"Connect")) AND isnotnull(mvfind(Action,"Query"))
There can be a better/different solution possible based on your final requirement (what your final output should look like).
 
					
				
		
Did you get a chance to test this?
Some,
Very close, it adds a raw field which I can deal with because I do get the information I need, here's the output:
ID  _raw            Action
952 1526559902.256133   Quit
        1526559902.241782   Query
        1526559902.241542   Connect
 
					
				
		
There actually a typo in my query, list(_time) should actually be list(_raw) in line 3. This way you retain the raw log data and can extract any field you want for later.
After correctly expanding the time field (I only had 24 hours selected)and changing the _time to _raw, here's the output:
227 
2018-05-16T18:15:52.691468Z 227 Quit                                            Quit
2018-05-16T18:15:49.213672Z 227 Query   select @@version_comment limit 1                        Query
2018-05-16T18:15:49.136165Z 227 Connect SCADA@205.181.75.173 on using TCP/IP                        Connect
2018-05-11T18:19:41.688250Z 227 Connect Access denied for user 'root'@'62.164.252.90' (using password: YES)     Connect
2018-05-11T18:19:41.688086Z 227 Connect root@62.164.252.90 on mysql using TCP/IP                    Conn
Question - Do you really need to detect the pattern "Line containing 'Connect' followed immediately by line containing 'Query'"? What's the root purpose? Are you trying to find the username of the user running a query? If so, that's a little different (and actually easier).
