Not sure if that titled made sense but hopefully I can explain it better here:
I am receiving sFTP logs from a host and I was able to manually extract new fields successfully and create several different reports, i.e. user login, failed logins, files downloaded etc., etc. ... Now I am attempting to correlate the external/source IP with the username that is associated with the external/source IP. The issue is that the external/source IP and the associating username are being indexed into separate lines. When someone is logging into the sFTP server the external/source IP is logged and the very next log that gets indexed is the username that just authenticated.
So I was thinking if I could create a search that uses the if-this-then-that logic. For example, if Splunk sees a log with category=a and the very next log shows category=b then table the external/source ip (from category=a) and username (from category=b). I manually extracted the field titled 'category'.
DalJeanis,
So re-reading your response, sounds like the logs would need to have a common field of some sort. Here is the raw log:
2017-08-22 00:01:03.624; [00000C3C] {125} username has successfully authenticated via Password
2017-08-22 00:01:03.610; [00000C3C] {121} username tries Password authentication
2017-08-22 00:01:03.376; [00000C3C] {110} enforcing anti-hammering delay [0.20 secs]
2017-08-22 00:01:03.288; [00000C3C] {120} username requests Password authentication
2017-08-22 00:01:02.207; [000009A8] {000} * x.x.x.x -> 1 active connections
2017-08-22 00:01:02.207; [000009A8] {109} List of currently connected IP and count of per-IP connections:
2017-08-22 00:01:02.113; [000009A8] {112} Optimizing socket configuration for better performance
2017-08-22 00:01:02.113; [000009A8] {111} Incoming connection request from [x.x.x.x]
so using the above logs, I was trying to see if I could possible correlate category={000} with category={125} maybe by time. So if the search see's category={000} and within say 5secs later if category={125} is listed, table source_ip(from category={000}) username(from category={125}).
Hi DalJeanis,
Thank you for assisting me with this issue. I tried running the search string you provided but I get the following error message: "Error in 'streamstats' command: The argument 'prior(mysource)' is invalid.
Here is what I copied and pasted into splunk:
index=name sourcetype=name_sftp earliest=-1d@d latest=@d| sort 0 _time
| eval mysource = if(category="000",source_ip,null())
| streamstats current=f prior(mysource) as lastsource window=1
| where category="125" AND isnotnull(lastsource)
| table username lastsource
Should I be swapping out "mysource" for maybe the name of the 'interesting field'?
Okay, SPL is not a programming language in that precise way. You can accomplish what you are talking about with the streamstats
command. It is MUCH cleaner, however, when there is some field in common.
Your search that gets all the relevant events
| sort 0 _time
| eval mysource = if(category="a",source_ip,null())
| streamstats current=f prior(mysource) as lastsource window=1
| where category="B" AND isnotnull(lastsource)
| table username lastsource
TLDR;
How to pull results only if one matching interesting field data is followed by a different matching interesting field.
index=x sourcetype=x | if category=a is followed by category=b, table source_ip(from catgeory=a) username(from category=b)