I am trying to write a query in Splunk that will tell me if any user IDs in my CSV file were used to log into any machines that ARE NOT in my CSV file. I have a CSV file called "MLT_List.csv" with three fields. The second field contains the user IDs and is called "User_ID". The third field is called "Computers". So I want to search Splunk for any successful logins with the user IDs in the CSV that are not logging into any of the computers in the CSV. I don't know if I am being clear or not. Thanks in advance for any help!
Comparing the result of a search to lookup values is a very commonly used practice and you describe it perfectly.
The way I phrased it is actually a clue as to how you perform this.
You have a lookup table with valid hosts and users. (fields host, user)
You have a search that retrieves successful logins and the hosts.
To filter the search to show only those successful logins that occur on hosts NOT in your CSV you do this:
yourSearchwith_host NOT [ inputlookup yourlookup |fields host ]
so for instance you might say something like this:
index=blah sourcetype=blah_foo host=* login_status=success NOT [inputlookup legit_user_host.csv | fields host]
To really see what's going on, you'll want to look at the job inspector after it runs.
to make sense of this, you can run each part of the search independently and imagine the 'venn diagram' the "NOT" creates. run index=blah sourcetype=blah_foo host=* login_status=success (any search that gives a list of the successful logins)
then you can independently run |inputlookup legit_user_host.csv|fields host run it all together and Splunk will give you the results showing when the data in splunk does NOT match the lookup list.
If you run the same and remove the NOT you get the list that DOES match...
With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!