Need to exclude field results based on multiple string-matching cirteria (OR):
-Not equals to any one of several names
-Not ends with "$"
-Only has A-Z, a-z, "-", ".", "_"
-Not contains any one of several names
Here's my inefficient solution. AdminAccount is the field to query.
| where not (AdminAccount = "Joe" or AdminAccount = "Mike" or AdminAccount = "David" or AdminAccount = "Max" or AdminAccount = "Abe" or AdminAccount = "Peter")
| regex AdminAccount != "\$$"
| where NOT match(AdminAccount,"\d+$")
| where NOT match(AdminAccount,"sql|ssoadmin|local service|internal|snapshots|sharepoint")
Any way to do this better? bonus points if you explain why.
Techinically the whole thing could be one big regex for a single filter like so:
| regex AdminAccount != "^Joe$|^Mike$|^David$|^Max$|^Abe$|^Peter$|\$$|\d+$|sql|sso|admin|local service|internal|snapshots|sharepoint"
But if readability counts, then maybe switch the first where statement to a search
(because the IN operator is handy though where
has something similar) and combine the regex expressions
| search AdminAccount IN (Joe Mike David Max Peter)
| regex AdminAccount != "\$$|\d+$|sql|sso|admin|local service|internal|snapshots|sharepoint"
Is one regex faster/more efficient than multiple regex'es? assuming readability doesn't matter
Well I'm not certain how regex is handled "under the hood" so to speak. I think nickhillscpl depiction of using job inspector is a good idea to test it, but logically a single operation has got to be more efficient then multiple (unless Splunk is combining them) and likely you are passing the load to the regex engine/module/whatever all at once.
Where you have a long list of things to exclude, you may consider using a lookup.
Create a CSV with something like:
AdminAccount,exclude
Joe,1
Mike,1
David,1
Max,1
*$,1
sql,1
etc, etc
Create a lookup definition for your CSV lookup and set the match type to WILDCARD for the AdminAccount field
Then run your search, and perform the lookup:
[my search]|lookup exclude_accounts AdminAccount OUTPUT exclude|where exclude!=1
https://docs.splunk.com/Documentation/Splunk/7.2.4/Knowledge/ConfigureCSVlookups
https://docs.splunk.com/Documentation/Splunk/7.2.4/Knowledge/Addfieldmatchingrulestoyourlookupconfig...
Is a lookup more efficient than the in-search where
clause?
Thats an excellent question - and not one I have ever seen performance comparisons on, however small lookups (<10mb) anecdotally perform very well.
The reason is that the data is loaded once into memory, and events are simply matched based on the field value as they are returned, the single where to exclude them is probably as efficient as it gets.
I would suggest testing both approaches in your environment and use the job inspector to see which one works best for your data and env.
Will do! thanks