I've currently got a summary search setup going against DNS query data that I use to produce a reporting chart of the top 50 searches over the past 3 days. As part of this, I generally go through every couple days to see what is showing up in the top 50 that isn't interesting and can be discarded from the chart. This basically amounts to then appending another "query!="some.domain.y" (where "query" is a defined field in my dns eventtype") such as:
sourcetype="dns" query_type="A" query!="some.domain.x" | sitop query limit="50"
At this point though, this has made my search quite long as there's probably upwards of 30 or so "query!=xxxx" statements appended now. To try to make this a little cleaner, I'm wondering if there's a way to leverage a lookup table in order to just read in the domains to ignore via a loop? If not, no big deal... but simply appending to the end of a "domains.ignore" file would be a lot easier than having to edit the search itself daily.
Thanks in advance for any help with this.
Sure, this is pretty easy to do.
Create a CSV-based lookup table with two columns, e.g.:
query, ignored "some.domain.x","true" "some.domain.y","true" ...
When you call the lookup, any values for
query not in your CSV list will have an empty value for
ignored. So, your search becomes:
sourcetype="dns" query_type="A" | lookup domainsToIgnore query OUTPUT ignored | search NOT ignored=true
If you need more more general information on how to create the lookup table, take a look at: http://www.splunk.com/base/Documentation/latest/Knowledge/Addfieldsfromexternaldatasources
Hmm, well, it's unfortunate that the name of your field is
query, otherwise this would work:
sourcetype="dns" query_type=A NOT [inputlookup excluded_queries | fields query]
But the problem is that a field named either
search is treated specially by subsearch. If you renamed the field in your original sourcetype from
query to, say,
qry, then this would work:
sourcetype="dns" query_type=A NOT [inputlookup excluded_queries | fields qry]
In 4.1.5, I think you'll be unlikely to have problems, but in earlier version (including earlier 4.1 releases), you may run into a 100-term limit that a subsearch will return, which would need to be raised in
gkanapathy, was that second search line supposed to be identical to the first? Or was it supposed to end in "| fields qry]" instead? It's fairly simple for me to change that field name if need be (probably to "dnsquery" actually), so I might go your route. In that case, I'm assuming it would be: "| fields dnsquery]" at the end, and the lookup file would just be a list of the domains (one per line)?
The table would be a list of domains, one per line, except the first line would be the column/field name,
dns_query. You could use the
rename command if it's different, but you might as well make it the same as in your sourcetype.