Hi,
We want to search for hundreds of hosts at a time. The question is similar to these:
^ Globbing is not good because a full text expansion will not match groups like the one in the title. Tags would be in the order of hundreds which becomes difficult to maintain.
http://answers.splunk.com/questions/730/how-to-search-multiple-value-on-the-same-field/734#734
^ This is more promising, but not ideal for a managed installation where clients may use it, as the csv has to exist in a dir on the server.
What are the alternatives?
You could use the same technique as described in the following answer: http://answers.splunk.com/questions/6856/regular-expression-in-search
Specifying something like host321 - host426 is possible, but a little more complicated:
* [ | metadata type=hosts | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426 | fields host ]
EDIT:
As subsearches are quite limited (default to 100 results), here is a slower, but less limited variant (just as an alternative):
host=3* OR host=4* | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426
Responding here to get the full formatting - this was solved using a combination of the above, although it's perhaps not as suitable as more advanced pattern matching on the host string.
Create a file (call it grp1) with desired list of hosts inside. In this case I had a file containing
box1.*
box2.*
box1500.*
... and so on
You need to get Splunk to index this file, go w/o linemerge (use a newline breaker)
For whatever strange reason,
| fields +host
Displays the host field twice, and causes a strange artifact with | format, making your string look like
OR ( host=box1.domain.com host=box1.* ) OR ( host=box1.domain.com host=box1.* )
The | rex overcomes this problem, so the final search string (to search for all hosts you listed in the file:
index=mydata [ search source=*grp1* | rex field=_raw "host=(?<host>.*)" | fields + host | format ]
The subsearch returns the results for the host group, the main search provides the data.
You could use the same technique as described in the following answer: http://answers.splunk.com/questions/6856/regular-expression-in-search
Specifying something like host321 - host426 is possible, but a little more complicated:
* [ | metadata type=hosts | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426 | fields host ]
EDIT:
As subsearches are quite limited (default to 100 results), here is a slower, but less limited variant (just as an alternative):
host=3* OR host=4* | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426
At least that limit can be changed!
Subsearches are limited as well... See http://www.splunk.com/base/Documentation/latest/Admin/Limitsconf
Liking this idea very much, but | metadata queries on hosts only returns 10000 results :[ (afaik this is hardcoded limitation) - the use case I'm looking at has in excess of that number. We may have to yield to generating plaintext files with the groups, and getting Splunk to index them so they can be returned with a simple subsearch for the sourcefile with a group listing. Something along those lines...
great idea ziegfried
If the lookup tables and tagging mentioned in the two answers you linked in your question do not work for you, you could define your server groups with wildcards. Such as doing a search for
(host=host1* OR host=host2*)
Alternatively, could you define the hosts by a search? Then you could use a subsearch to define your hosts and push them to your desired search. This might be possible if your hosts have some distinct attribute you can search on. If all the desired hosts for example have a source in common, for example they all index an example.log file, you could craft your search as follows:
search terms [ search source="*example.log" | fields + host]
Subsearch is the way to go I think, just need to find an optimal way of getting the data in there.
I recommend putting in an Enhancement Request for that feature. What about the subsearch?
Globbing/wildcards does not work with the example provided unfortunately. For example host1* matches host1, host10, host100, host1000 and all in between. There's no way to specify host321-host426 for example, using wildcards.
A full pattern match or range operator [] would suffice, but as I understand it that's currently not possible.