Splunk Search

Searching for large groups of hosts (or any other field), i.e. host=box[100-200].domain.com

parallaxed
Path Finder

Hi,

We want to search for hundreds of hosts at a time. The question is similar to these:

http://answers.splunk.com/questions/968/how-can-i-easily-filter-or-limit-my-search-down-to-a-specifi...

^ Globbing is not good because a full text expansion will not match groups like the one in the title. Tags would be in the order of hundreds which becomes difficult to maintain.

http://answers.splunk.com/questions/730/how-to-search-multiple-value-on-the-same-field/734#734

^ This is more promising, but not ideal for a managed installation where clients may use it, as the csv has to exist in a dir on the server.

What are the alternatives?

0 Karma
1 Solution

ziegfried
Influencer

You could use the same technique as described in the following answer: http://answers.splunk.com/questions/6856/regular-expression-in-search

Specifying something like host321 - host426 is possible, but a little more complicated:

* [ | metadata type=hosts | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426 | fields host ]

EDIT:

As subsearches are quite limited (default to 100 results), here is a slower, but less limited variant (just as an alternative):

host=3* OR host=4* | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426

View solution in original post

parallaxed
Path Finder

Responding here to get the full formatting - this was solved using a combination of the above, although it's perhaps not as suitable as more advanced pattern matching on the host string.

Create a file (call it grp1) with desired list of hosts inside. In this case I had a file containing

box1.*
box2.*
box1500.*

... and so on

You need to get Splunk to index this file, go w/o linemerge (use a newline breaker)

For whatever strange reason,

| fields +host 

Displays the host field twice, and causes a strange artifact with | format, making your string look like

OR ( host=box1.domain.com host=box1.* ) OR ( host=box1.domain.com host=box1.* )

The | rex overcomes this problem, so the final search string (to search for all hosts you listed in the file:

index=mydata [ search source=*grp1* | rex field=_raw "host=(?<host>.*)" | fields + host | format ]

The subsearch returns the results for the host group, the main search provides the data.

0 Karma

ziegfried
Influencer

You could use the same technique as described in the following answer: http://answers.splunk.com/questions/6856/regular-expression-in-search

Specifying something like host321 - host426 is possible, but a little more complicated:

* [ | metadata type=hosts | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426 | fields host ]

EDIT:

As subsearches are quite limited (default to 100 results), here is a slower, but less limited variant (just as an alternative):

host=3* OR host=4* | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426

parallaxed
Path Finder

At least that limit can be changed!

0 Karma

ziegfried
Influencer
0 Karma

parallaxed
Path Finder

Liking this idea very much, but | metadata queries on hosts only returns 10000 results :[ (afaik this is hardcoded limitation) - the use case I'm looking at has in excess of that number. We may have to yield to generating plaintext files with the groups, and getting Splunk to index them so they can be returned with a simple subsearch for the sourcefile with a group listing. Something along those lines...

0 Karma

ftk
Motivator

great idea ziegfried

0 Karma

ftk
Motivator

If the lookup tables and tagging mentioned in the two answers you linked in your question do not work for you, you could define your server groups with wildcards. Such as doing a search for

(host=host1* OR host=host2*)

Alternatively, could you define the hosts by a search? Then you could use a subsearch to define your hosts and push them to your desired search. This might be possible if your hosts have some distinct attribute you can search on. If all the desired hosts for example have a source in common, for example they all index an example.log file, you could craft your search as follows:

search terms [ search source="*example.log" | fields + host]

parallaxed
Path Finder

Subsearch is the way to go I think, just need to find an optimal way of getting the data in there.

0 Karma

ftk
Motivator

I recommend putting in an Enhancement Request for that feature. What about the subsearch?

0 Karma

parallaxed
Path Finder

Globbing/wildcards does not work with the example provided unfortunately. For example host1* matches host1, host10, host100, host1000 and all in between. There's no way to specify host321-host426 for example, using wildcards.

A full pattern match or range operator [] would suffice, but as I understand it that's currently not possible.

0 Karma
Get Updates on the Splunk Community!

Notification Email Migration Announcement

The Notification Team is migrating our email service provider from Postmark to AWS Simple Email Service (SES) ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...