Splunk Search

Searching for large groups of hosts (or any other field), i.e. host=box[100-200].domain.com

parallaxed
Path Finder

Hi,

We want to search for hundreds of hosts at a time. The question is similar to these:

http://answers.splunk.com/questions/968/how-can-i-easily-filter-or-limit-my-search-down-to-a-specifi...

^ Globbing is not good because a full text expansion will not match groups like the one in the title. Tags would be in the order of hundreds which becomes difficult to maintain.

http://answers.splunk.com/questions/730/how-to-search-multiple-value-on-the-same-field/734#734

^ This is more promising, but not ideal for a managed installation where clients may use it, as the csv has to exist in a dir on the server.

What are the alternatives?

0 Karma
1 Solution

ziegfried
Influencer

You could use the same technique as described in the following answer: http://answers.splunk.com/questions/6856/regular-expression-in-search

Specifying something like host321 - host426 is possible, but a little more complicated:

* [ | metadata type=hosts | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426 | fields host ]

EDIT:

As subsearches are quite limited (default to 100 results), here is a slower, but less limited variant (just as an alternative):

host=3* OR host=4* | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426

View solution in original post

parallaxed
Path Finder

Responding here to get the full formatting - this was solved using a combination of the above, although it's perhaps not as suitable as more advanced pattern matching on the host string.

Create a file (call it grp1) with desired list of hosts inside. In this case I had a file containing

box1.*
box2.*
box1500.*

... and so on

You need to get Splunk to index this file, go w/o linemerge (use a newline breaker)

For whatever strange reason,

| fields +host 

Displays the host field twice, and causes a strange artifact with | format, making your string look like

OR ( host=box1.domain.com host=box1.* ) OR ( host=box1.domain.com host=box1.* )

The | rex overcomes this problem, so the final search string (to search for all hosts you listed in the file:

index=mydata [ search source=*grp1* | rex field=_raw "host=(?<host>.*)" | fields + host | format ]

The subsearch returns the results for the host group, the main search provides the data.

0 Karma

ziegfried
Influencer

You could use the same technique as described in the following answer: http://answers.splunk.com/questions/6856/regular-expression-in-search

Specifying something like host321 - host426 is possible, but a little more complicated:

* [ | metadata type=hosts | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426 | fields host ]

EDIT:

As subsearches are quite limited (default to 100 results), here is a slower, but less limited variant (just as an alternative):

host=3* OR host=4* | rex field=host "^host(?<host_no>\d+)" | where host_no>=321 AND host_no<=426

parallaxed
Path Finder

At least that limit can be changed!

0 Karma

ziegfried
Influencer
0 Karma

parallaxed
Path Finder

Liking this idea very much, but | metadata queries on hosts only returns 10000 results :[ (afaik this is hardcoded limitation) - the use case I'm looking at has in excess of that number. We may have to yield to generating plaintext files with the groups, and getting Splunk to index them so they can be returned with a simple subsearch for the sourcefile with a group listing. Something along those lines...

0 Karma

ftk
Motivator

great idea ziegfried

0 Karma

ftk
Motivator

If the lookup tables and tagging mentioned in the two answers you linked in your question do not work for you, you could define your server groups with wildcards. Such as doing a search for

(host=host1* OR host=host2*)

Alternatively, could you define the hosts by a search? Then you could use a subsearch to define your hosts and push them to your desired search. This might be possible if your hosts have some distinct attribute you can search on. If all the desired hosts for example have a source in common, for example they all index an example.log file, you could craft your search as follows:

search terms [ search source="*example.log" | fields + host]

parallaxed
Path Finder

Subsearch is the way to go I think, just need to find an optimal way of getting the data in there.

0 Karma

ftk
Motivator

I recommend putting in an Enhancement Request for that feature. What about the subsearch?

0 Karma

parallaxed
Path Finder

Globbing/wildcards does not work with the example provided unfortunately. For example host1* matches host1, host10, host100, host1000 and all in between. There's no way to specify host321-host426 for example, using wildcards.

A full pattern match or range operator [] would suffice, but as I understand it that's currently not possible.

0 Karma
Get Updates on the Splunk Community!

Introducing Edge Processor: Next Gen Data Transformation

We get it - not only can it take a lot of time, money and resources to get data into Splunk, but it also takes ...

Take the 2021 Splunk Career Survey for $50 in Amazon Cash

Help us learn about how Splunk has impacted your career by taking the 2021 Splunk Career Survey. Last year’s ...

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...