Splunk Search

How can I use regex with wildcard patterns in a search to capture a host field?

martin_smith
Engager

Can simple regular expressions be used in searches?

I'm trying to capture a fairly simple pattern for the host field. For example a host name might be T1234SWT0001 and I'd like to capture any device with "T" + "four digits" + "SWT" + "anything". I think the regex would be something along the lines of T\d\d\d\dSWT.

1 Solution

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

View solution in original post

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...