Splunk Search

How can I use regex with wildcard patterns in a search to capture a host field?

martin_smith
Engager

Can simple regular expressions be used in searches?

I'm trying to capture a fairly simple pattern for the host field. For example a host name might be T1234SWT0001 and I'd like to capture any device with "T" + "four digits" + "SWT" + "anything". I think the regex would be something along the lines of T\d\d\d\dSWT.

0 Karma
1 Solution

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

View solution in original post

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

Get Updates on the Splunk Community!

Using Machine Learning for Hunting Security Threats

REGISTER NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more ...

Security Highlights | November 2022 Newsletter

 November 2022 2022 Gartner Magic Quadrant for SIEM: Splunk Named a Leader for the 9th Year in a RowSplunk is ...

Platform Highlights | November 2022 Newsletter

 November 2022 Skill Up on Splunk with our New Builder Tech Talk SeriesCan you build it? Yes you can! *play ...