Splunk Search

A list of common regular expressions for field extractions?

stefanlasiewski
Contributor

Splunk isn't extracting certain fields from my logs. This includes basic things such as IP addresses.

It seems that I need to build regular expressions so that Splunk will recognize my data better. Here are some things which I need Splunk to recognize:

  1. 1.1.1.1 and 192.168.100.100 are IPv4 addresses. Regex is something like (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3}
  2. IPv6 addresses. The regex for this is difficult. Very difficult, which is why I was hoping that Splunk would do this for me, and save me time.
  3. 1.1.1.1:8080 is an IP address with a port
  4. foo@example.gov is an email address.

The examples above are extremely common. Is there a list of common regular expressions which I can import into Splunk so that I don't need to experiment with dozens of regular expression strings?

Tags (2)
0 Karma

gkanapathy
Splunk Employee
Splunk Employee

While there are plenty of regex sites that can provide these regexes, it isn't all that useful in most cases. A field extraction is usually defined by absolute position (e.g., 5rd word in the line) or its location relative to fixed characters (e.g., string after src_addr= until the next space, or string starting after <addr> until you see </addr>). So trying to force the regex to match the exact thing you're looking for is rarely necessary. Usually, once you have located it, it's sufficient to say "string of non-space characters" (\S*) or "sequence of hex digits and colons" ([0-9a-zA-Z\:]* or [[:xdigit:]:]). So typically, it's less important to know how to match or validate against the data type itself as much as to match to locate it within a log entry. This unfortunately is more dependent on your log format, and less likely to be found in the wild.

stefanlasiewski
Contributor

I was under the impression that fields are not position-based. e.g. If I want Splunk to identify an IPv6 field anywhere on the line, I need to use the interactive field extractor to define the IPv6 field based on a regular expression.

0 Karma
Get Updates on the Splunk Community!

CX Day is Coming!

Customer Experience (CX) Day is on October 7th!! We're so excited to bring back another day full of wonderful ...

Strengthen Your Future: A Look Back at Splunk 10 Innovations and .conf25 Highlights!

The Big One: Splunk 10 is Here!  The moment many of you have been waiting for has arrived! We are thrilled to ...

Now Offering the AI Assistant Usage Dashboard in Cloud Monitoring Console

Today, we’re excited to announce the release of a brand new AI assistant usage dashboard in Cloud Monitoring ...