Splunk isn't extracting certain fields from my logs. This includes basic things such as IP addresses.
It seems that I need to build regular expressions so that Splunk will recognize my data better. Here are some things which I need Splunk to recognize:
The examples above are extremely common. Is there a list of common regular expressions which I can import into Splunk so that I don't need to experiment with dozens of regular expression strings?
While there are plenty of regex sites that can provide these regexes, it isn't all that useful in most cases. A field extraction is usually defined by absolute position (e.g., 5rd word in the line) or its location relative to fixed characters (e.g., string after
src_addr= until the next space, or string starting after
<addr> until you see
</addr>). So trying to force the regex to match the exact thing you're looking for is rarely necessary. Usually, once you have located it, it's sufficient to say "string of non-space characters" (
\S*) or "sequence of hex digits and colons" (
[[:xdigit:]:]). So typically, it's less important to know how to match or validate against the data type itself as much as to match to locate it within a log entry. This unfortunately is more dependent on your log format, and less likely to be found in the wild.
I was under the impression that fields are not position-based. e.g. If I want Splunk to identify an IPv6 field anywhere on the line, I need to use the interactive field extractor to define the IPv6 field based on a regular expression.