I've recently noticed the recommendations the move to search-time versus index-time field extractions. I'm trying to get an idea of exactly how much of the configuration that we've got in place doesn't follow this paradigm. We especially have a lot of DELIMS/FIELDS-based field extractions, and I'm not clear on where we stand with these, especially since there's no obvious way to configure them in the GUI.
I'm assuming when an extract says 'uses transform' as opposed to 'inline' in the GUI then it is an index-time field extraction? Is this the case or am I oversimplifying the distinction?
I've looked over the documentation on search-time indexing and http://www.splunk.com/base/Documentation/latest/Knowledge/Addfieldsatsearchtime says:
You can also create and maintain field extractions by making edits directly to props.conf and transforms.conf. If this sounds like your kind of thing--and it may be, especially if you are an old-timey Splunk user, or just prefer working at the configuration file level of things, you can find all the details in "Create and maintain search-time extractions through configuration files," in this manual.
This being said, other documentation at http://www.splunk.com/base/Splexicon:Transform says:
Transforms are always involved in the setup of custom index-time field extractions.
Can somebody please help us clear this up? Thanks!
In general, we recommend search-time extractions rather than index-time extractions. There are relatively few cases where index-time extractions are better, and they come at the cost of brittleness of configuration and an increase in index size (which in turn makes searches slower).
The distinction in the UI of "uses transform" vs. inline doesn't have anything to do with search-time vs index-time. It is referring to where the regex itself is stored: in an
EXTRACT- line in props.conf (for inline) as opposed to in a
REPORT- line that refers to a stanza in transforms.conf (for uses transform).
Index time extractions are also set in props.conf and transforms.conf by means of the
TRANSFORM- line. Again, they should rarely be used. They are appropriate when the heuristic of search for the value of the field fails (either because the value is ubiquitous outside of cases where the field equals the value, or because the value isn't an indexed token) or when you commonly search for
field!=value without other terms to constrain the search.
Thanks, that's exactly what I was hoping to hear. Now, if we could just get an easy way to configure DELIMS/FIELDS in the UI, I'd be even happier...
Yup, still waiting on the DELIMS/FIELDS UI thing in 2016. And now with Splunk Cloud that's become an even bigger pain because of the lack of access to the .conf files. ;-(
@wmyersas I think it's much more recommended now that Splunk is moving to "compute" rather than daily volume type of billing customers. Search time extractions will defo use more compute to load into RAM rather than displaying fields that have already been burned onto the disk.
I will clarify here that DELIMS/FIELDS extraction are search-time extractions, and thus of the preferred type already.
There are 2 different
transforms.conf which contains
transform definitions and the word
transform only occurs in the file name, not in the contents of the file. That is one thing.
Then there is the
TRANFORMS- definition inside of
props.conf that is part of the
TRANSFORMS- triad. The first two are
search-time things that are really the same thing (just that
REPORT- definitions will reference
transforms defined in
EXTRACT- definitions are inlined completely in
props.conf). The last,
TRANSFORMS- is how
index-time extractions are configured.
I agree that this is a bit confusing.