Splunk Search

search-time versus index-time field extractions

Path Finder

Hi,

I've recently noticed the recommendations the move to search-time versus index-time field extractions. I'm trying to get an idea of exactly how much of the configuration that we've got in place doesn't follow this paradigm. We especially have a lot of DELIMS/FIELDS-based field extractions, and I'm not clear on where we stand with these, especially since there's no obvious way to configure them in the GUI.

I'm assuming when an extract says 'uses transform' as opposed to 'inline' in the GUI then it is an index-time field extraction? Is this the case or am I oversimplifying the distinction?

I've looked over the documentation on search-time indexing and http://www.splunk.com/base/Documentation/latest/Knowledge/Addfieldsatsearchtime says:

You can also create and maintain field extractions by making edits directly to props.conf and transforms.conf. If this sounds like your kind of thing--and it may be, especially if you are an old-timey Splunk user, or just prefer working at the configuration file level of things, you can find all the details in "Create and maintain search-time extractions through configuration files," in this manual.

This being said, other documentation at http://www.splunk.com/base/Splexicon:Transform says:

Transforms are always involved in the setup of custom index-time field extractions.

Can somebody please help us clear this up? Thanks!

-Frank

1 Solution

Splunk Employee
Splunk Employee

In general, we recommend search-time extractions rather than index-time extractions. There are relatively few cases where index-time extractions are better, and they come at the cost of brittleness of configuration and an increase in index size (which in turn makes searches slower).

The distinction in the UI of "uses transform" vs. inline doesn't have anything to do with search-time vs index-time. It is referring to where the regex itself is stored: in an EXTRACT- line in props.conf (for inline) as opposed to in a REPORT- line that refers to a stanza in transforms.conf (for uses transform).

Index time extractions are also set in props.conf and transforms.conf by means of the TRANSFORM- line. Again, they should rarely be used. They are appropriate when the heuristic of search for the value of the field fails (either because the value is ubiquitous outside of cases where the field equals the value, or because the value isn't an indexed token) or when you commonly search for field!=value without other terms to constrain the search.

View solution in original post

Esteemed Legend

There are 2 different transform things.

One is transforms.conf which contains transform definitions and the word transform only occurs in the file name, not in the contents of the file. That is one thing.

Then there is the TRANFORMS- definition inside of props.conf that is part of the REPORT-, EXTRACT-, and TRANSFORMS- triad. The first two are search-time things that are really the same thing (just that REPORT- definitions will reference transforms defined in transforms.conf whereas EXTRACT- definitions are inlined completely in props.conf). The last, TRANSFORMS- is how index-time extractions are configured.

I agree that this is a bit confusing.

Splunk Employee
Splunk Employee

I will clarify here that DELIMS/FIELDS extraction are search-time extractions, and thus of the preferred type already.

Super Champion

Here is a related discussion (which highlights some additional use-cases for using indexed fields)

Splunk Employee
Splunk Employee

In general, we recommend search-time extractions rather than index-time extractions. There are relatively few cases where index-time extractions are better, and they come at the cost of brittleness of configuration and an increase in index size (which in turn makes searches slower).

The distinction in the UI of "uses transform" vs. inline doesn't have anything to do with search-time vs index-time. It is referring to where the regex itself is stored: in an EXTRACT- line in props.conf (for inline) as opposed to in a REPORT- line that refers to a stanza in transforms.conf (for uses transform).

Index time extractions are also set in props.conf and transforms.conf by means of the TRANSFORM- line. Again, they should rarely be used. They are appropriate when the heuristic of search for the value of the field fails (either because the value is ubiquitous outside of cases where the field equals the value, or because the value isn't an indexed token) or when you commonly search for field!=value without other terms to constrain the search.

View solution in original post

Builder

Does this "recommend" still stand nearly 8 years later?

0 Karma

Contributor

@wmyersas I think it's much more recommended now that Splunk is moving to "compute" rather than daily volume type of billing customers. Search time extractions will defo use more compute to load into RAM rather than displaying fields that have already been burned onto the disk.

0 Karma

Path Finder

Thanks, that's exactly what I was hoping to hear. Now, if we could just get an easy way to configure DELIMS/FIELDS in the UI, I'd be even happier...

Super Champion

Yup, still waiting on the DELIMS/FIELDS UI thing in 2016. And now with Splunk Cloud that's become an even bigger pain because of the lack of access to the .conf files. ;-(