Splunk suggests to extract fields at forwarders for structured data, why? and what if i have field names in the log / no filed field names in the log?
I have a confusion that whether my license usage get affected by structured field extraction at index time/ at forwarders.
I understand that splunk license counts against what you index , so if i do indexed field extractions then those field value pairs will be added to _raw and cause license usage, is that correct?
For unstructured data Splunk suggests us to do extraction at search time?.
I'm clear with these but sometimes not,.
any advises will be appreciated..
Field extraction settings for structured data must be configured on the forwarder.
If structured data has fields then those are automatically extracted. If not then FIELD_NAMES attribute can be configured in props.conf to set field names.
For structured data all the fields in data are extracted during index time only.
For unstructured data it's better to extract data during search time as Splunk doc says:
####Index-time custom field extraction can degrade performance at both index time and search time. When you add to the number of fields extracted during indexing, the indexing process slows. Later, searches on the index are also slower, because the index has been enlarged by the additional fields, and a search on a larger index takes longer. You can avoid such performance issues by instead relying on search-time field extraction.
I appreciate your efforts here.
It has nothing to do with license because you are metered for that with length of
_raw in bytes.
First, that guidance is overly-smplistic to the point of being fallacious; please post a followup comment here with the URL where you read that so that I can submit some feedback.
The MAIN reason that this advice is wrong is because it will lead people to the very bad and generally WRONG decision to use
Heavy Forwarders (which can do every kind of
index-time field extractions) instead of
Universal Forwarders: https://www.splunk.com/en_us/blog/tips-and-tricks/universal-or-heavy-that-is-the-question.html
Another reason it is wrong is because
index-time field extractions consume a significant amount of disk space, often for no actual benefit (nobody is
Also, the only sensible way to do
index-time field extractions on a
Universal Forwarder is with
INDEXED_EXTRACTIONS which should generally be avoided because it is "all or none".
The only shred of this advice that is true is the universal distributed architecture rule that, all other considerations being equal (note my previously voiced inequalities above) as much as possible should be done at the leaves of the tree.
All your suggestions are good, I really appreciate your effort.
My question is what does INDEXED_EXTRACTIONS do at UF, lets say I have a csv file having 20 lines, no field names.
I did INDEXED_EXTRACTIONS at UF now what exactly my forwarder sends to Indexer? That's all...
My doubt is if UF forwards _raw + new field vales then while passing throgh index pipeline, does it counts all?
There are 4 pipelines - Parsing-Mergine-Typing-Index , license metered at last pipeline, is that correct when data being written to disk?
Can you please elaborate on precedence of props attributes vs license meter.