Context is structured sourcetypes such as JSON. First, Does use of TIMESTAMP_FIELDS require INDEXED_EXTRACTIONS? (The Web UI suggests so.)
In Bug: Duplicate values with INDEXED_EXTRACTION?, @badrinath_itrs referred to an intense case study The Indexed Extractions vs. Search-Time Extractions Splunk Case Study regarding INDEXED_EXTRACTIONS:
To summarize, Indexed Extractions should be used with caution. Splunk gives a pretty fair warning against using them in almost any doc that references Indexed Extractions, including their definition on Splexicon.
Then, I realized that for JSON documents whose timestamp fields falls beyond 128 characters, it is better to set INDEXED_EXTRACTIONS=json in conjunction with TIMESTAMP_FIELDS. (There is an index-time penalty to set MAX_TIMESTAMP_LOOKAHEAD too large.)
INDEXED_EXTRACTIONS=json then causes duplicate values at search time unless KV_MODE is set to none on search head. Given Splunk's extraordinary search time capabilities, if I can use TIMESTAMP_FIELDS in conjunction with INDEXED_EXTRACTIONS=none, the problem would be solved without touching KV_MODE. Is this possible?
Secondly, because INDEXED_EXTRACTIONS=json nearly demands use of KV_MODE=none, wouldn't it be useful for the Web GUI to automatically set KV_MODE=none when "Indexed Extractions" selector points to a structured sourcetype? The user can still override in Advanced view, but the presence of this default can save lots of headaches for people like me.
Hi @yuanliu ,
Did able to find the solution for this issue? we are also facing same issue.
Hi @KJ10 ,
I’m a Community Moderator in the Splunk Community.
This question was posted 3 years ago, so it might not get the attention you need for your question to be answered. We recommend that you post a new question so that your issue can get the visibility it deserves. To increase your chances of getting help from the community, follow these guidelines in the Splunk Answers User Manual when creating your post.
Thank you!
I think you've made the case for not using TIMESTAMP_FIELDS when using INDEXED_EXTRACTIONS. That leaves you with TIME_PREFIIX as the way to tell Splunk where the timestamp is.
Thanks for the suggestion, @richgalloway. I did briefly look into TIME_PREFIX, but reasoned against it because prefixing texts (even with regex) in structured data feels awkward. Not only is this less elegant (not quite in aesthetics, but in "let the server do what it does best" - extract structured data), but it is more difficult to document, and in a way the regex has to anticipate possible JSON formatting variants - again, a job that the indexer does best.
Maybe I need to take a second look at this assessment.