Getting Data In

TIMESTAMP_FIELDS vs INDEXED_EXTRACTIONS vs KV_MODE

yuanliu
SplunkTrust
SplunkTrust

Context is structured sourcetypes such as JSON.  First, Does use of TIMESTAMP_FIELDS require INDEXED_EXTRACTIONS? (The Web UI suggests so.)

In Bug: Duplicate values with INDEXED_EXTRACTION?@badrinath_itrs referred to an intense case study The Indexed Extractions vs. Search-Time Extractions Splunk Case Study regarding INDEXED_EXTRACTIONS:

To summarize, Indexed Extractions should be used with caution. Splunk gives a pretty fair warning against using them in almost any doc that references Indexed Extractions, including their definition on Splexicon.

Then, I realized that for JSON documents whose timestamp fields falls beyond 128 characters, it is better to set INDEXED_EXTRACTIONS=json in conjunction with TIMESTAMP_FIELDS. (There is an index-time penalty to set MAX_TIMESTAMP_LOOKAHEAD too large.)

INDEXED_EXTRACTIONS=json then causes duplicate values at search time unless KV_MODE is set to none on search head.  Given Splunk's extraordinary search time capabilities, if I can use TIMESTAMP_FIELDS in conjunction with INDEXED_EXTRACTIONS=none, the problem would be solved without touching KV_MODE.  Is this possible?

Secondly, because INDEXED_EXTRACTIONS=json nearly demands use of KV_MODE=none, wouldn't it be useful for the Web GUI to automatically set KV_MODE=none when "Indexed Extractions" selector points to a structured sourcetype?  The user can still override in Advanced view, but the presence of this default can save lots of headaches for people like me.

0 Karma

KJ10
Loves-to-Learn Lots

Hi @yuanliu ,
Did able to find the solution for this issue? we are also facing same issue.

0 Karma

DanielPi
Moderator
Moderator

Hi @KJ10 ,

I’m a Community Moderator in the Splunk Community.

This question was posted 3 years ago, so it might not get the attention you need for your question to be answered. We recommend that you post a new question so that your issue can get the  visibility it deserves. To increase your chances of getting help from the community, follow these guidelines in the Splunk Answers User Manual when creating your post.

Thank you! 

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think you've made the case for not using TIMESTAMP_FIELDS when using INDEXED_EXTRACTIONS.  That leaves you with TIME_PREFIIX as the way to tell Splunk where the timestamp is.

---
If this reply helps you, Karma would be appreciated.

yuanliu
SplunkTrust
SplunkTrust

Thanks for the suggestion, @richgalloway. I did briefly look into TIME_PREFIX, but reasoned against it because prefixing texts (even with regex) in structured data feels awkward. Not only is this less elegant (not quite in aesthetics, but in "let the server do what it does best" - extract structured data), but it is more difficult to document, and in a way the regex has to anticipate possible JSON formatting variants - again, a job that the indexer does best.

Maybe I need to take a second look at this assessment.

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...