We are seeing a large discrepancy in field extraction counts between our Prod and Dev environments for sourcetype=xxx.
In Prod, a search returns ~600+ fields. In Dev, the same search returns ~100 fields.
We confirmed that KV_MODE=auto is set on both environments, but Dev still does not extract as many fields.
Raw events in Dev do contain certain keys (e.g., PreStore), but these fields do not consistently appear in the sidebar unless explicitly searched.
Prod has ~58 field extractions defined for this sourcetype, while Dev only has ~6. A large number of the extractions in Prod appear as Private in the UI. We are unclear whether these “Private” extractions are also being applied to other users, or only to the owners.
Questions
How do “Private” field extractions behave — are they ever applied to users other than the owner, or should they only affect the owner’s searches?
Could differences in data verbosity (more key=value pairs in Prod logs) be compounding the discrepancy, even with the same KV_MODE setting?
What is the best way to identify all active field extractions (including private/app-scoped) that are being applied to a sourcetype, so we can reconcile between environments?
How can we ensure consistent field discovery behavior between Dev and Prod?
Steps taken so far
Checked props.conf and transforms.conf on the search app in both environments — only a few extractions found in Dev vs many in Prod.
Verified KV_MODE settings using REST and btool. Confirmed Prod SH shows auto, Dev was updated to auto, but discrepancy remains.
Compared number of field extractions and in PROD it is 58 for sourcetype and in Dev it is 6.
btool is your friend
splunk btool props list sourcetype -user user -app app -debug
Same goes for transforms
It will show you what is the effective config read from files in your environment and applied in context of a given user and app according to precedence rules.
I'm not hundred percent sure if it uses user's private KOs. I suppose it does but you'd have to double check it.
1. Yes, user's private KOs are limited to this user only.
2. If you have different data, it might produce different (number of) fields. That should be pretty obvious
3. Depending on what you mean by "active", probably the btool
4. By keeping the configuration in sync and the same (format of) data? I know that due to compliance reasons dev/test/staging/whatever data might need to be anonymized or otherwise manipulated but it should generally represent the production data. Otherwise there's no point in keeping those environments.