It seems very strange to me to be asking this question in 2019 for Splunk 7.3.1, but I've used Splunk, I've read the Splunk docs, I've read numerous questions and answers here on Splunk Answers, and I observe significant differences between different Splunk docs topics and also—depending on which docs topics you read—what Splunk allows in practice. Or perhaps the answer is different in different contexts, and I've overlooked some docs that would clarify this for me.
Field names must start with a letter and contain only letters, numbers, and underscores.
The default true value of the CLEAN_KEYS setting in transforms.conf enforces those allowed values for search-time field extractions:
CLEAN_KEYS = [true|false]
NOTE: This setting is only valid for search-time field extractions.
Optional. Controls whether Splunk software "cleans" the keys (field names) it
extracts at search time. "Key cleaning" is the practice of replacing any
non-alphanumeric characters (characters other than those falling between the
a-z, A-Z, or 0-9 ranges) in field names with underscores, as well as the
stripping of leading underscores and 0-9 characters from field names.
Consistent so far.
But then the false value allows other characters:
Add CLEAN_KEYS = false to your transform if you need to extract field
names that include non-alphanumeric characters, or which begin with
underscores or 0-9 characters.
If the expression references a field name that contains non-alphanumeric characters, it needs to be surrounded by single quotation marks. For example, if the field name is server-1 you specify the field name like this new=count+'server-1'.
implicitly allows other ("non-alphanumeric") characters in field names.
And the documentation for the rename command offers examples of renaming fields to give them more meaningful names, with spaces (I frequently do this in practice).
Given these apparently conflicting docs, I thought I'd ask this question—which characters does Splunk allow in field names?—in the hope of getting a definitive answer in one place.
Perhaps the answer is different in different contexts: ingesting data versus dynamically (re)naming fields in SPL? (And I'm happy to discuss the meaning of "allow", if necessary.)
Inputs field OR index time fields
inputs are like indexed_extractions and meta fields
and indextime are on the pipeline on the heavy forwarder and indexers from the typing processor
As they are in props/transforms, so the fields names have to be simple without space.
A few years ago, with json, the "xxx.yyy" for "aaa:bbb:cc" fields name format became possible
search time fields
those are more flexible, as you can use 'my space string field' in SPL to call them.
Thank you for bringing this to our attention and with such valuable details !!! I have reached out to our Engineering team to get a definitive answer. I believe that you are correct and that there might be a difference between field naming conventions during ingest/extraction and fields referenced in searches.
I have also created a JIRA issue for this issue and notified the writers who own the manuals which you have identified. I own the Search Reference (the eval command) documentation and agree that the wording is not as clear as it should be.