Archive

Which characters does Splunk allow in field names?

It seems very strange to me to be asking this question in 2019 for Splunk 7.3.1, but I've used Splunk, I've read the Splunk docs, I've read numerous questions and answers here on Splunk Answers, and I observe significant differences between different Splunk docs topics and also—depending on which docs topics you read—what Splunk allows in practice. Or perhaps the answer is different in different contexts, and I've overlooked some docs that would clarify this for me.

Some extracts from the Splunk docs (mostly copied from a comment I recently added to the question "Safe characters for field names")...

From Splunk docs / Documentation / Splunk Enterprise / Getting Data In / Create custom fields at index time:

Field name syntax restrictions

You can assign field names as follows:

  • Valid characters for field names are a-z, A-Z, 0-9, or _ .

Similarly, from Splunk docs / Documentation / Splunk Enterprise / Knowledge Manager Manual / Field Extractor: Select Fields step:

Field names must start with a letter and contain only letters, numbers, and underscores.

The default true value of the CLEAN_KEYS setting in transforms.conf enforces those allowed values for search-time field extractions:

CLEAN_KEYS = [true|false]
NOTE: This setting is only valid for search-time field extractions.
Optional. Controls whether Splunk software "cleans" the keys (field names) it
extracts at search time. "Key cleaning" is the practice of replacing any
non-alphanumeric characters (characters other than those falling between the
a-z, A-Z, or 0-9 ranges) in field names with underscores, as well as the
stripping of leading underscores and 0-9 characters from field names.
...
Default: true

Consistent so far.

But then the false value allows other characters:

Add CLEAN_KEYS = false to your transform if you need to extract field
names that include non-alphanumeric characters, or which begin with
underscores or 0-9 characters.

Splunk docs / Documentation / Splunk Enterprise / Knowledge Manager Manual / About regular expressions with field extraction:

Proper field name syntax
Field names must conform to the field name syntax rules.

  • Valid characters for field names are a-z, A-Z, 0-9, . , :, and _.

adds the period (.) and colon (:).

Splunk docs / Documentation / Splunk Enterprise / Search Reference / eval:

If the expression references a field name that contains non-alphanumeric characters, it needs to be surrounded by single quotation marks. For example, if the field name is server-1 you specify the field name like this new=count+'server-1'.

implicitly allows other ("non-alphanumeric") characters in field names.

And the documentation for the rename command offers examples of renaming fields to give them more meaningful names, with spaces (I frequently do this in practice).

Given these apparently conflicting docs, I thought I'd ask this question—which characters does Splunk allow in field names?—in the hope of getting a definitive answer in one place.

Perhaps the answer is different in different contexts: ingesting data versus dynamically (re)naming fields in SPL? (And I'm happy to discuss the meaning of "allow", if necessary.)

Splunk Employee
Splunk Employee

I would put my money on the distinction between

  • Inputs field OR index time fields inputs are like indexed_extractions and meta fields and indextime are on the pipeline on the heavy forwarder and indexers from the typing processor As they are in props/transforms, so the fields names have to be simple without space.

A few years ago, with json, the "xxx.yyy" for "aaa:bbb:cc" fields name format became possible

  • search time fields those are more flexible, as you can use 'my space string field' in SPL to call them.
0 Karma

Splunk Employee
Splunk Employee

Graham -

Thank you for bringing this to our attention and with such valuable details !!! I have reached out to our Engineering team to get a definitive answer. I believe that you are correct and that there might be a difference between field naming conventions during ingest/extraction and fields referenced in searches.

I have also created a JIRA issue for this issue and notified the writers who own the manuals which you have identified. I own the Search Reference (the eval command) documentation and agree that the wording is not as clear as it should be.

Laura

In case this is useful information for anyone wishing to answer: typically, I forward logs in JSON Lines format to a Splunk TCP data input.

0 Karma