Splunk Dev

How can you specify additional characters to the indexing tokenizer?

nathansvlsr
New Member

We have messages that have tabs replaced with #011 along with other control characters (See rsyslog EscapeControlCharactersOnReceive setting) but we do not want to turn this setting off. Ideally, we want to have Splunk split on #011 in addition to the existing splitting tokens (real tab, spaces, etc). When we have log lines like:

#011Testing 123

We are unable to search for "Testing" without specifying it as a wildcard or some other substring technique. We would like to be able to search for Testing as if it a log line without the #011 replacement.

Tags (1)
0 Karma

sshelly_splunk
Splunk Employee
Splunk Employee

Take a look at $SPLUNK_HOME/etc/system/default/segmentors.conf. You can add your own key/value segmentors by creating that file in ./local/. As always, test, test, test before deploying into production:)

You can also check out http://docs.splunk.com/Documentation/Splunk/latest/admin/Segmentersconf

0 Karma
Get Updates on the Splunk Community!

Index This | Why did the turkey cross the road?

November 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Enter the Agentic Era with Splunk AI Assistant for SPL 1.4

  🚀 Your data just got a serious AI upgrade — are you ready? Say hello to the Agentic Era with the ...

Feel the Splunk Love: Real Stories from Real Customers

Hello Splunk Community,    What’s the best part of hearing how our customers use Splunk? Easy: the positive ...