Splunk Dev

How can you specify additional characters to the indexing tokenizer?

nathansvlsr
New Member

We have messages that have tabs replaced with #011 along with other control characters (See rsyslog EscapeControlCharactersOnReceive setting) but we do not want to turn this setting off. Ideally, we want to have Splunk split on #011 in addition to the existing splitting tokens (real tab, spaces, etc). When we have log lines like:

#011Testing 123

We are unable to search for "Testing" without specifying it as a wildcard or some other substring technique. We would like to be able to search for Testing as if it a log line without the #011 replacement.

Tags (1)
0 Karma

sshelly_splunk
Splunk Employee
Splunk Employee

Take a look at $SPLUNK_HOME/etc/system/default/segmentors.conf. You can add your own key/value segmentors by creating that file in ./local/. As always, test, test, test before deploying into production:)

You can also check out http://docs.splunk.com/Documentation/Splunk/latest/admin/Segmentersconf

0 Karma
Get Updates on the Splunk Community!

Building Reliable Asset and Identity Frameworks in Splunk ES

 Accurate asset and identity resolution is the backbone of security operations. Without it, alerts are ...

Cloud Monitoring Console - Unlocking Greater Visibility in SVC Usage Reporting

For Splunk Cloud customers, understanding and optimizing Splunk Virtual Compute (SVC) usage and resource ...

Automatic Discovery Part 3: Practical Use Cases

If you’ve enabled Automatic Discovery in your install of the Splunk Distribution of the OpenTelemetry ...