Splunk Dev

How can you specify additional characters to the indexing tokenizer?

nathansvlsr
New Member

We have messages that have tabs replaced with #011 along with other control characters (See rsyslog EscapeControlCharactersOnReceive setting) but we do not want to turn this setting off. Ideally, we want to have Splunk split on #011 in addition to the existing splitting tokens (real tab, spaces, etc). When we have log lines like:

#011Testing 123

We are unable to search for "Testing" without specifying it as a wildcard or some other substring technique. We would like to be able to search for Testing as if it a log line without the #011 replacement.

Tags (1)
0 Karma

sshelly_splunk
Splunk Employee
Splunk Employee

Take a look at $SPLUNK_HOME/etc/system/default/segmentors.conf. You can add your own key/value segmentors by creating that file in ./local/. As always, test, test, test before deploying into production:)

You can also check out http://docs.splunk.com/Documentation/Splunk/latest/admin/Segmentersconf

0 Karma
Get Updates on the Splunk Community!

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...

Splunk Decoded: Business Transactions vs Business IQ

It’s the morning of Black Friday, and your e-commerce site is handling 10x normal traffic. Orders are flowing, ...

Fastest way to demo Observability

I’ve been having a lot of fun learning about Kubernetes and Observability. I set myself an interesting ...