Getting Data In

Make Splunk stop learning sourcetypes!

lsolberg
Path Finder

Hi

In this setup, we have servers for each universal-forwarder -> forwarder -> indexer -> searchhead.

I am testing adding Linux logs (/var/log) to Splunk, but I wont pollute the splunk indexer with -any- learned sourcetypes. If Splunk can't figure out the sourcetype based on its rules, the sourcetype should be set to 'linux_logs_unknown'.

We have managed to get rid of all the XXX-too_small entries by putting this in the props.conf on the universal forwarders:

[too_small]
PREFIX_SOURCETYPE = False

But I am still getting sourcetypes of eg, smbd-5 for source=/var/log/samba/smbd.log.
And sourcetype=wb-DOMAIN.log for source=/var/log/samba/wb-DOMAIN.log.

Note that this problem is not only for samba, its for everything under /var/log.

I am still somewhat new to Splunk, so please give examples 🙂

woodcock
Esteemed Legend

I assume that the problem isn't really in sourcetyping but rather in forwarding. You need a properly restrictive entry in inputs.conf and by that I mean that you should have a whitelist OR a blacklist (e.g. do not just monitor a directory but rather some specific stuff in that directly) and prevent Splunk from digging any deeper than that by using the recursive = false directive. Read the "MONITOR:" section here:

http://docs.splunk.com/Documentation/Splunk/6.2.3/admin/inputsconf

If you really would like to disable learning, edit $SPLUNK_HOME/etc/apps/learned/local/app.conf and make sure it says this:

[install]
state = disabled
Get Updates on the Splunk Community!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

It’s Monday morning, and your phone is buzzing with alert escalations – your customer-facing portal is running ...

What’s New in Splunk Observability – September 2025

What's NewWe are excited to announce the latest enhancements to Splunk Observability, designed to help ITOps ...

Fun with Regular Expression - multiples of nine

Fun with Regular Expression - multiples of nineThis challenge was first posted on Slack #regex channel ...