Getting Data In

Make Splunk stop learning sourcetypes!

lsolberg
Path Finder

Hi

In this setup, we have servers for each universal-forwarder -> forwarder -> indexer -> searchhead.

I am testing adding Linux logs (/var/log) to Splunk, but I wont pollute the splunk indexer with -any- learned sourcetypes. If Splunk can't figure out the sourcetype based on its rules, the sourcetype should be set to 'linux_logs_unknown'.

We have managed to get rid of all the XXX-too_small entries by putting this in the props.conf on the universal forwarders:

[too_small]
PREFIX_SOURCETYPE = False

But I am still getting sourcetypes of eg, smbd-5 for source=/var/log/samba/smbd.log.
And sourcetype=wb-DOMAIN.log for source=/var/log/samba/wb-DOMAIN.log.

Note that this problem is not only for samba, its for everything under /var/log.

I am still somewhat new to Splunk, so please give examples 🙂

woodcock
Esteemed Legend

I assume that the problem isn't really in sourcetyping but rather in forwarding. You need a properly restrictive entry in inputs.conf and by that I mean that you should have a whitelist OR a blacklist (e.g. do not just monitor a directory but rather some specific stuff in that directly) and prevent Splunk from digging any deeper than that by using the recursive = false directive. Read the "MONITOR:" section here:

http://docs.splunk.com/Documentation/Splunk/6.2.3/admin/inputsconf

If you really would like to disable learning, edit $SPLUNK_HOME/etc/apps/learned/local/app.conf and make sure it says this:

[install]
state = disabled
Get Updates on the Splunk Community!

Prove Your Splunk Prowess at .conf25—No Prereqs Required!

Your Next Big Security Credential: No Prerequisites Needed We know you’ve got the skills, and now, earning the ...

Splunk Observability Cloud's AI Assistant in Action Series: Observability as Code

This is the sixth post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Answers Content Calendar, July Edition I

Hello Community! Welcome to another month of Community Content Calendar series! For the month of July, we will ...