Getting Data In

sourcetype best practices

a212830
Champion

Hi,

I'm looking for some help on sourcetype naming. I have a bunch of logfiles - some apache error logs, some apache access logs, some custom application error logs. I want to give my customers an easy way to search on these logs (there will be dozens of them). Should I use a pretrained source type? Wouldn't that make it more difficult to search on the logs? If I use a custom sourcetype (say "appname"), will Splunk recognize the logfile formats?

Tags (1)
1 Solution

hexx
Splunk Employee
Splunk Employee

The concept of sourcetype was introduced so that a metadata field associated with an event would describe the nature of the data, which typically tells us something about the structure of the data rather than its precise origin. "Where is this data coming from?" is a question best answered with the 'host' and 'source' metadata fields. The sourcetype is rather there to answer "What kind of data is this?".

For that reason, I would not recommend to assign the same sourcetype to access logs and application logs, for example. You are probably better off using a pre-trained sourcetype whenever one is available, such as 'access_common' or 'access_combined' for HTTPD access logs. This will bring the benefit of pre-packaged field extractions, among other things.

Note that most pre-trained sourcetypes are defined in $SPLUNK_HOME/etc/system/default/props.conf.

View solution in original post

hexx
Splunk Employee
Splunk Employee

The concept of sourcetype was introduced so that a metadata field associated with an event would describe the nature of the data, which typically tells us something about the structure of the data rather than its precise origin. "Where is this data coming from?" is a question best answered with the 'host' and 'source' metadata fields. The sourcetype is rather there to answer "What kind of data is this?".

For that reason, I would not recommend to assign the same sourcetype to access logs and application logs, for example. You are probably better off using a pre-trained sourcetype whenever one is available, such as 'access_common' or 'access_combined' for HTTPD access logs. This will bring the benefit of pre-packaged field extractions, among other things.

Note that most pre-trained sourcetypes are defined in $SPLUNK_HOME/etc/system/default/props.conf.

ChrisG
Splunk Employee
Splunk Employee

I do also recommend reading http://docs.splunk.com/Documentation/Splunk/latest/Data/Whysourcetypesmatter and the topics that follow it.

0 Karma

hexx
Splunk Employee
Splunk Employee

There are many ways to do this, and it really depends on what qualifies the event set that you want your search to return. You can use:

- Wildcards in your search terms
- Eventtypes
- Tags

a212830
Champion

Thanks, this helps me understand the usage better. Still, if I have dozens of logfiles, across multiple hosts, and I want to search them, how would I easily do that? I don't want to type in each host or logfile - that's a lot of work. Is there an alias, or something like that?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...