Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

Mastering Splunk Source Types: A Complete Q&A Guide

hettervik
Builder

The Foundation of Data: A Practitioner's Guide to Splunk Source Types 

Even if you are new to Splunk, you should be somewhat familiar with source types. The idea of source types is one of the first thing you learn about in your Splunk journey. That being said, the concept and its applications might be somewhat hard to fully grasp. With this blog post I’ll try to answer some common questions about source types. 

Key Takeaways

  • Technical Core: The concept of source types is to classify logs into defined data structures, and standardize the way these data structures are treated. 
  • Efficiency Engine: They are a key component in parsing, field extractions and CIM-normalization. Stuff that makes your Splunk fast and friendly. 
  • Search Strategy: Source types are default indexed fields that gives an effective way to optimize query performance and allows the use of commands like tstats. 

Technical Definition: What is a Source Type Exactly? 

Technically a source type is just a default indexed field. Normally the source type field is determined at data ingest, but it can be rewritten later in the pipeline as well. That the field is default means that every log event in Splunk has this field, and that the field is indexed means that the field is stored in the index (as opposed to being extracted at search time). The picture below shows examples of source types, as shown in Splunk web (note the other default indexed fields as well, listed on the left). 

Run a search in Splunk to see source types (the actual field name being “sourcetype”). They are pre-selected as an interesting field in the GUI.Run a search in Splunk to see source types (the actual field name being “sourcetype”). They are pre-selected as an interesting field in the GUI.

Conceptual Overview: Classifying Your Log Formats 

Conceptually, the source type is supposed to classify the data structure of a group of log events, or in other words, specify the log format. So, for example, you might have a group of sources (e.g. log files) that are sending events to Splunk in JSON format. Then, the common source type for all these events should be JSON. Said in a different way, the source type tells you what the events are. The other default fields “source” and “host” tells you where the events are. 

Backend Mechanics: How Splunk Uses Source Types Under the Hood 

Rules for parsing and field extractions are defined per source type. This means that you can use source types to decide how different log streams should be treated. Spending some time setting well-defined logical source types gives a better end user experience in Splunk, and better performance utilization for the Splunk platform as a whole. 

Configuration Guide: Defining Your Source Types in props.conf 

All source types are defined in the configuration file props.conf. The file can be edited directly, or you can add and edit source types through the Splunk web UI. What works best is dependent on your environment and situation. Each source type has its own stanza in props.conf, defining the source type name, parsing rules and field extractions. 

Strategy: What Makes a "Well-Defined" Source Type? 

A source type that specifies efficient parsing rules and user-friendly field extractions is a good start. If the field extractions are also following the naming standards as defined in the Common Information Model (CIM), used for data normalization and data models, that makes a quite well-defined source type. Make sure to look into the Splunk Great Eight for super-efficient parsing rules. Note that defining source types requires some manual work, but the payoff makes it worth it. 

Naming Conventions: Building a Consistent Hierarchy 

There is not a single common consensus to how source types should be named, but having a consistent naming convention within your environment is beneficial. Commonly you’ll see source type names using a colon separated hierarchy system, where you “build” the name from “general” to “specific”. See the screenshot above for examples, like "aws:cloudwatchlogs:vpcflow". One benefit from this system, besides being easy to read, is that you can add a wildcard after any colon to search that subset of source types, for example "aws:cloudwatchlogs:*" or "aws:*". 

Deployment Logic: When to Split into Different Source Types 

When to split a source type? That’s a good question. To add to the JSON example mentioned earlier, you might have two different systems logging JSON events. Even though the log format is the same (JSON), the fields used by the systems might be totally different. This means that rules for field extraction and normalization need to be customised for each system, and thus, each system needs its own source type. You could define the source type names as something like “json:this” and “json:that”, each with its own field extraction rules. 

When You Should NOT Split Source Types 

If you have different systems, logging essentially the same type of events, but with minor variances, it might not be smart to split all these log streams into different source types. Say that you want to change a field extraction rule for all these log streams, you would have to update all the different source types, one-by-one, every time. In this case a common source type might be a better option. You’ll have to decide what is best case by case. 

Search Optimization: How Users Benefit from Source Types 

Users can specify a source type field in a query to easily find and filter events. For example, by searching for source type “WinEventLog”, they’ll find events from Windows Event Log. Since the source type field is an indexed field, it gives good search efficiency when specified. Also, note that indexed fields allow for the usage of tstats, which can be used to create super-efficient searches (see example below). 

tstats is a highly optimized way to search your data.tstats is a highly optimized way to search your data.

You can use tstats on indexed fields, but also on other “terms” that exists in your data (see my other blogpost if you are interested). Understanding exactly how this works takes a bit of practice and learning, but combine this knowledge with a set of well-defined source types, and you can create all sorts of Splunk tstats magic. 

More source type searching tips 

Also, since source type is a special default field, not just any indexed field, you can use the metadata command as well. This command lets you search on bucket metadata for source type information (not even touching the logs on disk). It’s uses are limited, but when it works, it grants huge efficiency benefits. One common use of metadata is to efficiently catch stopped data streams, e.g. see when logs of a certain log format (source type) suddenly stop being ingested to Splunk. This could indicate a failure in the data pipeline somewhere, that needs to be investigated. 

The metadata command can be used to find source types that haven’t sent data to Splunk in more than a day.The metadata command can be used to find source types that haven’t sent data to Splunk in more than a day.

Another special use of the source type field, is to analyse license usage. In Splunk there is a license usage log, which also has some prebuilt views in the Monitoring Console. This log can be split by source type, meaning that you can use this field to identify which “classes” of sources are using the most of your license. The better and more consistent you’ve defined your source types, the better values you can get from this license analysis. 

About the Author

Martin Hettervik, Senior Consultant and Team Leader at Accelerate Oslo, Splunk MVP 
LinkedIn: https://www.linkedin.com/in/martinhettervik/ 

Tags (3)
Contributors
Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...