When I talk to folks who are new to Splunk, I often struggle to explain the concept of a sourcetype
to them. Other basic fields, like host
, source
and _time
, are more easily understood because they exist outside of Splunk.
Analogies tend to be a great way to convey new concepts. So I'm curious what analogies for sourcetype
have worked for you?
I find Humans to be a great analogy. Here's how I explain it:
Splunk headquarters is in downtown San Francisco, California, adjacent to the Embarcadero, along the city's shoreline where thousands of people pass by every day: pedestrians, tourists, runners, families, workers on break, and so on.
All the people on the Embarcadero have their own names and addresses. Some are named John, some have black hair, some even come from the same address, like a family or coworkers from the same company. This is similar to machine data! While the data may not have a name, it likely has a host
name, or identifier for the unique asset that created the data. Likewise, the data may not rest its head, but it did originate from an address, or source
location.
When you compare and contrast people on the Embarcadero, it makes more sense to compare them by attributes they have in common, such as bikers, runners, or tourists, rather than dealing with each individual's name or address. By organizing the people this way, you can compare and contrast their common attributes effectively.
In Splunk software, you do the same thing with sourcetype
. Consider Apache web logs. By referring to all Apache web logs with the same source type name, we can calculate average web request time without having to list list every Apache host
or source
path. It is best to create source types for data that has a similar structure. For example, bicyclists and unicyclists are similar, but they are structured differently. Likewise, Apache web logs and ISS web logs are both web logs, but they are structurally different and have different values worth comparing, so they should each have their own source type.
To learn more about source types, check out "Why source types matter" within the Getting Data In manual. For those ready to define their own custom source types, discover naming conventions within the "Source types for add-ons" of the Splunk Add-ons manual.