Getting Data In

What is the best analogy for explaining 'sourcetypes'?

sloshburch
Splunk Employee
Splunk Employee

When I talk to folks who are new to Splunk, I often struggle to explain the concept of a sourcetype to them. Other basic fields, like host, source and _time, are more easily understood because they exist outside of Splunk.

Analogies tend to be a great way to convey new concepts. So I'm curious what analogies for sourcetype have worked for you?

1 Solution

sloshburch
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this reponse. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

I find Humans to be a great analogy. Here's how I explain it:

Splunk headquarters is in downtown San Francisco, California, adjacent to the Embarcadero, along the city's shoreline where thousands of people pass by every day: pedestrians, tourists, runners, families, workers on break, and so on.

All the people on the Embarcadero have their own names and addresses. Some are named John, some have black hair, some even come from the same address, like a family or coworkers from the same company. This is similar to machine data! While the data may not have a name, it likely has a host name, or identifier for the unique asset that created the data. Likewise, the data may not rest its head, but it did originate from an address, or source location.

When you compare and contrast people on the Embarcadero, it makes more sense to compare them by attributes they have in common, such as bikers, runners, or tourists, rather than dealing with each individual's name or address. By organizing the people this way, you can compare and contrast their common attributes effectively.

In Splunk software, you do the same thing with sourcetype. Consider Apache web logs. By referring to all Apache web logs with the same source type name, we can calculate average web request time without having to list list every Apache host or source path. It is best to create source types for data that has a similar structure. For example, bicyclists and unicyclists are similar, but they are structured differently. Likewise, Apache web logs and ISS web logs are both web logs, but they are structurally different and have different values worth comparing, so they should each have their own source type.

To learn more about source types, check out "Why source types matter" within the Getting Data In manual. For those ready to define their own custom source types, discover naming conventions within the "Source types for add-ons" of the Splunk Add-ons manual.

View solution in original post

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...