Getting Data In

What is best way to use sourcetype with HTTP Event Collector to categorize data?

simpkins1958
Contributor

From the HTTP Event Collector setting page:

Source type
The source type is one of the default fields that Splunk assigns to all incoming data. It tells Splunk what kind of data you've got, so that Splunk can format the data intelligently during indexing. *And it's a way to categorize your data, so that you can search it easily. *

We are inputting key/value pairs via HTTP Event Collector. We are currently using sourcetype as a way to categorize the type of data associated with the key/value pairs. We could also add a key with the type of data.

Is using sourcetype to categorize data a good practice? Or should we not set the sourcetype for our HTTP Events and set a key value?

0 Karma
1 Solution

gblock_splunk
Splunk Employee
Splunk Employee

The main value of sourcetype is you can associate different processing rules that will run either at index or search time based on the sourcetype. So in your case if you think you might want to be able to associate different rules for diff categories, then diff sourcetypes make sense, vs a single sourcetype. Having a single sourcetype and using a category field for example, will allow you to have one set of specific rules for all your data.

If there are no rules period, then it really doesn't matter which way you go.

View solution in original post

sumitnagal
Path Finder

@simpkins1958 would you mind sharing your httpevent stream code. we are trying to push the code via stream, and we are not able to setup the sourcetype and source. It is taking default values as http-stream-too_small or http-stream?

0 Karma

gblock_splunk
Splunk Employee
Splunk Employee

The main value of sourcetype is you can associate different processing rules that will run either at index or search time based on the sourcetype. So in your case if you think you might want to be able to associate different rules for diff categories, then diff sourcetypes make sense, vs a single sourcetype. Having a single sourcetype and using a category field for example, will allow you to have one set of specific rules for all your data.

If there are no rules period, then it really doesn't matter which way you go.

s2_splunk
Splunk Employee
Splunk Employee

I would qualify that by saying that sourcetype is an indexed field, so if you have a good amount of different sourcetypes, using that field when searching will improve search performance, compared to using an event-level key/value pair that is extracted at search time.

jrodman
Splunk Employee
Splunk Employee

It's a common misconception that indexed fields have notably different performance characteristics from text tokens. They don't. We look them up the same way. Indexed fields only behave notably differently when the field name and value together are drastically less common than the value alone.

However, the fields source, sourcetype, and host in Splunk are afforded a fairly special place and afford much more powerful abilities to apply implicit processing by data category, among other things. sourcetype is best thought of "a type of data", such as the kind of data produced by a particular application, or for complex applications one type of datastream it produces. Something that you can create a rich configuration to automatically extract further data from by its format and structure.

gblock_splunk
Splunk Employee
Splunk Employee

@ssievert that's a good point!

0 Karma

simpkins1958
Contributor

Thanks. We will be using sourcetypes for our categories.

0 Karma

simpkins1958
Contributor

Thanks Glenn.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...