 
					
				
		
From the HTTP Event Collector setting page:
Source type 
The source type is one of the default fields that Splunk assigns to all incoming data. It tells Splunk what kind of data you've got, so that Splunk can format the data intelligently during indexing. *And it's a way to categorize your data, so that you can search it easily. *
We are inputting key/value pairs via HTTP Event Collector. We are currently using sourcetype as a way to categorize the type of data associated with the key/value pairs. We could also add a key with the type of data.
Is using sourcetype to categorize data a good practice? Or should we not set the sourcetype for our HTTP Events and set a key value?
 
		
		
		
		
		
	
			
		
		
			
					
		The main value of sourcetype is you can associate different processing rules that will run either at index or search time based on the sourcetype. So in your case if you think you might want to be able to associate different rules for diff categories, then diff sourcetypes make sense, vs a single sourcetype. Having a single sourcetype and using a category field for example, will allow you to have one set of specific rules for all your data.
If there are no rules period, then it really doesn't matter which way you go.
@simpkins1958 would you mind sharing your httpevent stream code. we are trying to push the code via stream, and we are not able to setup the sourcetype and source. It is taking default values as http-stream-too_small or http-stream?
 
		
		
		
		
		
	
			
		
		
			
					
		The main value of sourcetype is you can associate different processing rules that will run either at index or search time based on the sourcetype. So in your case if you think you might want to be able to associate different rules for diff categories, then diff sourcetypes make sense, vs a single sourcetype. Having a single sourcetype and using a category field for example, will allow you to have one set of specific rules for all your data.
If there are no rules period, then it really doesn't matter which way you go.
 
		
		
		
		
		
	
			
		
		
			
					
		I would qualify that by saying that sourcetype is an indexed field, so if you have a good amount of different sourcetypes, using that field when searching will improve search performance, compared to using an event-level key/value pair that is extracted at search time.
 
					
				
		
 
		
		
		
		
		
	
			
		
		
			
					
		It's a common misconception that indexed fields have notably different performance characteristics from text tokens. They don't. We look them up the same way. Indexed fields only behave notably differently when the field name and value together are drastically less common than the value alone.
However, the fields source, sourcetype, and host in Splunk are afforded a fairly special place and afford much more powerful abilities to apply implicit processing by data category, among other things. sourcetype is best thought of "a type of data", such as the kind of data produced by a particular application, or for complex applications one type of datastream it produces. Something that you can create a rich configuration to automatically extract further data from by its format and structure.
 
		
		
		
		
		
	
			
		
		
			
					
		@ssievert that's a good point!
 
					
				
		
Thanks. We will be using sourcetypes for our categories.
 
					
				
		
Thanks Glenn.
