We have a very simple inputs.conf stanza setup to monitor a file system:
[monitor:|path|]
disabled = false
index = Index1
What I've noted is that this has resulted in many different sourcetypes for our data. Based on my research I believe I need to define the sourcetype in inputs.conf
However, what's confusing me is that the "learned" sourcetypes are defined in props.conf and sourcetypes.conf on the Universal Forwarder rather than on the Indexers.
On the UF there is a sourcetypes.conf stanza:
[||filepath|]
L-//::.*t. = 0.322842
L-//::._t. = 0.101574
_source = |filepath|
_sourcetype = |typename-15|
And an props.conf stanza:
[typename-15]
CHARSET = UTF-8
MAX_TIMESTAMP_LOOKAHEAD = 42
is_valid = True
This leads me to two questions:
I don't know about the entries in sourcetypes.conf and props.conf on your universal forwarders; I don't know why they are there. But I can tell you that you do need sourcetypes, and that you are right about how/where to put them.
First, you are correct, You do not need to "create" a sourcetype. For example, in inputs.conf on your UF,
[monitor:/var/.../apache.log]
index = Index1
sourcetype = access_combined
[monitor:/var/.../app23.log]
index = Index2
sourcetype = cust_transaction
The first sourcetype access_combined
is a sourcetype that is built-in to Splunk. But the sourcetype cust_transaction
is a sourcetype that I just invented by using it in inputs.conf. While I may want to define other characteristics of the cust_transaction
sourcetype, Splunk does not require me to do anything more. Events will show up on my indexer and be searchable using "sourcetype=cust_transaction"
You are right about the parsing question as well. If I do want to specify how the cust_transaction data should be parsed (like MAX_TIMESTAMP_LOOKAHEAD), I need to put that on the indexer in props.conf.
Finally, here is a very useful page in the documentation: List of pretrained sourcetypes. Always use pretrained sourcetypes when you can, as it means you don't have to define how the data is parsed, etc. Notice that some of these can be automatically detected by Splunk, but some cannot.
I don't know about the entries in sourcetypes.conf and props.conf on your universal forwarders; I don't know why they are there. But I can tell you that you do need sourcetypes, and that you are right about how/where to put them.
First, you are correct, You do not need to "create" a sourcetype. For example, in inputs.conf on your UF,
[monitor:/var/.../apache.log]
index = Index1
sourcetype = access_combined
[monitor:/var/.../app23.log]
index = Index2
sourcetype = cust_transaction
The first sourcetype access_combined
is a sourcetype that is built-in to Splunk. But the sourcetype cust_transaction
is a sourcetype that I just invented by using it in inputs.conf. While I may want to define other characteristics of the cust_transaction
sourcetype, Splunk does not require me to do anything more. Events will show up on my indexer and be searchable using "sourcetype=cust_transaction"
You are right about the parsing question as well. If I do want to specify how the cust_transaction data should be parsed (like MAX_TIMESTAMP_LOOKAHEAD), I need to put that on the indexer in props.conf.
Finally, here is a very useful page in the documentation: List of pretrained sourcetypes. Always use pretrained sourcetypes when you can, as it means you don't have to define how the data is parsed, etc. Notice that some of these can be automatically detected by Splunk, but some cannot.
Okay, that all makes sense.
I think what's throwing me with the UF's props.conf is that it appears that some things do matter for the Input stage of the pipeline:
http://wiki.splunk.com/Where_do_I_configure_my_Splunk_settings%3F
but MAX_TIMESTAMP_LOOKAHEAD didn't seem to make sense as it's more of a parsing thing. In the end I think it's a moot point if I'm going to leverage a deployment server and manage the props.conf on the indexer for the parsing.
Thanks!