I'm having issues ingesting data correctly as custom sourcetype defined in Splunk Cloud are completely ignored when set on our Heavy Forwarders. In the web interface of the Splunk Cloud cluster I've defined custom sourcetypes as directed in the docs
Specify source type for an input
You can assign the source type for data coming from a specific input, such as /var/log/. If you have Splunk Enterprise, you do this in Splunk Web or by editing the inputs.conf configuration file. If you have Splunk Cloud, use Splunk Web to define source types.
And then on a Universal forwarder I have a file monitor stanza (where matches the one defined in Splunk Cloud):
[monitor://path\to\file.txt] index = test_index ... ... ... sourcetype = <custom sourcetype name>
After ingesting, I checked the received events and it's as if the sourcetype configuration (which I tested successfully with the "add data" wizard) is being totally ignored and Splunk is still trying to automatically identify event breaks and timestamps.
Am I supposed to define the sourcetype somewhere else? It's not particularly clear from the docs. Here is a summary of the data pipeline in place
On prem windows UF > On prem Heavy Forwarder > Splunk Cloud
Any help would be appreciated!
As UF will not do any type of parsing activity it will just forward your data to HF in your data pipeline and HF will parse your data and then It will forward it to indexer for indexing so if you want to apply any of the extraction with source type then you can do it during index time or search time and in your data pipeline I think you can add it on HF so that it will apply the source type before indexing on indexer.
Thanks for your answer, turns out the sourcetype stanzas also needed to be placed on the Heavy Forwarders. As soon as I did that and reindexed, the event boundaries started working properly.