Community Blog
Get the latest updates on the Splunk Community, including member experiences, product education, events, and more!

OpenTelemetry Configuration & Common Problems

CaitlinHalla
Splunk Employee
Splunk Employee

Intro

In our previous post, we walked through integrating our Kubernetes environment with Splunk Observability Cloud using the Splunk Distribution of the OpenTelemetry Collector for Kubernetes. In this post, we’ll look at the general Splunk Distribution of the OpenTelemetry Collector and dive into the configuration for a Collector deployed in host (agent) monitoring mode. We’ll walk through the different pieces of the config so you can easily customize and extend your own configuration. We’ll also talk about common configuration problems and how you can avoid them so that you can seamlessly get up and running with your own OpenTelemetry Collector.   

Walkthrough

After you’ve installed the OpenTelemetry Collector for Linux or Windows, you can locate configuration files under either the /etc/otel/collector directory for Linux or the \ProgramData\Splunk\OpenTelemetry Collector\agent_config.yaml for Windows. You’ll notice several Collector configuration files live under this directory –  a gateway_config used to configure Collectors deployed in data forwarding (gateway) mode, an otlp_config_linux configuration file for exporting OpenTelemtry traces to Splunk, configuration files designed for use with AWS ECS tasks, etc. Because we’re looking at configuring our application’s instrumentation and collecting host and application metrics, we will focus on the agent_config.yaml Collector configuration file. When you open up this config, you’ll notice it’s composed of the following blocks: 

Extensions

In the extensions block of the Collector config, you’ll find components that extend Collector capabilities. This section defines things like health monitoring, service discovery, data forwarding – anything not directly involved with processing telemetry data. 

Extensions.png

The Splunk Distribution of the OpenTelemetry Collector defines a few default extensions: 

  • health_check: sets an HTTP URL that can be hit to check server availability and uptime
  • http_forwarder: accepts HTTP requests and optionally adds headers and forwards them
  • smart_agent: gets metrics for the host OS 
  • zpages: serves zPages, an HTTP endpoint for live debugging different components 

Receivers 

Receivers are responsible for getting telemetry data into the Collector. This section of the configuration file is where data sources are configured. 

Receivers.png

In this example config file, we have several default receivers configured: 

  • fluentforward: receives log data through Fluentd Forward protocol
  • hostmetrics: collects metrics about the system itself
  • jaeger: receives traces in Jaeger format
  • otlp: receives metrics, logs, and traces through gRPC or HTTP in OTLP format
  • prometheus: collects metrics from the collector itself – hence the /internal
  • smartagent: legacy SignalFX Smart Agent monitors used to send metric data (soon to be deprecated in favor of native OpenTelemetry receivers)
  • signalfx: receives metrics and logs in protobuf format
  • zipkin: another Collector-supported trace format

Processors 

Processors receive telemetry data from the receivers and transform the data based on rules or settings. For example, a processor might filter, drop, rename, or recalculate telemetry data. 

  • batch: without parameters tells the Collector to batch data before sending it to the exporters. This way, not every single piece of telemetry data is sent off to the exporter as it’s processed. 
  • memory_limiter: limits the memory a collector can use to ensure consistent performance
  • resourcedetection: detects system metadata from the host – region, OS, cloud provider, etc. 

Exporters 

This is the configuration section that defines what backends or destinations telemetry data will be sent off to. 

  • sapm (Splunk APM exporter): exports traces in a single batch to optimize network performance
  • signalfx: sends metrics, events, and traces to Splunk Observability Cloud 
  • splunk_hec: sends telemetry data to a Splunk HEC endpoint to send logs and metrics to Splunk Enterprise or Splunk Cloud Platform
  • splunk_hec/profiling: sends telemetry data for AlwaysOn Profiling to a Splunk HEC endpoint
  • otlp: sends data through gRPC using OTLP format, especially useful when deploying in forwarding (gateway) mode
  • debug: enables debug logging

Service

The service block is where the previously configured components (extensions, receivers, processors, exporters) are enabled within the pipelines

  • telemetry.metrics: emit telemetry metrics about the Collector itself
  • extensions: where all previously defined extensions are enabled for use
  • pipelines: defined for each type of data – metrics, logs, traces

Problems

There are a few problems you might run into when configuring your OTel Collector. Common issues are caused by: 

  • Incorrect indentation
  • Receivers configured but not enabled in a pipeline 
  • Receivers enabled in a pipeline but not configured 
  • All receivers, exporters, and processors used in a pipeline must support the particular data type (metrics, logs, traces)

Indentation is a very common problem. Collector configs are in YAML, which is indentation-sensitive. Using a YAML linter can help you verify that indentation has been maintained successfully. The good news is that the Collector fails fast – if the indentation is incorrect, the Collector will not start so you can identify and fix the problem.

If you’ve set up your Collector, but data isn’t appearing in the backend, there’s a high chance receivers are being configured but subsequently aren’t enabled in a pipeline. After each pipeline component is configured, it must be enabled in a pipeline under the service block of the config. 

If the specified data type in receivers, exporters, and processors aren’t supported, you’ll encounter an ErrDataTypeIsNotSupproted error. Confirm the pipeline types of the different Collector components to ensure the data types in the config are supported. 

You can always ensure your Collector is up and running with the health check extension, which is on by default with the Splunk Distribution of the OpenTelemetry Collector. From your Linux host, open http://localhost:13133. If your Collector service is up and running, you’ll see a status of Server available

You can also monitor all of your Collectors with Splunk Observability Cloud’s built-in dashboard. Data for this dashboard is configured in the metrics/internal section of the configuration file under the Prometheus receiver. 

Wrap up 

To help you through the configuration of your own OTel Collector, we walked through the config file for the Splunk Distribution of the OpenTelemetry Collector and called out potential problems you might run into with the config. If you don’t already have an OpenTelemetry Collector installed and configured, start your Splunk Observability Cloud 14 day free trial and get started with the Splunk Distribution of the OpenTelemetry Collector

Resources

Get Updates on the Splunk Community!

Monitoring Postgres with OpenTelemetry

Behind every business-critical application, you’ll find databases. These behind-the-scenes stores power ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...