The OpenTelemetry Certified Associate (OTCA) Exam

CaitlinHalla

What’s this OTCA exam?

The Linux Foundation offers the OpenTelemetry Certified Associate (OTCA) credential to confirm expertise in OpenTelemetry and observability practices. The exam might be of interest to you if you know and love observability best practices and are one of the following:

Software developer
DevOps and SRE practitioner
Cloud architect
Technical professional with a passion for OpenTelemetry

With the OTCA certification, your toolbox will be overflowing with an understanding of:

Core observability concepts and data types
OpenTelemetry API and SDK implementations
Deploying and managing OpenTelemetry Collector pipelines
Debugging and maintaining observability setups

Sound like something you're into? Let’s breakdown the certification structure and then get into how to prepare.

OTCA Domains & Competencies

According to The Linux Foundation’s OTCA certification page, an OpenTelemetry Certified Associate understands the basic concepts of observability, and how the OpenTelemetry project enables these concepts. The exam focuses on the major components of OpenTelemetry, how they are used, and the best practices for instrumenting cloud-native applications for observability with OpenTelemetry.

The exam is broken down into four main sections:

Fundamentals of Observability (18%):
1. Telemetry data
2. Semantic conventions
3. Instrumentation
4. Analysis and outcomes
The OpenTelemetry API and SDK (46%):
1. Data model
2. Composability and extension
3. Configuration
4. Signals (metrics, logs, traces)
5. SDK pipelines
6. Context propagation
7. Agents
The OpenTelemetry Collector (26%)
1. Configuration
2. Deployment
3. Scaling
4. Pipelines
5. Transforming data
Maintaining and Debugging Observability Pipelines (10%)
1. Context propagation
2. Debugging pipelines
3. Error handling
4. Schema management

The details

To pass the test, you’ll need a minimum score of 75%. There are 60 multiple choice questions, and you’ll have 90 minutes to complete the test. The exam is online and proctored, and you’ll need a clean, quiet workspace to be able to take the test. You’ll chat with your proctor to make sure your testing environment meets all the stated requirements.

You’ll be allowed two testing attempts, and both attempts are included in the $250 test cost.

Still sounds good? Let’s jump into what and how to study.

Study guide

Based on my research, my teammate Moss’ study guide, and with a splash of AI, here are some key components and areas to focus on when studying for the exam. This is by no means comprehensive, and I strongly encourage you to use this as a jumping off point in your own studying journey.

Domain 1: Fundamentals of Observability (18%)

Core Signals: Traces, Metrics, and Logs ("Events" are often part of logs/spans but are not a standalone core signal in the same way).
The "Why": OpenTelemetry is vendor-agnostic, allowing users to switch backends without re-instrumenting.
Semantic Conventions: A shared naming scheme (e.g., service.name, http.response.status_code, messaging.system).

Schema URL: Included in telemetry to declare which version of semantic conventions is being followed.
If schemas don’t match:

Situation	Recommended Tool	Action
OTel Version Mismatch	schemaprocessor	Set a target version; the processor handles the rest.
Custom/Legacy Mapping	transform processor	Write OTTL statements to rename or move attributes.
Simple Renaming	attributes processor	Use the upsert or insert actions for basic key changes.
Data Cleaning	filter processor	Drop attributes or spans that don't fit the destination schema.

Observability Backends: Tools like Jaeger (traces), Zipkin (traces), Prometheus (metrics), Grafana Loki (logs) that receive and store OTel data.

Domain 2: The OpenTelemetry API and SDK (46%)

API vs. SDK

API: Defines how telemetry is generated. Use only the opentelemetry-api package when writing libraries to avoid dependencies.
SDK: The language-specific implementation of the API. It handles the processing and exporting. The SDK is separate from the API and can be swapped at runtime without changing instrumentation code.
In existing legacy code bases, you can use the and the SDK to manage and export the data.

Traces and Spans

Span: Represents a single operation. A SpanID identifies a unique span; a TraceID links spans across services. TraceIDs must be passed to services to create traces that can be viewed in observability tooling (see Context Propagation, below).
Span Hierarchy: Spans form a parent-child hierarchy. A Child Span is created when a service receives a request from an upstream traced service.
Span Kinds:

Producer: Sending a message to a broker.
Consumer: Processing a message from a broker.

Span Links: Used to associate spans that are causally related but not in a direct parent-child relationship (e.g., fan-out/batch processing).

Metrics Instruments

Counter: Monotonic (only goes up).
UpDownCounter: Can increase or decrease (e.g., queue length).
Gauge: Captures a non-additive value at a point in time (e.g., CPU usage).
Histogram: Records a distribution of values into "buckets" (e.g., request latency).
Meter: The API component responsible for creating these instruments. The MeterProvider is the factory for Meters. The Meter is the factory for Instruments (Counters, Histograms, etc.).
Views: Used in the SDK to rename, aggregate, or filter attributes of metrics before they are exported.

Context Propagation & Baggage

Context Propagation: Passing trace/context info across service boundaries (generic for all signals).
Default propagators:

W3C Trace Context: The default standard; uses the traceparent header.
Baggage: A key-value store that travels with the context across services. Unlike span attributes, baggage is propagated over the wire.

Propagators: Configured via OTEL_PROPAGATORS (e.g., tracecontext, baggage).

Resources & Instrumentation

Resources: Attributes describing the entity (e.g., KubernetesResourceDetector for pod info adds resources like container.id or k8s.pod.uid). If two detectors attempt to add the same resource, the last detector wins.
Zero-Code Instrumentation: Collecting data without modifying application code (e.g., Java Agent or OTel Operator).
Code-Based: Manual instrumentation using the API.

Domain 3: The OpenTelemetry Collector (26%)

The Pipeline

Components: A pipeline consists of Receivers, Processors, and Exporters.
Connectors: A unique component that acts as an Exporter in one pipeline and a Receiver in another. Connectors are used to "bridge" different types of telemetry data or to route data based on specific logic. They are the primary way to turn one signal (like a Trace) into another (like a Metric) within the Collector.
Extensions: Components that live outside the data pipeline (e.g., health checks, zpages).

Key Processors

Batch Processor: Buffers data to reduce overhead. send_batch_size triggers a flush.
Memory Limiter: Prevents OOM crashes by dropping data. Look for otelcol_processor_refused_spans metrics.
Filter Processor: Used to drop specific data (e.g., only keep HTTP 500 errors).
Transform Processor: Used to sanitize or normalize attribute values.
K8s Attributes: Automatically adds pod/container metadata to all signals.

Exporters & Protocols

OTLP (OpenTelemetry Protocol): the “native language” of OTel. It is a general-purpose telemetry data delivery protocol to standardize how traces, metrics, and logs are encoded and transported between your applications, the OTel Collector, and observability backends
- Encoding formats: Binary Protobuf, JSON
- Supports gRPC (Port 4317) and HTTP (Port 4318)
- Servers MUST support Gzip compression
- Exporters MUST use exponential back-off with jitter for retries

Used for: standardization, vendor neutrality, efficiency (it uses Protocol Buffers for binary serialization), reliability (built-in features for backpressure signaling and retry logic)

Prometheus Remote Write: Used to push metrics directly to a Prometheus-compatible backend.

Deployment & Scaling

Agent: Collocated with the app (Sidecar/DaemonSet). Improves gRPC load balancing.
Gateway: Centralized instances for a cluster/region.
Kubernetes: Collector default Deployment

Mode	Kubernetes Resource	Best For...
deployment	apps/v1.Deployment	Centralized Gateways & stateless processing.
daemonset	apps/v1.DaemonSet	Collecting host metrics and logs from all nodes.
statefulset	apps/v1.StatefulSet	Tail-sampling and predictable scaling.
sidecar	Injected into Pod	Fargate, serverless, or per-app isolation.

Kubernetes-specific:

resourceDetection Processor: used to detect cluster-level or host-level information (k8s_node detector)
Kubeletstats Receiver: pulls node, pod, and container-level metrics (CPU, Memory, network) directly from the Kubelet API on each node
filelog Receiver: Collects logs from the standard /var/log/pods directory on the host

Scaling:
- Routing key: you configure the exporter to hash based on a specific key

Scale up when the exporter queue reaches 60-70% capacity.
Do not scale if the backend is the bottleneck (indicated by otelcol_exporter_send_failed_spans).
Target Allocator: a specific component used when you want to scale Prometheus-style scraping. Automatically distributes scrape targets across multiple Collector replicas to prevent duplicate data
Loadbalancing Exporter: Uses DNS to distribute OTLP/gRPC traffic across backend collectors
L7 Load Balancer: Required for distributing gRPC traffic properly
The loadbalancing exporter is the “secret sauce” for scaling stateful collectors. It doesn’t use simple round-robin; instead, it uses consistent hashing.

Component	Role in Scaling
loadbalancing exporter	Uses consistent hashing to keep related spans together.
routing_key: traceID	The setting required to make Tail Sampling work at scale.
StatefulSet	The K8s workload used for the processing layer to provide stable identities.
Headless Service	Allows the load-balancing exporter to discover individual pod IPs via DNS.
Target Allocator	Distributes Prometheus scrape targets across multiple collectors.
Two-Layer Design	Layer 1 (Load Balancer) $\rightarrow$ Layer 2 (Stateful Processor).

Domain 4: Sampling and Maintenance (10%)

Sampling Types

Head Sampling: Decision (to sample or to drop) made at the start.

AlwaysOn: Keeps 100%.
TraceIDRatioBased: Keeps a specific percentage (e.g., 10%).
ParentBased: Respects the decision of the upstream parent.

Tail Sampling: Decision made after the trace is complete. This is stateful and requires careful scaling.

Critical Environment Variables

Variable	Purpose
OTEL_SERVICE_NAME	Sets the logical name of the service.
OTEL_EXPORTER_OTLP_ENDPOINT	Points the SDK/Agent to the Collector or SaaS backend.
OTEL_TRACES_SAMPLER	Changes the active sampler without recompiling.
OTEL_EXPORTER_OTLP_HEADERS	Attaches custom HTTP headers to exports.
OTEL_JAVAAGENT_ENABLED=false	Disables the Java agent at runtime.
OTEL_NODE_RESOURCE_DETECTORS	Used to disable specific detectors in Node.js.

Debugging & Testing

Self-Telemetry: Collector metrics are typically exposed (when enabled) on Port 8888.
- If the Collector is dropping data...
  - Monitor internal metrics at Port 8888
  - Enable debug logging
  - Enable the zpages extension to get a live, web-based view of trace and metric buffers
  - The Collector’s default behavior depends on where the failure occurred:

Component	Scenario	Default Action
Memory Limiter	RAM usage hits the "Hard Limit"	Drop immediately. It refuses to accept any more data from receivers until memory is freed.
Exporter Queue	Queue is full (queue_size reached)	Drop immediately. The setting block_on_overflow defaults to false, so new data is rejected rather than making the app wait.
Exporter	Transient Error (e.g., 503, 429)	Retry. It uses exponential backoff (starting at 5s, up to 300s total) before finally dropping the data.
Exporter	Permanent Error (e.g., 400, 401)	Drop immediately. The Collector assumes the data is invalid and will never be accepted.

Unit Testing: Use the InMemorySpanExporter (in Python/other SDKs) to assert that spans are created correctly without a backend.

Read the Docs

At the end of the day, the only real way to study for the OTCA exam is to read the docs (that or use OpenTelemetry in a production environment every day and gain experience with every type of OTel scenario).

Hopefully, the above guide helps distill some of the details provided in the docs, but, again, it’s no substitute for digging into the docs yourself.

Need some docs links to get started? Here are some pages to focus on:

Best of luck on your exam!

Additional Resources

Never miss a new post. Check out this short guide on how to subscribe to the blog and get updates.

The OpenTelemetry Certified Associate (OTCA) Exam

What’s this OTCA exam?

OTCA Domains & Competencies

The details

Study guide

Domain 1: Fundamentals of Observability (18%)

Domain 2: The OpenTelemetry API and SDK (46%)

API vs. SDK

Traces and Spans

Metrics Instruments

Context Propagation & Baggage

Resources & Instrumentation

Domain 3: The OpenTelemetry Collector (26%)

The Pipeline

Key Processors

Exporters & Protocols

Deployment & Scaling

Domain 4: Sampling and Maintenance (10%)

Sampling Types

Critical Environment Variables

Debugging & Testing

Read the Docs

Additional Resources

The OpenTelemetry Certified Associate (OTCA) Exam

From Manual to Agentic: Level Up Your SOC at Cisco Live

Splunk Classroom Chronicles: Training Tales and Testimonials (Episode 4)

Join the Conversation

The OpenTelemetry Certified Associate (OTCA) Exam

What’s this OTCA exam?

OTCA Domains & Competencies

The details

Study guide

Domain 1: Fundamentals of Observability (18%)

Domain 2: The OpenTelemetry API and SDK (46%)

API vs. SDK

Traces and Spans

Metrics Instruments

Context Propagation & Baggage

Resources & Instrumentation

Domain 3: The OpenTelemetry Collector (26%)

The Pipeline

Key Processors

Exporters & Protocols

Deployment & Scaling

Domain 4: Sampling and Maintenance (10%)

Sampling Types

Critical Environment Variables

Debugging & Testing

Read the Docs

Additional Resources

The OpenTelemetry Certified Associate (OTCA) Exam

From Manual to Agentic: Level Up Your SOC at Cisco Live

Splunk Classroom Chronicles: Training Tales and Testimonials (Episode 4)