Manual instrumentation is an invaluable observability tool. In this series, we dig into manual instrumentation to see how we can use it to easily build out observability from scratch. In our first post, Manual Instrumentation with Splunk Observability Cloud: The What and Why, we learned about what manual instrumentation is and why you need it. In this post, we'll look at a real, un-instrumented full-stack application and go step-by-step through the process of manually instrumenting its backend, commit by commit.
Preface: SDK-based Automatic Instrumentation
Before we jump in, we want to call out that while we will be using manual instrumentation to create custom telemetry data, we’ll also be using OpenTelemetry’s SDK-based automatic instrumentation to hook into our web framework, HTTP requests, and database so we don’t have to manually create spans for framework operations.
A true zero-code example of automatic instrumentation would require no code changes, but since zero-code auto-instrumentation doesn’t exist (yet) for our application’s Elixir backend, we’ll use SDK-based automatic instrumentation for framework operations and full manual instrumentation for our business logic instrumentation.
Example Application Overview
GitHub repository: https://github.com/splunk/evangelism-public/tree/main/manual-instrumentation
Our un-instrumented example application is called Worms in Space. It consists of a React frontend with an Elixir backend and provides a full GraphQL API. Worms in Space allows customers (who happen to be worms) to schedule spacewalks in… space.
Space worms can use the UI or GraphQL API to:
Here’s what the frontend UI looks like:
And here you can interact with the GraphQL API via GraphiQL:
If you'd like to follow along on GitHub commit-by-commit, check out the GitHub repository, otherwise, you can read on to follow each commit here.
Backend Instrumentation Implementation
Again, our example application is an Elixir backend with a GraphQL API for scheduling spacewalks. Even if you aren’t using Elixir or GraphQL, you can still use the following general steps as a guide to instrument your own backend application:
Let's walk through each of these steps commit-by-commit to instrument our Worms in Space backend.
Step 1: Add OpenTelemetry Dependencies
Git commit: 5f926245
The first thing we need to do within our backend application is add the OpenTelemetry dependencies to our dependency file, in our case our mix.exs file:
The list of dependencies can be found in the OpenTelemetry Language APIs & SDKs doc under the Elixir/Erlang dependencies section. Since we’re using the core OpenTelemetry SDK, we’ll need to include OpenTelemetry, the OpenTelemetry API, and the OpenTelemetry exporter. Our Elixir application is using Phoenix for HTTP requests, so to auto-instrument our controllers, plugs, and requests, we’ll need to include the OpenTelemetry dependency for Phoenix. Phoenix uses Cowboy as its default HTTP server, so we’ll also include the OpenTelemetry Cowboy dependency so we can low-level HTTP requests. Our application uses Ecto for database interactions, so we’ll include the OpenTelemetry Ecto dependency, and then GraphQL operations in Elixir are resolved via Absinthe, so we’ll need the OpenTelemetry Absinthe dependency, as well. Finally, the OTLP exporter is required for sending telemetry data to Splunk Observability Cloud.
Step 2: Configure OpenTelemetry SDK and OTLP Exporter
Git commit: e45ee707
Next, we configure the OpenTelemetry SDK in our config/runtime.exs file:
This configuration establishes our service identity, sets up the batch processor for efficient trace export, and configures the OTLP exporter to send traces directly to Splunk Observability Cloud.
Note: in this current configuration, we are directly sending out telemetry data to our Splunk Observability Cloud backend. In most cases, you would want to set up an OpenTelemetry Collector to collect, process, and export telemetry data to your observability backend. In a subsequent post, we’ll look at how we can update our current configuration to use the OpenTelemetry Collector for increased resilience, flexibility, and processing capabilities.
Step 3: Initialize Instrumentation Components in Application
Git commit: f7113fec
We next initialize the automatic instrumentations in our application.ex startup code:
Notes on this setup:
This setup initializes the dependencies referenced in the first commit in Step 1 and provides immediate visibility into HTTP requests, database queries, and GraphQL operations without any code changes. Every HTTP endpoint, database query, and resolver will automatically generate traces.
Step 4: Add Custom Spans for Business Logic
Git commit: 9278ef62
While SDK-based automatic instrumentation covers the framework layers, we need to add custom spans to capture our application’s unique business logic. We do this by setting attributes on specific spans within our create function heads in our resolvers.ex file so that every time a spacewalk is created, custom telemetry data with specified attributes is exported to Splunk Observability Cloud:
Note: I’ve collapsed this method for brevity, but the full method and file can be viewed on GitHub in the manual-instrumentation/worms_in_space/lib/worms_in_space_web/api/resolvers/resolvers.ex file.
These custom spans provide deep visibility into our spacewalk scheduling business logic, including success/failure rates, scheduling types, and performance metrics for critical operations. Some of the most common attributes to include in manual instrumentation are things like customer level or user ID and can be configured based on your unique business use-cases.
Environment Configuration
Finally, we set our environment variables for deployment flexibility:
This configuration allows seamless deployment across development, staging, and production environments while maintaining complete trace context and correlation.
What You'll See in Splunk Observability Cloud
With this instrumentation in place in just four commits, we immediately gain visibility into:
Going into Splunk Observability Cloud, we can visit the service overview of our Worms in Space backend in Splunk Application Performance Monitoring. Here we can see valuable service metrics, Service Map, Tag Spotlight, errors, and traces:
If we open the traces from this view, we can see long running traces:
We can open one of those long running traces, and then we can select a span to view the custom attributes that we manually instrumented earlier:
Manual instrumentation success 🎉.
Wrap Up
In just four commits, we've transformed our un-instrumented Elixir backend into a fully observable service. The combination of automatic instrumentation for framework operations and manual instrumentation via custom spans for business logic provides complete visibility into application performance and behavior.
When issues arise at 2 AM, we’ll now know exactly which spacewalk scheduling operation failed, how long database queries are taking, and whether the problem is in our business logic or our infrastructure. This level of observability transforms mysterious production issues into clearly diagnosed problems with obvious solutions so space worms can seamlessly schedule their spacewalks at any time.
Join us in our next post to follow along, step-by-step, as we manually instrument the React-powered frontend of our Worms in Space application.
Ready to add context to your own backend apps? Start with OpenTelemetry's documentation on Language APIs and SDKs, and use Splunk Observability Cloud's 14-day free trial to easily visualize your telemetry data in one unified backend platform.
Resources
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.