This blog post is part of an ongoing series on OpenTelemetry.
This post shows how to pre-process logs to add additional metadata, such as source type, index or host.name.
WARNING: WE ARE DISCUSSING A CURRENTLY UNSUPPORTED CONFIGURATION. When sending data to Splunk Enterprise, we currently only support use of the OpenTelemetry Collector in Kubernetes environments. As always, use of the Collector is fully supported to send data to Splunk Observability Cloud.
The OpenTelemetry project is the second largest project of the Cloud Native Computing Foundation (CNCF) The CNCF is a member of the Linux Foundation and besides OpenTelemetry, also hosts Kubernetes, Jaeger, Prometheus, and Helm among others. OpenTelemetry defines a model to represent traces, metrics, and logs. Using this model, it orchestrates libraries in different programming languages to allow folks to collect this data. Just as important, the project delivers an executable named the OpenTelemetry Collector, which receives, processes, and exports data as a pipeline.
The OpenTelemetry Collector uses a component-based architecture, which allows folks to devise their own distribution by picking and choosing which components they want to support. Please see our official documentation to install the collector.
At Splunk, we manage the distribution of our version of the OpenTelemetry collector under this open-source repository. The repository contains our configuration and hardening parameters as well as examples.
In this example, we are going to use our former blog post, ingesting logs from a file and sending them to Splunk Enterprise. We are going to apply a twist to this example by creating three pipelines that read from three different files. Data coming from those three files will be associated with different source types, depending on their source.
Our Splunk HEC exporter reads a specific element of the log to determine the source type to output in HEC events. Per the specification of the log data model, it uses the com.splunk.sourcetype resource attribute. You can override the attribute to look for in the HEC exporter configuration with hec_metadata_to_otel_attrs/sourcetype.
A resource is a group of log records - so we don’t repeat the same metadata for each log entry.
To manipulate resource attributes, we can use a processor named the resources processor.
This is a typical use of the processor to set the attribute com.splunk.sourcetype to the value sourcetype1.
resource/one:
attributes:
- key: com.splunk.sourcetype
value: "sourcetype1"
action: upsert
OK, so our example from there consists of adding this processor, configured differently, to all three pipelines:
processors:
batch:
resource/one:
attributes:
- key: com.splunk.sourcetype
value: "sourcetype1"
action: upsert
resource/two:
attributes:
- key: com.splunk.sourcetype
value: "sourcetype2"
action: upsert
resource/three:
attributes:
- key: com.splunk.sourcetype
value: "sourcetype3"
action: upsert
pipelines:
logs/one:
receivers: [ filelog/onefile ]
processors: [ batch, resource/one ]
exporters: [ splunk_hec/logs ]
logs/two:
receivers: [ filelog/twofile ]
processors: [ batch, resource/two ]
exporters: [ splunk_hec/logs ]
logs/three:
receivers: [ filelog/threefolder ]
processors: [ batch, resource/three ]
exporters: [ splunk_hec/logs ]
We have put this all together into an example that lives under Splunk’s OpenTelemetry Collector github repository. To run this example, you will need at least 4 gigabytes of RAM, as well as git and Docker Desktop installed.
First, check out the repository using git clone: https://github.com/signalfx/splunk-otel-collector.git
Using a terminal window, navigate to the folder examples/otel-logs-with-sourcetypes-splunk.
Type:
docker-compose up
This will start the OpenTelemetry Collector, our bash script generating data, and Splunk Enterprise. Your terminal will display information as Splunk starts. Eventually, Splunk will display the same information as in our last blog post to let us know it is ready.
Now, you can open your web browser and navigate to http://localhost:18000. You can log in as admin/changeme. You will be met with a few prompts as this is a new Splunk instance. Make sure to read and acknowledge them, and open the default search application.
In the search box, enter index=logs to start searching for your logs. You will see the logs have been ingested with separate source types:
When you have finished exploring this example, you can press Ctrl+C to exit from Docker Compose. Thank you for following along! This concludes our Prometheus metrics example. We have used the OpenTelemetry Collector to successfully scrape Prometheus metrics and send them to our Splunk Enterprise instance.
If you found this example interesting, feel free to star the repository! Just click the star icon in the top right corner. Any ideas or comments? Please open an issue on the repository.
— Antoine Toulme, Senior Engineering Manager, Blockchain & DLT
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.