Getting Data In

Do ingest time log to metrics transforms still index original events?

rleviseur
Explorer

When configuring ingest-time log to metrics conversions via props.conf and transforms.conf, does Splunk still index the original events to a normal log index?

Is it possible to have the same input logging to a normal index and being converted to metrics for indexing to a metrics index?

0 Karma

sowings
Splunk Employee
Splunk Employee

If you employ the CLONE_SOURCETYPE directive, you can still ingest the log data normally, and also get metrics out of the data, into a separate metric(s) index. This requires a bit of work in transforms / props, but below is a sample.

Splunk's own logs in $SPLUNK_HOME/var/log/splunk are ingested by a single monitor: stanza, with the sourcetype of the data being set by source:: rules in props.conf. In this case, I was interested in tracking some of the time spent by the Cluster Master's "cmmaster" service, tracking the work it was doing to keep my cluster healthy.

props.conf

This first section clones some events from the metrics log based upon the regex in the

clone_cmmaster_service transform.

The transform (shown below) clones events, which have a new sourcetype "cluster_master_svc".

[source::.../var/log/splunk/metrics.log(.\d+)?]
TRANSFORMS-cmmaster_metrics = clone_cmmaster_service

These rules define what to do with the events with sourcetype "cluster_master_svc"

[cluster_master_svc]

These rules are added to ensure completeness; these events would already have time stamps parsed, however.

ADD_EXTRA_TIME_FIELDS = false
ANNOTATE_PUNCT = false
TIME_PREFIX = ^
TIME_FORMAT = %m-%d-%Y %H:%M:%S.%3N %z
SHOULD_LINEMERGE = false

These transforms (see below) are ordered lexically by the part appearing after the "TRANSFORMS-" string.

First, we use the built in extraction to find key=value pairs (true for metrics log, but maybe not for your log data).

TRANSFORMS-0_make_fields = field_extraction

Next we provide the "metric_name" field, which serves as a prefix to any of the measures found in the event.

TRANSFORMS-1_mark_metrics = cluster_service_metrics_metric_name

Finally, make sure these events are sent to a metrics index. The FORMAT of the transform dictates which index.

TRANSFORMS-4_move_index = cluster_metrics_to_index

Now, tell Splunk how to make sense of the fields found in the event, to convert them to metrics.

METRIC-SCHEMA-TRANSFORMS = metric-schema:cluster_service_metrics_multiple

transforms.conf


[clone_cmmaster_service]
REGEX = group=subtask_(?:count|seconds)
CLONE_SOURCETYPE = cluster_master_svc

This transform sets the "metric_name" field for any measurements found in the event.

[cluster_service_metrics_metric_name]

This would capture either "subtask_counts" or "subtask_seconds" from the group= field.

REGEX = group=(subtask_(?:counts|seconds)), name=cmmaster_service
FORMAT = metric_name::$1
WRITE_META = TRUE

Send any events with sourcetype=cluster_master_svc to the "cm_metrics" index.

[cluster_metrics_to_index]
SOURCE_KEY = MetaData:Sourcetype
REGEX = cluster_master_svc
DEST_KEY = _MetaData:Index
FORMAT = cm_metrics

This transform declares what to do with the fields found in the event, to convert them to metrics.

[metric-schema:cluster_service_metrics_multiple]

When the metric name is "subtask_counts", "measure" these fields.

METRIC-SCHEMA-MEASURES-subtask_counts = to_fix_data_safety, to_fix_rep_factor, to_fix_search_factor, to_fix_summary, to_fix_total, count

When the metric name is "subtask_seconds", "measure" these fields instead.

METRIC-SCHEMA-MEASURES-subtask_seconds = to_fix_data_safety, to_fix_rep_factor, to_fix_search_factor, to_fix_summary, to_fix_total, service

If no metric_name is set, just capture these fields.

METRIC-SCHEMA-MEASURES = to_fix_data_safety, to_fix_rep_factor, to_fix_search_factor, to_fix_summary, to_fix_total

Because there are multiple measurements in these events, the metric_name field becomes a prefix.

Instead of simply "to_fix_rep_factor" appearing as a metric, it would be either "subtask_seconds.to_fix_rep_factor", or

"subtask_counts.to_fix_rep_factor".

With all of these rules in place, I ingest the metrics.log as normal, as well as getting events in my metrics index which capture the counts (and seconds taken) of work required of the cluster master, as shown in "group=subtask_counts" or "group=subtask_seconds" events.

0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...