Getting Data In

Events to Metric Conversion : Does data cloning and routing cause overhead on splunk

Poojitha
Communicator

Hi All,

I have a requirement  where I have to write metrics data to metrics index from existing events index as soon as the data is ingested. Most important consideration is this should not disturb existing events index.

Props.conf and transforms.conf of the app created for this is as below :

Props.conf :

[test:applogs]

TRANSFORMS-clone = clone_only_metrics
NO_BINARY_CHECK = true
DATETIME_CONFIG = NONE

[applogs_clone]

TRANSFORMS-extract = extract_metric_values
TRANSFORMS-routing = route_to_metrics
METRIC-SCHEMA-TRANSFORMS = metric-schema:log_to_metrics
NO_BINARY_CHECK = true
DATETIME_CONFIG = NONE


Transforms.conf : 

[extract_metric_values]
REGEX = <myregex which is working fine.
FORMAT = l_time::$1 Id::$2 metric_value::$3 metric_name::$4 pod::$5 namespace::$6 podid::$7 k8s::$8 c_name::$9 docker::$10 c_hash::$11 c_image::$12 extracted_host::$13
WRITE_META = true

clone_only_metrics]
REGEX = "metricName"
CLONE_SOURCETYPE = applogs_clone
WRITE_META = true

[route_to_metrics]
REGEX = .*
DEST_KEY = _MetaData:Index
FORMAT = new_metrics_index

[metric-schema:log_to_metrics]
METRIC-SCHEMA-MEASURES = metric_value
METRIC-SCHEMA-WHITELIST-DIMS = Id, pod, namespace, podid, c_name, c_hash, c_image, docker, extracted_host

Workflow  :
1) test:applogs is the existing sourcetype that has event logs.
2) This sourcetype is evaluated for cloning and logs matching metricName are cloned.
3) Cloned sourcetype pipeline does —> field extraction, index routing , metric schema enforcement and metric is stored.

The application logs ingested is very huge (every second) so wanted to be sure that this implementation does not create overhead on splunk. So , I have below questions :
1) Does the clone sourcetype CLONE_SOURCETYPE = applogs_clone , creates second copy of data ?
2) When I run | mpreview index=new_metrics_index , I see this sourcetype (applogs_clone). When I just run sourcetype=applogs_clone, there is no result. So, does this mean after applying the metrics schema , the original logs in the cloned sourcetype is discarded ( no more contains duplicate data ) and I should be good with this implementation.

Please share your thoughts and help me on this.



0 Karma

Poojitha
Communicator

@livehybrid  : From your response, it answers to me that there is no data duplication of the original logs ( I mean the events cloned from the original sourcetype test-applogs ) and I should be good with this implementation or Should I use nullQueue to drop this cloned sourcetype ? Does it work ?

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @Poojitha 

Thats right, you wont get a duplication of the entire event, the only 'duplicate' part is the metrics you have extracted. 

You do not need to do anything with nullQueue - if you did then you'd probably end up nullQueuing the metric version of your event which you wouldnt want to do.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @Poojitha 

You are correct in that the reason you dont see the applogs_clone sourcetype with a regular search is because it has been saved as metrics, so you need to use the metric specific commands. The rest of the event from the cloned sourcetype will be dropped and only the extracted metrics will be saved. The original event will be untouched.

If you are on an ingest based license then you will pay 150 bytes per metric ingested (see https://help.splunk.com/ja-jp/data-management/splunk-enterprise-admin-manual/9.3/configure-splunk-li...

Its hard for us to determine the load on the Splunk environment when carrying out this 'metricification' because its largely the regex computation which could cause the biggest impact - a bad regex could cause unnecessary processing, for example.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

Get Updates on the Splunk Community!

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...

AI for AppInspect

We’re excited to announce two new updates to AppInspect designed to save you time and make the app approval ...

App Platform's 2025 Year in Review: A Year of Innovation, Growth, and Community

As we step into 2026, it’s the perfect moment to reflect on what an extraordinary year 2025 was for the Splunk ...