Hi All,
I have a requirement where I have to write metrics data to metrics index from existing events index as soon as the data is ingested. Most important consideration is this should not disturb existing events index.
Props.conf and transforms.conf of the app created for this is as below :
Props.conf :
[test:applogs]
TRANSFORMS-clone = clone_only_metrics
NO_BINARY_CHECK = true
DATETIME_CONFIG = NONE
[applogs_clone]
TRANSFORMS-extract = extract_metric_values
TRANSFORMS-routing = route_to_metrics
METRIC-SCHEMA-TRANSFORMS = metric-schema:log_to_metrics
NO_BINARY_CHECK = true
DATETIME_CONFIG = NONE
Transforms.conf :
[extract_metric_values]
REGEX = <myregex which is working fine.
FORMAT = l_time::$1 Id::$2 metric_value::$3 metric_name::$4 pod::$5 namespace::$6 podid::$7 k8s::$8 c_name::$9 docker::$10 c_hash::$11 c_image::$12 extracted_host::$13
WRITE_META = true
clone_only_metrics]
REGEX = "metricName"
CLONE_SOURCETYPE = applogs_clone
WRITE_META = true
[route_to_metrics]
REGEX = .*
DEST_KEY = _MetaData:Index
FORMAT = new_metrics_index
[metric-schema:log_to_metrics]
METRIC-SCHEMA-MEASURES = metric_value
METRIC-SCHEMA-WHITELIST-DIMS = Id, pod, namespace, podid, c_name, c_hash, c_image, docker, extracted_hostWorkflow :
1) test:applogs is the existing sourcetype that has event logs.
2) This sourcetype is evaluated for cloning and logs matching metricName are cloned.
3) Cloned sourcetype pipeline does —> field extraction, index routing , metric schema enforcement and metric is stored.
The application logs ingested is very huge (every second) so wanted to be sure that this implementation does not create overhead on splunk. So , I have below questions :
1) Does the clone sourcetype CLONE_SOURCETYPE = applogs_clone , creates second copy of data ?
2) When I run | mpreview index=new_metrics_index , I see this sourcetype (applogs_clone). When I just run sourcetype=applogs_clone, there is no result. So, does this mean after applying the metrics schema , the original logs in the cloned sourcetype is discarded ( no more contains duplicate data ) and I should be good with this implementation.
Please share your thoughts and help me on this.
@livehybrid : From your response, it answers to me that there is no data duplication of the original logs ( I mean the events cloned from the original sourcetype test-applogs ) and I should be good with this implementation or Should I use nullQueue to drop this cloned sourcetype ? Does it work ?
Hi @Poojitha
Thats right, you wont get a duplication of the entire event, the only 'duplicate' part is the metrics you have extracted.
You do not need to do anything with nullQueue - if you did then you'd probably end up nullQueuing the metric version of your event which you wouldnt want to do.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing
Hi @Poojitha
You are correct in that the reason you dont see the applogs_clone sourcetype with a regular search is because it has been saved as metrics, so you need to use the metric specific commands. The rest of the event from the cloned sourcetype will be dropped and only the extracted metrics will be saved. The original event will be untouched.
If you are on an ingest based license then you will pay 150 bytes per metric ingested (see https://help.splunk.com/ja-jp/data-management/splunk-enterprise-admin-manual/9.3/configure-splunk-li...)
Its hard for us to determine the load on the Splunk environment when carrying out this 'metricification' because its largely the regex computation which could cause the biggest impact - a bad regex could cause unnecessary processing, for example.
🌟 Did this answer help you? If so, please consider:
Your feedback encourages the volunteers in this community to continue contributing