When I check the contents of my metric index I don't see any gpu values (via | mcatalog values(metric_name) where index=infra_metrics). My script shows output: metric_name=gpu.utilization value=0 gpu_index=0 gpu_name=NVIDIA_L40S metric_name=gpu.memory_used_pct value=0.00 gpu_index=0 gpu_name=NVIDIA_L40S metric_name=gpu.temperature value=35 gpu_index=0 gpu_name=NVIDIA_L40S metric_name=gpu.power_draw value=39.14 gpu_index=0 gpu_name=NVIDIA_L40S I have also tried naming it metric_name:gpu.temperature. I have established sourcetype=gpu_metrics/gpu:metrics (tried both ways). Splunk user is able to run the gpu_metric.sh, able to run directly nvidia-smi commands but nothing is ingested/parsed to my index. Data is just not there when I have done everything accordingly in my opinion. I have used the following architecture: /opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh #!/bin/bash NVIDIA_SMI=/usr/bin/nvidia-smi $NVIDIA_SMI \ --query-gpu=index,name,utilization.gpu,utilization.memory,memory.total,memory.used,temperature.gpu,power.draw \ --format=csv,noheader,nounits | while IFS=',' read -r gpu_index gpu_name util_gpu mem_util mem_total mem_used temp power do gpu_index=$(echo "$gpu_index" | xargs) gpu_name=$(echo "$gpu_name" | xargs | tr ' ' '_') util_gpu=$(echo "$util_gpu" | xargs) mem_total=$(echo "$mem_total" | xargs) mem_used=$(echo "$mem_used" | xargs) temp=$(echo "$temp" | xargs) power=$(echo "$power" | xargs) # calculate memory percentage mem_used_pct=0 if [ "$mem_total" -gt 0 ]; then mem_used_pct=$(awk "BEGIN {printf \"%.2f\", ($mem_used/$mem_total)*100}") fi # Proper Splunk metrics format echo "metric_name:gpu.utilization _value=$util_gpu gpu_index=$gpu_index gpu_name=$gpu_name" echo "metric_name:gpu.memory_used_pct _value=$mem_used_pct gpu_index=$gpu_index gpu_name=$gpu_name" echo "metric_name:gpu.temperature _value=$temp gpu_index=$gpu_index gpu_name=$gpu_name" echo "metric_name:gpu.power_draw _value=$power gpu_index=$gpu_index gpu_name=$gpu_name" done /opt/splunkforwarder/etc/apps/gpu_monitor/local# cat inputs.conf [script:///opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh] interval = 60 index = infra_metrics sourcetype = gpu:metrics disabled = false /opt/splunkforwarder/etc/apps/gpu_monitor/local# cat props.conf [gpu:metrics] DATAMODE = metric METRICS_PROTOCOL = true LINE_BREAKER = ([\r\n]+)
... View more