<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: GPU metrics monitoring via nvidia-smi in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759385#M120355</link>
    <description>&lt;P&gt;OK. If you were able to successfully run&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;/opt/splunk/bin/splunk cmd&amp;nbsp;/opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh&lt;/PRE&gt;&lt;P&gt;and got meaningful results, I'd go for ingesting the data first into a normal event index. If it does work and doesn't work when trying to get it as metrics, it would mean that there is something about parsing the metrics schema.&lt;/P&gt;</description>
    <pubDate>Tue, 17 Mar 2026 18:30:41 GMT</pubDate>
    <dc:creator>PickleRick</dc:creator>
    <dc:date>2026-03-17T18:30:41Z</dc:date>
    <item>
      <title>GPU metrics monitoring via nvidia-smi</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759341#M120349</link>
      <description>&lt;P&gt;Hello. I have the following issue: I can't make splunk index GPU data in a metrics index. On the GPU server I have a working forwarder that forwards infrastructure data via the Splunk Add-on for Linux and Unix in a metrics index called infra_metrics. Unfortunately I can't make splunk index data in the same index from the gpu metrics. I am using a script that collect metrics and is executable by the splunkfwd user:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;/opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh&lt;BR /&gt;metric_name:gpu.utilization _value=0 gpu_index=0 gpu_name=NVIDIA_L40S&lt;BR /&gt;metric_name:gpu.memory_used_pct _value=0.00 gpu_index=0 gpu_name=NVIDIA_L40S&lt;BR /&gt;metric_name:gpu.temperature _value=35 gpu_index=0 gpu_name=NVIDIA_L40S&lt;BR /&gt;metric_name:gpu.power_draw _value=38.95 gpu_index=0 gpu_name=NVIDIA_L40S&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;I have the following setup:&lt;BR /&gt;/opt/splunkforwarder/etc/apps/gpu_monitor/local# cat inputs.conf&lt;BR /&gt;[script:///opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh]&lt;BR /&gt;interval = 60&lt;BR /&gt;index = infra_metrics&lt;BR /&gt;sourcetype = gpu:metrics&lt;BR /&gt;disabled = false&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;/opt/splunkforwarder/etc/apps/gpu_monitor/local# cat props.conf&lt;BR /&gt;[gpu:metrics]&lt;BR /&gt;DATAMODE = metric&lt;BR /&gt;METRICS_PROTOCOL = true&lt;BR /&gt;LINE_BREAKER = ([\r\n]+)&lt;/P&gt;</description>
      <pubDate>Mon, 16 Mar 2026 13:47:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759341#M120349</guid>
      <dc:creator>radko</dc:creator>
      <dc:date>2026-03-16T13:47:44Z</dc:date>
    </item>
    <item>
      <title>Re: GPU metrics monitoring via nvidia-smi</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759366#M120350</link>
      <description>&lt;P&gt;OK. And what actually is your problem here?&lt;/P&gt;&lt;P&gt;Is your script not being run properly?&lt;/P&gt;&lt;P&gt;Does it not produce data?&lt;/P&gt;&lt;P&gt;Is it not getting parsed?&lt;/P&gt;&lt;P&gt;Something else?&lt;/P&gt;&lt;P&gt;What have you already done around debugging the issue.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Mar 2026 19:58:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759366#M120350</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2026-03-16T19:58:56Z</dc:date>
    </item>
    <item>
      <title>Re: GPU metrics monitoring via nvidia-smi</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759375#M120353</link>
      <description>&lt;P&gt;When I check the contents of my metric index I don't see any gpu values (via&amp;nbsp;| mcatalog values(metric_name) where index=infra_metrics). My script shows output:&lt;/P&gt;&lt;DIV&gt;metric_name=gpu.utilization value=0 gpu_index=0 gpu_name=NVIDIA_L40S metric_name=gpu.memory_used_pct value=0.00 gpu_index=0 gpu_name=NVIDIA_L40S metric_name=gpu.temperature value=35 gpu_index=0 gpu_name=NVIDIA_L40S metric_name=gpu.power_draw value=39.14 gpu_index=0 gpu_name=NVIDIA_L40S&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I have also tried naming it&amp;nbsp;metric_name:gpu.temperature. I have established sourcetype=gpu_metrics/gpu:metrics (tried both ways).&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Splunk user is able to run the gpu_metric.sh, able to run directly nvidia-smi commands but nothing is ingested/parsed to my index. Data is just not there when I have done everything accordingly in my opinion.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I have used the following architecture:&lt;BR /&gt;/opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;P&gt;#!/bin/bash&lt;/P&gt;&lt;P&gt;NVIDIA_SMI=/usr/bin/nvidia-smi&lt;/P&gt;&lt;P&gt;$NVIDIA_SMI \&lt;BR /&gt;--query-gpu=index,name,utilization.gpu,utilization.memory,memory.total,memory.used,temperature.gpu,power.draw \&lt;BR /&gt;--format=csv,noheader,nounits | while IFS=',' read -r gpu_index gpu_name util_gpu mem_util mem_total mem_used temp power&lt;BR /&gt;do&lt;BR /&gt;gpu_index=$(echo "$gpu_index" | xargs)&lt;BR /&gt;gpu_name=$(echo "$gpu_name" | xargs | tr ' ' '_')&lt;BR /&gt;util_gpu=$(echo "$util_gpu" | xargs)&lt;BR /&gt;mem_total=$(echo "$mem_total" | xargs)&lt;BR /&gt;mem_used=$(echo "$mem_used" | xargs)&lt;BR /&gt;temp=$(echo "$temp" | xargs)&lt;BR /&gt;power=$(echo "$power" | xargs)&lt;/P&gt;&lt;P&gt;# calculate memory percentage&lt;BR /&gt;mem_used_pct=0&lt;BR /&gt;if [ "$mem_total" -gt 0 ]; then&lt;BR /&gt;mem_used_pct=$(awk "BEGIN {printf \"%.2f\", ($mem_used/$mem_total)*100}")&lt;BR /&gt;fi&lt;/P&gt;&lt;P&gt;# Proper Splunk metrics format&lt;BR /&gt;echo "metric_name:gpu.utilization _value=$util_gpu gpu_index=$gpu_index gpu_name=$gpu_name"&lt;BR /&gt;echo "metric_name:gpu.memory_used_pct _value=$mem_used_pct gpu_index=$gpu_index gpu_name=$gpu_name"&lt;BR /&gt;echo "metric_name:gpu.temperature _value=$temp gpu_index=$gpu_index gpu_name=$gpu_name"&lt;BR /&gt;echo "metric_name:gpu.power_draw _value=$power gpu_index=$gpu_index gpu_name=$gpu_name"&lt;BR /&gt;done&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/opt/splunkforwarder/etc/apps/gpu_monitor/local# cat inputs.conf&lt;BR /&gt;[script:///opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh]&lt;BR /&gt;interval = 60&lt;BR /&gt;index = infra_metrics&lt;BR /&gt;sourcetype = gpu:metrics&lt;BR /&gt;disabled = false&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/opt/splunkforwarder/etc/apps/gpu_monitor/local# cat props.conf&lt;BR /&gt;[gpu:metrics]&lt;BR /&gt;DATAMODE = metric&lt;BR /&gt;METRICS_PROTOCOL = true&lt;BR /&gt;LINE_BREAKER = ([\r\n]+)&lt;/P&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 17 Mar 2026 07:39:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759375#M120353</guid>
      <dc:creator>radko</dc:creator>
      <dc:date>2026-03-17T07:39:06Z</dc:date>
    </item>
    <item>
      <title>Re: GPU metrics monitoring via nvidia-smi</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759385#M120355</link>
      <description>&lt;P&gt;OK. If you were able to successfully run&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;/opt/splunk/bin/splunk cmd&amp;nbsp;/opt/splunkforwarder/etc/apps/gpu_monitor/bin/gpu_metrics.sh&lt;/PRE&gt;&lt;P&gt;and got meaningful results, I'd go for ingesting the data first into a normal event index. If it does work and doesn't work when trying to get it as metrics, it would mean that there is something about parsing the metrics schema.&lt;/P&gt;</description>
      <pubDate>Tue, 17 Mar 2026 18:30:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759385#M120355</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2026-03-17T18:30:41Z</dc:date>
    </item>
    <item>
      <title>Re: GPU metrics monitoring via nvidia-smi</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759440#M120368</link>
      <description>&lt;P&gt;Thank you for the solution. I did create a normal event index that monitors the output from the gpu_metrics.sh. I've also enabled it to run via cronjob and I get consistent results.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Mar 2026 14:36:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/GPU-metrics-monitoring-via-nvidia-smi/m-p/759440#M120368</guid>
      <dc:creator>radko</dc:creator>
      <dc:date>2026-03-19T14:36:19Z</dc:date>
    </item>
  </channel>
</rss>

