Dear All,
So I have a Linux script that runs vmstat as a daemon and writes the output every minute to a csv file. Here is some typical output ...
_time, metric_name:vmstat.procs.runwait, metric_name:vmstat.procs.blocking, metric_name:vmstat.memory.swapped, metric_name:vmstat.memory.free, metric_name:vmstat.memory.buffers, metric_name:vmstat.memory.cache, metric_name:vmstat.swap.in, metric_name:vmstat.swap.out, metric_name:vmstat.blocks.read, metric_name:vmstat.blocks.written, metric_name:vmstat.system.interupts, metric_name:vmstat.system.contxtswtch, metric_name:vmstat.cpu.user, metric_name:vmstat.cpu.system, metric_name:vmstat.cpu.idle, metric_name:vmstat.cpu.iowait, metric_name:vmstat.cpu.stolen
1637263961, 11, 0, 301056, 13188244, 52, 1645532, 0, 0, 258, 20, 4, 2, 2, 3, 96, 0, 0
1637264021, 3, 0, 301056, 13193028, 52, 1645648, 0, 0, 0, 37, 1480, 2090, 0, 1, 99, 0, 0
1637264081, 3, 0, 301056, 13193448, 52, 1645724, 0, 0, 0, 13, 700, 1097, 0, 0, 100, 0, 0
1637264141, 3, 0, 301056, 13192100, 52, 1645812, 0, 0, 0, 17, 756, 1154, 0, 0, 100, 0, 0
Now every so often I get an error in the message board like
The metric value=metric_name:vmstat.procs.runwait provided for source=/opt/splunkforwarder/etc/apps/TA-linux-metrics/log/read_vmstat.log, sourcetype=csv, host=foo.bar.baz, index=lnx_os_metrics is not a floating point value. Using a numeric type rather than a string type is recommended to avoid indexing inefficiencies. Ensure the metric value is provided as a floating point number and not as a string. For instance, provide 123.001 rather than 123.001.
This is not consistent and when I look at the file it is perfectly formed, the above is an example that just threw me an error.
The stanza from inputs.conf is
[monitor:///opt/splunkforwarder/etc/apps/TA-linux-metrics/log/read_vmstat.log]
index = lnx_os_metrics
sourcetype = csv
I tried with sourcetype csv as well as metrics_csv, both give the same result.
What on earth could be going on here?
Thanks, R.
Does you log file contain the header line in your example data? I ask because I think the error might be occuring everytime the file is 'rolled' (or recreated a fresh). Splunk tends to pick up the first line of the file and assume it should be data, the error because "metric_name:vmstat.procs.runwait" isn't a numeric at all.
I experience this intermittantly and haven't quite got a solution yet though suspect either removing the header line from your log file would make the issue go away (alternatively if you can't change the format of the log file you could match and ignore that line with props and transforms.conf on the Heavy Forwarder or Indxer - as described here: https://community.splunk.com/t5/Getting-Data-In/How-to-ignore-first-three-line-of-my-log/m-p/64708)
Hope this helps.
Eddie
Does you log file contain the header line in your example data? I ask because I think the error might be occuring everytime the file is 'rolled' (or recreated a fresh). Splunk tends to pick up the first line of the file and assume it should be data, the error because "metric_name:vmstat.procs.runwait" isn't a numeric at all.
I experience this intermittantly and haven't quite got a solution yet though suspect either removing the header line from your log file would make the issue go away (alternatively if you can't change the format of the log file you could match and ignore that line with props and transforms.conf on the Heavy Forwarder or Indxer - as described here: https://community.splunk.com/t5/Getting-Data-In/How-to-ignore-first-three-line-of-my-log/m-p/64708)
Hope this helps.
Eddie
I think you hit the nail on the head. Log was rolling and Splunk did not like the repeated header. I completely re-wrote the script so it writes to STDOUT and gives the csv header as it's first line, then only writes lines of numeric data after that.
I call it as a script input with an interval of -1 so it runs at Splunk start, and Splunk restarts it if it should die.
As a bonus the code is now much simpler that the deamon and write to log that I was running it before.
Thanks for your input,
R.