Hi,
I've integrated collectd metrics with Splunk 6.x via HEC in the past but getting some issues recently with collectd 5.8.1 on Splunk 7.2.4.2.
need clues ... getting errors while trying ingest collectd metrics... haven't see this before. I get the errors in the SH "Messages" in the black nav bar ... but can't see the error messages themselves in the _internal index.
Error msgs
search peer idx-xyzmydomain.com has the following message: Metric value= is not valid for source=collectd_hec_token, sourcetype=httpevent, host=aa.xx.yy.zz, index=linux_metrics. Metric event data with an invalid metric value would not be indexed. Ensure the input metric data is not malformed. Example of collectd payload is given below ... appreciate any hint as to what is wrong with the format of the data. Followed verbatim https://docs.splunk.com/Documentation/Splunk/7.2.4/Metrics/GetMetricsInCollectd
[{"values":[0],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"thermal","plugin_instance":"cooling_device1","type":"gauge","type_instance":""},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"HYP"},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"PIN"},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"NPI"},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"PIW"},{"values":[0],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"","type":"ps_state","type_instance":"running"},{"values":[101],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"","type":"ps_state","type_instance":"sleeping"},{"values":[0],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"","type":"ps_state","type_instance":"zombies"},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"MCP"},{"values":[0],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"","type":"ps_state","type_instance":"stopped"},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"MIS"},{"values":[null],"dstypes":["derive"],"dsnames":["value"],"time":1552624141.826,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"irq","plugin_instance":"","type":"irq","type_instance":"ERR"},{"values":[0],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"","type":"ps_state","type_instance":"paging"},{"values":[4949405696],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"all","type":"ps_data","type_instance":""},{"values":[9698123776],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"all","type":"ps_vm","type_instance":""},{"values":[0],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"","type":"ps_state","type_instance":"blocked"},{"values":[null,null],"dstypes":["derive","derive"],"dsnames":["user","syst"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"all","type":"ps_cputime","type_instance":""},{"values":[503173120],"dstypes":["gauge"],"dsnames":["value"],"time":1552624141.837,"interval":60.000,"host":"myhost.us-east-1a.aws.mydomain.com","plugin":"processes","plugin_instance":"all","type":"ps_rss","type_instance":""}]
I've seen that before. Is the sourcetype defined appropriately to process that into metrics data? I can't recall if the requirements on the HEC sourcetype side are explained in the docs.
I've seen that before. Is the sourcetype defined appropriately to process that into metrics data? I can't recall if the requirements on the HEC sourcetype side are explained in the docs.
You are right. The docs don't mention the sourcetype either on the inputs.conf or in the collectd.conf. The sourcetype=collectd_http need to mentioned either file - in the former under the HEC input stanza and in the latter as part of the URL in the write_http plugin section. The basic examples only hint at the sourcetype in the curl examples in the document "a beginner's guide to collectd".
anyway ... learnt my lesson... I had better success using the Splunk App for Infrastructure which has detailed step by step documentation and integrated install scriptlets of all components.