Config file for collectd(1).

jbburkes · ‎11-14-2019

Splunk App for Infrastructure data collection on Search Head

Followed:
https://docs.splunk.com/Documentation/InfraApp/2.0.0/Admin/ManualInstalLinuxUF

Environment:
Search Head 7.3.0
Indexer 7.3.0

Setup:
collectd -> localhost udp port 5000 -> indexer (via system/local/outputs.conf)

Issue:
So data flows from collectd to localhost udp port 5000, verified with tcpdump to include viewing data. Search Head forwards data to the Indexer. Indexer has Add-On as instructed in documentation but get the following error:

Metric value = unset is not valid for source=5000 sourcetype=em_metrics_udp. Metric event data with an invalid metric value would not be indexed. Ensure the input metric data is not malformed.

Thanks.

Jeremy

jbburkes · ‎11-15-2019

The solution of putting the Add-On on the Search Head itself was correct. Is that because the Search Head is basically acting like a glorified Heavy Forwarder? A little confused as to why the Search Head is performing any parsing in this regard. Thanks.

Jeremy

yannK · ‎11-15-2019

SII was for splunklight, it was not intended to be multi tenant, initially. SAI is an app for enterprise, but the setup UI still assume that you have single instance (SH/IDX all in one).

So when you are in an enterprise splunk deployment, read the docs :
https://docs.splunk.com/Documentation/InfraApp/2.0.0/Install/DistributedDeployment

You do not really want to send data to the SH , its a bottleneck, and HEC may not scale.
Instead move the ingest on the indexers.

you need to create the HEC tokens and inputs on the indexers, install the TA (with sourcetypes and indexes)
setup a DNS loadbalancer to have a single address for all your indexers
then customize the install script for the client to use the good token and the indexers addresses

jbburkes · ‎11-15-2019

So the answer was to install the Add-On on the Search Head itself, which makes me question my understanding of splunk data flow. Is the reason the Search Head needs the Add-On installed is because it is basically acting like a HF? Thanks for your help!

Jeremy

yannK · ‎11-14-2019

We expect to use the collectd setup script, that will send data over HEC to the indexers (and skip the UF)

the SAI app does not use UDP

Setup:
collectd -> localhost udp port 5000 -> indexer (via system/local/outputs.conf)

If you send data over UDP, the format may not be recognized, as many transformations are done for the em_metrics sourcetypes.

dagarwal_splunk · ‎11-14-2019

we have UDP support in write_splunk.

jbburkes · ‎11-14-2019

Issue:
Additional Error missing from original post:

Metric name is missing from source...Metric event data without metric name is invalid and would not be indexed. Ensure the input metric data is not malformed

collectd.conf

Config file for collectd(1).

Please read collectd.conf(5) for a list of options.

http://collectd.org/

Global

----------------------------------------------------------------------------

Global settings for the daemon.

Hostname "XXX"
FQDNLookup false
BaseDir "/var/lib/collectd"

PIDFile "/var/run/collectd.pid"

PluginDir "/usr/lib64/collectd"

TypesDB "/usr/share/collectd/types.db"

----------------------------------------------------------------------------

When enabled, plugins are loaded automatically with the default options

when an appropriate block is encountered.

Disabled by default.

----------------------------------------------------------------------------

AutoLoadPlugin false

----------------------------------------------------------------------------

When enabled, internal statistics are collected, using "collectd" as the

plugin name.

Disabled by default.

----------------------------------------------------------------------------

CollectInternalStats false

----------------------------------------------------------------------------

Interval at which to query values. This may be overwritten on a per-plugin

base by using the 'Interval' option of the LoadPlugin block:

Interval 60

----------------------------------------------------------------------------

Interval 60

MaxReadInterval 86400

Timeout 2

ReadThreads 5

WriteThreads 5

Limit the size of the write queue. Default is no limit. Setting up a limit is

recommended for servers handling a high volume of traffic.

WriteQueueLimitHigh 1000000
WriteQueueLimitLow 800000

Logging

----------------------------------------------------------------------------

Plugins which provide logging functions should be loaded first, so log

messages generated when loading or configuring other plugins can be

accessed.

LoadPlugin syslog

LoadPlugin logfile

    FlushInterval 30



server 127.0.0.1
buffersize 9000
useudp true
udpport 5000
#data_type metric
#Dimension "entity_type:linux_host"

LoadPlugin section

----------------------------------------------------------------------------

Lines beginning with a single `#' belong to plugins which have been built

but are disabled by default.

Lines beginning with `##' belong to plugins which have not been built due

to missing dependencies or because they have been deactivated explicitly.

LoadPlugin csv

LoadPlugin cpu

LoadPlugin memory

LoadPlugin df

LoadPlugin load

LoadPlugin disk

LoadPlugin interface

LoadPlugin uptime

LoadPlugin processmon

Plugin configuration

----------------------------------------------------------------------------

In this section configuration stubs for each plugin are provided. A desc-

ription of those options is available in the collectd.conf(5) manual page.

LogLevel info
File "/var/log/collectd.log"
Timestamp true
PrintSeverity true

LogLevel info

ReportByCpu false
ReportByState true
ValuesPercentage true



ValuesAbsolute false
ValuesPercentage true



FSType "ext2"
FSType "ext3"
FSType "ext4"
FSType "XFS"
FSType "rootfs"
FSType "overlay"
FSType "hfs"
FSType "apfs"
FSType "zfs"
FSType "ufs"
ReportByDevice true
ValuesAbsolute false
ValuesPercentage true
IgnoreSelected false



ReportRelative true



Disk ""
IgnoreSelected true
UdevNameAttr "DEVNAME"



IgnoreSelected true

inputs.conf
[default]
host = XXX

[em_entity_migration://job]
disabled = 1

[udp://5000]
index = em_metrics
sourcetype = em_metrics_udp
no_appending_timestamp = true

[monitor:///var/log/collectd.log]
disabled = false
index = _internal

collectd tcpdump
{time: 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.user", "metric_type": "cpu", "_value": 1.41780386351553, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.system", "metric_type": "cpu", "_value": 0.293985801111308, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.wait", "metric_type": "cpu", "_value": 0.00312750852246072, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.nice", "metric_type": "cpu", "_value": 0, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.interrupt", "metric_type": "cpu", "_value": 0, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.softirq", "metric_type": "cpu", "_value": 0.00729751988574169, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.steal", "metric_type": "cpu", "_value": 0, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}{"time": 1573748903.05, "host": "XXX", "fields": {"metric_name": "cpu.idle", "metric_type": "cpu", "_value": 98.277785306965, "entity_type": "linix_host", "kernel_version": "3.10.0-1062.4.1.el7.x86_64", "os": "Red Hat Enterprise Linux Server", "os_version": "7.7 (Maipo)", "ip": "XXX"}}

So my confusion is the metric name and metric value are in the event traffic, so why is the indexer throwing this error?

Thanks for the help.

Jeremy

dagarwal_splunk · ‎11-14-2019

you might need TA in SH as well if you are using that to forward data instead of UF.

jbburkes · ‎11-15-2019

This is the answer! Thank you.

dagarwal_splunk · ‎11-14-2019

What is the version of Splunk Add on for Infrastructure on your indexers?

jbburkes · ‎11-14-2019

Add-On and App are 2.0.0

Are you a member of the Splunk Community?

Splunk App for Infrastructure: Error message on search head

Config file for collectd(1).

Please read collectd.conf(5) for a list of options.

http://collectd.org/

Global

----------------------------------------------------------------------------

Global settings for the daemon.

PIDFile "/var/run/collectd.pid"

TypesDB "/usr/share/collectd/types.db"

----------------------------------------------------------------------------

When enabled, plugins are loaded automatically with the default options

when an appropriate block is encountered.

Disabled by default.

----------------------------------------------------------------------------

AutoLoadPlugin false

----------------------------------------------------------------------------

When enabled, internal statistics are collected, using "collectd" as the

plugin name.

Disabled by default.

----------------------------------------------------------------------------

CollectInternalStats false

----------------------------------------------------------------------------

Interval at which to query values. This may be overwritten on a per-plugin

base by using the 'Interval' option of the LoadPlugin block:

Interval 60

----------------------------------------------------------------------------

MaxReadInterval 86400

Timeout 2

ReadThreads 5

WriteThreads 5

Limit the size of the write queue. Default is no limit. Setting up a limit is

recommended for servers handling a high volume of traffic.

Logging

----------------------------------------------------------------------------

Plugins which provide logging functions should be loaded first, so log

messages generated when loading or configuring other plugins can be

accessed.

LoadPlugin syslog

LoadPlugin section

----------------------------------------------------------------------------

Lines beginning with a single `#' belong to plugins which have been built

but are disabled by default.

Lines beginning with `##' belong to plugins which have not been built due

to missing dependencies or because they have been deactivated explicitly.

LoadPlugin csv

LoadPlugin memory

LoadPlugin df

LoadPlugin load

LoadPlugin disk

LoadPlugin interface

LoadPlugin uptime

LoadPlugin processmon

Plugin configuration

----------------------------------------------------------------------------

In this section configuration stubs for each plugin are provided. A desc-

ription of those options is available in the collectd.conf(5) manual page.

LogLevel info

Splunk Observability as Code: From Zero to Dashboard

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

Shape the Future of Splunk: Join the Product Research Lab!