All Apps and Add-ons
Highlighted

What's the best practice for collecting UNIX OS metrics?

Ultra Champion

There are a few ways to collect UNIX operating system metrics. Which method should I use? Does it depend on the situation?

These are the Splunk data collection apps I'm looking at:
- Splunk Add-on for Unix and Linux
- Splunk Add-on for Linux
- Splunk App for Infrastructure

0 Karma
Highlighted

Re: What's the best practice for collecting UNIX OS metrics?

Ultra Champion

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

There are several things to consider when choosing which data collection method to use. I'll focus first on the differences among the three apps. From there, consider the users who are consuming the data, the administrative implications, compatibility with premium apps, and how you are currently capturing the data.

Differences

First, let's look at how each solution collects the data.

The Splunk Add-on for Unix and Linux uses scripted inputs and appears in Splunk as event data.

Both the Splunk Add-on for Linux and the Splunk App for Infrastructure use the more lightweight collectd or statsd, which appears as metrics data. Both collectd and statsd are open source solutions. Many companies have policies against using open source software which therefore could be a deciding factor against.

Just to simplify things, I recommend using the Splunk App for Infrastructure instead of the Splunk Add-on for Linux, because the Splunk App for Infrastructure is is more feature rich and provides stronger guidance around installation and the resulting data.

So, that leaves us with just two approaches to consider, the Splunk Add-on for Unix and Linux or the Splunk App for Infrastructure.

Users

Next, consider who's using the data. If those users are familiar with metrics and already have collectd running, then it will be simpler for them to adopt Splunk App for Infrastructure.

Conversely, if the users are long-time UNIX admins, they may find the scripted inputs of Splunk Add-on for Unix and Linux more familiar. That's because they are essentially the output of the same command those UNIX users would run on the terminal.

If the users are not technical at all and really want to use the data without learning much SPL, then they'll appreciate the point-and-click user interface to the metrics data provided by the Splunk App for Infrastructure.

If users intend to use the UNIX performance data along with the performance data from Windows, then either solution will work. The Splunk Add-on for Microsoft Windows naturally collects event data, but can be configured to that same data as metrics data, which works well with whichever approach you use for UNIX data.

Administration

Receiving Data: Administration to receive data for Splunk Add-on for Unix and Linux and Splunk App for Infrastructure are comparable. Both solutions require that you configure an endpoint: either a receiver endpoint for event data, or an HTTP Event Collector endpoint for metrics data. If you already have forwarders deployed then you have less work to do because there is a receiver endpoint set up already.

Upgrade: Splunk introduced the metrics feature in Splunk Enterprise 7.0. Therefore, if you are running an older release of Splunk, you'll need to perform an upgrade of Splunk Enterprise to use metrics data. If needed, see How to upgrade Splunk Enterprise. Alternatively, the event receiver is available in all supported releases of Splunk software and therefore requires no upgrade to use event data.

Universal Fowarder: collectd runs without a Splunk Universal Forwarder installed on the endpoint. Some endpoint owners like that collectd has a smaller installation footprint with less impact to the host's resources. Not having a Universal Forwarder, though means that you miss out on rich data points, such as those laid out in Source types for the Splunk Add-on for Unix and Linux. To have the best of both worlds, you can use collectd while also running a Universal Forwarder to collect other data points that collectd cannot.

Premium apps

If you use, or intend to use, Splunk IT Service Intelligence or Splunk Enterprise Security, you should go with the event data produced by the Splunk Add-on for Unix and Linux. Both ITSI and Splunk Enterprise Security have features that build upon the data produced by Splunk Add-on for Unix and Linux, but they don't currently make use of the metrics data.

While Splunk IT Service Intelligence's OS module currently uses event data, there is some integration with metrics data for creating entities and alerts. See Integrate the Splunk App for Infrastructure with ITSI for guidance.

How you currently capture data

The way you capture data today may influence how you collect data tomorrow. Here are some things to consider depending on how you currently capture data.

Splunk Add-on for Unix and Linux

If you are currently using the Splunk Add-on for Unix and Linux to capture data, but considering switching to Splunk App for Infrastructure, users may depend on knowledge objects (like dashboards and reports) that are tailored to the event data from the Splunk Add-on for Unix and Linux. You may want to stick with that approach rather than disrupt users, retrofit dashboards and reports, and retrain stakeholders. If you do decide to make the switch to metrics, you can make use of Splunk Service Offerings to help it go smoothly.

Third-party Solutions

If you end users use a third party solution for collecting OS performance data, those users may find any change to be confusing. In those scenarios, you may elect to continue to use those third-party solutions to minimize end user confusion. That data may more easily send to Splunk as event data rather than metrics data, regardless, you can always apply log-to-metrics conversion to convert that event data to metrics data.

Decision time

There's a lot to consider, from differences between the add-ons, your users, administration, premium apps, or how you currently capture data. Let me know if you think of other considerations I didn't mention, or if anything is unclear so I can keep this updated.

View solution in original post

0 Karma
Highlighted

Re: What's the best practice for collecting UNIX OS metrics?

Contributor

Hello, what is the difference in data ingestion on License? Will Splunk metrics has lower footprint than Splunk Add-on for Unix and Linux?

0 Karma
Highlighted

Re: What's the best practice for collecting UNIX OS metrics?

Ultra Champion

The page How Splunk Enterprise licensing works explains:

For metrics data, each metric event is metered by volume on a scale similar to the scale used for event data. However, this scale is capped at 150 bytes. Metric events with volumes over 150 bytes are metered as if they are only 150 bytes. Metrics data does not use a separate license. Rather, it draws from the same license quota as event data.

I'll ask around if anyone has done comparison of each sourcetype. My hunch is that they are somewhat comparable because most event data from the TA is a simple output of values. Of course, over a large enough dataset even the slightest variability can become pronounced.

It's important to remember that metrics will work for the OS metrics but not OS logs. Those will still be event data. So this discussion is only applicable for the OS metrics portion.

0 Karma
Highlighted

Re: What's the best practice for collecting UNIX OS metrics?

Contributor

Thanks for the useful info. Cheers!!

Highlighted

Re: What's the best practice for collecting UNIX OS metrics?

Ultra Champion

I added a note to highlight that statsd and collectd are open source solutions. This may be an issue for some organizations.

0 Karma