All Apps and Add-ons

What is the best way to migrate Windows performance monitoring from event-based to metrics-based data?

sloshburch
Splunk Employee
Splunk Employee

I have a mixed *nix and Windows environment and I'm currently collecting the Windows data with the Splunk Add-on for Microsoft Windows as event data. I want to start using the Splunk App for Infrastructure as a platform-agnostic way to monitor my infrastructure using metrics data.

I have several groups of users who have Windows infrastructure dashboards, reports, and alerts that use log event data from the Splunk Add-on for Microsoft Windows. What's the best way to migrate my Windows performance monitoring from event-based to metrics-based data?

1 Solution

sloshburch
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

Great question.

Since you have dashboards, reports, and alerts (knowledge objects) built from the event-based data, you will need time to reproduce those knowledge objects using metric-based equivalents. This means you need to collect BOTH log event and metrics data formats until you have sufficiently switched over your knowledge objects.

While that sounds easy enough, there are some quirky details to consider:

  • Parallel data ingestion can be overly consuming of license and storage. Transitioning groups in a methodical manner mitigate license and storage overages thereby making the Splunk platform more efficient.
  • Metrics data cannot be saved in a log event index type. You need a separate metrics index.
  • The input mechanism for log event and metrics data are nearly equivalent. You need to tweak the input.conf file to create subtly different input stanzas.
  • Trending and historical comparison is only available with historical data. Convert your historical log event data to historical metrics format to preserve its value for trending.
  • Premium Apps are not yet compatible with metrics data format. For those apps, you'll still need data indexed in log event format.

Let's explore each of these in more detail...

Mitigate license and storage strain during parallel data ingestion

While you are collecting both data types, there will be a corresponding consumption to the license and retention storage. To mitigate the impact to license and storage, you can migrate groups of Windows servers based on the users of the resulting performance data. As @gjanders highlights, get a better understanding of the licensing differences between event and metrics data with How Splunk Enterprise licensing works.

For example, rather than generating both data types for the entire infrastructure, you choose to cut over the IT Ops team only. Once they have switched all knowledge objects (searches, alerts, dashboards, etc...) to metrics, you can disable the log event-based performance data and repeat the process with the next group.

Store metrics data in its own index

The default data type for Splunk indexes is the event data type. Event data type indexes can only store event data. Indexes configured with the metric data type can only store metrics data.

To collect data from both data structures, you'll need an index of each type.

Fortunately, it's easy to create a metrics index: just select 'metrics' as the index data type when you create a new index. For more about how to create a metrics index, see Create metrics indexes in the Managing Indexers and Clusters of Indexers manual.

To validate that your metrics are getting indexed properly, get familiar with how to search and monitor metrics in the Splunk Enterprise Metrics manual.

Create separate stanzas for log event and metrics data in inputs.conf

The Windows OS generates performance data from the same input type ( perfmon ) regardless of whether Splunk indexes it as event log or metrics data (see the Monitor Windows performance in the Getting Data In manual on Splunk docs for details). In contrast, the Unix OS generates performance data using scripted inputs for event logs, and collectd for metrics.

When the Windows OS perfmon data arrives at Splunk (indexer or heavy forwarder), Splunk requires special configurations to convert the raw data from log event to metrics format. The Splunk Add-on for Infrastructure provides the necessary configuration for free (see the associated documentation for implementation instructions) . The Splunk Add-on for Infrastructure applies this transformation to any data that arrives with the source value matching Perfmon:*. This design could cause problems for those who need perfmon event data. See Why does Windows perfmon event data stop working after adding the Splunk Add-on for Infrastructure? for more details.

You must create different input.conf stanza s in order to collect the log event and metrics data in parallel. While these stanzas may appear identical, it's critical that they have unique names. That's because Splunk will merge all configurations with the same stanza name. Using a unique name for each data type ensures that Splunk handles the attributes that deviate, such as interval, index, and _meta, correctly. For example, you could name the event data stanza [perfmon://CPU] and the metrics one [perfmon://CPU Load] or [perfmon://metric-cpu].

The Manually configure metrics and log collection for Windows on Splunk App for Infrastructure in the Administer Splunk App for Infrastructure manual provides specific guidance and examples, even if you're not using the Splunk App for Infrastructure. When you do start using the Splunk App for Infrastructure, add the _meta stanza attribute for the dimension, entity_type::Windows_Host. This signals which metrics are Windows related. See Add Perfmon objects in inputs.conf in the Administer Splunk App for Infrastructure manual for details.

Convert historical log event data to historical metrics data

To preserve the ability to do long-term trending and performance comparisons with your newly converted metrics-based environment, you can convert historical data from event-based to metrics-based. Use the mcollect and meventcollect commands on a search of the event-based historical data to convert it to metrics and save it to a metrics-based index.

As @gjanders points out, at the time of this writing, the mcollect and meventcollect incur no additional no impact to licensing. The license was consumed when the original log event data was indexed. This behavior mirrors that of the collect command and summary indexing feature.

Continue indexing in event log format for premium apps

At the time of this posting, Splunk Enterprise Security's knowledge objects depend on event data. Splunk IT Service Intelligence includes many modules that use event data, but version 4.1.2 includes capabilities to define KPIs with metrics data.
Refer to the documentation for the most up-to-date and officially supported guidance on metrics use.

View solution in original post

sloshburch
Splunk Employee
Splunk Employee

The Splunk Product Best Practices team provided this response. Read more about How Crowdsourcing is Shaping the Future of Splunk Best Practices.

Great question.

Since you have dashboards, reports, and alerts (knowledge objects) built from the event-based data, you will need time to reproduce those knowledge objects using metric-based equivalents. This means you need to collect BOTH log event and metrics data formats until you have sufficiently switched over your knowledge objects.

While that sounds easy enough, there are some quirky details to consider:

  • Parallel data ingestion can be overly consuming of license and storage. Transitioning groups in a methodical manner mitigate license and storage overages thereby making the Splunk platform more efficient.
  • Metrics data cannot be saved in a log event index type. You need a separate metrics index.
  • The input mechanism for log event and metrics data are nearly equivalent. You need to tweak the input.conf file to create subtly different input stanzas.
  • Trending and historical comparison is only available with historical data. Convert your historical log event data to historical metrics format to preserve its value for trending.
  • Premium Apps are not yet compatible with metrics data format. For those apps, you'll still need data indexed in log event format.

Let's explore each of these in more detail...

Mitigate license and storage strain during parallel data ingestion

While you are collecting both data types, there will be a corresponding consumption to the license and retention storage. To mitigate the impact to license and storage, you can migrate groups of Windows servers based on the users of the resulting performance data. As @gjanders highlights, get a better understanding of the licensing differences between event and metrics data with How Splunk Enterprise licensing works.

For example, rather than generating both data types for the entire infrastructure, you choose to cut over the IT Ops team only. Once they have switched all knowledge objects (searches, alerts, dashboards, etc...) to metrics, you can disable the log event-based performance data and repeat the process with the next group.

Store metrics data in its own index

The default data type for Splunk indexes is the event data type. Event data type indexes can only store event data. Indexes configured with the metric data type can only store metrics data.

To collect data from both data structures, you'll need an index of each type.

Fortunately, it's easy to create a metrics index: just select 'metrics' as the index data type when you create a new index. For more about how to create a metrics index, see Create metrics indexes in the Managing Indexers and Clusters of Indexers manual.

To validate that your metrics are getting indexed properly, get familiar with how to search and monitor metrics in the Splunk Enterprise Metrics manual.

Create separate stanzas for log event and metrics data in inputs.conf

The Windows OS generates performance data from the same input type ( perfmon ) regardless of whether Splunk indexes it as event log or metrics data (see the Monitor Windows performance in the Getting Data In manual on Splunk docs for details). In contrast, the Unix OS generates performance data using scripted inputs for event logs, and collectd for metrics.

When the Windows OS perfmon data arrives at Splunk (indexer or heavy forwarder), Splunk requires special configurations to convert the raw data from log event to metrics format. The Splunk Add-on for Infrastructure provides the necessary configuration for free (see the associated documentation for implementation instructions) . The Splunk Add-on for Infrastructure applies this transformation to any data that arrives with the source value matching Perfmon:*. This design could cause problems for those who need perfmon event data. See Why does Windows perfmon event data stop working after adding the Splunk Add-on for Infrastructure? for more details.

You must create different input.conf stanza s in order to collect the log event and metrics data in parallel. While these stanzas may appear identical, it's critical that they have unique names. That's because Splunk will merge all configurations with the same stanza name. Using a unique name for each data type ensures that Splunk handles the attributes that deviate, such as interval, index, and _meta, correctly. For example, you could name the event data stanza [perfmon://CPU] and the metrics one [perfmon://CPU Load] or [perfmon://metric-cpu].

The Manually configure metrics and log collection for Windows on Splunk App for Infrastructure in the Administer Splunk App for Infrastructure manual provides specific guidance and examples, even if you're not using the Splunk App for Infrastructure. When you do start using the Splunk App for Infrastructure, add the _meta stanza attribute for the dimension, entity_type::Windows_Host. This signals which metrics are Windows related. See Add Perfmon objects in inputs.conf in the Administer Splunk App for Infrastructure manual for details.

Convert historical log event data to historical metrics data

To preserve the ability to do long-term trending and performance comparisons with your newly converted metrics-based environment, you can convert historical data from event-based to metrics-based. Use the mcollect and meventcollect commands on a search of the event-based historical data to convert it to metrics and save it to a metrics-based index.

As @gjanders points out, at the time of this writing, the mcollect and meventcollect incur no additional no impact to licensing. The license was consumed when the original log event data was indexed. This behavior mirrors that of the collect command and summary indexing feature.

Continue indexing in event log format for premium apps

At the time of this posting, Splunk Enterprise Security's knowledge objects depend on event data. Splunk IT Service Intelligence includes many modules that use event data, but version 4.1.2 includes capabilities to define KPIs with metrics data.
Refer to the documentation for the most up-to-date and officially supported guidance on metrics use.

Ranazar
Path Finder

I see that in v7.0.0 of the Splunk Add-on for Windows, the addon includes the config necessary to convert perfmon values to metrics (source).

Looks like the default collection method is multikv, but if you change it to single then there are configs in the props.conf and transforms.conf to convert the resulting value to metric form.

0 Karma

dstaulcu
Builder

Keep a careful eye on CPU utilization changes in your IDX tier as you transition. It was a very sad day when we had to scale back our perfmon to metrics deployment to fall within financial constraints.

samprog1816
Explorer

Great work!
Hoping that my comments would be helpful if someone whats to reduce the Licensing cost on perfmon data:

I have experimented on migrating the Perfmon logs to metrics/multikv from a standard event format. Where both of the new formats has shown better results over the generic event formats in diff areas i.e., Licensing cost/storage reduction/ Search Performance.

Metrics: Better performing during the search time and cost wise it's more expensive than a regular event format because of the standard measuring of each metric log as 150 bytes when indexed. Metrics format in return resulted in consuming higher Licensing when compared to Event data.

MultiKv: mode=multikv is more efficient and granular when compared to regular event data with around 60% less license consumption when compared to other to formats with all the counter in a single logs to work efficiently.

After playing with all the 3 formats i've decided to move forward with multikv but I'm having hard time understanding the historic data migration of generic data format to multikv?
Did someone played around converting the historic perfmon event data into multikv?

0 Karma

woodcock
Esteemed Legend

This is no longer true. As of 7.? Metrics events below 150 bytes are metered at actual size so metrics will now ALWAYS be cheaper on license than non-metrics, especially when stacked.

0 Karma

gjanders
SplunkTrust
SplunkTrust

Since the original time of writing, licensing for metrics is now capped at 150 bytes for metrics as of 7.3 and above, therefore you can end up with the same license cost for metrics! (or more/less in some circumstances)

0 Karma

samprog1816
Explorer

As, I remember reading thru the docs after 7.3 licensing for metrics is capped at 100 bytes per event which in turn seems to be a game changer. But we don't have a short ETA to get to 7.3 or later versions which makes me think about conversion of standard event format to multikv.

0 Karma

gjanders
SplunkTrust
SplunkTrust

It is worth mentioning that metrics are licensed differently, refer to the documentation for licensing, at the time of writing the price was 150 bytes per metric.

Furthermore, mcollect / meventcollect has no licensing implications within Splunk as the license usage was calculated when the events were indexed as events. There is no additional charge to translate them into metrics

0 Karma

sloshburch
Splunk Employee
Splunk Employee

@gjanders Great points! I just added those details in and keep your eyes peeled for some sweet sweet karma for your contribution!

sloshburch
Splunk Employee
Splunk Employee

Updated the answer to account for improvements to ITSI that support metrics!

0 Karma

gjanders
SplunkTrust
SplunkTrust

Since the original time of writing, licensing for metrics is now capped at 150 bytes for metrics as of 7.3 and above, therefore you can end up with the same license cost for metrics! (or more/less in some circumstances)

0 Karma

nickhills
Ultra Champion

Great answer to a great question. Thanks guys!

If my comment helps, please give it a thumbs up!
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...