Upgrading Enterprise from 9.1.7 -> 9.2.4: hit cloc...

burwell · ‎04-23-2025

We are a big customer. We hit a big issue in upgrading from 9.1.7 to 9.2.4 and it took a long time for the issue to be resolved.

We have a large stack with many indexers. Our current operating system is RedHat 7; we are in the process of migrating to RedHat 8.

On upgrade from 9.1.7 to 9.2.4, one of the indexer clusters that ingests the most amount of data, suddenly had aggregation and parsing queues filled at 100% during our peak logging hours. The indexers were not using much more cpu or memory it’s just that the queues were very full.

It turns out that Splunk has enabled profiling starting in 9.2: specifically cpu time profiling. These settings are controlled in limits.conf: https://docs.splunk.com/Documentation/Splunk/9.2.4/Admin/Limitsconf. There are 6 new profiling metrics and these are all enabled by default.

In addition, the agg_cpu_profiling runs a lot of time of day routines. A lot.

There are several choices for clocksource in RedHat https://docs.redhat.com/en/documentation/red_hat_enterprise_linux_for_real_time/7/html/reference_gui...

It turns out that we had set our clock source to use the clocksource “hpet” some number of years ago. This clocksource, while high precision, is much slower than using “tsc”. Once we switched to using tsc, the problem with our aggregation and parsing queues at 100% during peak hours was fixed.

Even if you don't have the clock source issue, the change in profiling is something to be aware of in the upgrade to 9.2

Upgrading Enterprise from 9.1.7 -> 9.2.4: hit clocksource issue, increased metric collection

administration

upgrade

September Community Champions: A Shoutout to Our Contributors!

Splunk Decoded: Service Maps vs Service Analyzer Tree View vs Flow Maps

What’s New in Splunk Observability – September 2025

Are you a member of the Splunk Community?