Getting Data In

Effect of restarting splunk service on indexers when the indexing pipelines/queues are almost full

splunker12er
Motivator

On my 3 indexers(which are in a cluster), sometimes the typing queue and indexing queue go almost full ( >90% or 100%) -
and those indexers indexing rate will go down(e.g. 300KB/sec | normal case it will be >3MB/sec) -
and after I restart all my indexers' splunk service it will be back to normal (means the indexing rate will be improved., queue get cleared. etc.)

How does the restart of splunk service actually improve the performance back in this case?

  • Does the restart of indexers actually indexed the data in the queue (which was full) ? (without data loss)
  • Or it has cleared the queue ( wiped away / not indexed / removed from being indexed) and this improves the indexing rate for new incoming logs? ( with data loss)

Is it recommended to restart the indexers (rolling-restart) when the queue/pipelines full ?

Thanks.

0 Karma

gjanders
SplunkTrust
SplunkTrust

In Alerts for Splunk Admins, I have alert AllSplunkLevel - Data Loss on shutdown (in github here) this detects the issue on shutdown of a forwarder:
I found the words: "Forcing TcpOutputGroups to shutdown after timeout"

Result in data loss from a forwarder to an indexer, I'd suspect if you see something similar on an indexer with the keyword "forcing" you might have an issue.

Do you see anything about forcing shutdown? The other way to check this would be the metrics.log just before shutdown, do the queues look like approximately zero in size before shutdown? Or close to it?

To answer your question, no, you should fix the root cause rather than restarting

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi splunker12er,
probably there are some scheduled searches very hard to execute (remember that every search takes a CPU and release it when finished so, if you have many searches with many subsearches your CPUs are alla taken by these searches and you haven't sufficient CPUs to index), so when you restart Indexers, these searches are stopped and when you restart services, your Indexers are more free and able to index correctly.
I suggest to use the Monitoring Console, to see the load on CPUs of your servers.

Bye.
Giuseppe

0 Karma

splunker12er
Motivator

I have no scheduled searches in my environment. Only ad-hoc searches run by users. (that too very few)

Just would like to know does splunk cleared/wiped the queue during the restart - which means is a data loss ?
or how does it actually improve the indexing rate after restart ?

simple., what happen to the queued data during the restart ?

0 Karma

gcusello
SplunkTrust
SplunkTrust

No I said that after restarting there aren't pending searches so your system can run mainly for indexing.
Anyway, use the Monitoring Console to monitor System load and queues.
Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Continuing Innovation & New Integrations Unlock Full Stack Observability For Your ...

You’ve probably heard the latest about AppDynamics joining the Splunk Observability portfolio, deepening our ...

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...