Deployment Architecture

Search job unexpectedly terminated when running on an SHC

mramiro
Engager

For the past few days, after upgrading the infrastructure from 7.3.2 to the latest GA (8.0.5),  I'm having problems when running ad-hoc searches on an SHC. To give you more context about the Splunk infrastructure I'm talking about, I've described it at the end of the post.

Following is the problem I'm facing:

  • When I connect to the SHC using the VIP and I run whatever search, the system raises the following error after 5-10 seconds. I couldn't find any relevant information by looking at the logs.

mramiro_1-1596003231596.png

  • When I connect directly to any of the Search Heads and I run the same search, it runs smoothly without any problem.

I found the following Known Issues (SPL-192057, SPL-188608) that seem to match this behavior. These are pretty recent though, but I can't find which Splunk versions are affected. 

mramiro_0-1596002119535.png

Did anyone face this before? What do you think I should do?

Splunk Infrastructure

  • 3 Search Heads
    • These SH are in a Search Head Cluster (SHC) configured to distribute the searches on both Indexers
    • Load balancer in front of the SHC
  • 2 Indexers
  • 2 Heavy Forwarders + multiple Universal Forwarders
  • 1 Deployment Server
  • 1 Cluster Master
0 Karma
1 Solution

mramiro
Engager

I've managed to solve the problem. It doesn't seem to be related to the Known Issues I've posted. Although the description was a perfect match.

You may double-check the load balancer configuration. As stated in the official docs (https://docs.splunk.com/Documentation/Splunk/6.6.3/DistSearch/UseSHCwithloadbalancers) :

"Configure the load balancer so that user sessions are "sticky" or "persistent." This ensures that the user remains on a single search head throughout their session."

After double-checking, it seemed that it wasn't configured properly. After applying the changes on the load balancer now it works perfectly.

I hope it helps.

 

View solution in original post

0 Karma

mramiro
Engager

I've managed to solve the problem. It doesn't seem to be related to the Known Issues I've posted. Although the description was a perfect match.

You may double-check the load balancer configuration. As stated in the official docs (https://docs.splunk.com/Documentation/Splunk/6.6.3/DistSearch/UseSHCwithloadbalancers) :

"Configure the load balancer so that user sessions are "sticky" or "persistent." This ensures that the user remains on a single search head throughout their session."

After double-checking, it seemed that it wasn't configured properly. After applying the changes on the load balancer now it works perfectly.

I hope it helps.

 

0 Karma

sanjaynathan
Loves-to-Learn

@mramiro  , May i know which LB layer traffic you are using ? Is it layer 7 or different ?

0 Karma
Get Updates on the Splunk Community!

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...