Deployment Architecture

Fixing 502 errors when front-ending Search Heads with an AWS application load balancer?

akira_splunk
Splunk Employee
Splunk Employee

We have a Splunk deployment in AWS and have our Search Head Cluster front-ended with an ALB (not ELB). Users frequently have the screen say "502 bad gateway", which usually goes away after a refresh or two. Has anyone else seen this, and figured out how to fix it?

1 Solution

joeydenbroeder
Explorer

I've been able to resolve this issue by disabling HTTP/2 on our ALB. We're running Splunk Enterprise 7.1.0 and were seeing 73286 requests per hour of which 1205 were ELB 502 errors (0,16%). Directly after disabling HTTP/2 on the ALB, we are seeing 0 ELB 502 errors.

Next question is: Why does it break?

View solution in original post

ivohechmann
Explorer

Hi,

Had the same issue. Also check 

busyKeepAliveIdleTimeout 

in server.conf, we had a value of 12 which did not match with the idle timeout of 60 in AWS ALB. Setting  value to 65 in server.conf solved all HTTP502.

Regards,

ivo

dillencehsu
Path Finder

This is worked.

 

Finally, I done this and solved 502 error (Include Server Error show after search) with AWS ALB.

  • Set the same value (60 seconds) for busyKeepAliveIdleTimeout and Connection idle timeout of ALB.
  • Disabled HTTP/2 on ALB

SHC of Splunk 7.3.3 and 8.2.8 both worked.

 

I also found and verified this can solve error 502 of ALB.
NLB -> ALB of Target Group

  • Default value for busyKeepAliveIdleTimeout and Connection idle timeout of ALB.
    No need change any timeout settings.

Ref. https://repost.aws/ja/knowledge-center/alb-static-ip

0 Karma

joeydenbroeder
Explorer

I've been able to resolve this issue by disabling HTTP/2 on our ALB. We're running Splunk Enterprise 7.1.0 and were seeing 73286 requests per hour of which 1205 were ELB 502 errors (0,16%). Directly after disabling HTTP/2 on the ALB, we are seeing 0 ELB 502 errors.

Next question is: Why does it break?

chrisboy68
Contributor

Its 09/2021 and this solved our 502 issues too. Thank you!

0 Karma

jtrujillo
Path Finder

This worked for me.

0 Karma

akira_splunk
Splunk Employee
Splunk Employee

My customer can validate this has also worked for them. Great find @joeydenbroeder

0 Karma

japposadas
Explorer

how to check this one?

0 Karma

itradeclayton
Path Finder

@akira were you able to fix this? I'm having the same issue.

0 Karma

nickdewijer
Explorer

I'm seeing the same situation after migrating to AWS.

We're running Splunk Enterprise 7.0.2.

3 search heads, clustered behind an ALB. We see about 31468 requests per hour. 217 of those are 5XX errors on the ELB. (0,07%)

We are also seeing that after we run a ( successful ) search, when it's done and settled a "server error" message appears below the query bar. All results are there, and the page works fine but it is odd.

0 Karma

vguptadevops
New Member

I am having the same issue , the only thing is I am not even able to see a successful splunkweb page at all.
I have configured my environment using the ALB and 3 search heads behind it.

so User browser(https) ---> ALB listens on 443 ---> Forward to Target Group which has protocol for HTTPS and Port 8000 for backend servers.

all search heads are configured to use https but it just gives me 502 Bad Gateway all the time. I enabled the access logs to ALB and here's what I get.

h2 2018-03-19T17:34:59.318960Z app/Splunk-SearchHead-ELB/d24e3730216c0f34 37.228.224.60:34058 10.11.2.83:8000 -1 -1 -1 502 - 95 208 "GET https://splunk-searchhead-elb-985980458.us-east-1.elb.amazonaws.com:443/en-US/static/@01A10D5DE1BF7B... HTTP/2.0" "Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:us-east-1:542993520366:targetgroup/SHTargetGroup/4afe5809d39a7bac "Root=1-5aaff4c3-87775f57ad096cf7cad703d8" "splunk-searchhead-elb-985980458.us-east-1.elb.amazonaws.com" 
0 Karma

hardikJsheth
Motivator

Hi @akira,

We have used this in our environment and it works fine. I think the issue is with AWS configuration. You should start with your network configuration on AWS

0 Karma
Get Updates on the Splunk Community!

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

Industry Solutions for Supply Chain and OT, Amazon Use Cases, Plus More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Enterprise Security Content Update (ESCU) | New Releases

In November, the Splunk Threat Research Team had one release of new security content via the Enterprise ...