Splunk Search

job timeout error - what to do without increasing the receiveTimeout in distsearch.conf

splunk_zen
Builder
06-08-2015 15:41:47.050 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing     https://ip.1:port/services/streams/search?sh_sid=1433773905.807685)
06-08-2015 15:41:47.051 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing     https://ip.2:port/services/streams/search?sh_sid=1433773905.807685)
06-08-2015 15:41:47.051 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing https://ip.4:port/services/streams/search?sh_sid=1433773905.807685)

06-08-2015 15:41:47.056 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx001.iggroup.local
06-08-2015 15:41:47.057 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx002.iggroup.local
06-08-2015 15:41:47.057 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx003.iggroup.local


06-08-2015 15:41:47.072 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx001.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
06-08-2015 15:41:47.075 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx002.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
06-08-2015 15:41:47.075 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx003.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!

06-08-2015 15:42:46.481 INFO  DispatchThread - Download request for search.log from spkidx004.iggroup.local status=200, msg=OK
06-08-2015 15:42:46.522 INFO  DispatchThread - Download request for search.log from spk_bidx001.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.522 ERROR DispatchThread - Failed to download     from 'https://ip_b.1:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.522 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx001.iggroup.local', uri='https://ip_b.14:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.559 INFO  DispatchThread - Download request for search.log from spk_bidx002.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.559 ERROR DispatchThread - Failed to download from 'https://ip_b.3:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.559 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx002.iggroup.local', uri='https://ip_b.27:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.596 INFO  DispatchThread - Download request for search.log from spk_bidx003.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.597 ERROR DispatchThread - Failed to download from 'https://ip_b.4:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.597 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx003.iggroup.local', uri='https://ip_b.45:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.630 INFO  DispatchThread - Download request for search.log from spk_bidx004.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.630 ERROR DispatchThread - Failed to download from 'https://ip_b.2:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.630 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx004.iggroup.local', uri='https://ip_b.57:port', sid='remote_shead001.iggroup.local_1433773905.807685'

The search takes the form (trying to simplify as much as possible as this happens even before the | stats command)

index=a sourcetype=b app.trial ("Complete" OR "Initiated") "finished"

The job inspector shows the runtime took around 603 seconds
(which is confusing me as other tested searches kept going for over 1000 seconds)

Tags (2)
0 Karma

lguinn2
Legend

My first thought is that the network connection between the search head and the indexers is slow/flaky or misconfigured. Based on the messages, I don't think it has anything to do with the search that you are running.
Might also be caused by a flaky or slow DNS service.

I would take a look at these things and make sure the network / DNS are operating properly before you change any settings in distsearch.conf

Remember that ping, while a useful tool, is not the same protocol as an https/tcp connection. So use it, but a ping connection (or lack thereof) does not verify an https/tcp connection, just that the server can be reached.

0 Karma
Get Updates on the Splunk Community!

Monitoring MariaDB and MySQL

In a previous post, we explored monitoring PostgreSQL and general best practices around which metrics to ...

Financial Services Industry Use Cases, ITSI Best Practices, and More New Articles ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Splunk Federated Analytics for Amazon Security Lake

Thursday, November 21, 2024  |  11AM PT / 2PM ET Register Now Join our session to see the technical ...