Splunk Search

job timeout error - what to do without increasing the receiveTimeout in distsearch.conf

splunk_zen
Builder
06-08-2015 15:41:47.050 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing     https://ip.1:port/services/streams/search?sh_sid=1433773905.807685)
06-08-2015 15:41:47.051 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing     https://ip.2:port/services/streams/search?sh_sid=1433773905.807685)
06-08-2015 15:41:47.051 ERROR HttpClientRequest - HTTP client error: Read Timeout (while accessing https://ip.4:port/services/streams/search?sh_sid=1433773905.807685)

06-08-2015 15:41:47.056 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx001.iggroup.local
06-08-2015 15:41:47.057 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx002.iggroup.local
06-08-2015 15:41:47.057 WARN  SearchResultParserExecutor - Socket error during transaction. Timeout error. for collector=spkidx003.iggroup.local


06-08-2015 15:41:47.072 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx001.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
06-08-2015 15:41:47.075 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx002.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!
06-08-2015 15:41:47.075 ERROR DispatchThread - sid:1433773905.807685 Timed out waiting for peer spkidx003.iggroup.local.  If this occurs frequently, receiveTimeout in distsearch.conf may need to be increased. Search results might be incomplete!

06-08-2015 15:42:46.481 INFO  DispatchThread - Download request for search.log from spkidx004.iggroup.local status=200, msg=OK
06-08-2015 15:42:46.522 INFO  DispatchThread - Download request for search.log from spk_bidx001.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.522 ERROR DispatchThread - Failed to download     from 'https://ip_b.1:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.522 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx001.iggroup.local', uri='https://ip_b.14:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.559 INFO  DispatchThread - Download request for search.log from spk_bidx002.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.559 ERROR DispatchThread - Failed to download from 'https://ip_b.3:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.559 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx002.iggroup.local', uri='https://ip_b.27:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.596 INFO  DispatchThread - Download request for search.log from spk_bidx003.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.597 ERROR DispatchThread - Failed to download from 'https://ip_b.4:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.597 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx003.iggroup.local', uri='https://ip_b.45:port', sid='remote_shead001.iggroup.local_1433773905.807685'
06-08-2015 15:42:46.630 INFO  DispatchThread - Download request for search.log from spk_bidx004.iggroup.local status=404, msg=Not Found
06-08-2015 15:42:46.630 ERROR DispatchThread - Failed to download from 'https://ip_b.2:port/services/search/jobs/remote_shead001.iggroup.local_1433773905.807685/search.log'
06-08-2015 15:42:46.630 WARN  DispatchThread - Failed to download search.log from remote peer 'spk_bidx004.iggroup.local', uri='https://ip_b.57:port', sid='remote_shead001.iggroup.local_1433773905.807685'

The search takes the form (trying to simplify as much as possible as this happens even before the | stats command)

index=a sourcetype=b app.trial ("Complete" OR "Initiated") "finished"

The job inspector shows the runtime took around 603 seconds
(which is confusing me as other tested searches kept going for over 1000 seconds)

Tags (2)
0 Karma

lguinn2
Legend

My first thought is that the network connection between the search head and the indexers is slow/flaky or misconfigured. Based on the messages, I don't think it has anything to do with the search that you are running.
Might also be caused by a flaky or slow DNS service.

I would take a look at these things and make sure the network / DNS are operating properly before you change any settings in distsearch.conf

Remember that ping, while a useful tool, is not the same protocol as an https/tcp connection. So use it, but a ping connection (or lack thereof) does not verify an https/tcp connection, just that the server can be reached.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...