Searches return errors like below. The indexers IPs returned seems to be changing on different attempts of the same search. The searches were run over long time periods like 15-20 days on an index with relatively large number of events.
----
3 errors occurred while the search was executing. Therefore, search results might be incomplete. Hide errors.
Unknown error for indexer: <INDEXER_IP1>. Search Results might be incomplete! If this occurs frequently, check on the peer.
Unknown error for indexer: <INDEXER_IP2>. Search Results might be incomplete! If this occurs frequently, check on the peer.
Server error
----
Inspecting the job says
----
warn : Socket error during transaction. Socket error: Success
error : Unknown error for indexer: <INDEXER_IP1>. Search Results might be incomplete! If this occurs frequently, check on the peer.
error : Unknown error for indexer: <INDEXER_IP2>. Search Results might be incomplete! If this occurs frequently, check on the peer.
----
Related entries found from search.log for one of the indexer IPs
----
07-29-2020 05:46:53.900 INFO TcpOutbound - Received unexpected socket close condition with unprocessed data in RX buffer. Processing remaining bytes=5792 of data in RX buffer. socket_status="Connection closed by peer" paused=1
07-29-2020 05:47:00.543 ERROR HttpClientRequest - HTTP client error=Success while accessing server=http://<INDEXER_IP1>:8089 for request=http://<INDEXER_IP1>:8089/services/streams/search?sh_sid=1596001558.314297_64CB0758-30F3-4D5E-9CC0-DA1DD06754ED.
07-29-2020 05:47:09.734 WARN SearchResultParserExecutor - Socket error during transaction. Socket error: Success for collector=<INDEXER_IP1>
----
From some other discussion, I saw it maybe related to ulimit value. However Im not seeing any ulimit/thread/socket errors in splunkd.log.
ulimit -n value is 1024 (which I believe is a softlimit)on the indexers, but the splunk uses 100K as per the startup log and
----
splunkd.log.3:07-08-2020 10:19:48.050 +0000 INFO ulimit - Limit: open files: 100000 files
----