I'm trying to confirm that replication and searching can happen on one NIC while ingesting happens over a different NIC.
I have the following simple test setup:
3 indexes in a cluster, each with 2 NICs...
1 master
1 search-head
1 forwarder sending to all three indexers
The search-head is connected to the master and in settings > distributed search > search-peers, or on the command line I see all three indexers in the cluster:
splunk list search-server
Server at URI "dsplunk-index-test-01.oit.duke.edu:8089" with status as "Up"
Server at URI "splunk-index-test-01-private.oit.duke.edu:8089" with status as "Up"
Server at URI "splunk-index-test-02-private.oit.duke.edu:8089" with status as "Up"
Server at URI "splunk-index-test-03-private.oit.duke.edu:8089" with status as "Up"
But I only see results from one indexer when I search from the web GUI on the search-head, or from its command line.
This is my command line search: splunk search "index=* | chart count by splunk_server"
I'm using the same search in the web GUI, just everything inside the "".
If I run the command-line search on the indexers individually I get results from the specific search-peer.
If I run the command-line search on the master, I get results from all three search-peers.
splunk_server count
-------------------- -----
splunk-index-test-01 57
splunk-index-test-02 39
splunk-index-test-03 456
If I run the command-line search from the search-head I get one result.
splunk_server count
-------------------- -----
splunk-index-test-01 57
If I had configured the search-head incorrectly to the master, I wouldn't see the search peers in the list search-server command results. Or I wouldn't see any results at all. As it is, it makes no sense that one of the 3 indexers shows and the other two don't. Firewalls are all open to the search-head for both NICs on all 3 indexers. I can telnet to port 8089 from the search-head to both NICs on all 3 boxes.
Here's the snippet from server.conf on the search-head:
[clustering]
master_uri = https://splunk-master-test-01.oit.duke.edu:8089
mode = searchhead
pass4SymmKey = $1$7/FK0zLe7w3j3t4lkTuxrXaNBB9vpccQ==
And from the master:
[clustering]
cluster_label = oit
mode = master
pass4SymmKey = $1$bYZ2q5Vu//5VNuiwljjQlH9xYhGBKA==
replication_factor = 2
search_factor = 1
(pass4SymmKeys have been changed)
show cluster-status shows that everything is up and searchable, all green lights.
How do I get my search-head to believe that it actually should be able to see the other search-peers?
Hi, super late to the thread but, isn't it attibuted to the search filter restrictions to the role of the user?
10-13-2017 14:04:14.725 INFO SearchProcessor - Final search filter= ( ( splunk_server=splunk-index-test-01* ) )
10-13-2017 14:04:14.654 INFO dispatchRunner - initing LicenseMgr in search process: nonPro=0
10-13-2017 14:04:14.655 INFO dispatchRunner - registering build time modules, count=1
10-13-2017 14:04:14.655 INFO dispatchRunner - registering search time components of build time module name=vix
10-13-2017 14:04:14.655 INFO dispatchRunner - Splunkd starting (build aa7d4b1ccb80).
10-13-2017 14:04:14.655 INFO dispatchRunner - System info: Linux, splunk-search-head-test-01, 3.10.0-693.2.2.el7.x86_64, #1 SMP Sat Sep 9 03:55:24 EDT 2017, x86_64.
10-13-2017 14:04:14.656 INFO dispatchRunner - Detected 1 (virtual) CPUs, 1 CPU cores, and 975MB RAM
10-13-2017 14:04:14.656 INFO dispatchRunner - Maximum number of threads (approximate): 487
10-13-2017 14:04:14.656 INFO dispatchRunner - Arguments are: "search" "--id=1507917854.41" "--maxbuckets=0" "--ttl=600" "--maxout=500000" "--maxtime=8640000" "--lookups=1" "--reduce_freq=10" "--user=bryn" "--pro" "--roles=admin:user"
10-13-2017 14:04:14.656 INFO dispatchRunner - Getting search configuration data from: /opt/splunk/etc/modules/parsing/config.xml
10-13-2017 14:04:14.662 INFO BundlesSetup - Setup stats for /opt/splunk/etc: wallclock_elapsed_msec=24, cpu_time_used=0.021666, shared_services_generation=2, shared_services_population=1
10-13-2017 14:04:14.665 WARN AuthorizationManager - Capability 'delete_by_keyword' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.665 WARN AuthorizationManager - Capability 'edit_view_html' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.665 WARN AuthorizationManager - Capability 'list_httpauths' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.665 WARN AuthorizationManager - Capability 'rtsearch' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'delete_by_keyword' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'edit_view_html' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'list_httpauths' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'rtsearch' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'delete_by_keyword' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'edit_view_html' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'rtsearch' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.666 WARN AuthorizationManager - Capability 'schedule_search' had value 'disable' - only 'enabled' is valid. Ignoring...
10-13-2017 14:04:14.667 INFO UserManagerPro - Load authentication: forcing roles="admin, user"
10-13-2017 14:04:14.671 INFO SessionManager - auth tokens will be generated with shpooling shared secret
10-13-2017 14:04:14.671 INFO UserManager - Setting user context: splunk-system-user
10-13-2017 14:04:14.671 INFO UserManager - Done setting user context: NULL -> splunk-system-user
10-13-2017 14:04:14.672 INFO UserManager - Unwound user context: splunk-system-user -> NULL
10-13-2017 14:04:14.672 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.672 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.678 INFO dispatchRunner - search context: user="bryn", app="search", bs-pathname="/opt/splunk/etc"
10-13-2017 14:04:14.685 INFO SearchParser - PARSING: search index=*\n| chart count by splunk_server
10-13-2017 14:04:14.689 INFO ISplunkDispatch - Not running in splunkd. Bundle replication not triggered.
10-13-2017 14:04:14.700 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.700 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.725 INFO SearchProcessor - Building search filter
10-13-2017 14:04:14.725 INFO SearchProcessor - Final search filter= ( ( splunk_server=splunk-index-test-01* ) )
10-13-2017 14:04:14.733 INFO SearchOperator:kv - name=EXTRACT-GUID, can_use_re2=0, regex: (?i)(?!=\w)(?:objectguid|guid)\s*=\s*(?<guid_lookup>[\w\-]+)
10-13-2017 14:04:14.733 INFO SearchOperator:kv - name=EXTRACT-SID, can_use_re2=0, regex: objectSid\s*=\s*(?<sid_lookup>\S+)
10-13-2017 14:04:14.735 INFO SearchOperator:kv - name=ad-kv, can_use_re2=0, regex: (?<_KEY_1>[\w-]+)=(?<_VAL_1>[^\r\n]*)
10-13-2017 14:04:14.737 INFO SearchOperator:kv - name=access-extractions, can_use_re2=0, regex: ^(?P<clientip>\S+)\s++(?P<ident>\S+)\s++(?P<user>\S+)\s++\[(?<req_time>[^\]]*+)\]\s++"\s*+(?P<method>[^\s"]++)?(?:\s++(?<uri>(?:(?<uri_domain>\w++://[^/\s"]++))?+(?<uri_path>(?:/++(?<root>(?:\\"|[^\s\?/"])++)/++)?(?:(?:\\"|[^\s\?/"])*+/++)*(?<file>[^\s\?/]+)?)(?:\?(?<uri_query>[^\s]*))?)(?:\s++(?P<version>[^\s"]++))*)?\s*+"\s++(?P<status>\S+)\s++(?P<bytes>\S+)(?:\s++"(?<referer>(?:(?<referer_domain>\w++://[^/\s"]++))?+[^"]*+)"(?:\s++"(?<useragent>[^"]*+)"(?:\s++"(?<cookie>[^"]*+)")?+)?+)?(?P<other>.*)
10-13-2017 14:04:14.738 INFO SearchOperator:kv - name=syslog-extractions, can_use_re2=0, regex: \s([^\s\[]+)(?:\[(\d+)\])?:\s
10-13-2017 14:04:14.739 INFO SearchOperator:kv - name=db2, can_use_re2=0, regex: ([A-Z]+) *: (.*?)(?=\n|$| +[A-Z]+ *:)
10-13-2017 14:04:14.739 INFO SearchOperator:kv - name=EXTRACT-extract_spent, can_use_re2=0, regex: (?<spent>\d+)ms$
10-13-2017 14:04:14.740 INFO SearchOperator:kv - name=EXTRACT-1, can_use_re2=0, regex: (?<_KEY_1>\S+)::(?<_VAL_1>\S+)
10-13-2017 14:04:14.742 INFO SearchOperator:kv - name=bracket-space, can_use_re2=0, regex: \[(\S+) (.*?)\]
10-13-2017 14:04:14.744 INFO SearchOperator:kv - name=sendmail-extractions, can_use_re2=0, regex: sendmail\[(\d+)\]: (\w+):
10-13-2017 14:04:14.744 INFO SearchOperator:kv - name=tcpdump-endpoints, can_use_re2=0, regex: (\d+\.\d+\.\d+\.\d+):(\d+) -> (\d+\.\d+\.\d+\.\d+):(\d+)
10-13-2017 14:04:14.744 INFO SearchOperator:kv - name=colon-kv, can_use_re2=0, regex: (?<= )([A-Za-z]+): ?((0x[A-F\d]+)|\d+)(?= |\n|$)
10-13-2017 14:04:14.752 INFO SearchOperator:kv - name=EXTRACT-severity,logger, can_use_re2=0, regex: .*?(?<severity>[A-Z]+) ((?<logger>[^\s]+) \-)*
10-13-2017 14:04:14.753 INFO SearchOperator:kv - name=EXTRACT-collection,category,object, can_use_re2=0, regex: collection=\"?(?P<collection>[^\"\n]+)\"?\ncategory=\"?(?P<category>[^\"\n]+)\"?\nobject=\"?(?P<object>[^\"\n]+)\"?\n
10-13-2017 14:04:14.754 INFO SearchOperator:kv - name=wel-message, can_use_re2=0, regex: (?sm)^(?<_pre_msg>.+)\nMessage=(?<Message>.+)$
10-13-2017 14:04:14.754 INFO SearchOperator:kv - name=wel-col-kv, can_use_re2=0, regex: \n([^:\n\r]+):[ \t]++([^\n]*)
10-13-2017 14:04:14.755 INFO SearchOperator:kv - name=EXTRACT-useragent, can_use_re2=0, regex: userAgent=(?P<browser>[^ (]+)
10-13-2017 14:04:14.755 INFO SearchOperator:kv - name=splunk-service-extractions, can_use_re2=0, regex: (?i)^(?:[^ ]* ){2}(?P<log_level>[^\s]*)\s+\[(?P<requestid>\w+)]\s+(?P<component>[^ ]+):(?P<line>\d+) - (?P<message>.+)
10-13-2017 14:04:14.755 INFO SearchOperator:kv - name=EXTRACT-fields, can_use_re2=0, regex: (?i)^(?:[^ ]* ){2}(?:[+\-]\d+ )?(?P<log_level>[^ ]*)\s+(?P<component>[^ ]+) - (?P<message>.+)
10-13-2017 14:04:14.755 INFO SearchOperator:kv - name=extract_spent, can_use_re2=0, regex: (?P<spent>\d+)ms$
10-13-2017 14:04:14.756 INFO SearchOperator:kv - name=weblogic-code, can_use_re2=0, regex: <BEA-([0-9]+)>
10-13-2017 14:04:14.756 INFO SearchOperator:kv - name=colon-line, can_use_re2=0, regex: ^(\w+)\s*:[ \t]*(.*?)$
10-13-2017 14:04:14.756 INFO SearchOperator:kv - name=was-trlog-code, can_use_re2=0, regex: ] ([a-fA-F0-9]{8})
10-13-2017 14:04:14.757 INFO UnifiedSearch - base lispy: [ AND index::* splunk_server::splunk-index-test-01* ]
10-13-2017 14:04:14.758 INFO UnifiedSearch - Processed search targeting arguments
10-13-2017 14:04:14.758 INFO SortOperator - maxmem = 209715200
10-13-2017 14:04:14.758 INFO SortOperator - maxmem = 209715200
10-13-2017 14:04:14.758 INFO SearchParser - PARSING: prestats count by splunk_server
10-13-2017 14:04:14.758 INFO SearchParser - PARSING: addinfo type=count label=prereport_events
10-13-2017 14:04:14.758 INFO DispatchThread - BatchMode: allowBatchMode: 1, conf(1): 1, timeline/Status buckets(0):0, realtime(0):0, report pipe empty(0):0, reqTimeOrder(0):0, summarize(0):0, statefulStreaming(0):0
10-13-2017 14:04:14.758 INFO DispatchThread - required fields list to add to remote search = prestats_reserved_*,psrsvd_*,splunk_server
10-13-2017 14:04:14.758 INFO SearchParser - PARSING: fields keepcolorder=t "prestats_reserved_*" "psrsvd_*" "splunk_server"
10-13-2017 14:04:14.763 INFO DispatchThread - Did not find a usable summary_id, setting info._summary_mode=none, not modifying input summary_id=49CAB615-276A-428B-972B-FC67E89AEB46_search_bryn_96102898428831f8
10-13-2017 14:04:14.765 INFO DispatchThread - Did not find a usable summary_id, setting info._summary_mode=none, not modifying input summary_id=49CAB615-276A-428B-972B-FC67E89AEB46_search_bryn_NScf8163cdac44f862
10-13-2017 14:04:14.766 INFO DispatchThread - Allow retry on peer failure
10-13-2017 14:04:14.766 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.766 INFO UserManager - Done setting user context: bryn -> bryn
10-13-2017 14:04:14.766 INFO UserManager - Unwound user context: bryn -> bryn
10-13-2017 14:04:14.766 INFO DistributedSearchResultCollectionManager - Stream search: litsearch ( index=* ) ( ( splunk_server=splunk-index-test-01* ) ) | addinfo type=count label=prereport_events | fields keepcolorder=t "prestats_reserved_*" "psrsvd_*" "splunk_server" | prestats count by splunk_server
10-13-2017 14:04:14.766 INFO ExternalResultProvider - No external result providers are configured
10-13-2017 14:04:14.766 INFO DistributedSearchResultCollectionManager - Default search group:*
10-13-2017 14:04:14.766 INFO DistributedSearchResultCollectionManager - Connecting to peer splunk-index-test-01 connectAll 0 connectToSpecificPeer 1
10-13-2017 14:04:14.766 INFO DistributedSearchResultCollectionManager - Connecting to peer splunk-index-test-02 connectAll 0 connectToSpecificPeer 1
10-13-2017 14:04:14.766 INFO DistributedSearchResultCollectionManager - Connecting to peer splunk-index-test-03 connectAll 0 connectToSpecificPeer 1
10-13-2017 14:04:14.766 INFO DistributedSearchResultCollectionManager - Connecting to peer splunk-search-head-test-01 connectAll 0 connectToSpecificPeer 1
10-13-2017 14:04:14.766 INFO ServerConfig - Using REMOTE_SERVER_NAME=splunk-search-head-test-01
10-13-2017 14:04:14.767 INFO KeyManagerLocalhost - Checking for localhost key pair
10-13-2017 14:04:14.767 INFO KeyManagerLocalhost - Public key already exists: /opt/splunk/etc/auth/distServerKeys/trusted.pem
10-13-2017 14:04:14.767 INFO KeyManagerLocalhost - Reading public key for localhost: /opt/splunk/etc/auth/distServerKeys/trusted.pem
10-13-2017 14:04:14.767 INFO KeyManagerLocalhost - Finished reading public key for localhost: /opt/splunk/etc/auth/distServerKeys/trusted.pem
10-13-2017 14:04:14.767 INFO KeyManagerLocalhost - Reading private key for localhost: /opt/splunk/etc/auth/distServerKeys/private.pem
10-13-2017 14:04:14.767 INFO KeyManagerLocalhost - Finished reading private key for localhost: /opt/splunk/etc/auth/distServerKeys/private.pem
10-13-2017 14:04:14.768 INFO DistributedSearchResultCollectionManager - Successfully created search result collector for peer=splunk-index-test-01 in 0.003000 seconds
10-13-2017 14:04:14.770 INFO DistributedSearchResultCollectionManager - Successfully created search result collector for peer=splunk-index-test-02 in 0.002000 seconds
10-13-2017 14:04:14.772 INFO DistributedSearchResultCollectionManager - Successfully created search result collector for peer=splunk-index-test-03 in 0.002000 seconds
10-13-2017 14:04:14.772 INFO DispatchThread - Disk quota = 10485760000
10-13-2017 14:04:14.772 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.772 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.772 INFO SearchParser - PARSING: litsearch ( index=* ) ( ( splunk_server=splunk-index-test-01* ) ) | addinfo type=count label=prereport_events | fields keepcolorder=t "prestats_reserved_*" "psrsvd_*" "splunk_server" | prestats count by splunk_server
10-13-2017 14:04:14.784 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.784 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.785 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.785 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.785 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.785 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.793 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.793 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.793 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.793 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.797 INFO SearchParser - PARSING: typer | tags
10-13-2017 14:04:14.798 INFO FastTyper - found nodes count: comparisons=6, unique_comparisons=5, terms=4, unique_terms=4, phrases=12, unique_phrases=12, total leaves=22
10-13-2017 14:04:14.801 INFO UnifiedSearch - Processed search targeting arguments
10-13-2017 14:04:14.801 INFO LocalCollector - Final required fields list = prestats_reserved_*,psrsvd_*,splunk_server
10-13-2017 14:04:14.801 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:14.801 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:14.801 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:14.801 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:14.801 WARN RetryManager - Peer: splunk-search-head-test-01 not found in offset map.
10-13-2017 14:04:15.108 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.109 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.109 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.109 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.109 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.125 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.128 INFO UserManager - Setting user context: bryn
10-13-2017 14:04:15.128 INFO UserManager - Done setting user context: NULL -> bryn
10-13-2017 14:04:15.128 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.133 INFO DispatchThread - Downloading all remote search.log files took 0.005 seconds
10-13-2017 14:04:15.135 INFO DispatchManager - DispatchManager::dispatchHasFinished(id='1507917854.41', username='bryn')
10-13-2017 14:04:15.136 INFO UserManager - Unwound user context: bryn -> NULL
10-13-2017 14:04:15.136 INFO ShutdownHandler - Shutting down splunkd
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Begin"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_JustBeforeKVStore"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_KVStore"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Thruput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_TcpInput1"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_TcpOutput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_UdpInput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_FifoInput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_WinEventLogInput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_HttpInput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Scheduler"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Tailing"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_SyslogOutput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_HTTPOutput"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_TailingXP"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_PeerManager"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_ArchiveAndOneshot"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_AuditTrailManager"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_AuditTrailQueueServiceThread"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_FSChangeMonitor"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_FSChangeManagerProcessor"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_HttpClientPollingThread"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_AsyncQueuedMessageDispatcherThread"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_OfflineFlusher"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Slave"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_SlaveSearch"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Captain"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Select"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_IdataDO_Collector"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_TcpOutput2"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_IndexerService"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Database1"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_LastIndexerLevel"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_TcpInput2"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_LoadLDAPUsers"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_MetricsManager"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Pipeline"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Queue"
10-13-2017 14:04:15.136 INFO ShutdownHandler - shutting down level "ShutdownLevel_Exec"
10-13-2017 14:04:15.137 INFO ShutdownHandler - shutting down level "ShutdownLevel_CallbackRunner"
10-13-2017 14:04:15.137 INFO ShutdownHandler - shutting down level "ShutdownLevel_HttpClient"
10-13-2017 14:04:15.137 INFO ShutdownHandler - Shutdown complete in 972 microseconds
Looks like your search head connects to all three peers:
10-13-2017 14:04:14.768 INFO DistributedSearchResultCollectionManager - Successfully created search result collector for peer=splunk-index-test-01 in 0.003000 seconds
10-13-2017 14:04:14.770 INFO
DistributedSearchResultCollectionManager
- Successfully created search result collector for
peer=splunk-index-test-02 in 0.002000
seconds 10-13-2017 14:04:14.772 INFO
DistributedSearchResultCollectionManager
- Successfully created search result collector for
peer=splunk-index-test-03 in 0.002000
seconds
Your job inspector shows that two of the three peers did not contribute data to the search result set.
There were only 8 buckets searched, how much data do you have at rest and what was your search timeframe? Most importantly: Which indexes are you searching by default (index=*)?
It is entirely possible that all primary buckets for your search resided on a single peer.
That is not the case, when I search individually on each indexer on the command line I get results. When I search from the command line on the master, I get results for each indexer.
your config looks good to me
try index=_* | chart count by splunk_server . Does it also return results on one search head ?
also check you don't have a search group configured on the SH with only one indexer (do a btool on distsearch.conf for checking this) ?
if so, just remove it
from the config, you are not using any multisite mode (then you could have afinity between sh and one indexer) -> so unlikely this is the cause
as you have multiple nic, your indexer may report a unreachable search ip to the CM
On each indexer, I would force the search ip in server.conf (look for register_search_address = IP address in server.conf)
I would start by setting this.
I rebuilt the VM with the same config, and it has the same exact symptoms. Of course this means that it was disconnected from the master/cluster and re-attached.
/opt/splunk/bin/splunk list search-server
Server at URI "splunk-index-test-01-private.oit.duke.edu:8089" with status as "Up"
Server at URI "splunk-index-test-02-private.oit.duke.edu:8089" with status as "Up"
Server at URI "splunk-index-test-03-private.oit.duke.edu:8089" with status as "Up"
and
/opt/splunk/bin/splunk search "index=* | stats count by splunk_server"
splunk_server count
-------------------- ------
splunk-index-test-01 727141
Interestingly, the search of internal indexes returned nothing at all:
/opt/splunk/bin/splunk search "index=_* | stats count by splunk_server"
root@splunk-search-head-test-01 /opt/splunk/etc/system/local $
There are no distsearch.conf files anywhere that aren't default:
/opt/splunk/etc $ find ./ -name distsearch.conf
./apps/splunk_archiver/default/distsearch.conf
./apps/splunk_management_console/default/distsearch.conf
./system/default/distsearch.conf
I'm really puzzled by this one.
So am I. When you run your search on the SH UI and then look at the job inspector output, do you see your three indexers listed under dispatch.stream.remote?
Any chance you can post the full search.log from your search?
Same old search:
index=* | chart count by splunk_server
which returned:
splunk-index-test-01 | 18443
From the job inspector:
0.14 dispatch.stream.remote 11 - 50,997
0.14 dispatch.stream.remote.splunk-index-test-01 9 - 43,249
0.00 dispatch.stream.remote.splunk-index-test-02 1 - 3,874
0.00 dispatch.stream.remote.splunk-index-test-03 1 - 3,874
and at the end of the job inspector:
searchProviders
[
"splunk-index-test-01",
"splunk-index-test-02",
"splunk-index-test-03",
"splunk-search-head-test-01"
]
searchTotalBucketsCount 8
searchTotalEliminatedBucketsCount 0
sid 1507917854.41
statusBuckets 0
ttl 600
Additional info search.log search.log( splunk-index-test-01 splunk-index-test-02 splunk-index-test-03 )
The search.log is attached to the next answer.
I removed the cluster info from the search-head's server.conf and added the peers individually. I'm still only getting one of them, so it isn't the master.
I probably should have mentioned - this was a problem before I added the 2nd NIC. I re-did all of the index-clustering pieces (removed [cluster] from server.conf everywhere and re-ran the cluster-config command everywhere) and re-attached the search-head. It had not really registered that there was a connection problem until after I'd done all of this, but it definitely was there.
There are no distsearch.conf files anywhere except the defaults, on the search-head, indexers, and master. btool is fine with those.
Any indication in splunkd.log that the search head can successfully connect to all search peers?
Also, have you taken a look at the search job inspector, specifically search.log in the UI?
I can telnet to both nics on all 3 boxes over port 8089.
I inspected this job:
index=*
This is under normalizedSearch:
litsearch ( index=* ) ( ( splunk_server=splunk-index-test-01* ) ) | fields keepcolorder=t "*" "_bkt" "_cd" "_si" "host" "index" "linecount" "source" "sourcetype" "splunk_server" | remotetl nb=300 et=1507753287.000000 lt=1507839687.000000 remove=true max_count=1000 max_prefetch=100
It looks from this like the search-head believes there is only one search-peer.
Further, if I specify the splunk_server in the search:
index=* splunk_server=splunk-index-test-02*
I get "Search filters specified using splunk_server/splunk_server_group do not match any search peer."
Even though that search peer is listed under Distributed search->search peers. If the search-head was unable to see them, they would not show up there.
splunkd.log shows no connection problems.
I did just find "Not connecting to peer 'splunk-index-test-02' because it has been optimized out. Peername and none of it's search groups [] match the query."
Searching docs.splunk.com for either of these phrases gets me nothing. I do wish that the error log wording showed up in the documentation.
I also searched for DistributedSearchResultCollectionManager.
This is just a wild hair idea, but if you are replicating data between three indexers and all the data got replicated onto one indexer, then connecting to the other two wouldn't really be necessary, would it?
Highly unlikely, and wouldn't explain the results when running the search on the cluster master.
That's not how index clustering works.
If I stop splunk on the one that I can see, there is no data at all.
Do you by any chance still have a distsearch.conf file on your search head?
Re-reading your original question, it almost sounds like you have connected it to the cluster master AND configured search peers in the distributed search setup....?
Maybe I am misreading...
Strange. If you haven't already, I would try to remove/re-add the search head to the cluster and see if that helps. I'll dig to see if I can find the REST call to get the list of search peers from the cluster master.
Not sure if this will work from your search head, but worth a try:
|rest /services/cluster/master/peers count=0 splunk_server=splunk-master-test-01
I'm not sure what you mean by removing the search-head from the cluster. It's an index cluster, the search-head is a singleton.
I'd appreciate that REST call.