Getting Data In

How to get list of buckets which are having issues in replicating, from API and CLI?

shivanshsingh
Explorer

When my splunk multi-site indexer cluster comes up, I have some buckets belonging to _audit and _internal which are having issues getting replicated, due to which Indexer clustering dashboard on Cluster Master shows, Replication Factor not met. I can see the bucket names from the dashboard page, by clicking on the bucket status button. Then when I delete those buckets from Cluster Master CLI, everything is back to normal and my dashboard says "Rep. factor met".

I want to know instead of Splunk dashboard UI, is there a way to get the bucket names which are having replication issues, via CLI or REST API?

mzorzi
Splunk Employee
Splunk Employee

I think there was a problem with copying the regex extraction. The original search should be

| rest /services/cluster/master/buckets splunk_server=*
    | rex field=title "^(?<repl_index>[^\~]+)" 
    | search repl_index="*" standalone=0 frozen=0
    | rename title AS bucketID
    | fields bucketID peers.*.search_state  *site*
    | untable bucketID siteState value
    | rex field=siteState "peers\.(?<peerGUID>[^\.]*)\.(search_state)"
    | rex field=siteState "(?<siteState>primaries_by_site)\.(\S+)"
    | rex field=siteState "(?<siteState>rep_count_by_site)\.(\S+)"
    | rex field=siteState "(?<siteState>search_count_by_site)\.(\S+)"| eval peerGUID=if(siteState=="primaries_by_site", value, peerGUID)
    | eval site=if(siteState=="origin_site", value, site)
    | eval value=if(siteState=="search_count_by_site", site + ":" + value, value)
    | eval value=if(siteState=="rep_count_by_site", site + ":" + value, value)

       | eval peerGUID=if(siteState=="primaries_by_site", value, peerGUID)
       | eval site=if(siteState=="origin_site", value, site)
       | eval value=if(siteState=="search_count_by_site", site + ":" + value, value)
       | eval value=if(siteState=="rep_count_by_site", site + ":" + value, value)

    | join type=outer peerGUID [ rest /services/cluster/master/peers splunk_server=*
                           | fields active_* host* label title status site
                           | eval PeerName= site + ":" + label + ":" + host_port_pair
                           | rename title AS peerGUID
                           | rename site AS peerSite
                           | table peerGUID PeerName peerSite ]
    | eval site=if(siteState=="search_state", peerSite, site)
    | eval value=if(siteState=="primaries_by_site", PeerName + ":For_" + site, value)
    | eval value=if(siteState=="search_state", PeerName + ":" + value, value)
    | fields - PeerName peerGUID peerSite    | chart values(value) over bucketID by siteState

rphillips_splk
Splunk Employee
Splunk Employee

This search comes courtesy of my co-worker @Masa

Clustering

Multi-site enabled

Simple version of bucket state by site

| rest /services/cluster/master/buckets
   | rex field=title "^(?[^\~]+)"
   | search repl_index="*" standalone=0 frozen=0
   | rename title AS bucketID
   | fields bucketID  *origin_site* *_by_site*
   | untable bucketID siteState value
   | rex mode=sed field=siteState "s/\./__/"
   | rex mode=sed field=siteState "s/_count_/_/"
   | search NOT siteState=primaries_*
   | xyseries bucketID siteState value
   | fields - search_by_site
   | fillnull
   | eval rep_total= rep_by_site__site1 + rep_by_site__site2 + rep_by_site__site3
   | eval srch_total = search_by_site__site1 + search_by_site__site2 + search_by_site__site3
   | rename constrain_to_origin_site AS constrain
   | rename origin_site AS origin
   | rename rep_by_site__site1 AS rep_site1
   | rename rep_by_site__site2 AS rep_site2
   | rename rep_by_site__site3 AS rep_site3
   | rename search_by_site__site1 AS srch_site1
   | rename search_by_site__site2 AS srch_site2
   | rename search_by_site__site3 AS srch_site3

table output:

bucketID constrain origin rep_site1 rep_site2 rep_site3 rep_total srch_site1 srch_site2 srch_site3 srch_total


_audit~118~FF782A13-8AFB-4617-BCB4-15ED11928DD7 0 site1 2 1 1 4 2 1 1 4
_audit~119~FF782A13-8AFB-4617-BCB4-15ED11928DD7 0 site1 2 1 2 5 2 1 1 4

You can further filter out for buckets where rep or search factor is not met (assuming your rep factor=4 and search factor=3) by appending this to the end of the search:
| search rep_total<4 OR srch_total<3

Note: remove references to site3 in the search if you only have 2 sites in the multi-site cluster

Clustering

Multi-site enabled

| rest /services/cluster/master/buckets
   | rex field=title "^(?[^\~]+)"
   | search repl_index="*" standalone=0 frozen=0
   | rename title AS bucketID
   | fields bucketID peers.*.search_state  *site*
   | untable bucketID siteState value

   | rex field=siteState "peers\.(?[^\.]*?)\.(?search_state)"
   | rex field=siteState "(?primaries_by_site)\.(?\S+)"
   | rex field=siteState "(?rep_count_by_site)\.(?\S+)"
   | rex field=siteState "(?search_count_by_site)\.(?\S+)"

   | eval peerGUID=if(siteState=="primaries_by_site", value, peerGUID)
   | eval site=if(siteState=="origin_site", value, site)
   | eval value=if(siteState=="search_count_by_site", site + ":" + value, value)
   | eval value=if(siteState=="rep_count_by_site", site + ":" + value, value)

   | join type=outer peerGUID [ rest /services/cluster/master/peers
                          | fields active_* host* label title status site
                          | eval PeerName= site + ":" + label + ":" + host_port_pair
                          | rename title AS peerGUID
                          | rename site AS peerSite
                          | table peerGUID PeerName peerSite ]
   | eval site=if(siteState=="search_state", peerSite, site)
   | eval value=if(siteState=="primaries_by_site", PeerName + ":For_" + site, value)
   | eval value=if(siteState=="search_state", PeerName + ":" + value, value)
   | fields - PeerName peerGUID peerSite    | chart values(value) over bucketID by siteState

table output:

               bucketID                     constrain origin                     primaries_by_site                      rep_by_site

srch_by_site search_state



_audit~118~FF782A13-8AFB-4617-BCB4-15ED11928DD7 0 site1 site1:centos58-64sup01-620CP:10.140.48.137:55591:For_site1 site1:2 site1:2

site1:centos58-64sup01-620CP:10.140.48.137:55591:Searchable
site2:centos65-64sup14-620CP:10.140.48.150:55591:For_site2 site2:1 site2:1

site1:centos65-64sup06-620CP:10.140.48.142:55591:Searchable
site3:centos62-64sup13-620CP:10.140.48.149:55591:For_site3 site3:1 site3:1

site2:centos65-64sup14-620CP:10.140.48.150:55591:Searchable

site3:centos62-64sup13-620CP:10.140.48.149:55591:Searchable

_audit~119~FF782A13-8AFB-4617-BCB4-15ED11928DD7 0 site1 site1:centos58-64sup01-620CP:10.140.48.137:55591:For_site1 site1:2 site1:2

site1:centos58-64sup01-620CP:10.140.48.137:55591:Searchable
site2:centos65-64sup14-620CP:10.140.48.150:55591:For_site2 site2:1 site2:1

site1:centos65-64sup06-620CP:10.140.48.142:55591:Searchable
site3:centos62-64sup13-620CP:10.140.48.149:55591:For_site3 site3:2 site3:1

site2:centos65-64sup14-620CP:10.140.48.150:55591:Searchable

site3:centos62-64sup12-620CP:10.140.48.148:55591:Unsearchable

site3:centos62-64sup13-620CP:10.140.48.149:55591:Searchable

isoutamo
SplunkTrust
SplunkTrust

It seems that this query has gone broken when migrated to the new community platform. Here is fixed versio if someone else also needs it. I need to check bucket status as it seems that splunk 8.1.4 (have heard that also 8.1.3) has broken replication for buckets this has only some buckets left after some has frozen.

| rest splunk_server=<ADD YOUR CM HERE> /services/cluster/master/buckets 
``` if you know bucket id add it here ```
``` | search title=$bucketIdx$~$bucketNbr$~$bucketGuid$* ```
| rex field=title "^(<repl_index>[^\~]+)" 
| search repl_index="*" standalone=0 frozen=*
| rename title AS bucketID 
| fields bucketID peers.*.search_state *site* 
| untable bucketID siteState value 
| rex field=siteState "peers\.(<search_state>[^\.]*?)\.search_state" 
| rex field=siteState "\.(<primaries_by_site>\S+)" 
| rex field=siteState "\.(<rep_count_by_site>\S+)" 
| rex field=siteState "\.(<search_count_by_site>\S+)" 
| eval peerGUID=if(siteState=="primaries_by_site", value, peerGUID) 
| eval site=if(siteState=="origin_site", value, site) 
| eval value=if(siteState=="search_count_by_site", site + ":" + value, value) 
| eval value=if(siteState=="rep_count_by_site", site + ":" + value, value) 
| join type=outer peerGUID 
    [ rest splunk_server=<ADD YOUR CM HERE> /services/cluster/master/peers 
    | fields active_* host* label title status site 
    | eval PeerName= site + ":" + label + ":" + host_port_pair 
    | rename title AS peerGUID 
    | rename site AS peerSite 
    | table peerGUID PeerName peerSite ] 
| eval site=if(siteState=="search_state", peerSite, site) 
| eval value=if(siteState=="primaries_by_site", PeerName + ":For_" + site, value) 
| eval value=if(siteState=="search_state", PeerName + ":" + value, value) 
| fields - PeerName peerGUID peerSite 
| chart limit=0 values(value) over bucketID by siteState

 

You should replace <ADD YOUR CM HERE> with your Cluster Master name.

r. Ismo 

0 Karma

sansay1
Explorer

I thought this was a great query to have. But unfortunately it is dangerous on a cluster with 600 indexers. Every time I ran it, Splunk got killed by the kernel due to "out of memory"

0 Karma

isoutamo
SplunkTrust
SplunkTrust
Probably you have a quite many buckets, which means it needs lot of memory! Could you somehow limit buckets used in query is used e.g. limit to one index or something in the beginning of query?
0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...