Monitoring Splunk

Indexers BatchAdding problem

m_zandinia
Path Finder

Hi Splunkers.

I have an indexer cluster and all of sudden all of them goes up and down and stuck in BatchAdding status.

I have 4 indexers.

These are my settings:

 

[clustering]
cluster_label = IndexerCluster
mode = master
rebalance_threshold = 0.95
replication_factor = 3
search_factor = 2
restart_timeout = 180
service_interval = 90
heartbeat_timeout = 180
cxn_timeout = 300
send_timeout = 300
rcv_timeout = 300
max_peer_build_load = 20
max_peer_rep_load = 50
max_fixup_time_ms = 0
maintenance_mode = false

 

I increase max_peer_build_load  to improve my fixup tasks but it doesn't work.

I've followed the amount of buckets and it increases very slowly.

I have this error in my splund.log file on indexers

 

ERROR ProcessTracker - (child_581__Fsck) BucketBuilder - BucketBuilder::error: Event data size is 0. Raw and Meta data may be missing for bucket="/Splunk-Storage/HOT/eventlog-online-index/db_1641702441_1641656220_301"

 

 

WARN  ProcessTracker - (child_601__Fsck)  Fsck - Repair entire bucket, index=eventlog-online-index, tryWarmThenCold=1, bucket=/Splunk-Storage/HOT/eventlog-online-index/db_1641702441_1641656220_301, exists=1, localrc=3, failReason=(entire bucket) Rebuild for bkt='/Splunk-Storage/HOT/eventlog-online-index/db_1641702441_1641656220_301' failed: BucketBuilder::error: Event data size is 0. Raw and Meta data may be missing for bucket="/Splunk-Storage/HOT/eventlog-online-index/db_1641702441_1641656220_301"

 

On the other hand I face with crash.log file on my indexers continuously

 

Received fatal signal 8 (Floating point exception).
 Cause:
   Integer division by zero at address [0x0000557E03DBB1D9].
 Crashing thread: indexerPipe
 Registers:
    RIP:  [0x0000557E03DBB1D9] _ZN12HotDBManager19computeBucketMapKeyERK15CowPipelineData + 121 (splunkd + 0xEF91D9)
    RDI:  [0x00007F43D73836D0]
    RSI:  [0x00007F43ABDAA72D]
    RBP:  [0x00007F43C022EB40]
    RSP:  [0x00007F43C07FD5A0]
    RAX:  [0x07AC58C70206CAB3]
    RBX:  [0x07AC58C70206CAB3]
    RCX:  [0x0000000000000000]
    RDX:  [0x0000000000000000]
    R8:  [0x00000000000000B8]
    R9:  [0x00007F43C8F3E060]
    R10:  [0x00007F43D73867D0]
    R11:  [0x00007F43D6200080]
    R12:  [0x00007F43D7385E08]
    R13:  [0x00007F43C07FD5F0]
    R14:  [0x00007F43C02148E0]
    R15:  [0x00007F43B6C2B500]
    EFL:  [0x0000000000010246]
    TRAPNO:  [0x0000000000000000]
    ERR:  [0x0000000000000000]
    CSGSFS:  [0x002B000000000033]
    OLDMASK:  [0x0000000000000000]

 OS: Linux
 Arch: x86-64

 Backtrace (PIC build):
  [0x0000557E03DBB1D9] _ZN12HotDBManager19computeBucketMapKeyERK15CowPipelineData + 121 (splunkd + 0xEF91D9)
  [0x0000557E03DBCFDA] _ZN12HotDBManager15_suitableBucketERK15CowPipelineDatalRblR3Str + 410 (splunkd + 0xEFAFDA)
  [0x0000557E03DBF018] _ZN12HotDBManager10suitableDbERK15CowPipelineDatalRblR3Str + 24 (splunkd + 0xEFD018)
  [0x0000557E03E1AF53] _ZN11IndexWriter11_dbLazyLoadERK15CowPipelineDatall + 131 (splunkd + 0xF58F53)
  [0x0000557E03E1C054] _ZN11IndexWriter14write_internalER15CowPipelineDatalRP8DBBucketb + 308 (splunkd + 0xF5A054)
  [0x0000557E03E1C8D7] _ZN11IndexWriter10write_implER15CowPipelineDatalb + 103 (splunkd + 0xF5A8D7)
  [0x0000557E03E1CC43] _ZN11IndexWriter5writeER15CowPipelineDatal + 19 (splunkd + 0xF5AC43)
  [0x0000557E03E1404F] _ZN14IndexProcessor7executeER15CowPipelineData + 3951 (splunkd + 0xF5204F)
  [0x0000557E0433F585] _ZN9Processor20executeMultiLastStepER18PipelineDataVector + 101 (splunkd + 0x147D585)
  [0x0000557E03B2ABCA] _ZN8Pipeline4mainEv + 1418 (splunkd + 0xC68BCA)
  [0x0000557E048FD9D8] _ZN6Thread8callMainEPv + 120 (splunkd + 0x1A3B9D8)
  [0x00007F43D67D6609] ? (libpthread.so.0 + 0x2609)
  [0x00007F43D66FD263] clone + 67 (libc.so.6 + 0xFD263)
 Linux / indexer1-datacenter / 5.4.0-92-generic / #103-Ubuntu SMP Fri Nov 26 16:13:00 UTC 2021 / x86_64
 /etc/debian_version: bullseye/sid
Last errno: 2
Threads running: 72
Runtime: 8.643140s
argv: [splunkd --under-systemd --systemd-delegate=yes -p 8089 _internal_launch_under_systemd]
Regex JIT enabled

RE2 regex engine enabled

using CLOCK_MONOTONIC
Thread: "indexerPipe", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f43c2118e10:
00000000  00 e7 7f c0 43 7f 00 00                           |....C...|
00000008


x86 CPUID registers:
         0: 00000016 756E6547 6C65746E 49656E69
         1: 00050657 08400800 7FFEFBFF BFEBFBFF
         2: 76036301 00F0B5FF 00000000 00C30000
         3: 00000000 00000000 00000000 00000000
         4: 00000000 00000000 00000000 00000000
         5: 00000040 00000040 00000003 00002020
         6: 00000AF7 00000002 00000009 00000000
         7: 00000000 00000000 00000000 00000000
         8: 00000000 00000000 00000000 00000000
         9: 00000000 00000000 00000000 00000000
         A: 07300404 00000000 00000000 00000603
         B: 00000000 00000000 0000002F 00000008
         C: 00000000 00000000 00000000 00000000
          00000000 00000000 00000000 00000000
         E: 00000000 00000000 00000000 00000000
         F: 00000000 00000000 00000000 00000000
        10: 00000000 00000000 00000000 00000000
        11: 00000000 00000000 00000000 00000000
        12: 00000000 00000000 00000000 00000000
        13: 00000000 00000000 00000000 00000000
        14: 00000000 00000000 00000000 00000000
        15: 00000002 000000F0 00000000 00000000
        16: 00000BB8 00000FA0 00000064 00000000
  80000000: 80000008 00000000 00000000 00000000
  80000001: 00000000 00000000 00000121 2C100800
  80000002: 65746E49 2952286C 6F655820 2952286E
  80000003: 6C6F4720 32362064 20523834 20555043
  80000004: 2E332040 48473030 0000007A 00000000
  80000005: 00000000 00000000 00000000 00000000
  80000006: 00000000 00000000 01006040 00000000
  80000007: 00000000 00000000 00000000 00000100
  80000008: 0000302E 00000000 00000000 00000000
terminating...

 

My OS is Ubuntu server 20.04.

Any suggestion?

Can I bring up one indexer outside of my cluster to prevent log drop and after the cluster will be stable join it to cluster?

Labels (2)
0 Karma

emzet
Explorer

If you have indexer cluster bucket should have GUID on name like this: 

db_1641702441_1641656220_301_C7FC9055-53C4-4411-99E8-98FF5BA9E5E3

GUID of indxer you can find in instance.cfg file.

0 Karma

isoutamo
SplunkTrust
SplunkTrust

As your environment didn’t work and seems to have continuous crash, I propose that you should do ticket to splunk support asap with urgency one or two. They can help you.

r. Ismo

0 Karma
Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...