Deployment Architecture

After upgrade from 6.2 to 6.3 unable to start splunkd on my indexers clusterd servers, in distribiuted environment.

New Member

I have updated my splunk infra from 6.2 to 6.3
1) Deployment server -1
2) indexer cluster servers -2
2) sh head cluster servers -2

Update process was successful on all nodes but I am unable to start splunk services on my both indexers.
Please help

error :
[root@CDCDSPLKNDX1 bin]# ./splunk start

Splunk> The IT Search Engine.

Checking prerequisites...
Checking http port [8000]: open
Checking mgmt port [8089]: open
Checking appserver port [127.0.0.1:8065]: open
Checking kvstore port [8191]: open
Checking configuration... Done.
Checking critical directories... Done
Checking indexes...
Validated: audit _internal _introspection _thefishbucket collaborationdb firedalerts ghd history linux main os summary unix windows
Done

Bypassing local license checks since this instance is configured with a remote license master.

    Checking filesystem compatibility...  Done
    Checking conf files for problems...
    Done
    Checking default conf files for edits...
    Validating installed files against hashes from '/splunk/splunk/splunk-6.3.0-aa7d4b1ccb80-linux-2.6-x86_64-manifest'
    All installed files intact.
    Done
    Checking replication_port port [8080]: open

All preliminary checks passed.

Starting splunk server daemon (splunkd)...
Done
[ OK ]

Waiting for web server at http://127.0.0.1:8000 to be availablesplunkd 3362 was not running.
Stopping splunk helpers...
[ OK ]
Done.
Stopped helpers.
Removing stale pid file... done.

WARNING: web interface does not seem to be available!

Crash file:

[root@CDCDSPLKNDX1 splunk]# cat "crash-2015-11-27-18:26:28.log"
[build aa7d4b1ccb80] 2015-11-27 18:26:28
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 3362 running under UID 0.
Crashing thread: IdataDO_Collector
Registers:
RIP: [0x0000003938C32625] gsignal + 53 (/lib64/libc.so.6)
RDI: [0x0000000000000D22]
RSI: [0x0000000000000D35]
RBP: [0x0000000001893940]
RSP: [0x00007FD620FFC9F8]
RAX: [0x0000000000000000]
RBX: [0x00007FD62A1AA000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0xFEFEFEFEFEFEFEFF]
R9: [0x00007FD62A1FDF60]
R10: [0x0000000000000008]
R11: [0x0000000000000206]
R12: [0x0000000001816C94]
R13: [0x0000000001894130]
R14: [0x0000000000000000]
R15: [0x0000000000000003]
EFL: [0x0000000000000206]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]

OS: Linux
Arch: x86-64

Backtrace:
[0x0000003938C32625] gsignal + 53 (/lib64/libc.so.6)
[0x0000003938C33E05] abort + 373 (/lib64/libc.so.6)
[0x0000003938C2B74E] ? (/lib64/libc.so.6)
[0x0000003938C2B810] assertperrorfail + 0 (/lib64/libc.so.6)
[0x0000000000AEA0DD] ? (splunkd)
[0x0000000000AECACD] ZN22IdataCollectorCallback4tickEv + 157 (splunkd)
[0x0000000000AECBE3] _ZN17IdataDO
Collector4mainEv + 83 (splunkd)
[0x000000000109F0EE] ZN6Thread8callMainEPv + 62 (splunkd)
[0x00000039394079D1] ? (/lib64/libpthread.so.0)
[0x0000003938CE88FD] clone + 109 (/lib64/libc.so.6)
Linux / CDCDSPLKNDX1 / 2.6.32-504.12.2.el6.x86
64 / #1 SMP Sun Feb 1 12:14:02 EST 2015 / x8664
Last few lines of stderr (may contain info on assertion failure, but also could be old):
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDO
Collector.cpp:272: void collect
indexes(): Assertion ! name.empty()' failed.
2015-11-27 18:01:06.097 +0530 splunkd started (build aa7d4b1ccb80)
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDO_Collector.cpp:272: void collect__indexes(): Assertion
! name.empty()' failed.
2015-11-27 18:03:37.368 +0530 splunkd started (build aa7d4b1ccb80)
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDOCollector.cpp:272: void collectindexes(): Assertion `! name.empty()' failed.
2015-11-27 18:26:27.740 +0530 splunkd started (build aa7d4b1ccb80)
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDO
Collector.cpp:272: void collect__indexes(): Assertion `! name.empty()' failed.

/etc/redhat-release: Red Hat Enterprise Linux Server release 6.6 (Santiago)
glibc version: 2.12
glibc release: stable
Last errno: 0
Threads running: 19
Runtime: 0.954489s
argv: [splunkd -p 8089 start]
Thread: "IdataDOCollector", didjoin=0, readytorun=Y, main_thread=N
First 8 bytes of Thread token @0x7fd622820610:
00000000 00 d7 ff 20 d6 7f 00 00 |... ....|
00000008

x86 CPUID registers:
0: 0000000B 756E6547 6C65746E 49656E69
1: 000106A4 06010800 80982201 0FABFBFF
2: 55035A01 00F0B2E4 00000000 09CA212C
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000000 00000000 00000000 00000000
6: 00000003 00000002 00000001 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300401 0000007F 00000000 00000000
B: 00000000 00000000 000000FD 00000006
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 28100800
80000002: 65746E49 2952286C 6F655820 2952286E
80000003: 55504320 20202020 20202020 58202020
80000004: 30353535 20402020 37362E32 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003028 00000000 00000000 00000000
terminating...

0 Karma
1 Solution

Builder

I had what sounds sort of like the same problem with 6.3.0. Happened on some Splunk servers and not on others. Crashed every few seconds. Splunk support gave me an internal build at the time which was 6.3.0.1 and the problem went away. I haven't lined it up with a specific bug fix, but I'd try 6.3.1 on your servers.

View solution in original post

0 Karma

Builder

I had what sounds sort of like the same problem with 6.3.0. Happened on some Splunk servers and not on others. Crashed every few seconds. Splunk support gave me an internal build at the time which was 6.3.0.1 and the problem went away. I haven't lined it up with a specific bug fix, but I'd try 6.3.1 on your servers.

View solution in original post

0 Karma

New Member

Thanks mfrost8 for your reply.

Error was due to some permission issue of audit db files and "indexes.conf" . Given splunk:splunk permission to all audit db buckets and tracing of bad index conf file by using "./splunk cmd btool indexes list --debug|more" help us for closing the issue.

0 Karma

New Member

Upgraded to 6.3.1 ..still unable to start indexers. Getting below crash. I hope same error i am getting.

[root@CDCDSPLKNDX1 splunk]# cat crash-2015-12-01-13:04:32.log
[build f3e41e4b37b2] 2015-12-01 13:04:32
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 17203 running under UID 0.
Crashing thread: IdataDO_Collector
Registers:
RIP: [0x0000003938C32625] gsignal + 53 (/lib64/libc.so.6)
RDI: [0x0000000000004333]
RSI: [0x000000000000434B]
RBP: [0x0000000001893860]
RSP: [0x00007FB3513FE9F8]
RAX: [0x0000000000000000]
RBX: [0x00007FB35B1A2000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0xFEFEFEFEFEFEFEFF]
R9: [0x00007FB35B1F5F60]
R10: [0x0000000000000008]
R11: [0x0000000000000206]
R12: [0x0000000001816AC1]
R13: [0x0000000001894050]
R14: [0x0000000000000000]
R15: [0x0000000000000003]
EFL: [0x0000000000000206]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]

OS: Linux
Arch: x86-64

Backtrace:
[0x0000003938C32625] gsignal + 53 (/lib64/libc.so.6)
[0x0000003938C33E05] abort + 373 (/lib64/libc.so.6)
[0x0000003938C2B74E] ? (/lib64/libc.so.6)
[0x0000003938C2B810] assertperrorfail + 0 (/lib64/libc.so.6)
[0x0000000000AE848D] ? (splunkd)
[0x0000000000AEB06D] ZN22IdataCollectorCallback4tickEv + 157 (splunkd)
[0x0000000000AEB183] _ZN17IdataDO
Collector4mainEv + 83 (splunkd)
[0x000000000109E7BE] ZN6Thread8callMainEPv + 62 (splunkd)
[0x00000039394079D1] ? (/lib64/libpthread.so.0)
[0x0000003938CE88FD] clone + 109 (/lib64/libc.so.6)
Linux / CDCDSPLKNDX1 / 2.6.32-504.12.2.el6.x86
64 / #1 SMP Sun Feb 1 12:14:02 EST 2015 / x8664
Last few lines of stderr (may contain info on assertion failure, but also could be old):
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDO
Collector.cpp:272: void collect
indexes(): Assertion ! name.empty()' failed.
2015-11-30 14:43:00.176 +0530 splunkd started (build f3e41e4b37b2)
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDO_Collector.cpp:272: void collect__indexes(): Assertion
! name.empty()' failed.
2015-11-30 14:51:07.672 +0530 splunkd started (build f3e41e4b37b2)
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDOCollector.cpp:272: void collectindexes(): Assertion `! name.empty()' failed.
2015-12-01 13:04:31.533 +0530 splunkd started (build f3e41e4b37b2)
splunkd: /home/build/build-src/ember/src/pipeline/indexer/IdataDO
Collector.cpp:272: void collect__indexes(): Assertion `! name.empty()' failed.

/etc/redhat-release: Red Hat Enterprise Linux Server release 6.6 (Santiago)
glibc version: 2.12
glibc release: stable
Last errno: 0
Threads running: 19
Runtime: 1.084813s
argv: [splunkd -p 8089 start]
Thread: "IdataDOCollector", didjoin=0, readytorun=Y, main_thread=N
First 8 bytes of Thread token @0x7fb353820610:
00000000 00 f7 3f 51 b3 7f 00 00 |..?Q....|
00000008

x86 CPUID registers:
0: 0000000B 756E6547 6C65746E 49656E69
1: 000106A4 04010800 80982201 0FABFBFF
2: 55035A01 00F0B2E4 00000000 09CA212C
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000000 00000000 00000000 00000000
6: 00000003 00000002 00000001 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300401 0000007F 00000000 00000000
B: 00000000 00000000 000000FD 00000004
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 28100800
80000002: 65746E49 2952286C 6F655820 2952286E
80000003: 55504320 20202020 20202020 58202020
80000004: 30353535 20402020 37362E32 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003028 00000000 00000000 00000000
terminating...

0 Karma