Hey, i am currently experiencing severe problems with my splunk installation since splunkd repeatedly crashes right after starting splunk. Here's the output of the respective log file:
[build 182037] 2013-10-30 23:01:39
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 8918 running under UID 0.
Crashing thread: archivereader
Registers:
RIP: [0x00007F7208496037] gsignal + 55 (/lib/x86_64-linux-gnu/libc.so.6)
RDI: [0x00000000000022D6]
RSI: [0x0000000000002441]
RBP: [0x00007F72085E5578]
RSP: [0x00007F7201FEC008]
RAX: [0x0000000000000000]
RBX: [0x00007F72097EC000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0xFEFEFEFEFEFEFEFF]
R9: [0x00007F720983FF60]
R10: [0x0000000000000008]
R11: [0x0000000000000202]
R12: [0x0000000001299678]
R13: [0x000000000129A300]
R14: [0x00007F720638AB20]
R15: [0x00007F72060743DB]
EFL: [0x0000000000000202]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace:
[0x00007F7208496037] gsignal + 55 (/lib/x86_64-linux-gnu/libc.so.6)
[0x00007F7208499698] abort + 328 (/lib/x86_64-linux-gnu/libc.so.6)
[0x00007F720848EE03] ? (/lib/x86_64-linux-gnu/libc.so.6)
[0x00007F720848EEB2] ? (/lib/x86_64-linux-gnu/libc.so.6)
[0x000000000083AA16] _ZN17ArchiveCrcChecker21seekAndComputeSeekCrcEv + 598 (splunkd)
[0x000000000083D345] _ZN17ArchiveCrcChecker5writeEPKcm + 357 (splunkd)
[0x0000000000AA0717] _ZN14ArchiveContext7processERK8PathnameP13ISourceWriter + 855 (splunkd)
[0x0000000000AA0E95] _ZN14ArchiveContext9readFullyEP13ISourceWriterRb + 1221 (splunkd)
[0x000000000083CFA2] _ZN16ArchiveProcessor20haveReadAsNonArchiveE14FileDescriptorlPK3Str + 578 (splunkd)
[0x000000000083EE53] _ZN16ArchiveProcessor4mainEv + 2755 (splunkd)
[0x0000000000D81A2D] _ZN6Thread8callMainEPv + 61 (splunkd)
[0x00007F720882EF8E] ? (/lib/x86_64-linux-gnu/libpthread.so.0)
[0x00007F7208558E1D] clone + 109 (/lib/x86_64-linux-gnu/libc.so.6)
Linux / ubuntuSplunkHost / 3.8.0-19-generic / #29-Ubuntu SMP Wed Apr 17 18:16:28 UTC 2013 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2013-10-30 23:01:25.780 +0100 splunkd started (build 182037)
Cannot open manifest file inside "/opt/splunk/var/lib/splunk/audit/db/db_1383170465_1383170464_24/rawdata": No such file or directory
splunkd: /opt/splunk/p4/splunk/branches/6.0.0/src/pipeline/input/ArchiveProcessor.cpp:1044: bool ArchiveCrcChecker::seekAndComputeSeekCrc(): Assertion `(file_offset_t)_seekPtr >= dp->curPos()' failed.
2013-10-30 23:01:34.813 +0100 splunkd started (build 182037)
Cannot open manifest file inside "/opt/splunk/var/lib/splunk/audit/db/db_1383170486_1383170485_25/rawdata": No such file or directory
splunkd: /opt/splunk/p4/splunk/branches/6.0.0/src/pipeline/input/ArchiveProcessor.cpp:1044: bool ArchiveCrcChecker::seekAndComputeSeekCrc(): Assertion `(file_offset_t)_seekPtr >= dp->curPos()' failed.
/etc/debian_version: wheezy/sid
Last errno: 0
Threads running: 40
argv: [splunkd -p 8089 start]
Thread: "archivereader", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f7206074230:
00000000 00 d7 ff 01 72 7f 00 00 |....r...|
00000008
x86 CPUID registers:
0: 0000000D 756E6547 6C65746E 49656E69
1: 000306A9 00010800 9E982203 0FABFBFF
2: 76035A01 00F0B2FF 00000000 00CA0000
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000000 00000000 00000000 00000000
6: 00000077 00000002 00000009 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300401 0000007F 00000000 00000000
B: 00000000 00000000 0000005D 00000000
C: 00000000 00000000 00000000 00000000
😧 00000000 00000000 00000000 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 28100800
80000002: 20202020 20202020 65746E49 2952286C
80000003: 726F4320 4D542865 37692029 3737332D
80000004: 50432030 20402055 30342E33 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003028 00000000 00000000 00000000
terminating...
..I already tried repairing/rebuilding indexes and buckets as explained here: http://docs.splunk.com/Documentation/Splunk/6.0/Indexer/HowSplunkstoresindexes but no success yet. After every restart, splunkd crashes again. The error message is almost the same, but the line " Cannot open manifest file inside "/opt/splunk/var/lib/splunk/audit/db/db_1383170486_1383170485_25/rawdata": No such file or directory" keeps changing, the number i marked bold is increasing after every crash.
Can anybody help solving this issue?
Thanks and regards,
Flo
This is a known issue and fixed as of 6.0.1
Upgrading causes crash in "Crashing Thread: archivereader" (SPL-74873)
This is a known issue and fixed as of 6.0.1
Upgrading causes crash in "Crashing Thread: archivereader" (SPL-74873)
Same thing here on ubuntu 13.04 64bit, any suggestions?
here's what helped in my case: http://answers.splunk.com/answers/111222/splunk-crashes-repeatedly-cannot-open-manifest-file
The same thing. Did you manage to solve the problem?
After trying all sorts of index/bucket rebuilds and repairs and whatsoever, we ended up restoring a vm-snapshot of a couple of days ago.
No, I fell back to splunk 5.0.5
Same problem here.
Installed Splunk splunk-6.0-182037-linux-2.6-amd64.deb on Ubuntu 13.10.
Added Syslog and searching stuff works. Then I did splunk restart and logged in again to splunkweb and it fails with the above error message.
Thank you. I did so and tried starting splunk again, still the same error, only the corresponding line changed:
Cannot open manifest file inside "/opt/splunk/var/lib/splunk/audit/db/db_1383210609_1383210597_27/rawdata": No such file or directory
so, now the problem lies within the _27 or/and _19 db, for some reason.
splunk starts normally but splunkd crashes seconds after starting it.
to me it seems that when starting splunk it is looking for a db that does not exist and next time starting it it is looking for the db with the next higher number (25,..26,..27, etc).
Make a copy of the db file db_1383170486_1383170485_25 somewhere else outside of splunkHome and delete db_1383170486_1383170485_25 from the db directory. Does Splunk start?