Security

How to troubleshoot why splunkd is not running after restarting Splunk?

bagarwal
Path Finder

Hello ,

Splunk Web was working fine until I restarted it. After that, I found Splunkd is not running. Tried all possible troubleshooting, but without any luck. Can anyone please suggest or help on this? Thanks in advance. Below are the complete details:

[build aaff59bb082c] 2016-05-17 09:56:21
Received fatal signal 6 (Aborted).
 Cause:
   Signal sent by PID 17986 running under UID 91408.
 Crashing thread: SplunkdSpecificInitThread
 Registers:
    RIP:  [0x0000003720832625] gsignal + 53 (/lib64/libc.so.6)
    RDI:  [0x0000000000004642]
    RSI:  [0x000000000000464A]
    RBP:  [0x00000000018B2F70]
    RSP:  [0x00007F1B377FC408]
    RAX:  [0x0000000000000000]
    RBX:  [0x00007F1B3A88A000]
    RCX:  [0xFFFFFFFFFFFFFFFF]
    RDX:  [0x0000000000000006]
    R8:  [0xFEFEFEFEFEFEFEFF]
    R9:  [0x00007F1B3B34AF60]
    R10:  [0x0000000000000008]
    R11:  [0x0000000000000206]
    R12:  [0x00000000018B36C0]
    R13:  [0x00000000018B4A00]
    R14:  [0x00007F1B377FC8D0]
    R15:  [0x00007F1B36090200]
    EFL:  [0x0000000000000206]
    TRAPNO:  [0x0000000000000000]
    ERR:  [0x0000000000000000]
    CSGSFS:  [0x0000000000000033]
    OLDMASK:  [0x0000000000000000]

 OS: Linux
 Arch: x86-64

 Backtrace:
  [0x0000003720832625] gsignal + 53 (/lib64/libc.so.6)
  [0x0000003720833E05] abort + 373 (/lib64/libc.so.6)
  [0x000000372082B74E] ? (/lib64/libc.so.6)
  [0x000000372082B810] __assert_perror_fail + 0 (/lib64/libc.so.6)
  [0x0000000000BBC14A] _ZN14IndexerService35disableIndexesAndReinitGlobalConfigERKN9__gnu_cxx17__normal_iteratorIPK3StrSt6vectorIS2_SaIS2_EEEESA_ + 554 (splunkd)
  [0x0000000000BBCAA5] _ZN14IndexerService18initPerIndexConfigEP9StrVectorb + 309 (splunkd)
  [0x0000000000BBDEAC] _ZN14IndexerService12reloadConfigERK14IndexConfigRef + 428 (splunkd)
  [0x000000000100DDD3] _ZN9EventLoop20internal_runInThreadEP13InThreadActorb + 291 (splunkd)
  [0x0000000000BBB52B] _ZN14IndexerService16loadLatestConfigEP14IndexConfigRef + 411 (splunkd)
  [0x0000000000BBB6C5] _ZN14IndexerService16loadLatestConfigEv + 21 (splunkd)
  [0x0000000000BBB9F2] _ZN14IndexerServiceC2Ev + 770 (splunkd)
  [0x0000000000BBBD61] _ZN14IndexerService14_new_singletonEv + 65 (splunkd)
  [0x00000000009ACEA7] _ZN25SplunkdSpecificInitThread4mainEv + 135 (splunkd)
  [0x00000000010A040E] _ZN6Thread8callMainEPv + 62 (splunkd)
  [0x0000003720C07AA1] ? (/lib64/libpthread.so.0)
  [0x00000037208E893D] clone + 109 (/lib64/libc.so.6)
 Linux / BLRSECTST01 / 2.6.32-573.el6.x86_64 / #1 SMP Thu Jul 23 15:44:03 UTC 2015 / x86_64
 Last few lines of stderr (may contain info on assertion failure, but also could be old):
    2016-05-17 09:46:34.417 +0530 splunkd started (build aaff59bb082c)
    splunkd: /home/build/build-src/ember/src/pipeline/indexer/IndexerService.cpp:923: void IndexerService::disableIndexesAndReinitGlobalConfig(const const_iterator&, const const_iterator&): Assertion `0 && "Cannot disable an internal index."' failed.
    2016-05-17 09:56:20.074 +0530 splunkd started (build aaff59bb082c)
    splunkd: /home/build/build-src/ember/src/pipeline/indexer/IndexerService.cpp:923: void IndexerService::disableIndexesAndReinitGlobalConfig(const const_iterator&, const const_iterator&): Assertion `0 && "Cannot disable an internal index."' failed.

 /etc/redhat-release: CentOS release 6.7 (Final)
 glibc version: 2.12
 glibc release: stable
Last errno: 2
Threads running: 14
Runtime: 1.104891s
argv: [splunkd -p 8089 start]
Thread: "SplunkdSpecificInitThread", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f1b38cceb10:
00000000  00 d7 7f 37 1b 7f 00 00                           |...7....|
00000008

InThreadActor @0x7f1b377fca30: _queuedOn=(nil), ran=Y, wantWake=Y, wantFailIfLoopDone=N
First 128 bytes of InThreadActor object @0x7f1b377fca30:
00000000  f0 3e 8b 01 00 00 00 00  01 00 08 36 1b 7f 00 00  |.>.........6....|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  2e 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000050  00 01 08 36 1b 7f 00 00  c0 01 09 36 1b 7f 00 00  |...6.......6....|
00000060  ff ff ff ff ff ff ff ff  00 01 08 36 1b 7f 00 00  |...........6....|
00000070  68 02 08 36 1b 7f 00 00  18 05 08 36 1b 7f 00 00  |h..6.......6....|
00000080


x86 CPUID registers:
         0: 0000000B 756E6547 6C65746E 49656E69
         1: 00020651 00010800 82B82203 0FABFBFF
         2: 76036301 00F0B2FF 00000000 00CA0000
         3: 00000000 00000000 00000000 00000000
         4: 00000000 00000000 00000000 00000000
         5: 00000000 00000000 00000000 00000000
         6: 00000077 00000002 00000001 00000000
         7: 00000000 00000000 00000000 00000000
         8: 00000000 00000000 00000000 00000000
         9: 00000000 00000000 00000000 00000000
         A: 07300401 0000007F 00000000 00000000
         B: 00000000 00000000 000000CD 00000000
  80000000: 80000008 00000000 00000000 00000000
  80000001: 00000000 00000000 00000001 28100800
  80000002: 20202020 6E492020 286C6574 58202952
  80000003: 286E6F65 43202952 45205550 36322D35
  80000004: 76203036 20402032 30322E32 007A4847
  80000005: 00000000 00000000 00000000 00000000
  80000006: 00000000 00000000 01006040 00000000
  80000007: 00000000 00000000 00000000 00000100
  80000008: 00003028 00000000 00000000 00000000
terminating...

Output in the CLI:

./splunk restart
splunkd is not running.                                    [FAILED]

Splunk> Needle. Haystack. Found.

Checking prerequisites...
        Checking http port [8000]: open
        Checking mgmt port [8089]: open
        Checking appserver port [127.0.0.1:8065]: open
        Checking kvstore port [8191]: open
        Checking configuration...  Done.
        Checking critical directories...        Done
        Checking indexes...
                Validated: _audit _internal _introspection _thefishbucket audit_log history main summary
        Done
        Checking filesystem compatibility...  Done
        Checking conf files for problems...
        Done
        Checking default conf files for edits...
        Validating installed files against hashes from '/remote/users1/bagarwal/splunk/splunk-6.3.2-aaff59bb082c-linux-2.6-x86_64-manifest'
        All installed files intact.
        Done
All preliminary checks passed.

Starting splunk server daemon (splunkd)...
Done
                                                           [  OK  ]

Waiting for web server at http://127.0.0.1:8000 to be availablesplunkd 10082 was not running.
Stopping splunk helpers...
                                                           [  OK  ]
Done.
Stopped helpers.
Removing stale pid file... done.


WARNING: web interface does not seem to be available!
Labels (1)
0 Karma
1 Solution

phadnett_splunk
Splunk Employee
Splunk Employee

This might be caused by bucket conflicts. Do you find any results when searching the following?
index=_internal component=DatabaseDirectoryManager conflicts

View solution in original post

0 Karma

jagdish0886
Explorer

I have got a similar issue, my docker container for splunk exits with below error on restart (it runs fine as long as I keep it up). Has anyone found a solution for this.

*

TASK [splunk_common : Start Splunk via cli] ************************************
fatal: [localhost]: FAILED! => {"changed": false, "cmd": ["/opt/splunk/bin/splunk", "start", "--accept-license", "--answer-yes", "--no-prompt"], "delta": "0:05:20.859094
", "end": "2020-04-18 09:15:03.654801", "msg": "non-zero return code", "rc": 1, "start": "2020-04-18 09:09:42.795707", "stderr": "\n\nBypassing local license checks since this
 instance is configured with a remote license master.", "stderr_lines": ["", "", "Bypassing local license checks since this instance is configured with a remote license master
."], "stdout": "splunkd 268 was not running.\nStopping splunk helpers...\n\nDone.\nStopped helpers.\nRemoving stale pid file... done.\n\nSplunk> Winning the War on Error\n\nCh
ecking prerequisites...\n\tChecking http port [8000]: open\n\tChecking mgmt port [8089]: open\n\tChecking appserver port [127.0.0.1:8065]: open\n\tChecking kvstore port [8191]
: open\n\tChecking configuration... Done.\n\tChecking critical directories...\tDone\n\tChecking indexes...\n\t\tValidated: _audit _internal _introspection _telemetry _thefishb
ucket history main summary\n\tDone\n\tChecking filesystem compatibility...  Done\n\tChecking conf files for problems...\n\tDone\n\tChecking default conf files for edits...\n\t
Validating installed files against hashes from '/opt/splunk/splunk-7.3.0-657388c7a488-linux-2.6-x86_64-manifest'\n\tAll installed files intact.\n\tDone\n\tChecking replication
_port port [8050]: open\nAll preliminary checks passed.\n\nStarting splunk server daemon (splunkd)...  \nDone\n\n\nWaiting for web server at http://127.0.0.1:8000 to be availa
ble............................................................................................................................................................................
................................................................................................................................\n\nWARNING: web interface does not seem to be 
available!", "stdout_lines": ["splunkd 268 was not running.", "Stopping splunk helpers...", "", "Done.", "Stopped helpers.", "Removing stale pid file... done.", "", "Splunk> W
inning the War on Error", "", "Checking prerequisites...", "\tChecking http port [8000]: open", "\tChecking mgmt port [8089]: open", "\tChecking appserver port [127.0.0.1:8065
]: open", "\tChecking kvstore port [8191]: open", "\tChecking configuration... Done.", "\tChecking critical directories...\tDone", "\tChecking indexes...", "\t\tValidated: _au
dit _internal _introspection _telemetry _thefishbucket history main summary", "\tDone", "\tChecking filesystem compatibility...  Done", "\tChecking conf files for problems..."
, "\tDone", "\tChecking default conf files for edits...", "\tValidating installed files against hashes from '/opt/splunk/splunk-7.3.0-657388c7a488-linux-2.6-x86_64-manifest'",
 "\tAll installed files intact.", "\tDone", "\tChecking replication_port port [8050]: open", "All preliminary checks passed.", "", "Starting splunk server daemon (splunkd)... 
 ", "Done", "", "", "Waiting for web server at http://127.0.0.1:8000 to be available...........................................................................................
...............................................................................................................................................................................
..................................", "", "WARNING: web interface does not seem to be available!"]}
PLAY RECAP *********************************************************************
localhost                  : ok=18   changed=1    unreachable=0    failed=1    skipped=16   rescued=0    ignored=0

0 Karma

richgalloway
SplunkTrust
SplunkTrust

@jagdish0886 This question is nearly 4 years old with an accepted answer. If the answer does not work for you then please post a new question describing your problem.

---
If this reply helps you, Karma would be appreciated.
0 Karma

johnansett
Communicator

Just had this issue too, a quick tail of splunkd shows permissions issue for my data model summaries, but I suspect it could have been caused by any permissions issue. Check splunkd first and foremost and make sure all permissions are correct and index directories can be written to by the splunk runtime user.

0 Karma

phadnett_splunk
Splunk Employee
Splunk Employee

This might be caused by bucket conflicts. Do you find any results when searching the following?
index=_internal component=DatabaseDirectoryManager conflicts

0 Karma

bagarwal
Path Finder

Thanks for the response. When I searched in the splunkd logs I could able to find the following error:
ERROR DatabaseDirectoryManager - idx=_audit bucket=db_1461821679_1461820976_6 Detected directory manually copied into its database, causing id conflicts

[path1='...../splunk/audit/db/hot_v1_6'
path2='...../splunk/var/lib/splunk/audit/db/db_1461821679_1461820976_6']

Later I replaced hot_v1_6 to hot_v1_7 ; (changind the _ID) and splunkd starts running. Good News!! Not really, as logs were not getting received by the splunk indexer. I have reinstalled the Splunk and now it is working fine but that is not exactly I wanted to do as though it was the last option .

If there is any workaround to resolve this ; would be happy to know as it will be useful in the future.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...