Enabling LDAP - splunkd crash on startup.
Running Splunk standalone (i.e. not clustered as per previous post)
Splunk v4.1.4 (build 82143).
LDAP against Windows Server 2003 Active Directory. Server hit has a global catalog.
ldapsearch tests for both groups & users are successful as per splunk docs.
Have set groupBaseFilter to only include (cn=APP-Splunk*) groups (3 exist)
Have set userBaseFilter to only include my account (cn=myname)
splunkd_stderr.log says: src/tcmalloc.cc:353] Attempt to free invalid pointer: 0x1b00010
Last line in splunkd.log says: INFO loader - Instantiated plugin: thruputprocessor
Running on physical box with 8 cores & 16GB RAM. SLES 11 amd64.
Reverting back to Splunk (internal) authenticaiton allows Splunk to start clean.
Crash log output below.
Received fatal signal 6 (Aborted).
Signal sent by PID 29447 running under UID 0.
Crashing thread: Main Thread
RIP: [0x00007F38F5A9C645] gsignal + 53 (/lib64/libc.so.6)
[0x00007F38F5A9DC33] abort + 387 (/lib64/libc.so.6)
[0x0000000000AC36EF] ? (splunkd)
[0x0000000000AC38A6] _ZN22TCMalloc_CrashReporter12PrintfAndDieEPKcz + 150 (splunkd)
[0x0000000000ABC08B] _ZN123_GLOBAL__N__ZN61FLAG__namespace_do_not_use_directly_use_DECLARE_int64_instead43FLAGS_tcmalloc_large_alloc_report_thresholdE11InvalidFreeEPv + 43 (splunkd)
[0x0000000000DD7D35] tc_free + 453 (splunkd)
[0x00007F38F5B4A10D] __res_iclose + 189 (/lib64/libc.so.6)
[0x00007F38F5B75234] ? (/lib64/libc.so.6)
[0x00007F38F5B751C2] __libc_thread_freeres + 34 (/lib64/libc.so.6)
[0x00007F38F7052083] ? (/lib64/libpthread.so.0)
[0x00007F38F5B3D10D] clone + 109 (/lib64/libc.so.6)
Linux / myserver / 126.96.36.199-0.1-default / #1 SMP 2010-02-22 16:49:47 +0100 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
src/tcmalloc.cc:353] Attempt to free invalid pointer: 0x1b00010
/etc/SuSE-release: SUSE Linux Enterprise Server 11 (x86_64)
glibc version: 2.9
glibc release: stable
Threads running: 14
... View more
Thanks jrodman. I've written my own heartbeat cluster resource agent which seems to work OK. I have extended the timeouts for start/stop (from default recommended cluster timings) and it now starts, stops and is monitored correctly. The points you've raised are very valid and I'll now be sure to test it thoroughly with those in mind.
So at this stage I have a working clustered setup with a DRBD, file system, virtual ip, syslog-ng (separate instance - I know Splunk supports syslog udp out of the box but I need it for other reasons), and Splunk which successfully starts, stops and fails over.
... View more
I have an environment that is small enough for a simple single server setup of Splunk, but the data itself and access to Splunk is very important, so I have configured a 2-node High Availability SUSE Linux cluster (SLES 11 amd64) with a clustered DRBD storage back-end, file system and virtual ip.
I have installed Splunk into the DRBD storage area so that it can fail betweeen my 2 node cluster. This gives me everything I need except for clustering the Splunk services.
Does anyone have by chance, an example cluster cib.xml file, or the cib entries that would be applicable for Splunk? I'm assuming it would use a generic-service resource agent as I could not find any cluster resource agents specific for Splunk.
Just trying to save myself lots of work doing this myself. If no one has this info, and I'm successful, I'll be more than happy posting back how it's done.
... View more