Monitoring Splunk

Crash logs galore :(

jonathan_cooper
Communicator

Trying to figure out what these crash logs mean, I get some every minute, sometimes multiple times:

[build 149561] 2013-08-02 14:40:02
Received fatal signal 6 (Aborted).
 Cause:
   Signal sent by PID 22532 running under UID 0.
 Crashing thread: dispatch
 Registers:
    RIP:  [0x00000037B5A30285] gsignal + 53 (/lib64/libc.so.6)
    RDI:  [0x0000000000005804]
    RSI:  [0x0000000000005810]
    RBP:  [0x00002AF474800940]
    RSP:  [0x00002AF4747FD568]
    RAX:  [0x0000000000000000]
    RBX:  [0x00002AF4747FD610]
    RCX:  [0xFFFFFFFFFFFFFFFF]
    RDX:  [0x0000000000000006]
    R8:  [0x0000000000000080]
    R9:  [0x0101010101010101]
    R10:  [0x0000000000000008]
    R11:  [0x0000000000000202]
    R12:  [0x00002AF4753BB8D0]
    R13:  [0x00002AF4753BBA78]
    R14:  [0x00002AF4751A1B40]
    R15:  [0x00002AF4747FDB80]
    EFL:  [0x0000000000000202]
    TRAPNO:  [0x0000000000000000]
    ERR:  [0x0000000000000000]
    CSGSFS:  [0x0000000000000033]
    OLDMASK:  [0x0000000000000000]
 OS: Linux
 Arch: x86-64
 Backtrace:
  [0x00000037B5A30285] gsignal + 53 (/lib64/libc.so.6)
  [0x00000037B5A31D30] abort + 272 (/lib64/libc.so.6)
  [0x00000000012EB52E] _ZN9__gnu_cxx27__verbose_terminate_handlerEv + 318 ([splunkd)
  [0x00000000012EB186] _ZN10__cxxabiv111__terminateEPFvvE + 6 ([splunkd)
  [0x00000000012EB1B3] ? ([splunkd)
  [0x00000000012EB0AF] ? ([splunkd)
  [0x0000000000B3FB5A] _ZN15SearchEvaluator10lispyQueryER3StrR7TimevalS3_R9StrVectorRKS2_S7_b + 474 ([splunkd)
  [0x00000000008AF449] _ZN17IndexScopedSearch4initERK7TimevalS2_bP14LookupOperatorP12FieldAliaserP18CalcFieldProcessorPKSt3setI10CMBucketIdSt4lessISA_ESaISA_EE + 633 ([splunkd)
  [0x0000000000898795] _ZN14SearchOperator8evalArgsER17SearchResultsInfo + 9701 ([splunkd)
  [0x0000000000E29E73] _ZN14SearchPipeline8evalArgsER17SearchResultsInfo + 99 ([splunkd)
  [0x00000000008D168F] _ZN22BucketSummaryProcessor8evalArgsER17SearchResultsInfo + 8991 ([splunkd)
  [0x0000000000E29E73] _ZN14SearchPipeline8evalArgsER17SearchResultsInfo + 99 ([splunkd)
  [0x0000000000ED33A1] _ZN14DispatchThread8evaluateEbb + 16097 ([splunkd)
  [0x0000000000ECB7B1] _ZN14DispatchThread8mainImplEv + 4417 ([splunkd)
  [0x0000000000ECE74E] _ZN14DispatchThread4mainEv + 254 ([splunkd)
  [0x0000000000DA2F32] _ZN6Thread8callMainEPv + 66 ([splunkd)
  [0x00000037B620683D] ? (/lib64/libpthread.so.0)
  [0x00000037B5AD4FAD] clone + 109 (/lib64/libc.so.6)
 Linux / ZAS1UXP-0109 / 2.6.18-348.6.1.el5 / #1 SMP Fri Apr 26 09:21:26 EDT 2013 / x86_64
 Last few lines of stderr (may contain info on assertion failure, but also could be old):
    2013-05-02 16:30:01.334 -0400 splunkd started (build 149561)
    2013-05-09 21:04:08.357 -0400 Interrupt signal received
    2013-05-09 21:08:48.739 -0400 splunkd started (build 149561)
    2013-05-23 12:52:57.349 -0400 Interrupt signal received
    2013-05-23 12:54:03.366 -0400 splunkd started (build 149561)
    2013-06-06 20:04:15.593 -0400 Interrupt signal received
    2013-06-06 20:10:14.241 -0400 splunkd started (build 149561)
    2013-06-17 14:14:49.882 -0400 Interrupt signal received
    2013-06-17 14:15:24.634 -0400 splunkd started (build 149561)
    2013-06-17 14:18:05.896 -0400 Interrupt signal received
    2013-06-17 14:19:14.726 -0400 splunkd started (build 149561)
    2013-06-17 14:32:13.302 -0400 Interrupt signal received
    2013-06-17 14:33:25.088 -0400 splunkd started (build 149561)
 /etc/redhat-release: Red Hat Enterprise Linux Server release 5.9 (Tikanga)
 glibc version: 2.5
 glibc release: stable
Threads running: 3
argv: [splunkd -p 8089 restart]
Process renamed: [splunkd pid=25781] splunkd -p 8089 restart [process-runner]
Process renamed: [splunkd pid=25781] search --id=scheduler__admin__ipreputation__RMD526d08b3e8e7938df_at_1375468800_211990 --maxbuckets=0 --ttl=60 --maxout=500000 --maxtime=8640000 --lookups=1 --reduce_freq=10 --user=admin --pro --roles=admin:power:user
Crash log write attempted over the limit of (50kB), skipping.
Tags (2)
1 Solution

jonathan_cooper
Communicator

@yannK:

You were correct, I disabled the IP Reputation app and all it's saved/scheduled searches and things have been quiet. Seems to have a problem with it's pre-packaged searches. Thanks for the guidance.

View solution in original post

jonathan_cooper
Communicator

@yannK:

You were correct, I disabled the IP Reputation app and all it's saved/scheduled searches and things have been quiet. Seems to have a problem with it's pre-packaged searches. Thanks for the guidance.

View solution in original post

jonathan_cooper
Communicator

@martin_mueller: No, multiple searches

@miteshvohra: Yes

VERSION=5.0.2
BUILD=149561
PRODUCT=splunk
PLATFORM=Linux-x86_64

@yannK:

I will look through SOS again to see if I can pinpoint.

Thanks for all the feedback.

0 Karma

yannK
Splunk Employee
Splunk Employee

This is obviously a dispatch error, probably a searchjob process crash. You have a problem with a scheduled search maybe "ipreputaion" that is going nuts, maybe memory explosion. Install the SOS app and turn on the ps_sos.sh (linux) or ps_sos.ps1 (windows powershell) script to monitor the memory usage of your searches.

miteshvohra
Contributor

Are you using the right build for your OS? I mean, 32-bit vs 64-bit binaries of Splunk. Just a thought, since many people miss that out.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Do they all reference the same scheduled search ipreputation?

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!