Solved: Failures after Upgrading to 6.0

ShaneNewman · ‎11-26-2013

We upgraded to 6.0 yesterday. Since then, randomly, we see this:

    [build 182037] 2013-11-26 15:01:46
     Access violation, cannot read at address [0x0000000014F00000]
     Exception address: [0x000000000037C5F4]
     Crashing thread: HTTPDispatch
        MxCsr:  [0x0000000000001F80]
        SegDs:  [0x000000000000002B]
        SegEs:  [0x000000000000002B]
        SegFs:  [0x0000000000000053]
        SegGs:  [0x000000000000002B]
        SegSs:  [0x000000000000002B]
        SegCs:  [0x0000000000000033]
        EFlags:  [0x0000000000010206]
        Rsp:  [0x000000000287FA68]
        Rip:  [0x000000000037C5F4] memmove + 628/1280
        Dr0:  [0x0000000000000000]
        Dr1:  [0x0000000000000000]
        Dr2:  [0x0000000000000000]
        Dr3:  [0x0000000000000000]
        Dr6:  [0x0000000000000000]
        Dr7:  [0x0000000000000000]
        Rax:  [0x000000000000000B]
        Rcx:  [0x000000000BFE0518]
        Rdx:  [0x0000000008F1FAE0]
        Rbx:  [0x00000000104FD3D8]
        Rbp:  [0x0000000000003FD8]
        Rsi:  [0x000000000BFE0038]
        Rdi:  [0x0000000000003FD8]
        R8:  [0x0000000000003FD8]
        R9:  [0x00000000000001D7]
        R10:  [0x000000000000000B]
        R11:  [0x000000000BFE0038]
        R12:  [0x0000000000000002]
        R13:  [0x0000000000000002]
        R14:  [0x0000000000000002]
        R15:  [0x0000000000000000]
        DebugControl:  [0xFFFFF80001024D4A]
        LastBranchToRip:  [0x0000000000000000]
        LastBranchFromRip:  [0x0000000000000000]
        LastExceptionToRip:  [0x0000000000000000]
        LastExceptionFromRip:  [0x0000000000000000]

     OS: Windows
     Arch: x86-64

Backtrace:
  [0x000000000037C5F4] memmove + 628/1280
  [0x000000000FBD0371] FINGERPRINT_premain + 10145/33104
  [0x000000000FBCFEC7] FINGERPRINT_premain + 8951/33104
  [0x000000000FBCE4E5] FINGERPRINT_premain + 2325/33104
  [0x000000000FBCF175] FINGERPRINT_premain + 5541/33104
  [0x000000000FBB83E5] COMP_rle + 709/2272
  [0x000000000FBB7FC7] COMP_compress_block + 55/80
  [0x000000000048468A] SSLv3_client_method + 38346/47744
  [0x0000000000480EC7] SSLv3_client_method + 24071/47744
  [0x000000014079D9C9] ?
  [0x00000001407A0C0C] ?
  [0x00000001407A091F] ?
  [0x000000014079CADB] ?
  [0x0000000140026D2C] ?
  [0x00000001406F720E] ?
  [0x0000000140029537] ?
  [0x0000000000363FEF] beginthreadex + 263/284
  [0x0000000000364196] endthreadex + 402/404
  [0x0000000077D6B70A] BaseThreadStart + 58/80
 Crash dump written to: C:\Program Files\Splunk\var\log\splunk\C__Program Files_Splunk_bin_splunkd_exe_crash-2013-11-26-15-01-46.dmp

TRDVWPTRM01 /5.2 Service Pack 2
GetLastError(): 0
Threads running: 113
argv: [Splunkd -p 8089]
Thread: "HTTPDispatch", did_join=0, ready_to_run=Y, main_thread=N
First 4 bytes of Thread token @00000000023BD344:
00000000  88 07 00 00                                       |....|
00000004
First 128 bytes of CompletionPortItem object @000000001203F560:
00000000  18 af 15 41 01 00 00 00  50 b7 34 25 00 00 00 00  |...A....P.4%....|
00000010  38 96 26 41 01 00 00 00  01 00 00 00 00 00 00 00  |8.&A............|
00000020  03 01 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 10 00 00 00 00 00 00  |................|
00000050  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000060  40 97 da 11 00 00 00 00  00 00 00 00 00 00 00 00  |@...............|
00000070  00 00 00 00 00 00 00 00  47 45 54 20 2f 73 65 72  |........GET /ser|
00000080
In TcpChannel 000000001203F560, _tcloop=000000000166FC18, no async write data, _data._shouldKill=N, r/w_timeouts=5000/300000, timeout_count=0
SSL: version="SSLv3", state="SSL negotiation finished successfully", cipher="AES256-SHA", compression="zlib compression"
rbuf: ptr=000000001203F5D8, size=0x2000, rptr=0x0, wptr=0x0
TcpChannelAcceptor: pendingAsyncAccept=1, dyingChannels=0, tcloop=000000000166FC18, activeList=000000001F34F230, activeCount=10, idleThreads=0000000013CCF058, idleThreadCount=43, activeThreadCount=9, maxActiveThreadsInTimePeriod=27, idleThreadTrimmerActive=Y, disabledReasons=0
_thread=000000001401B958: commandForThread=0, nextIdle=0000000011FD4370, requestAfterThread=2, _tpfd=000000001203F560writeCorkCount=1, terminateCallback=0000000000000000, ioError=No error, lastError=No error, terminateError=Connection closed by peer
giveCmd @000000001401BA78: _queuedOn=0000000000000000, ran=N, wantWake=N, wantFailIfLoopDone=N, cmd=2, ok=Y, chan=000000001203F560
writeDataAvail @000000001401BAB0: _queuedOn=0000000000000000, ran=N, wantWake=N, wantFailIfLoopDone=N, chan=000000001203F560
wbuf: ptr=000000001401BB18, size=0x8000, rptr=0x0, wptr=0x8000


x86 CPUID registers:
         0: 0000000A 756E6547 6C65746E 49656E69
         1: 000006FB 09040800 0004E3BD BFEBFBFF
         2: 05B0B101 005657F0 00000000 2CB43048
         3: 00000000 00000000 00000000 00000000
         4: 0C000121 01C0003F 0000003F 00000001
         5: 00000040 00000040 00000003 00002220
         6: 00000001 00000002 00000001 00000000
         7: 00000000 00000000 00000000 00000000
         8: 00000400 00000000 00000000 00000000
         9: 00000000 00000000 00000000 00000000
         A: 07280202 00000000 00000000 00000503
  80000000: 80000008 00000000 00000000 00000000
  80000001: 00000000 00000000 00000001 20100800
  80000002: 65746E49 2952286C 6F655820 2952286E
  80000003: 55504320 20202020 20202020 45202020
  80000004: 30333337 20402020 30342E32 007A4847
  80000005: 00000000 00000000 00000000 00000000
  80000006: 00000000 00000000 0C007040 00000000
  80000007: 00000000 00000000 00000000 00000000
  80000008: 00003028 00000000 00000000 00000000
terminating...

Splunk crashes, a restart gets it working for a while, then it crashes again. Any ideas what I need to do in order to correct this?

jordanperks · ‎11-26-2013

There are people here much smarter than me, but I think "argv: [Splunkd -p 8089]" is the source of the crash. 8089 is default for listening on the splunkd management port. There is some more information about it here: The disableDefaultPort = [true|false] setting is documented here: http://www.splunk.com/base/Documentation/latest/admin/Serverconf. I found that in this thread http://answers.splunk.com/answers/26236/universal-forwarder-listening-on-port-8089

View solution in original post

jordanperks · ‎11-26-2013

There are people here much smarter than me, but I think "argv: [Splunkd -p 8089]" is the source of the crash. 8089 is default for listening on the splunkd management port. There is some more information about it here: The disableDefaultPort = [true|false] setting is documented here: http://www.splunk.com/base/Documentation/latest/admin/Serverconf. I found that in this thread http://answers.splunk.com/answers/26236/universal-forwarder-listening-on-port-8089

ShaneNewman · ‎11-27-2013

I created a critical ticket with Splunk. Turns out we needed to deploy the pre-release of Splunk. The issue then, was that Splunk 6.0.0.2 does not work on WS2003R2 and a bug has been filed.

Failures after Upgrading to 6.0

New Case Study Shows the Value of Partnering with Splunk Academic Alliance

How to Monitor Google Kubernetes Engine (GKE)

Index This | How can you make 45 using only 4?