good morning
Our SH cluster is going back several times and we do not know the cause. someone could give me some support.
[build 2e75b3406c5b] 2019-03-11 19:36:50
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 10042 running under UID 501.
Crashing thread: TcpChannelThread
Registers:
RIP: [0x00007F20C087F625] gsignal + 53 (libc.so.6 + 0x32625)
RDI: [0x000000000000273A]
RSI: [0x0000000000004266]
RBP: [0x00007F20C3F31C00]
RSP: [0x00007F207B5FBF88]
RAX: [0x0000000000000000]
RBX: [0x00007F20C1E16000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0x0000000000000020]
R9: [0xFEFEFEFEFEFEFF09]
R10: [0x0000000000000008]
R11: [0x0000000000000206]
R12: [0x00007F20C3E80645]
R13: [0x00007F20C400E260]
R14: [0x0000000000000000]
R15: [0x0000000000000000]
EFL: [0x0000000000000206]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace (PIC build):
[0x00007F20C087F625] gsignal + 53 (libc.so.6 + 0x32625)
[0x00007F20C0880E05] abort + 373 (libc.so.6 + 0x33E05)
[0x00007F20C087874E] ? (libc.so.6 + 0x2B74E)
[0x00007F20C0878810] __assert_perror_fail + 0 (libc.so.6 + 0x2B810)
[0x00007F20C2E9D70C] _ZN16SearchResultsMem8deepCopyERKS_P22ArenaStrPoolCopyHelpermm + 572 (splunkd + 0x102470C)
[0x00007F20C2E9D959] _ZN16SearchResultsMem6appendERKS_mm + 105 (splunkd + 0x1024959)
[0x00007F20C2E8D3E6] _ZN18SearchResultsFiles15appendMultiFileERS_b + 454 (splunkd + 0x10143E6)
[0x00007F20C374C5CB] _ZN15AppendProcessor7executeER18SearchResultsFilesR17SearchResultsInfo + 331 (splunkd + 0x18D35CB)
[0x00007F20C36CE62D] _ZN15SearchProcessor16execute_dispatchER18SearchResultsFilesR17SearchResultsInfoRK3Str + 237 (splunkd + 0x185562D)
[0x00007F20C36BEAB7] _ZN14SearchPipeline7executeER18SearchResultsFilesR17SearchResultsInfo + 279 (splunkd + 0x1845AB7)
[0x00007F20C3AD34C7] _ZN22HandleJobsDataProvider19handleResultsGetterEPN14DispatchSearch28GenericOutputResultsAcceptorERK3StrRK18SearchResultsFilesRK10StrSegmentbRK8Pathname + 3911 (splunkd + 0x1C5A4C7)
[0x00007F20C3AD5D60] _ZN22HandleJobsDataProvider31executeEventResultPreviewActionERK3StrRK19SearchJobStatusDataRK13HttpArgumentsP6StrSetRK10StrSegmentb + 144 (splunkd + 0x1C5CD60)
[0x00007F20C3AD5EC7] _ZN22HandleJobsDataProvider17handleStatusQueryERK21UserTimezoneSpecifierP6StrSetRK3StrP24CachedJobStatusReferenceRK10StrSegmentRK10HttpMethodRK13HttpArgumentsRb + 215 (splunkd + 0x1C5CEC7)
[0x00007F20C3AD86A6] _ZN22HandleJobsDataProvider10handleJobsERK21UserTimezoneSpecifier10HttpMethodRK3StrRK10StrSegmentbRK13HttpArguments + 1094 (splunkd + 0x1C5F6A6)
[0x00007F20C3AD97C2] _ZN22HandleJobsDataProvider18handleWithTimezoneERK21UserTimezoneSpecifier + 754 (splunkd + 0x1C607C2)
[0x00007F20C3B06E53] _ZN38DispatchSearchDataProviderWithTimezone21handleWithoutTimezoneEv + 435 (splunkd + 0x1C8DE53)
[0x00007F20C3A91FD9] _ZN26DispatchSearchDataProvider2goEv + 41 (splunkd + 0x1C18FD9)
[0x00007F20C2D53E78] _ZN33ServicesEndpointReplyDataProvider9rawHandleEv + 88 (splunkd + 0xEDAE78)
[0x00007F20C2D492AF] _ZN18RawRestHttpHandler10getPreBodyEP21HttpServerTransaction + 31 (splunkd + 0xED02AF)
[0x00007F20C31F5930] _ZN32HttpThreadedCommunicationHandler11communicateER17TcpSyncDataBuffer + 272 (splunkd + 0x137C930)
[0x00007F20C271F04A] _ZN16TcpChannelThread4mainEv + 218 (splunkd + 0x8A604A)
[0x00007F20C328132F] _ZN6Thread8callMainEPv + 111 (splunkd + 0x140832F)
[0x00007F20C0BE8A51] ? (libpthread.so.0 + 0x7A51)
[0x00007F20C093596D] clone + 109 (libc.so.6 + 0xE896D)
Linux / splunk_searchhead03_cnt / 2.6.32-573.el6.x86_64 / #1 SMP Wed Jul 1 18:23:37 EDT 2015 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
splunkd: /home/build/build-src/nightlight/src/framework/SearchResultsMem.cpp:354: void SearchResultsMem::deepCopy(const SearchResultsMem&, ArenaStrPoolCopyHelper*, size_t, size_t): Assertion `_defaultMvDelim.empty()' failed.
2019-03-11 17:56:59.116 -0300 splunkd started (build 2e75b3406c5b)
splunkd: /home/build/build-src/nightlight/src/framework/SearchResultsMem.cpp:354: void SearchResultsMem::deepCopy(const SearchResultsMem&, ArenaStrPoolCopyHelper*, size_t, size_t): Assertion `_defaultMvDelim.empty()' failed.
splunkd: /home/build/build-src/nightlight/src/framework/SearchResultsMem.cpp:354: void SearchResultsMem::deepCopy(const SearchResultsMem&, ArenaStrPoolCopyHelper*, size_t, size_t): Assertion `_defaultMvDelim.empty()' failed.
/etc/redhat-release: Red Hat Enterprise Linux Server release 6.7 (Santiago)
glibc version: 2.12
glibc release: stable
Last errno: 0
Threads running: 86
Runtime: 5991.658114s
argv: [splunkd -p 8089 start]
Regex JIT disabled due to SELinux
using CLOCK_MONOTONIC
Thread: "TcpChannelThread", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7f20624650d0:
00000000 00 f7 5f 7b 20 7f 00 00 |.._{ ...|
00000008
commandForThread=0, nextIdle=0x7f2098b99900, requestAfterThread=0, _tpfd=0x7f2098a29800, writeCorkCount=0, terminateCallback=(nil), ioError=No error, lastError=No error, terminateError=No error
giveCmd @0x7f2062465230: _queuedOn=(nil), ran=N, wantWake=N, wantFailIfLoopDone=N, cmd=0, ok=Y, chan=0x7f2098aaf800
writeDataAvail @0x7f2062465290: _queuedOn=(nil), ran=N, wantWake=N, wantFailIfLoopDone=N, chan=0x7f2098aaf800
wbuf: ptr=0x7f2062465330, size=0x8000, rptr=0x0, wptr=0x0
HttpListeningConnection: _transactionActive=Y, _haveHadTransaction=Y, _alreadyLoggedTimeout=N
HttpTcpConnection: peer=172.16.70.186, _desiredCompressionLevel=6
RestHttpServerTransaction: _restPath="search/jobs/packetcore__packetcore_UEFDS0VUX0NPUkU__RMD56deb342c100fc05e_1552343796.1088_D75AD645-1FB9-499B-9D7B-E9A513BABFA9/results_preview", namespaced=N, context=-/-, session=[user=packetcore, refcnt=7, touched=1552343810, refreshEligible=1552343895, removed=N, id=e85683b136ed38a6c2d92f1f8b5123ca, created=1552323168, refreshed=1552343670, expires=1552347270, initialLife=3600, createdBy=D75AD645-1FB9-499B-9D7B-E9A513BABFA9, portable, ip=172.16.70.186, csrf=16487147686831632066]
HttpServerTransaction: _state=6, _shouldLog=Y, _startTime=1552343810.555153524851
REQUEST: GET /en-US/splunkd/__raw/services/search/jobs/packetcore__packetcore_UEFDS0VUX0NPUkU__RMD56deb342c100fc05e_1552343796.1088_D75AD645-1FB9-499B-9D7B-E9A513BABFA9/results_preview?output_mode=json&search=search%20alert_state_interf%3E0%20%20id%3D%22cl-ml%22%20%20%20%7Crename%20ciudad%20as%20Comuna%2Cclave%20as%20%22Nombre%20Equipo%3APuerta%22%2C%20estado_interf%20as%20%22Estado%20Puerta%22%2C%20alert_state_interf%20as%20Estado_Puerta%20%7Ceval%20Estado_Limite%3Dalert_lim_interf%20%7Ceval%20%22%25%20Utilizacion%20Limite%20Interfaz%22%3D%20%22IN%3D%22.Porcentaje_interf_int.%22%25%20%20OUT%3D%22.Porcentaje_interf_out.%22%25%22%20%7Ceval%20%22Estado%20Trafico%22%3D%20%22IN%3D%22.round(traf_int%2C2).%22%20gbps%20%20OUT%3D%22.%20round(traf_out%2C2).%22%20gbps%22%20%7Ceval%20Estado_Trafico%3Dalert_traffic_interf%20%20%20%20%20%20%20%20%7Ctable%20datetime%2CComuna%2C%22Nombre%20Equipo%3APuerta%22%2C%20%22Estado%20Puerta%22%2C%20Estado_Puerta%7C%20appendpipe%20%5Bstats%20count%20as%20Sin_resultados%20%7C%20where%20Sin_resultados%3D0%5D%20%7C%20stats%20count&_=1552343796942 HTTP/1.1
Host: 172.18.143.136:8000
Connection: keep-alive
Accept: text/javascript, text/html, application/xml, text/xml, */*
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
X-Requested-With: XMLHttpRequest
Accept-Encoding: gzip, deflate
Accept-Language: es-ES,es;q=0.9
Cookie: mintjs%3Auuid=ec652fbe-e9b7-4813-80da-60f636105a9e; splunkweb_csrf_token_8000=16487147686831632066; session_id_8000=4136116f7bfc06e09ffa3ab8ddc1fa033dad4b2c; token_key=16487147686831632066; experience_id=b6e225a0-05d2-be97-2826-0106c92b3df7; splunkd_8000=akbksU0^HYSJ2ccEREa06hPvyiFToSp4E6fYTOl1ZAevd2d6Ly^CGuTyt6mbtUoyvpoYxSgpsWQj^94F5NH1qSrllIw^B1pqqrHkF5HbtxYrHW6x1PuQuSDPcDMTmyg2pHW39i8djQwcoN2AqmY
_bytesReceived=0, _maximumRequestDataSize=0, _totalBytesExpectedOfRequestData=-1
_bytesLeftInRequestDataChunk=0, _requestTransferEncodingIsChunked=N, _receivingRequestDataForever=N
_needToSetupRequestGunzip=N, _owedConsume=7305804385234272835, _wantSavedRequestData=N
_100continue=0, _expectDisconnect=N, _overrideSourceState=0
POST arguments: {}
REPLY: 200
Set-Cookie: splunkd_8000=akbksU0^HYSJ2ccEREa06hPvyiFToSp4E6fYTOl1ZAevd2d6Ly^CGuTyt6mbtUoyvpoYxSgpsWQj^94F5NH1qSrllIw^B1pqqrHkF5HbtxYrHW6x1PuQuSDPcDMTmyg2pHW39i8djQwcoN2AqmY; Path=/; HttpOnly; Max-Age=3600; Expires=Mon, 11 Mar 2019 23:36:50 GMT
Set-Cookie: splunkweb_csrf_token_8000=16487147686831632066; Path=/; Max-Age=157680000; Expires=Sat, 09 Mar 2024 22:36:50 GMT
ServicesEndpointReplyDataProvider: _setupState=0, _outputMode=2, _explicitOutputMode=Y
GET args: {["search"] = "search alert_state_interf>0 id="cl-ml" |rename ciudad as Comuna,clave as "Nombre Equipo:Puerta", estado_interf as "Estado Puerta", alert_state_interf as Estado_Puerta |eval Estado_Limite=alert_lim_interf |eval "% Utilizacion Limite Interfaz"= "IN=".Porcentaje_interf_int."% OUT=".Porcentaje_interf_out."%" |eval "Estado Trafico"= "IN=".round(traf_int,2)." gbps OUT=". round(traf_out,2)." gbps" |eval Estado_Trafico=alert_traffic_interf |table datetime,Comuna,"Nombre Equipo:Puerta", "Estado Puerta", Estado_Puerta| appendpipe [stats count as Sin_resultados | where Sin_resultados=0] | stats count"}
_allowedMethods={GET,POST,PUT,DELETE,HEAD,OPTIONS}, _preconditionState=0
_wantsSeparateThread=N, _alreadyBuiltHeaders=N, _needToSendBody=Y
_bodyBytesWritten=0, _chunkedState=0, _isLastTransaction=N
_varyBy=0x8, _redirectUrl="", _downloadFilename="", _totalScheduledLength=0
_willSendDataLater=N, _toSendState=0, _toSendSafe=Y
_knowCompleteLength=N, _desiredCompressionLevel=6
_replyIsGzipCompressed=N, _cacheControl=0x0, _maxCacheSeconds=4294967295, _dontIncludeFrameOptions=N
In TcpChannel 0x7f2098a29800, _tcloop=0x7f209baaa288, no async write data, _data._shouldKill=N, r/w_timeouts=5.000/300.000, timeout_count=0
SSL: inactive
rbuf: ptr=0x7f2098a298a0, size=0x2000, rptr=0x0, wptr=0x0
TcpChannelAcceptor: , tcloop=0x7f209baaa288, _disabledReasons=0, _activeCount=9, _inflightSubordinateAccepts=0
HttpListener: ssl=N, _maxActiveConnections=1365, _wellBelowConnectionLimit=Y, _maxThreads=1365
SplunkdHttpListener: PORT: _allowGzip=Y, bind=http://:8000
conf: _sslopt={rootCAPath="", caCertFile="", certFile="", privateKeyFile="", privateKeyPassword_set=N, commonNameToCheck="", altNameToCheck="", allowSslRenegotiation=Y, sslVersions="SSL3,TLS1.0,TLS1.1,TLS1.2", cipherSuite="", ecdhCurves="", useCompression=N, quietShutdown=NdhFile="", shouldVerifyClientCert=N}, _allowSslRenegotiation=Y, _frameOptionsSameOrigin=Y, _strictTransportSecurityHeader=N, _allowBasicAuth=N, _allowCookieAuth=N
conf: _streamInWriteTimeout=5.000, _maxContentLength=524288000, _maxThreads=1365, _maxSockets=1365, _forceHttp10=0
_thread=0x7f20624650c0: commandForThread=0, nextIdle=0x7f2098b99900, requestAfterThread=0, _tpfd=0x7f2098a29800, writeCorkCount=0, terminateCallback=(nil), ioError=No error, lastError=No error, terminateError=No error
giveCmd @0x7f2062465230: _queuedOn=(nil), ran=N, wantWake=N, wantFailIfLoopDone=N, cmd=0, ok=Y, chan=0x7f2098aaf800
writeDataAvail @0x7f2062465290: _queuedOn=(nil), ran=N, wantWake=N, wantFailIfLoopDone=N, chan=0x7f2098aaf800
wbuf: ptr=0x7f2062465330, size=0x8000, rptr=0x0, wptr=0x0
x86 CPUID registers:
0: 0000000B 756E6547 6C65746E 49656E69
1: 000206C2 22200800 029EE3FF BFEBFBFF
2: 55035A01 00F0B2FF 00000000 00CA0000
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000040 00000040 00000003 00001120
6: 00000007 00000002 00000009 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300403 00000004 00000000 00000603
B: 00000000 00000000 0000007D 00000022
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 2C100800
80000002: 65746E49 2952286C 6F655820 2952286E
80000003: 55504320 20202020 20202020 45202020
80000004: 35343635 20402020 30342E32 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003028 00000000 00000000 00000000
terminating...
I will answer this ticket as a solution was given in the following way.
I quote:
In regards to SPL-159979 - Crashing Thread: TcpChannelThread - Post process search using stats fields with null crashes Splunk
This bug was fixed with the 7.1.6 release and the 7.2.4 release as well.
it is validated that version 7.2.4.2 has documented this incident in two cases, therefore the splunk was updated and the service no longer fell when making a query or reviewing a panel.
2018-12-03 SPL-163063, SPL-159979 Crashing Thread: TcpChannelThread - Post process search using stats fields with null crashes Splunk
I hope that someone will serve this information that gave many problems.
regards
I will answer this ticket as a solution was given in the following way.
I quote:
In regards to SPL-159979 - Crashing Thread: TcpChannelThread - Post process search using stats fields with null crashes Splunk
This bug was fixed with the 7.1.6 release and the 7.2.4 release as well.
it is validated that version 7.2.4.2 has documented this incident in two cases, therefore the splunk was updated and the service no longer fell when making a query or reviewing a panel.
2018-12-03 SPL-163063, SPL-159979 Crashing Thread: TcpChannelThread - Post process search using stats fields with null crashes Splunk
I hope that someone will serve this information that gave many problems.
regards
@aecruzp Please accept the answer to help future readers find this solution.