- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
kv store failing again and again After Splunk Upgrade, out of memory errors (6.6.3 --> 7.1.6)
vishaltaneja070
Motivator
02-15-2019
12:25 AM
Here is the mongod.log
2019-02-15T08:12:44.837Z W CONTROL [main] net.ssl.sslCipherConfig is deprecated. It will be removed in a future release.
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] MongoDB starting : pid=8956 port=8191 dbpath=D:\Splunk\var\lib\splunk\kvstore\mongo 64-bit host=XXXXXX
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] db version v3.4.10-splunk
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] git version: 078f28920cb24de0dd479b5ea6c66c644f6326e9
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2p-fips 14 Aug 2018
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] allocator: tcmalloc
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] modules: none
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] build environment:
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] distmod: 2008plus-ssl
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] distarch: x86_64
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] target_arch: x86_64
2019-02-15T08:12:45.266Z I CONTROL [initandlisten] options: { net: { port: 8191, ssl: { PEMKeyFile: "D:\Splunk\etc\auth\server.pem", PEMKeyPassword: "<password>", allowInvalidHostnames: true, mode: "requireSSL", sslCipherConfig: "TLSv1+HIGH:TLSv1.2+HIGH:@STRENGTH" } }, replication: { oplogSizeMB: 4000, replSet: "27E07F05-DA98-4185-B26F-95378779627A" }, security: { javascriptEnabled: false, keyFile: "D:\Splunk\var\lib\splunk\kvstore\mongo\splunk.key" }, setParameter: { enableLocalhostAuthBypass: "0", oplogFetcherMaxFetcherRestarts: "0" }, storage: { dbPath: "D:\Splunk\var\lib\splunk\kvstore\mongo", engine: "mmapv1", mmapv1: { smallFiles: true } }, systemLog: { timeStampFormat: "iso8601-utc" } }
2019-02-15T08:12:45.267Z W - [initandlisten] Detected unclean shutdown - D:\Splunk\var\lib\splunk\kvstore\mongo\mongod.lock is not empty.
2019-02-15T08:12:45.288Z I JOURNAL [initandlisten] journal dir=D:\Splunk\var\lib\splunk\kvstore\mongo\journal
2019-02-15T08:12:45.288Z I JOURNAL [initandlisten] recover begin
2019-02-15T08:12:45.288Z I JOURNAL [initandlisten] info no lsn file in journal/ directory
2019-02-15T08:12:45.288Z I JOURNAL [initandlisten] recover lsn: 0
2019-02-15T08:12:45.288Z I JOURNAL [initandlisten] recover D:\Splunk\var\lib\splunk\kvstore\mongo\journal\j._0
2019-02-15T08:12:45.289Z I JOURNAL [initandlisten] recover applying initial journal section with sequence number 683
2019-02-15T08:12:45.339Z I JOURNAL [initandlisten] recover cleaning up
2019-02-15T08:12:45.339Z I JOURNAL [initandlisten] removeJournalFiles
2019-02-15T08:12:45.339Z I JOURNAL [initandlisten] old journal file will be removed: D:\Splunk\var\lib\splunk\kvstore\mongo\journal\j._0
2019-02-15T08:12:45.340Z I JOURNAL [initandlisten] recover done
2019-02-15T08:12:45.446Z I JOURNAL [durability] Durability thread started
2019-02-15T08:12:45.448Z I JOURNAL [journal writer] Journal writer thread started
2019-02-15T08:12:45.450Z I CONTROL [initandlisten]
2019-02-15T08:12:45.450Z I CONTROL [initandlisten] ** WARNING: No SSL certificate validation can be performed since no CA file has been provided
2019-02-15T08:12:45.450Z I CONTROL [initandlisten] ** Please specify an sslCAFile parameter.
2019-02-15T08:12:46.379Z I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory 'D:/Splunk/var/lib/splunk/kvstore/mongo/diagnostic.data'
2019-02-15T08:12:46.396Z I REPL [initandlisten] Did not find local voted for document at startup.
2019-02-15T08:12:46.400Z I NETWORK [thread1] waiting for connections on port 8191 ssl
2019-02-15T08:12:46.401Z I REPL [replExecDBWorker-2] New replica set config in use: { _id: "27E07F05-DA98-4185-B26F-95378779627A", version: 1, members: [ { _id: 0, host: "127.0.0.1:8191", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: { instance: "27E07F05-DA98-4185-B26F-95378779627A", all: "all" }, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: 60000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 } } }
2019-02-15T08:12:46.401Z I REPL [replExecDBWorker-2] This node is 127.0.0.1:8191 in the config
2019-02-15T08:12:46.402Z I REPL [replExecDBWorker-2] transition to STARTUP2
2019-02-15T08:12:46.402Z I REPL [replExecDBWorker-2] Starting replication storage threads
2019-02-15T08:12:46.421Z I REPL [replExecDBWorker-2] Starting replication fetcher thread
2019-02-15T08:12:46.421Z I REPL [replExecDBWorker-2] Starting replication applier thread
2019-02-15T08:12:46.421Z I REPL [replExecDBWorker-2] Starting replication reporter thread
2019-02-15T08:12:46.422Z I REPL [rsSync] transition to RECOVERING
2019-02-15T08:12:46.425Z I REPL [rsSync] transition to SECONDARY
2019-02-15T08:12:46.425Z I REPL [rsSync] transition to PRIMARY
2019-02-15T08:12:47.009Z I FTDC [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
2019-02-15T08:12:47.433Z I REPL [rsSync] transition to primary complete; database writes are now permitted
2019-02-15T08:12:55.777Z I NETWORK [thread1] connection accepted from 127.0.0.1:49369 #1 (1 connection now open)
2019-02-15T08:12:55.790Z I NETWORK [conn1] received client metadata from 127.0.0.1:49369 conn1: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:55.811Z I ACCESS [conn1] Successfully authenticated as principal __system on local
2019-02-15T08:12:55.811Z I - [conn1] end connection 127.0.0.1:49369 (1 connection now open)
2019-02-15T08:12:56.813Z I NETWORK [thread1] connection accepted from 127.0.0.1:49372 #2 (1 connection now open)
2019-02-15T08:12:56.824Z I NETWORK [conn2] received client metadata from 127.0.0.1:49372 conn2: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:56.877Z I ACCESS [conn2] Successfully authenticated as principal __system on local
2019-02-15T08:12:56.879Z I NETWORK [thread1] connection accepted from 127.0.0.1:49374 #3 (2 connections now open)
2019-02-15T08:12:56.881Z I - [conn2] end connection 127.0.0.1:49372 (2 connections now open)
2019-02-15T08:12:56.886Z I NETWORK [conn3] received client metadata from 127.0.0.1:49374 conn3: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:56.907Z I ACCESS [conn3] Successfully authenticated as principal __system on local
2019-02-15T08:12:56.908Z I - [conn3] end connection 127.0.0.1:49374 (1 connection now open)
2019-02-15T08:12:56.916Z I NETWORK [thread1] connection accepted from 127.0.0.1:49375 #4 (1 connection now open)
2019-02-15T08:12:56.952Z I NETWORK [conn4] received client metadata from 127.0.0.1:49375 conn4: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:56.974Z I ACCESS [conn4] Successfully authenticated as principal __system on local
2019-02-15T08:12:56.975Z I - [conn4] end connection 127.0.0.1:49375 (1 connection now open)
2019-02-15T08:12:56.978Z I NETWORK [thread1] connection accepted from 127.0.0.1:49376 #5 (1 connection now open)
2019-02-15T08:12:56.981Z I NETWORK [conn5] received client metadata from 127.0.0.1:49376 conn5: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:56.982Z I NETWORK [thread1] connection accepted from 127.0.0.1:49377 #6 (2 connections now open)
2019-02-15T08:12:57.016Z I ACCESS [conn6] Successfully authenticated as principal __system on local
2019-02-15T08:12:57.076Z I NETWORK [thread1] connection accepted from 127.0.0.1:49378 #7 (3 connections now open)
2019-02-15T08:12:57.086Z I NETWORK [conn7] received client metadata from 127.0.0.1:49378 conn7: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:57.088Z I NETWORK [thread1] connection accepted from 127.0.0.1:49379 #8 (4 connections now open)
2019-02-15T08:12:57.123Z I ACCESS [conn8] Successfully authenticated as principal __system on local
2019-02-15T08:12:58.199Z I NETWORK [thread1] connection accepted from 127.0.0.1:49383 #9 (5 connections now open)
2019-02-15T08:12:58.202Z I NETWORK [conn9] received client metadata from 127.0.0.1:49383 conn9: { driver: { name: "mongoc", version: "1.6.3" }, os: { type: "Windows", name: "Windows", version: "6.3 (9600)", architecture: "x86_64" }, platform: "cfg=0x200c9 CC=MSVC 1900 CFLAGS="-DPY_BIG_ENDIAN=0 /DWIN32 /D_WINDOWS /W3" LDFLAGS="/machine:x64"" }
2019-02-15T08:12:58.225Z I ACCESS [conn9] Successfully authenticated as principal __system on local
2019-02-15T08:12:58.334Z I NETWORK [thread1] connection accepted from 127.0.0.1:49385 #10 (6 connections now open)
2019-02-15T08:12:58.359Z I ACCESS [conn10] Successfully authenticated as principal __system on local
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x1dc493
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x1d926e
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x165c88
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x75d91
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x7c4a82
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x7c4913
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x1619a7
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x1615af
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe ???
2019-02-15T08:13:51.302Z I CONTROL [conn9] mongod.exe index_collator_extension+0x82924d
2019-02-15T08:13:51.302Z I CONTROL [conn9] KERNEL32.DLL BaseThreadInitThunk+0x22
2019-02-15T08:13:51.302Z F - [conn9] out of memory.
Increased the oplogsize setting as well but that also not worked out.
- Mark as New
- Bookmark Message
- Subscribe to Message
- Mute Message
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
maraman_splunk

Splunk Employee
02-15-2019
04:42 AM
Hi,
something may be eating the memory (could be search activity, you may have been living close to the limit and now hit the wall)
look at https://docs.splunk.com/Documentation/Splunk/latest/Troubleshooting/Troubleshootmemoryusage
make sure you also have enough swap so at least it won't crash for temp peak (obviously you don't want to swap all time)
