Hi,
I am currently working in improving the Nmon App and notably the way splunk generates and indexes the perf monitor data stream.
Until now, input scripts used in the App will generates various csv flat files being indexed by various monitor inputs.
For improvements reasons and much more flexibility (data checkpoints improvements, removal of input csv monitors...) i am working in generating json data stream to be indexed by indexers.
Things works as expected (data stream is being generated, indexed and field extraction works perfecty) but i am facing an issue which causes splunkd to randomly crash (from a few minutes to more).
In splunkd, i can see persistent messages about an "indexedExtractionsConfig" error:
12-18-2014 01:18:10.084 +0100 INFO TailingProcessor - Archive file='/media/BIGDATA/Splunk_Various/splunk/etc/apps/nmon/var/nmon_repository/guilhem-UX52VS_141218_0106.nmon' has stopped changing, will read it now.
12-18-2014 01:18:10.084 +0100 INFO ArchiveProcessor - handling file=/media/BIGDATA/Splunk_Various/splunk/etc/apps/nmon/var/nmon_repository/guilhem-UX52VS_141218_0106.nmon
12-18-2014 01:18:10.084 +0100 INFO ArchiveProcessor - reading path=/media/BIGDATA/Splunk_Various/splunk/etc/apps/nmon/var/nmon_repository/guilhem-UX52VS_141218_0106.nmon (seek=40245 len=155920)
12-18-2014 01:18:10.916 +0100 ERROR IndexedExtractionsConfig - Tried to set INDEXED_EXTRACTIONS but it already had a value! (was: 8, wanted: 0)
Splunkd crashes after a random time, and produces the following crash log:
[build 245427] 2014-12-18 01:18:11
Received fatal signal 6 (Aborted).
Cause:
Signal sent by PID 17132 running under UID 1000.
Crashing thread: archivereader
Registers:
RIP: [0x00007FC757582BB9] gsignal + 57 (/lib/x86_64-linux-gnu/libc.so.6)
RDI: [0x00000000000042EC]
RSI: [0x0000000000004312]
RBP: [0x00007FC7576CD2B0]
RSP: [0x00007FC74C7EA0D8]
RAX: [0x0000000000000000]
RBX: [0x00007FC7588F5000]
RCX: [0xFFFFFFFFFFFFFFFF]
RDX: [0x0000000000000006]
R8: [0xFEFEFEFEFEFEFEFF]
R9: [0x00007FC758998F60]
R10: [0x0000000000000008]
R11: [0x0000000000000202]
R12: [0x00000000015A2C6B]
R13: [0x00000000015F5200]
R14: [0x00007FC7504C54F8]
R15: [0x00007FC7504C54D0]
EFL: [0x0000000000000202]
TRAPNO: [0x0000000000000000]
ERR: [0x0000000000000000]
CSGSFS: [0x0000000000000033]
OLDMASK: [0x0000000000000000]
OS: Linux
Arch: x86-64
Backtrace:
[0x00007FC757582BB9] gsignal + 57 (/lib/x86_64-linux-gnu/libc.so.6)
[0x00007FC757585FC8] abort + 328 (/lib/x86_64-linux-gnu/libc.so.6)
[0x00007FC75757BA76] ? (/lib/x86_64-linux-gnu/libc.so.6)
[0x00007FC75757BB22] ? (/lib/x86_64-linux-gnu/libc.so.6)
[0x000000000093488C] ? (splunkd)
[0x00000000009325C7] _ZN16ArchiveProcessor14classifyStreamEv + 567 (splunkd)
[0x0000000000932860] _ZN16ArchiveProcessor22awaitingClassificationEPKcm + 128 (splunkd)
[0x0000000000932930] _ZN16ArchiveProcessor5writeEPKvm + 64 (splunkd)
[0x0000000000BF45B8] _ZN14ArchiveContext7processERK8PathnameP13ISourceWriter + 760 (splunkd)
[0x0000000000BF4E15] _ZN14ArchiveContext9readFullyEP13ISourceWriterRb + 1221 (splunkd)
[0x0000000000936582] _ZN16ArchiveProcessor4mainEv + 4082 (splunkd)
[0x0000000000F46A7E] _ZN6Thread8callMainEPv + 62 (splunkd)
[0x00007FC75791A182] ? (/lib/x86_64-linux-gnu/libpthread.so.0)
[0x00007FC757646EFD] clone + 109 (/lib/x86_64-linux-gnu/libc.so.6)
Linux / guilhem-UX52VS / 3.13.0-43-generic / #72-Ubuntu SMP Mon Dec 8 19:35:06 UTC 2014 / x86_64
Last few lines of stderr (may contain info on assertion failure, but also could be old):
2014-12-18 00:55:43.594 +0100 splunkd started (build 245427)
2014-12-18 00:57:09.989 +0100 Interrupt signal received
2014-12-18 00:57:27.496 +0100 splunkd started (build 245427)
2014-12-18 00:57:59.003 +0100 Interrupt signal received
2014-12-18 01:03:28.865 +0100 splunkd started (build 245427)
splunkd: /home/build/build-src/6.2.1/src/framework/PipelineInputChannel.h:144: void PipelineInputChannel::setIndexedExtractionsDestructive(IndexedExtractionsConfig&, StructuredDataHeaderExtractor*): Assertion `_refcnt == 1' failed.
/etc/debian_version: jessie/sid
Last errno: 0
Threads running: 48
argv: [splunkd -p 8090 restart]
Thread: "archivereader", did_join=0, ready_to_run=Y, main_thread=N
First 8 bytes of Thread token @0x7fc7504c5340:
00000000 00 37 7f 4c c7 7f 00 00 |.7.L....|
00000008
x86 CPUID registers:
0: 0000000D 756E6547 6C65746E 49656E69
1: 000306A9 03100800 7FBAE3BF BFEBFBFF
2: 76035A01 00F0B2FF 00000000 00CA0000
3: 00000000 00000000 00000000 00000000
4: 00000000 00000000 00000000 00000000
5: 00000040 00000040 00000003 00021120
6: 00000077 00000002 00000009 00000000
7: 00000000 00000000 00000000 00000000
8: 00000000 00000000 00000000 00000000
9: 00000000 00000000 00000000 00000000
A: 07300403 00000000 00000000 00000603
B: 00000000 00000000 0000005D 00000003
C: 00000000 00000000 00000000 00000000
😧 00000000 00000000 00000000 00000000
80000000: 80000008 00000000 00000000 00000000
80000001: 00000000 00000000 00000001 28100800
80000002: 20202020 49202020 6C65746E 20295228
80000003: 65726F43 294D5428 2D376920 37333533
80000004: 50432055 20402055 30302E32 007A4847
80000005: 00000000 00000000 00000000 00000000
80000006: 00000000 00000000 01006040 00000000
80000007: 00000000 00000000 00000000 00000100
80000008: 00003024 00000000 00000000 00000000
terminating...
splunkd_stderr.log produces:
2014-12-18 00:55:43.594 +0100 splunkd started (build 245427)
2014-12-18 00:57:09.989 +0100 Interrupt signal received
2014-12-18 00:57:27.496 +0100 splunkd started (build 245427)
2014-12-18 00:57:59.003 +0100 Interrupt signal received
2014-12-18 01:03:28.865 +0100 splunkd started (build 245427)
splunkd: /home/build/build-src/6.2.1/src/framework/PipelineInputChannel.h:144: void PipelineInputChannel::setIndexedExtractionsDestructive(IndexedExtractionsConfig&, StructuredDataHeaderExtractor*): Assertion `_refcnt == 1' failed.
The App uses the "unarchive_cmd" option in props.conf to use a third party script that will convert the raw nmon data to be exploitable by Splunk, such that is being automatically called when a file is managed by Splunk.
props.conf
[source::.../*.nmon]
invalid_cause = archive
unarchive_cmd = $SPLUNK_HOME/bin/splunk cmd python $SPLUNK_HOME/etc/apps/nmon/bin/nmon2csv.py
sourcetype = nmon_processing
NO_BINARY_CHECK = true
[nmon_data]
INDEXED_EXTRACTIONS=json
KV_MODE=none
disabled=false
pulldown_type=true
TIME_FORMAT=%d-%m-%Y %H:%M:%S
TIMESTAMP_FIELDS=ZZZZ
inputs.conf
[monitor://$SPLUNK_HOME/etc/apps/nmon/var/nmon_repository/*.nmon]
disabled = false
followTail = 0
recursive = false
index = nmon
sourcetype = nmon_data
crcSalt = <SOURCE>
The nmon converter (currently the Python converter in a new beta flavor) produces json stream (and only json), such as:
{
"Busy": "",
"CPUs": "4",
"Idle_PCT": "84.9",
"Sys_PCT": "3.0",
"User_PCT": "12.1",
"Wait_PCT": "0.0",
"ZZZZ": "17-12-2014 23:19:05",
"hostname": "guilhem-UX52VS",
"interval": "10",
"serialnum": "guilhem-UX52VS",
"snapshots": "1500",
"type": "CPU_ALL"
}
{
"Busy": "",
"CPUs": "4",
"Idle_PCT": "88.5",
"Sys_PCT": "2.1",
"User_PCT": "8.9",
"Wait_PCT": "0.5",
"ZZZZ": "17-12-2014 23:19:15",
"hostname": "guilhem-UX52VS",
"interval": "10",
"serialnum": "guilhem-UX52VS",
"snapshots": "1500",
"type": "CPU_ALL"
}
Any help will be greatly appreciated 🙂
Note that currently the stable release of the App uses the same mechanism but generates csv flat files and some processing information in stdout, and i and people using the App never faced this issue.
Thanks !
Guilhem
Couldn't really find the explanation, but using a key=value format solved my issue (instead of pure json)
Couldn't really find the explanation, but using a key=value format solved my issue (instead of pure json)
Can you please advise, what do I do if my Splunk (v7.0.2) complains a lot in splunkd.log about
05-21-2018 14:01:18.641 +0300 ERROR IndexedExtractionsConfig - Tried to set INDEXED_EXTRACTIONS but it already had a value! (was: 0, wanted: 😎
I have tried enabling debug logging level for IndexedExtractionsConfig, but got no details. Unfortunately I cannot just restart this server in full debug logging mode (part of a production environment).