Getting Data In

Failing to break multiline events

nl_cape
Explorer

I'm trying to index JVM garbage collection logs. I'm having trouble getting the event delimiting to work, however. Below is my attempted props.conf, as well as some sample data. I've experimented a bit in the import preview, and there the event breaks work nicely, but as soon as I actually index some data, I just get a single big event. Is the logic for splitting different in the preview as compared with during indexing?

Also, If I add my EXTRACT-statements to the previewer, it seems to hang - staying at 0% indefinitely.

props.conf

[JvmGarbageCollection]
KV_MODE = none
DATETIME_CONFIG = none
EXTRACT-RelativeTime = ^(?<_time>[\d\.]+): 
EXTRACT-GarbageCollectionData = (?<Offset>[\d\.]+): \[(?<GcType>(Full )?GC(--)?)\s(?<GcDetails>.+\s+)?\[PSYoungGen: (?<YoungGenSizeBeforeGC>\d+K)->(?<YoungGenSizeAfterGC>\d+K)\((?<MaxYoungGenSize>\d+K)\)\]\s(\[ParOldGen: (?<ParOldGenSizeBeforeGC>\d+K)->(?<ParOldGenSizeAfterGC>\d+K)\((?<MaxParOldGenSize>\d+K)\)\]\s)?(?<TotalSizeBeforeGC>\d+K)->(?<TotalSizeAfterGC>\d+K)\((?<MaxTotalSize>\d+K)\)(\s\[PSPermGen: (?<PSPermGenSizeBeforeGC>\d+K)->(?<PSPermGenSizeAfterGC>\d+K)\((?<MaxPSPermGenSize>\d+K)\)\])?, (?<TotalGcTime>[\d\.]+) secs\]\s\[Times: user.(?<UserTime>[\d\.]+) sys.(?<SysTime>[\d\.]+), real.(?<RealTime>[\d\.]+) secs\]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE=[\d\.]+:
MUST_BREAK_AFTER = \[Times: user.[\d\.]+ sys.[\d\.]+, real.[\d\.]+ secs\]

Sample data

Application time: 1.6442740 seconds
60927.496: [GC
Desired survivor size 20381696 bytes, new threshold 1 (max 15)
 [PSYoungGen: 2886991K->9404K(2900544K)] 5798933K->2924091K(5820992K), 0.4201650 secs] [Times: user=0.00 sys=3.85, real=0.42 secs] 
60927.917: [Full GC [PSYoungGen: 9404K->0K(2900544K)] [ParOldGen: 2914687K->1570706K(2920448K)] 2924091K->1570706K(5820992K) [PSPermGen: 371646K->204339K(393216K)], 9.6103740 secs] [Times: user=38.46 sys=43.74, real=9.61 secs] 
Total time for which application threads were stopped: 10.0346980 seconds
0 Karma

lguinn2
Legend

I suspect something is wrong with the second EXTRACT which is causing Splunk to fail to process the remainder of the props.conf stanza. Also, I think there is an easier way to write your EXTRACT. Do you really need both BREAK_ONLY_BEFORE and MUST_BREAK_AFTER? Try this

props.conf

[JvmGarbageCollection]
KV_MODE = none
DATETIME_CONFIG = none
EXTRACT-RelativeTime = ^(?<_time>[\d\.]+): 
SHOULD_LINEMERGE = true
MUST_BREAK_AFTER = \[Times: user.[\d\.]+ sys.[\d\.]+, real.[\d\.]+ secs\]
REPORT-GarbageCollectionData = GarbageCollectionDataExtract]

transforms.conf

[GarbageCollectionDataExtract]
REGEX=([\d\.]+): \[(\(Full \)?GC\(--\)?)\s(.+\s+)?\[PSYoungGen: (\d+K)->(\d+K)\((\d+K)\)\]\s(\[ParOldGen: (\d+K)->(\d+K)\((\d+K)\)\]\s)?(\d+K)->(\d+K)\((\d+K)\)(\s\[PSPermGen: (\d+K)->(\d+K)\((\d+K)\)\])?, ([\d\.]+) secs\]\s\[Times: user.([\d\.]+) sys.([\d\.]+), real.([\d\.]+) secs\]
FORMAT = Offset::$1 GcType::$2 GcDetails::$3 YoungGenSizeBeforeGC::$4 YoungGenSizeAfterGC::$5 MaxYoungGenSize::$6 ParOldGenSizeBeforeGC::$7 ParOldGenSizeAfterGC::$8 MaxParOldGenSize::$9 TotalSizeBeforeGC::$10 TotalSizeAfterGC::$11 MaxTotalSize::$12 PSPermGenSizeBeforeGC::$13 PSPermGenSizeAfterGC::$14 MaxPSPermGenSize::$15 TotalGcTime::$16 UserTime::$17 SysTime::$18 RealTime::$19

Now that I have split out the field names ?<fieldname> from the REGEX, I can see that there are a number of parentheses in addition to those that enclose the field values; some of these have been escaped, but not all of them. These "extra" parentheses must be escaped with a \ or else Splunk probably thinks that you are trying to extract fields within fields. Or something. It certainly looks confusing to me. So now I would edit the REGEX to

REGEX=([\d\.]+): \[((Full )?GC(--)?)\s(.+\s+)?\[PSYoungGen: (\d+K)->(\d+K)\((\d+K)\)\]\s(\[ParOldGen: (\d+K)->(\d+K)\((\d+K)\)\]\s)?(\d+K)->(\d+K)\((\d+K)\)(\s\[PSPermGen: (\d+K)->(\d+K)\((\d+K)\)\])?, ([\d\.]+) secs\]\s\[Times: user.([\d\.]+) sys.([\d\.]+), real.([\d\.]+) secs\]

At least I think this may work. It is really hard to read, especially without a real understanding of the data...

0 Karma

nl_cape
Explorer

Yes, the regex got quite unreadable. Not a very nice log format.
I used the changes you proposed, and I still have the same issue; If I load the props configuration except the REPORT into the previewer, event breaks works (adding the REPORT makes the previewer break). However, data imported using the generated source type is not split properly.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...