Getting Data In

Failing to break multiline events

nl_cape
Explorer

I'm trying to index JVM garbage collection logs. I'm having trouble getting the event delimiting to work, however. Below is my attempted props.conf, as well as some sample data. I've experimented a bit in the import preview, and there the event breaks work nicely, but as soon as I actually index some data, I just get a single big event. Is the logic for splitting different in the preview as compared with during indexing?

Also, If I add my EXTRACT-statements to the previewer, it seems to hang - staying at 0% indefinitely.

props.conf

[JvmGarbageCollection]
KV_MODE = none
DATETIME_CONFIG = none
EXTRACT-RelativeTime = ^(?<_time>[\d\.]+): 
EXTRACT-GarbageCollectionData = (?<Offset>[\d\.]+): \[(?<GcType>(Full )?GC(--)?)\s(?<GcDetails>.+\s+)?\[PSYoungGen: (?<YoungGenSizeBeforeGC>\d+K)->(?<YoungGenSizeAfterGC>\d+K)\((?<MaxYoungGenSize>\d+K)\)\]\s(\[ParOldGen: (?<ParOldGenSizeBeforeGC>\d+K)->(?<ParOldGenSizeAfterGC>\d+K)\((?<MaxParOldGenSize>\d+K)\)\]\s)?(?<TotalSizeBeforeGC>\d+K)->(?<TotalSizeAfterGC>\d+K)\((?<MaxTotalSize>\d+K)\)(\s\[PSPermGen: (?<PSPermGenSizeBeforeGC>\d+K)->(?<PSPermGenSizeAfterGC>\d+K)\((?<MaxPSPermGenSize>\d+K)\)\])?, (?<TotalGcTime>[\d\.]+) secs\]\s\[Times: user.(?<UserTime>[\d\.]+) sys.(?<SysTime>[\d\.]+), real.(?<RealTime>[\d\.]+) secs\]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE=[\d\.]+:
MUST_BREAK_AFTER = \[Times: user.[\d\.]+ sys.[\d\.]+, real.[\d\.]+ secs\]

Sample data

Application time: 1.6442740 seconds
60927.496: [GC
Desired survivor size 20381696 bytes, new threshold 1 (max 15)
 [PSYoungGen: 2886991K->9404K(2900544K)] 5798933K->2924091K(5820992K), 0.4201650 secs] [Times: user=0.00 sys=3.85, real=0.42 secs] 
60927.917: [Full GC [PSYoungGen: 9404K->0K(2900544K)] [ParOldGen: 2914687K->1570706K(2920448K)] 2924091K->1570706K(5820992K) [PSPermGen: 371646K->204339K(393216K)], 9.6103740 secs] [Times: user=38.46 sys=43.74, real=9.61 secs] 
Total time for which application threads were stopped: 10.0346980 seconds
0 Karma

lguinn2
Legend

I suspect something is wrong with the second EXTRACT which is causing Splunk to fail to process the remainder of the props.conf stanza. Also, I think there is an easier way to write your EXTRACT. Do you really need both BREAK_ONLY_BEFORE and MUST_BREAK_AFTER? Try this

props.conf

[JvmGarbageCollection]
KV_MODE = none
DATETIME_CONFIG = none
EXTRACT-RelativeTime = ^(?<_time>[\d\.]+): 
SHOULD_LINEMERGE = true
MUST_BREAK_AFTER = \[Times: user.[\d\.]+ sys.[\d\.]+, real.[\d\.]+ secs\]
REPORT-GarbageCollectionData = GarbageCollectionDataExtract]

transforms.conf

[GarbageCollectionDataExtract]
REGEX=([\d\.]+): \[(\(Full \)?GC\(--\)?)\s(.+\s+)?\[PSYoungGen: (\d+K)->(\d+K)\((\d+K)\)\]\s(\[ParOldGen: (\d+K)->(\d+K)\((\d+K)\)\]\s)?(\d+K)->(\d+K)\((\d+K)\)(\s\[PSPermGen: (\d+K)->(\d+K)\((\d+K)\)\])?, ([\d\.]+) secs\]\s\[Times: user.([\d\.]+) sys.([\d\.]+), real.([\d\.]+) secs\]
FORMAT = Offset::$1 GcType::$2 GcDetails::$3 YoungGenSizeBeforeGC::$4 YoungGenSizeAfterGC::$5 MaxYoungGenSize::$6 ParOldGenSizeBeforeGC::$7 ParOldGenSizeAfterGC::$8 MaxParOldGenSize::$9 TotalSizeBeforeGC::$10 TotalSizeAfterGC::$11 MaxTotalSize::$12 PSPermGenSizeBeforeGC::$13 PSPermGenSizeAfterGC::$14 MaxPSPermGenSize::$15 TotalGcTime::$16 UserTime::$17 SysTime::$18 RealTime::$19

Now that I have split out the field names ?<fieldname> from the REGEX, I can see that there are a number of parentheses in addition to those that enclose the field values; some of these have been escaped, but not all of them. These "extra" parentheses must be escaped with a \ or else Splunk probably thinks that you are trying to extract fields within fields. Or something. It certainly looks confusing to me. So now I would edit the REGEX to

REGEX=([\d\.]+): \[((Full )?GC(--)?)\s(.+\s+)?\[PSYoungGen: (\d+K)->(\d+K)\((\d+K)\)\]\s(\[ParOldGen: (\d+K)->(\d+K)\((\d+K)\)\]\s)?(\d+K)->(\d+K)\((\d+K)\)(\s\[PSPermGen: (\d+K)->(\d+K)\((\d+K)\)\])?, ([\d\.]+) secs\]\s\[Times: user.([\d\.]+) sys.([\d\.]+), real.([\d\.]+) secs\]

At least I think this may work. It is really hard to read, especially without a real understanding of the data...

0 Karma

nl_cape
Explorer

Yes, the regex got quite unreadable. Not a very nice log format.
I used the changes you proposed, and I still have the same issue; If I load the props configuration except the REPORT into the previewer, event breaks works (adding the REPORT makes the previewer break). However, data imported using the generated source type is not split properly.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas

Why Splunk Customers Should Attend Cisco Live 2026 Las Vegas     Cisco Live 2026 is almost here, and this ...

What Is the Name of the USB Key Inserted by Bob Smith? (BOTS Hint, Not the Answer)

Hello Splunkers,   So you searched, “what is the name of the usb key inserted by bob smith?”  Not gonna lie… ...

Automating Threat Operations and Threat Hunting with Recorded Future

    Automating Threat Operations and Threat Hunting with Recorded Future June 29, 2026 | Register   Is your ...