Getting Data In

Failing to break multiline events

nl_cape
Explorer

I'm trying to index JVM garbage collection logs. I'm having trouble getting the event delimiting to work, however. Below is my attempted props.conf, as well as some sample data. I've experimented a bit in the import preview, and there the event breaks work nicely, but as soon as I actually index some data, I just get a single big event. Is the logic for splitting different in the preview as compared with during indexing?

Also, If I add my EXTRACT-statements to the previewer, it seems to hang - staying at 0% indefinitely.

props.conf

[JvmGarbageCollection]
KV_MODE = none
DATETIME_CONFIG = none
EXTRACT-RelativeTime = ^(?<_time>[\d\.]+): 
EXTRACT-GarbageCollectionData = (?<Offset>[\d\.]+): \[(?<GcType>(Full )?GC(--)?)\s(?<GcDetails>.+\s+)?\[PSYoungGen: (?<YoungGenSizeBeforeGC>\d+K)->(?<YoungGenSizeAfterGC>\d+K)\((?<MaxYoungGenSize>\d+K)\)\]\s(\[ParOldGen: (?<ParOldGenSizeBeforeGC>\d+K)->(?<ParOldGenSizeAfterGC>\d+K)\((?<MaxParOldGenSize>\d+K)\)\]\s)?(?<TotalSizeBeforeGC>\d+K)->(?<TotalSizeAfterGC>\d+K)\((?<MaxTotalSize>\d+K)\)(\s\[PSPermGen: (?<PSPermGenSizeBeforeGC>\d+K)->(?<PSPermGenSizeAfterGC>\d+K)\((?<MaxPSPermGenSize>\d+K)\)\])?, (?<TotalGcTime>[\d\.]+) secs\]\s\[Times: user.(?<UserTime>[\d\.]+) sys.(?<SysTime>[\d\.]+), real.(?<RealTime>[\d\.]+) secs\]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE=[\d\.]+:
MUST_BREAK_AFTER = \[Times: user.[\d\.]+ sys.[\d\.]+, real.[\d\.]+ secs\]

Sample data

Application time: 1.6442740 seconds
60927.496: [GC
Desired survivor size 20381696 bytes, new threshold 1 (max 15)
 [PSYoungGen: 2886991K->9404K(2900544K)] 5798933K->2924091K(5820992K), 0.4201650 secs] [Times: user=0.00 sys=3.85, real=0.42 secs] 
60927.917: [Full GC [PSYoungGen: 9404K->0K(2900544K)] [ParOldGen: 2914687K->1570706K(2920448K)] 2924091K->1570706K(5820992K) [PSPermGen: 371646K->204339K(393216K)], 9.6103740 secs] [Times: user=38.46 sys=43.74, real=9.61 secs] 
Total time for which application threads were stopped: 10.0346980 seconds
0 Karma

lguinn2
Legend

I suspect something is wrong with the second EXTRACT which is causing Splunk to fail to process the remainder of the props.conf stanza. Also, I think there is an easier way to write your EXTRACT. Do you really need both BREAK_ONLY_BEFORE and MUST_BREAK_AFTER? Try this

props.conf

[JvmGarbageCollection]
KV_MODE = none
DATETIME_CONFIG = none
EXTRACT-RelativeTime = ^(?<_time>[\d\.]+): 
SHOULD_LINEMERGE = true
MUST_BREAK_AFTER = \[Times: user.[\d\.]+ sys.[\d\.]+, real.[\d\.]+ secs\]
REPORT-GarbageCollectionData = GarbageCollectionDataExtract]

transforms.conf

[GarbageCollectionDataExtract]
REGEX=([\d\.]+): \[(\(Full \)?GC\(--\)?)\s(.+\s+)?\[PSYoungGen: (\d+K)->(\d+K)\((\d+K)\)\]\s(\[ParOldGen: (\d+K)->(\d+K)\((\d+K)\)\]\s)?(\d+K)->(\d+K)\((\d+K)\)(\s\[PSPermGen: (\d+K)->(\d+K)\((\d+K)\)\])?, ([\d\.]+) secs\]\s\[Times: user.([\d\.]+) sys.([\d\.]+), real.([\d\.]+) secs\]
FORMAT = Offset::$1 GcType::$2 GcDetails::$3 YoungGenSizeBeforeGC::$4 YoungGenSizeAfterGC::$5 MaxYoungGenSize::$6 ParOldGenSizeBeforeGC::$7 ParOldGenSizeAfterGC::$8 MaxParOldGenSize::$9 TotalSizeBeforeGC::$10 TotalSizeAfterGC::$11 MaxTotalSize::$12 PSPermGenSizeBeforeGC::$13 PSPermGenSizeAfterGC::$14 MaxPSPermGenSize::$15 TotalGcTime::$16 UserTime::$17 SysTime::$18 RealTime::$19

Now that I have split out the field names ?<fieldname> from the REGEX, I can see that there are a number of parentheses in addition to those that enclose the field values; some of these have been escaped, but not all of them. These "extra" parentheses must be escaped with a \ or else Splunk probably thinks that you are trying to extract fields within fields. Or something. It certainly looks confusing to me. So now I would edit the REGEX to

REGEX=([\d\.]+): \[((Full )?GC(--)?)\s(.+\s+)?\[PSYoungGen: (\d+K)->(\d+K)\((\d+K)\)\]\s(\[ParOldGen: (\d+K)->(\d+K)\((\d+K)\)\]\s)?(\d+K)->(\d+K)\((\d+K)\)(\s\[PSPermGen: (\d+K)->(\d+K)\((\d+K)\)\])?, ([\d\.]+) secs\]\s\[Times: user.([\d\.]+) sys.([\d\.]+), real.([\d\.]+) secs\]

At least I think this may work. It is really hard to read, especially without a real understanding of the data...

0 Karma

nl_cape
Explorer

Yes, the regex got quite unreadable. Not a very nice log format.
I used the changes you proposed, and I still have the same issue; If I load the props configuration except the REPORT into the previewer, event breaks works (adding the REPORT makes the previewer break). However, data imported using the generated source type is not split properly.

0 Karma
Get Updates on the Splunk Community!

Splunk Mobile: Your Brand-New Home Screen

Meet Your New Mobile Hub  Hello Splunk Community!  Staying connected to your data—no matter where you are—is ...

Introducing Value Insights (Beta): Understand the Business Impact your organization ...

Real progress on your strategic priorities starts with knowing the business outcomes your teams are delivering ...

Enterprise Security (ES) Essentials 8.3 is Now GA — Smarter Detections, Faster ...

As of today, Enterprise Security (ES) Essentials 8.3 is now generally available, helping SOC teams simplify ...