Getting Data In

Logging delay causes a line to get split in two. How to configure line breaking in Splunk to prevent this?

johnraftery
Communicator

Hi,

I'm trying to log Full GC events which look like this in the GC log:

109897.407: [Full GC 109897.407: [CMS: 8807279K->8606400K(31073920K), 11.1930580 secs] 9069543K->8606400K(31418944K), [CMS Perm : 77354K->77350K(262144K)], 11.1931520 secs] [Times: user=11.20 sys=0.00, real=11.19 secs] 

My props.conf has this stanza:

[sun_jvm]
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
BREAK_ONLY_BEFORE = ^\d+\.\d+:
category = Custom
disabled = false
pulldown_type = true
EXTRACT-file = (?P<service_name>[^\/\\]*)$ in source
EXTRACT-PAR = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s\[GC.*\]\s(?P<before_gc>\d+)K->(?P<after_gc>\d+)K\((?P<max_gc>\d+).*real=(?P<par_new_duration>\d+\.\d+)\s
EXTRACT-FGC = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s\[Full GC.*\]\s(?P<before_gc>\d+)K->(?P<after_gc>\d+)K\((?P<max_gc>\d+)K.*\s(?P<full_gc_duration>\d+\.\d+) secs\]
EXTRACT-CMS = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s\[CMS-concurrent-sweep:\s.*real=(?P<cms_duration>\d+\.\d+)\s
EXTRACT-CMS-MARK = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s.*\[1\sCMS-initial-mark:[^,]+,\s(?P<cms_init_mark_duration>\d+\.\d+) secs\]
EXTRACT-CMS-REMARK = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s.*\[1\sCMS-remark:[^,]+,\s(?P<cms_remark_duration>\d+\.\d+) secs\]

This works for quick GCs, but sometimes the GC takes a bit of time, so the first part of the line comes a few seconds before the rest. Namely, this appears first:

109897.407: [Full GC 109897.407: [CMS

Then a few seconds later it gets finished.

The trouble is, Splunk interprets this first bit as an event in its own right, and then puts the rest in a second event:

: 8807279K->8606400K(31073920K), 11.1930580 secs] 9069543K->8606400K(31418944K), [CMS Perm : 77354K->77350K(262144K)], 11.1931520 secs] [Times: user=11.20 sys=0.00, real=11.19 secs]

This screws up my graph, because it's no longer recognized as a Full GC.

Is there any way to tell Splunk to wait until the whole line has been written?

Thanks!
John

0 Karma
1 Solution

woodcock
Esteemed Legend

You need to increase your TIME_BEFORE_CLOSE value in inputs.conf:

time_before_close =
* Modtime delta required before Splunk can close a file on EOF.
* Tells the system not to close files that have been updated in past seconds.
* Defaults to 3.

View solution in original post

woodcock
Esteemed Legend

You need to increase your TIME_BEFORE_CLOSE value in inputs.conf:

time_before_close =
* Modtime delta required before Splunk can close a file on EOF.
* Tells the system not to close files that have been updated in past seconds.
* Defaults to 3.

johnraftery
Communicator

Ok, so I should modify the stanza in inputs.conf like this?

[monitor://PERF_TEST/logs/.../*]
disabled = false
host = PHASE4 
index = mlc_live
sourcetype = sun_jvm
whitelist = .*gc\.log$
time_before_close = 15
0 Karma

johnraftery
Communicator

Seems to have done the trick, thanks very much!

0 Karma
Get Updates on the Splunk Community!

Index This | What is broken 80% of the time by February?

December 2025 Edition   Hayyy Splunk Education Enthusiasts and the Eternally Curious!    We’re back with this ...

Unlock Faster Time-to-Value on Edge and Ingest Processor with New SPL2 Pipeline ...

Hello Splunk Community,   We're thrilled to share an exciting update that will help you manage your data more ...

Splunk MCP & Agentic AI: Machine Data Without Limits

Discover how the Splunk Model Context Protocol (MCP) Server can revolutionize the way your organization uses ...