Getting Data In

Logging delay causes a line to get split in two. How to configure line breaking in Splunk to prevent this?

johnraftery
Communicator

Hi,

I'm trying to log Full GC events which look like this in the GC log:

109897.407: [Full GC 109897.407: [CMS: 8807279K->8606400K(31073920K), 11.1930580 secs] 9069543K->8606400K(31418944K), [CMS Perm : 77354K->77350K(262144K)], 11.1931520 secs] [Times: user=11.20 sys=0.00, real=11.19 secs] 

My props.conf has this stanza:

[sun_jvm]
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
BREAK_ONLY_BEFORE = ^\d+\.\d+:
category = Custom
disabled = false
pulldown_type = true
EXTRACT-file = (?P<service_name>[^\/\\]*)$ in source
EXTRACT-PAR = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s\[GC.*\]\s(?P<before_gc>\d+)K->(?P<after_gc>\d+)K\((?P<max_gc>\d+).*real=(?P<par_new_duration>\d+\.\d+)\s
EXTRACT-FGC = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s\[Full GC.*\]\s(?P<before_gc>\d+)K->(?P<after_gc>\d+)K\((?P<max_gc>\d+)K.*\s(?P<full_gc_duration>\d+\.\d+) secs\]
EXTRACT-CMS = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s\[CMS-concurrent-sweep:\s.*real=(?P<cms_duration>\d+\.\d+)\s
EXTRACT-CMS-MARK = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s.*\[1\sCMS-initial-mark:[^,]+,\s(?P<cms_init_mark_duration>\d+\.\d+) secs\]
EXTRACT-CMS-REMARK = ^(?P<TIME_GC_TIMESTAMP>[^ ]*?)\s?(?P<relative_time>\d+\.\d+):\s.*\[1\sCMS-remark:[^,]+,\s(?P<cms_remark_duration>\d+\.\d+) secs\]

This works for quick GCs, but sometimes the GC takes a bit of time, so the first part of the line comes a few seconds before the rest. Namely, this appears first:

109897.407: [Full GC 109897.407: [CMS

Then a few seconds later it gets finished.

The trouble is, Splunk interprets this first bit as an event in its own right, and then puts the rest in a second event:

: 8807279K->8606400K(31073920K), 11.1930580 secs] 9069543K->8606400K(31418944K), [CMS Perm : 77354K->77350K(262144K)], 11.1931520 secs] [Times: user=11.20 sys=0.00, real=11.19 secs]

This screws up my graph, because it's no longer recognized as a Full GC.

Is there any way to tell Splunk to wait until the whole line has been written?

Thanks!
John

0 Karma
1 Solution

woodcock
Esteemed Legend

You need to increase your TIME_BEFORE_CLOSE value in inputs.conf:

time_before_close =
* Modtime delta required before Splunk can close a file on EOF.
* Tells the system not to close files that have been updated in past seconds.
* Defaults to 3.

View solution in original post

woodcock
Esteemed Legend

You need to increase your TIME_BEFORE_CLOSE value in inputs.conf:

time_before_close =
* Modtime delta required before Splunk can close a file on EOF.
* Tells the system not to close files that have been updated in past seconds.
* Defaults to 3.

johnraftery
Communicator

Ok, so I should modify the stanza in inputs.conf like this?

[monitor://PERF_TEST/logs/.../*]
disabled = false
host = PHASE4 
index = mlc_live
sourcetype = sun_jvm
whitelist = .*gc\.log$
time_before_close = 15
0 Karma

johnraftery
Communicator

Seems to have done the trick, thanks very much!

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

How to find the worst searches in your Splunk environment and how to fix them

Everyone knows Splunk is a powerful platform for running searches and doing data analytics. Your ...

Share Your Feedback: On Admin Config Service (ACS)!

Help Us Build a Better Admin Config Service Experience (ACS)   We Want Your Feedback on Admin Config Service ...