Splunk Search

Line Breaking Issue/Time Stamp for Indextime field extraction

arrowsmith3
Path Finder

Having an issue with line breaking at the time stamp for a particular sourcetype.

RAW

2013-03-13T15:32:52.247-0700: 103395.597: [Full GC (System) [PSYoungGen: 192K->0K(20160K)] [ParOldGen: 17487K->16257K(43712K)] 17679K->16257K(63872K) [PSPermGen: 28027K->28027K(47232K)], 0.5712670 secs] [Times: user=0.16 sys=0.00, real=0.57 secs]

Splunk Parsed:

2013-03-14T08:50:15.353-0700: 63009.133: [GC
Desired survivor size 25559040 bytes, new threshold 1 (max 15)
 [PSYoungGen: 53440K->21216K(56960K)] 92645K->60485K(122496K), 0.4307669 secs]

[Times: user=2.27 sys=0.03, real=0.43 secs]

---> this is a new event, should be merged with the line above.

2013-03-14T13:37:19.653-0700: 80232.893: [GC
Desired survivor size 28311552 bytes, new threshold 1 (max 15)
 [PSYoungGen: 56544K->21216K(60288K)] 95813K->60549K(125824K), 0.4341336 secs]

props.conf

[iccsgclog]
SHOULD_LINEMERGE = true
TRANSFORMS-iccslogs = iccs-fields
REPORT-iccs = slc_details, slc_fields, slc_taxon
MAX_TIMESTAMP_LOOKAHEAD = 40
BREAK_ONLY_BEFORE = \d+-\d+-\d+\w\d+:\d+:\d+.\d+-\d+:
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N

What am I missing??

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

in addition to what emiller42 says, use this instead for slightly better results:

[iccsgclog]
# SHOULD_LINEMERGE = true
SHOULD_LINEMERGE = false
LINE_BREAKER=([\r\n]+)(?=\d+-\d+-\d+\w\d+:\d+:\d+.\d+-\d+:)
TRANSFORMS-iccslogs = iccs-fields
REPORT-iccs = slc_details, slc_fields, slc_taxon
MAX_TIMESTAMP_LOOKAHEAD = 40
# BREAK_ONLY_BEFORE = \d+-\d+-\d+\w\d+:\d+:\d+.\d+-\d+:
# TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%3N%z
0 Karma

emiller42
Motivator

The problem is that gc log lines don't buffer, so there's actually often a significant delay between when the first part of the event is written and when the second part is written. (It can actually split into more chunks on abnormally long collections) The default time splunk waits before it starts considering something a new event is three seconds. There is a config for that, and since your gc logs actually have a timestamp (mine don't) this may help:

In your inputs.conf try setting the TIME_BEFORE_CLOSE parameter to a higher value. (default is 3)

Another option is to not parse gc logs at all. Instead, use something like SPLUNK4JMX to poll the JVM for info around garbage collection.

0 Karma
Get Updates on the Splunk Community!

Federated Search for Amazon S3 | Key Use Cases to Streamline Compliance Workflows

Modern business operations are supported by data compliance. As regulations evolve, organizations must ...

New Dates, New City: Save the Date for .conf25!

Wake up, babe! New .conf25 dates AND location just dropped!! That's right, this year, .conf25 is taking place ...

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud

Introduction to Splunk Observability Cloud - Building a Resilient Hybrid Cloud  In today’s fast-paced digital ...