Getting Data In

Log file with differing message formats

mikelanghorst
Motivator

I've run across an odd log file from EMC's Data Protection application that is logging two very different log formats into a single file. Example:

2012-03-08 12:06:30,643 INFO Webapp Launcher [Init] Connection to controller at fdpap01.oa.domain.com:3916
2012-03-08 12:06:30,643 INFO Webapp Launcher [Init] Connection to reporter at fdpap01.oa.domain.com:4002
INFO 2560.2564 20120308:123239 service - ServerCtrlHandler(): Service stop signalled - exiting
INFO 2676.2696 20120308:123532 webapp - daemonMain(): Setting memory limit '-Xmx128m'
INFO 2676.2696 20120308:123535 webapp - daemonMain(): DPA Webapp
INFO 2676.2696 20120308:123535 webapp - daemonMain(): (c) 1994-2009 EMC Corporation. All rights reserved.
INFO 2676.2696 20120308:123535 webapp - daemonMain(): Version: 5.0.1 build 4792 on windows
INFO 2676.2696 20120308:123535 webapp - daemonMain(): Logging at level Info
2012-03-08 12:36:01,967 INFO Webapp Launcher [Init] Connection to controller at fdpap01.oa.domain.com:3916
2012-03-08 12:36:01,967 INFO Webapp Launcher [Init] Connection to reporter at fdpap01.oa.domain.com:4002
INFO 2676.2680 20120308:133056 service - ServerCtrlHandler(): Service stop signalled - exiting
INFO 3912.3884 20120308:133135 webapp - daemonMain(): Setting memory limit '-Xmx128m'
INFO 3912.3884 20120308:133135 webapp - daemonMain(): DPA Webapp
INFO 3912.3884 20120308:133135 webapp - daemonMain(): (c) 1994-2009 EMC Corporation. All rights reserved.
INFO 3912.3884 20120308:133135 webapp - daemonMain(): Version: 5.0.1 build 4792 on windows
INFO 3912.3884 20120308:133135 webapp - daemonMain(): Logging at level Info
2012-03-08 13:31:38,752 INFO Webapp Launcher [Init] Connection to controller at fdpap01.oa.domain.com:3916
2012-03-08 13:31:38,752 INFO Webapp Launcher [Init] Connection to reporter at fdpap01.oa.domain.com:4002

Whenever I've had to assist splunk with line breaking & date extraction, it's been a consistent format for the entire file. Either specified a source or sourcetype, and the specifics to break on. Unsure how to handle this one in regards to date extraction. For the lines starting with the severity, the third column is the datestamp, and does line up that each of these should be a different event. Currently by default Splunk is merging these.

Ideas?

Tags (1)

hexx
Splunk Employee
Splunk Employee

If you can be sure that you'll always have a 1 line = 1 event parity for this data source, the simple way to fix the line-breaking is simply to set :

SHOULD_LINEMERGE = false

The different time formats might cause a different kind of problem, as Splunk's time stamp extraction heuristic are not fond of this situation.

Still, it might be worth it to see how the time stamp extraction behaves once you've fixed the line-breaking. Perhaps you should still add, at a minimum :

MAX_TIMESTAMP_LOOKAHEAD = 37

...in order to scope the time stamp extraction as much as we currently can.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Quantify Your Splunk Investment Impact: Introducing Savings Metrics to Value Insights

Building on the foundation established in our initial Value Insights releases, we are introducing the Savings ...

Event Series: Telemetry Pipeline Management

Balancing Scale and Spend: Gaining Control Over High-Volume Metrics in Splunk Observability Cloud As ...

Kick the Tires Before You Commit: A Hands-On Tour of the Splunk Observability Cloud ...

Evaluating an enterprise observability platform usually goes like this: fill out a form, get a free trial with ...