Say I have multiple sources of jboss logs, like server.log, geo.log, feature.log, and gzipped archives containing earlier versions of those logs. Each log contains entries in log4j format. I want to parse all the logs collectively and return only the log entries that contain the word "ERROR," up to the first newline. I want to exclude all log entries that contain the words "INFO," "WARN," "DEBUG," etc. In the example log entries below, using Regex Coach, the regex -
ERROR.*?\n
matches the portion of text I want to extract from the errors. Additionally, the regex doesn't match any log entry I want excluded from the search:
2011-05-10 02:01:11,799 [ThreadPool worker thread #21] ERROR com.geodyne.overt.runtime.engine.FlowObjectExecutionTreeNode - deliverException(...)com.geodyne.magma.GeoException: Runtime error in script ("Process: 'GeoDoc GeoCode LAVA Cache' ProcessItem: 'Get GeoCode Location' Type: 'ITEM'" 18:0).Internal Script error: com.geodyneinc.magma.common.util.service.exceptions.BaseSystemException: Error while calling calderaLocation or parsing the response from GeoCode because of HTML response
[ErrorInfo[
featureId=null
featureNumber=null
featureMapId=null
errorType=RECOVERABLE
externalErrorCode=null
message=Error while calling calderaLocation or parsing the response from GeoCode because of HTML response
serviceName=GeoGen
severityType=null
timeStamp=2011-05-09 23:16:18.203
stackTrace=null
]]
2011-05-10 02:01:56,360 [ThreadPool worker thread #22] ERROR com.geodyne.server.ejb.workflow.EJBWorkflowManagerBean - Exception occurred, e = com.geodyne.component.common.workflow.WorkflowProcessItemException: Runtime error in script ("Process: 'EruptionEvasionRetryService' ProcessItem: 'call connector' Type: 'ITEM'" 4:0).Internal Script error: com.geodyneinc.magma.common.util.service.exceptions.BaseSystemException: Error in persisting certificate images into GeoCode 46555calderaLocation: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<HTML><HEAD><TITLE>You are not authorized to view this page</TITLE>
<META HTTP-EQUIV="Content-Type" Content="text/html; charset=Windows-1252">
<STYLE type="text/css">
BODY { font: 8pt/12pt verdana }
H1 { font: 13pt/15pt verdana }
H2 { font: 8pt/12pt verdana }
A:link { color: red }
A:visited { color: maroon }
</STYLE>
</HEAD><BODY><TABLE width=500 border=0 cellspacing=10><TR><TD>
<h1>You are not authorized to view this page</h1>
You do not have permission to view this directory or page using the credentials that you supplied.
<hr>
<p>Please try the following:</p>
<ul>
2011-05-10 02:02:44,865 [ThreadPool worker thread #25] ERROR com.geodyne.magma.script.js.GeoJavaScriptException - GeoJavaScriptException(), nested exception:
2011-05-10 02:02:44,867 [ThreadPool worker thread #25] ERROR com.geodyne.server.ejb.workflow.EJBWorkflowManagerBean - Message: server.ejb.workflow.impl.EJBWorkflowManagerBean.exception Arguments: ExecutionStack(ExecutionJob(worker(componentName = Script), processItemId = 6071, processTiming = N, saveExecutionContextBehaviour = EXECUTION_CONTEXT_DO_NOT_SAVE)), SymbolTable(SymbolTable(...)), sharedData = null
Now say that I want to categorize all errors with matching text to be of a particular "type." For example, multiple log entries with the text -
2011-05-10 02:01:11,799 [ThreadPool worker thread #21] ERROR com.geodyne.overt.runtime.engine.FlowObjectExecutionTreeNode - deliverException(...)com.geodyne.magma.GeoException: Runtime error in script ("Process: 'GeoDoc GeoCode LAVA Cache' ProcessItem: 'Get GeoCode Location' Type: 'ITEM'" 18:0).Internal Script error: com.geodyneinc.magma.common.util.service.exceptions.BaseSystemException: Error while calling calderaLocation or parsing the response from GeoCode because of HTML response
are a "type." Say there are 30 odd "types" of errors in the logs. I want to count how many errors of each "type" there are for a given time frame, and return the text of the first instance in the log of the 10 most frequently seen error "types", with a count of occurances of each. For example, say there are 200 of the first type, 40 of the second, 12 of the third, and so on. Now, I'd like to graph that information, with a bar for each "type" that shows the error count and error text on mouse-over. Is there a way to do that? I've tried to do it with search- and index-field extraction, suggestions given by kind list participants (links to original questions/answers below), but I can't make them work - no doubt because I've failed to articulate my problem and desired outcome adequately.
http://splunk-base.splunk.com/answers/24165/how-to-report-top-ten-errors-over-a-time-range
http://splunk-base.splunk.com/answers/24268/index-time-field-extraction-and-report-output-problems
My intent is to make a dashboard that has fields for entering the desired start date/timestamp, stop date/timestamp and a "Top Ten" button that when clicked, produces the report/graph directly without any further interaction from the user. Sources are the current individual logs and multiple gzipped archives of older logs. Any pointers would be appreciated.
... View more