We get unformatted stack traces dumped into the same source type as our event logs. I'd like to strip off the time/date and the host fields from events identified as a stack trace, probably truncate off the seconds from the time, and then use the time and host to re-search the logs looking for matching events to help diagnosing application issues.
Could anyone suggest an approach for this? Can one do some kind of join, or a subsearch?
You could try something like this:
sourcetype="whatever" Guid="*" | eval time=_time | search [search sourcetype="whatever" NOT Guid="*" | eval time=strptime(substr(_raw,1,18)) | rename host AS HostName | fields time,HostName]
Of course, this assumes that the stacktrace events will have the exact same time stamp as the typical log entry you are interested in. It also assumes all typical events have a value in Guid field and that none of the stacktrace events have the Guid field.
May be something like this (say normal events logs and stacktrace logs are maximum 5 min apart)
sourcetype=yourSourceType | transaction host maxspan=5m startswith="GRID APPLY CHANGES START" endswith="error"
Don't think I follow how this would be set up. I'm really only interested in those transactions during which an exception occurred. I've used transactions before but I don't see how it applies.
It seems to me that I need a search that identifies stacktraces and then does some kind of join or subsearch using the host and time.
Since there is a stacktrace there is not the normal end-of-transaction entry, such as [GRID APPLY CHANGES END]
I can see we have host field matching between these two logs using which a transaction can be created. Have a look at that.
http://docs.splunk.com/Documentation/Splunk/6.0/SearchReference/Transaction
You can define how the grouping should be done, may be based on the maximum duration/span during which both of these events occur.
Typical log entry:
20140805 13:59:22 [PERF] [GRID APPLY CHANGES START] Action=GridApplyChanges, Guid=8c1551d8-1fc2-478e-a425-aa5535690057, PlanId=8df9ab68-3d08-48d5-a5de-a36f00cd68ac, PlanName=MYPlanName, Dept=123, StartPeriod=2015 P1 (FEBRUARY), EndPeriod=2015 P3 (APRIL), NumPeriods=3, EstimatedColumns=25, NumPlanRows=59, RPRows=0, SQAs=37524, SFAs=112572, NumDoors=636, AppliedBy=userid/a123456, AffProcessSize=1.03GB, Build=5.1.6.16392, Env=PRODUCTION, OSArch=64-bit, NetworkConnection=Local Area Connection, IPAddress=11.22.33.44, HostName=a1122334, ConnectionStatus=Connected, PlanMode=Server
Typical stack trace:
20140805 12:01:09 unhandled error from dispatcher, sender:System.Windows.Threading.Dispatcher
System.NullReferenceException: Object reference not set to an instance of an object.
at ........ System.Windows.BroadcastEventHelper.BroadcastEvent(DependencyObject root, RoutedEvent routedEvent)
at ...............
host = A1122334 source = c:\logs\App1\MetricsLog.20140805.8232.log sourcetype = OurSourceType
Typical log entry follows in next message:
You might be able to utilize transaction command for this, may be based on host. Could you post some sample event logs and stack trace logs?