Splunk Search

How to reassemble indexed multiline events that were not split properly using either join or another search-time option?

drodman29
Path Finder

I have multiline events that were split by the default 256 line limit (MAX_EVENTS). While I have read all on how to fix the issue going forward, is there a way to use the data that I already have indexed? I have one event that has linecount =257 and some number of additional (artificially created by the default line limits) "events" with the same timestamp that I would like to join in a transaction or some other search time union that would make them useful as the one "logical" event that was intended. However, I can't seem to find any field that would allow me to join them. Any ideas?

0 Karma
1 Solution

drodman29
Path Finder

I'm getting good results with the following:
"SSO authentication API authenticate response" OR "}, {" | transaction host,_time keeporphans=false maxevents=3 startswith="SSO authentication API authenticate response

The assumption is that the events are sequential, and that the split events are getting the same timestamp of the logical parent . I empirically determined I wasn't getting more than 3 split events for my application. The additional implied assumption is that the same host is not logging a JSON like event (see the OR clause) at exactly the same time stamp as the parent/logical event timestamp.

View solution in original post

0 Karma

drodman29
Path Finder

I'm getting good results with the following:
"SSO authentication API authenticate response" OR "}, {" | transaction host,_time keeporphans=false maxevents=3 startswith="SSO authentication API authenticate response

The assumption is that the events are sequential, and that the split events are getting the same timestamp of the logical parent . I empirically determined I wasn't getting more than 3 split events for my application. The additional implied assumption is that the same host is not logging a JSON like event (see the OR clause) at exactly the same time stamp as the parent/logical event timestamp.

0 Karma

lguinn2
Legend

If there is no commonality in the fields, you can't "join" them by anything other than time proximity. (And you wouldn't need

This is unlikely to work accurately. But if you post some example data, we might think up a way to give it a try.

0 Karma

drodman29
Path Finder

Example cleaned up data, it is a log4j entry with a dumped json object embedded in it.
2015-06-09 12:17:58,169 INFO (mymodule.java:64) - SSO authentication API authenticate response:
{
"status" : "AUTHENTICATED",
"inactive" : false,
"login" : "here",
"domain" : "there",
"principal" : "someguid",
"otherPrincipal" : "someotherguid",
"method" : "mymethod",
"hashId" : "longstring",
"clientHost" : "1.1.1.1",
"subtoken" :{
subsubtoken" :{
... 400 more lines of json blah... listing a variable number of group memberships
}
}
}

I get at least one event with a line count of 257, then a second or more events depending on the number of lines in the json object - which is variable. It seems to be a pseudo random break point within the data structure.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...