I have a multithreaded application that writes out intermingled logs and having performance issues searching with transactions, and looking for a simpler way to coalesce events.
simplified mylog.log example:
10:00 thread=50 start
10:01 thread=51 start
10:02 thread=50 dosomething
10:03 thread=51 dosomethingelse
10:03 thread=50 error
10:04 thread=50 end
10:05 thread=51 end
if I search with sourcetype=mylogtype | transaction tid startswith="start" | search error, the result will be correct but it is very slow, especially compared with just "sourcetype=mylog error". I think real issue here is that events are single-line and only joined later, but I don't see how intermingled log files like this can have the lines combined into single events appropriately.
Here you go:
The subsearch between the square brackets retrieves all the 'thread' (i.e., tid) values and generates an OR expression of (thread=val1 OR thread=val2 ...). Given your sample data, after the subsearch runs, the search would look like this:
and the results would be one transaction:
Two more notes:
the final "| search error" is required because it's possible, and actually likely, that thread values are reused else where, and not all thread=50 transactions have errors. Essentially the subsearch gets you a set of candidate events, then they are put into transactions, and finally those transactions that don't have 'error' are filtered out. This will be massively faster because you're only looking at the events that have the 'thread' values that also have errors.
In the 4.2 timeframe there will be a search command to do all this for you (e.g. "| searchtxn mythreaddef error OR warn")
You might want to look into the map
and localize
commands here, especially of the typical number of error events tends to be low. That should let you find the errors first, then you can look for the related events and build your transactions.
One other thing that can help improve the speed of transaction
searches is to tune the values of maxspan
and maxpause
(link). Setting these to lower values can speed up the search, at the cost of losing any events that are outside the maximum time intervals.