As I always say, what search is best data analytics solution depends on data. While this is as true as what @PickleRick explained in general, it is even more true with an ambiguous case as yours. A...
See more...
As I always say, what search is best data analytics solution depends on data. While this is as true as what @PickleRick explained in general, it is even more true with an ambiguous case as yours. A discussion last year has showed me possibilities that I hadn't known before. But whether it will help your use case depends on a lot of things. So, let me first put out some qualifiers that immediately come to mind. There can be many others. Does every field of interest appear in every event in which sourcetype_1_primary, sourcetype_2_primary, or sourcetype_3_primary is present? Are sourcetype_1_primary, sourcetype_2_primary, and sourcetype_3_primary already extracted at search time, i.e., your <initial search> does not have to extract any of them? Gain from such optimization also depends on how many calculations are to be performed between index search and stats. This is not to say that failing these qualifiers will preclude potential benefits from similar strategies, but the following is based on them. The idea is to limit search intervals using subsearches. For this to work, of course, employed subsearches must be extremely light. Hence tstats. Here is a little demonstration. original with time filters index=_introspection component=* earliest=-4h | stats latest(*) as * by component index=_introspection component=* earliest=-4h [tstats max(_time) as latest where index=_introspection earliest=-4h by component index | eval earliest = latest - 0.1, latest = latest + 0.1] | stats latest(*) as * by component I tested them on a standalone instance on my laptop. That is to say there are few events (only 10 components); instead of 0.1s shifts, I use 1s. Even so, the baseline is extremely unstable, ranging from 0.76s to 1.8s. The biggest gain I saw was from 1.8s to 0.6s. Smaller gains were like from 0.75s to 0.68s. Back to your correlation search. Assuming your <initial search> is a combined search, try something like this: (sourcetype=sourcetype_1 sourcetype_1_primary=*
[tstats max(_time) as latest where sourcetype=sourcetype_1 by sourcetype_1_primary
| eval earliest = latest - 0.1, latest = latest + 0.1])
OR (sourcetype=sourcetype_2 sourcetype_2_primary
[tstats max(_time) as latest where sourcetype=sourcetype_2 by sourcetype_2_primary
| eval earliest = latest - 0.1, latest = latest + 0.1])
OR (sourcetype=sourcetype_3 sourcetype_3_primary=*
[tstats max(_time) as latest where sourcetype=sourcetype_3 by sourcetype_3_primary
| eval earliest = latest - 0.1, latest = latest + 0.1])
| fields _time, xxx, xxx, <pick your required fields>
| eval coalesced_primary_key=coalesce(sourcetype_1_primary, sourcetype_2_primary, sourcetype_3_primary)
| stats latest(*) AS * by coalesced_primary_key