For the comparison of sourcetype=foo vs log_type=foo, it all depends on how log_type is defined. If log_type=foo can be inferred to be only a keyword in the text of the event, then for many cases it will be equivalently efficient. We can rule out buckets for both cases in the bloom filter.
However, for cases where 'foo' is present in your events, but is not actually the value of log_type, then sourcetype=foo will be more efficient, because we can retreive only the events where sourcetype=foo from the index, while events that have 'foo' but where log_type is not 'foo' will have to be post-filtered after the rules to identify the value are identified.
If you have a wide variety of possible sources of log_type=foo, then the field based search will tend to diverge more in efficiency as compared to sourcetype=foo.
In short, use sourcetypes, host, and source when they make sense, if it's a search that will run over significant data or will be saved and reused. However the difference between searching on source/sourcetype/host and a field is often not large enough to make mangling those fields for your use-case worth it.
Typically, if that kind of mangling turned out to be worth it, it would be sufficient to use an indexed field in any event.
... View more