I think that's due to Splunk trying to match the 12 against the indexed words rather than the raw event: the _raw contains 12ms which is not segmented in two blocks, it has been indexed as a single term, being it a single word without any major/minor breaking character into it (ref: segmenters.conf)
This instead could work (but would return more results than expected):
sourcetype=ruby ruby_call_completed=12*
because Splunk shoud try to find indexed tokens starting with 12 (so 12ms, 123ms, ... would be found in the index)
On the other side, your second example:
sourcetype=ruby | search ruby_call_completed=12
first acts on the indexed data matching sourcetype=ruby, then fields are extracted, THEN the secondary search is executed.
I think this is due to the map-reduce paradigm:
"map" is executed on the distributed servers, and it is just a search for
matching events based on the precomputed index of the logs
"reduce" extracts fields and applyes the secondary search, but this is
only executed on the node where the search was first launched.
However, that's only my two cents...
Paolo
... View more