I'm experiencing exactly the same problem (with a similar setup for extracting an indexed field and then removing that text from _raw after indexing; and yes, I have INDEXED_VALUE=false). I am running 4.1.3.
Double quotes in the transform (eg, FORMAT=fieldname::"$1") preserves extracted field values having a space, and I can see the correct values listed in the metadata under the event. But filtering on any of these values (eg, by clicking the value in the event's metadata, or choosing it from the field list on the left, both of which add fieldname="value" to the search) fails. It fails whether or not there's a space in the value.
Removing the quotes from the transform (eg, FORMAT=fieldname::$1) makes the searching/filtering work as expected. But extracted field values that should include a space are instead truncated at the space.
What I've noticed that goes beyond the discussion above is that in situation 1., if you include a * in the filtering term, eg, fieldname="value*", the search will succeed. I've not found a literal character I can put in that final position other than the * and have the search succeed.
And because it's customary at this point to be asked why one is indexing and modifying _raw:
I'm wanting to associate additional metadata with logfile lines and other event text I'm streaming via TCP from a large number of sources (I want to record serial#, model#, and software version). If the ***SPLUNK*** header trick would work for custom indexed fields instead of only source, sourcetype, and host, I would put these values there and we'd be done. Instead, I append the metadata to each logline like this: ***META*** serial=ABCDE model=FGH version="1.1e"
and I have a transform that removes ***META*** and everything after it after the indexing transform has been invoked. It would be wrong to leave the original line all mangled, so search time extraction is no good here.
I saw mention elsewhere that the ***SPLUNK*** header feature had fallen out of favor and wasn't being tended to. It would be great if this limitation could be addressed, especially since the metadata would only need to appear once at the top of a logfile stream rather than being bolted onto each line.
... View more