kscher, short answer is that Splunk uses certain characters to decide what a word boundary is, and usually spaces and punctuation work fine. but when you have weird characters separating words, the words won't get indexed the way you need for a search to work. the custom segmentation simply tells Splunk to split words so they can be searched for and found.
extraction works independently of this, because extraction happens after events are found, thus it may appear to work even if the field values that get extracted can not be found in the index.
... View more