We recently embarked on a project to migrate our on-prem splunk instance to splunk cloud, and everything has gone well except for one outstanding mystery that is baffling me. We have a search time field extraction and calculated field defined in props.conf:
EXTRACT-versionextract = <Version>(?<Version>[^<]+) EVAL-Version = if(match(Version, "[\d\.\w\s]"),Version, "Unknown")
This works great on our on-prem environment, and it mostly works in the cloud, with Version field showing up on the left and it being usable to do stats by, etc. But, if a value is chosen to search by it always returns no results ie:
index=foo Version="9.0.601" index=foo Version="*1" index=foo Version="9*" index=foo Version="*.*" index=foo | search Version="9.0.601"
all return 0 results.
returns all results in the index.
I also find it interesting that
properly returns all results except those with Version=9.0.601
I'm out of ideas and I'm hoping some ninja here can lend me a hand 🙂
You have to tell the Search Head that these fields are not indexed values (they do not fall between to major/minor breakers) by adding this to
[Version] INDEXED_VALUE = false
I have seen this link before but had discounted this solution for 3 reasons. Firstly, according to the linked article:
UPDATE: in 4.3 and after search time fields extracted from indexed fields work without any further configuration
this suggests to me it ought not be needed in splunk cloud (7.2.6). Secondly, as mentioned in my post above, our implementation worked fine in both our on-prem environments (splunk 6.6.11 and 7.0.5) without the need to add this clause to fields.conf. Thirdly, in the blog posts it suggests that "To confirm that you’re hitting this problem and not something entirely unrelated you can try running the following type of search – which should return the expected results
search sourcetype=MyEvents MyField=* | search Myfield=ValidValue
my search of
search sourcetype=ValidSourcetype Version=* | search Version="9.0.601"
returns no results.
Are you aware of any potential downsides to asking splunk to add this to our deployment? I've considered trying this anyway but I'd like to avoid getting into a game of trying to enact random changes with unclear potential downsides. Given the objections I've raised do you still consider it a likely remedy?
Edit: Another thing that makes me suspicious of this being the reason is that the blog post states "When we index an event we tokenize it based on some rules and add those tokens to the inverted index. Without loosing generality let’s assume for now that we tokenize based on non-alphanumeric characters: ie every word is a token. So let’s take the above search example and see what the search asks the index for:
search: sourcetype=MyEvents MyField=ValidValue ask index: sourcetype=MyEvents AND ValidValue
host/source/sourcetype/index and some other fields are special fields that our index understands – what’s important here is that the search is asking the index for “ValidValue”. Now, if ValidValue is not a token in the index then the search would return nothing and we’d hit the problem described earlier..."
Now this doesn't explicitly state as much, but it suggests to me that if I were having this problem and I were to search:
sourcetype=GoodSource AND "9.0.601"
I shouldn't find any results. But, with this search I do get all the results I want (as well as some I'd like not to be returned wherein 9.0.601 is present elsewhere in the log). I could be very wrong in this understanding, I'm pretty new to splunk, so please let me know if I'm misunderstanding something.