Hello,
I'm having an issue when trying to filter events based on accented characters.
For instance if I look at the ingested events, index=my_index sourcetype=my_source
, I will be able to see the events that have the field value I'm looking for:
... asset_name ...
... D. João ...
If I try to filter the events at search time index=my_index sourcetype=my_source asset_name="D. João"
, the "No results found." message is displayed, the same applies if I select the desired field value from the field's list on the left.
How can I get this to work?
I've looked for similar questions here on the Splunk Answers forum but it mainly points for the sourcetype encoding, which I think might not be the issue, since the events seem to be properly encoded.
Thanks in advance!
-- since the events seem to be properly encoded...
What is the encoding?
A similar thread at - How do I search for accented characters?
Hello ddrillic,
It's set to default.
From my previous research on other threads here on the forum, my understanding was that if the characters are show properly the encoding selected by Splunk would be correct.
If this is not the case I might need to manually select the correct one. Would changing the current encoding affect already ingested data or only the data ingested from now on?
Thanks
try something to the affect of |regex ã
Hello CarzonZa,
It works if I use the ´| regex ´ command, is this the only way to retrieve the filtered data?
I'm worried about this, since I tried to use the ´like()´ function and it didn't calculate properly.
´index="index_name" source="source_name"
| regex field_name="Carvão"
| eval x=if(like(field_name,"%ã%"),1,0)
| stats count by x´
It only counts x=0
Thanks
create a capture group like so
| rex field=_raw "\s(?<name>.*)"
| eval x=if(like(name,"%ã%"),1,0)
| stats count by name
Hello CarsonZa,
That works, though using regex would not be the final solution I would like to implement since the majority of the users that perform searches do not possess the required knowledge.
I might have to assess the encoding to prevent this from happening.
Thanks