Unable to find double-byte characters in Okta Iden...

shomatsuo · ‎01-19-2021

When Okta Identity Cloud Add-on for Splunk saves log data in okta into Splunk, Japanese letters are converted into Unicode-escaped state and not unescaped.

example:
Letters "田中" in original log are saved in splunk as converted letters, that is, "\u7530\u4e2d".

Therefore, we cannot reach logs we would like to see by searching with Japanese letters.
example:
I expect the statement below to search logs including "田中", but actually nothing are found:

index="okta_logs" 田中

To fix this, I think you need to modify the source code of Okta Identity Cloud add-on.
I ask Okta Identity Cloud Add-on for Splunk to have a function to Unicode-unescape multi-byte letters.

mbegan · ‎02-09-2021

Hello Shomatsuo,

I've created an issue on the github repo for this add-on

https://github.com/mbegan/Okta-Identity-Cloud-for-Splunk/issues/28

shomatsuo · ‎01-19-2021

The field definition values are utf-8 encoded by Splunk, so that's okay.
However, field definitions are not a solution because I want to perform a full-text search.

Even if it is re-indexed in the summary index, it will be encoded in utf-8. However, it wastes storage space and degrades real-time performance, so it is not good solution.

Unable to find double-byte characters in Okta Identity Cloud add-on for Splunk

search

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...

Join the Conversation

Unable to find double-byte characters in Okta Identity Cloud add-on for Splunk

search

.conf25 technical session recap of Observability for Gen AI: Monitoring LLM ...

A Season of Skills: New Splunk Courses to Light Up Your Learning Journey

Announcing the Migration of the Splunk Add-on for Microsoft Azure Inputs to ...