Archive

Recognizing Unicode

Communicator

Hi there, I am in the problem where I am receiving a JSON data via TCP but I am unable to convert the unicode to the correct one.

For example:

Search string: sourcetype = 123, results =

APPLICATION_NAME:  ABC
ADDRESS: %u0e1b%u0e32%u0e01%u0e41%u0e1e%u0e23%u0e01 

From what I understand, I should add under /etc/system/local/props.conf

[sourcetype::123]
CHARSET=TIS-620

With a command | extract reload=T, that should work.

Any idea? Heres the link to the unicode table if anyone is interested:

http://www.unicode.org/charts/PDF/U0E00.pdf

0 Karma
1 Solution

Communicator

Found a workaround by having a macro:

| eval ADDRESS= replace(ADDRESS, "u0e01","ก")
| eval ADDRESS= replace(ADDRESS, "u0e02","ก")

.... Repeat for all 50ish characters

Tedious but it works. I believe the problem is that the server is not forwarding me in the correct unicode format, hence requiring the manual work.

View solution in original post

0 Karma

Communicator

Found a workaround by having a macro:

| eval ADDRESS= replace(ADDRESS, "u0e01","ก")
| eval ADDRESS= replace(ADDRESS, "u0e02","ก")

.... Repeat for all 50ish characters

Tedious but it works. I believe the problem is that the server is not forwarding me in the correct unicode format, hence requiring the manual work.

View solution in original post

0 Karma

Builder

Try to use the charset ISO-IR-166, after change the value, reboot splunk service.

Regards,

0 Karma

Communicator

When I perform the change, will it take effect for indexed events or will that be for newer incoming events?

0 Karma

Builder

Only affects new events.

0 Karma

Communicator

Didn't work for us. Is the unicode supposed to be displayed as such with a percentage code in front?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!