All Apps and Add-ons

Why is the REST API modular input response xml data charset is broken?

leeyounsoo
Path Finder

Hi,
i have a problem with the character set.

I am using the REST API modular input.
However, characters are broken in the xml data received in response through the REST API.
like this:

<?xml version="1.0" encoding="UTF-8"?>
<tag1>
<![CDATA[ìœ ì˜ë¯¸ì˜ 마음은 ì–¸ì œë‚˜ 청춘]]>
</tag1>

I tried this thing:

  1. modify props.conf

    [sourcetype]
    CHARSET = UTF-8

  2. modify responsehandler.py

    coding: utf-8
    import sys
    reload(sys)
    sys.setdefaultencoding('utf-8')

    import json
    import datetime

    class MyResponseHandler:

    def __init__(self,**args):
        pass
    
    def __call__(self, response_object,raw_response_output,response_type,req_args,endpoint):
        cookies = response_object.cookies
        if cookies:
            req_args["cookies"] = cookies
        raw_response_output.encode('utf-8')
        print_xml_stream(raw_response_output)
    

    ....(skip)
    what should I do?

0 Karma

asohahn_splunk
Splunk Employee
Splunk Employee

It seems like your broken text is encoded with iso-8859-1(Latin1). Is the first line of your responsehandler.py "coding: utf-8" just a typo? It should be something like "#-- coding: utf-8 --".

0 Karma

leeyounsoo
Path Finder

that is a typo.
splunk answer comment form automatically change that String.

i write like this :
"# -- coding : utf-8 --"

but, that is broken yet.

  • my python version is 2.7.x
0 Karma
Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.