<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Format unicode characters in json input in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Format-unicode-characters-in-json-input/m-p/400265#M71292</link>
    <description>&lt;P&gt;We're trying to index json formatted logs from kubernetes pods by removing the json formatting and making the logs appear like normal syslog input.&lt;/P&gt;

&lt;P&gt;Using the below lines in props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;SHOULD_LINEMERGE = false
SEDCMD-1_unjsonify = s/{"log":"(?:\\u[0-9]+)?(.*?)\\n","stream.*/\1/g
SEDCMD-2_unescapequotes = s/\\"/"/g
BREAK_ONLY_BEFORE={"logname":
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;We managed to transform the indexed logs from this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{"log":"2019-07-18T14:11:48+00:00 kubernetes location1 0.0.0.0 - - [18/Jul/2019:14:11:48 +0000] \"GET /saml2/idp/sso?sp=something.com\u0026RelayState=https://something.something.com/URL/paths HTTP/1.1\" 200 2808 \"-\" \"jmeter\" \"something.something.com\" \"0.0.0.0\" 15982 \"95931E90-49DC-462D-B29F-86AF681A6B3B\"\n","stream":"stdout","time":"2019-07-18T14:11:48.485908193Z"}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;to this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-06-13T07:33:53+00:00 kubernetes location1 0.0.0.0
- - [13/Jun/2019:07:33:53 +0000] "POST /saml2/sp/acs/oac.something.com HTTP/1.1" 200 5573 "https://something.else.com/saml2/idp/sso?sp=something.com\u0026RelayState=https%3A%2F%2Fsomething.else.com%2Fadmin%2F" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:67.0) Gecko/20100101 Firefox/67.0" "something.something.com" "0.0.0.0" 9109 "A088E5DB-311C-400E-8AE9-A7B74CA7365C"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, we get unconverted unicode characters like \u0026, %3A and %2F.&lt;/P&gt;

&lt;P&gt;How can we convert them to normal unicode characters?&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Thu, 18 Jul 2019 14:24:01 GMT</pubDate>
    <dc:creator>vstariradev</dc:creator>
    <dc:date>2019-07-18T14:24:01Z</dc:date>
    <item>
      <title>Format unicode characters in json input</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Format-unicode-characters-in-json-input/m-p/400265#M71292</link>
      <description>&lt;P&gt;We're trying to index json formatted logs from kubernetes pods by removing the json formatting and making the logs appear like normal syslog input.&lt;/P&gt;

&lt;P&gt;Using the below lines in props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;SHOULD_LINEMERGE = false
SEDCMD-1_unjsonify = s/{"log":"(?:\\u[0-9]+)?(.*?)\\n","stream.*/\1/g
SEDCMD-2_unescapequotes = s/\\"/"/g
BREAK_ONLY_BEFORE={"logname":
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;We managed to transform the indexed logs from this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{"log":"2019-07-18T14:11:48+00:00 kubernetes location1 0.0.0.0 - - [18/Jul/2019:14:11:48 +0000] \"GET /saml2/idp/sso?sp=something.com\u0026RelayState=https://something.something.com/URL/paths HTTP/1.1\" 200 2808 \"-\" \"jmeter\" \"something.something.com\" \"0.0.0.0\" 15982 \"95931E90-49DC-462D-B29F-86AF681A6B3B\"\n","stream":"stdout","time":"2019-07-18T14:11:48.485908193Z"}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;to this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;2019-06-13T07:33:53+00:00 kubernetes location1 0.0.0.0
- - [13/Jun/2019:07:33:53 +0000] "POST /saml2/sp/acs/oac.something.com HTTP/1.1" 200 5573 "https://something.else.com/saml2/idp/sso?sp=something.com\u0026RelayState=https%3A%2F%2Fsomething.else.com%2Fadmin%2F" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:67.0) Gecko/20100101 Firefox/67.0" "something.something.com" "0.0.0.0" 9109 "A088E5DB-311C-400E-8AE9-A7B74CA7365C"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;However, we get unconverted unicode characters like \u0026, %3A and %2F.&lt;/P&gt;

&lt;P&gt;How can we convert them to normal unicode characters?&lt;/P&gt;

&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 18 Jul 2019 14:24:01 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Format-unicode-characters-in-json-input/m-p/400265#M71292</guid>
      <dc:creator>vstariradev</dc:creator>
      <dc:date>2019-07-18T14:24:01Z</dc:date>
    </item>
  </channel>
</rss>

