<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Use rex to strip certain characters from fields in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Use-rex-to-strip-certain-characters-from-fields/m-p/135865#M37168</link>
    <description>&lt;P&gt;Yes, but keep in mind this is an index time function, so it will change indexed data on the way in... permanently.&lt;/P&gt;

&lt;PRE&gt;
[yoursourcetype]

sedcmd-course = s/(‘|’)/'/g

You can read about it &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.0/admin/Propsconf"&gt;HERE&lt;/A&gt;&lt;BR /&gt; and I have excerpted below:


&lt;/PRE&gt;

&lt;PRE&gt;&lt;CODE&gt;SEDCMD-&lt;CLASS&gt; = &lt;SED script=""&gt;
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit card or social
  security numbers. For more information, search the online documentation for "anonymize
  data."
* Used to specify a sed script which Splunk applies to the _raw field.
* A sed script is a space-separated list of sed commands. Currently the following subset of
  sed commands is supported:
        * replace (s) and character substitution (y).
* Syntax:
        * replace - s/regex/replacement/flags
                * regex is a perl regular expression (optionally containing capturing groups).
                * replacement is a string to replace the regex match. Use \n for backreferences,
                  where "n" is a single digit.
                * flags can be either: g to replace all matches, or a number to replace a specified
                  match.
        * substitute - y/string1/string2/
                * substitutes the string1[i] with string2[i]&lt;/SED&gt;&lt;/CLASS&gt;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Sun, 10 Nov 2013 08:02:22 GMT</pubDate>
    <dc:creator>rsennett_splunk</dc:creator>
    <dc:date>2013-11-10T08:02:22Z</dc:date>
    <item>
      <title>Use rex to strip certain characters from fields</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Use-rex-to-strip-certain-characters-from-fields/m-p/135864#M37167</link>
      <description>&lt;P&gt;Unicode punctuation characters U+2000 to U+206f seem to make Splunk want to put the requirement for Simplified Chinese fonts in exported PDFs, so I want to convert these characters to ASCII equivalents.&lt;/P&gt;

&lt;P&gt;I can add the following to the search command&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;rex field=Course mode=sed "s/(‘|’)/'/g"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;where the replacement chars above are U+2018 and U+2019 and they are replaced with 0x27, but I want to put something in props.conf to force it to happen always.&lt;/P&gt;

&lt;P&gt;How would I do this?&lt;/P&gt;</description>
      <pubDate>Sun, 10 Nov 2013 07:36:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Use-rex-to-strip-certain-characters-from-fields/m-p/135864#M37167</guid>
      <dc:creator>bowesmana</dc:creator>
      <dc:date>2013-11-10T07:36:09Z</dc:date>
    </item>
    <item>
      <title>Re: Use rex to strip certain characters from fields</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Use-rex-to-strip-certain-characters-from-fields/m-p/135865#M37168</link>
      <description>&lt;P&gt;Yes, but keep in mind this is an index time function, so it will change indexed data on the way in... permanently.&lt;/P&gt;

&lt;PRE&gt;
[yoursourcetype]

sedcmd-course = s/(‘|’)/'/g

You can read about it &lt;A href="http://docs.splunk.com/Documentation/Splunk/6.0/admin/Propsconf"&gt;HERE&lt;/A&gt;&lt;BR /&gt; and I have excerpted below:


&lt;/PRE&gt;

&lt;PRE&gt;&lt;CODE&gt;SEDCMD-&lt;CLASS&gt; = &lt;SED script=""&gt;
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit card or social
  security numbers. For more information, search the online documentation for "anonymize
  data."
* Used to specify a sed script which Splunk applies to the _raw field.
* A sed script is a space-separated list of sed commands. Currently the following subset of
  sed commands is supported:
        * replace (s) and character substitution (y).
* Syntax:
        * replace - s/regex/replacement/flags
                * regex is a perl regular expression (optionally containing capturing groups).
                * replacement is a string to replace the regex match. Use \n for backreferences,
                  where "n" is a single digit.
                * flags can be either: g to replace all matches, or a number to replace a specified
                  match.
        * substitute - y/string1/string2/
                * substitutes the string1[i] with string2[i]&lt;/SED&gt;&lt;/CLASS&gt;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sun, 10 Nov 2013 08:02:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Use-rex-to-strip-certain-characters-from-fields/m-p/135865#M37168</guid>
      <dc:creator>rsennett_splunk</dc:creator>
      <dc:date>2013-11-10T08:02:22Z</dc:date>
    </item>
  </channel>
</rss>

