Use rex to strip certain characters from fields

bowesmana — Sun, 10 Nov 2013 07:36:09 GMT

Unicode punctuation characters U+2000 to U+206f seem to make Splunk want to put the requirement for Simplified Chinese fonts in exported PDFs, so I want to convert these characters to ASCII equivalents.

I can add the following to the search command

rex field=Course mode=sed "s/(‘|’)/'/g"

where the replacement chars above are U+2018 and U+2019 and they are replaced with 0x27, but I want to put something in props.conf to force it to happen always.

How would I do this?

Re: Use rex to strip certain characters from fields

rsennett_splunk — Sun, 10 Nov 2013 08:02:22 GMT

Yes, but keep in mind this is an index time function, so it will change indexed data on the way in... permanently.

[yoursourcetype]

sedcmd-course = s/(‘|’)/'/g

You can read about it HERE
 and I have excerpted below:

SEDCMD- = 
* Only used at index time.
* Commonly used to anonymize incoming data at index time, such as credit card or social
  security numbers. For more information, search the online documentation for "anonymize
  data."
* Used to specify a sed script which Splunk applies to the _raw field.
* A sed script is a space-separated list of sed commands. Currently the following subset of
  sed commands is supported:
        * replace (s) and character substitution (y).
* Syntax:
        * replace - s/regex/replacement/flags
                * regex is a perl regular expression (optionally containing capturing groups).
                * replacement is a string to replace the regex match. Use \n for backreferences,
                  where "n" is a single digit.
                * flags can be either: g to replace all matches, or a number to replace a specified
                  match.
        * substitute - y/string1/string2/
                * substitutes the string1[i] with string2[i]

topic Re: Use rex to strip certain characters from fields in Splunk Search

Use rex to strip certain characters from fields

Re: Use rex to strip certain characters from fields