Splunk Search

Change single quote to double quote

jwhughes58
Communicator

I'm working with a data source that has two different versions.  In one version the information is double quoted while the other version is single quoted.  This is causing me issues because the single quoted information will still have the single quotes while the double quoted won't have any quotes.  This is throwing counts off since the single quoted string and the unquoted string are not the same.  Without using SEDCMD since we want the actual raw source, I've been trying to work out how to do this.  I've got this in a search

search
| eval raw=_raw 
| rex field=raw mode=sed "s/\'/\"/g"
| rex field=raw "\[(?<audit_event>[^\:]+)\:(?<vendor_severity>[^\:]+).+(?<vendor_xml>\<vendor.+\<\/vendor\>)" 

Now I'm trying to convert it to props and transforms.  My props.conf

EXTRACT-vendor_raw = (?<raw>^.*$)
REPORT-vendor_extract_fields = vendor_replace_single_quotes, vendor_fields
KV_MODE = xml

My transforms.conf

[vendor_replace_single_quotes]

[vendor_fields]
REGEX = \[(?<audit_event>[^\:]+)\:(?<vendor_severity>[^\:]+).+(?<vendor_xml>\<vendor.+\<\/vendor\>)
SOURCE = raw

What I can't figure out is how do the replace like in the search either in props.conf or transforms.conf.  Everything I've found uses the SEDCMD.  Any thoughts on this?

TIA,

Joe

Labels (2)
Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

You can add a calculated field to do this, using replace().  For example,

EVAL-vendor_xml = replace(vendor_xml, "'", "\"")

Note your extracted fields are available in field calculation, as @venkatasri recently reminded me of props.conf:

* Splunk software processes calculated fields after field extraction and
  field aliasing but before lookups.

 

0 Karma

jwhughes58
Communicator

I don't think the eval will work.  Here is what I have in my props.conf

#
# Index-time operation
#
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
TRUNCATE = 9999

#
# Wed May 19 2021 09:32:13 -07:00
#
MAX_TIMESTAMP_LOOKAHEAD = 111
TIME_FORMAT = %a %b %d %Y %H:%M:%S %z

#
# Search-time operation sequence #1
#
EXTRACT-vendor_raw = (?<raw>^.*$)

#
# Search-time operation sequence #2
#
REPORT-vendor_extract_fields = vendor_replace_single_quotes, vendor_fields

#
# Search-time operation sequence #3
#
KV_MODE = xml

#
# Search-time operation sequence #4
#
# FIELDALIAS
#

#
# Search-time operation sequence #5
#
# EVAL
#

#
# Search-time operation sequence #6
#
# LOOKUP
#

#
# Search-time operation sequence #7
#
# In eventtypes.conf if exists
#

#
# Search-time operation sequence #8
#
# In tags.conf if exists
#

I don't want to replace the single quotes with double quotes in _raw for legal reasons.  I can use eval after the report, but I would have to change every value that is single quoted and I am not guaranteed the names.  I would like to make a backup of _raw, change single quotes to double quotes, and get the name value pairs.

0 Karma

yuanliu
SplunkTrust
SplunkTrust

You do not need to replace anything in _raw; the intention of the sample EVAL is to replace quote in vendor_xml field only.

Your original post showed the following stanzas in transform.conf:

 

 

[vendor_replace_single_quotes]

[vendor_fields]
REGEX = \[(?<audit_event>[^\:]+)\:(?<vendor_severity>[^\:]+).+(?<vendor_xml>\<vendor.+\<\/vendor\>)
SOURCE = raw

 

 

Then, your  props.conf contains this automatic extraction using a nonexistent transform:

 

 

REPORT-vendor_extract_fields = vendor_replace_single_quotes, vendor_fields

 

 

Do the following:

  1. In transforms.conf, get rid of the empty [vendor_replace_single_quotes] stanza.
  2. In props.conf, get rid of vendor_replace_single_quotes in auto extraction vendor_extract_fields;
  3. then, add the EVAL:

 

 

# Automatically apply transform named "vendor_fields";
# 'vendor_xml' field may contain single or double quotes
REPORT-vendor_extract_fields = vendor_fields​
# Replace any single quote in 'vendor_xml' field with double quote
EVAL-vendor_xml = replace(vendor_xml, "'", "\"")​

 

 

Check to make sure the above segment is under sourcetype, source, or host in props.conf that matches your search.  Your vendor_xml field should no longer contain single quotes.  Nothing in _raw is changed. (The order in which these appeared in props.conf isn't important; in fact, it would be better to use Web UI to do these things and allow Splunk to order props.conf automatically although you will lose comments.)

Of course, if another field, or other extracted fields, have undesired single quote, you can do the same to them.

Hope this helps.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!