Splunk Search

How do we strip single quotes from the beginning and end of the values?

danielbb
Motivator

We are an index in which most of the fields have a single quote at the beginning and end of the values. We would like to strip them, first at search time and hopefully later at index time.

How can we do that?

Tags (2)
0 Karma

woodcock
Esteemed Legend

Like this for search-time:

... | rex field=YourFieldNameHere mode=sed "s/'|'$//g"

Like this for index-time:

In props.conf

[YourSourcetypHere]
TRANSFORMS-strip-bounding-quotes = sbq_field_foo, ..., sbq_field_bar

In transforms.conf (multiple similar stanzas):

[sbq_field_foo]
SOURCE_KEY = field_foo
REGEX = '(?<field_foo>[^\']*)\'"
FORMAT = field_foo::$1
WRITE_META = true

mitag
Contributor

This seems to remove all quotes, not just the bounding ones:

... | rex field=YourFieldNameHere mode=sed "s/'|'$//g"

Seems `^` symbol needs to be added?

"s/^'|'$//g"

 

0 Karma

danielbb
Motivator

Thank you @woodcock.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi danielbb,
you could create a calculated fields with a regex like this

| rex field=your_field "\'(?<your_new_field>[^\']*)\'"

Bye.
Giuseppe

danielbb
Motivator

And index time @gcusello ?

0 Karma

dmarling
Builder

An eval command that trims that from the field will do that:

| eval fieldname=trim(fieldname, "'")

Sorry you meant at index time. It's easy to do with a calculated field, but that doesn't answer your question so I'm converting this to a comment instead of an answer.

If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

dmarling
Builder

An eval command that trims that from the field will do that:

| eval fieldname=trim(fieldname, "'")

Sorry you meant at index time. It's easy to do with a calculated field, but that doesn't answer your question so I'm converting this to a comment instead of an answer.

If this comment/answer was helpful, please up vote it. Thank you.

dmarling
Builder

If it's multiple fields you could also create a macro that cleans them all with foreach

| foreach *
    [eval "<<FIELD>>"=trim('<<FIELD>>', "'")]
If this comment/answer was helpful, please up vote it. Thank you.

danielbb
Motivator

Really nice - how do we do it at index time?

0 Karma

dmarling
Builder

Sorry I missed this. I am not sure how to accomplish this at index time unfortunately. You could hypothetically strip them at index time from the raw event using this process: https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_with_a_sed_scr...

I highly caution you though that you MUST write a regex sed statement that only removes the single quotes you don't want and leaves the ones you do. If you nuke all of them, you may have some unintended consequences. Do you have some anonymized _raw examples of the data that is feeding in with the single quote wrapped data? I could take a hack at a sed statement that is hopefully very targeted.

If this comment/answer was helpful, please up vote it. Thank you.
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...