Splunk Search

How do we strip single quotes from the beginning and end of the values?

danielbb
Motivator

We are an index in which most of the fields have a single quote at the beginning and end of the values. We would like to strip them, first at search time and hopefully later at index time.

How can we do that?

Tags (2)
0 Karma

woodcock
Esteemed Legend

Like this for search-time:

... | rex field=YourFieldNameHere mode=sed "s/'|'$//g"

Like this for index-time:

In props.conf

[YourSourcetypHere]
TRANSFORMS-strip-bounding-quotes = sbq_field_foo, ..., sbq_field_bar

In transforms.conf (multiple similar stanzas):

[sbq_field_foo]
SOURCE_KEY = field_foo
REGEX = '(?<field_foo>[^\']*)\'"
FORMAT = field_foo::$1
WRITE_META = true

mitag
Contributor

This seems to remove all quotes, not just the bounding ones:

... | rex field=YourFieldNameHere mode=sed "s/'|'$//g"

Seems `^` symbol needs to be added?

"s/^'|'$//g"

 

0 Karma

danielbb
Motivator

Thank you @woodcock.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi danielbb,
you could create a calculated fields with a regex like this

| rex field=your_field "\'(?<your_new_field>[^\']*)\'"

Bye.
Giuseppe

danielbb
Motivator

And index time @gcusello ?

0 Karma

dmarling
Builder

An eval command that trims that from the field will do that:

| eval fieldname=trim(fieldname, "'")

Sorry you meant at index time. It's easy to do with a calculated field, but that doesn't answer your question so I'm converting this to a comment instead of an answer.

If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

dmarling
Builder

An eval command that trims that from the field will do that:

| eval fieldname=trim(fieldname, "'")

Sorry you meant at index time. It's easy to do with a calculated field, but that doesn't answer your question so I'm converting this to a comment instead of an answer.

If this comment/answer was helpful, please up vote it. Thank you.

dmarling
Builder

If it's multiple fields you could also create a macro that cleans them all with foreach

| foreach *
    [eval "<<FIELD>>"=trim('<<FIELD>>', "'")]
If this comment/answer was helpful, please up vote it. Thank you.

danielbb
Motivator

Really nice - how do we do it at index time?

0 Karma

dmarling
Builder

Sorry I missed this. I am not sure how to accomplish this at index time unfortunately. You could hypothetically strip them at index time from the raw event using this process: https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_with_a_sed_scr...

I highly caution you though that you MUST write a regex sed statement that only removes the single quotes you don't want and leaves the ones you do. If you nuke all of them, you may have some unintended consequences. Do you have some anonymized _raw examples of the data that is feeding in with the single quote wrapped data? I could take a hack at a sed statement that is hopefully very targeted.

If this comment/answer was helpful, please up vote it. Thank you.
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...