Splunk Search

how can i extract a quoted field value that includes a quoted string?

sophy
Splunk Employee
Splunk Employee

(a question from a customer)

I have a field named string that reads:

string="This is "an extraordinary" event, not to be missed."

Splunk 4.2 extracts this as string="This is ", ignoring the rest of the value.

Is there a way to format the input data/logging, such that the auto-kv will extract the entire string properly.

The documentation describes how to escape backslash characters ( http://docs.splunk.com/Documentation/Splunk/4.2.3/admin/Propsconf ), can a similar approach be used to handle quotes? For example:

key="i am having \"fun\" with splunk"

I would prefer to use Splunk's auto-kv extraction to handle this (instead of writing a regex, as discussed in http://splunk-base.splunk.com/answers/23841/extract-with-quoted-values ).

Tags (1)

icquintos
New Member

Hi Sophy;

Have you already found an answer with your problem?

with your given string string="This is "an extraordinary" event, not to be missed."

I also want to extract the whole This is "an extraordinary" event, not to be missed. for my value,

I can't seem to understand how to fix this issue. Could you help me?

Thank you very much.

0 Karma

Damien_Dallimor
Ultra Champion

Sophy,

I don't have an exact answer for your particular issue, but here are some of my experiences.

What is the data input that you using ?
You may be able to do something in the scripted input logic or at the source of the logging etc...

ie:

I encountered a similar issue with an IBM product's JMX attributes...so I wrote a custom output formatter that strips quotes before pushing the data into the Splunk indexer pipeline.

In another situation, a Java application using the "logback" logging framework, I used the "replace" functionality to strip out quotes.

wpf500
Engager

Unless you have the inside quotes escaped there's no way this could be done automatically. It just can't possibly know which quotes to accept (unless you have some known delimiter between each key/value).

If you do have escaped quotes, not sure how you would do it with auto-kv, but the transform I used is fairly simple:

REGEX = ([^ ]+)="(.+?[^\\])"
FORMAT = $1::$2

Not thoroughly tested but seems to be doing the trick. It basically ensures that there isn't a backslash before the closing quote. This won't work if you had an escaped backslash at the end of your string (key="foo \"bar\" \\")

hexx
Splunk Employee
Splunk Employee

I don't think that the auto-kv field extraction is configurable to handle this scenario. If you have clear delimiters between the key/value pairs, perhaps you can get better luck using a REPORT defined in transforms.conf with the appropriate DELIMS :


DELIMS =
* NOTE: This attribute is only valid for search-time field extractions.
* Optional. Used in place of REGEX when dealing with delimiter-based field extractions,
where field values (or field/value pairs) are separated by delimiters such as colons,
spaces, line breaks, and so on.
* Sets delimiter characters, first to separate data into field/value pairs, and then to
separate field from value.
* Each individual character in the delimiter string is used as a delimiter to split the event.
* Delimiters must be quoted with " " (use \ to escape).
* When the event contains full delimiter-separated field/value pairs, you enter two sets of
quoted characters for DELIMS:
* The first set of quoted delimiters extracts the field/value pairs.
* The second set of quoted delimiters separates the field name from its corresponding
value.
* When the event only contains delimiter-separated values (no field names) you use just one set
of quoted delimiters to separate the field values. Then you use the FIELDS attribute to
apply field names to the extracted values (see FIELDS, below).
* Alternately, Splunk reads even tokens as field names and odd tokens as field values.
* Splunk consumes consecutive delimiter characters unless you specify a list of field names.
* The following example of DELIMS usage applies to an event where field/value pairs are
seperated by '|' symbols and the field names are separated from their corresponding values
by '=' symbols:
[pipe_eq]
DELIMS = "|", "="
* Defaults to "".

Other than that, you will indeed have to use a clever regular expression to accept double quotes in the extracted field value.

Get Updates on the Splunk Community!

BSides Splunk 2022 - The Call for Papers is now Open!

TLDR; Main Site: https://bsidessplunk.com CFP Site: https://bsidessplunk.com/cfp CFP Opens: December 15th, ...

Sending Metrics to Splunk Enterprise With the OpenTelemetry Collector

This blog post is part of an ongoing series on OpenTelemetry. The OpenTelemetry project is the second largest ...

What's New in Splunk Cloud Platform 9.0.2208?!

Howdy!  We are happy to share the newest updates in Splunk Cloud Platform 9.0.2208! Analysts can benefit ...