Splunk Enterprise

Why Splunk cannot parse csv file with semicolon as separator?

Marta88
Explorer

Hi,

I am importing a csv file in Splunk Enterprise that has semicolon as field separator but Splunk does not correctly parses it. For instance this field --> SARL "LE RELAIS DU GEVAUDAN";;;"1 is considered as a whole and is not getting splitted.

Do you know which settings should I configure in the file importer wizard in order to import it?

Thank you

Kind regards

Marta

 

Labels (2)
Tags (1)
0 Karma

isoutamo
SplunkTrust
SplunkTrust

Hi

at search time you could use DELIMS with props.conf & transforms.conf

DELIMS = <quoted string list>
* NOTE: This setting is only valid for search-time field extractions.
* IMPORTANT: If a value may contain an embedded unescaped double quote
  character, such as "foo"bar", use REGEX, not DELIMS. An escaped double
  quote (\") is ok. Non-ASCII delimiters also require the use of REGEX.
* Optional. Use DELIMS in place of REGEX when you are working with ASCII-only
  delimiter-based field extractions, where field values (or field/value pairs)
  are separated by delimiters such as colons, spaces, line breaks, and so on.
* Sets delimiter characters, first to separate data into field/value pairs,
  and then to separate field from value.
* Each individual ASCII character in the delimiter string is used as a
  delimiter to split the event.
* Delimiters must be specified within double quotes (eg. DELIMS="|,;").
  Special escape sequences are \t (tab), \n (newline), \r (carriage return),
  \\ (backslash) and \" (double quotes).
* When the event contains full delimiter-separated field/value pairs, you
  enter two sets of quoted characters for DELIMS:
* The first set of quoted delimiters extracts the field/value pairs.
* The second set of quoted delimiters separates the field name from its
  corresponding value.
* When the event only contains delimiter-separated values (no field names),
  use just one set of quoted delimiters to separate the field values. Then use
  the FIELDS setting to apply field names to the extracted values.
  * Alternately, Splunk software reads even tokens as field names and odd
    tokens as field values.
* Splunk software consumes consecutive delimiter characters unless you
  specify a list of field names.
* The following example of DELIMS usage applies to an event where
  field/value pairs are separated by '|' symbols and the field names are
  separated from their corresponding values by '=' symbols:
    [pipe_eq]
    DELIMS = "|", "="
* Default: ""

But on ingesting time you must use REGEX to separate those if needed. Are you sure that you need this on ingest time and search time is not enough?

r. Ismo

0 Karma

Marta88
Explorer

Thank you for your answer. How can I specify a regular expression at ingestion time, in the "add data" wizard?

0 Karma

isoutamo
SplunkTrust
SplunkTrust

This depends in your use case and your environment. If you have Splunk Cloud in use then you can try to use Splunk Edge Processor. That is probably the easiest way to do it? Without Splunk Cloud you can try ingest even or "old way" with props.conf and transforms.conf.

More about this:

Are you absolutely sure that you want extract those fields on index time not on search time?

 

0 Karma
Get Updates on the Splunk Community!

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

Get Inspired! We’ve Got Validation that Your Hard Work is Paying Off

We love our Splunk Community and want you to feel inspired by all your hard work! Eric Fusilero, our VP of ...

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...