Hi,
I am importing a csv file in Splunk Enterprise that has semicolon as field separator but Splunk does not correctly parses it. For instance this field --> SARL "LE RELAIS DU GEVAUDAN";;;"1 is considered as a whole and is not getting splitted.
Do you know which settings should I configure in the file importer wizard in order to import it?
Thank you
Kind regards
Marta
Hi
at search time you could use DELIMS with props.conf & transforms.conf
DELIMS = <quoted string list> * NOTE: This setting is only valid for search-time field extractions. * IMPORTANT: If a value may contain an embedded unescaped double quote character, such as "foo"bar", use REGEX, not DELIMS. An escaped double quote (\") is ok. Non-ASCII delimiters also require the use of REGEX. * Optional. Use DELIMS in place of REGEX when you are working with ASCII-only delimiter-based field extractions, where field values (or field/value pairs) are separated by delimiters such as colons, spaces, line breaks, and so on. * Sets delimiter characters, first to separate data into field/value pairs, and then to separate field from value. * Each individual ASCII character in the delimiter string is used as a delimiter to split the event. * Delimiters must be specified within double quotes (eg. DELIMS="|,;"). Special escape sequences are \t (tab), \n (newline), \r (carriage return), \\ (backslash) and \" (double quotes). * When the event contains full delimiter-separated field/value pairs, you enter two sets of quoted characters for DELIMS: * The first set of quoted delimiters extracts the field/value pairs. * The second set of quoted delimiters separates the field name from its corresponding value. * When the event only contains delimiter-separated values (no field names), use just one set of quoted delimiters to separate the field values. Then use the FIELDS setting to apply field names to the extracted values. * Alternately, Splunk software reads even tokens as field names and odd tokens as field values. * Splunk software consumes consecutive delimiter characters unless you specify a list of field names. * The following example of DELIMS usage applies to an event where field/value pairs are separated by '|' symbols and the field names are separated from their corresponding values by '=' symbols: [pipe_eq] DELIMS = "|", "=" * Default: ""
But on ingesting time you must use REGEX to separate those if needed. Are you sure that you need this on ingest time and search time is not enough?
r. Ismo
Thank you for your answer. How can I specify a regular expression at ingestion time, in the "add data" wizard?
This depends in your use case and your environment. If you have Splunk Cloud in use then you can try to use Splunk Edge Processor. That is probably the easiest way to do it? Without Splunk Cloud you can try ingest even or "old way" with props.conf and transforms.conf.
More about this:
Are you absolutely sure that you want extract those fields on index time not on search time?