Splunk Search

CSV file index time field extractions - how to ignore the header? Is there schema on write support?

kiril123
Path Finder

Hello,

I have the following little csv file:

time,interface,utilization
2019-11-03,int_a,100
2019-11-04,int_b,200

You can see in contains a header and two rows with the data.

I want to perform index time extraction of the fields. I also want to use timestamp from the time column.

This is my props.conf configuration:

DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = time
TIME_FORMAT = %Y-%m-%d
category = Custom
pulldown_type = 1
HEADER_FIELD_LINE_NUMBER = 1
disabled = false
FIELD_HEADER_REGEX =
PREAMBLE_REGEX =

No matter what i do Splunk always indexes the header as well. I don't want that. I have tried the following settings:

  1. PREAMBLE_REGEX - this ignores the header, but then index time field extractions are not performed. Probably because the header is ignored (chicken and egg situation). I can work around this by listing the comma separated field names manually but i want schema on write support which Splunk doesn't seem to provide.

  2. HEADER_FIELD_LINE_NUMBER = 1 Tried this setting which made no difference.

Does anyone know if it is possible to index csv file fields without the header and without defining column names manually in props.conf?

Thank you,

Kiril

Tags (1)
0 Karma

darrenfuller
Contributor

I usually go with a props/transforms/nullQueue for these type of situations where the field names are known

# props.conf
[782506]
disabled = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d
MAX_TIMESTAMP_LOOKAHEAD = 15
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
INDEXED_EXTRACTIONS = csv
TRANSFORMS_01_killheader = Delete_csv_header

with

#transforms.conf
[Delete_csv_header]
disabled = false
REGEX = ^time\,interface\,utilization
DEST_KEY = queue
FORMAT = nullQueue

This resulted in two events, no header, and field extractions indexed.

events_ss

anwarmian
Communicator

Darrenfuller's answer is good. There are advantages and disadvantages in index time and search time field extraction for csv file with header.

Search Time : Less storage space for indexed data than index time extraction
Index Time: If a field's position changes, and it can happen sometimes, then creating a new report class for search time will override some of the old fields.
Here is a link to a case study on search time vs index time for json file.
https://www.hurricanelabs.com/blog/splunk-case-study-indexed-extractions-vs-search-time-extractions

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...