Splunk Search

CSV file index time field extractions - how to ignore the header? Is there schema on write support?

kiril123
Path Finder

Hello,

I have the following little csv file:

time,interface,utilization
2019-11-03,int_a,100
2019-11-04,int_b,200

You can see in contains a header and two rows with the data.

I want to perform index time extraction of the fields. I also want to use timestamp from the time column.

This is my props.conf configuration:

DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = time
TIME_FORMAT = %Y-%m-%d
category = Custom
pulldown_type = 1
HEADER_FIELD_LINE_NUMBER = 1
disabled = false
FIELD_HEADER_REGEX =
PREAMBLE_REGEX =

No matter what i do Splunk always indexes the header as well. I don't want that. I have tried the following settings:

  1. PREAMBLE_REGEX - this ignores the header, but then index time field extractions are not performed. Probably because the header is ignored (chicken and egg situation). I can work around this by listing the comma separated field names manually but i want schema on write support which Splunk doesn't seem to provide.

  2. HEADER_FIELD_LINE_NUMBER = 1 Tried this setting which made no difference.

Does anyone know if it is possible to index csv file fields without the header and without defining column names manually in props.conf?

Thank you,

Kiril

Tags (1)
0 Karma

darrenfuller
Contributor

I usually go with a props/transforms/nullQueue for these type of situations where the field names are known

# props.conf
[782506]
disabled = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d
MAX_TIMESTAMP_LOOKAHEAD = 15
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
INDEXED_EXTRACTIONS = csv
TRANSFORMS_01_killheader = Delete_csv_header

with

#transforms.conf
[Delete_csv_header]
disabled = false
REGEX = ^time\,interface\,utilization
DEST_KEY = queue
FORMAT = nullQueue

This resulted in two events, no header, and field extractions indexed.

events_ss

anwarmian
Communicator

Darrenfuller's answer is good. There are advantages and disadvantages in index time and search time field extraction for csv file with header.

Search Time : Less storage space for indexed data than index time extraction
Index Time: If a field's position changes, and it can happen sometimes, then creating a new report class for search time will override some of the old fields.
Here is a link to a case study on search time vs index time for json file.
https://www.hurricanelabs.com/blog/splunk-case-study-indexed-extractions-vs-search-time-extractions

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...