Splunk Search

CSV file index time field extractions - how to ignore the header? Is there schema on write support?

kiril123
Path Finder

Hello,

I have the following little csv file:

time,interface,utilization
2019-11-03,int_a,100
2019-11-04,int_b,200

You can see in contains a header and two rows with the data.

I want to perform index time extraction of the fields. I also want to use timestamp from the time column.

This is my props.conf configuration:

DATETIME_CONFIG =
INDEXED_EXTRACTIONS = csv
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = time
TIME_FORMAT = %Y-%m-%d
category = Custom
pulldown_type = 1
HEADER_FIELD_LINE_NUMBER = 1
disabled = false
FIELD_HEADER_REGEX =
PREAMBLE_REGEX =

No matter what i do Splunk always indexes the header as well. I don't want that. I have tried the following settings:

  1. PREAMBLE_REGEX - this ignores the header, but then index time field extractions are not performed. Probably because the header is ignored (chicken and egg situation). I can work around this by listing the comma separated field names manually but i want schema on write support which Splunk doesn't seem to provide.

  2. HEADER_FIELD_LINE_NUMBER = 1 Tried this setting which made no difference.

Does anyone know if it is possible to index csv file fields without the header and without defining column names manually in props.conf?

Thank you,

Kiril

Tags (1)
0 Karma

darrenfuller
Contributor

I usually go with a props/transforms/nullQueue for these type of situations where the field names are known

# props.conf
[782506]
disabled = false
TIME_PREFIX = ^
TIME_FORMAT = %Y-%m-%d
MAX_TIMESTAMP_LOOKAHEAD = 15
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
INDEXED_EXTRACTIONS = csv
TRANSFORMS_01_killheader = Delete_csv_header

with

#transforms.conf
[Delete_csv_header]
disabled = false
REGEX = ^time\,interface\,utilization
DEST_KEY = queue
FORMAT = nullQueue

This resulted in two events, no header, and field extractions indexed.

events_ss

anwarmian
Communicator

Darrenfuller's answer is good. There are advantages and disadvantages in index time and search time field extraction for csv file with header.

Search Time : Less storage space for indexed data than index time extraction
Index Time: If a field's position changes, and it can happen sometimes, then creating a new report class for search time will override some of the old fields.
Here is a link to a case study on search time vs index time for json file.
https://www.hurricanelabs.com/blog/splunk-case-study-indexed-extractions-vs-search-time-extractions

0 Karma
Get Updates on the Splunk Community!

Transforming Financial Data into Fraud Intelligence

Every day, banks and financial companies handle millions of transactions, logins, and customer interactions ...

How to send events & findings from AWS to Splunk using Amazon EventBridge

Amazon EventBridge is a serverless service that uses events to connect application components together, making ...

Exciting News: The AppDynamics Community Joins Splunk!

Hello Splunkers,   I’d like to introduce myself—I’m Ryan, the former AppDynamics Community Manager, and I’m ...