Solved: Re: CSV file ingestion not respecting column headi...

ericlarsen · ‎05-04-2017

I'm trying to monitor a CSV file (via a UF) with column headings included in the file. I want the column headings to be extracted at search time.

Sample file output:
"Name","DatabaseSize","UsedDatabaseSpace","AvailableNewMailboxSpace","NumMailboxes","TotalItemCount"
"SFG-DB01","306.9 GB (329,503,997,952 bytes)","257.1 GB (276,068,106,240 bytes)","49.77 GB (53,435,891,712 bytes)","223"
"SFG-DB02","350.4 GB (376,212,291,584 bytes)","300.7 GB (322,833,514,496 bytes)","49.71 GB (53,378,777,088 bytes)","362"
"SFG-DB03","308.6 GB (331,383,570,432 bytes)","236.1 GB (253,546,692,608 bytes)","72.49 GB (77,836,877,824 bytes)","151"

inputs.conf:
[monitor://E:\fileName*.csv]
index = test
sourcetype = mySourcetypeLog
ignoreOlderThan = 24h
crcSalt =

props.conf:
[mySourcetypeLog]
SHOULD_LINEMERGE = false
REPORT-getfields = mySourcetypeLog_fields

transforms.conf:
[mySourcetypeLog_fields]
DELIMS=","
FIELDS = "Name","DatabaseSize","UsedDatabaseSpace","AvailableNewMailboxSpace","NumMailboxes","TotalItemCount"

When I run a oneshot, the data is ingested correctly (one event per log record) but the extracted fields are not showing up.

Any help would be appreciated.
Thanks.

adonio · ‎05-05-2017

Will recommend follow docs on csv index here:
http://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Extractfieldsfromfileswithstructureddata
inputs.conf: (like you already have)

[monitor://E:\fileName*.csv]
index = test
sourcetype = mySourcetypeLog
ignoreOlderThan = 24h
crcSalt =

props.conf (on indexer/s)

[mySourcetypeLog]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled=false
pulldown_type=true

screenshots:

you can see on the left hand side of the first screenshot the props.conf
on the second screenshot you can see all the fields extracted nicely from header
hope it helps

View solution in original post

adonio · ‎05-05-2017

Will recommend follow docs on csv index here:
http://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Extractfieldsfromfileswithstructureddata
inputs.conf: (like you already have)

[monitor://E:\fileName*.csv]
index = test
sourcetype = mySourcetypeLog
ignoreOlderThan = 24h
crcSalt =

props.conf (on indexer/s)

[mySourcetypeLog]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled=false
pulldown_type=true

screenshots:

you can see on the left hand side of the first screenshot the props.conf
on the second screenshot you can see all the fields extracted nicely from header
hope it helps

adonio · ‎05-04-2017

why do you want the column heading extracted at search time?
any particular reason?
this doc: http://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Extractfieldsfromfileswithstructureddata
explains in detail best practices indexing csv data with nice config samples and data samples to work with

ericlarsen · ‎05-05-2017

I don't want to have the users to create extracted fields for every single field if the field names are already included in the csv file.

adonio · ‎05-05-2017

when you will bring the data like mentioned in the docs, the users will not have to create fields at all.
pay attention that you have 6 fields in your example but values for only 5 of them.
in that case, per docs, splunk will not extract the field with no values.
also, some values are strings like: "DatabaseSize" "308.6 GB (331,383,570,432 bytes)" you will probably would want to extract numeric field based on these values, for example:
field name: DatabaseSizeGB value 308.6 there are multiple ways to do it.
submitting a full answer with screenshot here

ericlarsen · ‎05-05-2017

Ignore the sample file. It's just for illustrative purposes.

I was able to get it work by setting the sourcetype = csv in inputs.conf.

adonio · ‎05-05-2017

great,
please mark question as answered and up vote any comments answers that you think helped with resolution
have a great weekend

CSV file ingestion not respecting column headings

.conf24 | Day 0

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

Troubleshooting the OpenTelemetry Collector