We have a S3 bucket containing many csv files, each with different header fields that need to be extracted at index time. The current configs in place for this is:
[aws:s3:csv] DATETIME_CONFIG = CURRENT MAX_TIMESTAMP_LOOKAHEAD = 1 KV_MODE = auto SHOULD_LINEMERGE = false HEADER_FIELD_LINE_NUMBER = 1 TRUNCATE = 999999 INDEXED_EXTRACTIONS = csv
This is on the heavy forwarder server that has the AWS add-on installed (latest version) in addition to being on the indexers. I have downloaded a sample csv file from S3 and imported it into Splunk via the UI and it parses correctly, yet it does not when setting this up via the Splunk_TA_aws app (UI or file) to use S3.
It seems that the AWS add on is causing it to ignore the HEADER_FIELD_LINE_NUMBER = 1 and INDEXED_EXTRACTIONS = csv setting entirely. Is anyone else seeing this, does anyone have a solution? Search time extractions are not an option here due to the fields changing frequently.
you uploaded the CSV using the UI , right? Can you compare the stanzas in the .conf files for the UI input vis a vis the AWS input? there might be some differences.
Several users have reported changing the sourcetype name [aws:s3:csv] sometimes cause an issue, once some of them reverted back to using just [aws:s3] thngs started wokring
can you try the compare and tinker with the sourcetype
@ShaneNewman Did you get a resolution to this? I am seeing the same thing myself when I run a "Generic S3" input for a custom input for CSV files in an S3 bucket.
The header lines keep getting indexed and the fields are not extracted when I search the data.
I know it has been a while but did anyone ever get this issue resolved? On the newest version of the AWS Add-On and still unable to figure out reading in data from CSV files with field extractions.