Solved: Any advice on how to resolve multiple CSV header i...

andrew_burnett · ‎04-14-2023

We are getting multiple errors like this

Corrupt csv header in CSV file , 2 columns with the same name

However we have so many CSV files that finding them will be all but impossible.

Can someone provide advice on how to find them?

woodcock · ‎04-14-2023

Assuming that your OS is unix/linux, assuming that your CSV files use standard filenaming conventions (i.e. *.csv), assuming that your CSV files are standard with a header on the first line, assuming that the source files still exist, you can use the following CLI commands to identify problematic files:

find ${SPLUNK_HOME}/etc/apps/*/lookups -name *.csv -exec head -1 {} \; | tr ',' '\n' | sort| uniq -d

This will tell you the duplicated field, e.g. "foo". Then take that and do this to find the file (or a small pile to peek through):

for FILE in $(find ${SPLUNK_HOME}/lookups -name *.csv -exec grep -il foo {} \;); do echo ${FILE}; head -1 ${FILE} | tr ',' '\n' | sort | uniq -d; done

Here are some other tips:

View solution in original post

woodcock · ‎04-14-2023

Assuming that your OS is unix/linux, assuming that your CSV files use standard filenaming conventions (i.e. *.csv), assuming that your CSV files are standard with a header on the first line, assuming that the source files still exist, you can use the following CLI commands to identify problematic files:

find ${SPLUNK_HOME}/etc/apps/*/lookups -name *.csv -exec head -1 {} \; | tr ',' '\n' | sort| uniq -d

This will tell you the duplicated field, e.g. "foo". Then take that and do this to find the file (or a small pile to peek through):

for FILE in $(find ${SPLUNK_HOME}/lookups -name *.csv -exec grep -il foo {} \;); do echo ${FILE}; head -1 ${FILE} | tr ',' '\n' | sort | uniq -d; done

Here are some other tips:

View solution in original post

andrew_burnett · ‎04-14-2023

So the first one command, every word it brings back is a duplicated one?

woodcock · ‎04-14-2023

Exactly.

andrew_burnett · ‎04-14-2023

Well see we are trying to find specific keywords, so I know like one I'm trying to test. When I run your second command, it pulls in a ton of CSV files. Checking one, and the word isn't in the CSV header at all?

andrew_burnett · ‎04-14-2023

Oh I see it now, the word is in the CSV file itself. But I'm only concerned with the headers, is that not what the alert means?

woodcock · ‎04-14-2023

Yes. I updated my answer to help better.

Any advice on how to resolve multiple CSV header issues?

Other

Message Parsing in SOCK

Exploring the OpenTelemetry Collector’s Kubernetes annotation-based discovery

Use ‘em or lose ‘em | Splunk training units do expire