Splunk Search

"Corrupt csv header" : how to find the corrupted csv?

Explorer

I find on splunkd.log a lot of warnings as: "Corrupt csv header, contains empty value (col #3)" without any other details.
I have many .csv and many automated lookups in my Splunk. How can I find the corrupted one without opening all of them with a spreadsheet?

Thank you to all in advance

1 Solution

Esteemed Legend

Assuming that your OS is unix/linux, assuming that your CSV files use standard filenaming conventions (i.e. *.csv), assuming that your CSV files are standard with a header on the first line, assuming that the source files still exist, you can use the following CLI commands to identify problematic files:

find . -name "*.csv" -exec echo {} \; -exec grep -n ",," {} \; | grep -B 1 "^1:"

This variation also considers whitespace but may not be compatible with your version of grep:

find . -name "*.csv" -exec echo {} \; -exec grep -Pn ",\s*," {} \; | grep -B 1 "^1:"

View solution in original post

New Member

Hello,

Something similar happened to me, we had an TA Addon arista and the app Arista operating in the Search Head, we found that there was a duplicate word "speed" in a CSV but there was no reference to what lookup it was.

Error

"09-19-2019 22:05:14.045 -0500 WARN SearchResultsCSVSerializer - Corrupt csv header, 2 columns with the same name 'speed' (col #3 and #0, #3 will be ignored)"

Solution - Find the speed word in the csv files of the splunk apps directory.

grep -Rw '/opt/splunk/etc/apps/' -e 'speed' --include=*.csv

Output

/opt/splunk/etc/apps/TA-arista/lookups/interface-speed.csv:speed,"speeddesc",Speed
/opt/splunk/etc/apps/aristanetworks/lookups/interface-speed.csv:speed,"speed
desc",speed

Then delete the app or duplicate csv (In my case delete the app contain csv duplicate)

/opt/splunk/bin/splunk stop
rm -rf /opt/splunk/etc/apps/aristanetworks
/opt/splunk/bin/splunk start

Validate log level (error, warn)

tail -f tail -f /opt/splunk/var/log/splunk/splunkd.log

OR Query SPL

index="internal"
| search log
level!="info"
| stats values(component) count by log_level

🙂

0 Karma

SplunkTrust
SplunkTrust

Sorry to add another answer; but here is how you do it if your not on Linux or do not have OS access on the search head to run @woodcock 's command.

Using the Lookup Editor App https://splunkbase.splunk.com/app/1724/ you can check the lookup files and see the error straight away!
I uploaded two lookup files called missingoneheaderfield.csv which is missing one header field somewhere in the header and another lookup file called missingendheaderfield.csv which - surprise, surprise - misses one field at the end of the header.

Running the app and looking at the lookup files you can spot and fix the errors very easy.

missingoneheaderfield.csv
alt text

missingendheaderfield.csv
alt text

Hope this will help anyone out there to get rid of this annoying error

cheers, MuS

SplunkTrust
SplunkTrust

another quick and easy solution if you know which lookup it is - use Splunk search commands to clean it:

 | inputlookup lookupName | outputlookup lookupName

cheers, MuS

Esteemed Legend

This may drop rows or mis-align fields with headers, depending on the type of problem. I would rather hand-manage the cleanup.

0 Karma

SplunkTrust
SplunkTrust

This is true, if the problematic header misses a field some where in the header and not at the end - but then you have other problems and were missing those fields anyway.

0 Karma

Explorer

Just spent hours trying this; it was an error in the header BUT it was at the END of the header...that is the first line ended with a comma ",". Grep for this also:

find . -name "*.csv" -exec echo {} \; -exec grep -Pn ",$" {} \; | grep -B 1 "^1:"

Looks for a comma at the END of the line in the first line.

SplunkTrust
SplunkTrust

and to make it match a missing field somewhere in the header AND at the end run this find

find . -name "*.csv" -exec echo {} \; -exec grep -Pn ',$|,,' {} \; | grep -B 1 "^1:"

Still have to figure out how this can be done by Splunk Admin without CLI OS access though......

0 Karma

Esteemed Legend

Assuming that your OS is unix/linux, assuming that your CSV files use standard filenaming conventions (i.e. *.csv), assuming that your CSV files are standard with a header on the first line, assuming that the source files still exist, you can use the following CLI commands to identify problematic files:

find . -name "*.csv" -exec echo {} \; -exec grep -n ",," {} \; | grep -B 1 "^1:"

This variation also considers whitespace but may not be compatible with your version of grep:

find . -name "*.csv" -exec echo {} \; -exec grep -Pn ",\s*," {} \; | grep -B 1 "^1:"

View solution in original post

Explorer

good way! finded! thanks!

0 Karma

Esteemed Legend

This finds everything in 1 search:

find . -name "*.csv" -exec echo {} \; -exec egrep -Pn "^\s*,|,\s*,|,\s*$" {} \; | grep -B 1 "^1:"
0 Karma