Hi Splunkers!
I have an issue with Splunk 6.3.1 and the indexed data from a CSV file.
On my CSV file (separed by semicolons) y have some null fields, wich I don't want to index in Splunk as null, like so:
NAME;SURNAME;ID;BIRTHPLACE;BIRTHDATE;COUNTRY;POSTCODE;ADDRESS
NAME1;SURNAME1;ID1;;BIRTH1;;;ADDRESS1
NAME2;SURNAME2;ID2;;BIRTH2;;;ADDRESS2
NAME3;SURNAME3;ID3;;BIRTH3;;;ADDRESS3
NAME4;SURNAME4;ID4;;BIRTH4;;;ADDRESS4
So, I defined on my props.conf the following properties to avoid null values (and use null values like I want), like this:
SEDCMD-replaceblanks=s/^(?=;)|(?<=;)(?=;|$)/-------/g
I indexed my data, and y can show my data in the _raw and source data perfectly like this:
But when I want to work with the data (make stats or concatenate strings), Splunk is telling me that the values are empty! Eventhough the values are stored with my custom data values!! I made a simply table to show the content and this is the result...
As you can see, the values and the data are indexed perfectly but Splunk does not see it properly.
Do you know why I'm having this trouble? Is this a bug?
Thank you very much!
I do not import it is a specification, so I think that it should be set at the time of search if necessary.
https://docs.splunk.com/Documentation/Splunk/6.3.1/Data/Extractfieldsfromfileswithstructureddata
Only header fields containing data are indexed
When Splunk software extracts header fields from structured data files, it only extracts those fields where data is present in at least one row. If the header field has no data in any row, it is skipped (that is, not indexed). Take, for example, the following csv file:
Thank you for your answer @HiroshiSatoh, but I don't understand your explanation.
On my props.conf I'm using the replace-blanks stanza to substitute those blanks by the characters "-----".
I'm doing that at indexing time, and, as you can see on the screenshots, the data is being indexed with those characters (you can see that on the content of the _raw fields and also on the source data). However, eventhough the data is indexed with "----", splunk doesn't seem to recognize those characters and sees and uses those fields as empty values.
Maybe is a bug for that version? I don't know how splunk behaves in more recent versions in this scenario
It is a workaround.Please contact the vendor to see if it is a bug.
In the order of processing, I think that the field is excluded because it is the order of header processing and SED.
https://wiki.splunk.com/Community:HowIndexingWorks
I think that you should extract the field again using the converted _raw.
Thank you for the explanation, I will make the extraction later.
Regards!