<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic CSV empty quoted field extraction problem in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493472#M194405</link>
    <description>&lt;P&gt;Hi there,&lt;/P&gt;

&lt;P&gt;I have the next CSV file:&lt;/P&gt;

&lt;P&gt;"CLM_TIMESTAMP","CLM_DATE","CLM_NUMBER"&lt;BR /&gt;
"1569301200","24/09/2019 00:00:00","389721519283162"&lt;BR /&gt;
"1569301400","24/09/2019 00:00:00",""&lt;BR /&gt;
"1569301600","24/09/2019 00:00:00",""&lt;/P&gt;

&lt;P&gt;When forwarded to index, the CLM_NUMBER "" then appears indexed as 'CLM_NUMBER = ' so if you look at statistics it says that 100% of events has the field, however, I only want to have the field in those cases where it really appears, and to consider empty double quotes to null.&lt;/P&gt;

&lt;P&gt;Any idea to solve this problem? &lt;/P&gt;

&lt;P&gt;I have tried to apply SEDCMD commands to the source and change the _raw but even if I change the whole _raw to "I have changed the raw", all of the fields of the csv still remains with the values that contains in the csv.&lt;/P&gt;</description>
    <pubDate>Wed, 30 Sep 2020 02:23:49 GMT</pubDate>
    <dc:creator>cajose3pepe</dc:creator>
    <dc:date>2020-09-30T02:23:49Z</dc:date>
    <item>
      <title>CSV empty quoted field extraction problem</title>
      <link>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493472#M194405</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;

&lt;P&gt;I have the next CSV file:&lt;/P&gt;

&lt;P&gt;"CLM_TIMESTAMP","CLM_DATE","CLM_NUMBER"&lt;BR /&gt;
"1569301200","24/09/2019 00:00:00","389721519283162"&lt;BR /&gt;
"1569301400","24/09/2019 00:00:00",""&lt;BR /&gt;
"1569301600","24/09/2019 00:00:00",""&lt;/P&gt;

&lt;P&gt;When forwarded to index, the CLM_NUMBER "" then appears indexed as 'CLM_NUMBER = ' so if you look at statistics it says that 100% of events has the field, however, I only want to have the field in those cases where it really appears, and to consider empty double quotes to null.&lt;/P&gt;

&lt;P&gt;Any idea to solve this problem? &lt;/P&gt;

&lt;P&gt;I have tried to apply SEDCMD commands to the source and change the _raw but even if I change the whole _raw to "I have changed the raw", all of the fields of the csv still remains with the values that contains in the csv.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 02:23:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493472#M194405</guid>
      <dc:creator>cajose3pepe</dc:creator>
      <dc:date>2020-09-30T02:23:49Z</dc:date>
    </item>
    <item>
      <title>Re: CSV empty quoted field extraction problem</title>
      <link>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493473#M194406</link>
      <description>&lt;P&gt;Hi&lt;BR /&gt;
try something like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;| makeresults | eval CLM_TIMESTAMP="1569301200", CLM_DATE="24/09/2019 00:00:00", CLM_NUMBER="389721519283162"
| append [ | makeresults | eval CLM_TIMESTAMP="1569301400", CLM_DATE="24/09/2019 00:00:00", CLM_NUMBER="" ]
| append [ | makeresults | eval CLM_TIMESTAMP="1569301600", CLM_DATE="24/09/2019 00:00:00", CLM_NUMBER="" ]
| eval CLM_NUMBER=if(tonumber(CLM_NUMBER)&amp;gt;0,CLM_NUMBER,NULL)
| stats count BY CLM_NUMBER
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Bye.&lt;BR /&gt;
Giuseppe&lt;/P&gt;</description>
      <pubDate>Mon, 07 Oct 2019 12:56:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493473#M194406</guid>
      <dc:creator>gcusello</dc:creator>
      <dc:date>2019-10-07T12:56:54Z</dc:date>
    </item>
    <item>
      <title>Re: CSV empty quoted field extraction problem</title>
      <link>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493474#M194407</link>
      <description>&lt;P&gt;Many thanks for your answer Giuseppe. I forgot to mention that I need it at index time. &lt;/P&gt;</description>
      <pubDate>Mon, 07 Oct 2019 15:35:16 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493474#M194407</guid>
      <dc:creator>cajose3pepe</dc:creator>
      <dc:date>2019-10-07T15:35:16Z</dc:date>
    </item>
    <item>
      <title>Re: CSV empty quoted field extraction problem</title>
      <link>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493475#M194408</link>
      <description>&lt;P&gt;I have found a "solution" that fits for me:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;PROPS.CONF&lt;/STRONG&gt;&lt;BR /&gt;
[my_sourcetype]&lt;BR /&gt;
CHARSET = ISO-8859-1&lt;BR /&gt;
TZ = America/Sao_Paulo&lt;BR /&gt;
TIME_PREFIX = \" # Line to get first field as timestamp&lt;BR /&gt;
TIME_FORMAT=%s&lt;BR /&gt;
MAX_TIMESTAMP_LOOKAHEAD=10&lt;BR /&gt;
SHOULD_LINEMERGE = false&lt;BR /&gt;
disabled = false&lt;BR /&gt;
pulldown_type = true&lt;BR /&gt;
KV_MODE = none&lt;BR /&gt;
NO_BINARY_CHECK = true&lt;BR /&gt;
PREAMBLE_REGEX = .&lt;EM&gt;CLM_NUMBER" #Line to avoid header indexing&lt;BR /&gt;
SEDCMD-changeeventformat1 = s/(\"[^\"]&lt;/EM&gt;\"),(\"[^\"]&lt;EM&gt;\"),(\"[^\"]&lt;/EM&gt;\")/clm_timestamp=\1 clm_date=\2 clm_number=\3/g&lt;BR /&gt;
SEDCMD-changeeventformat2 = s/ \w+=\"\"//g #This line deletes empty fields&lt;/P&gt;

&lt;P&gt;Not a beautiful solution but after hours of tries is the only solution I have found.&lt;/P&gt;

&lt;P&gt;Hope is helpful for others.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 02:24:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/CSV-empty-quoted-field-extraction-problem/m-p/493475#M194408</guid>
      <dc:creator>cajose3pepe</dc:creator>
      <dc:date>2020-09-30T02:24:05Z</dc:date>
    </item>
  </channel>
</rss>

