<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to extract fields from a CSV file that has commas in the fields? in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278438#M84066</link>
    <description>&lt;P&gt;Thx ssievert, you helped me out here! &lt;BR /&gt;
Works like a charm.&lt;/P&gt;</description>
    <pubDate>Tue, 16 Feb 2016 19:09:24 GMT</pubDate>
    <dc:creator>renems</dc:creator>
    <dc:date>2016-02-16T19:09:24Z</dc:date>
    <item>
      <title>How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278428#M84056</link>
      <description>&lt;P&gt;Hi There!&lt;/P&gt;

&lt;P&gt;I have an issue with a field extraction. I have a Windows CSV file, that has several fields that have commas in the fields itself, forcing it to break at the wrong position. &lt;BR /&gt;
Is there any clever way I could fix this? Unfortunately I'm not in the position to edit the source.&lt;/P&gt;

&lt;P&gt;Here's an example of the so-called-csv:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;"ssrv0052,true,,,,,,,unix,,,""Sunday, February 7, 2016 8:05:15 PM CET"",Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,IPv4,,,,,,,,,,,,false,,,5de7125b74af5305076f03648a900aea,false,""Sunday, February 7, 2016 8:05:15 PM CET"",""Sunday, February 7, 2016 8:05:15 PM CET"",,ssrv0052,,,false,,,,,,,,,,,,[server],,,,,,,,,,,,,,,Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Take a close look at the date field. It shows the issue I have.&lt;/P&gt;

&lt;P&gt;Thank you in advance!&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 16:26:14 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278428#M84056</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-10T16:26:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278429#M84057</link>
      <description>&lt;P&gt;Can you provide us with your regular expression that is breaking? Are you trying to create a field for each piece of text between the comas?  &lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 16:28:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278429#M84057</guid>
      <dc:creator>skoelpin</dc:creator>
      <dc:date>2016-02-10T16:28:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278430#M84058</link>
      <description>&lt;P&gt;Most certainly, brace yourself:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[Computer]:Display Label,[Computer]:Allow CI Update,[Computer]:BiosAssetTag,[Computer]:BiosDate,[Computer]:BiosSerialNumber,[Computer]:BiosSource,[Computer]:BiosUuid,[Computer]:BiosVersion,[Computer]:CI Type,[Computer]:CalculatedLocation,[Computer]:ChassisType,[Computer]:Create Time,[Computer]:Created By,[Computer]:DefaultGatewayIpAddress,[Computer]:DefaultGatewayIpAddressType,[Computer]:Description,[Computer]:DiscoveredContact,[Computer]:DiscoveredDescription,[Computer]:DiscoveredLocation,[Computer]:DiscoveredModel,[Computer]:DiscoveredOsName,[Computer]:DiscoveredOsVendor,[Computer]:DiscoveredOsVersion,[Computer]:DiscoveredVendor,[Computer]:DnsServers,[Computer]:DomainName,[Computer]:Enable Aging,[Computer]:ExtendedNodeFamily,[Computer]:ExtendedOsFamily,[Computer]:Global Id,[Computer]:Is Candidate For Deletion,[Computer]:Last Access Time,[Computer]:LastModifiedTime,[Computer]:MemorySize,[Computer]:Name,[Computer]:NetBiosName,[Computer]:Node Boot Time,[Computer]:Node Is Complete,[Computer]:Node Is Route,[Computer]:Node Is Virtual,[Computer]:Node Key,[Computer]:Node NNM UID,[Computer]:Node Operating System Installation type,[Computer]:Node Operating System Release,[Computer]:Node Operating System accuracy,[Computer]:Node Server Type,[Computer]:Node is Desktop,[Computer]:NodeFamily,[Computer]:NodeModel,[Computer]:NodeRole,[Computer]:Note,[Computer]:OS Architecture,[Computer]:Origin,[Computer]:OsDescription,[Computer]:OsFamily,[Computer]:OsVendor,[Computer]:PAE Enabled,[Computer]:PrimaryDnsName,[Computer]:ProcessorFamily,[Computer]:SerialNumber,[Computer]:SnmpSysName,[Computer]:SwapMemorySize,[Computer]:SysObjecttId,[Computer]:UdUniqueId,[Computer]:Updated By,[Computer]:User Label,[Computer]:Vendor,[RunningSoftware]:Display Label,[RunningSoftware]:Allow CI Update,[RunningSoftware]:CI Type,[RunningSoftware]:Create Time,[RunningSoftware]:Created By,[RunningSoftware]:Description,[RunningSoftware]:Enable Aging,[RunningSoftware]:Global Id,[RunningSoftware]:Is Candidate For Deletion,[RunningSoftware]:Last Access Time,[RunningSoftware]:LastModifiedTime,[RunningSoftware]:Name,[RunningSoftware]:Note,[RunningSoftware]:Origin,[RunningSoftware]:Updated By,[RunningSoftware]:User Label,[RunningSoftware]:Vendor,[RunningSoftware]:Application Category,[RunningSoftware]:Application IP,[RunningSoftware]:Application IP Routing Domain,[RunningSoftware]:Application IP Type,[RunningSoftware]:Application Installed Path,[RunningSoftware]:Application Listening Port Number,[RunningSoftware]:Application Timeout,[RunningSoftware]:Application Username,[RunningSoftware]:Application Version Description,[RunningSoftware]:Container name,[RunningSoftware]:DiscoveredProductName,[RunningSoftware]:ProductName,[RunningSoftware]:StartupTime,[RunningSoftware]:Version
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 10 Feb 2016 17:18:27 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278430#M84058</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-10T17:18:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278431#M84059</link>
      <description>&lt;P&gt;This looks like the CSV file, can you provide us with the regular expression that is parsing out the fields? &lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 17:30:31 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278431#M84059</guid>
      <dc:creator>skoelpin</dc:creator>
      <dc:date>2016-02-10T17:30:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278432#M84060</link>
      <description>&lt;P&gt;Looks like the values that have commas in it are quoted, so you shouldn't have that issue. Specifying the list of fields as they occur in the file configured with FIELDS= in props.conf or a header row in the file itself and CHECK_FOR_HEADER=true should work without issues.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 08:43:26 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278432#M84060</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2020-09-29T08:43:26Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278433#M84061</link>
      <description>&lt;P&gt;For the date field you are right, but looking at the original data, I can distinguish 9 "blocks" that are set within quotes:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;"ssrv0052,true,,,,,,,unix,,,"
"Sunday, February 7, 2016 8:05:15 PM CET"
",Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,IPv4,,,,,,,,,,,,false,,,5de7125b74af5305076f03648a900aea,false,"
"Sunday, February 7, 2016 8:05:15 PM CET"
","
"Sunday, February 7, 2016 8:05:15 PM CET"
",,ssrv0052,,,false,,,,,,,,,,,,[server],,,,,,,,,,,,,,,Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;If I understand you correct, that would mean that I create 9 pretty useless fields, right?&lt;/P&gt;

&lt;P&gt;What I really would like is a way to distinguish all original fields, any thoughts on that?&lt;/P&gt;</description>
      <pubDate>Thu, 11 Feb 2016 19:45:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278433#M84061</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-11T19:45:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278434#M84062</link>
      <description>&lt;P&gt;I'm sorry, I misread your question. I haven't created a regex yet, not the first thing I should put up on my resume &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;
I tried extracting by delimiter in the gui until now. Also see my reply to ssievert that naming the fields in props.conf doesn't lead to the desired fields. Any help appreciated!&lt;/P&gt;</description>
      <pubDate>Thu, 11 Feb 2016 19:49:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278434#M84062</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-11T19:49:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278435#M84063</link>
      <description>&lt;P&gt;Hi all, let me rephrase my question:&lt;/P&gt;

&lt;P&gt;I have a data input I cannot modify by the source. It was supposed to be a csv, but doesn't really comply.&lt;/P&gt;

&lt;P&gt;This is an example line/event from the input:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;"ssrv0052,true,,,,,,,unix,,,""Sunday, February 7, 2016 8:05:15 PM CET"",Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,IPv4,,,,,,,,,,,,false,,,5de7125b74af5305076f03648a900aea,false,""Sunday, February 7, 2016 8:05:15 PM CET"",""Sunday, February 7, 2016 8:05:15 PM CET"",,ssrv0052,,,false,,,,,,,,,,,,[server],,,,,,,,,,,,,,,Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I'm affraid my only resort for extracting the fields is to use a regex, because the input is devided in 9 blocks within quotes. However, the data contains 40+ fields. These are the blocks I can distinguish in the data:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; "ssrv0052,true,,,,,,,unix,,,"
 "Sunday, February 7, 2016 8:05:15 PM CET"
 ",Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,IPv4,,,,,,,,,,,,false,,,5de7125b74af5305076f03648a900aea,false,"
 "Sunday, February 7, 2016 8:05:15 PM CET"
 ","
 "Sunday, February 7, 2016 8:05:15 PM CET"
 ",,ssrv0052,,,false,,,,,,,,,,,,[server],,,,,,,,,,,,,,,Integratie Service Manager : DS_Integratie Service Manager _A02 - SM9 Linux-Unix-Windows Servers - Full,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Who can help me out?&lt;/P&gt;</description>
      <pubDate>Fri, 12 Feb 2016 09:09:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278435#M84063</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-12T09:09:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278436#M84064</link>
      <description>&lt;P&gt;Well, if you can't change the source and get it cleaned up (and you should probably still try to do that), you'll have to do some cleanup using props/transforms. Try this:&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;props.conf&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[dirtycsv]
SEDCMD-cleanupMess = s/^\"//g s/\"$//g s/\"\"/"/g
REPORT-cleanFields = cleancsv_fields 
SHOULD_LINEMERGE=false
NO_BINARY_CHECK = true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;STRONG&gt;transforms.conf&lt;/STRONG&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[cleancsv_fields]
DELIMS = ","
FIELDS = f01,f02,f03,f04,f05,f06,f07,f08,f09,f10,f11,f12,f13,f14,f15,f16,f17,f18,f19,f20,f21,f22,f23,f24,f25,f26,f27,f28,f29,f30,f31,f32,f33,f34,f35,f36,f37,f38,f39,f40
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Change your FIELDS list to the field names you want to assign to each of the field names in the - now structurally cleaned up - csv file and you should be on a happier path.&lt;/P&gt;

&lt;P&gt;The SEDCMD above contains a space-separated list of three SED replacements:&lt;/P&gt;

&lt;OL&gt;
&lt;LI&gt;Remove double quote at beginning of line&lt;/LI&gt;
&lt;LI&gt;Remove double quote at end of line&lt;/LI&gt;
&lt;LI&gt;Replace double-double quotes around timestamp fields with single double quote&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;There may be a more efficient way to do this with RegEx, but this should work. The most efficient way would be to clean up the source before it is indexed into Splunk.... &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;I tested this on my local 6.3.x install:&lt;BR /&gt;
&lt;span class="lia-inline-image-display-wrapper" image-alt="alt text"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/1042i32348261647EFC40/image-size/large?v=v2&amp;amp;px=999" role="button" title="alt text" alt="alt text" /&gt;&lt;/span&gt; &lt;/P&gt;

&lt;P&gt;props/transforms need to be on your first parsing tier, either indexer or heavy forwarder, not on universal forwarder.&lt;/P&gt;</description>
      <pubDate>Sun, 14 Feb 2016 04:43:36 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278436#M84064</guid>
      <dc:creator>s2_splunk</dc:creator>
      <dc:date>2016-02-14T04:43:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278437#M84065</link>
      <description>&lt;P&gt;Wow ssievert,&lt;/P&gt;

&lt;P&gt;thanx for the effort you put in! I'm gonna try this as soon as possible, and report back!&lt;/P&gt;</description>
      <pubDate>Mon, 15 Feb 2016 08:25:04 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278437#M84065</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-15T08:25:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract fields from a CSV file that has commas in the fields?</title>
      <link>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278438#M84066</link>
      <description>&lt;P&gt;Thx ssievert, you helped me out here! &lt;BR /&gt;
Works like a charm.&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2016 19:09:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/How-to-extract-fields-from-a-CSV-file-that-has-commas-in-the/m-p/278438#M84066</guid>
      <dc:creator>renems</dc:creator>
      <dc:date>2016-02-16T19:09:24Z</dc:date>
    </item>
  </channel>
</rss>

