<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Parsing Fields Properly in Knowledge Management</title>
    <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644685#M9487</link>
    <description>&lt;P&gt;Please share some sanitized example events for us to test with.&amp;nbsp; Are you trying to parse the fields at search time or index time?&amp;nbsp; If the former, please share the SPL you're using; otherwise, share the relevant props.conf stanza.&lt;/P&gt;</description>
    <pubDate>Fri, 26 May 2023 00:16:24 GMT</pubDate>
    <dc:creator>richgalloway</dc:creator>
    <dc:date>2023-05-26T00:16:24Z</dc:date>
    <item>
      <title>How to parse fields properly?</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644684#M9486</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;I am trying to get a field extraction working, and have written regex accordingly that the field extractor seems to like. The raw logs are a list of quotes-encapsulated fields separated by commas:&lt;BR /&gt;&lt;BR /&gt;"field1","field2","field3",...&lt;BR /&gt;&lt;BR /&gt;Certain fields can have multiple values, wherein the values are separated only by a comma but quotes enclose only the entire list of fields. For example:&lt;BR /&gt;&lt;BR /&gt;"field1","field2","field3value1,field3value2,field3value3",...&lt;BR /&gt;&lt;BR /&gt;To complicate matters, values that belong to a certain field can contain multiple words separated by other characters, such as "Software/Technology" or "Business and Industry" so that the entire field may look something like this:&lt;BR /&gt;&lt;BR /&gt;"Software/Technology,Business Services,Application,Business and Industry,Computers and Internet"&lt;BR /&gt;&lt;BR /&gt;That field needs to be extracted and displayed exactly as it is shown,&amp;nbsp;The regex I have attempted for this is as follows:&lt;BR /&gt;&lt;BR /&gt;"(?&amp;lt;categories&amp;gt;[^\"]+|)&lt;BR /&gt;"(?&amp;lt;categories_again&amp;gt;[\w\s\/\,]+|)&lt;/P&gt;
&lt;P&gt;Although the field extractor, rex function, and regex101 like both of these extractions and they work exactly as expected, when I search I get each word from within the field as its own independent value, which is not what I need:&lt;/P&gt;
&lt;P&gt;Software&lt;BR /&gt;Technology&lt;BR /&gt;Business&lt;BR /&gt;Services&lt;BR /&gt;Application&lt;BR /&gt;and&lt;BR /&gt;Industry&lt;/P&gt;
&lt;P&gt;At this point I'm out of ideas as to regex modifications or other work-arounds that can be applied to fix this. Has anyone else encountered this problem and if so, were you able to fix it and how? Otherwise I think I have to bring this to Splunk support.&lt;BR /&gt;&lt;BR /&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2023 12:21:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644684#M9486</guid>
      <dc:creator>Charlie5</dc:creator>
      <dc:date>2023-05-26T12:21:18Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644685#M9487</link>
      <description>&lt;P&gt;Please share some sanitized example events for us to test with.&amp;nbsp; Are you trying to parse the fields at search time or index time?&amp;nbsp; If the former, please share the SPL you're using; otherwise, share the relevant props.conf stanza.&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2023 00:16:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644685#M9487</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2023-05-26T00:16:24Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644695#M9490</link>
      <description>&lt;P&gt;It is not entirely clear what your expected results are. For example, are you looking for the extract to produce a multi-value field like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Software/Technology
Business Services
Application
Business and Industry
Computers and Internet&lt;/LI-CODE&gt;&lt;P&gt;or a single field like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Software/Technology,Business Services,Application,Business and Industry,Computers and Internet&lt;/LI-CODE&gt;&lt;P&gt;or in the more generic case a multi-value field like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;field1
field2
field3value1,field3value2,field3value3&lt;/LI-CODE&gt;&lt;P&gt;or is this three fields&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;field1&lt;/LI-CODE&gt;&lt;LI-CODE lang="markup"&gt;field2&lt;/LI-CODE&gt;&lt;LI-CODE lang="markup"&gt;field3value1,field3value2,field3value3&lt;/LI-CODE&gt;&lt;P&gt;or, in the case of the last field&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;field3value1
field3value2
field3value3&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2023 05:52:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644695#M9490</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2023-05-26T05:52:15Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644797#M9495</link>
      <description>&lt;P&gt;Thanks for the responses thus far, it is much appreciated. Here are some sanitized examples of logs:&lt;BR /&gt;&lt;BR /&gt;"2023-04-25 13:14:27","QZ-NewYork_DMZ","QZ-NewYork_DMZ","80.20.59.143","80.20.59.143","Allowed","28 (AAAA)","NOERROR","webdefence.global.whitespider.com","Software/Technology,Application,Computers and Internet","Networks","Networks",""&lt;/P&gt;&lt;P&gt;"2022-10-23 11:34:59","Charlie Five (cfive@workplace.com)","Charlie Five (cfive@workplace.com),QZ-NewYork_Verizon_VPN_NAT,QZ-845310891334","172.32.5.8","8.8.8.8","Allowed","1 (A)","NOERROR","outlook.office365.com","Software/Technology,Webmail,Business Services,Organizational Email,Application,Web-based Email,Online Document Sharing and Collaboration","AD Users","AD Users,Networks,Anyconnect Roaming Client",""&lt;/P&gt;&lt;P&gt;In the first example, I would want the values for the categories field to be as follows; each line represents one complete field value as it would display in a search:&lt;BR /&gt;&lt;BR /&gt;Software/Technology&lt;BR /&gt;Application&lt;BR /&gt;Computers and Internet&lt;/P&gt;&lt;P&gt;Alternatively, this would also suffice, which is the entire string exactly as it displays in the log:&lt;BR /&gt;&lt;BR /&gt;Software/Technology,Application,Computers and Internet&lt;/P&gt;&lt;P&gt;The same applies to the second example, here I will display them as if I clicked on the field in the event drop-down and selected "view events", this is what would be added to the search bar:&lt;BR /&gt;&lt;BR /&gt;categories="Software/Technology,Webmail,Business Services,Organizational Email,Application,Web-based Email,Online Document Sharing and Collaboration"&lt;BR /&gt;&lt;BR /&gt;Or (I'll only show 1 here for the sake of brevity):&lt;/P&gt;&lt;P&gt;categories="Online Document Sharing and Collaboration"&lt;/P&gt;&lt;P&gt;Hope this helps you more, and thank you again for your assistance.&lt;/P&gt;</description>
      <pubDate>Fri, 26 May 2023 23:30:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644797#M9495</guid>
      <dc:creator>Charlie5</dc:creator>
      <dc:date>2023-05-26T23:30:48Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644804#M9496</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Are you trying to parse the fields at search time or index time?&amp;nbsp; If the former, please share the SPL you're using; otherwise, share the relevant props.conf stanza.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 27 May 2023 00:34:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644804#M9496</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2023-05-27T00:34:11Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644805#M9497</link>
      <description>&lt;P&gt;Depending on whether the final field is important, you could do something like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex max_match=0 "(?&amp;lt;field&amp;gt;\"[^\"]*\"),?"
| eval categories=split(trim(mvindex(field,9),"\""),",")&lt;/LI-CODE&gt;</description>
      <pubDate>Sat, 27 May 2023 06:18:12 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/644805#M9497</guid>
      <dc:creator>ITWhisperer</dc:creator>
      <dc:date>2023-05-27T06:18:12Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/645161#M9499</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/213957"&gt;@richgalloway&lt;/a&gt;&amp;nbsp;Search time, here is the SPL for manual extraction:&lt;BR /&gt;&lt;BR /&gt;index=my_index sourcetype=proxy_sourcetype&lt;BR /&gt;| rex field=_raw "^("([^\"]+)",){9}"(?&amp;lt;categories&amp;gt;[^\"]+)"&lt;/P&gt;</description>
      <pubDate>Tue, 30 May 2023 21:40:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/645161#M9499</guid>
      <dc:creator>Charlie5</dc:creator>
      <dc:date>2023-05-30T21:40:22Z</dc:date>
    </item>
    <item>
      <title>Re: Parsing Fields Properly</title>
      <link>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/645172#M9500</link>
      <description>&lt;P&gt;I think you're most of the way there.&amp;nbsp; To separate the categories, use the &lt;FONT face="courier new,courier"&gt;split&lt;/FONT&gt; function.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;index=my_index sourcetype=proxy_sourcetype
| rex field=_raw "^("([^\"]+)",){9}"(?&amp;lt;categories&amp;gt;[^\"]+)"
| eval categories=split(categories,",")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 May 2023 02:02:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Knowledge-Management/How-to-parse-fields-properly/m-p/645172#M9500</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2023-05-31T02:02:51Z</dc:date>
    </item>
  </channel>
</rss>

