<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: fields.conf TOKENIZER breaks my event completely in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581305#M202524</link>
    <description>&lt;P&gt;The spath code is just to illustrate how to clean up. &amp;nbsp;Key-value pairs in Combo can be extracted using &lt;FONT face="courier new,courier"&gt;extract&lt;/FONT&gt; command (aka kv).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| spath
| rename Event.EventData.Data{@*} as EventData*, Event.EventData.Data as EventDataData ``` most eval functions cannot handle {} notation ```
| eval EventDataName=mvmap(EventDataName, case(EventDataName == "SubjectUnix", "SubjectUnix &amp;lt;Uid:" . EventDataUid . ", Gid:" . EventDataGid . ", Local:" . EventDataLocal . "&amp;gt;", EventDataName == "SubjectIP", "SubjectIP&amp;lt;" . EventDataIPVersion . "&amp;gt;", true(), EventDataName)) ``` application-specific mapping ```
| eval Combo = mvzip(EventDataName, EventDataData, "=\"")
| rename Combo as _raw
| rex mode=sed "s/$/\"/"
| kv kvdelim="=" ``` extract key-value pairs from Combo ```
| fields - Event*, _raw
| makemv delim=";" DesiredAccess
| makemv delim=";" Attributes
| makemv delim=";" HandleID
| makemv delim=";" ObjectName&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sample output is like&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;AccessList&lt;/TD&gt;&lt;TD&gt;AccessMask&lt;/TD&gt;&lt;TD&gt;Attributes&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;DesiredAccess&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;HandleID&lt;/TD&gt;&lt;TD&gt;ObjectName&lt;/TD&gt;&lt;TD&gt;ObjectServer&lt;/TD&gt;&lt;TD&gt;ObjectType&lt;/TD&gt;&lt;TD&gt;SubjectDomainName&lt;/TD&gt;&lt;TD&gt;SubjectUserIsLocal&lt;/TD&gt;&lt;TD&gt;SubjectUserName&lt;/TD&gt;&lt;TD&gt;SubjectUserSid&lt;/TD&gt;&lt;TD&gt;_time&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;%%4416 %%4423&lt;/TD&gt;&lt;TD&gt;81&lt;/TD&gt;&lt;TD&gt;Open a directory&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;Read Data&lt;/DIV&gt;&lt;DIV class=""&gt;List Directory&lt;/DIV&gt;&lt;DIV class=""&gt;Read Attributes&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;00000000000444&lt;/DIV&gt;&lt;DIV class=""&gt;00&lt;/DIV&gt;&lt;DIV class=""&gt;002a62a7&lt;/DIV&gt;&lt;DIV class=""&gt;0d3d88a4&lt;/DIV&gt;&lt;P&gt;00;002a62a7;0d3d88a4&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;(Shares)&lt;/DIV&gt;&lt;DIV class=""&gt;/LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;Security&lt;/TD&gt;&lt;TD&gt;Directory&lt;/TD&gt;&lt;TD&gt;ACCOUNTS&lt;/TD&gt;&lt;TD&gt;false&lt;/TD&gt;&lt;TD&gt;davidsmith&lt;/TD&gt;&lt;TD&gt;S-1-5-21-3579272529-1234567890-2280984729-123456&lt;/TD&gt;&lt;TD&gt;2022-01-12T15:42:41&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;The main point is that structured data are best &amp;nbsp;handled with conformant tested code. &amp;nbsp;In addition, complex, custom index-time extraction makes maintenance difficult. &amp;nbsp;Search-time prowess is Splunk's very strength. &amp;nbsp;Why not use it?&lt;/P&gt;&lt;P&gt;Meantime, the error in fields.conf is that TOKENIZER does not accept extra characters outside the token itself. This should work:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[DesiredAccess]
TOKENIZER = (\b[^;]+)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 17 Jan 2022 10:40:09 GMT</pubDate>
    <dc:creator>yuanliu</dc:creator>
    <dc:date>2022-01-17T10:40:09Z</dc:date>
    <item>
      <title>fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/580831#M202353</link>
      <description>&lt;P&gt;I'm trying to get a new sourcetype (NetApp user-level audit logs, exported as XML) to work, and I think my fields.conf tokenizer is breaking things. But I'm not really sure how, or why, or what to do about it.&lt;/P&gt;&lt;P&gt;The raw data is XML, but I'm not using KV_MODE=xml because that doesn't properly handle all the attributes. So, I've got a bunch of custom regular expressions, the true backbone of all enterprise software. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; Here's a single sample event (but you can probably disregard most of it, it's just here for completeness):&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Event&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;System&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Provider&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;NetApp-Security-Auditing&lt;/SPAN&gt;&lt;SPAN&gt;" &lt;/SPAN&gt;&lt;SPAN class=""&gt;Guid=&lt;/SPAN&gt;&lt;SPAN&gt;"{guid-edited&lt;/SPAN&gt;&lt;SPAN&gt;}"&lt;/SPAN&gt;&lt;SPAN class=""&gt;/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;EventID&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;4656&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/EventID&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;EventName&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Open&lt;/SPAN&gt; &lt;SPAN class=""&gt;Object&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/EventName&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Version&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;101.3&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Version&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Source&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;CIFS&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Source&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Level&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Level&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Opcode&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Opcode&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Keywords&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;0x8020000000000000&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Keywords&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Result&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Audit&lt;/SPAN&gt; &lt;SPAN class=""&gt;Success&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Result&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;TimeCreated&lt;/SPAN&gt; &lt;SPAN class=""&gt;SystemTime=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;2022-01-12T15:42:41.096809000Z&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Correlation/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Channel&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Security&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Channel&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Computer&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;server-name-edited&amp;lt;/Compu&lt;/SPAN&gt;&lt;SPAN class=""&gt;ter&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;ComputerUUID&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;guid-edited&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/ComputerUUID&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Security/&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/System&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;EventData&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectIP&lt;/SPAN&gt;&lt;SPAN&gt;" &lt;/SPAN&gt;&lt;SPAN class=""&gt;IPVersion=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;4&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;1.2.3.4&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectUnix&lt;/SPAN&gt;&lt;SPAN&gt;" &lt;/SPAN&gt;&lt;SPAN class=""&gt;Uid=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;1234&lt;/SPAN&gt;&lt;SPAN&gt;" &lt;/SPAN&gt;&lt;SPAN class=""&gt;Gid=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;1234&lt;/SPAN&gt;&lt;SPAN&gt;" &lt;/SPAN&gt;&lt;SPAN class=""&gt;Local=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;false&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectUserSid&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;S-1-5-21-3579272529-1234567890-2280984729-123456&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectUserIsLocal&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;false&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectDomainName&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;ACCOUNTS&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectUserName&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;davidsmith&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;ObjectServer&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Security&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;ObjectType&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Directory&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;HandleID&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;00000000000444&lt;/SPAN&gt;&lt;SPAN&gt;;&lt;/SPAN&gt;&lt;SPAN class=""&gt;00&lt;/SPAN&gt;&lt;SPAN&gt;;&lt;/SPAN&gt;&lt;SPAN class=""&gt;002a62a7&lt;/SPAN&gt;&lt;SPAN&gt;;&lt;/SPAN&gt;&lt;SPAN class=""&gt;0d3d88a4&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;ObjectName&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;(&lt;/SPAN&gt;&lt;SPAN class=""&gt;Shares&lt;/SPAN&gt;&lt;SPAN&gt;);&lt;/SPAN&gt;&lt;SPAN class=""&gt;/LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;AccessList&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;%%4416&lt;/SPAN&gt; &lt;SPAN class=""&gt;%%4423&lt;/SPAN&gt;&lt;SPAN&gt; &amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;AccessMask&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;81&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;DesiredAccess&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Read&lt;/SPAN&gt; &lt;SPAN class=""&gt;Data&lt;/SPAN&gt;&lt;SPAN&gt;; &lt;/SPAN&gt;&lt;SPAN class=""&gt;List&lt;/SPAN&gt; &lt;SPAN class=""&gt;Directory&lt;/SPAN&gt;&lt;SPAN&gt;; &lt;/SPAN&gt;&lt;SPAN class=""&gt;Read&lt;/SPAN&gt; &lt;SPAN class=""&gt;Attributes&lt;/SPAN&gt;&lt;SPAN&gt;; &amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;Attributes&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Open&lt;/SPAN&gt; &lt;SPAN class=""&gt;a&lt;/SPAN&gt; &lt;SPAN class=""&gt;directory&lt;/SPAN&gt;&lt;SPAN&gt;; &amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/EventData&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Event&amp;gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;My custom app's props.conf has a couple dozen lines like this, for each element I want to be able to search on:&lt;/P&gt;&lt;P&gt;&lt;!--  StartFragment   --&gt;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;EXTRACT-DesiredAccess = &amp;lt;Data Name="DesiredAccess"&amp;gt;(?&amp;lt;DesiredAccess&amp;gt;.*?)&amp;lt;\/Data&amp;gt;&lt;BR /&gt;EXTRACT-HandleID = &amp;lt;Data Name="HandleID"&amp;gt;(?&amp;lt;HandleID&amp;gt;.*?)&amp;lt;\/Data&amp;gt;&lt;BR /&gt;EXTRACT-InformationRequested = &amp;lt;Data Name="InformationRequested"&amp;gt;(?&amp;lt;InformationRequested&amp;gt;.*?)&amp;lt;\/Data&amp;gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;&lt;!--  EndFragment   --&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;This works as you'd expect, except for a couple of fields where they're composites. This is most noticeable in the DesiredAccess element, which in our example looks like:&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;&amp;lt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;"&lt;SPAN class=""&gt;DesiredAccess&lt;/SPAN&gt;"&amp;gt;&lt;SPAN class=""&gt;Read&lt;/SPAN&gt; &lt;SPAN class=""&gt;Data&lt;/SPAN&gt;; &lt;SPAN class=""&gt;List&lt;/SPAN&gt; &lt;SPAN class=""&gt;Directory&lt;/SPAN&gt;; &lt;SPAN class=""&gt;Read&lt;/SPAN&gt; &lt;SPAN class=""&gt;Attributes&lt;/SPAN&gt;; &amp;lt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;Thus you get a single field with "&lt;SPAN&gt;&lt;SPAN class=""&gt;Read&lt;/SPAN&gt; &lt;SPAN class=""&gt;Data&lt;/SPAN&gt;; &lt;SPAN class=""&gt;List&lt;/SPAN&gt; &lt;SPAN class=""&gt;Directory&lt;/SPAN&gt;; &lt;SPAN class=""&gt;Read&lt;/SPAN&gt; &lt;SPAN class=""&gt;Attributes&lt;/SPAN&gt;; &lt;/SPAN&gt;" and if you only need to look for, say, "List Directory," you have to get clever with your searches.&lt;/P&gt;&lt;P&gt;So, I added a fields.conf file with this in it:&lt;/P&gt;&lt;P&gt;&lt;!--  StartFragment   --&gt;&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;[DesiredAccess]&lt;BR /&gt;TOKENIZER = \s?(.*?);&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;When I paste the 'raw' contents of that field, and that regex, into a tool like regex101.com, it works and returns the expected results. Similarly, it also works if I remove it from fields.conf, and put it in as a makemv command:&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;index=nonprod_pe | makemv tokenizer="\s?(.*?);" DesiredAccess&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;With the TOKENIZER element in fields.conf, the DesiredAccess attribute just doesn't populate, period. So I assume it's the problem.&lt;/P&gt;&lt;P&gt;(Since this is in an app, the app's metadata does contain explicit "export = system" lines for both [props] and [fields]. And the app is on indexers and search heads. Probably doesn't need to be in both places, but hey I'm still learning...)&lt;/P&gt;&lt;P&gt;So, what am I doing wrong with my fields.conf tokenizer, that's caused it to fail completely to identify any elements?&lt;/P&gt;&lt;P&gt;&lt;!--  EndFragment   --&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jan 2022 17:33:18 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/580831#M202353</guid>
      <dc:creator>dsmith</dc:creator>
      <dc:date>2022-01-12T17:33:18Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/580894#M202380</link>
      <description>&lt;P&gt;It is unadvisable to handle structured data with custom regex because such is fraught with pitfalls. &amp;nbsp;It is better to focus on why&amp;nbsp;&lt;SPAN&gt;KV_MODE=xml "doesn't properly handle all the attributes." &amp;nbsp;Generally speaking, &amp;nbsp;there is no reason why vendor's tested builtin function cannot handle conformant data.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Can you illustrate with cleansed data where indexer/spath isn't handling correctly?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jan 2022 07:20:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/580894#M202380</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-01-13T07:20:21Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/580957#M202397</link>
      <description>&lt;P&gt;&lt;FONT face="courier new,courier"&gt;KV_MODE=xml&lt;/FONT&gt; doesn't handle most of the &amp;lt;Data Name="fieldname"&amp;gt;value&amp;lt;/Data&amp;gt; events, in the way that I would hope/expect. You'll get an attribute named literally "Name" but not something named "fieldname" with a value of "value". The most egregious example in terms of practicality is probably:&lt;/P&gt;&lt;PRE&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;Data&lt;/SPAN&gt; &lt;SPAN class=""&gt;Name=&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN class=""&gt;SubjectUserName&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;davidsmith&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;/Data&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;P&gt;In the above, I would like an event attribute named "SubjectUserName" with a value of "davidsmith". (Yes, I want user names in my audit logs...) But neither &lt;FONT face="courier new,courier"&gt;KV_MODE=xml&lt;/FONT&gt;, nor &lt;FONT face="courier new,courier"&gt;|xmlkv&lt;/FONT&gt; in a search, handle this case properly. (Or at least "the way I want them to," which may or may not be "properly.")&lt;/P&gt;&lt;P&gt;NetApp's particular flavor of XML has been an issue for years:&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-index-NetApp-CIFS-logs-in-XML-format/td-p/264546" target="_blank"&gt;https://community.splunk.com/t5/Getting-Data-In/How-to-configure-Splunk-to-index-NetApp-CIFS-logs-in-XML-format/td-p/264546&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.splunk.com/t5/Dashboards-Visualizations/Parsing-oddly-formatted-XML-NetApp-log/m-p/367181" target="_blank"&gt;https://community.splunk.com/t5/Dashboards-Visualizations/Parsing-oddly-formatted-XML-NetApp-log/m-p/367181&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Not that this is relevant, because the specific elements I'm asking about in this topic, such as DesiredAccess, aren't parsed properly either. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I'm primarily interested in understanding why my fields.conf tokenizers aren't working, not so much in debugging Splunk's internal XML parser.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jan 2022 14:23:17 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/580957#M202397</guid>
      <dc:creator>dsmith</dc:creator>
      <dc:date>2022-01-13T14:23:17Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581185#M202480</link>
      <description>&lt;P&gt;KV_MODE=xml is perhaps the wrong option for this problem. &amp;nbsp;On the other hand, &lt;FONT face="courier new,courier"&gt;spath&lt;/FONT&gt; command can put attributes into field names with the &amp;nbsp;{@attrib} notation so you don't get field name like "&lt;EM&gt;Name&lt;/EM&gt;"; instead, you get a scalar facsimile of the vectorial attribute space, like &lt;EM&gt;Event.EventData.Data{@Name}&lt;/EM&gt;,&amp;nbsp;&lt;EM&gt;Event.System.Provider{@Name}&lt;/EM&gt;, and so on. &amp;nbsp;Like any reduction of dimensions, &lt;FONT face="courier new,courier"&gt;spath&lt;/FONT&gt; ends up losing some information. (Another problem - I consider it a bug, is that &lt;FONT face="courier new,courier"&gt;spath&lt;/FONT&gt; does not handle empty values correctly.) &amp;nbsp;But because XML follows an application-specific DTD, you can usually compensate with application-specific handling, like the following:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex mode=sed "s/&amp;gt;&amp;lt;\/Data/&amp;gt;()&amp;lt;\//g" ``` compensate for spath's inability to handle empty values ```
| spath
| rename Event.EventData.Data{@*} as EventData*, Event.EventData.Data as EventDataData ``` most eval functions cannot handle {} notation ```
| eval EventDataName=mvmap(EventDataName, case(EventDataName == "SubjectUnix", "SubjectUnix &amp;lt;Uid:" . EventDataUid . ", Gid:" . EventDataGid . ", Local:" . EventDataLocal . "&amp;gt;", EventDataName == "SubjectIP", "SubjectIP&amp;lt;" . EventDataIPVersion . "&amp;gt;", true(), EventDataName)) ``` application-specific mapping ```
| eval Combo = mvzip(EventDataName, EventDataData, "=")&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(See inline comments) &amp;nbsp;Output from your sample data is&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="400.359375px" height="25px"&gt;&lt;DIV class=""&gt;Combo&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="171.296875px" height="25px"&gt;Event.System.Channel&lt;/TD&gt;&lt;TD width="184.5px" height="25px"&gt;Event.System.Computer&lt;/TD&gt;&lt;TD width="221.734375px" height="25px"&gt;Event.System.ComputerUUID&lt;/TD&gt;&lt;TD width="168.09375px" height="25px"&gt;Event.System.EventID&lt;/TD&gt;&lt;TD width="195.203125px" height="25px"&gt;Event.System.EventName&lt;/TD&gt;&lt;TD width="183.25px" height="25px"&gt;Event.System.Keywords&lt;/TD&gt;&lt;TD width="149.765625px" height="25px"&gt;Event.System.Level&lt;/TD&gt;&lt;TD width="169.15625px" height="25px"&gt;Event.System.Opcode&lt;/TD&gt;&lt;TD width="234.5625px" height="25px"&gt;Event.System.Provider{@Guid}&lt;/TD&gt;&lt;TD width="243.5625px" height="25px"&gt;Event.System.Provider{@Name}&lt;/TD&gt;&lt;TD width="157.5625px" height="25px"&gt;Event.System.Result&lt;/TD&gt;&lt;TD width="163.453125px" height="25px"&gt;Event.System.Source&lt;/TD&gt;&lt;TD width="321.9375px" height="25px"&gt;Event.System.TimeCreated{@SystemTime}&lt;/TD&gt;&lt;TD width="164.765625px" height="25px"&gt;Event.System.Version&lt;/TD&gt;&lt;TD width="337.828125px" height="25px"&gt;&lt;DIV class=""&gt;EventDataData&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="110.453125px" height="25px"&gt;EventDataGid&lt;/TD&gt;&lt;TD width="153.375px" height="25px"&gt;EventDataIPVersion&lt;/TD&gt;&lt;TD width="124.703125px" height="25px"&gt;EventDataLocal&lt;/TD&gt;&lt;TD width="164.796875px" height="25px"&gt;&lt;DIV class=""&gt;EventDataName&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="110.046875px" height="25px"&gt;EventDataUid&lt;/TD&gt;&lt;TD width="524.671875px" height="25px"&gt;_raw&lt;/TD&gt;&lt;TD width="103.4375px" height="25px"&gt;_time&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="400.359375px" height="575px"&gt;&lt;DIV class=""&gt;SubjectIP&amp;lt;4&amp;gt;=1.2.3.4&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUnix &amp;lt;Uid:1234, Gid:1234, Local:false&amp;gt;=()&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUserSid=S-1-5-21-3579272529-1234567890-2280984729-123456&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUserIsLocal=false&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectDomainName=ACCOUNTS&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUserName=davidsmith&lt;/DIV&gt;&lt;DIV class=""&gt;ObjectServer=Security&lt;/DIV&gt;&lt;DIV class=""&gt;ObjectType=Directory&lt;/DIV&gt;&lt;DIV class=""&gt;HandleID=00000000000444;00;002a62a7;0d3d88a4&lt;/DIV&gt;&lt;DIV class=""&gt;ObjectName=(Shares);/LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/DIV&gt;&lt;DIV class=""&gt;AccessList=%%4416 %%4423&lt;/DIV&gt;&lt;DIV class=""&gt;AccessMask=81&lt;/DIV&gt;&lt;DIV class=""&gt;DesiredAccess=Read Data; List Directory; Read Attributes;&lt;/DIV&gt;&lt;DIV class=""&gt;Attributes=Open a directory;&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="171.296875px" height="575px"&gt;Security&lt;/TD&gt;&lt;TD width="184.5px" height="575px"&gt;server-name-edited&lt;/TD&gt;&lt;TD width="221.734375px" height="575px"&gt;guid-edited&lt;/TD&gt;&lt;TD width="168.09375px" height="575px"&gt;4656&lt;/TD&gt;&lt;TD width="195.203125px" height="575px"&gt;Open Object&lt;/TD&gt;&lt;TD width="183.25px" height="575px"&gt;0x8020000000000000&lt;/TD&gt;&lt;TD width="149.765625px" height="575px"&gt;0&lt;/TD&gt;&lt;TD width="169.15625px" height="575px"&gt;0&lt;/TD&gt;&lt;TD width="234.5625px" height="575px"&gt;{guid-edited}&lt;/TD&gt;&lt;TD width="243.5625px" height="575px"&gt;NetApp-Security-Auditing&lt;/TD&gt;&lt;TD width="157.5625px" height="575px"&gt;Audit Success&lt;/TD&gt;&lt;TD width="163.453125px" height="575px"&gt;CIFS&lt;/TD&gt;&lt;TD width="321.9375px" height="575px"&gt;2022-01-12T15:42:41.096809000Z&lt;/TD&gt;&lt;TD width="164.765625px" height="575px"&gt;101.3&lt;/TD&gt;&lt;TD width="337.828125px" height="575px"&gt;&lt;DIV class=""&gt;1.2.3.4&lt;/DIV&gt;&lt;DIV class=""&gt;()&lt;/DIV&gt;&lt;DIV class=""&gt;S-1-5-21-3579272529-1234567890-2280984729-123456&lt;/DIV&gt;&lt;DIV class=""&gt;false&lt;/DIV&gt;&lt;DIV class=""&gt;ACCOUNTS&lt;/DIV&gt;&lt;DIV class=""&gt;davidsmith&lt;/DIV&gt;&lt;DIV class=""&gt;Security&lt;/DIV&gt;&lt;DIV class=""&gt;Directory&lt;/DIV&gt;&lt;DIV class=""&gt;00000000000444;00;002a62a7;0d3d88a4&lt;/DIV&gt;&lt;DIV class=""&gt;(Shares);/LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/DIV&gt;&lt;DIV class=""&gt;%%4416 %%4423&lt;/DIV&gt;&lt;DIV class=""&gt;81&lt;/DIV&gt;&lt;DIV class=""&gt;Read Data; List Directory; Read Attributes;&lt;/DIV&gt;&lt;DIV class=""&gt;Open a directory;&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="110.453125px" height="575px"&gt;1234&lt;/TD&gt;&lt;TD width="153.375px" height="575px"&gt;4&lt;/TD&gt;&lt;TD width="124.703125px" height="575px"&gt;false&lt;/TD&gt;&lt;TD width="164.796875px" height="575px"&gt;&lt;DIV class=""&gt;SubjectIP&amp;lt;4&amp;gt;&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUnix &amp;lt;Uid:1234, Gid:1234, Local:false&amp;gt;&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUserSid&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUserIsLocal&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectDomainName&lt;/DIV&gt;&lt;DIV class=""&gt;SubjectUserName&lt;/DIV&gt;&lt;DIV class=""&gt;ObjectServer&lt;/DIV&gt;&lt;DIV class=""&gt;ObjectType&lt;/DIV&gt;&lt;DIV class=""&gt;HandleID&lt;/DIV&gt;&lt;DIV class=""&gt;ObjectName&lt;/DIV&gt;&lt;DIV class=""&gt;AccessList&lt;/DIV&gt;&lt;DIV class=""&gt;AccessMask&lt;/DIV&gt;&lt;DIV class=""&gt;DesiredAccess&lt;/DIV&gt;&lt;DIV class=""&gt;Attributes&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="110.046875px" height="575px"&gt;1234&lt;/TD&gt;&lt;TD width="524.671875px" height="575px"&gt;&amp;lt;Event&amp;gt;&amp;lt;System&amp;gt;&amp;lt;Provider Name="NetApp-Security-Auditing" Guid="{guid-edited}"/&amp;gt;&amp;lt;EventID&amp;gt;4656&amp;lt;/EventID&amp;gt;&amp;lt;EventName&amp;gt;Open Object&amp;lt;/EventName&amp;gt;&amp;lt;Version&amp;gt;101.3&amp;lt;/Version&amp;gt;&amp;lt;Source&amp;gt;CIFS&amp;lt;/Source&amp;gt;&amp;lt;Level&amp;gt;0&amp;lt;/Level&amp;gt;&amp;lt;Opcode&amp;gt;0&amp;lt;/Opcode&amp;gt;&amp;lt;Keywords&amp;gt;0x8020000000000000&amp;lt;/Keywords&amp;gt;&amp;lt;Result&amp;gt;Audit Success&amp;lt;/Result&amp;gt;&amp;lt;TimeCreated SystemTime="2022-01-12T15:42:41.096809000Z"/&amp;gt;&amp;lt;Correlation/&amp;gt;&amp;lt;Channel&amp;gt;Security&amp;lt;/Channel&amp;gt;&amp;lt;Computer&amp;gt;server-name-edited&amp;lt;/Computer&amp;gt;&amp;lt;ComputerUUID&amp;gt;guid-edited&amp;lt;/ComputerUUID&amp;gt;&amp;lt;Security/&amp;gt;&amp;lt;/System&amp;gt;&amp;lt;EventData&amp;gt;&amp;lt;Data Name="SubjectIP" IPVersion="4"&amp;gt;1.2.3.4&amp;lt;/Data&amp;gt;&amp;lt;Data Name="SubjectUnix" Uid="1234" Gid="1234" Local="false"&amp;gt;()&amp;lt;/&amp;gt;&amp;lt;Data Name="SubjectUserSid"&amp;gt;S-1-5-21-3579272529-1234567890-2280984729-123456&amp;lt;/Data&amp;gt;&amp;lt;Data Name="SubjectUserIsLocal"&amp;gt;false&amp;lt;/Data&amp;gt;&amp;lt;Data Name="SubjectDomainName"&amp;gt;ACCOUNTS&amp;lt;/Data&amp;gt;&amp;lt;Data Name="SubjectUserName"&amp;gt;davidsmith&amp;lt;/Data&amp;gt;&amp;lt;Data Name="ObjectServer"&amp;gt;Security&amp;lt;/Data&amp;gt;&amp;lt;Data Name="ObjectType"&amp;gt;Directory&amp;lt;/Data&amp;gt;&amp;lt;Data Name="HandleID"&amp;gt;00000000000444;00;002a62a7;0d3d88a4&amp;lt;/Data&amp;gt;&amp;lt;Data Name="ObjectName"&amp;gt;(Shares);/LogTestActivity/dsmith/wordpress-shared/plugins-shared&amp;lt;/Data&amp;gt;&amp;lt;Data Name="AccessList"&amp;gt;%%4416 %%4423 &amp;lt;/Data&amp;gt;&amp;lt;Data Name="AccessMask"&amp;gt;81&amp;lt;/Data&amp;gt;&amp;lt;Data Name="DesiredAccess"&amp;gt;Read Data; List Directory; Read Attributes; &amp;lt;/Data&amp;gt;&amp;lt;Data Name="Attributes"&amp;gt;Open a directory; &amp;lt;/Data&amp;gt;&amp;lt;/EventData&amp;gt;&amp;lt;/Event&amp;gt;&lt;/TD&gt;&lt;TD width="103.4375px" height="575px"&gt;2022-01-12T15:42:41&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;In the above, &lt;U&gt;Combo&lt;/U&gt;&amp;nbsp;field is a scalar representation of &amp;lt;Event&amp;gt;&amp;lt;EventData&amp;gt;&amp;lt;Data&amp;gt; entities, using Event.EventData.Data{@Name} as the primary attribute. &amp;nbsp;As you can see,&amp;nbsp;&lt;EM&gt;SubjectUserName=davidsmith&lt;/EM&gt; is one of the values in &lt;U&gt;Combo&lt;/U&gt;.&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 15 Jan 2022 08:20:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581185#M202480</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-01-15T08:20:56Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581210#M202487</link>
      <description>&lt;P&gt;Which is great but doesn't address the question I'm asking. Note that the DesiredAccess attribute still is shown as a single text item, and isn't being tokenized into its individual components.&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Assume for the sake of this discussion that I'm going to stick with regexes for now. I have working regular expressions for the fields I care about, and as long as I don't also have a tokenizer for those fields, the field extraction works. But when I add fields.conf the fields named therein aren't extracted, period. Any suggestions on what I'm doing wrong there?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 15 Jan 2022 16:33:40 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581210#M202487</guid>
      <dc:creator>dsmith</dc:creator>
      <dc:date>2022-01-15T16:33:40Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581305#M202524</link>
      <description>&lt;P&gt;The spath code is just to illustrate how to clean up. &amp;nbsp;Key-value pairs in Combo can be extracted using &lt;FONT face="courier new,courier"&gt;extract&lt;/FONT&gt; command (aka kv).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| spath
| rename Event.EventData.Data{@*} as EventData*, Event.EventData.Data as EventDataData ``` most eval functions cannot handle {} notation ```
| eval EventDataName=mvmap(EventDataName, case(EventDataName == "SubjectUnix", "SubjectUnix &amp;lt;Uid:" . EventDataUid . ", Gid:" . EventDataGid . ", Local:" . EventDataLocal . "&amp;gt;", EventDataName == "SubjectIP", "SubjectIP&amp;lt;" . EventDataIPVersion . "&amp;gt;", true(), EventDataName)) ``` application-specific mapping ```
| eval Combo = mvzip(EventDataName, EventDataData, "=\"")
| rename Combo as _raw
| rex mode=sed "s/$/\"/"
| kv kvdelim="=" ``` extract key-value pairs from Combo ```
| fields - Event*, _raw
| makemv delim=";" DesiredAccess
| makemv delim=";" Attributes
| makemv delim=";" HandleID
| makemv delim=";" ObjectName&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sample output is like&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;AccessList&lt;/TD&gt;&lt;TD&gt;AccessMask&lt;/TD&gt;&lt;TD&gt;Attributes&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;DesiredAccess&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;HandleID&lt;/TD&gt;&lt;TD&gt;ObjectName&lt;/TD&gt;&lt;TD&gt;ObjectServer&lt;/TD&gt;&lt;TD&gt;ObjectType&lt;/TD&gt;&lt;TD&gt;SubjectDomainName&lt;/TD&gt;&lt;TD&gt;SubjectUserIsLocal&lt;/TD&gt;&lt;TD&gt;SubjectUserName&lt;/TD&gt;&lt;TD&gt;SubjectUserSid&lt;/TD&gt;&lt;TD&gt;_time&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;%%4416 %%4423&lt;/TD&gt;&lt;TD&gt;81&lt;/TD&gt;&lt;TD&gt;Open a directory&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;Read Data&lt;/DIV&gt;&lt;DIV class=""&gt;List Directory&lt;/DIV&gt;&lt;DIV class=""&gt;Read Attributes&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;00000000000444&lt;/DIV&gt;&lt;DIV class=""&gt;00&lt;/DIV&gt;&lt;DIV class=""&gt;002a62a7&lt;/DIV&gt;&lt;DIV class=""&gt;0d3d88a4&lt;/DIV&gt;&lt;P&gt;00;002a62a7;0d3d88a4&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;(Shares)&lt;/DIV&gt;&lt;DIV class=""&gt;/LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;Security&lt;/TD&gt;&lt;TD&gt;Directory&lt;/TD&gt;&lt;TD&gt;ACCOUNTS&lt;/TD&gt;&lt;TD&gt;false&lt;/TD&gt;&lt;TD&gt;davidsmith&lt;/TD&gt;&lt;TD&gt;S-1-5-21-3579272529-1234567890-2280984729-123456&lt;/TD&gt;&lt;TD&gt;2022-01-12T15:42:41&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;The main point is that structured data are best &amp;nbsp;handled with conformant tested code. &amp;nbsp;In addition, complex, custom index-time extraction makes maintenance difficult. &amp;nbsp;Search-time prowess is Splunk's very strength. &amp;nbsp;Why not use it?&lt;/P&gt;&lt;P&gt;Meantime, the error in fields.conf is that TOKENIZER does not accept extra characters outside the token itself. This should work:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[DesiredAccess]
TOKENIZER = (\b[^;]+)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jan 2022 10:40:09 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581305#M202524</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-01-17T10:40:09Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581495#M202584</link>
      <description>&lt;P&gt;Well, at least that updated tokenizer breaks things in a different way...&amp;nbsp;&lt;/P&gt;&lt;P&gt;I edited the fields.conf I'm pushing out to my search heads thusly:&lt;/P&gt;&lt;PRE&gt;[DesiredAccess]&lt;BR /&gt;# TOKENIZER = \s?(.*?);&lt;BR /&gt;TOKENIZER = (\b[^;]+)&lt;/PRE&gt;&lt;P&gt;The contents of the fields so tokenized (is that a word?) at least show up when I expand a given search result now. They're a single line, with the semicolons removed. (I highlighted multiple lines because there are actually about a half-dozen such fields that I'm extracting, I limited it to a single instance for this thread because the solution for one should be identical to all the others.)&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Screenshot of a single event, with the improperly-extracted fields highlighted." style="width: 400px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/17593i4DC318D8E87C4B09/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Screenshot 2022-01-18 100703.png" alt="Screenshot of a single event, with the improperly-extracted fields highlighted." /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Screenshot of a single event, with the improperly-extracted fields highlighted.&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Your regex works correctly in online tools like regex101.com, but then again so did mine. (Yours is cleaner and faster, though, so thank you for that.) I wish Splunk had more and better examples of how to use TOKENIZER in the docs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Dumb Newbie Question of the day: The fields are split correctly if I remove the tokenizer, and add &lt;FONT face="courier new,courier"&gt;| makemv delim=";" FieldNameHere&lt;/FONT&gt; to a search. Is there a way to add that to a config file? (i.e. "every time you search this sourcetype, do this" or similar) Part of my goal here is to make life easier for users that aren't deeply familiar with Splunk field commands, and asking these users to add a half-dozen makemv commands to every search isn't exactly convenient for anyone involved.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jan 2022 16:23:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/581495#M202584</guid>
      <dc:creator>dsmith</dc:creator>
      <dc:date>2022-01-18T16:23:41Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582058#M202750</link>
      <description>&lt;P&gt;I replaced the tokenizer for my desired fields with&lt;/P&gt;&lt;P&gt;TOKENIZER = (\s?(.*?);)&lt;/P&gt;&lt;P&gt;It's close-enough for my case. The tokenized events still have the semicolon in their name, but I can live with that for now. (I tried (\s?(.*?)); but then all the event names were empty strings.)&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jan 2022 17:15:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582058#M202750</guid>
      <dc:creator>dsmith</dc:creator>
      <dc:date>2022-01-21T17:15:44Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582070#M202752</link>
      <description>&lt;P&gt;Have you tried using transforms? you might want to give this a try this:&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;transforms.conf&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[extract_xml_data_atribute_as_field]
REGEX=&amp;lt;Data Name="([^"]*)"[^&amp;gt;]*&amp;gt;([^&amp;lt;]*)
FORMAT=$1::$2

[extract_xml_data_values_list_as_mv]
SOURCE_KEY = DesiredAccess
REGEX = (?&amp;lt;DesiredAccessList&amp;gt;[^;]*);
MV_ADD=true&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;props.conf&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[&amp;lt;your_sourcetype&amp;gt;]
REPORT-xml_data_to_field = extract_xml_data_atribute_as_field, extract_xml_data_values_list_as_mv&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jan 2022 17:57:06 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582070#M202752</guid>
      <dc:creator>diogofgm</dc:creator>
      <dc:date>2022-01-21T17:57:06Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582082#M202758</link>
      <description>&lt;P&gt;What benefits would there be to a transforms.conf approach over fields.conf? I'm still fairly new to Splunk, and definitely new to this sort of data massaging, so I don't deeply understand the pros and cons of each.&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jan 2022 19:01:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582082#M202758</guid>
      <dc:creator>dsmith</dc:creator>
      <dc:date>2022-01-21T19:01:43Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582100#M202768</link>
      <description>&lt;P&gt;From your last reply you stated that the other solution you end up with was close enough. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; This solution is better than close enough. Also with this you avoid having the extra makemv command in your search because with this transforms the field is already extracted as a mv field.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Jan 2022 00:09:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582100#M202768</guid>
      <dc:creator>diogofgm</dc:creator>
      <dc:date>2022-01-22T00:09:43Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582115#M202775</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;The contents of the fields so tokenized (is that a word?) at least show up when I expand a given search result now. They're a single line, with the semicolons removed. (I highlighted multiple lines because there are actually about a half-dozen such fields that I'm extracting, I limited it to a single instance for this thread because the solution for one should be identical to all the others.)&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-center" image-alt="Screenshot of a single event, with the improperly-extracted fields highlighted." style="width: 400px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/17593i4DC318D8E87C4B09/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Screenshot 2022-01-18 100703.png" alt="Screenshot of a single event, with the improperly-extracted fields highlighted." /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Screenshot of a single event, with the improperly-extracted fields highlighted.&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;Maybe you can elaborate "breaks things in a different way... " &amp;nbsp;You are correct that values looks to be on a single line &lt;STRONG&gt;IF&lt;/STRONG&gt; you just click expand the even view. &amp;nbsp;But that look itself doesn't mean much. &amp;nbsp;Based on your original question, your intention is to break DesiredAccess, etc., into a multivalue field instead of semicolon-separated single string. &amp;nbsp;The proposed TOKENIZER does exactly that. &amp;nbsp;How is this &amp;nbsp;broken? &amp;nbsp;You can count the number of values of the DesiredAccess like this&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| eval AccessCount=mvcount(DesiredAccess)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You'll see that the count is &amp;gt; 1. &amp;nbsp;I ingested your sample data, then used the following props.properties&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[xml-too_small]
EXTRACT-DesiredAccess = &amp;lt;Data Name="DesiredAccess"&amp;gt;(?&amp;lt;DesiredAccess&amp;gt;.*?)&amp;lt;\/Data&amp;gt;
EXTRACT-HandleID = &amp;lt;Data Name="HandleID"&amp;gt;(?&amp;lt;HandleID&amp;gt;.*?)&amp;lt;\/Data&amp;gt;
EXTRACT-InformationRequested = &amp;lt;Data Name="InformationRequested"&amp;gt;(?&amp;lt;InformationRequested&amp;gt;.*?)&amp;lt;\/Data&amp;gt;
EXTRACT-Attributes = &amp;lt;Data Name="Attributes"&amp;gt;(?&amp;lt;Attributes&amp;gt;[^&amp;lt;]*)&amp;lt;\/Data&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and fields.properties&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[DesiredAccess]
TOKENIZER = (\b[^;]+)
[ObjectName]
TOKENIZER = (\b[^;]+)
[InformationRequested]
TOKENIZER = (\b[^;]+)
[Attributes]
TOKENIZER = (\b[^;]+)
[HandleID]
TOKENIZER = (\b[^;]+)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When I perform this search&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;index="tests" source="netapptest.xml"
| table DesiredAccess Attributes HandleID ObjectName
| eval AccessCount=mvcount(DesiredAccess)
| eval ObjectCount=mvcount(ObjectName)
| eval HandleCount=mvcount(HandleID)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;it gives&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;DIV class=""&gt;DesiredAccess&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;Attributes&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;HandleID&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;ObjectName&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;AccessCount&lt;/TD&gt;&lt;TD&gt;HandleCount&lt;/TD&gt;&lt;TD&gt;ObjectCount&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;DIV class=""&gt;Read Data&lt;/DIV&gt;&lt;DIV class=""&gt;List Directory&lt;/DIV&gt;&lt;DIV class=""&gt;Read Attributes&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;Open a directory&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;00000000000444&lt;/DIV&gt;&lt;DIV class=""&gt;00&lt;/DIV&gt;&lt;DIV class=""&gt;002a62a7&lt;/DIV&gt;&lt;DIV class=""&gt;0d3d88a4&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;&lt;DIV class=""&gt;Shares)&lt;/DIV&gt;&lt;DIV class=""&gt;LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/DIV&gt;&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;So, even though the expanded event view displays these fields in a single line, they are really multivalue fields now; DesiredAccess, for example, is made of 3 distinct values. (Do not test this in verbose mode. &amp;nbsp;That mode can interact strangely.) &amp;nbsp;This is exactly what TOKENIZER does, and I believe that this is what you originally wanted.&lt;/P&gt;&lt;P&gt;The "clipping" of the opening parenthesis in ObjectName highlights the reason why I strongly recommend using vendor-provided commands like spath. &amp;nbsp; You can fine tune that TOKENIZER &amp;nbsp;to get around this one &amp;nbsp;problem, but there maybe other data values to break it.&lt;/P&gt;&lt;P&gt;So, I refined the spath method to eliminate glitches when there are multiple attributes in one property:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex mode=sed "s/&amp;gt;&amp;lt;\/Data/&amp;gt;()&amp;lt;\//g" ``` compensate for spath's inability to handle empty values ```
| spath
| rename Event.EventData.Data{@*} as EventData*, Event.EventData.Data as EventDataData ``` most eval functions cannot handle {} notation ```
| eval EventDataName=mvmap(EventDataName, case(EventDataName == "SubjectUnix", "SubjectUnix &amp;lt;Uid:" . EventDataUid . ", Gid:" . EventDataGid . ", Local:" . EventDataLocal . "&amp;gt;", EventDataName == "SubjectIP", "SubjectIP&amp;lt;" . EventDataIPVersion . "&amp;gt;", true(), EventDataName)) ``` application-specific mapping ```
| eval Combo = mvzip(EventDataName, EventDataData, "=\"")
| eval Combo = mvmap(Combo, replace(Combo, "&amp;lt;(.+)&amp;gt;=\"", "=\"&amp;lt;\1&amp;gt;")) ``` handle multi-attribute properties ```
| rename Combo as _raw
| rex mode=sed "s/$/\"/"
| kv kvdelim="=" ``` extract key-value pairs from Combo ```
| fields - Event*, _raw
| makemv delim=";" DesiredAccess
| makemv delim=";" Attributes
| makemv delim=";" HandleID
| makemv delim=";" ObjectName&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(Note: The above will not work correctly when that custom TOKENIZER exists.) &amp;nbsp;This is a lot more generic in terms of which parts of XML turn into fields. &amp;nbsp;The output of the above for your sample data is&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="74.8125px" height="25px"&gt;_time&lt;/TD&gt;&lt;TD width="89.046875px" height="25px"&gt;AccessList&lt;/TD&gt;&lt;TD width="101.25px" height="25px"&gt;AccessMask&lt;/TD&gt;&lt;TD width="83.265625px" height="25px"&gt;Attributes&lt;/TD&gt;&lt;TD width="118.609375px" height="25px"&gt;&lt;DIV class=""&gt;DesiredAccess&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="145.40625px" height="25px"&gt;&lt;DIV class=""&gt;HandleID&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="274.28125px" height="25px"&gt;&lt;DIV class=""&gt;ObjectName&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="106.953125px" height="25px"&gt;ObjectServer&lt;/TD&gt;&lt;TD width="94.65625px" height="25px"&gt;ObjectType&lt;/TD&gt;&lt;TD width="164.796875px" height="25px"&gt;SubjectDomainName&lt;/TD&gt;&lt;TD width="87.515625px" height="25px"&gt;SubjectIP&lt;/TD&gt;&lt;TD width="98.71875px" height="25px"&gt;SubjectUnix&lt;/TD&gt;&lt;TD width="150.6875px" height="25px"&gt;SubjectUserIsLocal&lt;/TD&gt;&lt;TD width="90.453125px" height="25px"&gt;SubjectUserName&lt;/TD&gt;&lt;TD width="112.484375px" height="25px"&gt;SubjectUserSid&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="74.8125px" height="113px"&gt;2022-01-21 19:38:25&lt;/TD&gt;&lt;TD width="89.046875px" height="113px"&gt;%%4416 %%4423&lt;/TD&gt;&lt;TD width="101.25px" height="113px"&gt;81&lt;/TD&gt;&lt;TD width="83.265625px" height="113px"&gt;Open a directory&lt;/TD&gt;&lt;TD width="118.609375px" height="113px"&gt;&lt;DIV class=""&gt;Read Data&lt;/DIV&gt;&lt;DIV class=""&gt;List Directory&lt;/DIV&gt;&lt;DIV class=""&gt;Read Attributes&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="145.40625px" height="113px"&gt;&lt;DIV class=""&gt;00000000000444&lt;/DIV&gt;&lt;DIV class=""&gt;00&lt;/DIV&gt;&lt;DIV class=""&gt;002a62a7&lt;/DIV&gt;&lt;DIV class=""&gt;0d3d88a4&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="274.28125px" height="113px"&gt;&lt;DIV class=""&gt;(Shares)&lt;/DIV&gt;&lt;DIV class=""&gt;/LogTestActivity/dsmith/wordpress-shared/plugins-shared&lt;/DIV&gt;&lt;/TD&gt;&lt;TD width="106.953125px" height="113px"&gt;Security&lt;/TD&gt;&lt;TD width="94.65625px" height="113px"&gt;Directory&lt;/TD&gt;&lt;TD width="164.796875px" height="113px"&gt;ACCOUNTS&lt;/TD&gt;&lt;TD width="87.515625px" height="113px"&gt;&amp;lt;4&amp;gt;1.2.3.4&lt;/TD&gt;&lt;TD width="98.71875px" height="113px"&gt;&amp;lt;Uid:1234, Gid:1234, Local:false&amp;gt;()&lt;/TD&gt;&lt;TD width="150.6875px" height="113px"&gt;false&lt;/TD&gt;&lt;TD width="90.453125px" height="113px"&gt;davidsmith&lt;/TD&gt;&lt;TD width="112.484375px" height="113px"&gt;S-1-5-21-3579272529-1234567890-2280984729-123456&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;Not only are DesiredAccess, ObjectName, etc., multivalued, and the first value of ObjectName is no longer missing opening parenthesis, but SubjectIP now shows with Version embedded in value, so does SubjectUnix.&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 22 Jan 2022 04:20:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582115#M202775</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-01-22T04:20:38Z</dc:date>
    </item>
    <item>
      <title>Re: fields.conf TOKENIZER breaks my event completely</title>
      <link>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582143#M202785</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/33901"&gt;@yuanliu&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;&lt;P&gt;KV_MODE=xml is perhaps the wrong option for this problem. &amp;nbsp;On the other hand, &lt;FONT face="courier new,courier"&gt;spath&lt;/FONT&gt; command&lt;/P&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;I didn't look deeply enough. &amp;nbsp;In fact, KV_MODE=XML performs spath just like in explicit SPL. &amp;nbsp;It could have worked if not for the want of a placeholder value when Event.EventData.Data contains null values. &amp;nbsp;In explicit spath, I try to fix this bug with&amp;nbsp;"s/&amp;gt;&amp;lt;\/Data/&amp;gt;()&amp;lt;\//g" before running spath. &amp;nbsp;But &amp;nbsp;there is no way to fix implicit output.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Jan 2022 11:23:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/fields-conf-TOKENIZER-breaks-my-event-completely/m-p/582143#M202785</guid>
      <dc:creator>yuanliu</dc:creator>
      <dc:date>2022-01-22T11:23:38Z</dc:date>
    </item>
  </channel>
</rss>

