<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Speeding up XML searches in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25424#M177572</link>
    <description>&lt;P&gt;I don't think there is any way to parse XML prior to indexing.&lt;/P&gt;

&lt;P&gt;However, what might make your search run more quickly: does the value "mmsgw1.xxx.com" appear anywhere in the text other than the  ActivityLogRecord_Common-ServerID?  If this value does NOT appear anywhere else, then you could simply search&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=test1 sourcetype="mmsgw" mmsgw1.xxx.com
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;for example.  Once you have retrieved only the needed events, then you could apply the xmlkvrecursive.  This would probably be much faster, if you can eliminate a significant number of events in the initial search.  And perhaps you don't need to parse the XML at all -- you only need to create the fields from the XML if you want to run searches, statistics or reports on the fields that are created by xmlkvrecursive.&lt;/P&gt;

&lt;P&gt;Based on the example event, the following search is also valid&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=test1 sourcetype=mmsgw serverid="mmsgw1.xxx.com" userid="999999999"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and also doesn't appear to require the XML to be parsed.  But of course, I don't really understand the data...&lt;/P&gt;</description>
    <pubDate>Thu, 09 Jun 2011 00:24:34 GMT</pubDate>
    <dc:creator>lguinn2</dc:creator>
    <dc:date>2011-06-09T00:24:34Z</dc:date>
    <item>
      <title>Speeding up XML searches</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25423#M177571</link>
      <description>&lt;P&gt;It seems I need to use 'xmlkvrecursive' to properly parse XML log files where the tags may contain many attributes. However this parsing is during the search, which is consequently v. slow.&lt;BR /&gt;
How can I parse (using xmlkvrecursive or similar) at index time - then search on the tag or attribute names via the indexes ?&lt;/P&gt;

&lt;P&gt;Example of current search :-&lt;/P&gt;

&lt;P&gt;index=test1 sourcetype="mmsgw" | xmlkvrecursive | search ActivityLogRecord_Common-ServerID="mmsgw1.xxx.com" ActivityLogRecord_Common-UserID="&lt;A href="mailto:999999999@h1.xxx.com" target="_blank"&gt;999999999@h1.xxx.com&lt;/A&gt;"&lt;/P&gt;

&lt;P&gt;A single data record typically looks like :-&lt;/P&gt;

&lt;P&gt;&lt;ACTIVITYLOGRECORD&gt;&lt;BR /&gt;
    &lt;COMMON servicename="Voice Writer" starttimestamp="Thu, 09 Jun 2011 08:55:27 +1000" endtimestamp="Thu, 09 Jun 2011 08:56:12 +10:00" serverid="mmsgw1.xxx.com" userid="999999999@h1.xxx.com" mainaction="Voice Writer Main" userterminal="Handheld"&gt;&lt;BR /&gt;
        &lt;MAINACTIONRESULT status="1" description=""&gt;&lt;BR /&gt;
        &lt;/MAINACTIONRESULT&gt;&lt;BR /&gt;
    &lt;/COMMON&gt;&lt;BR /&gt;
    &lt;SERVICESPECIFIC&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="COS" value="default"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="MessageType" value="voice-message"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="MessageReceiveTime" value="Thu, 9 Jun 2011 08:55:27 +1000 (EST)"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="SenderID" value="437771747"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="RFC822MsgId" value="&amp;lt;15630469.38497162.1307573727523.JavaMail.vxvuser@vm-asu1.KCBK.h1.xxx.com&amp;gt;"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="SingleAttMessage" value="1"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineReferenceNumber" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="DSNAction" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="DSNStatus" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="DirectMessageLink" value="0499999999"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineTextLength" value="83"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="TriggerSndTimeStamp" value="Thu, 09 Jun 2011 08:55:27 +1000"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="ReceiveDSNTimeStamp" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineTriggerRcvTimeStamp1" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineDSNSndTimeStamp" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="ReceiveResponseTimeStamp" value="Thu, 09 Jun 2011 08:56:11 +1000"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineTriggerRcvtimeStamp2" value="Wed, 08 Jun 2011 22:55:59 +0000"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineResponseSndTimeStamp" value="Wed, 8 Jun 2011 22:56:05 +0000 (GMT)"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="SNAPTimeStamp" value="Thu, 09 Jun 2011 08:56:12 +1000"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="ServiceLevel" value="0"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="DeliveryMethod" value="SMS"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="MMSContent" value="Text"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="Destination" value="61499999999"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="OperationMode" value="Copy"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="MsgSyncEnabled" value="false"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="StatusCode" value="Err-Succ"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="LongMessage" value="0"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="EngineLanguage" value="en-AU"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="CLI Restricted" value="0"&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;ATTRIBUTE name="MM7MsgID" value=""&gt;&lt;BR /&gt;
        &lt;/ATTRIBUTE&gt;&lt;BR /&gt;
        &lt;MESSAGEOPTIONS importance="Normal" sensitivity=""&gt;&lt;BR /&gt;
        &lt;/MESSAGEOPTIONS&gt;&lt;BR /&gt;
        &lt;MESSAGEVOLUME seconds="7"&gt;&lt;BR /&gt;
        &lt;/MESSAGEVOLUME&gt;&lt;BR /&gt;
        &lt;TRANSCODINGATTACHMENTLIST&gt;&lt;BR /&gt;
            &lt;TRANSCODINGPROPERTIES formatbefore="sbc" formatafter="G711Alawwav" sizebefore="14208" sizeafter="56620"&gt;&lt;/TRANSCODINGPROPERTIES&gt;&lt;BR /&gt;
        &lt;/TRANSCODINGATTACHMENTLIST&gt;&lt;BR /&gt;
        &lt;SUBACTIONRESULTLIST&gt;&lt;BR /&gt;
            &lt;SUBACTION name="ProfileFetch" status="1" description="MsgHdrs" timestamp="Thu, 09 Jun 2011 08:55:27 +1000"&gt;&lt;/SUBACTION&gt;&lt;BR /&gt;
            &lt;SUBACTION name="Audio Transcoding" status="1" description="Success:sbc=&amp;gt;G711Alawwav&amp;quot; TimeStamp=&amp;quot;Thu, 09 Jun 2011 08:55:27 +1000&amp;quot; /&amp;gt;&amp;lt;br&amp;gt;
            &amp;lt;SubAction Name=" sendenginetrigger=""&gt;&lt;/SUBACTION&gt;&lt;BR /&gt;
            &lt;SUBACTION name="HTTP-Put" status="1" description="Success:EngTrigger" timestamp="Thu, 09 Jun 2011 08:55:27 +1000"&gt;&lt;/SUBACTION&gt;&lt;BR /&gt;
            &lt;SUBACTION name="ReceiveEngineResponse" status="1" description="Success:Err-Succ" timestamp="Thu, 09 Jun 2011 08:56:11 +1000"&gt;&lt;/SUBACTION&gt;&lt;BR /&gt;
            &lt;SUBACTION name="HTTP-Delete" status="1" description="Success:AftEngine" timestamp="Thu, 09 Jun 2011 08:56:12 +1000"&gt;&lt;/SUBACTION&gt;&lt;BR /&gt;
            &lt;SUBACTION name="TriggerSMS" status="1" description="Success:No-DR-Required" timestamp="Thu, 09 Jun 2011 08:56:12 +1000"&gt;&lt;/SUBACTION&gt;&lt;BR /&gt;
        &lt;/SUBACTIONRESULTLIST&gt;&lt;BR /&gt;
    &lt;/SERVICESPECIFIC&gt;&lt;BR /&gt;
&lt;/ACTIVITYLOGRECORD&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 09:39:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25423#M177571</guid>
      <dc:creator>bhiley</dc:creator>
      <dc:date>2020-09-28T09:39:43Z</dc:date>
    </item>
    <item>
      <title>Re: Speeding up XML searches</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25424#M177572</link>
      <description>&lt;P&gt;I don't think there is any way to parse XML prior to indexing.&lt;/P&gt;

&lt;P&gt;However, what might make your search run more quickly: does the value "mmsgw1.xxx.com" appear anywhere in the text other than the  ActivityLogRecord_Common-ServerID?  If this value does NOT appear anywhere else, then you could simply search&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=test1 sourcetype="mmsgw" mmsgw1.xxx.com
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;for example.  Once you have retrieved only the needed events, then you could apply the xmlkvrecursive.  This would probably be much faster, if you can eliminate a significant number of events in the initial search.  And perhaps you don't need to parse the XML at all -- you only need to create the fields from the XML if you want to run searches, statistics or reports on the fields that are created by xmlkvrecursive.&lt;/P&gt;

&lt;P&gt;Based on the example event, the following search is also valid&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;index=test1 sourcetype=mmsgw serverid="mmsgw1.xxx.com" userid="999999999"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;and also doesn't appear to require the XML to be parsed.  But of course, I don't really understand the data...&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2011 00:24:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25424#M177572</guid>
      <dc:creator>lguinn2</dc:creator>
      <dc:date>2011-06-09T00:24:34Z</dc:date>
    </item>
    <item>
      <title>Re: Speeding up XML searches</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25425#M177573</link>
      <description>&lt;P&gt;Seems slightly counter-intuitive to me to index before you parse but then I'm a newbie with Splunk. Your method certainly speeds up the search hugely. Many thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2011 00:44:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Speeding-up-XML-searches/m-p/25425#M177573</guid>
      <dc:creator>bhiley</dc:creator>
      <dc:date>2011-06-09T00:44:20Z</dc:date>
    </item>
  </channel>
</rss>

