<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: getting xml data into splunk consistently in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651997#M110718</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/256709"&gt;@Strangertinz&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Really!&amp;nbsp; It is being parsed correctly?&amp;nbsp; I've know idea how it could be based on your example sample data and the props.conf shown.&amp;nbsp; Using&amp;nbsp;&lt;SPAN&gt;LINE_BREAKER=(&amp;lt;log_entry&amp;gt;) and&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;SHOULD_LINEMERGE=FALSE would rip the&amp;nbsp;&amp;lt;log_entry&amp;gt; line out of the XML which would break the XML structured data format.&amp;nbsp; &amp;nbsp;This might explain why the data appears clumped as timestamp extractions would only work on events which had the log_time value in it.&amp;nbsp; Events without timestamps would have to full back on to other sources, such as the mod time of the source file, for example.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;If you use the hidden _indextime metadata (you need to rename the field to see it, e.g. "rename _indextime to indextime) this will give you the time (in epoch seconds) the data is ingested by Splunk (written to index) and you can check if the event time and index time match or vary wildly.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 26 Jul 2023 04:28:34 GMT</pubDate>
    <dc:creator>yeahnah</dc:creator>
    <dc:date>2023-07-26T04:28:34Z</dc:date>
    <item>
      <title>Getting xml data into Splunk consistently?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651813#M110695</link>
      <description>&lt;P&gt;I am having trouble with ingesting my data into Splunk consistently. I have an&amp;nbsp;&lt;SPAN&gt;XML log file that is constantly being written into (about 100 entry per minute) however,&amp;nbsp; when I search for the data in Splunk I am only seeing sporadic results of the data in Splunk where I see results for 10 minutes then nothing for the next 20 and so on and so forth .&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I have my inputs and props config below.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;inputs config:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;[monitor:///var/log/sample_xml_file.xml]&lt;BR /&gt;disabled = false&lt;BR /&gt;index = sample_xml_index&lt;BR /&gt;sourcetype= sample_xml_st&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;props.conf:&lt;/P&gt;
&lt;P&gt;---------------------&lt;/P&gt;
&lt;P&gt;[ sample_xml_st ]&lt;BR /&gt;CHARSET=UTF-8&lt;BR /&gt;KV_MODE=xml&lt;BR /&gt;LINE_BREAKER=(&amp;lt;log_entry&amp;gt;)&lt;BR /&gt;NO_BINARY_CHECK=true&lt;BR /&gt;SHOULD_LINEMERGE=FALSE&lt;BR /&gt;TIME_FORMAT=%Y%m%d-%H:%M:%S&lt;BR /&gt;TIME_PREFIX=&amp;lt;log_time&amp;gt;&lt;BR /&gt;TRUNCATE=0&lt;BR /&gt;description=describing props config&lt;BR /&gt;disabled=false&lt;BR /&gt;pulldown_type=1&lt;BR /&gt;TZ=-05:00&lt;BR /&gt;&lt;BR /&gt;---------------------&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Sample xml log:&lt;/P&gt;
&lt;P&gt;&amp;lt;?xml version="1.0" encoding="utf-8" ?&amp;gt;&lt;BR /&gt;&amp;lt;log&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;lt;log_entry&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;log_time&amp;gt;20230724-05:42:00&amp;lt;/log_time&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;description&amp;gt;some random data 1&amp;lt;/description&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;lt;/log_entry&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;log_entry&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;log_time&amp;gt;20230724-05:43:00&amp;lt;/log_time&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;description&amp;gt;some random data 2&amp;lt;/description&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;lt;/log_entry&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;log_entry&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;log_time&amp;gt;20230724-05:43:20&amp;lt;/log_time&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;lt;description&amp;gt;some random data 3&amp;lt;/description&amp;gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;lt;/log_entry&amp;gt;&lt;BR /&gt;&amp;lt;/log&amp;gt;&lt;BR /&gt;&lt;BR /&gt;And this xml log file gets constantly written into with the a new log_entry&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2023 16:58:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651813#M110695</guid>
      <dc:creator>Strangertinz</dc:creator>
      <dc:date>2023-07-25T16:58:10Z</dc:date>
    </item>
    <item>
      <title>Re: getting xml data into splunk consistently</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651815#M110696</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/256709"&gt;@Strangertinz&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;To correctly break the events the LINE_BREAKER value would be&amp;nbsp;([\r\n]+)&amp;lt;\?xml.&amp;nbsp; The newlines in the regex capture group define the line break and are not ingested.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;So, something like this should work on the heavy forwarder or parsing tier.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[sample_xml_st]
SHOULD_LINEMERGE=false
LINE_BREAKER=([\r\n]+)&amp;lt;\?xml
TIME_PREFIX=&amp;lt;log_time&amp;gt;
TIME_FORMAT=%Y%m%d-%H:%M:%S
TRUNCATE=0&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;Hope this helps&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2023 02:11:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651815#M110696</guid>
      <dc:creator>yeahnah</dc:creator>
      <dc:date>2023-07-25T02:11:20Z</dc:date>
    </item>
    <item>
      <title>Re: getting xml data into splunk consistently</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651938#M110707</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/158935"&gt;@yeahnah&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am able to parse the data correctly, my issue is with the data being received by Splunk sporadically.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Jul 2023 15:39:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651938#M110707</guid>
      <dc:creator>Strangertinz</dc:creator>
      <dc:date>2023-07-25T15:39:51Z</dc:date>
    </item>
    <item>
      <title>Re: getting xml data into splunk consistently</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651997#M110718</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/256709"&gt;@Strangertinz&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Really!&amp;nbsp; It is being parsed correctly?&amp;nbsp; I've know idea how it could be based on your example sample data and the props.conf shown.&amp;nbsp; Using&amp;nbsp;&lt;SPAN&gt;LINE_BREAKER=(&amp;lt;log_entry&amp;gt;) and&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;SHOULD_LINEMERGE=FALSE would rip the&amp;nbsp;&amp;lt;log_entry&amp;gt; line out of the XML which would break the XML structured data format.&amp;nbsp; &amp;nbsp;This might explain why the data appears clumped as timestamp extractions would only work on events which had the log_time value in it.&amp;nbsp; Events without timestamps would have to full back on to other sources, such as the mod time of the source file, for example.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;If you use the hidden _indextime metadata (you need to rename the field to see it, e.g. "rename _indextime to indextime) this will give you the time (in epoch seconds) the data is ingested by Splunk (written to index) and you can check if the event time and index time match or vary wildly.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Jul 2023 04:28:34 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Getting-xml-data-into-Splunk-consistently/m-p/651997#M110718</guid>
      <dc:creator>yeahnah</dc:creator>
      <dc:date>2023-07-26T04:28:34Z</dc:date>
    </item>
  </channel>
</rss>

