Dashboards & Visualizations

Why does Splunk truncate logs with xml content?

Explorer

Hello,

I am trying to log events like the one below (with the xml content) however

2014-06-30 19:10:35,637 INFO  [RnApp] (pool-11-thread-6 - ajp-0.0.0.0-12209-2 E9944F6B7A094983782B7D2CF1FA2050.node1876 POST /sahok/randor.jsp 5623432387 L CA_EN) DEBUG response XML = <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<message>
    <payload>
          <placeOrderResponse requestId="23434-ou234-983483-p234i34o" id="93993493">
            <order merchantOrderId="299209344" FKOrderId="E4X2309098340338">
                <frDisposition>Orange</frDisposition>
            </order>
        </placeOrderResponse>
    <payload>
<message>

For some reason my splunk randomly ignores anything after the opening <payload> tag, eg:

2014-06-30 19:10:35,637 INFO  [RandomApp] (pool-11-thread-6 - ajp-0.0.0.0-12209-2 E9944F6B7A094983782B7D2CF1FA2050.node1876 POST /sahok/randor.jsp 5623432387 L CA_EN) DEBUG response XML = <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<message>
    <payload>

Please could someone help me resolve this issue? the source file also contains other types of events; however they are parsed without any issues by splunk.

We are using Splunk 4.2.3 version.

Thanks

Tags (3)
0 Karma
1 Solution

SplunkTrust
SplunkTrust

Consider setting these in props.conf:

[your_sourcetype]
MAX_TIMESTAMP_LOOKAHEAD=25
NO_BINARY_CHECK=1
TIME_FORMAT=%Y-%m-%d %H:%M:%S.%3N
TIME_PREFIX=^

That'll teach Splunk about your timestamp's location and format, avoiding any other timestamp lookalikes to be found. Should speed things up a little as well 🙂

View solution in original post

Explorer

Thanks Martin, this is very useful; we don't have anything older than 1998 (the sample I have above is a dummy sample).. I will now have to look further if this is the case.. I will update my findings here. I tried indexing the same data with splunk 6 in a sandbox; and it works seamlessly without this issue; I guess it might be that splunk team might have fixed this in the latest versions of splunk!

0 Karma

SplunkTrust
SplunkTrust

Consider setting these in props.conf:

[your_sourcetype]
MAX_TIMESTAMP_LOOKAHEAD=25
NO_BINARY_CHECK=1
TIME_FORMAT=%Y-%m-%d %H:%M:%S.%3N
TIME_PREFIX=^

That'll teach Splunk about your timestamp's location and format, avoiding any other timestamp lookalikes to be found. Should speed things up a little as well 🙂

View solution in original post

SplunkTrust
SplunkTrust

93993493 interpreted as a unix epoch timestamp (seconds since 1970-01-01) comes out to 1972-12-23. Depending on Splunk's automated timestamp discovery this may be interpreted as a timestamp, causing a new event to start here. I've just checked, apparently 900000000 is the lowest Splunk will find automatically, that works out to 1998... do you have IDs that large in your data?

If you don't have sourcetype-specific config in your props.conf for these events then Splunk will use its default settings, looking for every timestamp it can possibly find. Your merchantOrderId looks suspicious too...

0 Karma

Explorer

Thanks Martin for your quick response; we do not have an entry against this source-type in our props.conf. One more quick question? how did you calculate that splunk would identify id="93993493" as 1972 ?

0 Karma

SplunkTrust
SplunkTrust

The attribute values in the <placeOrderResponse> tag look awfully similar to timestamps - my guess is Splunk saw a timestamp and started a new event. Look for events outside your expected time range to confirm. That particular id="93993493" would be 1972, so Splunk should log info about an event outside the configured time range for newly indexed events, check the _internal index for those.

What does your props.conf look like for that sourcetype?