Getting Data In

Why are my Scripted Input XML events truncated to 4 lines?

jercra
Explorer

I have scripted input that's calling a simple rest service to get a list of messages. No matter what settings I put in props.conf, I can only see 4 lines of XML on the indexer.

The XML result looks like:

<messages_list>
 <audit>
    <code>30</code>
    <created_at>2017-03-20T13:54:31-06:00</created_at>
    <message>Node ip-172-31-28-26 activated</message>
    <id>4</id>
    <node>02</node>
  </audit>
  <audit>
    <code>31</code>
    <created_at>2017-03-20T13:50:08-06:00</created_at>
    <message>Node ip-172-31-28-26 is deactivated</message>
    <id>3</id>
    <node>02</node>
  </audit>
  <audit>
    <code>30</code>
    <created_at>2017-03-20T13:49:44-06:00</created_at>
    <message>Node ip-172-31-28-26 activated</message>
    <id>2</id>
    <node>02</node>
  </audit>
  <audit>
    <code>32</code>
    <created_at>2017-03-20T13:47:07-06:00</created_at>
    <message>Node ip-172-31-28-26 added to cluster</message>
    <id>1</id>
    <node>02</node>
  </audit>
</messages_list>

The only thing that shows up in Splunk is:

 <messages_list>
 <audit>
    <code>30</code>

I've set TRUNCATE to 1000000 and to 0
I've set a multitude of linebreaking (Before, after, linebreak, linemerge=true|false, etc.)

Nothing changes. It's always only 4 lines.
TCPDump shows the data making it to the indexer.

Current props.conf:
[live_messages]
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE=0
MAX_EVENTS = 10000
pulldown_type = 1
KV_MODE=xml
LINE_BREAKER = (<\/messages_list>)
DATETIME_CONFIG = CURRENT

0 Karma
1 Solution

somesoni2
Revered Legend

Pretty strange. Would you mind trying with this?

[live_messages]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\<messages_list\>)
TRUNCATE=0
MAX_EVENTS = 10000
KV_MODE=xml
DATETIME_CONFIG = CURRENT

I'm assuming live_messages is the sourcetype that you assign to your scripted input in inputs.conf.

View solution in original post

0 Karma

somesoni2
Revered Legend

Pretty strange. Would you mind trying with this?

[live_messages]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)(?=\<messages_list\>)
TRUNCATE=0
MAX_EVENTS = 10000
KV_MODE=xml
DATETIME_CONFIG = CURRENT

I'm assuming live_messages is the sourcetype that you assign to your scripted input in inputs.conf.

0 Karma

jercra
Explorer

Yeah, the sourcetype is specified in the app's inputs.conf.

Your change seems to have done the trick. Very odd. Thanks. Add it as an answer and I will mark it as such.

0 Karma

somesoni2
Revered Legend

Can you share your props.conf for this sourcetype? Where is this script runs, UF or full Splunk Enterprise instance?

0 Karma

jercra
Explorer

I've gone through many, many iterations of props.conf. My current iteration looks like:
[live_messages]
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TRUNCATE=0
MAX_EVENTS = 10000
pulldown_type = 1
KV_MODE=xml
LINE_BREAKER = (<\/messages_list>)
DATETIME_CONFIG = CURRENT

The script runs on a forwarder and is calling a webservice on localhost. It's a really simple service that just returns a single XML document.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...