Getting Data In

Why are the events being cut off at 257 lines in xml data?

mwcooley
Explorer

Hi,

I have xml data that can have up to 500+ lines but Splunk is truncating at 257 lines. I've been trying combinations of LINE_BREAK and BREAK_ONLY_BEFORE, but no luck. I'm not sure if it's my regex or my config files or what.

thanks,
mike

I defined the stanza in inputs.conf:
[monitor:///app/freeswitch/cdrs/*.xml]
sourcetype = conf_cdr_xml

Here's my props.conf:
[conf_cdr_xml]
KV_MODE = xml
SHOULD_LINEMERGE = false
BREAK_ONLY_BEFORE = \<\/cdr\>
MAX_EVENTS = 100000
TRUNCATE=100000
NO_BINARY_CHECK = true
pulldown_type = true

And here is an example event:

    <cdr>
      <conference>
        <name>5551231234-1234567</name>
        <hostname>test@test.net</hostname>
        <rate>8000</rate>
        <interval>20</interval>
        <start_time type="UNIX-epoch">1521040386</start_time>
        <end_time endconference_forced="false" type="UNIX-epoch">1521040388</end_time>
        <members>
          <member type="caller">
            <join_time type="UNIX-epoch">1521040386</join_time>
            <leave_time type="UNIX-epoch">1521040388</leave_time>
            <flags>
              <is_moderator>true</is_moderator>
              <end_conference>true</end_conference>
              <was_kicked>false</was_kicked>
              <is_ghost>false</is_ghost>
            </flags>
            <caller_profile>
              <username>5553214321</username>
              <dialplan>XML</dialplan>
              <caller_id_name>DEMO SITE</caller_id_name>
              <caller_id_number>5553214321</caller_id_number>
              <callee_id_name></callee_id_name>
              <callee_id_number></callee_id_number>
              <ani>5553214321</ani>
              <aniii></aniii>
              <network_addr>10.1.1.165</network_addr>
              <rdnis></rdnis>
              <destination_number>5551231234;conf=555;mod;tone=NO_SOUNDS</destination_number>
              <uuid>2dccfdde-279a-11e8-99a6-5903ab961f76</uuid>
              <source>mod_sofia</source>
              <context>public</context>
              <chan_name>sofia/internal/5553214321@10.1.1.125</chan_name>
            </caller_profile>
          </member>
          <member type="caller">
            <join_time type="UNIX-epoch">1521040386</join_time>
            <leave_time type="UNIX-epoch">1521040388</leave_time>
            <flags>
              <is_moderator>true</is_moderator>
              <end_conference>true</end_conference>
              <was_kicked>false</was_kicked>
              <is_ghost>false</is_ghost>
            </flags>
            <caller_profile>
              <username>5553214321</username>
              <dialplan>XML</dialplan>
              <caller_id_name>DEMO SITE</caller_id_name>
              <caller_id_number>5553214321</caller_id_number>
              <callee_id_name></callee_id_name>
              <callee_id_number></callee_id_number>
              <ani>5553214321</ani>
              <aniii></aniii>
              <network_addr>10.1.1.165</network_addr>
              <rdnis></rdnis>
              <destination_number>5551231234;conf=555;mod;tone=NO_SOUNDS</destination_number>
              <uuid>2dccfdde-279a-11e8-99a6-5903ab961f76</uuid>
              <source>mod_sofia</source>
              <context>public</context>
              <chan_name>sofia/internal/5553214321@10.1.1.125</chan_name>
            </caller_profile>
          </member>
        </members>
        <rejected></rejected>
      </conference>
    </cdr>
0 Karma
1 Solution

tiagofbmm
Influencer

The Universal Forwarder doesn't have those parsing capabilities at all, so what is leaving your UF is just blocks of data (uncooked data) and not events themselves.

You must put these props configurations in a full Splunk Instance, either your Heavy Forwarder or Indexer.

Remember that data will only go through the parsing pipeline once, so I think the HF is the solution here

View solution in original post

keishamtcs
Engager

Hi Mike,

I am also facing the same issue. Were you able to fix your issue ?

Regards

0 Karma

mwcooley
Explorer

Hi @keishamtcs ,

I added the props.conf file to the indexer as suggested in the answer by @tiagofbmm. Well, he actually suggested the heavy forwarder, but that's all controlled at the corporate level, so they chose to add it to the indexer instead.

0 Karma

niketn
Legend

@mwcooley, with SHOULD_LINEMERGE turned on i.e. true, the event break at 257 lines is not an indication of MAX_EVENTS. Rather, it is an indication that Splunk is unable to identify the timestamp in the lines that it has parsed. Since the timestamp in your log is Unix Epoch timestamp, you should add timestamp extraction related configs in props.conf i.e.

TIME_FORMAT=%s
TIME_PREFIX=\<start_time type=\"UNIX-epoch\"\>
MAX_TIMESTAMP_LOOKAHEAD=10

Further, try with the following LINE_BREAKER

LINE_BREAKER=[\>\s]((?=\<cdr\>))

Following are other required config which you already have:

SHOULD_LINEMERGE=true
KV_MODE = xml

Please try out and confirm.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

tiagofbmm
Influencer

The Universal Forwarder doesn't have those parsing capabilities at all, so what is leaving your UF is just blocks of data (uncooked data) and not events themselves.

You must put these props configurations in a full Splunk Instance, either your Heavy Forwarder or Indexer.

Remember that data will only go through the parsing pipeline once, so I think the HF is the solution here

mwcooley
Explorer

Thanks, @tiagofbmm. Looks like you are correct. I've opened a ticket with the splunk group. I'll update once I've confirmed.

0 Karma

mwcooley
Explorer

Yep, just as you stated. Thanks again.

0 Karma

tiagofbmm
Influencer

you're welcome

0 Karma

cmerriman
Super Champion
[ conf_cdr_xml ]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
BREAK_ONLY_BEFORE=\<\/cdr\>
MAX_EVENTS=100000
disabled=false

i think something like this should work.

0 Karma

mwcooley
Explorer

Hi,

Didn't work. I've been reading more and now I'm wondering if my events are being truncated later down the pipe. I'm doing all this work on a universal forwarder. The next hop is a heavy forwarder, then the indexer. Could it be that one of those is truncating my events?

0 Karma

mwcooley
Explorer

and, in case you're wondering why i don't just check there... it's a corporate instance of splunk and i have no access to the UF nor the indexer. i did reach out to that group as well.

0 Karma

mwcooley
Explorer

After a bit more reading, I think I should be using LINE_BREAKER instead of BREAK_ONLY_BEFORE. I tried it by just substitute line_breaker for break_only_before in the props.conf example. Still no luck.

0 Karma
Get Updates on the Splunk Community!

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...

Observability Highlights | January 2023 Newsletter

 January 2023New Product Releases Splunk Network Explorer for Infrastructure MonitoringSplunk unveils Network ...