Getting Data In

Question on Line Breaker for XML log

kkossery
Communicator

Hi Experts,

We are trying to ingest a XML log file to splunk with the following data

2016-05-26 10:14:37 | R.R.services.section_B_C.save.saveSection_B_C | [I] | <?xml version="1.0"?>
<subsidary_id>1014</subsidary_id>
<is_first_filling>false</is_first_filling>
<system_id>8323</system_id>
<updated_by>username@email.com</updated_by>
<created_by>username@email.com</created_by>
2016-05-26 10:14:49 | R.R.services.section_A_K_N.save.saveSection_A_K_N | [I] | <?xml version="1.0"?>
<soa_cable_id>4913</soa_cable_id>
<accounting_period_id>1008</accounting_period_id>
<gross_recipts>8233</gross_recipts>
<contact_person_id>1095</contact_person_id>
<user_name>user@email.com</user_name>
<form_type_id>1079</form_type_id>
<form_type_value>short</form_type_value>
2016-05-26 10:14:58 | R.R.services.section_D.save.saveSection_D | [I] | <?xml version="1.0"?>
<community>
  <soa_city_id/>
  <soa_cable_id/>
  <is_first_community>true</is_first_community>
 ..
 ..

We are trying to see if we can extract the first block (savesection_B_C) and then the second block (savesection_A_K_N) etc.. and we think Line breaking this till these blocks will give us a way to build a report on Splunk.
Let us know if this makes sense.

0 Karma

woodcock
Esteemed Legend

Try this:

[YourSourcetypeHere]
BREAK_ONLY_BEFORE_DATE = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y-%m-%d %H:M:%S
0 Karma

kkossery
Communicator

Thanks! I have a variation to the above question as we found out this further complicates matters on extraction. From the below code, can we force splunk to ignore the timestamps and select the whole block (start to end)? We would then run a xmlkv | table soa_cable_id etc.. to make a report.

2016-05-26 10:14:37 | R.R.services.section_B_C.save.saveSection_B_C | [I] | <?xml version="1.0"?>
<subsidary_id>1014</subsidary_id>
<is_first_filling>false</is_first_filling>
<system_id>8323</system_id>
<updated_by>user@email.com</updated_by>
<created_by>user@email.com</created_by>
2016-05-26 10:14:49 | R.R.services.section_A_K_N.save.saveSection_A_K_N | [I] | <?xml version="1.0"?>
<soa_cable_id>4917</soa_cable_id>
<accounting_period_id>1008</accounting_period_id>
<gross_recipts>8233</gross_recipts>
<contact_person_id>1095</contact_person_id>
<user_name>user@email.com</user_name>
<form_type_id>1079</form_type_id>
<form_type_value>short</form_type_value>
2016-05-26 10:14:58 | R.R.services.section_D.save.saveSection_D | [I] | <?xml version="1.0"?>
<community>
  <soa_city_id/>
  <soa_cable_id/>
  <is_first_community>true</is_first_community>
  <soa_city_name>vienna</soa_city_name>
  <county></county>
  <state_id>1001</state_id>
  <state>
    <state_id/>
    <state_name/>
    <country>
      <country_id/>
      <country_name/>
      <updated_by/>
      <updated_date/>
      <created_by/>
      <created_date/>
    </country>
    <country_id/>
    <updated_by/>
    <updated_date/>
    <created_by/>
    <created_date/>
  </state>
  <created_by/>
</community>
<soa_cable_id>4917</soa_cable_id>
<user_name>user@email.com</user_name>
2016-05-26 10:15:03 | R.R.services.section_E.save.saveSection_E | [I] | <?xml version="1.0"?>
<soa_cable_id>4917</soa_cable_id>
<user_name>user@email.com</user_name>
<secTransmissionTV>
  <res_svc_first_set_no_subscribers>8</res_svc_first_set_no_subscribers>
  <res_svc_first_set_no_rate>3243</res_svc_first_set_no_rate>
  <res_svc_addl_sets_no_subscribers></res_svc_addl_sets_no_subscribers>
  <res_svc_addl_sets_no_rate></res_svc_addl_sets_no_rate>
 <res_fm_no_subscribers></res_fm_no_subscribers>
  <res_fm_no_rate></res_fm_no_rate>
  <motel_hotel_subscribers></motel_hotel_subscribers>
  <motel_hotel_rate></motel_hotel_rate>
  <commercial_subscribers></commercial_subscribers>
  <commercial_rate></commercial_rate>
  <converter_subscribers/>
  <converter_rate/>
  <converter_resid_subscribers></converter_resid_subscribers>
  <converter_resid_rate></converter_resid_rate>
  <converter_non_resid_subscriber></converter_non_resid_subscriber>
  <converter_non_resid_rate></converter_non_resid_rate>
  <updated_by/>
  <updated_date/>
  <created_by/>
  <created_date/>
</secTransmissionTV>
2016-05-26 10:15:05 | R.R.services.section_F.save.saveSection_F | [I] | <?xml version="1.0"?>
<soa_cable_id>4917</soa_cable_id>
<user_name>user@email.com</user_name>
<otherSecTransmissionTV>
  <soa_cable_id/>
  <cont_svcs_pay_cable></cont_svcs_pay_cable>
  <cont_svcs_pay_cablepay_cable_addl_channel></cont_svcs_pay_cablepay_cable_addl_channel>
  <cont_svcs_fire_protection></cont_svcs_fire_protection>
  <cont_svcs_burglar_protection></cont_svcs_burglar_protection>
  <install_resi_first_set></install_resi_first_set>
  <install_resi_addl_set></install_resi_addl_set>
  <install_resi_fm_radio></install_resi_fm_radio>
  <install_resi_converter></install_resi_converter>
  <install_non_resi_motel></install_non_resi_motel>
  <install_non_resi_commercial></install_non_resi_commercial>
  <install_non_resi_pay_cable></install_non_resi_pay_cable>
  <install_non_resi_pay_cable_addl_channel></install_non_resi_pay_cable_addl_channel>
  <install_non_resi_fire_protection></install_non_resi_fire_protection>
  <install_non_resi_burglar_protection></install_non_resi_burglar_protection>
  <othr_svcs_reconnect></othr_svcs_reconnect>
  <othr_svcs_disconnect></othr_svcs_disconnect>
  <othr_svcs_outlet_relocation></othr_svcs_outlet_relocation>
  <othr_svcs_move_to_new_addr></othr_svcs_move_to_new_addr>
  <updated_by/>
  <updated_date/>
  <created_by/>
  <created_date/>
</otherSecTransmissionTV>
2016-05-26 10:15:19 | R.R.services.section_I.save.saveSection_I | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<user_name>user@email.com</user_name>
<non_network_substitute_channels>false</non_network_substitute_channels>
2016-05-26 10:15:21 | R.R.services.section_L.save.saveSection_L | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<user_name>user@email.com</user_name>
<royaltyFee>
  <soa_cable_id>4917</soa_cable_id>
  <accounting_period_id>1008</accounting_period_id>
  <gross_receipts>8233.00</gross_receipts>
  <fee>67.0</fee>
  <interest_amount>0.00</interest_amount>
  <filing_fee>15.00</filing_fee>
  <total_royalty_fee>67.00</total_royalty_fee>
  <form_type_id>1079</form_type_id>
  <is_partial_distant></is_partial_distant>
  <has_distant></has_distant>
  <submitted_date/>
  <shortform>
    <block1_1>52.00</block1_1>
    <block1_2/>
    <block2_1/>
    <block2_3/>
    <block2_5/>
    <block2_6/>
    <block2_7/>
    <block2_8/>
    <block2_9/>
    <block3_2/>
    <block3_3/>
    <block3_4/>
    <block3_5/>
    <block3_6/>
    <block3_7/>
  </shortform>
  <longform/>
  <interestAssesment>
    <late_payment_amount>52.00</late_payment_amount>
    <interest_rate></interest_rate>
    <no_of_days_late>0</no_of_days_late>
    <sectionQ_line2>0.00</sectionQ_line2>
    <sectionQ_line3>0.00</sectionQ_line3>
    <interest_amount>0.00</interest_amount>
  </interestAssesment>
</royaltyFee>
2016-05-26 10:15:25 | R.R.services.section_M.save.saveSection_M | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<total_number_of_channels>1</total_number_of_channels>
<activated_channel_count>2</activated_channel_count>
<user_name>user@email.com</user_name>
2016-05-26 10:15:25 | R.R.services.section_M.save.saveSection_M | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<total_number_of_channels>1</total_number_of_channels>
<activated_channel_count>2</activated_channel_count>
<user_name>user@email.com</user_name>
2016-05-26 10:15:27 | R.R.services.section_P.save.saveSection_P | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<is_secondary_transmission_satellite>false</is_secondary_transmission_satellite>
<satellite_gross_recipts></satellite_gross_recipts>
<user_name>user@email.com</user_name>
2016-05-26 10:15:34 | R.R.services.section_O.save.saveSection_O | [I] | <?xml version="1.0"?>
<certified>
  <certified_by_type_id>1041</certified_by_type_id>
  <user_id>1095</user_id>
  <user_name>user@email.com</user_name>
  <soa_cable_id>4917</soa_cable_id>
</certified>
0 Karma

woodcock
Esteemed Legend

Whatever you are doing merged, can surely also be done with properly line-broken events. If this answer properly breaks the lines, let's work from there.

0 Karma

kkossery
Communicator

Thanks for your response. In your opinion what is the best way to break this XML file so that we can construct a table with

[Date]              [SOA_Cable_ID]            [Created_by]

The problem with the XML is there are multiple tags for Soa_cable_id, dates etc.. that we are not able to discern how to create the report.

0 Karma

woodcock
Esteemed Legend

If properly linebroken, then you can get the first one with SOA_Cable_ID=mvindex(SOA_Cable_ID,0), etc.

0 Karma

alemarzu
Motivator

Passing a false value to TIME_PREFIX parameter could work, splunk wont be able to find the timestamp, hence one big event with the current timestamp.

With the given sample data, this worked for me.

[<SOURCETYPE NAME> ]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
CHARSET=UTF-8
disabled=false
TIME_PREFIX=asdasd

Hope it helps.

0 Karma

kkossery
Communicator

Thanks! This is a good idea. I will follow this if nothing else works.

0 Karma

somesoni2
Revered Legend

Not sure if I understand the requirement completely. Do you want to have block with saveSection_B_C and saveSection_A_K_N to appear as one event?

0 Karma

kkossery
Communicator

Thank you for your response. We are now looking at selecting the whole block of XML code instead of breaking it by time stamp. See my response below to woodcock. Thank you again.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...