Hi Experts,
We are trying to ingest a XML log file to splunk with the following data
2016-05-26 10:14:37 | R.R.services.section_B_C.save.saveSection_B_C | [I] | <?xml version="1.0"?>
<subsidary_id>1014</subsidary_id>
<is_first_filling>false</is_first_filling>
<system_id>8323</system_id>
<updated_by>username@email.com</updated_by>
<created_by>username@email.com</created_by>
2016-05-26 10:14:49 | R.R.services.section_A_K_N.save.saveSection_A_K_N | [I] | <?xml version="1.0"?>
<soa_cable_id>4913</soa_cable_id>
<accounting_period_id>1008</accounting_period_id>
<gross_recipts>8233</gross_recipts>
<contact_person_id>1095</contact_person_id>
<user_name>user@email.com</user_name>
<form_type_id>1079</form_type_id>
<form_type_value>short</form_type_value>
2016-05-26 10:14:58 | R.R.services.section_D.save.saveSection_D | [I] | <?xml version="1.0"?>
<community>
<soa_city_id/>
<soa_cable_id/>
<is_first_community>true</is_first_community>
..
..
We are trying to see if we can extract the first block (savesection_B_C) and then the second block (savesection_A_K_N) etc.. and we think Line breaking this till these blocks will give us a way to build a report on Splunk.
Let us know if this makes sense.
Try this:
[YourSourcetypeHere]
BREAK_ONLY_BEFORE_DATE = true
SHOULD_LINEMERGE = true
TIME_FORMAT = %Y-%m-%d %H:M:%S
Thanks! I have a variation to the above question as we found out this further complicates matters on extraction. From the below code, can we force splunk to ignore the timestamps and select the whole block (start to end)? We would then run a xmlkv | table soa_cable_id etc.. to make a report.
2016-05-26 10:14:37 | R.R.services.section_B_C.save.saveSection_B_C | [I] | <?xml version="1.0"?>
<subsidary_id>1014</subsidary_id>
<is_first_filling>false</is_first_filling>
<system_id>8323</system_id>
<updated_by>user@email.com</updated_by>
<created_by>user@email.com</created_by>
2016-05-26 10:14:49 | R.R.services.section_A_K_N.save.saveSection_A_K_N | [I] | <?xml version="1.0"?>
<soa_cable_id>4917</soa_cable_id>
<accounting_period_id>1008</accounting_period_id>
<gross_recipts>8233</gross_recipts>
<contact_person_id>1095</contact_person_id>
<user_name>user@email.com</user_name>
<form_type_id>1079</form_type_id>
<form_type_value>short</form_type_value>
2016-05-26 10:14:58 | R.R.services.section_D.save.saveSection_D | [I] | <?xml version="1.0"?>
<community>
<soa_city_id/>
<soa_cable_id/>
<is_first_community>true</is_first_community>
<soa_city_name>vienna</soa_city_name>
<county></county>
<state_id>1001</state_id>
<state>
<state_id/>
<state_name/>
<country>
<country_id/>
<country_name/>
<updated_by/>
<updated_date/>
<created_by/>
<created_date/>
</country>
<country_id/>
<updated_by/>
<updated_date/>
<created_by/>
<created_date/>
</state>
<created_by/>
</community>
<soa_cable_id>4917</soa_cable_id>
<user_name>user@email.com</user_name>
2016-05-26 10:15:03 | R.R.services.section_E.save.saveSection_E | [I] | <?xml version="1.0"?>
<soa_cable_id>4917</soa_cable_id>
<user_name>user@email.com</user_name>
<secTransmissionTV>
<res_svc_first_set_no_subscribers>8</res_svc_first_set_no_subscribers>
<res_svc_first_set_no_rate>3243</res_svc_first_set_no_rate>
<res_svc_addl_sets_no_subscribers></res_svc_addl_sets_no_subscribers>
<res_svc_addl_sets_no_rate></res_svc_addl_sets_no_rate>
<res_fm_no_subscribers></res_fm_no_subscribers>
<res_fm_no_rate></res_fm_no_rate>
<motel_hotel_subscribers></motel_hotel_subscribers>
<motel_hotel_rate></motel_hotel_rate>
<commercial_subscribers></commercial_subscribers>
<commercial_rate></commercial_rate>
<converter_subscribers/>
<converter_rate/>
<converter_resid_subscribers></converter_resid_subscribers>
<converter_resid_rate></converter_resid_rate>
<converter_non_resid_subscriber></converter_non_resid_subscriber>
<converter_non_resid_rate></converter_non_resid_rate>
<updated_by/>
<updated_date/>
<created_by/>
<created_date/>
</secTransmissionTV>
2016-05-26 10:15:05 | R.R.services.section_F.save.saveSection_F | [I] | <?xml version="1.0"?>
<soa_cable_id>4917</soa_cable_id>
<user_name>user@email.com</user_name>
<otherSecTransmissionTV>
<soa_cable_id/>
<cont_svcs_pay_cable></cont_svcs_pay_cable>
<cont_svcs_pay_cablepay_cable_addl_channel></cont_svcs_pay_cablepay_cable_addl_channel>
<cont_svcs_fire_protection></cont_svcs_fire_protection>
<cont_svcs_burglar_protection></cont_svcs_burglar_protection>
<install_resi_first_set></install_resi_first_set>
<install_resi_addl_set></install_resi_addl_set>
<install_resi_fm_radio></install_resi_fm_radio>
<install_resi_converter></install_resi_converter>
<install_non_resi_motel></install_non_resi_motel>
<install_non_resi_commercial></install_non_resi_commercial>
<install_non_resi_pay_cable></install_non_resi_pay_cable>
<install_non_resi_pay_cable_addl_channel></install_non_resi_pay_cable_addl_channel>
<install_non_resi_fire_protection></install_non_resi_fire_protection>
<install_non_resi_burglar_protection></install_non_resi_burglar_protection>
<othr_svcs_reconnect></othr_svcs_reconnect>
<othr_svcs_disconnect></othr_svcs_disconnect>
<othr_svcs_outlet_relocation></othr_svcs_outlet_relocation>
<othr_svcs_move_to_new_addr></othr_svcs_move_to_new_addr>
<updated_by/>
<updated_date/>
<created_by/>
<created_date/>
</otherSecTransmissionTV>
2016-05-26 10:15:19 | R.R.services.section_I.save.saveSection_I | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<user_name>user@email.com</user_name>
<non_network_substitute_channels>false</non_network_substitute_channels>
2016-05-26 10:15:21 | R.R.services.section_L.save.saveSection_L | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<user_name>user@email.com</user_name>
<royaltyFee>
<soa_cable_id>4917</soa_cable_id>
<accounting_period_id>1008</accounting_period_id>
<gross_receipts>8233.00</gross_receipts>
<fee>67.0</fee>
<interest_amount>0.00</interest_amount>
<filing_fee>15.00</filing_fee>
<total_royalty_fee>67.00</total_royalty_fee>
<form_type_id>1079</form_type_id>
<is_partial_distant></is_partial_distant>
<has_distant></has_distant>
<submitted_date/>
<shortform>
<block1_1>52.00</block1_1>
<block1_2/>
<block2_1/>
<block2_3/>
<block2_5/>
<block2_6/>
<block2_7/>
<block2_8/>
<block2_9/>
<block3_2/>
<block3_3/>
<block3_4/>
<block3_5/>
<block3_6/>
<block3_7/>
</shortform>
<longform/>
<interestAssesment>
<late_payment_amount>52.00</late_payment_amount>
<interest_rate></interest_rate>
<no_of_days_late>0</no_of_days_late>
<sectionQ_line2>0.00</sectionQ_line2>
<sectionQ_line3>0.00</sectionQ_line3>
<interest_amount>0.00</interest_amount>
</interestAssesment>
</royaltyFee>
2016-05-26 10:15:25 | R.R.services.section_M.save.saveSection_M | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<total_number_of_channels>1</total_number_of_channels>
<activated_channel_count>2</activated_channel_count>
<user_name>user@email.com</user_name>
2016-05-26 10:15:25 | R.R.services.section_M.save.saveSection_M | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<total_number_of_channels>1</total_number_of_channels>
<activated_channel_count>2</activated_channel_count>
<user_name>user@email.com</user_name>
2016-05-26 10:15:27 | R.R.services.section_P.save.saveSection_P | [I] | <?xml version="1.0"?>
<soa_user_id>4917</soa_user_id>
<is_secondary_transmission_satellite>false</is_secondary_transmission_satellite>
<satellite_gross_recipts></satellite_gross_recipts>
<user_name>user@email.com</user_name>
2016-05-26 10:15:34 | R.R.services.section_O.save.saveSection_O | [I] | <?xml version="1.0"?>
<certified>
<certified_by_type_id>1041</certified_by_type_id>
<user_id>1095</user_id>
<user_name>user@email.com</user_name>
<soa_cable_id>4917</soa_cable_id>
</certified>
Whatever you are doing merged, can surely also be done with properly line-broken events. If this answer properly breaks the lines, let's work from there.
Thanks for your response. In your opinion what is the best way to break this XML file so that we can construct a table with
[Date] [SOA_Cable_ID] [Created_by]
The problem with the XML is there are multiple tags for Soa_cable_id, dates etc.. that we are not able to discern how to create the report.
If properly linebroken, then you can get the first one with SOA_Cable_ID=mvindex(SOA_Cable_ID,0)
, etc.
Passing a false value to TIME_PREFIX parameter could work, splunk wont be able to find the timestamp, hence one big event with the current timestamp.
With the given sample data, this worked for me.
[<SOURCETYPE NAME> ]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
CHARSET=UTF-8
disabled=false
TIME_PREFIX=asdasd
Hope it helps.
Thanks! This is a good idea. I will follow this if nothing else works.
Not sure if I understand the requirement completely. Do you want to have block with saveSection_B_C and saveSection_A_K_N to appear as one event?
Thank you for your response. We are now looking at selecting the whole block of XML code instead of breaking it by time stamp. See my response below to woodcock. Thank you again.