I'm using SPLUNK to index an xml file. Is there a way to have SPLUNK automatically extract the key-value pairs for each event (xmlkv) for every search. I don't want the user to have to type the | xmlkv in the search bar each time. I see in props.conf you can set the KV_MODE, but none of the settings indicate xml extraction.
 
		
		
		
		
		
	
			
		
		
			
					
		Edited for version 4.3:
As for version 4.3, while the below accepted answer works, you can also use the props.conf setting:
KV_MODE = xml
this performs spath-type extraction on the events.
Maybe. As it turns out, the xmlkv command is not really a real XML extraction, it's just a regular regex that can be done by Splunk config probably better than the xmlkv command itself. (See $SPLUNK_HOME/etc/apps/search/bin/xmlkv.py.)
Just define a search-time extraction for your sourcetype (or source or whatever) in props.conf:
[mysourcetype]
REPORT-xmlkv = xmlkv-alternative
and in transforms.conf:
[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
 
					
				
		
try this:
LINE_BREAKER = ([\r\n]{2})
hey gkanapathy 🙂
I used your mad skillz regex in my transforms.conf but it negates the line breaker in my props.conf 😞
Any ideas on how to ensure the line breaker still works in this example?
props.conf:
[nagiosstatus]
MAX_EVENTS = 500000
TIME_PREFIX = \<created\>
MAX_TIMESTAMP_LOOKAHEAD = 500
SHOULD_LINEMERGE = false
LINE_BREAKER = (\n\n)
REPORT-xmlkv = xmlkv-alternative
transforms.conf:
[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
sample xml log:
<nagios>
<info>
    <created>1299121157</created>
    <version>3.2.1</version>
    <last_update_check>1299108670</last_update_check>
    <update_available>1</update_available>
    <last_version>3.2.1</last_version>
    <new_version>3.2.3</new_version>
</info>
<programstatus>
    <modified_host_attributes>1</modified_host_attributes>
    <modified_service_attributes>1</modified_service_attributes>
    <nagios_pid>15961</nagios_pid>
    <daemon_mode>1</daemon_mode>
    <program_start>1299103468</program_start>
    <last_command_check>1299121108</last_command_check>
    <last_log_rotation>0</last_log_rotation>
    <enable_notifications>1</enable_notifications>
    <active_service_checks_enabled>1</active_service_checks_enabled>
    <passive_service_checks_enabled>1</passive_service_checks_enabled>
    <active_host_checks_enabled>1</active_host_checks_enabled>
    <passive_host_checks_enabled>1</passive_host_checks_enabled>
    <enable_event_handlers>1</enable_event_handlers>
    <obsess_over_services>0</obsess_over_services>
    <obsess_over_hosts>0</obsess_over_hosts>
    <check_service_freshness>1</check_service_freshness>
    <check_host_freshness>0</check_host_freshness>
    <enable_flap_detection>0</enable_flap_detection>
    <enable_failure_prediction>1</enable_failure_prediction>
    <process_performance_data>1</process_performance_data>
    <global_host_event_handler></global_host_event_handler>
    <global_service_event_handler></global_service_event_handler>
    <next_comment_id>94586</next_comment_id>
    <next_downtime_id>35813</next_downtime_id>
    <next_event_id>1185528</next_event_id>
    <next_problem_id>532761</next_problem_id>
    <next_notification_id>1337020</next_notification_id>
    <total_external_command_buffer_slots>4096</total_external_command_buffer_slots>
    <used_external_command_buffer_slots>11</used_external_command_buffer_slots>
    <high_external_command_buffer_slots>128</high_external_command_buffer_slots>
    <active_scheduled_host_check_stats>21,132,401</active_scheduled_host_check_stats>
    <active_ondemand_host_check_stats>33,278,834</active_ondemand_host_check_stats>
    <passive_host_check_stats>0,0,0</passive_host_check_stats>
</programstatus>
</nagios>
Thanks in advance,
Luke 🙂
 
		
		
		
		
		
	
			
		
		
			
					
		Edited for version 4.3:
As for version 4.3, while the below accepted answer works, you can also use the props.conf setting:
KV_MODE = xml
this performs spath-type extraction on the events.
Maybe. As it turns out, the xmlkv command is not really a real XML extraction, it's just a regular regex that can be done by Splunk config probably better than the xmlkv command itself. (See $SPLUNK_HOME/etc/apps/search/bin/xmlkv.py.)
Just define a search-time extraction for your sourcetype (or source or whatever) in props.conf:
[mysourcetype]
REPORT-xmlkv = xmlkv-alternative
and in transforms.conf:
[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2
 
					
				
		
 
		
		
		
		
		
	
			
		
		
			
					
		This answer is still helpful 12 years later. Thanks, @gkanapathy !
 
		
		
		
		
		
	
			
		
		
			
					
		As of version 4.3, you can now use the setting in props.conf:
KV_MODE = xml
which will perform spath extraction.
 
					
				
		
Very Nice 🙂
Worked perfectly! Thanks!
Nice trick.  You could also add MV_ADD = True to your xmlkv-alternative stanza if you want to capture repeating XML elements as a multi-value field, for example if your XML represents a list of items.  This is something that you can't do with the default xmlkv command.  Pretty cool.
