Getting Data In

Splunk XML Parsing

craigwilkinson
Path Finder

Hey Guys,

Is someone able to assist me with line breaking and field extraction XML output from Splunk REST API - which is fed into splunk.
I'd like to split the events based on <entry>. Any ideas and help greatly appreciated!

<?xml version="1.0" encoding="UTF-8"?>
<!--This is to override browser formatting; see server.conf[httpServer] to disable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .-->
<?xml-stylesheet type="text/xml" href="/static/atom.xsl"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:s="http://dev.splunk.com/ns/rest" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
  <title>tcpout-server</title>
  <id>https://22.33.444.55:8089/servicesNS/nobody/system/data/outputs/tcp/server</id>
  <updated>2017-03-29T15:51:40+11:00</updated>
  <generator build="f44afce176d0" version="6.3.3"/>
  <author>
    <name>Splunk</name>
  </author>
  <link href="/servicesNS/nobody/system/data/outputs/tcp/server/_new" rel="create"/>
  <link href="/servicesNS/nobody/system/data/outputs/tcp/server/_reload" rel="_reload"/>
  <link href="/servicesNS/nobody/system/data/outputs/tcp/server/_acl" rel="_acl"/>
  <opensearch:totalResults>2</opensearch:totalResults>
  <opensearch:itemsPerPage>30</opensearch:itemsPerPage>
  <opensearch:startIndex>0</opensearch:startIndex>
  <s:messages/>
  <entry>
    <title>33.44.555.69:9997</title>
    <id>https://22.33.444.55:8089/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997</id>
    <updated>2017-03-29T15:51:40+11:00</updated>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997" rel="alternate"/>
    <author>
      <name>nobody</name>
    </author>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997" rel="list"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997/_reload" rel="_reload"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997" rel="edit"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997" rel="remove"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997/allconnections" rel="allconnections"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/33.44.555.69%3A9997/disable" rel="disable"/>
    <content type="text/xml">
      <s:dict>
        <s:key name="destHost">33.44.555.69</s:key>
        <s:key name="destIp">33.44.555.69</s:key>
        <s:key name="destPort">9997</s:key>
        <s:key name="eai:acl">
          <s:dict>
            <s:key name="app">system</s:key>
            <s:key name="can_change_perms">1</s:key>
            <s:key name="can_list">1</s:key>
            <s:key name="can_share_app">1</s:key>
            <s:key name="can_share_global">1</s:key>
            <s:key name="can_share_user">0</s:key>
            <s:key name="can_write">1</s:key>
            <s:key name="modifiable">1</s:key>
            <s:key name="owner">nobody</s:key>
            <s:key name="perms">
              <s:dict>
                <s:key name="read">
                  <s:list>
                    <s:item>*</s:item>
                  </s:list>
                </s:key>
                <s:key name="write">
                  <s:list>
                    <s:item>*</s:item>
                  </s:list>
                </s:key>
              </s:dict>
            </s:key>
            <s:key name="removable">1</s:key>
            <s:key name="sharing">system</s:key>
          </s:dict>
        </s:key>
        <s:key name="method">autobalance</s:key>
        <s:key name="sourcePort">8089</s:key>
        <s:key name="ssl">1</s:key>
        <s:key name="sslCertPath">sslcerpathhere</s:key>
        <s:key name="sslCommonNameToCheck">sllcommonnamehere</s:key>
        <s:key name="sslPassword">sslpasswordhere</s:key>
        <s:key name="sslRootCAPath">sslpathhere</s:key>
        <s:key name="sslVerifyServerCert">1</s:key>
        <s:key name="status">connect_done</s:key>
      </s:dict>
    </content>
  </entry>
  <entry>
    <title>11.22.333.44:9997</title>
    <id>https://22.33.444.55:8089/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997</id>
    <updated>2017-03-29T15:51:40+11:00</updated>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997" rel="alternate"/>
    <author>
      <name>nobody</name>
    </author>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997" rel="list"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997/_reload" rel="_reload"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997" rel="edit"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997" rel="remove"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997/allconnections" rel="allconnections"/>
    <link href="/servicesNS/nobody/system/data/outputs/tcp/server/11.22.333.44%3A9997/disable" rel="disable"/>
    <content type="text/xml">
      <s:dict>
        <s:key name="destHost">11.22.333.44</s:key>
        <s:key name="destIp">11.22.333.44</s:key>
        <s:key name="destPort">9997</s:key>
        <s:key name="eai:acl">
          <s:dict>
            <s:key name="app">system</s:key>
            <s:key name="can_change_perms">1</s:key>
            <s:key name="can_list">1</s:key>
            <s:key name="can_share_app">1</s:key>
            <s:key name="can_share_global">1</s:key>
            <s:key name="can_share_user">0</s:key>
            <s:key name="can_write">1</s:key>
            <s:key name="modifiable">1</s:key>
            <s:key name="owner">nobody</s:key>
            <s:key name="perms">
              <s:dict>
                <s:key name="read">
                  <s:list>
                    <s:item>*</s:item>
                  </s:list>
                </s:key>
                <s:key name="write">
                  <s:list>
                    <s:item>*</s:item>
                  </s:list>
                </s:key>
              </s:dict>
            </s:key>
            <s:key name="removable">1</s:key>
            <s:key name="sharing">system</s:key>
          </s:dict>
        </s:key>
        <s:key name="method">autobalance</s:key>
        <s:key name="sourcePort">8089</s:key>
        <s:key name="ssl">1</s:key>
        <s:key name="sslCertPath">sslcerpathhere</s:key>
        <s:key name="sslCommonNameToCheck">sllcommonnamehere</s:key>
        <s:key name="sslPassword">sslpasswordhere</s:key>
        <s:key name="sslRootCAPath">sslpathhere</s:key>
        <s:key name="sslVerifyServerCert">1</s:key>
        <s:key name="status">connect_done</s:key>
      </s:dict>
    </content>
  </entry>
</feed>

-Craig

0 Karma
1 Solution

beatus
Communicator

Craig,
You have the option to modify the output mode with your rest request by adding "?output_mode=json". You have a few options there and may like the field extractions better. That said, here's what I'd do for this data:

Props.conf:

[xml:data]
KV_MODE = xml
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=UTF-8
disabled=false
LINE_BREAKER=([\r\n\s]+)\
TIME_PREFIX=\
TIME_FORMAT=%FT%T%:z
MAX_TIMESTAMP_LOOKAHEAD=30
TRUNCATE = 99999
TRANSFORMS-remove_header = remove_header

Transforms.conf:

[remove_header]
REGEX = This\sis\sto\soverride\sbrowser\sformatting
DEST_KEY=queue
FORMAT=nullQueue

To explain this a bit - We set critical props settings to ensure data is broken correctly and timestamped correctly. Next we remove the extra data from the event (that top piece) and nullQueue it with the transforms.conf. Lastly, we use "KV_MODE = xml" to get some field extracts working.

The end result in Splunk - Automatically extracting fields on the XML data (granted the field names kinda stink). See the screenshot. alt text

View solution in original post

beatus
Communicator

Craig,
You have the option to modify the output mode with your rest request by adding "?output_mode=json". You have a few options there and may like the field extractions better. That said, here's what I'd do for this data:

Props.conf:

[xml:data]
KV_MODE = xml
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=UTF-8
disabled=false
LINE_BREAKER=([\r\n\s]+)\
TIME_PREFIX=\
TIME_FORMAT=%FT%T%:z
MAX_TIMESTAMP_LOOKAHEAD=30
TRUNCATE = 99999
TRANSFORMS-remove_header = remove_header

Transforms.conf:

[remove_header]
REGEX = This\sis\sto\soverride\sbrowser\sformatting
DEST_KEY=queue
FORMAT=nullQueue

To explain this a bit - We set critical props settings to ensure data is broken correctly and timestamped correctly. Next we remove the extra data from the event (that top piece) and nullQueue it with the transforms.conf. Lastly, we use "KV_MODE = xml" to get some field extracts working.

The end result in Splunk - Automatically extracting fields on the XML data (granted the field names kinda stink). See the screenshot. alt text

craigwilkinson
Path Finder

Legend. Thank you so much!

0 Karma

woodcock
Esteemed Legend

Like this:

LINE_BREAKER = [\r\n\s]+<\/entry>([\r\n\s]+)<entry>[\r\n\s]+
SHOULD_LINEMERGE = false

craigwilkinson
Path Finder

Awesome thanks!

Any help with regards to the field extraction, just 1-2 examples would be great. can figure out the others 🙂

0 Karma

woodcock
Esteemed Legend

Sorry, I didn't catch that at first. Just feed your event into spath:

https://docs.splunk.com/Documentation/Splunk/6.5.2/SearchReference/Spath

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Build the Future of Agentic AI: Join the Splunk Agentic Ops Hackathon

AI is changing how teams investigate incidents, detect threats, automate workflows, and build intelligent ...

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...