Dashboards & Visualizations

XML Extraction

jedatt01
Builder

I have a datasource that reads in events in XML format. Could someone please help me build a props.conf that will extract all fields and show the events in treeview. Sample event below

Mon Apr 28 16:45:57 EDT 2014 name="TOPIC_msg_received" event_id="ID:404040" msg_dest="SplunkTopic" msg_body="<?xml version="1.0" encoding="utf-8"?><ELLogInputLayout xmlns="http://www.test.com/1"><ELLogInputMessage>    <Header>      <LogEventTypeCode>ERROR</LogEventTypeCode>      <LogSeverityCode>CRITICAL</LogSeverityCode>      <LogEventDateTime>9999-12-31T23:59:59.9999999-05:00</LogEventDateTime>    </Header>    <SourceInformation>      <EAPMId>2</EAPMId>      <HostMachineName>HostMachineName3</HostMachineName>      <HostEnvironmentName>HostEnvironmentName3</HostEnvironmentName>      <ComponentId>ComponentId3</ComponentId>      <ComponentName>ComponentName3</ComponentName>      <ApplicationEventCorrelationId>ApplicationEventCorrelationId3</ApplicationEventCorrelationId>      <UserId>UserId3</UserId>      <UserSrc>UserSrc3</UserSrc>      <BusinessDomainId>BusinessDomainId3</BusinessDomainId>      <BusinessDomainName>BusinessDomainName3</BusinessDomainName>    </SourceInformation>    <ErrorInformation>      <ErrorCode>ErrorCode3</ErrorCode>      <ErrorDescription>ErrorDescription3</ErrorDescription><DetailedErrorInformation>DetailedErrorInformation3</DetailedErrorInformation>    </ErrorInformation>    <DetailedLogInformation>anyType</DetailedLogInformation>   </ELLogInputMessage></ELLogInputLayout>"
Tags (2)
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Try this.

props.conf

[yourSourceType]
NO_BINARY_CHECK = 1
TIME_FORMAT = %a %b %d %H:%M:%S %T %Y
pulldown_type = 1
REPORT-xmlkv = xmlkv-alternative

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2

View solution in original post

landen99
Motivator

I like this solution using transforms.conf

[views_std]
MV_ADD = 1
REGEX = \<(\w+[^\n\/\>]+)\/?\>([^\<\n][^\<]*)\<
FORMAT = $1::$2
CLEAN_KEYS = true

[views_param]
MV_ADD = 1
REGEX = \<(\w+ [^\n\/\>]+)\/?\>
FORMAT = param::$1
CLEAN_KEYS = true

[views_option]
MV_ADD = 1
SOURCE_KEY = param
REGEX = (\w+(?: \w+)*)="(?!host|source|sourcetype|index|splunk_server)(\w+)"
FORMAT = $1::$2
CLEAN_KEYS = true
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try this.

props.conf

[yourSourceType]
NO_BINARY_CHECK = 1
TIME_FORMAT = %a %b %d %H:%M:%S %T %Y
pulldown_type = 1
REPORT-xmlkv = xmlkv-alternative

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2

poddraj
Explorer

Hi,

I have a .log file which has mix of system logs and in between request & response XMLs of actual transactions going within my application. I only want those XML transactions only to be shown in splunk and do not want to see the other logging information. Is there a way to achieve this?

0 Karma

jedatt01
Builder

when you have an input that is true json or xml you can have the view of the events in a tree structure where you hit the + sign to see nested information.

0 Karma

rahulroy_splunk
Path Finder

Could you be more specific what you mean by pull down view?

0 Karma

jedatt01
Builder

That extracted all the fields, thanks! Wish there was a way to get a pull down view but this will do the trick if that's not possible.

0 Karma

gfuente
Motivator

Hello

Have your tried KV_MODE = XML in props.conf ??

From docs:

KV_MODE = [none|auto|multi|json|xml]
* Used for search-time field extractions only.
* Specifies the field/value extraction mode for the data.
* Set KV_MODE to one of the following:
        * none: if you want no field/value extraction to take place.
        * auto: extracts field/value pairs separated by equal signs.
        * multi: invokes the multikv search command to expand a tabular event into multiple events.
    * xml : automatically extracts fields from XML data.
    * json: automatically extracts fields from JSON data.
* Setting to 'none' can ensure that one or more user-created regexes are not overridden by
  automatic field/value extraction for a particular host, source, or source type, and also
  increases search performance.
* Defaults to auto.
* The 'xml' and 'json' modes will not extract any fields when used on data that isn't of the correct format (JSON or XML).
0 Karma

jedatt01
Builder

Yes, I tried using KV_MODE = XML but it is not picking up. Is the only way to do individual field extraction?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

This may not work as the event is not pure xml (its a combination of key value pair with embedded xml). You might have to extract all xml fields using field extractor.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...