Dashboards & Visualizations

XML Extraction

jedatt01
Builder

I have a datasource that reads in events in XML format. Could someone please help me build a props.conf that will extract all fields and show the events in treeview. Sample event below

Mon Apr 28 16:45:57 EDT 2014 name="TOPIC_msg_received" event_id="ID:404040" msg_dest="SplunkTopic" msg_body="<?xml version="1.0" encoding="utf-8"?><ELLogInputLayout xmlns="http://www.test.com/1"><ELLogInputMessage>    <Header>      <LogEventTypeCode>ERROR</LogEventTypeCode>      <LogSeverityCode>CRITICAL</LogSeverityCode>      <LogEventDateTime>9999-12-31T23:59:59.9999999-05:00</LogEventDateTime>    </Header>    <SourceInformation>      <EAPMId>2</EAPMId>      <HostMachineName>HostMachineName3</HostMachineName>      <HostEnvironmentName>HostEnvironmentName3</HostEnvironmentName>      <ComponentId>ComponentId3</ComponentId>      <ComponentName>ComponentName3</ComponentName>      <ApplicationEventCorrelationId>ApplicationEventCorrelationId3</ApplicationEventCorrelationId>      <UserId>UserId3</UserId>      <UserSrc>UserSrc3</UserSrc>      <BusinessDomainId>BusinessDomainId3</BusinessDomainId>      <BusinessDomainName>BusinessDomainName3</BusinessDomainName>    </SourceInformation>    <ErrorInformation>      <ErrorCode>ErrorCode3</ErrorCode>      <ErrorDescription>ErrorDescription3</ErrorDescription><DetailedErrorInformation>DetailedErrorInformation3</DetailedErrorInformation>    </ErrorInformation>    <DetailedLogInformation>anyType</DetailedLogInformation>   </ELLogInputMessage></ELLogInputLayout>"
Tags (2)
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Try this.

props.conf

[yourSourceType]
NO_BINARY_CHECK = 1
TIME_FORMAT = %a %b %d %H:%M:%S %T %Y
pulldown_type = 1
REPORT-xmlkv = xmlkv-alternative

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2

View solution in original post

landen99
Motivator

I like this solution using transforms.conf

[views_std]
MV_ADD = 1
REGEX = \<(\w+[^\n\/\>]+)\/?\>([^\<\n][^\<]*)\<
FORMAT = $1::$2
CLEAN_KEYS = true

[views_param]
MV_ADD = 1
REGEX = \<(\w+ [^\n\/\>]+)\/?\>
FORMAT = param::$1
CLEAN_KEYS = true

[views_option]
MV_ADD = 1
SOURCE_KEY = param
REGEX = (\w+(?: \w+)*)="(?!host|source|sourcetype|index|splunk_server)(\w+)"
FORMAT = $1::$2
CLEAN_KEYS = true
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try this.

props.conf

[yourSourceType]
NO_BINARY_CHECK = 1
TIME_FORMAT = %a %b %d %H:%M:%S %T %Y
pulldown_type = 1
REPORT-xmlkv = xmlkv-alternative

transforms.conf

[xmlkv-alternative]
REGEX = <([^\s\>]*)[^\>]*\>([^<]*)\<\/\1\>
FORMAT = $1::$2

poddraj
Explorer

Hi,

I have a .log file which has mix of system logs and in between request & response XMLs of actual transactions going within my application. I only want those XML transactions only to be shown in splunk and do not want to see the other logging information. Is there a way to achieve this?

0 Karma

jedatt01
Builder

when you have an input that is true json or xml you can have the view of the events in a tree structure where you hit the + sign to see nested information.

0 Karma

rahulroy_splunk
Path Finder

Could you be more specific what you mean by pull down view?

0 Karma

jedatt01
Builder

That extracted all the fields, thanks! Wish there was a way to get a pull down view but this will do the trick if that's not possible.

0 Karma

gfuente
Motivator

Hello

Have your tried KV_MODE = XML in props.conf ??

From docs:

KV_MODE = [none|auto|multi|json|xml]
* Used for search-time field extractions only.
* Specifies the field/value extraction mode for the data.
* Set KV_MODE to one of the following:
        * none: if you want no field/value extraction to take place.
        * auto: extracts field/value pairs separated by equal signs.
        * multi: invokes the multikv search command to expand a tabular event into multiple events.
    * xml : automatically extracts fields from XML data.
    * json: automatically extracts fields from JSON data.
* Setting to 'none' can ensure that one or more user-created regexes are not overridden by
  automatic field/value extraction for a particular host, source, or source type, and also
  increases search performance.
* Defaults to auto.
* The 'xml' and 'json' modes will not extract any fields when used on data that isn't of the correct format (JSON or XML).
0 Karma

jedatt01
Builder

Yes, I tried using KV_MODE = XML but it is not picking up. Is the only way to do individual field extraction?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

This may not work as the event is not pure xml (its a combination of key value pair with embedded xml). You might have to extract all xml fields using field extractor.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...