Getting Data In

Why am I unable to apply proper parsing on an XML field tag with my current props.conf?

OMohi
Path Finder

Hi Everyone:

I am facing an issue where I am unable to apply proper parsing for an XML tag. I want my event started at tag <v1:BusinessEventRequest> and line breaking at </v1:BusinessEventRequest>

Provided is the sample log file :

Thu Aug 06 11:47:02 EDT 2015 name="QUEUE_msg_received" event_id="ID:414d51204d514942513031202020202055bdd46020387541" msg_dest="QA.EA.ELOG.BUSINESSEVENT1" msg_body="<?xml version="1.0" encoding="UTF-8"?><v1:BusinessEventRequest xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:v1="http://schemas.humana.com/Infrastructure/Utility/Logging/BusinessEventRequest/V1.1">;
<v1:BusinessEvent><v1:BusinessEventMetaData>
<v1:BusinessEventTypeCode>BUSINESS_EVENT</v1:BusinessEventTypeCode>
<v1:BusinessEventDateTime>2015-08-06T12:00:47Z</v1:BusinessEventDateTime>
</v1:BusinessEventMetaData><v1:SourceApplicationInformation>
<v1:EAPMId>11111</v1:EAPMId><v1:HostMachineName>MQIBQ01</v1:HostMachineName>
<v1:HostEnvironmentName>QA</v1:HostEnvironmentName><v1:AppEventCorrelationId/>
<v1:Component><v1:ComponentId/><v1:ComponentName/></v1:Component>
</v1:SourceApplicationInformation><v1:BusinessProcessInformation><v1:ProcessName/>
<v1:EventModelXSDPath/><EventInformation><mstns:BAMEvent
xmlns:mstns="http://enrollmentservices.humana.com/Schema/BAMSchema/v1.0">
<mstns:EventSource>FileIntake</mstns:EventSource>
<mstns:Activity>FileIntakeActivity</mstns:Activity><mstns:EventTransactionId>40efe7da-4ef2-46b6-bea6-911a74db898e</mstns:EventTransactionId>
<mstns:EventCorrelationID>354805729</mstns:EventCorrelationID><mstns:Milestone>
<mstns:MilestoneEvent>File upload requested</mstns:MilestoneEvent>
<mstns:MilestoneState>Begin</mstns:MilestoneState><mstns:DataElements><mstns:FileName/>
<mstns:FileSize>9008</mstns:FileSize><mstns:AdditionalInfo>File upload requested</mstns:AdditionalInfo></mstns:DataElements></mstns:Milestone></mstns:BAMEvent>
</EventInformation></v1:BusinessProcessInformation></v1:BusinessEvent>
</v1:BusinessEventRequest>"

Here is my props.conf file :

[mq_business_nonprod]
DATETIME_CONFIG = CURRENT
BREAK_ONLY_BEFORE = "<v1:BusinessEventRequest
SHOULD_LINEMERGE = true
MUST_BREAK_AFTER = </v1:BusinessEventRequest>"
TRUNCATE = 1000000
disabled = false
pulldown_type = true
NO_BINARY_CHECK = 1
KV_MODE = xml

Am I missing something? Please advise.

0 Karma
1 Solution

OMohi
Path Finder

I am able to strip out the non xml data from the events by using the following props. It worked fine. Thank You guys for your inputs:

[sourcetype]
TIME_PREFIX =
SHOULD_LINEMERGE = true
MAX_TIMESTAMP_LOOKAHEAD = 150
TRUNCATE = 1000000
disabled = false
pulldown_type = true
NO_BINARY_CHECK = 1
SEDCMD-stripnonxml-1=s/^.*msg_body="//

View solution in original post

OMohi
Path Finder

I am able to strip out the non xml data from the events by using the following props. It worked fine. Thank You guys for your inputs:

[sourcetype]
TIME_PREFIX =
SHOULD_LINEMERGE = true
MAX_TIMESTAMP_LOOKAHEAD = 150
TRUNCATE = 1000000
disabled = false
pulldown_type = true
NO_BINARY_CHECK = 1
SEDCMD-stripnonxml-1=s/^.*msg_body="//

woodcock
Esteemed Legend

The problem is your first double-quote; try this:

BREAK_ONLY_BEFORE = <v1:BusinessEventRequest

Be sure that you restart the Splunk instances on your Indexers (and/or Heavy Forwarders)? This is required.

Also, I strongly advise against DATETIME_CONFIG = CURRENT if you have a timestamp in your event. You are really looking for trouble doing this.

somesoni2
Revered Legend

The log entries that you're trying to parse are not true XML as they contain non-xml portion at the start. Do think we can get rid of that (if there are no useful information in there)? Once formatted to proper XML, your can configure XML Parsing/event breaking and it should work.

frobinson_splun
Splunk Employee
Splunk Employee

As a start, I'd suggest taking a look at this recent post here on Answers:
http://answers.splunk.com/answers/201898/how-to-configure-splunk-to-read-xml-files-correctl.html

You might need to check the settings for breaks in props.conf as you have multiple line break rules here.

Let me know if this helps--we can continue troubleshooting if not 🙂

Best,
@frobinson_splunk

0 Karma

frobinson_splun
Splunk Employee
Splunk Employee

As a start, I'd suggest taking a look at this recent post here on Answers:
http://answers.splunk.com/answers/201898/how-to-configure-splunk-to-read-xml-files-correctl.html

I took a look at the props.conf spec file for setting up parsing, and I believe you may have a conflict between the two line break rules here. You could try including only one of them to see if this fixes the behavior you're seeing.

Specifically, the two rules are: (from props.conf: http://docs.splunk.com/Documentation/Splunk/6.2.4/Admin/Propsconf)

BREAK_ONLY_BEFORE =
* When set, Splunk creates a new event only if it encounters a new line that matches the
regular expression.
* Defaults to empty.

MUST_BREAK_AFTER =
* When set and the regular expression matches the current line, Splunk creates a new event for
the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.


Let me know if this helps--we can continue troubleshooting if not 🙂

Best,
@frobinson_splunk

frobinson_splun
Splunk Employee
Splunk Employee

Hi @OMohi,
I'm a tech writer here at Splunk. I work on simple xml docs and I'd like to help with your question. I'll reply shortly with some more information!

Best,
@frobinson_splunk

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...