Hi Everyone:
I am facing an issue where I am unable to apply proper parsing for an XML tag. I want my event started at tag <v1:BusinessEventRequest> and line breaking at </v1:BusinessEventRequest>
Provided is the sample log file :
Thu Aug 06 11:47:02 EDT 2015 name="QUEUE_msg_received" event_id="ID:414d51204d514942513031202020202055bdd46020387541" msg_dest="QA.EA.ELOG.BUSINESSEVENT1" msg_body="<?xml version="1.0" encoding="UTF-8"?><v1:BusinessEventRequest xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:v1="http://schemas.humana.com/Infrastructure/Utility/Logging/BusinessEventRequest/V1.1">;
<v1:BusinessEvent><v1:BusinessEventMetaData>
<v1:BusinessEventTypeCode>BUSINESS_EVENT</v1:BusinessEventTypeCode>
<v1:BusinessEventDateTime>2015-08-06T12:00:47Z</v1:BusinessEventDateTime>
</v1:BusinessEventMetaData><v1:SourceApplicationInformation>
<v1:EAPMId>11111</v1:EAPMId><v1:HostMachineName>MQIBQ01</v1:HostMachineName>
<v1:HostEnvironmentName>QA</v1:HostEnvironmentName><v1:AppEventCorrelationId/>
<v1:Component><v1:ComponentId/><v1:ComponentName/></v1:Component>
</v1:SourceApplicationInformation><v1:BusinessProcessInformation><v1:ProcessName/>
<v1:EventModelXSDPath/><EventInformation><mstns:BAMEvent
xmlns:mstns="http://enrollmentservices.humana.com/Schema/BAMSchema/v1.0">
<mstns:EventSource>FileIntake</mstns:EventSource>
<mstns:Activity>FileIntakeActivity</mstns:Activity><mstns:EventTransactionId>40efe7da-4ef2-46b6-bea6-911a74db898e</mstns:EventTransactionId>
<mstns:EventCorrelationID>354805729</mstns:EventCorrelationID><mstns:Milestone>
<mstns:MilestoneEvent>File upload requested</mstns:MilestoneEvent>
<mstns:MilestoneState>Begin</mstns:MilestoneState><mstns:DataElements><mstns:FileName/>
<mstns:FileSize>9008</mstns:FileSize><mstns:AdditionalInfo>File upload requested</mstns:AdditionalInfo></mstns:DataElements></mstns:Milestone></mstns:BAMEvent>
</EventInformation></v1:BusinessProcessInformation></v1:BusinessEvent>
</v1:BusinessEventRequest>"
Here is my props.conf file :
[mq_business_nonprod]
DATETIME_CONFIG = CURRENT
BREAK_ONLY_BEFORE = "<v1:BusinessEventRequest
SHOULD_LINEMERGE = true
MUST_BREAK_AFTER = </v1:BusinessEventRequest>"
TRUNCATE = 1000000
disabled = false
pulldown_type = true
NO_BINARY_CHECK = 1
KV_MODE = xml
Am I missing something? Please advise.
I am able to strip out the non xml data from the events by using the following props. It worked fine. Thank You guys for your inputs:
[sourcetype]
TIME_PREFIX =
SHOULD_LINEMERGE = true
MAX_TIMESTAMP_LOOKAHEAD = 150
TRUNCATE = 1000000
disabled = false
pulldown_type = true
NO_BINARY_CHECK = 1
SEDCMD-stripnonxml-1=s/^.*msg_body="//
I am able to strip out the non xml data from the events by using the following props. It worked fine. Thank You guys for your inputs:
[sourcetype]
TIME_PREFIX =
SHOULD_LINEMERGE = true
MAX_TIMESTAMP_LOOKAHEAD = 150
TRUNCATE = 1000000
disabled = false
pulldown_type = true
NO_BINARY_CHECK = 1
SEDCMD-stripnonxml-1=s/^.*msg_body="//
The problem is your first double-quote; try this:
BREAK_ONLY_BEFORE = <v1:BusinessEventRequest
Be sure that you restart the Splunk instances on your Indexers (and/or Heavy Forwarders)? This is required.
Also, I strongly advise against DATETIME_CONFIG = CURRENT
if you have a timestamp in your event. You are really looking for trouble doing this.
The log entries that you're trying to parse are not true XML as they contain non-xml portion at the start. Do think we can get rid of that (if there are no useful information in there)? Once formatted to proper XML, your can configure XML Parsing/event breaking and it should work.
As a start, I'd suggest taking a look at this recent post here on Answers:
http://answers.splunk.com/answers/201898/how-to-configure-splunk-to-read-xml-files-correctl.html
You might need to check the settings for breaks in props.conf as you have multiple line break rules here.
Let me know if this helps--we can continue troubleshooting if not 🙂
Best,
@frobinson_splunk
As a start, I'd suggest taking a look at this recent post here on Answers:
http://answers.splunk.com/answers/201898/how-to-configure-splunk-to-read-xml-files-correctl.html
I took a look at the props.conf spec file for setting up parsing, and I believe you may have a conflict between the two line break rules here. You could try including only one of them to see if this fixes the behavior you're seeing.
BREAK_ONLY_BEFORE =
* When set, Splunk creates a new event only if it encounters a new line that matches the
regular expression.
* Defaults to empty.
MUST_BREAK_AFTER =
* When set and the regular expression matches the current line, Splunk creates a new event for
the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.
Let me know if this helps--we can continue troubleshooting if not 🙂
Best,
@frobinson_splunk
Hi @OMohi,
I'm a tech writer here at Splunk. I work on simple xml docs and I'd like to help with your question. I'll reply shortly with some more information!
Best,
@frobinson_splunk