Dashboards & Visualizations

DB Connect: How to format a field containing XML data?

tmuthuk
Path Finder

I am extracting data using db connect from one of our tables. The issue is one of the fields has XML data and the data is not getting indexed properly and the value for the field is breaking into multiple and indexed as multiple events.

The query which i run:

select APPLICATIONID,
EXCEPTIONID,
EXCEPTIONCODE,
COMPONENTNAME,
HOSTNAME,
TRANSACTIONDOMAIN,
TRANSACTIONID,
TRANSACTIONTYPE,
FILENAME,
INTERFACEID,
TO_CHAR(TIME_STAMP,'YYYY-MM-DD HH24:MI:SS') "TIME_STAMP",
MESSAGE,
TRANSACTIONDATA,
TRANSACTIONDATAAFTER,
NOTIFICATIONCHNL,
REPLYDESTINATION,
STACKTRACE,
STATUS,
RESOLUTIONDESCRIPTION,
DATAENCODING,

CORRELATIONID,
RENDERID,
TO_CHAR(RESOLVEDELAYTIME,'YYYY-MM-DD HH24:MI:SS') "RESOLVEDELAYTIME",
EVENTTYPE,
JMSPROPERTIES,
CUSTOM
from cle_up.exceptionrec

Expected data in index :

2014-12-19 14:39:40 applicationid=Product_MDM_DExtract exceptionid=4608670 exceptioncode=SYS-CIM-001 componentname=ProductMDMExternalFeeder-ProductMDMExternalFeederProcessArchive hostname=plm00385.kohls.com transactiondomain= transactionid=000784672453882-164783-US transactiontype= filename= interfaceid= message="Error in response from CIM" transactiondata="10.5.80.46Member8STATE_CHANGE22014-12-19 14:39:34.116141902157411611580129729372tkcdh0634484UPC_VENDOR_COUNTRY000784672453882164783-US109322645CONFIRMEDY" transactiondataafter= notificationchnl= replydestination=UP.MDM.DAILYDELTA.FEEDER.RECEIVE.V1 stacktrace="Job-4976630 Error in [BusinessProcesses/Processes/CIMWSCall.process/Retrieve the data from cim] at com.tibco.plugin.soap.SOAPSendReceiveActivity.postEval(Unknown Source) at com.tibco.pe.core.TaskImpl.eval(Unknown Source) at com.tibco.pe.core.Job.a(Unknown Source) at com.tibco.pe.core.Job.k(Unknown Source) at com.tibco.pe.core.JobDispatcher$JobCourier.a(Unknown Source) at com.tibco.pe.core.JobDispatcher$JobCourier.run(Unknown Source)
This is the response from CIM: " status=Notified resolutiondescription= dataencoding= correlationid= renderid= resolvedelaytime="2014-12-19 14:40:04" eventtype= jmsproperties= custom=

Actual indexed data :

12/19/14 2:39:40.000 PM 2014-12-19 14:39:40 applicationid=Product_MDM_DExtract exceptionid=4608670 exceptioncode=SYS-CIM-001 componentname=ProductMDMExternalFeeder-ProductMDMExternalFeederProcessArchive hostname=plm00385.kohls.com transactiondomain= transactionid=000784672453882-164783-US transactiontype= filename= interfaceid= message="Error in response from CIM" transactiondata="

12/19/14 2:39:40.000 PM ns:RecordNotificationEventns:IPAddress10.5.80.46/ns:IPAddressns:NodeIDMember8/ns:NodeIDns:ActionSTATE_CHANGE/ns:Actionns:Version2/ns:Versionns:ActionDate2014-12-19 14:39:34.116/ns:ActionDatens:ActionDateLong1419021574116/ns:ActionDateLongns:UserID115801/ns:UserIDns:EventID29729372/ns:EventIDns:UserNametkcdh06/ns:UserNamens:RepositoryID34484/ns:RepositoryIDns:RepositoryNameUPC_VENDOR_COUNTRY/ns:RepositoryNamens:RecordID000784672453882/ns:RecordIDns:RecordIDExt164783-US/ns:RecordIDExtns:RecordKey109322645/ns:RecordKeyns:StateCONFIRMED/ns:Statens:IsActiveY/ns:IsActive/ns:RecordNotificationEvent/ns:NotificationEvent" transactiondataafter= notificationchnl= replydestination=UP.MDM.DAILYDELTA.FEEDER.RECEIVE.V1 stacktrace="Job-4976630 Error in [BusinessProcesses/Processes/CIMWSCall.process/Retrieve the data from cim]

12/19/14 2:39:40.000 PM at com.tibco.plugin.soap.SOAPSendReceiveActivity.postEval(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.TaskImpl.eval(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.Job.a(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.Job.k(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.JobDispatcher$JobCourier.a(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.JobDispatcher$JobCourier.run(Unknown Source)

12/19/14 2:39:40.000 PM This is the response from CIM: " status=Notified resolutiondescription= dataencoding= correlationid= renderid= resolvedelaytime="2014-12-19 14:40:04" eventtype= jmsproperties= custom=

transactiondata field is getting split into multiple events.

Is there any way i can index the XML data with out breaking into multiple events?

Tags (2)
0 Karma
1 Solution

mbenwell
Communicator

Does the xml have line feed characters in it? Splunk may be using the line feeds to break up events.

What does the data look like if you run the query from the Query view in the DB Connect app? If it looks correct, then I'd suggest the issue is the line breaking.

You might need to adjust the line breaking parameters for your sourcetype in props.conf (ref. here http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf). SHOULD_LINEMERGE should be true (default). You could grab a sample (put into a txt file) of what the correct output should look like and use the data inputs wizard to test/fix line breaking issues.

View solution in original post

0 Karma

mbenwell
Communicator

Does the xml have line feed characters in it? Splunk may be using the line feeds to break up events.

What does the data look like if you run the query from the Query view in the DB Connect app? If it looks correct, then I'd suggest the issue is the line breaking.

You might need to adjust the line breaking parameters for your sourcetype in props.conf (ref. here http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf). SHOULD_LINEMERGE should be true (default). You could grab a sample (put into a txt file) of what the correct output should look like and use the data inputs wizard to test/fix line breaking issues.

0 Karma

tmuthuk
Path Finder

Hi mbenwell,

Thanks a lot for your suggestion.

The data looks clean , when run the query from query view. I also tried changing the line breaking parameter SHOULD_LINEMERGE should be true . But it is still not indexing properly.

0 Karma

mbenwell
Communicator

What else is configured from the sourcetype you're assigning it? Did you restart Splunkd after modifying the sourcetype? line breaking is one of those things that needs a restart

Something else to check, do you have a distributed deployment? Where did you apply the props.conf changes? Line breaking should be performed on the system that you are running DB connect on (expecting DB Connect is on a search head) assuming you are sending 'cooked' data to the indexer, which is the default for tcpout (in outputs.conf). If it is not sending cooked data, then linebreaking will need to be done at the indexer.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...