Dashboards & Visualizations

DB Connect: How to format a field containing XML data?

tmuthuk
Path Finder

I am extracting data using db connect from one of our tables. The issue is one of the fields has XML data and the data is not getting indexed properly and the value for the field is breaking into multiple and indexed as multiple events.

The query which i run:

select APPLICATIONID,
EXCEPTIONID,
EXCEPTIONCODE,
COMPONENTNAME,
HOSTNAME,
TRANSACTIONDOMAIN,
TRANSACTIONID,
TRANSACTIONTYPE,
FILENAME,
INTERFACEID,
TO_CHAR(TIME_STAMP,'YYYY-MM-DD HH24:MI:SS') "TIME_STAMP",
MESSAGE,
TRANSACTIONDATA,
TRANSACTIONDATAAFTER,
NOTIFICATIONCHNL,
REPLYDESTINATION,
STACKTRACE,
STATUS,
RESOLUTIONDESCRIPTION,
DATAENCODING,

CORRELATIONID,
RENDERID,
TO_CHAR(RESOLVEDELAYTIME,'YYYY-MM-DD HH24:MI:SS') "RESOLVEDELAYTIME",
EVENTTYPE,
JMSPROPERTIES,
CUSTOM
from cle_up.exceptionrec

Expected data in index :

2014-12-19 14:39:40 applicationid=Product_MDM_DExtract exceptionid=4608670 exceptioncode=SYS-CIM-001 componentname=ProductMDMExternalFeeder-ProductMDMExternalFeederProcessArchive hostname=plm00385.kohls.com transactiondomain= transactionid=000784672453882-164783-US transactiontype= filename= interfaceid= message="Error in response from CIM" transactiondata="10.5.80.46Member8STATE_CHANGE22014-12-19 14:39:34.116141902157411611580129729372tkcdh0634484UPC_VENDOR_COUNTRY000784672453882164783-US109322645CONFIRMEDY" transactiondataafter= notificationchnl= replydestination=UP.MDM.DAILYDELTA.FEEDER.RECEIVE.V1 stacktrace="Job-4976630 Error in [BusinessProcesses/Processes/CIMWSCall.process/Retrieve the data from cim] at com.tibco.plugin.soap.SOAPSendReceiveActivity.postEval(Unknown Source) at com.tibco.pe.core.TaskImpl.eval(Unknown Source) at com.tibco.pe.core.Job.a(Unknown Source) at com.tibco.pe.core.Job.k(Unknown Source) at com.tibco.pe.core.JobDispatcher$JobCourier.a(Unknown Source) at com.tibco.pe.core.JobDispatcher$JobCourier.run(Unknown Source)
This is the response from CIM: " status=Notified resolutiondescription= dataencoding= correlationid= renderid= resolvedelaytime="2014-12-19 14:40:04" eventtype= jmsproperties= custom=

Actual indexed data :

12/19/14 2:39:40.000 PM 2014-12-19 14:39:40 applicationid=Product_MDM_DExtract exceptionid=4608670 exceptioncode=SYS-CIM-001 componentname=ProductMDMExternalFeeder-ProductMDMExternalFeederProcessArchive hostname=plm00385.kohls.com transactiondomain= transactionid=000784672453882-164783-US transactiontype= filename= interfaceid= message="Error in response from CIM" transactiondata="

12/19/14 2:39:40.000 PM ns:RecordNotificationEventns:IPAddress10.5.80.46/ns:IPAddressns:NodeIDMember8/ns:NodeIDns:ActionSTATE_CHANGE/ns:Actionns:Version2/ns:Versionns:ActionDate2014-12-19 14:39:34.116/ns:ActionDatens:ActionDateLong1419021574116/ns:ActionDateLongns:UserID115801/ns:UserIDns:EventID29729372/ns:EventIDns:UserNametkcdh06/ns:UserNamens:RepositoryID34484/ns:RepositoryIDns:RepositoryNameUPC_VENDOR_COUNTRY/ns:RepositoryNamens:RecordID000784672453882/ns:RecordIDns:RecordIDExt164783-US/ns:RecordIDExtns:RecordKey109322645/ns:RecordKeyns:StateCONFIRMED/ns:Statens:IsActiveY/ns:IsActive/ns:RecordNotificationEvent/ns:NotificationEvent" transactiondataafter= notificationchnl= replydestination=UP.MDM.DAILYDELTA.FEEDER.RECEIVE.V1 stacktrace="Job-4976630 Error in [BusinessProcesses/Processes/CIMWSCall.process/Retrieve the data from cim]

12/19/14 2:39:40.000 PM at com.tibco.plugin.soap.SOAPSendReceiveActivity.postEval(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.TaskImpl.eval(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.Job.a(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.Job.k(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.JobDispatcher$JobCourier.a(Unknown Source)

12/19/14 2:39:40.000 PM at com.tibco.pe.core.JobDispatcher$JobCourier.run(Unknown Source)

12/19/14 2:39:40.000 PM This is the response from CIM: " status=Notified resolutiondescription= dataencoding= correlationid= renderid= resolvedelaytime="2014-12-19 14:40:04" eventtype= jmsproperties= custom=

transactiondata field is getting split into multiple events.

Is there any way i can index the XML data with out breaking into multiple events?

Tags (2)
0 Karma
1 Solution

mbenwell
Communicator

Does the xml have line feed characters in it? Splunk may be using the line feeds to break up events.

What does the data look like if you run the query from the Query view in the DB Connect app? If it looks correct, then I'd suggest the issue is the line breaking.

You might need to adjust the line breaking parameters for your sourcetype in props.conf (ref. here http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf). SHOULD_LINEMERGE should be true (default). You could grab a sample (put into a txt file) of what the correct output should look like and use the data inputs wizard to test/fix line breaking issues.

View solution in original post

0 Karma

mbenwell
Communicator

Does the xml have line feed characters in it? Splunk may be using the line feeds to break up events.

What does the data look like if you run the query from the Query view in the DB Connect app? If it looks correct, then I'd suggest the issue is the line breaking.

You might need to adjust the line breaking parameters for your sourcetype in props.conf (ref. here http://docs.splunk.com/Documentation/Splunk/6.2.1/admin/Propsconf). SHOULD_LINEMERGE should be true (default). You could grab a sample (put into a txt file) of what the correct output should look like and use the data inputs wizard to test/fix line breaking issues.

0 Karma

tmuthuk
Path Finder

Hi mbenwell,

Thanks a lot for your suggestion.

The data looks clean , when run the query from query view. I also tried changing the line breaking parameter SHOULD_LINEMERGE should be true . But it is still not indexing properly.

0 Karma

mbenwell
Communicator

What else is configured from the sourcetype you're assigning it? Did you restart Splunkd after modifying the sourcetype? line breaking is one of those things that needs a restart

Something else to check, do you have a distributed deployment? Where did you apply the props.conf changes? Line breaking should be performed on the system that you are running DB connect on (expecting DB Connect is on a search head) assuming you are sending 'cooked' data to the indexer, which is the default for tcpout (in outputs.conf). If it is not sending cooked data, then linebreaking will need to be done at the indexer.

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...