I'm a Splunk newbie, so feel free to challenge any of my assumptions.
I'm tasked with integrating our proprietary product's event/alert database.
I believe the correct approach (in a simple case) is to install DB Connect and a (universal/heavy?) forwarder on the database server host and Splunk enterprise as an indexer/search head on a "query/reporting" host.
The difficulty that I'm encountering is that at least one table has a column that contains XML; this XML describes a variable list of additional fields based on the event/alert type (similar to a Windows event log); some of these additional fields include text with commas, which screws up the CSV processing.
These fields should be searchable, and select-able, on the search head, but I'm not sure what the best approach to processing them should be.
I started to look into custom search commands to transform the SQL Server record into an appropriate form:
A CSV representation seems to be a problem, not just due to delimiting characters in the field text, but to output the header row, all records must be processed to determine the set of additional fields.
One option is to convert the data into to a "key=value;" representation.
Can I define a custom source type to handle the data?
I expect the answer is probably a combination of these approaches.
BTW, I installed Splunk Enterprise and Splunk DB Connect, but even with a reduced set of records, I violated the daily limits on the demo license. Advice on avoiding this would be helpful.
If you have DB admins are not helping you really on this,
you should try to get make that xml field single line as well as use tostring() as required.
When you see your field in Splunk, you can easily swap it with _raw variable, and use all Splunk built in methods for XML structure
<search that refers one row in table, > | eval temp=_raw | eval _raw=xmlfield #some xml commands available in Splunk | xmlkv (via xml add on i believe) | spath (default function) etc...
in one of my cases, I had XML in HTML encoded also, just before above search ,
you would need to replace html encoded characters using SED of Splunk . Such that " & l t ; " with "<"
I would create a custom view inside the source database on the table data that you need where it is transformed into a more coherent and easily-digestible format. Exploit the expertise of your DBAs; get somebody else to do the work. Splunk doesn't have to do all of the heavy lifting.
You only need to test with 1 or a few event(s). Limit your queries/select in DB Connect and you'll be under the demo license limit. There's no need to hit it with everything. Limit the timeframe, or get very specific or both. Or manually create a table in your database with only a few hand-picked entries, and then config DB connect to hit that table. You're going to need a way to test, so that needs to be figured out.
The rest of your question might be solvable in props.conf: https://answers.splunk.com/answers/187195/how-to-add-and-parse-xml-data-in-splunk.html