Splunk Dev

Splunk DB Connect Output formatting values - what do they mean?

baerrach
Path Finder

http://docs.splunk.com/Documentation/DBX/latest/DeployDBX/Configuredatabasemonitoring does not describe what the different Output formats are doing.

Output formatting

These settings determine how the results are converted into a text-based format Splunk can index.

Formats

    * Key-Value based
    * Multiline Key-Value based
    * CSV
    * Template based
    * Timestamp output 

Initially I chose Key-Value based assuming that DB Connect would create the Key from the Column name and the Value from the data in the column.

I then noticed that varchar fields with new lines were being truncated.
To fix this I had to:

  • Delete all my Database Inputs
  • Stop Splunk
  • run splunk clean eventdata
  • Delete the persistent state for the Database Inputs (see How to reset tail.rising)
  • Start Splunk
  • Create all my Database Inputs - this time with "Multi-line Key-Value format"

So my question is: What do the different values of Output Format mean and can anyone provide guidance on when to select the different values? (Ideally this should be in the Documentation - Configuring Database Monitoring

Tags (1)
1 Solution

ziegfried
Influencer

1) Key-Value format

DB Connect formats the database results by generating key-value pairs in the form of <Column-Name>=<Column-Value>. The value is quoted if it contains certain characters. Those key value pairs are concatenated (seperated by a space). If output.timestamp is enabled, then the timestamp value will be printed at the beginning of the line.

2013-04-05T15:00:23.000 ID=4711 username=wuzi123 message="User has been disabled"

The sourcetype dbmon:kv contains the necessary index- and search-time settings to deal with this format. It can either be used directly (by simply omitting the sourcetype for the input) or by copying it's settings.

[dbmon:kv]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)

2) Multiline Key-Value format

Essentially the same as the regular key-value format with the exception that the key-value pairs are seperated by a newline and there is no quoting of values.

2013-04-05T15:00:23.000
ID=4711
username=wuzi123
message=User has been disabled

The sourcetype dbmon:mkv contains the necessary index- and search-time settings.

[dbmon:mkv]
KV_MODE = none
REPORT-mkv = dbx-mkv
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]---91827349873-dbx-end-of-event---[\r\n])
LINE_BREAKER_LOOKBEHIND = 10000

3) CSV

The values of the database result are format as comma separated values (regular CSV with quoting):

2013-04-05T15:00:23.000,"4711","wuzi123","User has been disabled"

The field extraction for this format has to be done manually. The best option here is a transforms.conf stanza with FIELDS and DELIMS.

3a) CSV with Headers

Does the same as CSV, but it will write the a header line into each file that is being generated containing a comma-separated list of the columns names from the database result.

4) Template

This format is the most flexible one, as it allows you to generate any format in the output.template option. The template should contain replacement tokens in the form of $<Column-Name>$. Those tokens are being replaced with the corresponding column value.

Example Template:

An even occurred at $timestamp$. User $user$ ID $ID$: $message$

would produce

An even occurred at 2013-04-05T15:00:23.000. User wuzi123 ID 4711: User has been disabled

DB Connect ships with a props.conf stanza that handles line-breaking for this format. The settings should be copied the sourcetype used for the database input.

[source::...tpl_*.dbmonevt]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]---91827349873-dbx-end-of-event---[\r\n])
HEADER_MODE = firstline

View solution in original post

ziegfried
Influencer

1) Key-Value format

DB Connect formats the database results by generating key-value pairs in the form of <Column-Name>=<Column-Value>. The value is quoted if it contains certain characters. Those key value pairs are concatenated (seperated by a space). If output.timestamp is enabled, then the timestamp value will be printed at the beginning of the line.

2013-04-05T15:00:23.000 ID=4711 username=wuzi123 message="User has been disabled"

The sourcetype dbmon:kv contains the necessary index- and search-time settings to deal with this format. It can either be used directly (by simply omitting the sourcetype for the input) or by copying it's settings.

[dbmon:kv]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]+)

2) Multiline Key-Value format

Essentially the same as the regular key-value format with the exception that the key-value pairs are seperated by a newline and there is no quoting of values.

2013-04-05T15:00:23.000
ID=4711
username=wuzi123
message=User has been disabled

The sourcetype dbmon:mkv contains the necessary index- and search-time settings.

[dbmon:mkv]
KV_MODE = none
REPORT-mkv = dbx-mkv
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]---91827349873-dbx-end-of-event---[\r\n])
LINE_BREAKER_LOOKBEHIND = 10000

3) CSV

The values of the database result are format as comma separated values (regular CSV with quoting):

2013-04-05T15:00:23.000,"4711","wuzi123","User has been disabled"

The field extraction for this format has to be done manually. The best option here is a transforms.conf stanza with FIELDS and DELIMS.

3a) CSV with Headers

Does the same as CSV, but it will write the a header line into each file that is being generated containing a comma-separated list of the columns names from the database result.

4) Template

This format is the most flexible one, as it allows you to generate any format in the output.template option. The template should contain replacement tokens in the form of $<Column-Name>$. Those tokens are being replaced with the corresponding column value.

Example Template:

An even occurred at $timestamp$. User $user$ ID $ID$: $message$

would produce

An even occurred at 2013-04-05T15:00:23.000. User wuzi123 ID 4711: User has been disabled

DB Connect ships with a props.conf stanza that handles line-breaking for this format. The settings should be copied the sourcetype used for the database input.

[source::...tpl_*.dbmonevt]
SHOULD_LINEMERGE = false
LINE_BREAKER = ([\r\n]---91827349873-dbx-end-of-event---[\r\n])
HEADER_MODE = firstline

anwarmian
Communicator

Clear explanations with nice examples. Thanks!

richgalloway
SplunkTrust
SplunkTrust

When using the CSV format, how does one link the inputs.conf and transforms.conf files? Should the respective files have identical stanza names?

---
If this reply helps you, Karma would be appreciated.
0 Karma

baerrach
Path Finder

Any updates to the answer when the column values contain newlines, i.e. from the original question "I then noticed that varchar fields with new lines were being truncated."

For me only Multiline Key-Value based worked

0 Karma

jhambrick
New Member

Did you ever find a solution to the issue where column values contain new lines?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...