All Apps and Add-ons

Splunk DB Connect 2: misssing events

lkeli_spl
Engager

Hey all,

I've searched for an answer to this but cannot see one, so apologies if this has been answered before.

I am using DB Connect 2 to pull big data (about 60000 events in 30 minuts from one database) from a variety of Oracle databases into indexes. I noticed that not all events are indexed. When I check the health tab in DB connect everything seems OK.

My observations:
1. When I go into the DB Connect Operations tab, I can verify that the data is there when I do the query preview, it's making it into the index.
2. When I am using dbquery in the search, all events from the database were returned, that's right.
3. I checked my _internal index, I did not find any errors.
4. Decreasing the "Fetch Size" parameter (from 5000 to 800, then to 300) seems to reduces the number of missed events, but still not all data is indexed.
5. The indexer often lacks a free swap, although there is free RAM. Maybe this is the problem?

Any help in where I can look to troubleshoot would be appreciated.

We have 1 indexer and several search head:
Splunk Enterprise Server 6.5.2
Linux, 47.1 GB Physical Memory, 12 CPU Cores

inputs.conf
[mi_input://**]
connection = ...
enable_query_wrapping = 1
index = cft_docum
input_timestamp_column_fullname = (001) NULL.TIME.TIMESTAMP
input_timestamp_column_name = TIME
interval = 60
max_rows = 5000000
mode = tail
output_timestamp_format = yyyy-MM-dd HH:mm:ss
query = SELECT /
+ opt_param('db_file_multiblock_read_count',1) */ ...
FROM ...
WHERE t1.TIME > sysdate-1/24\
AND t1.STATE_ID = t2.ID\
AND t1.CLASS_ID = t2.CLASS_ID\
AND t1.OBJ_ID = t3.ID\
AND t1.OBJ_ID = t4.ID
source = //...
sourcetype = ...
tail_rising_column_fullname = (001) NULL.TIME.TIMESTAMP
tail_rising_column_name = TIME
ui_query_catalog = NULL
ui_query_mode = advanced
disabled = 0
auto_disable = false
tail_rising_column_checkpoint_value = 1563454881499
fetch_size = 800

landen99
Motivator

If the SQL query takes too long, then the input will fail in DB Connect. The amount of time required for DB Connect grows geometrically with the number of events (amount of data).

Pulling the data directly using python instead is much more efficient, much faster, and much more reliable.

If you must use DB Connect, decrease the amount of data pulled at a time, and increase the frequency of the inputs runs.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...