All Apps and Add-ons

Splunk DB Connect 2: misssing events

lkeli_spl
Engager

Hey all,

I've searched for an answer to this but cannot see one, so apologies if this has been answered before.

I am using DB Connect 2 to pull big data (about 60000 events in 30 minuts from one database) from a variety of Oracle databases into indexes. I noticed that not all events are indexed. When I check the health tab in DB connect everything seems OK.

My observations:
1. When I go into the DB Connect Operations tab, I can verify that the data is there when I do the query preview, it's making it into the index.
2. When I am using dbquery in the search, all events from the database were returned, that's right.
3. I checked my _internal index, I did not find any errors.
4. Decreasing the "Fetch Size" parameter (from 5000 to 800, then to 300) seems to reduces the number of missed events, but still not all data is indexed.
5. The indexer often lacks a free swap, although there is free RAM. Maybe this is the problem?

Any help in where I can look to troubleshoot would be appreciated.

We have 1 indexer and several search head:
Splunk Enterprise Server 6.5.2
Linux, 47.1 GB Physical Memory, 12 CPU Cores

inputs.conf
[mi_input://**]
connection = ...
enable_query_wrapping = 1
index = cft_docum
input_timestamp_column_fullname = (001) NULL.TIME.TIMESTAMP
input_timestamp_column_name = TIME
interval = 60
max_rows = 5000000
mode = tail
output_timestamp_format = yyyy-MM-dd HH:mm:ss
query = SELECT /
+ opt_param('db_file_multiblock_read_count',1) */ ...
FROM ...
WHERE t1.TIME > sysdate-1/24\
AND t1.STATE_ID = t2.ID\
AND t1.CLASS_ID = t2.CLASS_ID\
AND t1.OBJ_ID = t3.ID\
AND t1.OBJ_ID = t4.ID
source = //...
sourcetype = ...
tail_rising_column_fullname = (001) NULL.TIME.TIMESTAMP
tail_rising_column_name = TIME
ui_query_catalog = NULL
ui_query_mode = advanced
disabled = 0
auto_disable = false
tail_rising_column_checkpoint_value = 1563454881499
fetch_size = 800

landen99
Motivator

If the SQL query takes too long, then the input will fail in DB Connect. The amount of time required for DB Connect grows geometrically with the number of events (amount of data).

Pulling the data directly using python instead is much more efficient, much faster, and much more reliable.

If you must use DB Connect, decrease the amount of data pulled at a time, and increase the frequency of the inputs runs.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...