All Apps and Add-ons

DB Connect 2 inefficient querying

jsilverbears
Path Finder

When creating an input I could not get the Choose and Preview Table (Step 2 of 4) to fetch any data. It just said Executing Query with a spinner.

I had the following options selected:
Catalog: db_name
Schema: NULL
Table: table1
Max rows: 100

I took a look at what processes were running against the database directly and I saw the following query:
SELECT * FROM (SELECT * FROM db_name.table1) t ORDER BY id ASC

This shows no limit and that table has several gigs. I had to put in a date limit in the actual query (going into advanced query mode) to get the query results to finish. I also noted that there is a checkpoint value so that Splunk DB Connect 2 knows where it left off but when I check the actual query being created as it is pulling data after making it an input MySQL shows:
SELECT * FROM (SELECT * FROM db_name.table1 WHERE date > '2013-03-28 00:00:00') t ORDER BY id ASC

There is no id > anything when it is fetching more data as an input. I am worried that this input's query is going to start stalling after a few days. It seems that Splunk DB Connect 2 is doing all the limiting after it gets all the data from the database table. Is there any way to increase the efficiency of the queries Splunk DB Connect 2 is creating automatically or am I missing something here?

Tags (1)

mabramov
Explorer

Try disabling query wrapping. Fixed it for me on MySQL 5.6.
link text

0 Karma

dlisicre
Explorer

Hi jsilverbears, did you manage to solve this issue? Same is happening on my side, making queries that should take miliseconds take 20seconds.

0 Karma

jsilverbears
Path Finder

Sort of. I pull data under a rolling window. So, for me, I pull the data in the last 24 hours with the rising column. This limits the data that it has to go through consistently.

I also pull the data every 2 minutes. Now this means that it pulls all the data every time but the internal check keeps only the new data coming in.

The additional reason I pull for 24 hours is just a safety issue. If something goes wrong and I can't pull data, this gives us a deadline to fix it.

Your limits may vary for your application but that's how I did it.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...