I have an question/Issue with the use of rising option in DB Connect. I'm using Splunk Ent v 9.0x and DB connect 3.9.0
Im trying to understand how DB connect works with the rising option. My issue is that when i run it, the DB server is increasing the memory and swapping on the disk in order to return the results. The database im trying to read has more than100Million entries.
What i did with the configuration of the connection was:
1. When i read the data from the DB to get the results the server responds quickly.
2. When i add the
select * from DB where ID > ? ORDER By ID ASC
the server times out, so i increased the timeout value and i eventually got the option to add the ID number i wanted to track from and proceed to the next window.
3. In the "MAX Rows to Retrieve" field i added 50000 and left the Fetch Size default "300". Execution policy was every 10minutes.
i was monitoring the server database and noticed that each request splunk db connect makes, produces issues on the server. Looks like the "order by ID ASC" makes the DB server to sort all the table, which increases the memory consumption and disk swapping. Returns ALL (i guess) the results and splunk ingests only 50.000 events as specified in the "max rows to retrieve"
i checked also the queries splunk run on the DB side, and the query string didn't show any limit , for example "SELECT * FROM DB WHERE ID > ? order by ID ASC Limit 50000;
ID field is indexed.
Is Splunk DB connect always asking the Database server to sort ALL the records ? How can i import the data, without making performance issue on the DB server? Can i add the limit option in the config file for example ?
EDIT:
IDK if DB connect would work if i put at the end of the query in the db_inputs.conf <SELECT * FROM DB WHERE ID > ? order by ID ASC Limit 50000>. Any thoughts?
Or i could create batch jobs to slowly import data till today and then use the "default" rising string? The question here is if the ORDER bY asc would still make performance issues to the DB server?
... View more