All Apps and Add-ons

Splunk DB Connect: How to index 1000 separate database queries?

jplumsdaine22
Influencer

I need to index a query on about 1000 separate databases. I'm guessing I'm going to run into issues at that scale. Has anybody successfully run dbmon-tail with about 1000 connections before? Or has anyone worked with indexing database queries with another method at this scale?

0 Karma
1 Solution

jplumsdaine22
Influencer

For those interested, we are currently running 310 db inputs using dbx2, and intend to scale out to several thousand shortly

A couple of issues for people trying to do the same thing:

  1. You must run this on a Heavy Forwarder!
  2. Make sure the splunk user has access to enough open processes and file handles. You may need to play around with sysctl
  3. You need to keep a close eye on the heavy forwarder's exec queue. As I understand it the JVM running the database query may drop data if the exec queue is blocked. So again make sure you have enough horsepower on your heavy forwarder
  4. Spread the load - assign different intervals to your inputs. Unfortunately dbx2 isn't very smart about connection management, so you're going to have to manually separate the connections.
  5. dbx2 keeps the rising column information in the input stanza in inputs.conf. This means if you're programmatically generating the inputs.conf (which you should be doing at this scale) you can accidently overwrite the tail_rising_column_checkpoint_value key, which will mean you will duplicate your data. To avoid this, you can create input stanzas in a separate app and ignore the tail_rising_column_checkpoint_value key. dbx2 will create splunk_app_db_connect/local/inputs.conf and write the tail_rising_column_checkpoint_value there.

View solution in original post

jplumsdaine22
Influencer

For those interested, we are currently running 310 db inputs using dbx2, and intend to scale out to several thousand shortly

A couple of issues for people trying to do the same thing:

  1. You must run this on a Heavy Forwarder!
  2. Make sure the splunk user has access to enough open processes and file handles. You may need to play around with sysctl
  3. You need to keep a close eye on the heavy forwarder's exec queue. As I understand it the JVM running the database query may drop data if the exec queue is blocked. So again make sure you have enough horsepower on your heavy forwarder
  4. Spread the load - assign different intervals to your inputs. Unfortunately dbx2 isn't very smart about connection management, so you're going to have to manually separate the connections.
  5. dbx2 keeps the rising column information in the input stanza in inputs.conf. This means if you're programmatically generating the inputs.conf (which you should be doing at this scale) you can accidently overwrite the tail_rising_column_checkpoint_value key, which will mean you will duplicate your data. To avoid this, you can create input stanzas in a separate app and ignore the tail_rising_column_checkpoint_value key. dbx2 will create splunk_app_db_connect/local/inputs.conf and write the tail_rising_column_checkpoint_value there.

ppablo
Community Manager
Community Manager

Thanks for coming back to revisit your post to close it out. Also, welcome back 🙂 been some time since we've seen ya around!

0 Karma

ppablo
Community Manager
Community Manager

Hi @jplumsdaine22

Just to clarify, did you mean dbxquery instead of dbquery since you tagged DB Connect 2?

0 Karma

jplumsdaine22
Influencer

Ah actually I meant dbmon-tail!

0 Karma

ppablo
Community Manager
Community Manager

Oh alrighty, glad we caught that ;D I hope ya find an answer soon!

0 Karma
Get Updates on the Splunk Community!

Improve Your Security Posture

Watch NowImprove Your Security PostureCustomers are at the center of everything we do at Splunk and security ...

Maximize the Value from Microsoft Defender with Splunk

 Watch NowJoin Splunk and Sens Consulting for this Security Edition Tech TalkWho should attend:  Security ...

This Week's Community Digest - Splunk Community Happenings [6.27.22]

Get the latest news and updates from the Splunk Community here! News From Splunk Answers ✍️ Splunk Answers is ...