Installation

How come the number of events is different between the database and the search?

dingguibin1
New Member

Hello.

Recently I met a problem. I found that a number of events are different between the database and the search.

I confirmed that the number of events was correct when I connected the database(picture1 in attachment).

However, when I found events in DB connect, the number of events was much more bigger than in the DB.

I further checked the events and found out that some events in the database were imported into Splunk more than once(some twice, some triple).

So the number of events in Splunk is bigger than in the database.

Furthermore, when Splunk first connected with the database, the number of events was correct in Splunk.

But maybe, after several days, the number became wrong. Could someone tell me what is the reason?

Thank you so much!

Tags (1)
0 Karma

woodcock
Esteemed Legend

What version of DBConnect? Some versions have a hard-coded limit in db*query.py that you must comment out (yes, I am serious).
Also see here:

https://answers.splunk.com/answers/593476/splunk-db-connect-how-do-you-increase-the-maximum.html

0 Karma

DalJeanis
Legend

So, there are a number of things that are possible to be wrong here.

First, you need to check the configuration of your "rising column". You should have the connection set up to only bring in new records, based on some field that will be known to track whether records are new. "Date added" or whatever.

Second, if records in the underlying database are being updated, rather than merely newly added, there is an architectural issue on the Splunk side that an updated, changed record is a new event as far as Splunk is concerned. That means you need do develop your queries to dedup the results on the record key, whatever that might be.

Alternatively, you might decide to bring in all the records on a schedule (such as every 24 hours) and not read records in splunk older than that.

Alternatively... and this is NOT recommended, but you can potentially do it, depending on your use case... you can set up a periodic job to purge the older copy of any duplicates.

0 Karma

p_gurav
Champion

Could you please share input configurations? Is it batch input or rising input?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...