just a security to not query the whole DB in case of a long "query-break" {{AND $rising_column$ > ?}}*

johann_brajer · ‎12-22-2014

Hi experts,

I am trying to find out why i am missing some DB entries in my splunklogs from my "postgresql" DB via DBConnect (DBX) "dbmon-tail" ?
thing is that i can only use a timestamp as "Rising Column" but for some entries the timestamp seems to be the same duplicate timestamps - for the same time more entries).
my DB uses epochtime as timestamp.

Could it be that i have a problem with my rising Culomn and therefore missing entries...as i have more entries at exact the same time in seconds?

When i am trying to do a query of the DB table i am getting all the entries...
only if i to "dbmon-tail" i miss them....

my sql-query Config in splunk looks like (of course these query does not make much sense but i cleared some entries):

*select
scheduled_job_run.started_at, #timeColumn – in EpochTime
scheduled_job_run.last_filename from
job_run_stat,
scheduled_job_run, where
job_run_stat.job_id=scheduled_job_run.job_id AND scheduled_job_run.started_at >= (now() - '3 hour'::INTERVAL)

just a security to not query the whole DB in case of a long "query-break" {{AND $rising_column$ > ?}}*

and my DBX Config looks like: Im Config-File habe ich folgendes geändert:

[dbmon-tail://x.x.x.x/Test] host = hostname interval = auto output.format
= kv output.timestamp = 1 output.timestamp.format = yyyy-MM-dd HH:mm:ss output.timestamp.parse.format
= yyyy-MM-dd HH:mm:ss query = select \r\n to_char(scheduled_job_run.started_at,'YYYY-MM-DD HH24:MI:SS') log_date,\r\n scheduled_job_run.started_at,\r\n scheduled_job_run.last_filename \r\n from \r\n job_run_stat,\r\n scheduled_job_run,\r\n where \r\n scheduled_job_run.started_at >= (now()
- '3 hour'::INTERVAL) \r\n {{AND $rising_column$ > ?}} sourcetype = sourcetype_name table = job_run_stat_interval tail.rising.column = log_date

is there a way to guarantee the query of all entries of DB??

Hope you have some good hints for me 🙂

Regards jo

martin_mueller · ‎12-26-2014

Using a rising column that can have duplicates (more precisely, a column that is not strictly monotonically rising) is quite problematic. You're probably filtering WHERE $rising_column$ > ?, so if you get one row with value X it'll ignore other rows with value X that are inserted later.

The best way to solve this is to have a strictly monotonically rising column. Talk to your DBA if there's stuff you may not know about that can be used, or talk to your developers if they can add such a column.

A neat way to get around this whole mess is to introduce a delay/offset into your query. If you know the values inserted into the table are at most T seconds old then you could add a condition to the WHERE that only looks at rows older than T seconds. This way you will not look at timestamps that have been partially written, ie timestamps that might get a later row inserted with the same timestamp.

DB Connect: Why am I missing some DB entries via dbmon-tail using a rising column with duplicate timestamps?

just a security to not query the whole DB in case of a long "query-break" {{AND $rising_column$ > ?}}*

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

DB Connect: Why am I missing some DB entries via dbmon-tail using a rising column with duplicate timestamps?

just a security to not query the whole DB in case of a long "query-break" {{AND $rising_column$ > ?}}*

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard