Archive

DB Connect: Why am I missing some DB entries via dbmon-tail using a rising column with duplicate timestamps?

New Member

Hi experts,

I am trying to find out why i am missing some DB entries in my splunklogs from my "postgresql" DB via DBConnect (DBX) "dbmon-tail" ?
thing is that i can only use a timestamp as "Rising Column" but for some entries the timestamp seems to be the same duplicate timestamps - for the same time more entries).
my DB uses epochtime as timestamp.

Could it be that i have a problem with my rising Culomn and therefore missing entries...as i have more entries at exact the same time in seconds?

When i am trying to do a query of the DB table i am getting all the entries...
only if i to "dbmon-tail" i miss them....

my sql-query Config in splunk looks like (of course these query does not make much sense but i cleared some entries):


*select
scheduledjobrun.startedat, #timeColumn – in EpochTime
scheduled
jobrun.lastfilename from
jobrunstat,
scheduledjobrun, where
jobrunstat.jobid=scheduledjobrun.jobid AND scheduledjobrun.started_at >= (now() - '3 hour'::INTERVAL)

just a security to not query the whole DB in case of a long "query-break" {{AND $rising_column$ > ?}}*


and my DBX Config looks like: Im Config-File habe ich folgendes geändert:


[dbmon-tail://x.x.x.x/Test] host = hostname interval = auto output.format
= kv output.timestamp = 1 output.timestamp.format = yyyy-MM-dd HH:mm:ss output.timestamp.parse.format
= yyyy-MM-dd HH:mm:ss query = select \r\n tochar(scheduledjobrun.startedat,'YYYY-MM-DD HH24:MI:SS') logdate,\r\n scheduledjobrun.startedat,\r\n scheduledjobrun.lastfilename \r\n from \r\n jobrunstat,\r\n scheduledjobrun,\r\n where \r\n scheduledjobrun.startedat >= (now()
- '3 hour'::INTERVAL) \r\n {{AND $risingcolumn$ > ?}} sourcetype = sourcetypename table = jobrunstatinterval tail.rising.column = logdate


is there a way to guarantee the query of all entries of DB??

Hope you have some good hints for me 🙂

Regards jo

0 Karma

SplunkTrust
SplunkTrust

Using a rising column that can have duplicates (more precisely, a column that is not strictly monotonically rising) is quite problematic. You're probably filtering WHERE $rising_column$ > ?, so if you get one row with value X it'll ignore other rows with value X that are inserted later.

The best way to solve this is to have a strictly monotonically rising column. Talk to your DBA if there's stuff you may not know about that can be used, or talk to your developers if they can add such a column.

A neat way to get around this whole mess is to introduce a delay/offset into your query. If you know the values inserted into the table are at most T seconds old then you could add a condition to the WHERE that only looks at rows older than T seconds. This way you will not look at timestamps that have been partially written, ie timestamps that might get a later row inserted with the same timestamp.

0 Karma