<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DB Connect: Why am I missing some DB entries via dbmon-tail using a rising column with duplicate timestamps? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/DB-Connect-Why-am-I-missing-some-DB-entries-via-dbmon-tail-using/m-p/168373#M98205</link>
    <description>&lt;P&gt;Using a rising column that can have duplicates (more precisely, a column that is not strictly monotonically rising) is quite problematic. You're probably filtering &lt;CODE&gt;WHERE $rising_column$ &amp;gt; ?&lt;/CODE&gt;, so if you get one row with value X it'll ignore other rows with value X that are inserted later.&lt;/P&gt;

&lt;P&gt;The best way to solve this is to have a strictly monotonically rising column. Talk to your DBA if there's stuff you may not know about that can be used, or talk to your developers if they can add such a column.&lt;/P&gt;

&lt;P&gt;A neat way to get around this whole mess is to introduce a delay/offset into your query. If you know the values inserted into the table are at most T seconds old then you could add a condition to the WHERE that only looks at rows older than T seconds. This way you will not look at timestamps that have been partially written, ie timestamps that might get a later row inserted with the same timestamp.&lt;/P&gt;</description>
    <pubDate>Sat, 27 Dec 2014 02:54:38 GMT</pubDate>
    <dc:creator>martin_mueller</dc:creator>
    <dc:date>2014-12-27T02:54:38Z</dc:date>
    <item>
      <title>DB Connect: Why am I missing some DB entries via dbmon-tail using a rising column with duplicate timestamps?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/DB-Connect-Why-am-I-missing-some-DB-entries-via-dbmon-tail-using/m-p/168372#M98204</link>
      <description>&lt;P&gt;Hi experts,&lt;/P&gt;

&lt;P&gt;I am trying to find out why i am missing some DB entries in my splunklogs from my "postgresql" DB via DBConnect (DBX) "dbmon-tail" ? &lt;BR /&gt;
thing is that i can only use a timestamp as "Rising Column" but for some entries the timestamp seems to be the same duplicate timestamps - for the same time more entries).&lt;BR /&gt;
my DB uses epochtime as timestamp.&lt;/P&gt;

&lt;P&gt;Could it be that i have a problem with my rising Culomn and therefore missing entries...as i have more entries at exact the same time in seconds?&lt;/P&gt;

&lt;P&gt;When i am trying to do a query of the DB table i am getting all the entries...&lt;BR /&gt;
only if i to "dbmon-tail" i miss them....&lt;/P&gt;

&lt;P&gt;my sql-query Config in splunk looks like (of course these query does not make much sense but i cleared some entries):&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;*select &lt;BR /&gt;
    scheduled_job_run.started_at,                                                                   #timeColumn – in EpochTime &lt;BR /&gt;
    scheduled_job_run.last_filename  from &lt;BR /&gt;
           job_run_stat,&lt;BR /&gt;
           scheduled_job_run, where &lt;BR /&gt;
    job_run_stat.job_id=scheduled_job_run.job_id AND    scheduled_job_run.started_at &amp;gt;= (now() - '3 hour'::INTERVAL)          &lt;/P&gt;

&lt;H1&gt;just a security to not query the whole DB in case of a long "query-break" {{AND $rising_column$ &amp;gt; ?}}*&lt;/H1&gt;

&lt;HR /&gt;

&lt;P&gt;and my DBX Config looks like: Im Config-File habe ich folgendes geändert:&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;&lt;EM&gt;[dbmon-tail://x.x.x.x/Test] host = hostname interval = auto output.format&lt;BR /&gt;
= kv output.timestamp = 1 output.timestamp.format = yyyy-MM-dd HH:mm:ss output.timestamp.parse.format&lt;BR /&gt;
= yyyy-MM-dd HH:mm:ss query = select \r\n     to_char(scheduled_job_run.started_at,'YYYY-MM-DD HH24:MI:SS') log_date,\r\n    scheduled_job_run.started_at,\r\n   scheduled_job_run.last_filename \r\n from \r\n           job_run_stat,\r\n  scheduled_job_run,\r\n where \r\n      scheduled_job_run.started_at &amp;gt;= (now()&lt;BR /&gt;
- '3 hour'::INTERVAL)  \r\n {{AND $rising_column$ &amp;gt; ?}} sourcetype = sourcetype_name table = job_run_stat_interval tail.rising.column = log_date&lt;/EM&gt;&lt;/P&gt;

&lt;HR /&gt;

&lt;P&gt;is there a way to guarantee the query of all entries of DB??&lt;/P&gt;

&lt;P&gt;Hope you have some good hints for me &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;

&lt;P&gt;Regards jo &lt;/P&gt;</description>
      <pubDate>Mon, 28 Sep 2020 18:32:20 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/DB-Connect-Why-am-I-missing-some-DB-entries-via-dbmon-tail-using/m-p/168372#M98204</guid>
      <dc:creator>johann_brajer</dc:creator>
      <dc:date>2020-09-28T18:32:20Z</dc:date>
    </item>
    <item>
      <title>Re: DB Connect: Why am I missing some DB entries via dbmon-tail using a rising column with duplicate timestamps?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/DB-Connect-Why-am-I-missing-some-DB-entries-via-dbmon-tail-using/m-p/168373#M98205</link>
      <description>&lt;P&gt;Using a rising column that can have duplicates (more precisely, a column that is not strictly monotonically rising) is quite problematic. You're probably filtering &lt;CODE&gt;WHERE $rising_column$ &amp;gt; ?&lt;/CODE&gt;, so if you get one row with value X it'll ignore other rows with value X that are inserted later.&lt;/P&gt;

&lt;P&gt;The best way to solve this is to have a strictly monotonically rising column. Talk to your DBA if there's stuff you may not know about that can be used, or talk to your developers if they can add such a column.&lt;/P&gt;

&lt;P&gt;A neat way to get around this whole mess is to introduce a delay/offset into your query. If you know the values inserted into the table are at most T seconds old then you could add a condition to the WHERE that only looks at rows older than T seconds. This way you will not look at timestamps that have been partially written, ie timestamps that might get a later row inserted with the same timestamp.&lt;/P&gt;</description>
      <pubDate>Sat, 27 Dec 2014 02:54:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/DB-Connect-Why-am-I-missing-some-DB-entries-via-dbmon-tail-using/m-p/168373#M98205</guid>
      <dc:creator>martin_mueller</dc:creator>
      <dc:date>2014-12-27T02:54:38Z</dc:date>
    </item>
  </channel>
</rss>

