All Apps and Add-ons
Highlighted

Database ETL data timestamp overlap from upstream ETL issue

Explorer

When pulling in some data from a database via DBConnect3, we found that the data is ETL'ed into the DB from an upstream integration. When this happens the table has a field called updated_on. This field is our best indication of the last time that data changed and therefore should be a good option for a rising column and record timestamp. We did find however that the ETL process stamps hundreds of records with the same time. This results in warnings for subsecond issues when searching this data.

Is there a suggested way to handle when our last known data change has this timestamp overlap without any subsecond data to go by? Should we just ignore the subsecond order warnings?

Error: Events might not be returned in sub-second order due to search memory limits. See search.log for more information. Increase the value of the following limits.conf setting:[search]:maxrawsizeperchunk.

0 Karma
Highlighted

Re: Database ETL data timestamp overlap from upstream ETL issue

SplunkTrust
SplunkTrust

If you're doing any searches that required timeline or first/last type calculations, then this can be a problem. If that's not the case, then you can ignore.

You could also maybe find a rising column in the data base that is a number that auto-increments. You could append that to the timestamp column using SQL commands. In fact you can using a counter in your SQL query that could be appended as well.

Something like this may help:
https://database.guide/6-ways-to-concatenate-string-and-number-sql-server/

Otherwise you'll need a DBA to add a rising column to the table (maybe).

0 Karma