All Apps and Add-ons

Issue with Missing Data Using DB Connect to Ingest Snowflake View

jay3828
Loves-to-Learn

Hello Splunk Community,

We are currently using Splunk Enterprise 9.1.5 and DB Connect 3.7 to collect data from a Snowflake database view. The view returns data correctly when queried directly via SQL.

Here are the specifics of our setup and the issue we're encountering:

  • Data Collection Interval: Every 11 minutes
  • Data Volume: Approximately 75,000 to 80,000 events per day, with peak times around 7 AM to 9 AM CST and 2 PM to 4 PM CST (approximately 20,000 events during these periods)
  • Unique Identifier: The data contains a unique ID column generated by a sequence that increments by 1
  • Timestamp Column: The table includes a STARTDATE column, which is a Timestamp_NTZ (no timezone) in UTC time

    Our DB Connect configuration is as follows:
  • Rising Column: ID
  • Metadata: _time is set to the STARTDATE field

    The issue we're facing is that Splunk is not ingesting all the data; approximately 30% of the data is missing. The ID column has been verified to be unique, so we suspect that the STARTDATE might be causing the issue. Although each event has a unique ID, the STARTDATE may not be unique since multiple events can occur simultaneously in our large environment.

    Has anyone encountered a similar issue, or does anyone have suggestions on how to address this problem? Any insights would be greatly appreciated.

    Thank you!
Labels (1)
0 Karma

jay3828
Loves-to-Learn

We identified the issue.  Startdate is a timestamp_NTZ (no time zone)  so UTC.  The config was set to Eastern-time zone.  once it was adjusted it worked perfectly.     Simple mis-config.  Took a while to identify the issue thought.  thanks for your input.

0 Karma

dural_yyz
Motivator

Missing data makes me immediately think of two things and one is much easier to find and fix.

1) Bad time ingestions

index=_introspection 
| eval latency=abs(_indextime-_time)
| table _time _indextime latency
| sort - latency
| head 15

Try sorting both - (descending) and + (increasing), this will help point out anything that is ingesting with bad time formatting causing the data to appear as missing.

2) Skipping events

You would need to dig through your HF internal logging to check for full queues or max transmit violations.  

 

Give that a start.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Agent Mode Engaged! Enchaining Agentic Operations with Splunk AI Assistant 2.0

    Are you ready to transform how your team handles complex data requests? We invite you to our upcoming ...

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...