Skip to main content

Data Quality Monitoring

What is DataBridge Data Quality?

DataBridge provides comprehensive data quality monitoring for your data warehouse-both in-flight (as events are ingested) and at rest (after data lands in your warehouse). Unlike pure event tracking tools that only validate data during ingestion, DataBridge continuously monitors your warehouse tables to ensure data quality is maintained over time.

Two Layers of Quality Assurance

1. In-Flight Validation (During Ingestion)

  • Real-time schema validation against your JSON schemas
  • Type checking and required field validation
  • Invalid events automatically routed to dead-letter queues
  • Prevents bad data from reaching your warehouse

2. At-Rest Monitoring (In Your Warehouse)

  • Continuous profiling and validation of warehouse tables
  • Automated anomaly detection and alerting
  • Schema drift detection
  • Data freshness and completeness monitoring

Key Features

📊 Comprehensive Data Profiling

Automatically profile tables to understand their structure, distribution and quality characteristics:

  • Column Statistics: Null/blank counts, cardinality, unique values
  • Distribution Metrics: Min/Max/Avg values, standard deviation, percentiles
  • Most Frequent Values: Top values by occurrence
  • Row Sampling: Preview actual data for context

✅ Flexible Validation Rules

Define data quality checks at multiple levels:

  • Schema-Level Checks: Column presence, order and structure
  • Table-Level Checks: Row counts, custom SQL queries
  • Column-Level Checks: Null checks, uniqueness, ranges, freshness

🔔 Real-Time Alerts

Get notified immediately when data quality issues are detected:

  • Email, Slack, or webhook notifications
  • Custom thresholds for any quality metric
  • Volume anomaly detection
  • Schema change alerts

🎯 Multiple Validation Options

Cloud Version (Managed Service)

  • Visual UI for defining and managing checks
  • Scheduled validation runs
  • Team collaboration features
  • Built-in alerting and reporting

Community Version (Open Source CLI)

  • Free, open-source tool: dbqctl
  • Run checks from YAML configuration
  • Integrate into CI/CD pipelines
  • Perfect for automated workflows

Use Cases

Event-Heavy Applications

Validate that high-volume event data maintains quality standards:

  • Gaming apps: Verify player action completeness
  • IoT platforms: Monitor sensor data accuracy
  • FinTech: Ensure transaction data integrity

Data Engineering Teams

  • Catch schema drift before it breaks downstream pipelines
  • Monitor data freshness for time-sensitive workflows
  • Validate ETL pipeline outputs automatically
  • Maintain data contracts across teams

Analytics & Business Intelligence

  • Ensure report accuracy with continuous quality checks
  • Detect anomalies in business metrics
  • Validate assumptions about data distributions
  • Prevent bad data from affecting decisions

Supported Databases

DataBridge supports quality monitoring for:

  • ClickHouse - High-performance analytics
  • PostgreSQL - General-purpose relational database
  • MySQL - Popular open-source database

Available Quality Dimensions

DataBridge monitors these critical data quality dimensions:

  • Completeness: Track null and blank value counts
  • Uniqueness: Ensure unique identifiers are truly unique
  • Freshness: Monitor data recency and staleness
  • Validity: Check values fall within expected ranges
  • Consistency: Detect schema drift and type changes
  • Volume: Identify unexpected spikes or drops
  • Accuracy: Validate data meets business rules

Getting Started

Choose your preferred way to implement data quality monitoring:

Next Steps