Skip to main content

Data Stores Overview

Data Stores are the destinations where DataBridge delivers your validated, transformed events. After events pass through the ingestion layer, schema validation and optional transformation functions, they are written to one or more data stores for analytics, reporting and downstream processing.

Supported Destinations

DestinationTypeData Quality at RestSchema DerivationBest For
ClickHouseColumn-oriented OLAPHigh-volume analytics, real-time aggregations
PostgreSQLRelational (OLTP/OLAP)General-purpose analytics, operational dashboards
MySQLRelational (OLTP)Application databases, lightweight analytics
WebhooksHTTP callbackForwarding events to external APIs
BlackholeTesting utilityPipeline testing without storing data

Connection Modes

Every database destination supports two connection modes:

Cloud Mode

DataBridge connects directly to your database from the cloud. You provide connection credentials (host, port, username, password) through the dashboard and DataBridge stores them securely.

  • Connection credentials are stored encrypted in DataBridge Cloud
  • You can test the connection from the dashboard before saving
  • Requires your database to be reachable from the internet (or DataBridge IPs whitelisted)

DataBridge Agent Mode

Coming Soon

Only data quality at rest monitoring and checks are supported via self-hosted agent at the moment. Data ingestion via Agent Mode is coming soon.

The DataBridge Agent runs in your infrastructure and connects to your database locally. No credentials leave your network.

  • Credentials are configured in the agent's local config.yaml file
  • You only provide a Connection Alias in the dashboard - a label that maps to a connection in the agent's config
  • Ideal for databases behind firewalls or VPNs
  • The agent communicates with DataBridge Cloud and reports data quality results to be available for dashboards and alerts
tip

Choose Agent mode when your database is not publicly accessible or when you want credentials to remain entirely within your infrastructure.

Table Naming Convention

DataBridge creates tables automatically in your destination using this naming pattern:

dbridge_{namespace}_{event_name}_{version}

For example, an event named Purchase Completed in namespace com.acme at version 1-0-0 produces:

dbridge_com_acme_purchase_completed_1_0_0

All names are converted to lowercase snake_case.

Schema-Driven Table Structure

When you define a JSON Schema for an event, DataBridge uses it to derive the destination table's column layout. Each top-level property in the schema becomes a dedicated, properly-typed column in the destination table instead of being stored inside a single JSON blob.

How It Works

  1. You create a JSON Schema for your event (e.g., com.acme/purchase_completed/1-0-0)
  2. When the pipeline processes events for that schema, it looks up the registered schema by namespace, name and version
  3. The schema's properties are mapped to database-native column types
  4. DataBridge creates the table with one column per property, plus some metadata columns (e.g. ref_url_id, event_id and _extra)

Example

Given this JSON Schema:

{
"properties": {
"user_id": { "type": "string" },
"amount": { "type": "number" },
"is_premium": { "type": "boolean" },
"purchased_at": { "type": "string", "format": "date-time" },
"metadata": { "type": "object" }
},
"required": ["user_id", "amount"]
}

DataBridge creates a table like this (PostgreSQL example):

ColumnType
ref_url_idTEXT
event_idTEXT
user_idTEXT
amountDOUBLE PRECISION
is_premiumBOOLEAN
purchased_atTIMESTAMPTZ
metadataJSONB

Type Mapping Summary

JSON Schema TypePostgreSQLClickHouseMySQL
stringTEXTStringTEXT
string (format: date-time)TIMESTAMPTZDateTime64(3)DATETIME(3)
integerBIGINTInt64BIGINT
numberDOUBLE PRECISIONFloat64DOUBLE
booleanBOOLEANBoolBOOLEAN
objectJSONBStringJSON
arrayJSONBStringJSON

See each destination's documentation for the full type mapping details.

Fallback Behavior

If no JSON Schema is registered for an event, DataBridge falls back to the simple table layout with three columns: ref_url_id, event_id and payload (a JSON blob containing the entire event).

tip

Define JSON Schemas for your events to get properly-typed columns in your destination tables. This enables efficient SQL queries, indexing and storage optimization without any additional configuration.

Multi-Destination Delivery

A single pipeline can deliver events to multiple data stores simultaneously. Common patterns:

  • Production + staging warehouses from the same event stream
  • ClickHouse for analytics + PostgreSQL for dashboards
  • Different transformations applied per destination

Next Steps