Bronze Layer
The bronze layer is Rocky’s config-driven replication within the warehouse. No SQL files needed — Rocky discovers what tables are available, generates the SQL, and copies data from the ingestion catalog into structured target catalogs/schemas.
The flow
Section titled “The flow”rocky discover → rocky plan → rocky run- Discover — Finds what schemas and tables are available for processing. For
fivetranadapters, calls the Fivetran REST API to list connectors and enabled tables. Forduckdbadapters, queriesinformation_schema. Formanualadapters, reads inline schema/table definitions. Discovery is metadata-only — it identifies what exists, it does not extract data. - Plan — Parses source schema names, resolves target catalogs/schemas, generates SQL statements. Shows what will happen without executing.
- Run — Executes the plan: creates catalogs/schemas, copies data, runs quality checks, updates watermarks.
Schema pattern parsing
Section titled “Schema pattern parsing”Source schemas follow a naming convention. Rocky parses these into structured components using a configurable pattern:
src__acme__us_west__shopify│ │ │ ││ │ │ └── source (connector name)│ │ └── regions (variable-length)│ └── tenant└── prefix (stripped)The pattern is defined under the pipeline source in rocky.toml:
[pipeline.bronze.source.schema_pattern]prefix = "src__"separator = "__"components = ["tenant", "regions...", "source"]Given src__acme__us_west__shopify, Rocky extracts:
tenant = "acme"regions = ["us_west"]source = "shopify"
Target mapping
Section titled “Target mapping”Templates on the pipeline target determine where data lands:
[pipeline.bronze.target]adapter = "prod"catalog_template = "warehouse"schema_template = "stage__{source}"Using the parsed components:
warehouseis a static catalog name (no variable substitution)stage__{source}resolves tostage__shopify
So fivetran_catalog.src__shopify.orders is copied to warehouse.stage__shopify.orders.
For multi-tenant setups where each tenant gets its own catalog, see Schema Patterns for the {tenant}_warehouse + components = ["tenant", "regions...", "source"] pattern.
Auto-creation
Section titled “Auto-creation”When auto_create_catalogs = true and auto_create_schemas = true, Rocky creates target catalogs and schemas before copying data:
CREATE CATALOG IF NOT EXISTS warehouse;CREATE SCHEMA IF NOT EXISTS warehouse.stage__shopify;Catalogs are tagged (e.g., managed_by = "rocky") so Rocky can later discover which catalogs it manages.
Incremental strategy
Section titled “Incremental strategy”On the first run (no watermark), Rocky performs a full refresh. On subsequent runs, it only copies rows where the timestamp column exceeds the last known watermark:
INSERT INTO acme_warehouse.staging__us_west__shopify.ordersSELECT *, CAST(NULL AS STRING) AS _loaded_byFROM fivetran_catalog.src__acme__us_west__shopify.ordersWHERE _fivetran_synced > ( SELECT COALESCE(MAX(_fivetran_synced), TIMESTAMP '1970-01-01') FROM acme_warehouse.staging__us_west__shopify.orders)The _fivetran_synced column is Fivetran’s built-in timestamp that records when each row was synced. Rocky uses it as the watermark column by default (configurable via timestamp_column).
If schema drift is detected (column type mismatch between source and target), Rocky falls back to a full refresh: it drops the target table and recreates it.
Metadata columns
Section titled “Metadata columns”Rocky can add metadata columns to replicated tables. They are declared on the pipeline alongside strategy and timestamp_column:
[pipeline.bronze]type = "replication"strategy = "incremental"timestamp_column = "_fivetran_synced"metadata_columns = [ { name = "_loaded_by", type = "STRING", value = "NULL" }]These are appended to the SELECT: SELECT *, CAST(NULL AS STRING) AS _loaded_by.
Filtering
Section titled “Filtering”Scope execution to a specific tenant:
rocky run --config rocky.toml --filter tenant=acmeThis processes only schemas where the parsed tenant component matches acme.
Comparison to dbt
Section titled “Comparison to dbt”In dbt, you write one staging model per source table:
-- models/staging/shopify/stg_orders.sqlSELECT * FROM {{ source('shopify', 'orders') }}Multiply that by every table, every source, every tenant. For a multi-tenant setup with 50 connectors and 20 tables each, that’s 1,000 SQL files that all look the same.
In Rocky, the entire bronze layer is config-driven. Zero SQL files. The rocky.toml configuration handles all source-to-target mapping, and Rocky generates the appropriate SQL at runtime.