Quickstart

This walks through a Rocky pipeline that replicates Fivetran-landed sources into Databricks.

Scaffold the project.
Terminal window
```
rocky init my-pipeline
cd my-pipeline
```
This scaffolds a runnable DuckDB starter project: a rocky.toml, a models/ directory with _defaults.toml and a sample stg_orders model, and a seeds/seed.sql. Step 2 replaces the generated rocky.toml with the Fivetran → Databricks config below.

Configure a source, a target, and the pipeline that wires them together.

[adapter.fivetran]
type = "fivetran"
kind = "discovery"
destination_id = "${FIVETRAN_DESTINATION_ID}"
api_key = "${FIVETRAN_API_KEY}"
api_secret = "${FIVETRAN_API_SECRET}"

[adapter.prod]
type = "databricks"
host = "${DATABRICKS_HOST}"
http_path = "${DATABRICKS_HTTP_PATH}"
token = "${DATABRICKS_TOKEN}"

[pipeline.bronze]
type = "replication"
strategy = "incremental"
timestamp_column = "_fivetran_synced"

[pipeline.bronze.source]
adapter = "prod"

[pipeline.bronze.source.discovery]
adapter = "fivetran"

[pipeline.bronze.source.schema_pattern]
prefix = "src__"
separator = "__"
components = ["source"]

[pipeline.bronze.target]
adapter = "prod"
catalog_template = "warehouse"
schema_template = "stage__{source}"

[pipeline.bronze.target.governance]
auto_create_catalogs = true
auto_create_schemas = true

[pipeline.bronze.checks]
enabled = true
row_count = true
column_match = true
freshness = { threshold_seconds = 86400 }

[state]
backend = "local"

[adapter.*] blocks define connections; [pipeline.*] blocks tie them together. Select between pipelines with --pipeline NAME. Export the referenced environment variables (DATABRICKS_HOST, FIVETRAN_API_KEY, and the rest) before running Rocky.

Validate the config.
Terminal window
```
rocky validate
```
Checks config syntax and adapter wiring. It does not call external APIs.
Discover the sources.
Terminal window
```
rocky -o table discover
```
Calls the Fivetran API and lists connectors matching the schema pattern.
Build a plan.
Terminal window
```
plan_id=$(rocky plan --filter tenant=acme --output json | jq -r .plan_id)
```
Compiles the pipeline, runs drift detection, and records a deterministic plan keyed by plan_id. Inspect the SQL, drift actions, and checks before you commit to a run.
Apply the plan.
Terminal window
```
rocky apply "$plan_id"
```
Executes the plan: discover, create catalogs and schemas, apply drift, copy data, run checks. Outputs a versioned JSON result with materializations, check results, drift actions, and permissions.
Inspect the state.
Terminal window
```
rocky state
```
Shows the stored watermarks for every table.

Plan and apply, or just run

For local iteration and automation, rocky run does the same work as the rocky plan + rocky apply two-step (steps 5–6) in one command:

rocky run --filter tenant=acme

Either path resumes from the last checkpoint after a failure with --resume-latest:

rocky run --filter tenant=acme --resume-latest

Next steps

Playground: the credential-free DuckDB version of this flow.
Schema patterns: customize source-to-target mapping.
Silver layer: add transformation models on top of the bronze copy.
Data quality checks: assertions that run inline with the load.
Dagster integration: run Rocky as Dagster assets.