Skip to content

Quickstart

This walks through a Rocky pipeline that replicates Fivetran-landed sources into Databricks. No credentials? The Playground guide runs the same flow against a local DuckDB file.

Terminal window
rocky init my-pipeline
cd my-pipeline

Creates rocky.toml and an empty models/ directory.

Edit rocky.toml to declare a Fivetran source, a Databricks target, and a pipeline that wires them together.

[adapter.fivetran]
type = "fivetran"
destination_id = "${FIVETRAN_DESTINATION_ID}"
api_key = "${FIVETRAN_API_KEY}"
api_secret = "${FIVETRAN_API_SECRET}"
[adapter.prod]
type = "databricks"
host = "${DATABRICKS_HOST}"
http_path = "${DATABRICKS_HTTP_PATH}"
token = "${DATABRICKS_TOKEN}"
[pipeline.bronze]
type = "replication"
strategy = "incremental"
timestamp_column = "_fivetran_synced"
[pipeline.bronze.source]
adapter = "fivetran"
[pipeline.bronze.source.schema_pattern]
prefix = "src__"
separator = "__"
components = ["source"]
[pipeline.bronze.target]
adapter = "prod"
catalog_template = "warehouse"
schema_template = "stage__{source}"
[pipeline.bronze.target.governance]
auto_create_catalogs = true
auto_create_schemas = true
[pipeline.bronze.checks]
enabled = true
row_count = true
column_match = true
freshness = { threshold_seconds = 86400 }
[state]
backend = "local"

[adapter.*] blocks define connections. [pipeline.*] blocks tie them together. Select between pipelines with --pipeline NAME.

Export the referenced environment variables before running Rocky (DATABRICKS_HOST, FIVETRAN_API_KEY, etc.).

Terminal window
rocky validate

Checks config syntax and adapter wiring. Does not call external APIs.

Terminal window
rocky -o table discover

Calls the Fivetran API and lists connectors matching the schema pattern.

Terminal window
plan_id=$(rocky plan --filter tenant=acme --output json | jq -r .plan_id)

Compiles the pipeline, runs drift detection, and records a deterministic plan keyed by plan_id. Inspect it (SQL, drift actions, checks) before committing to a run.

Terminal window
rocky apply "$plan_id"

Executes the plan: discover → create catalogs/schemas → apply drift → copy data → run checks. Outputs a versioned JSON result with materializations, check results, drift actions, and permissions.

Resume from the last checkpoint after a failure. The resume flag currently lives on the legacy rocky run alias (which continues to work alongside rocky plan + rocky apply):

Terminal window
rocky run --filter tenant=acme --resume-latest
Terminal window
rocky state

Shows stored watermarks for every table.