Quickstart
This walks through a Rocky pipeline that replicates Fivetran-landed sources into Databricks. No credentials? The Playground guide runs the same flow against a local DuckDB file.
1. Scaffold
Section titled “1. Scaffold”rocky init my-pipelinecd my-pipelineCreates rocky.toml and an empty models/ directory.
2. Configure
Section titled “2. Configure”Edit rocky.toml to declare a Fivetran source, a Databricks target, and a pipeline that wires them together.
[adapter.fivetran]type = "fivetran"destination_id = "${FIVETRAN_DESTINATION_ID}"api_key = "${FIVETRAN_API_KEY}"api_secret = "${FIVETRAN_API_SECRET}"
[adapter.prod]type = "databricks"host = "${DATABRICKS_HOST}"http_path = "${DATABRICKS_HTTP_PATH}"token = "${DATABRICKS_TOKEN}"
[pipeline.bronze]type = "replication"strategy = "incremental"timestamp_column = "_fivetran_synced"
[pipeline.bronze.source]adapter = "fivetran"
[pipeline.bronze.source.schema_pattern]prefix = "src__"separator = "__"components = ["source"]
[pipeline.bronze.target]adapter = "prod"catalog_template = "warehouse"schema_template = "stage__{source}"
[pipeline.bronze.target.governance]auto_create_catalogs = trueauto_create_schemas = true
[pipeline.bronze.checks]enabled = truerow_count = truecolumn_match = truefreshness = { threshold_seconds = 86400 }
[state]backend = "local"[adapter.*] blocks define connections. [pipeline.*] blocks tie them together. Select between pipelines with --pipeline NAME.
Export the referenced environment variables before running Rocky (DATABRICKS_HOST, FIVETRAN_API_KEY, etc.).
3. Validate
Section titled “3. Validate”rocky validateChecks config syntax and adapter wiring. Does not call external APIs.
4. Discover
Section titled “4. Discover”rocky -o table discoverCalls the Fivetran API and lists connectors matching the schema pattern.
5. Plan
Section titled “5. Plan”plan_id=$(rocky plan --filter tenant=acme --output json | jq -r .plan_id)Compiles the pipeline, runs drift detection, and records a deterministic plan keyed by plan_id. Inspect it (SQL, drift actions, checks) before committing to a run.
6. Apply
Section titled “6. Apply”rocky apply "$plan_id"Executes the plan: discover → create catalogs/schemas → apply drift → copy data → run checks. Outputs a versioned JSON result with materializations, check results, drift actions, and permissions.
Resume from the last checkpoint after a failure. The resume flag currently lives on the legacy rocky run alias (which continues to work alongside rocky plan + rocky apply):
rocky run --filter tenant=acme --resume-latest7. Inspect state
Section titled “7. Inspect state”rocky stateShows stored watermarks for every table.
Next steps
Section titled “Next steps”- Playground — credential-free DuckDB version
- Schema patterns — customize source-to-target mapping
- Silver layer — add transformation models
- Data quality checks
- Dagster integration
- See the combo in action — POC #17 (trace + cost + replay against the same run_id): examples/playground/pocs/06-developer-experience/17-trace-replay-cost-combo