CLI Reference
Rocky provides a single binary with subcommands for the full pipeline lifecycle. Commands are organized into categories:
- Core Pipeline:
init,validate,discover,plan,run,state - Modeling:
compile,lineage,test,ci - Data:
seed,snapshot,docs - AI:
ai,ai-sync,ai-explain,ai-test - Development:
playground,shell,watch,fmt,list,serve,lsp,import-dbt,init-adapter,hooks,validate-migration,test-adapter - Administration:
history,metrics,optimize,compact,profile-storage,archive - Diagnostics:
doctor,compare
See the dedicated command reference pages for detailed documentation of each category.
Global Flags
Section titled “Global Flags”These flags apply to all commands.
| Flag | Short | Default | Description |
|---|---|---|---|
--config <PATH> | -c | rocky.toml | Path to the pipeline configuration file. |
--output <FORMAT> | -o | json | Output format. Accepted values: json, table. |
--state-path <PATH> | .rocky-state.redb | Path to the embedded state store used for watermarks. |
# Example: use a custom config and table outputrocky -c pipelines/prod.toml -o table discoverCommands
Section titled “Commands”rocky init
Section titled “rocky init”Scaffolds a new Rocky project in the target directory.
rocky init [path]Arguments:
| Argument | Default | Description |
|---|---|---|
path | . (current directory) | Directory where the project will be created. |
Behavior:
- Creates a starter
rocky.tomlwith placeholder values. - Creates a
models/directory for SQL model files. - Fails with an error if
rocky.tomlalready exists in the target directory.
Example:
# Scaffold in current directoryrocky init
# Scaffold in a new directoryrocky init my-pipelinerocky validate
Section titled “rocky validate”Checks the pipeline configuration for correctness without connecting to any external APIs.
rocky validateChecks performed:
| Check | Description |
|---|---|
| TOML syntax | The config file parses without errors as v2 (named adapters + named pipelines). |
| Adapters | Each [adapter.NAME] is a recognized type (databricks, snowflake, duckdb, fivetran, manual) with the required fields populated. |
| Pipelines | Each [pipeline.NAME] references existing adapters for source, target, and (optional) discovery, and its schema_pattern parses. |
| DAG validation | If models/ exists, loads all models and checks for dependency cycles. |
Output:
Each check prints ok or !! followed by a short description. A non-zero exit code is returned if any check fails.
ok Config syntax valid (v2 format)ok adapter.fivetran: fivetranok adapter.prod: databricks (auth configured)ok pipeline.bronze: schema pattern parseableok pipeline.bronze: replication / incremental -> warehouse / stage__{source}rocky discover
Section titled “rocky discover”Lists available connectors and their tables from the configured source.
rocky discover [--pipeline NAME]Flags:
| Flag | Description |
|---|---|
--pipeline <NAME> | Pipeline name. Required when more than one [pipeline.NAME] is defined. |
Behavior:
- For
fivetranadapters, calls the Fivetran REST API to list connectors and their enabled tables. Forduckdbadapters, queriesinformation_schema.{schemata,tables}. Formanualadapters, reads inline schema/table definitions. - This is a metadata-only operation — it identifies what schemas and tables exist, it does not extract or move data.
- Parses each source schema name using the pipeline’s
schema_patternto extract structured components (tenant, regions, source, etc.). - Returns structured data about every discovered source and its tables.
JSON output:
{ "version": "0.1.0", "command": "discover", "sources": [ { "id": "connector_abc123", "components": { "tenant": "acme", "regions": ["us_west"], "source": "shopify" }, "source_type": "fivetran", "last_sync_at": "2026-03-30T10:00:00Z", "tables": [{ "name": "orders", "row_count": null }] } ]}Table output:
connector_id | components | tables──────────────────┼─────────────────────────────────────┼───────connector_abc123 | acme / us_west / shopify | 12connector_def456 | acme / eu_central / stripe | 8rocky plan
Section titled “rocky plan”Generates the SQL statements Rocky would execute, without actually running them. Useful for auditing and previewing changes before a run.
rocky plan --filter <key=value> [--pipeline NAME]Flags:
| Flag | Required | Description |
|---|---|---|
--filter <key=value> | Yes | Filter sources by component. Example: --filter tenant=acme. |
--pipeline <NAME> | Pipeline name. Required when more than one pipeline is defined. |
Behavior:
- Runs discovery and drift detection.
- Generates all SQL statements (catalog creation, schema creation, incremental copy, permission grants) and returns them without execution.
JSON output:
{ "version": "0.1.0", "command": "plan", "filter": "tenant=acme", "statements": [ { "purpose": "create_catalog", "target": "acme_warehouse", "sql": "CREATE CATALOG IF NOT EXISTS acme_warehouse" }, { "purpose": "create_schema", "target": "acme_warehouse.staging__us_west__shopify", "sql": "..." }, { "purpose": "incremental_copy", "target": "acme_warehouse.staging__us_west__shopify.orders", "sql": "..." } ]}rocky run
Section titled “rocky run”Executes the full pipeline end-to-end.
rocky run --filter <key=value> [flags]Flags:
| Flag | Required | Description |
|---|---|---|
--filter <key=value> | Yes | Filter sources by component. Example: --filter tenant=acme. |
--pipeline <NAME> | Pipeline name. Required when more than one pipeline is defined. | |
--governance-override <JSON> | Additional governance config as inline JSON or @file.json, merged with defaults. | |
--models <PATH> | Models directory for transformation execution. | |
--all | Execute both replication and compiled models. | |
--resume <RUN_ID> | Resume a specific previous run from its last checkpoint. | |
--resume-latest | Resume the most recent failed run from its last checkpoint. | |
--shadow | Run in shadow mode: write to shadow targets instead of production. | |
--shadow-suffix <SUFFIX> | Suffix appended to table names in shadow mode (default _rocky_shadow). | |
--shadow-schema <NAME> | Override schema for shadow tables (mutually exclusive with --shadow-suffix). |
Pipeline stages (in order):
- Discover — enumerate sources and tables from the configured source adapter.
- Governance setup (sequential, per matching catalog/schema):
- Create catalog (if
auto_create_catalogs = true) - Apply catalog tags (
ALTER CATALOG SET TAGS) - Bind workspaces (Unity Catalog bindings API, if
governance.isolationconfigured) - Apply catalog-level grants (
GRANT ... ON CATALOG) - Create schema (if
auto_create_schemas = true) - Apply schema tags (
ALTER SCHEMA SET TAGS) - Apply schema-level grants (
GRANT ... ON SCHEMA)
- Create catalog (if
- Parallel table processing — for each table concurrently (up to
execution.concurrency):- Drift detection (compare column types between source and target)
- Copy data (incremental or full refresh SQL)
- Apply table tags
- Update watermark in state store
- Batched checks — row count, column match, freshness (batched with UNION ALL for efficiency)
- Retry — failed tables retried sequentially (configurable via
execution.table_retries)
JSON output:
{ "version": "0.1.0", "command": "run", "filter": "tenant=acme", "duration_ms": 45200, "tables_copied": 20, "materializations": [ { "asset_key": ["fivetran", "acme", "us_west", "shopify", "orders"], "rows_copied": null, "duration_ms": 2300, "metadata": { "strategy": "incremental", "watermark": "2026-03-30T10:00:00Z" } } ], "check_results": [], "permissions": { "grants_added": 3, "grants_revoked": 0 }, "drift": { "tables_checked": 20, "tables_drifted": 1, "actions_taken": [] }}rocky doctor
Section titled “rocky doctor”Runs aggregate health checks on your Rocky project: config validation, state store health, adapter connectivity, pipeline consistency, and state sync status.
rocky doctorChecks performed:
| Check | Description |
|---|---|
| Config | Parses rocky.toml, validates adapters and pipelines |
| State | Verifies the state store is readable and not corrupted |
| Adapters | Tests connectivity to configured adapters |
| Pipelines | Validates schema patterns, templates, and governance config |
| State Sync | Checks remote state backends (S3, Valkey) if configured |
| Auth | Pings each warehouse and discovery adapter to verify credentials and connectivity |
JSON output:
{ "version": "0.1.0", "command": "doctor", "checks": [ { "name": "config", "status": "ok", "message": "rocky.toml valid" }, { "name": "state", "status": "ok", "message": "state store readable" }, { "name": "adapters", "status": "warn", "message": "adapter.fivetran: API key not set" } ], "overall": "warn"}Run a specific check:
rocky doctor --check authrocky list
Section titled “rocky list”Inspect project contents: pipelines, adapters, models, sources, and dependency relationships.
rocky list pipelines # List all pipeline definitionsrocky list adapters # List all adapter configurationsrocky list models # List all transformation modelsrocky list sources # List replication source configurationsrocky list deps <model> # Show what a model depends onrocky list consumers <model> # Show what depends on a modelAll subcommands support --output json (via the parent -o json flag) for machine-readable output.
Example (table format):
$ rocky -o table list pipelinesNAME TYPE TARGET SOURCE DEPENDS ONplayground replication default default -Example (JSON format):
{ "version": "0.3.0", "command": "list_pipelines", "pipelines": [ { "name": "playground", "pipeline_type": "replication", "target_adapter": "default", "source_adapter": "default", "depends_on": [], "concurrency": 16 } ]}rocky seed
Section titled “rocky seed”Load static reference data from CSV files into the target warehouse. Rocky’s equivalent of dbt’s dbt seed.
rocky seed # Load all seeds from seeds/rocky seed --seeds data/seeds/ # Custom seeds directoryrocky seed --filter name=dim_date # Load specific seedSeeds are .csv files in the seeds/ directory. Rocky infers column types (STRING, BIGINT, DOUBLE, BOOLEAN, TIMESTAMP) from the data and creates/replaces the target tables. Optional .toml sidecars can override inferred types.
Sidecar example (seeds/dim_date.toml):
[target]catalog = "warehouse"schema = "reference"table = "dim_date"
[[columns]]name = "date_key"data_type = "DATE"JSON output:
{ "version": "0.3.0", "command": "seed", "tables": [ { "name": "dim_date", "rows_loaded": 365, "columns": 4 } ]}rocky compare
Section titled “rocky compare”Compare shadow tables against production tables. Used after rocky run --shadow to validate results before promoting shadow data to production.
rocky compare --filter <key=value> [flags]Flags:
| Flag | Required | Description |
|---|---|---|
--filter <key=value> | Yes | Filter tables by component. |
--pipeline <NAME> | Pipeline name. | |
--shadow-suffix <SUFFIX> | Shadow table suffix (default _rocky_shadow). | |
--shadow-schema <NAME> | Override schema for shadow tables. |
JSON output:
{ "version": "0.1.0", "command": "compare", "comparisons": [ { "table": "warehouse.staging.orders", "shadow_table": "warehouse.staging.orders_rocky_shadow", "row_count_match": true, "schema_match": true, "production_rows": 15000, "shadow_rows": 15000 } ]}rocky state
Section titled “rocky state”Displays stored watermarks from the embedded state file.
rocky stateBehavior:
- Reads the redb state store (default:
.rocky-state.redb). - Lists every tracked table with its last watermark value and the timestamp it was recorded.
JSON output:
{ "version": "0.1.0", "command": "state", "watermarks": [ { "table": "acme_warehouse.staging__us_west__shopify.orders", "last_value": "2026-03-30T10:00:00Z", "updated_at": "2026-03-30T10:01:32Z" } ]}Table output:
table | last_value | updated_at─────────────────────────────────────────────────────┼───────────────────────────┼───────────────────────────acme_warehouse.staging__us_west__shopify.orders | 2026-03-30T10:00:00Z | 2026-03-30T10:01:32Zacme_warehouse.staging__us_west__shopify.customers | 2026-03-30T09:55:00Z | 2026-03-30T10:01:32Zrocky snapshot
Section titled “rocky snapshot”Execute an SCD Type 2 snapshot pipeline. Generates and runs MERGE SQL that tracks historical changes to a source table, maintaining valid_from, valid_to, is_current, and snapshot_id columns in the target history table.
rocky snapshot # Run the snapshot pipelinerocky snapshot --dry-run # Preview generated SQL without executingrocky snapshot --pipeline customers_scd # Select a specific pipelineFlags:
| Flag | Description |
|---|---|
--pipeline <NAME> | Pipeline name. Required when more than one pipeline is defined. |
--dry-run | Show generated SQL without executing. |
Pipeline config (rocky.toml):
[pipeline.customers_history]type = "snapshot"unique_key = ["customer_id"]updated_at = "updated_at"invalidate_hard_deletes = true
[pipeline.customers_history.source]adapter = "prod"catalog = "main"schema = "raw"table = "customers"
[pipeline.customers_history.target]adapter = "prod"catalog = "warehouse"schema = "history"table = "customers_history"Strategies:
- Timestamp — detects changes by comparing the
updated_atcolumn between source and target. Efficient when the source maintains a reliable last-modified timestamp. - Check — detects changes by comparing specified columns between source and target. Used when there is no reliable timestamp.
Generated SQL steps:
- Initial load —
CREATE TABLE IF NOT EXISTSwith SCD2 columns added - Close changed rows — MERGE that sets
valid_toandis_current = FALSE - Insert new versions — INSERT for rows that were just closed
- Invalidate hard deletes (optional) — UPDATE rows missing from source
JSON output:
{ "version": "0.3.0", "command": "snapshot", "pipeline": "customers_history", "source": "main.raw.customers", "target": "warehouse.history.customers_history", "dry_run": false, "steps_total": 4, "steps_ok": 4, "steps": [ { "step": "initial_load", "sql": "...", "status": "ok", "duration_ms": 12 }, { "step": "merge_1", "sql": "...", "status": "ok", "duration_ms": 45 } ], "duration_ms": 120}rocky docs
Section titled “rocky docs”Generate project documentation as a single-page HTML catalog. Discovers models from the models directory and renders them with metadata, dependencies, and tests.
rocky docs # Generate to docs/catalog.htmlrocky docs --models models/ --output site/api.html # Custom pathsFlags:
| Flag | Default | Description |
|---|---|---|
--models <PATH> | models | Models directory to scan. |
--output <PATH> | docs/catalog.html | Output HTML file path. |
Behavior:
- Loads all
.sqland.rockymodel files with their TOML sidecars. - Extracts: name, description (from
intent), target table, strategy, dependencies, tests. - Renders a self-contained HTML page with dark theme, search, and model cards.
- No external dependencies — the HTML is fully self-contained.
JSON output:
{ "version": "0.3.0", "command": "docs", "output_path": "docs/catalog.html", "models_count": 12, "pipelines_count": 2, "duration_ms": 15}rocky shell
Section titled “rocky shell”Interactive SQL shell against the configured warehouse. Supports multi-line queries, .tables and .schema meta-commands, and command history.
rocky shell # Use default adapterrocky shell --pipeline prod # Use a specific pipeline's adapterFlags:
| Flag | Description |
|---|---|
--pipeline <NAME> | Pipeline name to select the warehouse adapter. |
Meta-commands:
| Command | Description |
|---|---|
.tables | List tables in the current catalog/schema. |
.schema <table> | Describe columns for a table. |
.quit / .exit | Exit the shell. |
Multi-line queries are supported — end a statement with ; to execute.
rocky watch
Section titled “rocky watch”Watch the models directory for file changes and auto-recompile. Useful during development to get instant feedback on model changes.
rocky watch # Watch models/ directoryrocky watch --models src/models/ # Custom directoryrocky watch --contracts contracts/ # Include contractsFlags:
| Flag | Default | Description |
|---|---|---|
--models <PATH> | models | Models directory to watch. |
--contracts <PATH> | Contracts directory (optional). |
Behavior:
- Uses filesystem notifications (platform-native) to detect changes.
- Debounces rapid changes (waits for writes to settle before recompiling).
- Runs
compileon each change and reports diagnostics to the terminal.
rocky fmt
Section titled “rocky fmt”Format .rocky DSL files. Normalizes indentation, trims trailing whitespace, and enforces consistent style.
rocky fmt # Format all .rocky files in current directoryrocky fmt models/ # Format a specific directoryrocky fmt --check # Check mode: exit non-zero if any file needs formattingFlags:
| Flag | Description |
|---|---|
--check | Check mode for CI — exits non-zero if any file would be reformatted. |
Arguments:
| Argument | Default | Description |
|---|---|---|
paths | . | Files or directories to format. |