JSON Output
Rocky’s JSON output is the interface contract between Rocky and orchestrators such as Dagster. The schema is versioned so that consumers can detect breaking changes. Every command that emits --output json is backed by a typed Rust struct deriving JsonSchema, with autogenerated Pydantic and TypeScript bindings.
Schema Version
Section titled “Schema Version”Every JSON response includes a top-level version field that tracks the Rocky engine release. It’s set to env!("CARGO_PKG_VERSION") at compile time, so rocky --output json always reports the version of the binary producing the output. Examples on this page pin a representative version string for readability — your actual output will reflect whichever engine version you have installed.
Compatibility contract:
- Additive changes (new fields) ship in minor releases and are backward compatible.
- Field removals or renames are breaking and only happen in a major release.
- Orchestrators should parse defensively and ignore unknown fields.
For machine-readable schemas, the canonical source is schemas/*.schema.json in the repo, exported via rocky export-schemas. The Pydantic (dagster) and TypeScript (vscode) bindings are autogenerated from the same schemas — see the codegen pipeline.
Asset Key Format
Section titled “Asset Key Format”Throughout the output, asset_key arrays follow the format:
[source_type, ...component_values, table_name]For example, a Fivetran source with tenant acme, region us_west, connector shopify, and table orders produces:
["fivetran", "acme", "us_west", "shopify", "orders"]This key is designed to map directly to orchestrator asset definitions (e.g., Dagster’s AssetKey).
Core Pipeline Commands
Section titled “Core Pipeline Commands”rocky discover
Section titled “rocky discover”Returns all discovered sources and their tables.
{ "version": "1.6.0", "command": "discover", "sources": [ { "id": "connector_abc123", "components": { "tenant": "acme", "regions": ["us_west"], "source": "shopify" }, "source_type": "fivetran", "last_sync_at": "2026-03-30T10:00:00Z", "tables": [ { "name": "orders", "row_count": null }, { "name": "customers", "row_count": null }, { "name": "products", "row_count": null } ] } ], "failed_sources": [ { "id": "connector_flaky456", "schema": "src__acme__us_west__hubspot", "source_type": "fivetran", "error_class": "transient", "message": "schema fetch failed: 503 Service Unavailable" } ], "checks": { "freshness": { "threshold_seconds": 86400 } }}Field reference:
| Field | Type | Description |
|---|---|---|
sources[].id | string | Connector identifier from the source system. |
sources[].components | object | Parsed schema pattern components. |
sources[].source_type | string | Source type ("fivetran" or "manual"). |
sources[].last_sync_at | string or null | ISO 8601 timestamp of the last successful sync. Null if unknown. |
sources[].tables | array | List of tables in this source. |
sources[].tables[].name | string | Table name. |
sources[].tables[].row_count | integer or null | Row count if available, otherwise null. |
failed_sources | array or absent | Sources the adapter attempted to fetch metadata for and failed on. Absent when empty. Distinct from sources (succeeded) and excluded_tables (filtered post-success). |
failed_sources[].id | string | Connector or namespace identifier. |
failed_sources[].schema | string | Source schema string (when known). |
failed_sources[].source_type | string | Source type ("fivetran", "iceberg", etc.). |
failed_sources[].error_class | string | One of "transient", "timeout", "rate_limit", "auth", "unknown". Lets consumers branch on operating-mode without parsing message. |
failed_sources[].message | string | Free-form error detail for human inspection. |
checks | object or absent | Pipeline-level check configuration, when [checks] is declared in rocky.toml. |
checks.freshness.threshold_seconds | integer | Freshness threshold in seconds. |
Consumers diffing successive discover snapshots must treat ids that appear in failed_sources but not in sources as “unknown state, do not delete” — that’s the contract that distinguishes a fetch failure from a deletion. Available since engine 1.17.4.
rocky run
Section titled “rocky run”Note: as of engine v1.33, the canonical form is
rocky planfollowed byrocky apply <plan-id>.rocky runcontinues to work and is now an alias; the JSON output shape below is the same on bothapplyandrun(thecommandfield reflects which verb was invoked). Suppress the alias’s stderr deprecation notice withROCKY_SUPPRESS_DEPRECATION=1.
Returns a complete summary of the pipeline execution.
{ "version": "1.6.0", "command": "run", "pipeline_type": "replication", "filter": "tenant=acme", "duration_ms": 45200, "tables_copied": 20, "tables_failed": 0, "materializations": [ { "asset_key": ["fivetran", "acme", "us_west", "shopify", "orders"], "rows_copied": null, "duration_ms": 2300, "metadata": { "strategy": "incremental", "watermark": "2026-03-30T10:00:00Z", "target_table_full_name": "acme_warehouse.staging__us_west__shopify.orders", "sql_hash": "a1b2c3d4e5f67890", "column_count": 12, "compile_time_ms": 8 } } ], "check_results": [ { "asset_key": ["fivetran", "acme", "us_west", "shopify", "orders"], "checks": [ { "name": "row_count", "passed": true, "source_count": 15000, "target_count": 15000 }, { "name": "column_match", "passed": true, "missing": [], "extra": [] }, { "name": "freshness", "passed": true, "lag_seconds": 300, "threshold_seconds": 86400 } ] } ], "permissions": { "grants_added": 3, "grants_revoked": 0, "catalogs_created": 1, "schemas_created": 2 }, "drift": { "tables_checked": 20, "tables_drifted": 1, "actions_taken": [ { "table": "acme_warehouse.staging__us_west__shopify.line_items", "action": "drop_and_recreate", "reason": "column 'status' changed STRING -> INT" } ] }, "execution": { "concurrency": 8, "tables_processed": 20, "tables_failed": 0 }, "metrics": { "tables_processed": 20, "tables_failed": 0, "statements_executed": 45, "retries_attempted": 1, "retries_succeeded": 1, "anomalies_detected": 0, "table_duration_p50_ms": 1200, "table_duration_p95_ms": 4500, "table_duration_max_ms": 8200, "query_duration_p50_ms": 800, "query_duration_p95_ms": 3200, "query_duration_max_ms": 7100 }, "errors": [], "anomalies": []}Top-level fields:
| Field | Type | Description |
|---|---|---|
pipeline_type | string or absent | Pipeline type executed (e.g., "replication"). |
filter | string or null | The filter applied to this run, if any. |
duration_ms | integer | Total pipeline execution time in milliseconds. |
tables_copied | integer | Number of tables that were copied (full or incremental). |
tables_failed | integer | Number of tables that failed during processing. |
tables_skipped | integer | Number of tables skipped (omitted when 0). |
resumed_from | string or absent | Run ID this run resumed from, if --resume was used. |
shadow | boolean | True when running in shadow mode (omitted when false). |
errors | array | Error details for tables that failed. Each entry has asset_key and error. |
execution | object | Concurrency and throughput summary. |
metrics | object or null | Counters and percentile histograms for the run. |
anomalies | array | Row count anomalies detected by historical baseline comparison. |
partition_summaries | array | Per-model partition execution summaries (present for time_interval models). |
cost_summary | object or absent | Per-run cost rollup: total_usd (float or null), by_adapter (map of adapter → USD). Present when at least one adapter reports cost data. See [budget] for how cost limits are enforced. |
budget_breaches | array | Populated when [budget] limits tripped. Each entry has limit_type ("max_usd" / "max_duration_ms" / "max_bytes_scanned"), limit, and actual (both floats). Empty array when within budget or no limits configured. |
materializations[]:
| Field | Type | Description |
|---|---|---|
asset_key | array of strings | Unique asset identifier. |
rows_copied | integer or null | Number of rows inserted. Null if the warehouse does not report this. |
duration_ms | integer | Time spent copying this table in milliseconds. |
metadata.strategy | string | Replication strategy used ("incremental" or "full_refresh"). |
metadata.watermark | string or null | The watermark value after this copy. Null for full refresh. |
metadata.target_table_full_name | string or absent | Fully-qualified target table (catalog.schema.table). |
metadata.sql_hash | string or absent | 16-char hex hash of the generated SQL. |
metadata.column_count | integer or absent | Number of columns in the materialized table. |
metadata.compile_time_ms | integer or absent | Compile time in milliseconds for derived models. |
metadata.cost_usd | float or absent | Estimated cost for this materialization in USD. Rolls up into cost_summary.total_usd at the run level. |
job_ids | array of strings | Warehouse-side job IDs for the statements this materialization issued, accumulated alongside bytes_scanned / bytes_written. Lets orchestrators cross-check rocky-reported figures against the warehouse console (bq show -j, Snowflake query history, Databricks SQL warehouse history). Empty [] for adapters that don’t surface a job id. Available since engine 1.21.0. |
partition | object or absent | Partition window info for time_interval materializations. |
Cross-checking BigQuery cost against bq show -j
Section titled “Cross-checking BigQuery cost against bq show -j”job_ids lets operators reconcile rocky’s reported bytes_scanned against the figure BigQuery’s own job statistics return — useful for confirming the cost numbers rocky run reports match the GCP console before they show up on the bill.
# Capture the first job id from a run.rocky run --config rocky.toml --output json \ | jq -r '.materializations[].job_ids[]' \ | head -1# → bquxjob_5f3c4e2a_19a1b6d3e21
# Fetch the same job via the BigQuery REST API and read the billed bytes.bq show -j --location=EU --format=prettyjson bquxjob_5f3c4e2a_19a1b6d3e21 \ | jq '.statistics.query.totalBytesBilled'# → "10485760"The number returned by bq show -j is the same value the BigQuery console displays under “Bytes billed” and matches materializations[].bytes_scanned in rocky’s JSON output. --location must match the dataset’s region (EU, US, us-east1, …) — BigQuery jobs are region-scoped and bq show -j returns Not found: Job if the location is wrong.
check_results[]:
| Field | Type | Description |
|---|---|---|
asset_key | array of strings | The table this check applies to. |
checks[].name | string | Check name: "row_count", "column_match", or "freshness". |
checks[].passed | boolean | Whether the check passed. |
Additional fields vary by check type:
- row_count:
source_count(integer),target_count(integer) - column_match:
missing(list of column names missing from target),extra(list of unexpected columns in target) - freshness:
lag_seconds(integer),threshold_seconds(integer)
permissions:
| Field | Type | Description |
|---|---|---|
grants_added | integer | Number of GRANT statements executed. |
grants_revoked | integer | Number of REVOKE statements executed. |
catalogs_created | integer | Number of catalogs created during this run. |
schemas_created | integer | Number of schemas created during this run. |
drift:
| Field | Type | Description |
|---|---|---|
tables_checked | integer | Total tables inspected for schema drift. |
tables_drifted | integer | Number of tables where drift was detected. |
actions_taken[].table | string | Fully qualified table name. |
actions_taken[].action | string | Action taken (e.g., "drop_and_recreate"). |
actions_taken[].reason | string | Human-readable explanation of the drift. |
rocky plan
Section titled “rocky plan”Returns the SQL statements that would be executed, without running them.
{ "version": "1.6.0", "command": "plan", "filter": "tenant=acme", "statements": [ { "purpose": "create_catalog", "target": "acme_warehouse", "sql": "CREATE CATALOG IF NOT EXISTS acme_warehouse" }, { "purpose": "create_schema", "target": "acme_warehouse.staging__us_west__shopify", "sql": "CREATE SCHEMA IF NOT EXISTS acme_warehouse.staging__us_west__shopify" }, { "purpose": "incremental_copy", "target": "acme_warehouse.staging__us_west__shopify.orders", "sql": "INSERT INTO acme_warehouse.staging__us_west__shopify.orders SELECT *, CAST(NULL AS STRING) AS _loaded_by FROM raw_catalog.src__acme__us_west__shopify.orders WHERE _fivetran_synced > (...)" } ]}statements[]:
| Field | Type | Description |
|---|---|---|
purpose | string | What this statement does: "create_catalog", "create_schema", "incremental_copy", "full_refresh", "grant", "revoke", "tag_catalog", "tag_schema". |
target | string | The fully qualified object this statement operates on. |
sql | string | The exact SQL that would be executed. |
rocky state
Section titled “rocky state”Returns stored watermarks from the embedded state file.
{ "version": "1.6.0", "command": "state", "watermarks": [ { "table": "acme_warehouse.staging__us_west__shopify.orders", "last_value": "2026-03-30T10:00:00Z", "updated_at": "2026-03-30T10:01:32Z" }, { "table": "acme_warehouse.staging__us_west__shopify.customers", "last_value": "2026-03-30T09:55:00Z", "updated_at": "2026-03-30T10:01:32Z" } ]}watermarks[]:
| Field | Type | Description |
|---|---|---|
table | string | Fully qualified target table name. |
last_value | string | The last watermark value (ISO 8601 timestamp). |
updated_at | string | When this watermark was last written (ISO 8601 timestamp). |
rocky doctor
Section titled “rocky doctor”Aggregate health checks across config, state, adapters, and pipelines.
{ "command": "doctor", "overall": "healthy", "checks": [ { "name": "config", "status": "healthy", "message": "Config syntax valid", "duration_ms": 2 }, { "name": "state", "status": "healthy", "message": "State store healthy (12 watermarks)", "duration_ms": 5 }, { "name": "adapters", "status": "healthy", "message": "All adapters reachable", "duration_ms": 340 } ], "suggestions": []}Field reference:
| Field | Type | Description |
|---|---|---|
overall | string | Aggregate status: "healthy", "warning", or "critical". |
checks[].name | string | Check category: "config", "state", "adapters", "pipelines", "state_sync", "state_rw", "auth", or "auth/<adapter>" (per-adapter auth result). |
checks[].status | string | "healthy", "warning", or "critical". |
checks[].message | string | Human-readable result. |
checks[].duration_ms | integer | Time spent on this check in milliseconds. |
checks[].details | array of [key, value] string pairs | Optional per-check context (config path, state file size, adapter type + credential signal, pipeline kind, state backend). Populated only when --verbose is passed; omitted from the envelope entirely otherwise (skip_serializing_if = "Vec::is_empty"). |
suggestions | array of strings | Actionable fix suggestions for any non-healthy checks. |
Drift (inline within rocky apply)
Section titled “Drift (inline within rocky apply)”There is no standalone rocky drift CLI command today — drift detection runs inline during rocky apply (or the legacy rocky run alias) and surfaces through the top-level drift field of the run JSON output (already shown under rocky run above).
Field reference for drift:
| Field | Type | Description |
|---|---|---|
drift.tables_checked | integer | Total tables inspected for schema drift. |
drift.tables_drifted | integer | Number of tables where drift was detected. |
drift.actions_taken[].table | string | Fully qualified table name. |
drift.actions_taken[].action | string | Action taken. Currently emits "drop_and_recreate" (unsafe type mismatch) or "add_column" (column added to source). Safe widenings (AlterColumnTypes in rocky-core) don’t yet surface an action string through the run path. |
drift.actions_taken[].reason | string | Human-readable explanation. |
rocky compare
Section titled “rocky compare”Compare shadow tables against production targets after a shadow run.
{ "version": "1.6.0", "command": "compare", "filter": "tenant=acme", "tables_compared": 20, "tables_passed": 19, "tables_warned": 1, "tables_failed": 0, "results": [ { "production_table": "acme_warehouse.staging__us_west__shopify.orders", "shadow_table": "acme_warehouse.staging__us_west__shopify.orders_rocky_shadow", "row_count_match": true, "production_count": 15000, "shadow_count": 15000, "row_count_diff_pct": 0.0, "schema_match": true, "schema_diffs": [], "verdict": "pass" } ], "overall_verdict": "pass"}Field reference:
| Field | Type | Description |
|---|---|---|
filter | string | Filter applied to select tables. |
tables_compared | integer | Total tables compared. |
tables_passed | integer | Tables with no differences. |
tables_warned | integer | Tables with minor differences (within threshold). |
tables_failed | integer | Tables with significant differences. |
results[].production_table | string | Fully qualified production table name. |
results[].shadow_table | string | Fully qualified shadow table name. |
results[].row_count_match | boolean | Whether row counts match. |
results[].production_count | integer | Row count in production table. |
results[].shadow_count | integer | Row count in shadow table. |
results[].row_count_diff_pct | number | Row count difference as a percentage. |
results[].schema_match | boolean | Whether schemas match. |
results[].schema_diffs | array of strings | Column-level schema differences. |
results[].verdict | string | "pass", "warn", or "fail". |
overall_verdict | string | Aggregate verdict across all tables. |
Modeling Commands
Section titled “Modeling Commands”rocky compile
Section titled “rocky compile”Compile models, resolve dependencies, type-check SQL, and build the semantic graph.
{ "version": "1.6.0", "command": "compile", "models": 14, "execution_layers": 4, "diagnostics": [], "has_errors": false, "compile_timings": { "parse_ms": 12, "resolve_ms": 8, "typecheck_ms": 18, "total_ms": 42 }, "models_detail": [ { "name": "fct_revenue", "strategy": { "type": "incremental", "timestamp_column": "updated_at" }, "target": { "catalog": "acme_warehouse", "schema": "analytics", "table": "fct_revenue" }, "freshness": { "warn_after_seconds": 3600, "error_after_seconds": 86400 }, "contract_source": "auto" } ]}Field reference:
| Field | Type | Description |
|---|---|---|
models | integer | Number of models compiled. |
execution_layers | integer | Number of layers in the dependency DAG. |
diagnostics | array | Compiler diagnostics (errors and warnings). |
has_errors | boolean | Whether any compilation errors were found. |
compile_timings | object | Per-phase timing breakdown in milliseconds. |
models_detail | array | Per-model details (strategy, target, freshness). Empty when no models exist. |
models_detail[].name | string | Model name. |
models_detail[].strategy | object | Materialization strategy configuration. |
models_detail[].target | object | Target table coordinates (catalog, schema, table). |
models_detail[].freshness | object or absent | Per-model freshness expectation, when declared. |
models_detail[].contract_source | string or absent | "auto" or "explicit", absent when no contract. |
rocky lineage <model>
Section titled “rocky lineage <model>”Show model-level lineage with columns, upstream/downstream models, and column-level edges.
{ "version": "1.6.0", "command": "lineage", "model": "fct_revenue", "columns": [ { "name": "customer_id" }, { "name": "revenue_month" }, { "name": "net_revenue" } ], "upstream": ["stg_orders", "stg_refunds"], "downstream": ["rpt_monthly_summary"], "edges": [ { "source": { "model": "stg_orders", "column": "customer_id" }, "target": { "model": "fct_revenue", "column": "customer_id" }, "transform": "direct" }, { "source": { "model": "stg_orders", "column": "total_amount" }, "target": { "model": "fct_revenue", "column": "net_revenue" }, "transform": "expression" } ]}Field reference:
| Field | Type | Description |
|---|---|---|
model | string | The model being analyzed. |
columns | array | Columns in the model’s output schema. |
upstream | array of strings | Models this model depends on. |
downstream | array of strings | Models that depend on this model. |
edges[].source | object | Source column (model + column). |
edges[].target | object | Target column (model + column). |
edges[].transform | string | Transform kind: "direct", "cast", "expression", etc. |
rocky lineage <model> --column <col>
Section titled “rocky lineage <model> --column <col>”Trace a single column through its lineage chain. Default direction is upstream (sources). Pass --downstream to walk consumers instead.
{ "version": "1.11.0", "command": "lineage", "model": "fct_revenue", "column": "net_revenue", "direction": "upstream", "trace": [ { "source": { "model": "stg_orders", "column": "total_amount" }, "target": { "model": "fct_revenue", "column": "net_revenue" }, "transform": "expression" }, { "source": { "model": "stg_refunds", "column": "refund_amount" }, "target": { "model": "fct_revenue", "column": "net_revenue" }, "transform": "expression" } ]}Field reference:
| Field | Type | Description |
|---|---|---|
model | string | The model containing the traced column. |
column | string | The column being traced. |
direction | string | "upstream" (default) or "downstream". Set by the --downstream flag. |
trace[].source | object | Source column (model + column). |
trace[].target | object | Target column (model + column). |
trace[].transform | string | Transform kind: "direct", "cast", "expression", etc. |
rocky test
Section titled “rocky test”Run local model tests via DuckDB.
{ "version": "1.6.0", "command": "test", "total": 14, "passed": 12, "failed": 2, "failures": [ { "name": "fct_orders::not_null_order_id", "error": "found 3 null values" }, { "name": "fct_orders::unique_order_id", "error": "found 7 duplicates" } ]}Field reference:
| Field | Type | Description |
|---|---|---|
total | integer | Total number of tests run. |
passed | integer | Number of tests that passed. |
failed | integer | Number of tests that failed. |
failures[].name | string | Qualified test name. |
failures[].error | string | Error message describing the failure. |
rocky ci
Section titled “rocky ci”Combined compile + test output for CI/CD pipelines.
{ "version": "1.6.0", "command": "ci", "compile_ok": true, "tests_ok": true, "models_compiled": 14, "tests_passed": 14, "tests_failed": 0, "exit_code": 0, "diagnostics": [], "failures": []}Field reference:
| Field | Type | Description |
|---|---|---|
compile_ok | boolean | Whether compilation succeeded without errors. |
tests_ok | boolean | Whether all tests passed. |
models_compiled | integer | Number of models compiled. |
tests_passed | integer | Number of tests that passed. |
tests_failed | integer | Number of tests that failed. |
exit_code | integer | Process exit code (0 = success). |
diagnostics | array | Compiler diagnostics (errors and warnings). |
failures | array | Test failures (same shape as rocky test failures). |
Administration Commands
Section titled “Administration Commands”rocky history
Section titled “rocky history”Show recent pipeline run history.
{ "version": "1.6.0", "command": "history", "runs": [ { "run_id": "run_20260401_143022", "started_at": "2026-04-01T14:30:22Z", "status": "success", "trigger": "cli", "models_executed": 20 }, { "run_id": "run_20260401_080015", "started_at": "2026-04-01T08:00:15Z", "status": "partial", "trigger": "dagster", "models_executed": 19 } ], "count": 2}Field reference:
| Field | Type | Description |
|---|---|---|
runs[].run_id | string | Unique run identifier. |
runs[].started_at | string | ISO 8601 timestamp when the run started. |
runs[].status | string | Run status ("success", "partial", "failed"). |
runs[].trigger | string | What triggered the run ("cli", "dagster", "scheduler"). |
runs[].models_executed | integer | Number of models executed in this run. |
count | integer | Total number of runs returned. |
rocky history --model <name>
Section titled “rocky history --model <name>”Show execution history for a specific model.
{ "version": "1.6.0", "command": "history", "model": "fct_revenue", "executions": [ { "started_at": "2026-04-01T14:30:22Z", "duration_ms": 2300, "rows_affected": 1482, "status": "success", "sql_hash": "a1b2c3d4e5f67890" }, { "started_at": "2026-03-31T14:30:05Z", "duration_ms": 8900, "rows_affected": null, "status": "success", "sql_hash": "f8e7d6c5b4a39210" } ], "count": 2}Field reference:
| Field | Type | Description |
|---|---|---|
model | string | Model name. |
executions[].started_at | string | ISO 8601 timestamp. |
executions[].duration_ms | integer | Execution time in milliseconds. |
executions[].rows_affected | integer or null | Rows affected, if reported by the warehouse. |
executions[].status | string | Execution status. |
executions[].sql_hash | string | Hash of the SQL that was executed. |
count | integer | Total number of executions returned. |
rocky replay <run_id|latest>
Section titled “rocky replay <run_id|latest>”Per-model dump of a recorded run — SQL hash, row counts, bytes, and timings captured at execution time.
{ "version": "1.11.0", "command": "replay", "run_id": "run_20260420_143022", "status": "success", "trigger": "manual", "started_at": "2026-04-20T14:30:22Z", "finished_at": "2026-04-20T14:31:07Z", "config_hash": "cfg_a3f2b1c4", "models": [ { "model_name": "fct_revenue", "status": "success", "started_at": "2026-04-20T14:30:24Z", "finished_at": "2026-04-20T14:31:07Z", "duration_ms": 43000, "sql_hash": "hash_b4e3c2d5", "rows_affected": 8900, "bytes_scanned": 20971520, "bytes_written": 4194304 } ]}Field reference:
| Field | Type | Description |
|---|---|---|
run_id | string | Run identifier from the state store. |
status | string | "success", "partial_failure", or "failure". |
trigger | string | What triggered the run: "manual", "sensor", "schedule", "ci". |
config_hash | string | Hash of the rocky.toml effective at run time. |
models[].sql_hash | string | Stable across runs where the compiled SQL is identical. |
models[].bytes_scanned / bytes_written | integer or null | Per-model byte counts, when the adapter reports them. |
rocky trace <run_id|latest>
Section titled “rocky trace <run_id|latest>”Same run, laid out as a Gantt timeline — per-model start offsets + concurrency-lane assignments.
{ "version": "1.11.0", "command": "trace", "run_id": "run_20260420_143022", "status": "success", "trigger": "manual", "started_at": "2026-04-20T14:30:22Z", "finished_at": "2026-04-20T14:31:07Z", "run_duration_ms": 45000, "lane_count": 2, "models": [ { "model_name": "stg_orders", "status": "success", "start_offset_ms": 0, "duration_ms": 1200, "sql_hash": "hash_a3f2b1c4", "lane": 0, "rows_affected": 150000, "bytes_scanned": 41943040, "bytes_written": 20971520 } ]}Field reference:
| Field | Type | Description |
|---|---|---|
run_duration_ms | integer | Wall time from started_at to finished_at. |
lane_count | integer | Observed maximum concurrency (greedy first-fit over start offsets). |
models[].start_offset_ms | integer | Milliseconds between the run start and this model’s start. |
models[].lane | integer | Concurrency lane index; models sharing a lane do not overlap in time. |
rocky branch create / branch show / branch list / branch delete
Section titled “rocky branch create / branch show / branch list / branch delete”Branches use three distinct output shapes — see rocky branch for command-level usage.
{ "version": "1.11.0", "command": "branch create", "branch": { "name": "fix-price", "schema_prefix": "branch__fix-price", "created_by": "hugo", "created_at": "2026-04-20T14:22:11+00:00", "description": "testing reprice migration" }}{ "version": "1.11.0", "command": "branch list", "total": 2, "branches": [ { "name": "fix-price", "schema_prefix": "branch__fix-price", "created_by": "hugo", "created_at": "2026-04-20T14:22:11+00:00", "description": "testing reprice migration" } ]}{ "version": "1.11.0", "command": "branch delete", "name": "fix-price", "removed": true}removed is false when the branch didn’t exist. Deletion does not drop warehouse tables — that’s a separate cleanup step.
rocky branch approve
Section titled “rocky branch approve”Writes a content-addressed approval artifact that binds the approver’s git identity to a branch’s current state hash. The artifact path is returned so callers can stage it for review.
{ "version": "1.31.0", "command": "branch approve", "artifact": { "version": "1", "branch": "fix-price", "branch_state_hash": "sha256:9a4f…", "approver": { "name": "Hugo Correia", "email": "hugo@example.com" }, "signed_at": "2026-05-13T10:14:22Z", "message": "ready to promote", "signature": { "algorithm": "git_identity", "value": "…" } }, "artifact_path": "./.rocky/approvals/fix-price/01HXZ…json"}signature.algorithm reflects whichever signing identity Rocky could resolve at sign time. message is omitted when --message is not passed.
rocky branch promote
Section titled “rocky branch promote”Note: as of engine v1.33, the canonical form is
rocky plan promote <name>followed byrocky apply <plan-id>(orrocky branch promote <name> --plan <plan-id>). The barerocky branch promote <name>form continues to work and is now an alias; it emits a one-line[deprecated]notice to stderr that can be silenced withROCKY_SUPPRESS_DEPRECATION=1. The JSON output shape below is identical across all three forms.
Promotes a branch’s tables to their production targets. Emits the approval and semantic-breaking-change gate decisions in audit, the per-target SQL outcomes in targets, and a top-level success flag.
{ "version": "1.31.0", "command": "branch promote", "branch": "fix-price", "branch_state_hash": "sha256:9a4f…", "approvals_used": [ /* ApprovalArtifact, see `branch approve` above */ ], "approvals_rejected": [], "breaking_changes": [ { "change": { "kind": "column_dropped", "model": "analytics.marts.fct_orders", "column": "legacy_status", "data_type": "STRING" }, "severity": "breaking" } ], "targets": [ { "target": "analytics.marts.fct_orders", "source": "analytics.branch__fix-price.fct_orders", "statement": "CREATE OR REPLACE TABLE analytics.marts.fct_orders AS SELECT * FROM analytics.branch__fix-price.fct_orders", "succeeded": true } ], "audit": [ { "kind": "promote_started", "at": "2026-05-13T10:20:00Z", "actor": { "name": "Hugo Correia", "email": "hugo@example.com" }, "branch": "fix-price", "branch_state_hash": "sha256:9a4f…" }, { "kind": "breaking_changes_allowed", "at": "2026-05-13T10:20:00Z", "actor": { "name": "Hugo Correia", "email": "hugo@example.com" }, "branch": "fix-price", "branch_state_hash": "sha256:9a4f…", "breaking_changes": [ /* same shape as top-level `breaking_changes` */ ] }, { "kind": "promote_completed", "at": "2026-05-13T10:20:02Z", "actor": { "name": "Hugo Correia", "email": "hugo@example.com" }, "branch": "fix-price", "branch_state_hash": "sha256:9a4f…" } ], "success": true}Top-level fields:
| Field | Type | Description |
|---|---|---|
branch | string | Branch name being promoted. |
branch_state_hash | string | Content-addressed hash of the branch’s current state — the same value approval artifacts are bound to. |
approvals_used | array | Approval artifacts that satisfied the [branch.approval] gate at promote time. Empty when the gate was disabled or skipped. |
approvals_rejected | array | Approval artifacts loaded from disk that failed verification, with the reason for rejection. Surfaced even on success so operators can clean up stale artifacts. |
breaking_changes | array | absent | Semantic findings produced by the pre-promote gate. Absent when the gate was skipped (compile failure on either side). Empty array when the gate ran and found no breaking changes. When present and non-empty, either the promote was blocked or --allow-breaking was set — check audit. |
targets | array | One entry per managed target the promote attempted, in dispatch order. Each carries target, source, statement, succeeded, and an optional error. |
audit | array | Audit-trail events emitted during this invocation, in order. |
success | boolean | True when every target’s SQL succeeded. |
AuditEvent.kind values:
| Kind | Meaning |
|---|---|
promote_started | Promote began. |
promote_completed | Promote finished successfully (every target’s SQL succeeded). |
promote_failed | A target’s SQL failed; subsequent targets did not run. |
approval_skipped | The approval gate was bypassed via --skip-approval or the ROCKY_BRANCH_APPROVAL_SKIP env-var override. The reason field records which. |
breaking_changes_blocked | The semantic breaking-change gate blocked the promote. breaking_changes carries the findings that triggered the block. Exit code is nonzero. |
breaking_changes_allowed | The gate detected one or more breaking-severity findings but the operator overrode via --allow-breaking. breaking_changes carries the findings. |
breaking_changes_gate_skipped | The gate could not run (e.g. base ref failed to compile under the current Rocky version). Fail-open: the gate is skipped and the promote proceeds, but reason records why so the bypass is auditable. |
breaking_changes is attached to breaking_changes_blocked and breaking_changes_allowed events; it is absent on every other kind. The top-level breaking_changes and the per-event breaking_changes carry the same BreakingFinding shape.
BreakingFinding shape (also emitted by rocky ci-diff --semantic):
{ "change": { "kind": "column_type_changed", "model": "analytics.marts.fct_orders", "column": "amount_cents", "old_type": "BIGINT", "new_type": "INT", "narrowing": true }, "severity": "breaking"}change.kind is a tagged discriminator. Common variants: model_removed, model_added, column_dropped, column_added, column_type_changed, column_nullability_changed, column_reordered, materialization_strategy_changed, materialization_key_changed, replication_columns_changed, partition_by_changed, target_renamed, source_changed, column_mask_changed, lakehouse_format_changed, sql_body_changed. Each variant carries the minimum identifying context (model + before/after values) needed to render a useful message without re-loading the IR. severity is one of breaking, warning, info.
rocky metrics <model>
Section titled “rocky metrics <model>”Quality metrics for a model, including snapshots, alerts, and column trends.
{ "version": "1.6.0", "command": "metrics", "model": "fct_revenue", "snapshots": [ { "run_id": "run_20260401_143022", "timestamp": "2026-04-01T14:30:22Z", "row_count": 148203, "freshness_lag_seconds": 300, "null_rates": { "customer_id": 0.0, "net_revenue": 0.0 } } ], "count": 1, "alerts": [ { "type": "anomaly", "severity": "warning", "message": "null rate increased from 0.0% to 2.3% in last run", "run_id": "run_20260401_143022", "column": "discount_pct" } ], "column": "net_revenue", "column_trend": [ { "run_id": "run_20260401_143022", "timestamp": "2026-04-01T14:30:22Z", "null_rate": 0.0, "row_count": 148203 } ]}Field reference:
| Field | Type | Description |
|---|---|---|
model | string | Model name. |
snapshots[].run_id | string | Run that produced this snapshot. |
snapshots[].timestamp | string | ISO 8601 timestamp. |
snapshots[].row_count | integer | Row count at snapshot time. |
snapshots[].freshness_lag_seconds | integer or absent | Freshness lag in seconds. |
snapshots[].null_rates | object | Map of column name to null rate (0.0 — 1.0). |
count | integer | Number of snapshots returned. |
alerts | array | Quality alerts (present when --alerts is used). |
alerts[].type | string | Alert type (e.g., "anomaly"). |
alerts[].severity | string | "warning" or "critical". |
alerts[].message | string | Human-readable alert description. |
alerts[].run_id | string | Run that triggered the alert. |
alerts[].column | string or absent | Column involved, if applicable. |
column | string or absent | Column filter, when --column is used. |
column_trend | array | Column-level trend data (present when --column is used). |
message | string or absent | Explanatory message when no data is available. |
rocky optimize
Section titled “rocky optimize”Materialization strategy recommendations based on execution history and cost model.
{ "version": "1.6.0", "command": "optimize", "recommendations": [ { "model_name": "dim_customers", "current_strategy": "table", "recommended_strategy": "incremental", "estimated_monthly_savings": 42.50, "reasoning": "Only 0.3% row change rate between runs. Switching to incremental saves ~17s per run." } ], "total_models_analyzed": 1}Field reference:
| Field | Type | Description |
|---|---|---|
recommendations[].model_name | string | Model name. |
recommendations[].current_strategy | string | Current materialization strategy. |
recommendations[].recommended_strategy | string | Recommended materialization strategy. |
recommendations[].estimated_monthly_savings | number | Estimated monthly cost savings in dollars. |
recommendations[].reasoning | string | Human-readable explanation. |
total_models_analyzed | integer | Number of models analyzed. |
message | string or absent | Explanation when no recommendations are available (e.g., insufficient history). |
rocky compact
Section titled “rocky compact”Generate OPTIMIZE and VACUUM SQL for Delta table compaction.
{ "version": "1.6.0", "command": "compact", "model": "acme_warehouse.staging__us_west__shopify.orders", "dry_run": true, "target_size_mb": 256, "statements": [ { "purpose": "optimize", "sql": "OPTIMIZE acme_warehouse.staging__us_west__shopify.orders" }, { "purpose": "vacuum", "sql": "VACUUM acme_warehouse.staging__us_west__shopify.orders" } ]}Field reference:
| Field | Type | Description |
|---|---|---|
model | string or absent | Target table in catalog.schema.table format. Set when invoked as rocky compact <fqn>; absent under --catalog and --measure-dedup. |
catalog | string or absent | Catalog identifier (lowercased to match the managed-table resolver). Set only on rocky compact --catalog <name>. |
scope | string or absent | "catalog" for the catalog-scoped path; absent for single-model invocations to keep their envelope byte-stable. |
dry_run | boolean | Whether this was a dry run. |
target_size_mb | integer | Target file size in megabytes. |
statements[].purpose | string | Statement purpose ("optimize", "vacuum"). |
statements[].sql | string | The SQL statement. Catalog-scoped invocations carry the concatenation of every per-table bundle here. |
tables | object or absent | Per-table breakdown keyed by fully-qualified table name. Each entry has the same statements shape as the flat list. Present only on --catalog invocations. |
totals | object or absent | Aggregate counts: table_count and statement_count. Present only on --catalog invocations. |
rocky archive
Section titled “rocky archive”Generate or execute DELETE and VACUUM SQL for archiving old data.
{ "version": "1.6.0", "command": "archive", "model": "acme_warehouse.staging__us_west__shopify.events", "older_than": "90d", "older_than_days": 90, "dry_run": true, "statements": [ { "purpose": "delete", "sql": "DELETE FROM acme_warehouse.staging__us_west__shopify.events WHERE _fivetran_synced < '2026-01-10'" }, { "purpose": "vacuum", "sql": "VACUUM acme_warehouse.staging__us_west__shopify.events" } ]}Field reference:
| Field | Type | Description |
|---|---|---|
model | string or absent | Target table, if filtered to a specific model. Absent under --catalog. |
catalog | string or absent | Catalog identifier. Set only on rocky archive --catalog <name>. |
scope | string or absent | "catalog" for the catalog-scoped path; absent for single-model invocations. |
older_than | string | Age threshold as specified (e.g., "90d", "6m", "1y"). |
older_than_days | integer | Age threshold normalized to days. |
dry_run | boolean | Whether this was a dry run. |
statements[].purpose | string | Statement purpose ("delete", "vacuum"). |
statements[].sql | string | The SQL statement. Catalog-scoped invocations carry every per-table statement here. |
tables | object or absent | Per-table breakdown keyed by fully-qualified table name. Present only on --catalog invocations. |
totals | object or absent | Aggregate counts: table_count and statement_count. Present only on --catalog invocations. |
rocky profile-storage
Section titled “rocky profile-storage”Profile storage layout and generate encoding recommendations.
{ "version": "1.6.0", "command": "profile-storage", "model": "acme_warehouse.staging__us_west__shopify.orders", "profile_sql": "SELECT column_name, data_type, ... FROM information_schema.columns WHERE ...", "recommendations": [ { "column": "status", "data_type": "STRING", "estimated_cardinality": "5", "recommended_encoding": "TINYINT", "reasoning": "Only 5 distinct values. Integer encoding reduces storage by ~60%." }, { "column": "customer_notes", "data_type": "STRING", "estimated_cardinality": "98200", "recommended_encoding": "ZSTD", "reasoning": "High cardinality text column. Compression reduces storage significantly." } ]}Field reference:
| Field | Type | Description |
|---|---|---|
model | string | Target table in catalog.schema.table format. |
profile_sql | string | The SQL used to profile the table. |
recommendations[].column | string | Column name. |
recommendations[].data_type | string | Current data type. |
recommendations[].estimated_cardinality | string | Estimated number of distinct values. |
recommendations[].recommended_encoding | string | Recommended encoding or type. |
recommendations[].reasoning | string | Human-readable explanation. |
Development Commands
Section titled “Development Commands”rocky import-dbt
Section titled “rocky import-dbt”Import a dbt project and convert to Rocky models.
{ "version": "1.6.0", "command": "import-dbt", "import_method": "directory", "project_name": "acme_analytics", "dbt_version": "1.7.0", "imported": 24, "warnings": 2, "failed": 0, "sources_found": 6, "sources_mapped": 6, "tests_found": 18, "tests_converted": 15, "tests_converted_custom": 2, "tests_skipped": 1, "macros_detected": 3, "imported_models": ["stg_orders", "stg_customers", "fct_revenue"], "warning_details": [ { "model": "stg_payments", "category": "macro", "message": "custom Jinja macro 'cents_to_dollars' not translated", "suggestion": "Replace with a Rocky SQL expression" } ], "failed_details": [], "report": {}}Field reference:
| Field | Type | Description |
|---|---|---|
import_method | string | Import method used ("directory", "manifest"). |
project_name | string or absent | dbt project name from dbt_project.yml. |
dbt_version | string or absent | dbt version detected. |
imported | integer | Number of models successfully imported. |
warnings | integer | Number of import warnings. |
failed | integer | Number of models that failed to import. |
sources_found | integer | Number of dbt sources found. |
sources_mapped | integer | Number of sources successfully mapped. |
tests_found | integer | Number of dbt tests found. |
tests_converted | integer | Number of tests converted to Rocky format. |
tests_converted_custom | integer | Number of custom tests converted. |
tests_skipped | integer | Number of tests skipped. |
macros_detected | integer | Number of custom macros detected. |
imported_models | array of strings | Names of successfully imported models. |
warning_details[].model | string | Model with the warning. |
warning_details[].category | string | Warning category. |
warning_details[].message | string | Warning message. |
warning_details[].suggestion | string or absent | Suggested fix. |
failed_details[].name | string | Name of the failed model. |
failed_details[].reason | string | Reason for the failure. |
report | object | Free-form per-model migration report. |
rocky validate-migration
Section titled “rocky validate-migration”Compare a dbt project against its Rocky import to verify correctness.
{ "version": "1.6.0", "command": "validate-migration", "project_name": "acme_analytics", "dbt_version": "1.7.0", "models_imported": 24, "models_failed": 0, "total_tests": 18, "total_contracts": 12, "total_warnings": 1, "validations": [ { "model": "stg_customers", "present_in_dbt": true, "present_in_rocky": true, "compile_ok": true, "dbt_description": "Customer staging model", "rocky_intent": "Customer dimension with contact details", "test_count": 3, "contracts_generated": 1, "warnings": [] }, { "model": "dim_products", "present_in_dbt": true, "present_in_rocky": true, "compile_ok": true, "test_count": 2, "contracts_generated": 1, "warnings": ["column order differs from dbt"] } ]}Field reference:
| Field | Type | Description |
|---|---|---|
project_name | string or absent | dbt project name. |
dbt_version | string or absent | dbt version detected. |
models_imported | integer | Total models compared. |
models_failed | integer | Models that failed validation. |
total_tests | integer | Total test assertions found. |
total_contracts | integer | Total data contracts generated. |
total_warnings | integer | Total warnings across all models. |
validations[].model | string | Model name. |
validations[].present_in_dbt | boolean | Whether the model exists in the dbt project. |
validations[].present_in_rocky | boolean | Whether the model exists in the Rocky project. |
validations[].compile_ok | boolean | Whether the Rocky model compiles successfully. |
validations[].dbt_description | string or absent | Description from dbt YAML config. |
validations[].rocky_intent | string or absent | Intent from Rocky TOML config. |
validations[].test_count | integer | Number of tests for this model. |
validations[].contracts_generated | integer | Number of contracts generated. |
validations[].warnings | array of strings | Validation warnings. |
rocky test-adapter
Section titled “rocky test-adapter”Run conformance tests against a warehouse adapter.
{ "adapter": "duckdb", "sdk_version": "0.1.0", "tests_run": 26, "tests_passed": 22, "tests_failed": 0, "tests_skipped": 4, "results": [ { "name": "create_table", "category": "ddl", "status": "passed", "duration_ms": 12 }, { "name": "workspace_bindings", "category": "governance", "status": "skipped", "message": "not supported by duckdb adapter", "duration_ms": 0 } ]}Field reference:
| Field | Type | Description |
|---|---|---|
adapter | string | Adapter name being tested. |
sdk_version | string | Adapter SDK version. |
tests_run | integer | Total tests executed. |
tests_passed | integer | Tests that passed. |
tests_failed | integer | Tests that failed. |
tests_skipped | integer | Tests skipped (unsupported features). |
results[].name | string | Test name. |
results[].category | string | Test category ("connection", "ddl", "dml", "query", "types", "dialect", "governance", "discovery", "batch_checks"). |
results[].status | string | "passed", "failed", or "skipped". |
results[].message | string or absent | Error or skip reason. |
results[].duration_ms | integer | Test duration in milliseconds. |
rocky hooks list
Section titled “rocky hooks list”List all configured lifecycle hooks.
{ "hooks": [ { "event": "on_pipeline_start", "command": "scripts/notify.sh", "timeout_ms": 5000, "on_failure": "warn", "env_keys": ["SLACK_WEBHOOK_URL"] }, { "event": "on_materialize_error", "command": "scripts/alert.sh", "timeout_ms": 10000, "on_failure": "error", "env_keys": ["PAGERDUTY_API_KEY"] } ], "total": 2}Field reference:
| Field | Type | Description |
|---|---|---|
hooks[].event | string | Hook event name (e.g., "on_pipeline_start", "on_materialize_error"). |
hooks[].command | string | Shell command to execute. |
hooks[].timeout_ms | integer | Timeout in milliseconds. |
hooks[].on_failure | string | Failure behavior: "warn" or "error". |
hooks[].env_keys | array of strings | Environment variable keys passed to the hook. |
total | integer | Total number of configured hooks. |
rocky hooks test <event>
Section titled “rocky hooks test <event>”Fire a synthetic test event to validate hook scripts.
{ "event": "pipeline_start", "status": "continue", "result": "HookResult::Continue"}Field reference:
| Field | Type | Description |
|---|---|---|
event | string | The event that was tested. |
status | string | Result: "continue", "abort", or "no_hooks". |
message | string or absent | Explanation when no hooks are configured. |
result | string or absent | Debug rendering of the hook result. |
AI Commands
Section titled “AI Commands”rocky ai "<intent>"
Section titled “rocky ai "<intent>"”Generate a model from a natural language description.
{ "version": "1.6.0", "command": "ai", "intent": "monthly revenue by customer, joining orders and refunds", "format": "rocky", "name": "fct_monthly_revenue_by_customer", "source": "SELECT\n o.customer_id,\n DATE_TRUNC('month', o.order_date) AS revenue_month,\n SUM(o.total_amount) - COALESCE(SUM(r.refund_amount), 0) AS net_revenue\nFROM ref('stg_orders') o\nLEFT JOIN ref('stg_refunds') r ON o.order_id = r.order_id\nGROUP BY 1, 2", "attempts": 1}Field reference:
| Field | Type | Description |
|---|---|---|
intent | string | The natural language intent provided. |
format | string | Output format ("rocky" or "sql"). |
name | string | Generated model name. |
source | string | Generated SQL source code. |
attempts | integer | Number of generation attempts (retries on validation failure). |
rocky ai-sync
Section titled “rocky ai-sync”Detect schema changes and propose intent-guided model updates.
{ "version": "1.6.0", "command": "ai-sync", "proposals": [ { "model": "fct_revenue", "intent": "Add discount_pct to revenue calculation", "diff": "- SUM(o.total_amount) AS net_revenue\n+ SUM(o.total_amount * (1 - o.discount_pct)) AS net_revenue", "proposed_source": "SELECT ..." } ]}Field reference:
| Field | Type | Description |
|---|---|---|
proposals[].model | string | Model name. |
proposals[].intent | string | Description of the proposed change. |
proposals[].diff | string | Unified diff of the proposed SQL change. |
proposals[].proposed_source | string | Full proposed SQL source. |
rocky ai-explain
Section titled “rocky ai-explain”Generate natural language intent descriptions from existing model SQL.
{ "version": "1.6.0", "command": "ai-explain", "explanations": [ { "model": "fct_revenue", "intent": "Calculates monthly net revenue per customer by joining orders with refunds.", "saved": true }, { "model": "dim_customers", "intent": "Customer dimension combining profile data with computed lifetime metrics.", "saved": false } ]}Field reference:
| Field | Type | Description |
|---|---|---|
explanations[].model | string | Model name. |
explanations[].intent | string | Generated intent description. |
explanations[].saved | boolean | Whether the intent was saved to the model’s TOML config. |
rocky ai-test
Section titled “rocky ai-test”Generate test assertions from model intent and SQL logic.
{ "version": "1.6.0", "command": "ai-test", "results": [ { "model": "fct_revenue", "tests": [ { "name": "net_revenue_is_not_negative", "description": "Net revenue should never be negative after refunds", "sql": "SELECT COUNT(*) FROM ref('fct_revenue') WHERE net_revenue < 0" }, { "name": "customer_id_not_null", "description": "Every revenue row must have a customer" } ], "saved": false } ]}Field reference:
| Field | Type | Description |
|---|---|---|
results[].model | string | Model name. |
results[].tests[].name | string | Test assertion name. |
results[].tests[].description | string | Human-readable description. |
results[].tests[].sql | string or absent | SQL assertion query, when applicable. |
results[].saved | boolean | Whether the tests were saved to disk. |