Introduction
dagster-rocky is a Python package that bridges Rocky’s Rust binary with Dagster orchestration. It lets you manage SQL transformations with Rocky while leveraging Dagster for scheduling, retries, alerting, and its asset-centric UI.
What it provides
Section titled “What it provides”RockyResource— A DagsterConfigurableResourcethat wraps the Rocky CLI with 25+ methods covering discovery, execution, compilation, lineage, testing, AI-powered model generation, observability, and diagnostics. Three execution modes forrocky run: buffered (run), live-streaming (run_streaming), and full Dagster Pipes (run_pipes).RockyDagsterTranslator— Controls how Rocky sources and tables map to Dagster asset keys, groups, tags, and metadata.load_rocky_assets()— Callsrocky discoverand returns a list of DagsterAssetSpecobjects, one per enabled table.emit_check_results()/emit_materializations()— Convert Rocky’s check and materialization results into Dagster events that appear in the UI.RockyComponent— A state-backed Dagster component that caches discovery output, avoiding API calls on every code location reload.
Architecture
Section titled “Architecture”The integration follows a simple pattern:
- Dagster calls the
rockybinary via subprocess (e.g.,rocky discover --output json). - Rocky executes against your warehouse and sources, returning structured JSON.
dagster-rockyparses that JSON into Pydantic models.- The models are translated into Dagster events (asset materializations, check results, etc.).
Rocky handles the SQL transformation layer: DAG resolution, incremental logic, SQL generation, schema drift detection, and permission reconciliation. Dagster handles everything around it: scheduling, retries, alerting, lineage visualization, and operational monitoring.
Requirements
Section titled “Requirements”dagster >= 1.13.0pydantic >= 2.0pygments >= 2.20.0- The
rockybinary must be available onPATH(or configured viabinary_path). For deployment, you can vendor the binary under avendor/directory and pointbinary_pathto it.
CLI methods on the resource
Section titled “CLI methods on the resource”RockyResource exposes one Python method per Rocky CLI command. The full set includes:
- Core Pipeline —
discover,plan,run,run_streaming,run_pipes,state,resume_run - Modeling —
compile,lineage,test,ci - AI —
ai,ai_sync,ai_explain,ai_test - Observability —
history,metrics,optimize - Diagnostics —
doctor,validate_migration,test_adapter - Hooks —
hooks_list,hooks_test
See the RockyResource page for full method signatures and details.