Building a Custom Adapter
Rocky talks to warehouses through a small set of traits in the rocky-adapter-sdk crate. Implementing those traits gives you a working warehouse adapter, the same way rocky-databricks, rocky-snowflake, rocky-bigquery, rocky-trino, and rocky-duckdb are wired today.
This guide walks a Rust developer from “I want a ClickHouse adapter” to a compiling skeleton with passing tests in roughly fifteen minutes. The runnable skeleton lives at examples/playground/pocs/07-adapters/06-rust-native-adapter-skeleton/ and is shaped after ClickHouse, but the same shape works for Redshift, StarRocks, MotherDuck, or any SQL warehouse Rocky doesn’t ship in-tree. (Trino is in-tree as of engine v1.28.0; see the rocky-trino crate.)
When to reach for the SDK
Section titled “When to reach for the SDK”The adapter SDK is the right tool when:
- The warehouse you need is not in the in-tree adapter list (Databricks, Snowflake, BigQuery, DuckDB).
- You need a forked variant of an existing adapter (e.g. Databricks Serverless on top of
rocky-databricks). - You want to embed Rocky in a tool that owns its own warehouse client and would rather wrap it than spawn
rockyas a subprocess.
If your warehouse already ships in-tree, use it directly via [adapter] in rocky.toml. If you want adapters in a non-Rust language (Python, Go, Node), see the process adapter protocol: JSON-RPC over stdio. The POC at pocs/07-adapters/04-custom-process-adapter/ walks that pattern.
The trait surface
Section titled “The trait surface”Public traits live in engine/crates/rocky-adapter-sdk/src/traits.rs. The two you must implement for any warehouse adapter are WarehouseAdapter and SqlDialect. The rest are opt-in by capability.
| Trait | Required? | What it does | Key methods (full surface in rocky-adapter-sdk/src/traits.rs) |
|---|---|---|---|
WarehouseAdapter | yes | Execute SQL against the warehouse | dialect, execute_statement, execute_query, describe_table, table_exists, close |
SqlDialect | yes | Generate warehouse-specific SQL | name, format_table_ref, create_table_as, insert_into, merge_into, describe_table_sql, drop_table_sql, create_catalog_sql, create_schema_sql, row_hash_expr, tablesample_clause, select_clause, watermark_where, insert_overwrite_partition |
DiscoveryAdapter | no | Enumerate connectors / tables in a source system | discover |
GovernanceAdapter | no | Tags, grants, catalog/schema lifecycle | set_tags, get_grants, apply_grants, revoke_grants |
BatchCheckAdapter | no | Batched data-quality queries | batch_row_counts, batch_freshness |
LoaderAdapter | no | File ingestion (CSV, Parquet, JSONL) | load, supported_formats |
TypeMapper | no | Cross-warehouse type normalization | normalize_type, types_compatible |
Each opt-in trait is gated by a flag in AdapterCapabilities. Set the flag, implement the trait, and Rocky’s planner picks up the new behavior automatically.
When each method is called
Section titled “When each method is called”execute_statement: every DDL / DML Rocky generates:CREATE TABLE,INSERT INTO,MERGE INTO,ALTER TABLE,DROP TABLE, partition replace.execute_query:EXPLAIN,DESCRIBE, row-count assertions, therocky compile-timeSELECT 1connectivity check.describe_table: drift detection (rocky drift), contract validation, the column-list step before generating an incremental insert.table_exists: full-refresh-vs-create branching at the start of a materialization.dialect()methods: every SQL string Rocky emits is composed by a dialect call. Identifier validation lives here.
Worked example: a ClickHouse-shaped skeleton
Section titled “Worked example: a ClickHouse-shaped skeleton”The POC at examples/playground/pocs/07-adapters/06-rust-native-adapter-skeleton/ is a compiling, tested starter. To run it:
git clone https://github.com/rocky-data/rocky.gitcd rocky/examples/playground/pocs/07-adapters/06-rust-native-adapter-skeleton./run.shThis runs cargo check, the unit tests, and a demo binary that prints the SQL the adapter would have sent to a real warehouse.
Crate layout
Section titled “Crate layout”adapter/├── Cargo.toml # Path-dep on rocky-adapter-sdk; standalone (not in workspace)├── src/lib.rs # SkeletonAdapter, SkeletonDialect, MockBackend, tests└── examples/demo.rs # End-to-end driverAdapter struct + manifest
Section titled “Adapter struct + manifest”pub struct SkeletonAdapter { backend: Arc<dyn Backend>, dialect: SkeletonDialect,}
impl SkeletonAdapter { pub fn manifest() -> AdapterManifest { AdapterManifest { name: "skeleton".into(), version: env!("CARGO_PKG_VERSION").into(), sdk_version: SDK_VERSION.into(), dialect: "skeleton".into(), capabilities: AdapterCapabilities { warehouse: true, discovery: false, governance: false, batch_checks: false, create_catalog: false, // ClickHouse has no catalogs create_schema: true, // ClickHouse calls these "databases" merge: false, // No MERGE — use incremental instead tablesample: true, file_load: false, }, auth_methods: vec!["basic".into(), "token".into()], config_schema: serde_json::json!({ /* ... */ }), } }}Capability flags are not cosmetic; they gate behavior. merge: false makes Rocky’s planner refuse strategy = "merge" configs against this adapter at validate time rather than failing mid-run. create_catalog: false makes auto_create_catalogs = true surface a clear “warehouse doesn’t support catalogs” error instead of emitting broken SQL.
Backend abstraction
Section titled “Backend abstraction”The skeleton hides the actual warehouse client behind a small Backend trait so tests can substitute an in-memory mock:
#[async_trait]pub trait Backend: Send + Sync { async fn execute(&self, sql: &str) -> AdapterResult<()>; async fn query(&self, sql: &str) -> AdapterResult<QueryResult>; async fn describe(&self, table: &TableRef) -> AdapterResult<Vec<ColumnInfo>>; async fn exists(&self, table: &TableRef) -> AdapterResult<bool>;}The production impl wraps clickhouse::Client (or reqwest::Client for warehouses without a typed driver). The test impl is MockBackend: a HashMap plus a statement log so tests can assert on the SQL the dialect produced.
Dialect implementation
Section titled “Dialect implementation”SqlDialect is where most adapter divergence lives. The skeleton’s format_table_ref shows the two patterns you almost always need: drop arguments your warehouse doesn’t have, and validate every identifier you splice into SQL.
fn format_table_ref( &self, _catalog: &str, // ClickHouse has no catalogs — drop on the floor schema: &str, table: &str,) -> AdapterResult<String> { validate_ident(schema)?; validate_ident(table)?; Ok(format!("`{schema}`.`{table}`"))}Methods worth thinking carefully about:
merge_into: returnAdapterError::not_supported("merge_into")if your warehouse has noMERGE. Rocky’s planner sees the capability flag and won’t generate merge plans, but a defensive impl still helps if someone bypasses the planner.insert_overwrite_partition: returnsVec<String>because some warehouses need a multi-statement transaction (Snowflake’sBEGIN; DELETE; INSERT; COMMIT). The runtime executes them in order and rolls back on partial failure.row_hash_expr: used for change detection. ClickHouse usessipHash128(tuple(...)); if you want cross-warehouse comparable hashes, look at howrocky-bigqueryandrocky-snowflakeagree on a stable encoding.watermark_where: the standard incremental filter (col > (SELECT max(col) FROM target)). Validatetimestamp_colbefore splicing.
Auth and connection management
Section titled “Auth and connection management”The SDK ships an optional AuthProvider trait that composes the Authorization header with any extra mandatory headers your warehouse requires, such as a user-identity header alongside a bearer token. A StaticAuthProvider covers the fixed-credential case. Adapters with bespoke needs can still wire their own; the two patterns worth copying are in-tree:
engine/crates/rocky-databricks/src/auth.rs: PAT-first, OAuth M2M fallback. Reads${DATABRICKS_TOKEN}; if absent, falls through to theclient_credentialsflow with${DATABRICKS_CLIENT_ID}/${DATABRICKS_CLIENT_SECRET}. The auto-detection logic is roughly twenty lines.engine/crates/rocky-snowflake/src/auth.rs: multi-method priority: pre-supplied OAuth bearer wins, then RS256 key-pair JWT, then password. Each method reads from a distinct${SNOWFLAKE_*}variable so config files never carry secrets.
Two rules that apply to every adapter regardless of method:
- Read credentials at config-parse time, not at adapter-construct time. Rocky substitutes
${VAR}references when parsingrocky.toml. Pull the resolved string out ofSkeletonConfig; do not re-read env vars from the adapter constructor or tests will collide on shared state. - Pool HTTP clients in the adapter struct.
reqwest::Clientis internallyArc-counted and cheap to clone; construct it once inSkeletonAdapter::newand clone the handle into every request. Don’t construct a new client per call.
For retry and rate-limiting, look at rocky-adapter-sdk/src/throttle.rs (the AIMD adaptive concurrency helper) and rocky-databricks/src/connector.rs for an is_transient / is_rate_limit retry loop.
Testing your adapter
Section titled “Testing your adapter”Two test layers, neither of which needs a live warehouse.
Unit tests with a mock backend
Section titled “Unit tests with a mock backend”The skeleton’s tests assert on the SQL the dialect generated:
#[tokio::test]async fn execute_statement_round_trips_to_backend() { let backend = Arc::new(MockBackend::new()); let adapter = SkeletonAdapter::new(backend.clone());
adapter .execute_statement("CREATE TABLE foo (id Int64) ENGINE=Memory") .await .unwrap();
let log = backend.statement_log().await; assert!(log[0].contains("CREATE TABLE foo"));}This style covers everything except real network behavior.
Wiremock for HTTP-backed adapters
Section titled “Wiremock for HTTP-backed adapters”For adapters that talk to a REST API, the in-tree pattern is wiremock-based; see how rocky-fivetran/src/client.rs is tested. You stand up a MockServer per test, register expected Match::path("/v1/connectors") handlers, and assert your adapter sends the right verbs against the right paths. CI runs without a real Fivetran account.
The conformance harness
Section titled “The conformance harness”rocky-adapter-sdk::conformance::run_conformance(&manifest, Some(adapter.dialect())) returns a ConformanceResult describing which tests apply (based on declared capabilities) and which were skipped. When a live dialect is supplied, the harness exercises one real trait call, SqlDialect::format_table_ref, as the first incremental step toward live execution. Pass None when no live adapter is available (for example, rocky test-adapter --builtin <name>, which validates the test plan without a warehouse), and dialect-category checks are reported as skipped rather than run against a stub. The remaining checks are still plan entries rather than warehouse calls, so treat the result as a checklist of behaviors your unit tests should cover while broader trait execution lands in future SDK releases.
Distributing your adapter
Section titled “Distributing your adapter”The honest answer for now: fork and merge. The adapter registry is statically registered at compile time; there is no dynamic plugin system today. To ship a new adapter:
- Fork
rocky-data/rocky. - Drop your crate into
engine/crates/rocky-<name>/. - Add it to
engine/Cargo.tomlworkspacemembersand the CLI’s adapter dispatch. - Open a PR upstream. The SDK pins the trait shape so the diff stays small, usually a few hundred lines of crate plus one wiring line in the CLI.
Two looser paths if upstreaming isn’t an option yet:
- Vendor the crate. Keep your fork private, ship Rocky internally with your adapter linked in. The in-tree adapters use this same model; they’re just upstreamed.
- Process adapter (any language). If you want to escape Rust entirely, the JSON-RPC stdio protocol in
rocky-adapter-sdk/src/process.rsworks today; seepocs/07-adapters/04-custom-process-adapter/for a working Python adapter against SQLite.
A dynamic registration path (declarative config + crates.io discovery) is on the roadmap but unscheduled. Until it lands, the SDK’s job is to keep the trait surface stable enough that your fork is forward-compatible.
Gotchas worth knowing about
Section titled “Gotchas worth knowing about”These are real and surface during implementation. Flagging them up front so you don’t lose half a day debugging.
- The SDK trait surface now mirrors most of the in-tree one.
rocky-adapter-sdk/src/traits.rsis the public, marketed contract;rocky-core/src/traits.rsis what the in-tree adapters use. The SDK gained default-impl methods forexecute_statement_with_stats(andExecutionStats),ping,explain(andExplainResult),is_experimental,warehouse_name, andlist_tables, so out-of-tree adapters get the same shape in-tree adapters use. A few methods still differ pending cross-crate type unification (fetch_arrow_batch,clone_table_for_branch, and themerge_intosignature), plus a handful of duplicated types (TableRef,ColumnInfo,Grant,MetadataColumn). Target the SDK surface and treat those as not-yet-stable. - Identifier validation is not optional. Anything you splice into SQL must pass
[A-Za-z0-9_]+(or your warehouse’s equivalent). The skeleton’svalidate_identshows the pattern. SQL-injection-bearing string literals were the subject of a real CVE-class fix; don’t reinvent that hole. - The
catalogfield inTableRefis always present. Warehouses without catalogs (ClickHouse, Postgres, MySQL) get an empty string. Your dialect’sformat_table_refis responsible for dropping it. AdapterErroris intentionally type-erased. UseAdapterError::msg(...)for ad-hoc errors,AdapterError::new(my_err)to wrap anstd::error::Error, andAdapterError::not_supported("method_name")for capabilities your warehouse doesn’t have. Don’t reach forthiserrorinside the trait impl; the SDK boxes everything.
Next steps
Section titled “Next steps”- Browse the skeleton POC source;
adapter/src/lib.rsis meant to be read top-to-bottom. - Read the adapter concepts page for the architecture overview.
- Look at the in-tree adapters in
engine/crates/rocky-{databricks,snowflake,bigquery,duckdb,fivetran}/for production patterns. - File an issue on github.com/rocky-data/rocky if a trait method is missing what you need; the SDK is still young and feedback shapes the roadmap.