The Rocky DSL
Rocky’s DSL is a pipeline-oriented alternative to SQL for transformation models. .rocky files lower to standard SQL before execution — the warehouse only sees SQL.
Why a DSL
Section titled “Why a DSL”A SELECT with multiple CTEs, window functions, and nested CASE expressions gets hard to follow. The DSL offers a top-to-bottom pipeline: each step transforms the rows flowing through, and the final shape is obvious from scanning the file.
Pipeline syntax
Section titled “Pipeline syntax”A Rocky file is a sequence of pipeline steps. Data flows from the top step downward:
from orderswhere status == "completed"derive { total: amount * quantity}group customer_id { revenue: sum(total), order_count: count()}sort revenue desctake 100This lowers to:
SELECT customer_id, SUM(amount * quantity) AS revenue, COUNT() AS order_countFROM ordersWHERE status = 'completed'GROUP BY customer_idORDER BY revenue DESCLIMIT 100Constructs
Section titled “Constructs”Every pipeline starts with from. It accepts a model name, a qualified table reference, or a table with an alias:
from ordersfrom catalog.schema.ordersfrom orders as oFilters rows. Multiple where steps combine with AND. A where step after group becomes a HAVING clause:
from orderswhere amount > 0where status != "cancelled"group customer_id { total: sum(amount)}where total > 1000 -- this becomes HAVINGGroups rows by one or more keys and defines aggregations in a block:
from ordersgroup customer_id { revenue: sum(amount), avg_order: avg(amount), first_order: min(order_date), cnt: count()}derive
Section titled “derive”Adds computed columns without removing existing ones:
from ordersderive { total: amount * quantity, is_large: amount > 1000}select
Section titled “select”Chooses which columns to keep. Replaces the current column set:
from ordersselect { id, customer_id, total }Supports * and qualified references:
from orders as ojoin customers as c on customer_id { keep c.name}select { o.id, c.name }Joins another model by key columns, with an optional keep clause to pull in specific columns:
from orders as ojoin customers as c on customer_id { keep c.name, c.email}This lowers to:
SELECT *, c.name, c.emailFROM orders oJOIN customers AS c ON o.customer_id = c.customer_idOrders results. Use desc for descending:
from orderssort order_date descLimits the number of rows:
from orderssort amount desctake 10distinct
Section titled “distinct”Deduplicates rows:
from ordersselect { customer_id }distinctPattern matching that lowers to CASE WHEN:
from ordersderive { tier: match amount { > 10000 => "enterprise", > 1000 => "mid-market", _ => "smb" }}Lowers to:
CASE WHEN amount > 10000 THEN 'enterprise' WHEN amount > 1000 THEN 'mid-market' ELSE 'smb' END AS tierreplicate
Section titled “replicate”Shorthand for SELECT * — used in bronze-layer models that pass data through unchanged:
from source.fivetran.ordersreplicateDate literals
Section titled “Date literals”The @ prefix creates date literals without quoting:
from orderswhere order_date >= @2025-01-01Lowers to WHERE order_date >= DATE '2025-01-01'.
Timestamps are also supported: @2025-01-01T00:00:00Z.
NULL-safe operators
Section titled “NULL-safe operators”Rocky’s != operator compiles to SQL’s IS DISTINCT FROM, not the standard !=. This means comparisons involving NULL behave intuitively:
| Rocky | SQL | NULL behavior |
|---|---|---|
a == b | a = b | NULL = NULL is NULL (standard SQL) |
a != b | a IS DISTINCT FROM b | NULL IS DISTINCT FROM NULL is FALSE |
This eliminates an entire class of bugs where != 'value' silently excludes NULL rows.
IS NULL and IS NOT NULL work as expected:
from orderswhere email is not nullWindow functions
Section titled “Window functions”Window functions use an over clause with partition, sort, and optional frame:
from ordersderive { rn: row_number() over (partition customer_id, sort -order_date), running_total: sum(amount) over (partition customer_id, sort order_date, rows unbounded..current), prev_amount: lag(amount, 1) over (sort order_date)}The - prefix on a sort column means descending. Frame bounds use .. syntax: unbounded..current, 3..current.
Comments
Section titled “Comments”Line comments use --, same as SQL:
from orders-- Filter to completed orders onlywhere status == "completed"How lowering works
Section titled “How lowering works”DSL files go through two phases before the compiler sees them:
- Parse — tokens → typed AST (one variant per pipeline step).
- Lower — AST → single SQL string, walked step by step with an accumulating clause context (
FROM, joins,WHERE,SELECT,GROUP BY,HAVING,ORDER BY,LIMIT).
The lowered SQL then flows into the compiler for type checking and dependency resolution — exactly as if you’d written it by hand. There’s no runtime indirection; .rocky and .sql reach the warehouse identically.
For the full language grammar, see the Rocky language spec.