Skip to content

Performance Benchmarks

Rocky 0.3.0 (post-optimization) compiles 10,000 models in 1.00 second34x faster than dbt-core and 38x faster than dbt-fusion, while using 4-7x less memory.

MetricRockydbt-core 1.11.8dbt-fusion 2.0.0
Compile1.00 s34.62 s (34x)38.43 s (38x)
Peak memory147 MB629 MB (4.3x)1,063 MB (7.2x)
Lineage0.84 s35.36 s (42x)N/A
Startup14 ms896 ms (64x)12 ms
DAG resolution0.36 s9.12 s (25x)2.73 s (7x)
Warm compile (1 file)0.72 s33.12 s (46x)37.16 s (52x)
SQL generation200 msN/AN/A
Config validation15 ms2,187 ms (146x)1,473 ms (98x)

Benchmarked on Apple Silicon (12-core, 36 GB RAM) with a synthetic 4-layer medallion DAG. 3 iterations per benchmark, mean reported. Full methodology in examples/playground/benchmarks/REPORT_CURRENT.md.

Rocky compiles and traces lineage in 1.84 seconds. dbt-core takes 70 seconds. For a 10-engineer team running 5 PR iterations/day each:

ToolAnnual developer wait timeCost at $75/hr
Rocky9.3 hours$700
dbt-core354.8 hours$26,608
dbt-fusion194.8 hours$14,610

Rocky enables 60-second Dagster sensor intervals. dbt-core needs 2-5 minute intervals because its 35-second compile consumes most of the cycle. Rocky uses under 1% of a 2-minute sensor cycle for compilation; dbt-core uses 29%.

ScaleRockydbt-coredbt-fusion
Fits in 512 MB pod10kNoNo
Fits in 1 GB pod10k10kNo
Needs 2 GB podNeverNever10k

Rocky’s warm compile (after changing 1 file) takes 0.72 seconds — 28% faster than a cold compile. dbt-core and dbt-fusion show no warm compile benefit (their partial parse doesn’t help because the bottleneck is SQL generation, not parsing).

Rocky compiles linearly. Per-model cost is flat at ~100 µs from 1k to 50k models. Verified across prior benchmark rounds:

10k (measured) 50k (extrapolated)
Rocky 1.00 s ~5.0 s
dbt-core 34.62 s ~173 s
dbt-fusion 38.43 s ~192 s

Memory also scales linearly:

10k (measured) 50k (extrapolated)
Rocky 147 MB ~735 MB
dbt-core 629 MB ~3,145 MB
dbt-fusion 1,063 MB ~5,315 MB

At 50k models, Rocky stays well under 1 GB. dbt-fusion would need ~5 GB — exceeding standard EKS pod limits.

Annual Cost Model (10k models, 10 engineers)

Section titled “Annual Cost Model (10k models, 10 engineers)”

Same 5-minute sensor interval, Databricks SQL Classic with auto-stop, EKS Fargate:

Cost componentRockydbt-coredbt-fusion
Shared infrastructure$2,847$2,847$2,847
Tool-dependent costs
Fargate (orchestration + CI)$25$50$99
Databricks idle burn$21$894$988
Developer wait time (CI)$700$26,608$14,610
Tool subtotal$749$27,353$15,338
Grand total$3,596$30,200$18,185

Rocky saves $26,604/year vs dbt-core regardless of Databricks compute strategy. At 25 engineers, savings reach $65k/year.

Terminal window
cd examples/playground/benchmarks
# Build Rocky release binary
cd ../../../engine && cargo build --release && cd -
# Setup Python env
python3 -m venv .venv
.venv/bin/pip install dbt-core dbt-duckdb psutil matplotlib
# Generate 10k-model project
python generate_dbt_project.py --scale 10000 --output-dir .
# Run full suite
python run_benchmark.py \
--scale 10000 --iterations 3 \
--tool all --benchmark-type all \
--rocky-bin ../../../engine/target/release/rocky
# Generate charts
python visualize.py results/benchmark_*.json
VersionCompile (10k)Per-modelPeak RSS
Rocky 0.1.01.33 s133 µs116 MB
Rocky 0.3.01.20 s120 µs125 MB
Rocky 0.3.0 (optimized)1.00 s100 µs147 MB

Cumulative: 25% faster compile since v0.1.0. The memory increase (116 → 147 MB) is a deliberate tradeoff — caching and pre-allocation that trade ~31 MB for 39% faster warm compiles.