balances fwd fill asof test #9171

tomfutago · 2026-01-02T13:54:36Z

Thank you for contributing to Spellbook 🪄

Please open the PR in draft and mark as ready when you want to request a review.

Description:

[...]

quick links for more information:

tomfutago · 2026-01-05T15:10:18Z

asof join test results summary

test descriptions

test	strategy	description
original	`lead()` + range join	uses `lead()` window function to compute `next_update_day`, then expands with range join
test1	asof + cross join	cross join (address × token × days), then asof left join to find latest balance
test2	asof for `next_update_day` + `utils.days`	asof self-join to find next balance (replaces lead), expand with `inner join utils.days`
test3	asof for `next_update_day` + range join	asof self-join to find next balance (replaces lead), expand with original `left join days` range pattern

initial run (full refresh)

chain	rows	original	test1	test2	test3
arbitrum	3.4b	797s	2418s ❌ (3.0×)	784s ✅	763s ✅
base	3.2b	735s	1596s ❌ (2.2×)	714s ✅	694s ✅
linea	686m	155s	271s ❌ (1.7×)	147s ✅	135s ✅
scroll	487m	106s	183s ❌ (1.7×)	84s ✅	70s ✅

winner: test3 — consistently 5-35% faster than original on full refresh.

incremental run

chain	rows	original	test1	test2	test3
arbitrum	19.6m	93s	108s	107s	86s ✅
base	33.3m	137s	131s ✅	136s	136s
linea	2.6m	11s	11s	10s	10s ✅
scroll	1.9m	—	14s	15s	8s ✅

winner: test3 — fastest or tied on incremental runs.

conclusions

test1 (cross join) — ❌ do not use. the cross join creates massive cardinality explosion, especially on larger chains (2-3× slower).
test2 (asof + utils.days inner join) — ✅ slight improvement over original. good alternative.
test3 (asof + range join) — ✅ best performer. combines asof for next_update_day calculation with original range join pattern. 5-35% faster on full refresh, fastest on incremental.

recommendation: test3 is the optimal asof implementation. it replaces only the lead() window function with asof self-join while keeping the proven range join expansion pattern.

jeff-dude · 2026-01-06T00:28:17Z

asof join test results summary

test descriptions

test strategy description
original lead() + range join uses lead() window function to compute next_update_day, then expands with range join
test1 asof + cross join cross join (address × token × days), then asof left join to find latest balance
test2 asof for next_update_day + utils.days asof self-join to find next balance (replaces lead), expand with inner join utils.days
test3 asof for next_update_day + range join asof self-join to find next balance (replaces lead), expand with original left join days range pattern

initial run (full refresh)

chain rows original test1 test2 test3
arbitrum 3.4b 797s 2418s ❌ (3.0×) 784s ✅ 763s ✅
base 3.2b 735s 1596s ❌ (2.2×) 714s ✅ 694s ✅
linea 686m 155s 271s ❌ (1.7×) 147s ✅ 135s ✅
scroll 487m 106s 183s ❌ (1.7×) 84s ✅ 70s ✅
winner: test3 — consistently 5-35% faster than original on full refresh.

incremental run

chain rows original test1 test2 test3
arbitrum 19.6m 93s 108s 107s 86s ✅
base 33.3m 137s 131s ✅ 136s 136s
linea 2.6m 11s 11s 10s 10s ✅
scroll 1.9m — 14s 15s 8s ✅
winner: test3 — fastest or tied on incremental runs.

conclusions

test1 (cross join) — ❌ do not use. the cross join creates massive cardinality explosion, especially on larger chains (2-3× slower).

test2 (asof + utils.days inner join) — ✅ slight improvement over original. good alternative.

test3 (asof + range join) — ✅ best performer. combines asof for next_update_day calculation with original range join pattern. 5-35% faster on full refresh, fastest on incremental.

recommendation: test3 is the optimal asof implementation. it replaces only the lead() window function with asof self-join while keeping the proven range join expansion pattern.

follow up question:
was base chosen as one of the larger chains, so we can see performance across various size of chains?
would be good to confirm it's consistent across smaller + larger chains

tomfutago · 2026-01-06T09:32:20Z

follow up question: was base chosen as one of the larger chains, so we can see performance across various size of chains? would be good to confirm it's consistent across smaller + larger chains

yep, it's to compare 2 big chains (arbitrum + base) vs 2 small ones (linea + scroll)

just few more notes:

test1 is meant to be closest "translation" of original logic into asof-type of logic, but performance gain on asof join is massively diminished by utils.days cross join (which was meant to replace current range join on day to next_update_day)
test2-3 are performing better but not as much as i hoped - not sure if asof self-join is the best approach here

hints welcome 🙏

0xRobin

@tomfutago I had a quick look, can you add another approach that uses just 1 asof join like this:
High level this should look like this:

with balance_updates as (
  select * from balances_source
  where address_filter
  and token_filter
  and incremental_filter
),

asof_subjects as (
select distinct 
address, token_address
from balance_updates
),

asof_spine as (
select
  timestamp
  ,address
  ,token_address 
from asof_subjects
cross join (select timestamp from utils.day where <incremental window>)
),

select
  timestamp
  ,address
  ,token_address
  ,balance
from asof_spine
asof join balance_updates
on address = address
  and token_address = token_address
  and timestamp <= balance_updated_at

There's also an issue that you're not accounting for balance updates that occured before the start of the incremental window (and carrying those into the current window), but I would keep that as a follow up. I have some examples on how to do that as well somewhere.

tomfutago · 2026-01-06T16:37:10Z

thanks @0xRobin i might be missing something but your suggested approach looks like my test1, is it not?

0xRobin · 2026-01-06T17:35:56Z

ah yes you are right! @tomfutago
Sorry I conflated test1 with the original.

I think all approaches are missing the historical lookback for incremental updates though, can you add that as well.
Equivalent to this in the original:
https://github.com/duneanalytics/spellbook/blob/8019858e5349b5ddcf1f8d8f571484b920d85fd0/dbt_macros/shared/balances_incremental_subset_daily.sql#L85C1-L104C16

I wonder if for approach 1 it would be better to do it in 2 stages, one to construct the cross join spine and the final one to asof join with the balance updates.

tomfutago · 2026-01-07T13:02:21Z

updated - test3 version still seems to winning:

initial run (full refresh)

chain	rows	original	test1	test2	test3
arbitrum	3.4b	864s	2980s ❌ (3.4×)	848s ✅	855s ✅
base	3.25b	1000s	2047s ❌ (2×)	1002s	988s ✅
linea	688m	183s	268s ❌ (1.5×)	169s ✅	73s ✅✅
scroll	488m	52s	90s ❌ (1.7×)	46s ✅	44s ✅

winner: test3 - up to 2.5× faster on smaller chains, consistent improvement across all.

incremental run

chain	rows	original	test1	test2	test3
arbitrum	19.7m	94s	95s	100s	98s
base	33.4m	161s	149s ✅	165s	156s
linea	2.6m	23s	22s	17s ✅	16s ✅
scroll	1.9m	17s	18s	17s	17s

winner: test3 - slight edge on incremental, test1 also competitive now.

feat: balances fwd fill asof test

51675a5

github-actions bot added WIP work in progress dbt: tokens covers the Tokens dbt subproject labels Jan 2, 2026

tomfutago and others added 9 commits January 2, 2026 14:29

test: optimise & add base for bigger test dataset

4d23271

fix: reuse existing selection for changed balances, trigger base re-run

8ca0629

test: another attempt

a65c4f7

test: 3 variations to compare - just linea & scroll

b50251c

fix: daily sequence

b0ea822

fix: dedupe source

759d856

fix: add alias ref

0557281

test: include arbitrum & base

eea0e38

Merge branch 'main' into balances-fwd-fill-asof-test

ac4cd6d

tomfutago requested a review from a team January 5, 2026 15:10

0xRobin reviewed Jan 6, 2026

View reviewed changes

fix: optimise incremental logic

fe10a39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

balances fwd fill asof test #9171

balances fwd fill asof test #9171

tomfutago commented Jan 2, 2026

Uh oh!

tomfutago commented Jan 5, 2026

Uh oh!

jeff-dude commented Jan 6, 2026

asof join test results summary

test descriptions

initial run (full refresh)

incremental run

conclusions

Uh oh!

tomfutago commented Jan 6, 2026

Uh oh!

0xRobin left a comment •

edited

Loading

Uh oh!

tomfutago commented Jan 6, 2026

Uh oh!

0xRobin commented Jan 6, 2026 •

edited

Loading

Uh oh!

tomfutago commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

balances fwd fill asof test #9171

Are you sure you want to change the base?

balances fwd fill asof test #9171

Conversation

tomfutago commented Jan 2, 2026

Thank you for contributing to Spellbook 🪄

Description:

Uh oh!

tomfutago commented Jan 5, 2026

asof join test results summary

test descriptions

initial run (full refresh)

incremental run

conclusions

Uh oh!

jeff-dude commented Jan 6, 2026

asof join test results summary

test descriptions

initial run (full refresh)

incremental run

conclusions

Uh oh!

tomfutago commented Jan 6, 2026

Uh oh!

0xRobin left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tomfutago commented Jan 6, 2026

Uh oh!

0xRobin commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomfutago commented Jan 7, 2026

initial run (full refresh)

incremental run

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

0xRobin left a comment •

edited

Loading

0xRobin commented Jan 6, 2026 •

edited

Loading