-
Notifications
You must be signed in to change notification settings - Fork 357
CAP 81 draft #1869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
CAP 81 draft #1869
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,241 @@ | ||
| ## Preamble | ||
|
|
||
| ``` | ||
| CAP: 0081 | ||
| Title: TTL-Ordered Eviction | ||
| Working Group: | ||
| Owner: Garand Tyson <@SirTyson> | ||
| Authors: Garand Tyson <@SirTyson> | ||
| Consulted: | ||
| Status: Draft | ||
| Created: 2026-01-28 | ||
| Discussion: https://github.com/orgs/stellar/discussions/1868 | ||
| Protocol version: 26 | ||
| ``` | ||
|
|
||
| ## Simple Summary | ||
|
|
||
| This CAP changes the eviction order of Soroban entries from | ||
| bucket-file-position order to TTL order. Entries are evicted in | ||
| `(liveUntilLedgerSeq, LedgerKey)` order, lowest TTL first, with LedgerKey as a | ||
| tiebreaker. This makes eviction order depend solely on intrinsic entry | ||
| properties rather than BucketList file layout. | ||
|
|
||
| ## Working Group | ||
|
|
||
| As specified in the Preamble. | ||
|
|
||
| ## Motivation | ||
|
|
||
| The current eviction mechanism (CAP-0046-12) scans bucket files on disk to find | ||
| expired entries. This approach has two fundamental issues: | ||
|
|
||
| 1. **Performance**: The current eviction process requires unnecessary disk IO. | ||
| While only Soroban entries are evictable, we scan all entry types. Soroban | ||
| entries themselves are also always stored in-memory, but this is not | ||
| leveraged for the eviction scan. By changing eviction order, all eviction | ||
| scans can be done on in-memory state, increasing the rate of eviction and | ||
| reducing resource consumption. | ||
|
|
||
| 2. **Non-intuitive ordering**: Eviction order is determined by an entry's | ||
| position in the BucketList structure. This is very complex, implementation | ||
| specific, and led to a correctness bug in Protocol 23. This simplified | ||
| ordering is intrinsic to just entry state and should lead to much simpler | ||
| implementation. | ||
|
|
||
| ### Goals Alignment | ||
|
|
||
| The proposal aligns with Stellar’s goals of scalability, resilience, and | ||
| performance by reducing disk utilization and simplifying protocol. | ||
|
|
||
| ## Abstract | ||
|
|
||
| This CAP modifies the eviction mechanism to: | ||
|
|
||
| 1. **Order eviction by TTL**: Temporary and persistent entries are evicted | ||
| separately, each with their own ordering. Within each entry type, entries | ||
| are ordered by `(liveUntilLedgerSeq, LedgerKey)` — the entry with the lowest | ||
| `liveUntilLedgerSeq` is evicted first. For entries with the same TTL, | ||
| `LedgerKey` ordering provides a deterministic tiebreaker. Temporary and | ||
| persistent entries do not compete with each other for eviction order. | ||
|
|
||
| 2. **Separate limits for temporary and persistent entries**: Temporary entries | ||
| have their own eviction limit (`maxTempEntriesToEvict`) separate from the | ||
| persistent entry archival limit (`maxPersistentEntriesToArchive`). Each | ||
| limit is applied independently per ledger. | ||
|
|
||
| 3. **Remove disk I/O from eviction**: Since all Soroban entries are stored in | ||
| memory, eviction scans can be performed without disk IO. | ||
|
|
||
| 4. **Remove complex background eviction scan implementation**: With the new | ||
| ordering, eviction scans can be done on the apply thread during ledger | ||
| close, greatly simplifying implementation. | ||
|
|
||
| ## Specification | ||
|
|
||
| ### XDR changes | ||
|
|
||
| This CAP introduces a new network config setting to separately limit temporary | ||
| entry eviction: | ||
|
|
||
| ```diff mddiffcheck.ignore=true | ||
| enum ConfigSettingID | ||
| { | ||
| CONFIG_SETTING_CONTRACT_MAX_SIZE_BYTES = 0, | ||
| CONFIG_SETTING_CONTRACT_COMPUTE_V0 = 1, | ||
| // ... existing settings ... | ||
| CONFIG_SETTING_CONTRACT_PARALLEL_COMPUTE_V0 = 12, | ||
| CONFIG_SETTING_CONTRACT_LEDGER_COST_V0 = 13, | ||
| - CONFIG_SETTING_SCP_TIMING = 16 | ||
| + CONFIG_SETTING_SCP_TIMING = 16, | ||
| + CONFIG_SETTING_MAX_TEMP_ENTRIES_TO_EVICT = 17 | ||
| }; | ||
|
|
||
| union ConfigSettingEntry switch (ConfigSettingID configSettingID) | ||
| { | ||
| // ... existing cases ... | ||
| + case CONFIG_SETTING_MAX_TEMP_ENTRIES_TO_EVICT: | ||
| + uint32 maxTempEntriesToEvict; | ||
| }; | ||
| ``` | ||
|
|
||
dmkozh marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Additionally, `maxEntriesToArchive` will be renamed: | ||
|
|
||
| ```diff mddiffcheck.ignore=true | ||
| struct StateArchivalSettings | ||
| { | ||
| uint32 maxEntryTTL; | ||
| uint32 minTemporaryTTL; | ||
| uint32 minPersistentTTL; | ||
| int64 persistentRentRateDenominator; | ||
| int64 tempRentRateDenominator; | ||
| - uint32 maxEntriesToArchive; | ||
| + uint32 maxPersistentEntriesToArchive; | ||
| uint32 bucketListSizeWindowSampleSize; | ||
| uint32 bucketListWindowSamplePeriod; | ||
| - uint32 evictionScanSize; | ||
| + uint32 maxPersistentBytesToArchive; | ||
| uint64 startingEvictionScanLevel; | ||
| }; | ||
| ``` | ||
|
|
||
| `EvictionIterator`, and `startingEvictionScanLevel` will be deprecated and no | ||
| longer used. | ||
|
|
||
| ### Semantics | ||
|
|
||
| #### Eviction Ordering | ||
|
|
||
| Entries eligible for eviction (entries where | ||
| `liveUntilLedgerSeq < currentLedgerSeq`) are evicted in the following order: | ||
|
|
||
| 1. **Primary sort**: `liveUntilLedgerSeq` ascending, where entries expiring | ||
| soonest are evicted first | ||
| 2. **Secondary sort**: Sorted via `LedgerKey` | ||
|
|
||
| #### Eviction Algorithm | ||
|
|
||
| On each ledger close, eviction proceeds separately for temporary and persistent | ||
| entries: | ||
|
|
||
| **Temporary Entry Eviction:** | ||
|
|
||
| 1. Identify expired temporary entries: all `TEMPORARY` Soroban entries where | ||
| `liveUntilLedgerSeq < currentLedgerSeq` | ||
| 2. Sort by `(liveUntilLedgerSeq, LedgerKey)` | ||
| 3. Evict the first `maxTempEntriesToEvict` entries in this order | ||
|
|
||
| **Persistent Entry Archival:** | ||
|
|
||
| 1. Identify expired persistent entries: all `PERSISTENT` Soroban entries where | ||
| `liveUntilLedgerSeq < currentLedgerSeq` | ||
| 2. Sort by `(liveUntilLedgerSeq, LedgerKey)` | ||
| 3. Archive entries in this order until we have archived | ||
| `maxPersistentBytesToArchive` bytes or `maxPersistentBytesToArchive` | ||
| entries, whichever limit occurs first | ||
|
|
||
| Both limits are applied independently per ledger. A ledger may evict up to | ||
| `maxTempEntriesToEvict` temporary entries and archive up to | ||
| `maxPersistentEntriesToArchive` persistent entries. Eviction occurs after | ||
| applying all transactions from a given ledger. | ||
|
|
||
| #### Initial Settings | ||
|
|
||
| - `maxPersistentEntriesToArchive` initial value: 1000 | ||
| - `maxTempEntriesToEvict` initial value: 1000 | ||
|
|
||
| Currently, `maxEntriesToArchive` is set to 1000. For simplicity, this value | ||
| will be kept for both limits. | ||
|
|
||
| - `maxPersistentBytesToArchive` initial value: 286720 | ||
|
|
||
| `maxPersistentBytesToArchive` in practice meters the maximum number of bytes we | ||
| can write to the hot archive. While `evictionScanSize` implicity bounded this | ||
| before, it's value was not set with this in mind. We will adopt the current | ||
| `ledgerMaxWriteBytes` value. This represents the number of bytes we can safely | ||
| write to the Live BucketList. Given that the Hot Archive BucketList write is | ||
| independent and can be easily parallelized if need be, we can use the same | ||
| limit initially. | ||
|
|
||
| ## Implementation | ||
|
|
||
| Maintaining the order of entries to evict can be done efficiently by leveraging | ||
| the in-memory soroban cache. Initially, a naive approach can be used. Later on, | ||
| performance can be improved with a more complex solution without a protocol | ||
| change. | ||
|
|
||
| ### Naive implementation | ||
|
|
||
| The simplest implementation would maintain a global sorted index of all | ||
| entries, one for `PERSISTENT` entries and one for `TEMPORARY` entries: | ||
|
|
||
| ```cpp | ||
| struct EvictionKey | ||
| { | ||
| std::shared_ptr<LedgerEntry const> entry; | ||
| uint32_t liveUntilLedgerSeq; | ||
| }; | ||
|
|
||
| std::set<EvictionKey> evictionIndex; // Sorted by (liveUntilLedgerSeq, LedgerKey) | ||
| ``` | ||
|
|
||
| At current state sizes, this additional index requires about 48 MB (`entry` is | ||
| just a pointer to the LedgerEntry already allocated from the data cache). This | ||
| overhead should be acceptable, even with significant state growth. Maintaining | ||
| this set should be simple, as we can do updates along with the atomic commits | ||
| we currently make to the BucketList/in-memory soroban cache. | ||
|
|
||
| ### Future Optimizations | ||
|
|
||
| While the memory overhead of this list is fairly small, there may be some | ||
| runtime issues with O(log n) operations on TTL bumps, eviction, entry creation, | ||
| etc. Should this become an issue, one option is to make updates to the | ||
| in-memory soroban cache and the `evictionIndex` in parallel with BucketList | ||
| writes. These are independent operations that do not race on any data, and disk | ||
| IO would dominate in-memory updates even in an ordered set. | ||
|
|
||
| Additionally, it is not necessary to keep an ordered list of all entries. The | ||
| maximum number of entries that can be evicted is bounded by network config | ||
| settings. This allows us to maintain an ordered list of a subset of entries | ||
| that are already eligible for eviction, or are just about to become eligible. | ||
| This list can be prepared and maintained outside of the ledgerClose path. While | ||
| a TTL can be increased such that an entry in the list must be "skipped," TTLs | ||
| can never decrease such that a new entry "jumps the line" and invalidates the | ||
| previously constructed list. | ||
|
|
||
| For initial rollout, the naive solution is likely good enough. | ||
|
|
||
| ## Design Rationale | ||
|
|
||
| To evict a temporary entry, we must write a `DEADENTRY` to the live BucketList, | ||
| which is approximately `sizeof(LedgerKey)`. This size is bounded and relatively | ||
| small, so there is no need to an explicit byte limit when evicting temporay | ||
| entries. | ||
|
|
||
| To evict a persistent entry, we must write a `DEADENTRY` to the live BucketList | ||
| and a `ARCHIVED` entry to the Hot Archive. The size of the `ARCHIVED` entry is | ||
| `sizeof(LedgerEntry)`, so can be very significant and is not properly bounded | ||
| by an entry count limit alone. For this reason, a byte based limit is | ||
| necessary. While an entry count based limit is not required for safe core | ||
| operation, this still seems like a valuable limit to prevent overwhelming | ||
| downstream consumers. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.