Simplify forest and field batch codec versioning logic#26552
Simplify forest and field batch codec versioning logic#26552CraigMacomber wants to merge 1 commit intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR simplifies forest and field batch codec versioning by removing forest-summary per-version format types and centralizing version/schema handling, while also eliminating the need to pass a FieldBatchCodec into the forest summarizer/codec.
Changes:
- Replace
FormatV1/FormatV2forest-summary schemas with a singleFormatCommonschema/type. - Move
FieldBatchCodecconstruction into the forest summary codec (and update call sites/tests accordingly). - Export a reusable
versionFieldschema from the versioned codec utilities and use it in forest-summary formatting.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/dds/tree/src/test/feature-libraries/forest-summary/forestSummarizerCodec.spec.ts | Update tests to use FormatCommon and new forest codec build shape (no injected field batch codec). |
| packages/dds/tree/src/test/feature-libraries/forest-summary/forestSummarizer.spec.ts | Update summarizer tests for new ForestSummarizer constructor and FormatCommon usage. |
| packages/dds/tree/src/test/feature-libraries/chunked-forest/chunkEncodingEndToEnd.spec.ts | Update end-to-end tests for the updated ForestSummarizer signature and FormatCommon casts. |
| packages/dds/tree/src/shared-tree/sharedTree.ts | Update runtime construction of ForestSummarizer to remove the fieldBatchCodec argument. |
| packages/dds/tree/src/feature-libraries/forest-summary/formatV1.ts | Delete per-version forest summary format (v1). |
| packages/dds/tree/src/feature-libraries/forest-summary/formatV2.ts | Delete per-version forest summary format (v2). |
| packages/dds/tree/src/feature-libraries/forest-summary/formatCommon.ts | Define a single forest-summary FormatCommon schema and switch to shared versionField. |
| packages/dds/tree/src/feature-libraries/forest-summary/forestSummarizer.ts | Remove FieldBatchCodec dependency from constructor; build forest codec directly from options. |
| packages/dds/tree/src/feature-libraries/forest-summary/codec.ts | Build FieldBatchCodec internally and simplify forest codec options typing. |
| packages/dds/tree/src/codec/versioned/index.ts | Re-export versionField. |
| packages/dds/tree/src/codec/versioned/format.ts | Introduce versionField and build Versioned from it. |
| packages/dds/tree/src/codec/index.ts | Export versionField from the codec public surface. |
| packages/dds/tree/src/codec/codec.ts | Type-only import adjustment for TypeBox types. |
| >; | ||
| export const FormatCommon = Type.Object( | ||
| { | ||
| ...versionField, |
There was a problem hiding this comment.
The forest summary schema now spreads versionField, which allows version to be a string. Forest summary format versions appear to be a numeric strictEnum (ForestFormatVersion), and the previous schema constrained version to a numeric literal per version. If string versions are not actually supported for forest summaries, consider keeping this schema strict (e.g., version: Type.Number() or a union of the known numeric literals) to catch malformed data earlier via schema validation and avoid shifting failures into the later “Unsupported version …” path.
| ...versionField, | |
| ...versionField, | |
| version: Type.Union([ | |
| Type.Literal(ForestFormatVersion.v1), | |
| Type.Literal(ForestFormatVersion.v2), | |
| ]), |
| // Incremental summary is supported from ForestFormatVersion.v2 and ForestSummaryFormatVersion.v3 onwards. | ||
| // Note that even if this is true, it is possible for the | ||
| // FieldBatchCodec will not use incremental encoding (for example if using its v1 formats which does not support it). | ||
| const enableIncrementalSummary = | ||
| forestFormatWriteVersion >= ForestFormatVersion.v2 && | ||
| summaryFormatWriteVersion >= ForestSummaryFormatVersion.v3 && | ||
| encoderContext.encodeType === TreeCompressionStrategy.CompressedIncremental; |
There was a problem hiding this comment.
The comment says incremental summary is supported from ForestFormatVersion.v2 onward, but the gating logic no longer checks the forest format write version. Either update the comment to match the actual conditions being enforced, or reintroduce the forest-format check if it’s still required for correctness/compatibility.
| // Performance: Since multiple places (including multiple versions of this codec) use the field batch codec, | ||
| // we may end up with multiple copies of it, including compiling its format validation multiple times. | ||
| // This is not ideal, but is not too bad as it is a small fixed number of copies and thus should not be too expensive. | ||
| // If this becomes problematic a cache could be added for options to codec instances somewhere. | ||
| const fieldBatchCodec = fieldBatchCodecBuilder.build(options); | ||
| const formatSchema = FormatCommon; |
There was a problem hiding this comment.
makeForestSummarizerCodec builds a new fieldBatchCodec instance for each forest codec version during forestCodecBuilder.build(options). Because ClientVersionDispatchingCodecBuilder.build evaluates all registered versions, this ends up building (and schema-compiling) the FieldBatch codec multiple times per SharedTree/ForestSummarizer construction, which is a measurable startup/perf regression risk. Consider memoizing fieldBatchCodecBuilder.build(options) (e.g., in a WeakMap keyed by the options object or a stable key derived from options) or restructuring so the FieldBatch codec is built once per forestCodecBuilder.build call and reused across the v1/v2 forest codecs.
Description
Simplify forest and field batch codec versioning logic.
Reviewer Guidance
The review process is outlined on this wiki page.