-
Notifications
You must be signed in to change notification settings - Fork 42
Description
Codebook [[overrides]] Behavior Spec
Summary
Add support for glob-scoped configuration overrides in codebook.toml. Overrides allow users to customize spell-checking behavior for specific file paths without needing separate config files. Each override block matches files by glob pattern and can replace or append to base settings.
Config Shape
Base settings (unchanged)
Top-level keys remain the same and define the default behavior for all files:
dictionaries = ["en_us"]
words = ["codebook", "rustc"]
flag_words = ["todo", "fixme"]
ignore_paths = ["target/**/*"]
ignore_patterns = ["\\b[ATCG]+\\b"]
use_global = trueOverride blocks
Zero or more [[overrides]] blocks may follow. Each block contains a required paths field and any combination of setting fields:
[[overrides]]
paths = ["**/*.md", "docs/**/*"]
extra_words = ["frontmatter"]
dictionaries = ["en_us", "en_gb"]Override fields
| Field | Type | Semantics |
|---|---|---|
paths |
string[] |
Required. Glob patterns to match against. |
dictionaries |
string[] |
Replace the base dictionaries list. |
words |
string[] |
Replace the base words list. |
flag_words |
string[] |
Replace the base flag_words list. |
ignore_patterns |
string[] |
Replace the base ignore_patterns list. |
extra_dictionaries |
string[] |
Append to the resolved dictionaries list. |
extra_words |
string[] |
Append to the resolved words list. |
extra_flag_words |
string[] |
Append to the resolved flag_words list. |
extra_ignore_patterns |
string[] |
Append to the resolved ignore_patterns list. |
Path Matching
Glob format
Glob patterns in paths use the same syntax as ignore_paths:
*matches any sequence of non-separator characters.**matches zero or more directories.?matches any single non-separator character.{a,b}matches eitheraorb.
What paths are matched against
Patterns are matched against the file path relative to the project root (the directory containing codebook.toml). Paths use forward slashes regardless of OS.
For example, given a project root of /home/user/myproject, the file /home/user/myproject/src/lib.rs is matched as src/lib.rs.
Match behavior
A file matches an override block if it matches any pattern in that block's paths array. This is an OR relationship:
[[overrides]]
paths = ["**/*.md", "**/*.txt"] # matches .md OR .txt files
extra_words = ["prose"]Resolution Order
When Codebook opens a file, it resolves the effective configuration by applying overrides in declaration order:
- Start with base config. All top-level settings form the initial resolved config.
- Apply global config. If
use_global = true, merge the global config (with global settings as the base, project settings overriding). - Walk
[[overrides]]in order. For each override block, top to bottom:- Check if the current file matches any pattern in
paths. - If it matches, apply the override (see "Merge Semantics" below).
- If it doesn't match, skip the block.
- Check if the current file matches any pattern in
- The final resolved config is used for spell checking.
All matching overrides are applied, not just the first match. Later overrides take precedence over earlier ones if they set the same field.
Example
words = ["base"]
[[overrides]]
paths = ["**/*.md"]
extra_words = ["markdown"]
[[overrides]]
paths = ["docs/**/*"]
extra_words = ["documentation"]For the file docs/guide.md:
- Both overrides match.
- First override applies:
words = ["base", "markdown"] - Second override applies:
words = ["base", "markdown", "documentation"] - Final
words:["base", "markdown", "documentation"]
Merge Semantics
When an override matches a file, its fields are applied to the current resolved config as follows:
Replace fields (dictionaries, words, flag_words, ignore_patterns)
The resolved list is fully replaced with the value from the override. The base (or previously resolved) value is discarded.
words = ["alpha", "beta"]
[[overrides]]
paths = ["**/*.md"]
words = ["gamma"]
# Resolved words for .md files: ["gamma"]Append fields (extra_dictionaries, extra_words, extra_flag_words, extra_ignore_patterns)
The values are appended to the current resolved list. Duplicates are preserved (deduplication is not performed at config resolution time).
words = ["alpha", "beta"]
[[overrides]]
paths = ["**/*.md"]
extra_words = ["gamma"]
# Resolved words for .md files: ["alpha", "beta", "gamma"]Combining replace and append in the same override
If both words and extra_words are set in the same override block, the replace is applied first, then the append is applied on top:
words = ["alpha", "beta"]
[[overrides]]
paths = ["**/*.md"]
words = ["gamma"]
extra_words = ["delta"]
# Resolved words for .md files: ["gamma", "delta"]This is well-defined but unusual. It is not an error.
Fields not present in an override
If a field is absent from an override block, the resolved value for that field is unchanged. Overrides are sparse — they only affect what they explicitly set.
Interaction with ignore_paths
The top-level ignore_paths is evaluated before overrides. If a file matches ignore_paths, it is not checked and no overrides are evaluated for it.
Overrides cannot "un-ignore" a file that was excluded by ignore_paths.
Interaction with Global Config
When use_global = true, the resolution order is:
- Global config base settings.
- Project config base settings override global (replace semantics for all fields).
- Global config
[[overrides]]are applied (in order). - Project config
[[overrides]]are applied (in order).
This means project overrides always take final precedence.
If use_global = false, global config (including its overrides) is ignored entirely.
Validation and Errors
Warnings (config loads, invalid overrides are skipped)
pathsis missing from an[[overrides]]block — block is skipped.pathsis empty (paths = []) — block is skipped.pathscontains an invalid glob pattern — block is skipped.- An unrecognized field is present in an
[[overrides]]block — block is skipped. - An
[[overrides]]block haspathsbut no other fields (no-op override). - A pattern in
pathsdoesn't match any files in the project (optional, may be expensive to check).
Full Example
# Base configuration
dictionaries = ["en_us"]
words = ["codebook", "rustc", "serde"]
flag_words = ["todo", "fixme"]
ignore_paths = ["target/**/*", ".git/**/*"]
ignore_patterns = ["\\b[A-F0-9]{40}\\b"] # ignore git SHAs
use_global = true
# Markdown files: add en_gb dictionary, allow prose-specific words
[[overrides]]
paths = ["**/*.md", "**/*.mdx"]
extra_dictionaries = ["en_gb"]
extra_words = ["frontmatter", "callout", "codeblock"]
# Rust files: flag additional words
[[overrides]]
paths = ["**/*.rs"]
extra_flag_words = ["hack", "unwrap", "xxx"]
extra_ignore_patterns = ["r#\".*\"#"]
# Test files: more relaxed, allow test-specific jargon
[[overrides]]
paths = ["**/tests/**/*", "**/*_test.*", "**/*.test.*"]
extra_words = ["mock", "stub", "fixture", "parameterized"]
# Docs: replace the whole dictionary set for a multilingual project
[[overrides]]
paths = ["docs/de/**/*"]
dictionaries = ["de"]
extra_words = ["codebook"]Non-Goals (out of scope for v1)
- Negated globs (e.g.,
!**/*.test.md). Users can work around this with ordering. remove_wordsor subtraction semantics. Can be added later if needed.- Per-override
ignore_paths. Overrides scope settings to paths, but don't add new path exclusions. - Inheritance between overrides. Each override resolves against the base config plus all prior matching overrides, not against a named parent.
- Regex-based path matching. Globs are sufficient and safer.