-
Notifications
You must be signed in to change notification settings - Fork 164
virtiofs: Fix issue with getdents #2661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request fixes a critical bug in virtiofs directory enumeration where files could be skipped when deleted between getdents calls. It implements a caching mechanism with stable offsets and also fixes a reference counting issue with hard links.
Changes:
- Implements a sliding window cache (32 entries) for directory enumeration with stable offsets that survive file deletions
- Adds
inc_lookup()method toVirtioFsInodefor incrementing lookup count without path updates - Fixes hard link reference counting by incrementing lookup count when returning existing inode
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| vm/devices/virtio/virtiofs/src/file.rs | Implements DirEntryCursor with stable offset caching, refactors read_dir to use caching mechanism, adds comprehensive unit tests |
| vm/devices/virtio/virtiofs/src/inode.rs | Adds inc_lookup() method for manual lookup count increment |
| vm/devices/virtio/virtiofs/src/lib.rs | Fixes hard link handling to increment lookup count for existing inodes |
| vm/devices/virtio/virtiofs/Cargo.toml | Adds arrayvec dependency for fixed-size cache |
| Cargo.lock | Updates lock file for arrayvec dependency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 4 out of 5 changed files in this pull request and generated no new comments.
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
|
|
||
| // Cache this entry. | ||
| self.entries.push(CachedDirEntry { | ||
| offset: next_offset, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are the offsets all +1 from the real one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to preserve the getdents semantics where the offset is the value to pass to resume enumeration after that entry, instead of the entry's 0-index based position. I saw it being done here, so I wanted to maintain consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.
| } | ||
| // Serve entries from cache. | ||
| loop { | ||
| let entries: Vec<_> = self.cursor.entries_from(requested_offset).to_vec(); |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call to to_vec creates an unnecessary clone of all cached directory entries. This could be optimized by restructuring the code to avoid the need for cloning. For example, you could check needs_more and refill before getting the entries slice, or use indices instead of cloning the slice.
| // Increment the lookup count since we're returning an entry for this inode. | ||
| // The kernel will send a forget for this entry later. | ||
| target_inode.inc_lookup(); |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change appears to fix a reference counting bug for hard links, which is unrelated to the getdents caching issue described in the PR. While the fix looks correct (hard links need to increment the lookup count when returning an existing inode), it would be clearer if this were either mentioned in the PR description or submitted as a separate fix.
| /// kernel request. By caching 64 entries (2x the kernel buffer), we reduce | ||
| /// the number of repeated guest filesystem enumeration calls that would | ||
| /// be needed when the guest requests small buffers while deleting entries. | ||
| const CACHE_MAX_ENTRIES: usize = 64; |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description states that the solution caches "32 dirent entries", but CACHE_MAX_ENTRIES is set to 64. While the comment explains that 64 entries equals 2x the kernel buffer (which is reasonable), this discrepancy between the PR description and the implementation should be clarified or corrected.
In WSL we have the following testcase which has been failing when running against our virtiofs implementation: https://github.com/microsoft/WSL/blob/116f8d6bf5257b9c5ddef141d4ae6946ef38a4e3/test/linux/unit_tests/lxtfs.c#L807.
The test case does the following:
The test case fails in the following manner:
Initial state: 500 files (positions 0-499)
First getdents64: Reads entries 0-99, returns offset 100
Guest deletes files at positions 0-99
Second getdents64 with offset 100: Windows seeks to position 100, but this is now what was originally position 200 (since 100 files were deleted)
Result: Files 100-199 are skipped entirely.
The solution introduced in this PR is to cache 32 dirent entries (which is roughly 4kB worth of data matching what Linux requests). Each entry is assigned a stable offset that doesn't change as files are deleted.