Skip to content

Fix use-after-free in IOCP socket service shutdown#131

Open
vinniefalco wants to merge 6 commits intocppalliance:developfrom
vinniefalco:develop
Open

Fix use-after-free in IOCP socket service shutdown#131
vinniefalco wants to merge 6 commits intocppalliance:developfrom
vinniefalco:develop

Conversation

@vinniefalco
Copy link
Member

@vinniefalco vinniefalco commented Feb 11, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Improved resource cleanup for socket connections and TLS operations to prevent memory leaks.
    • Enhanced lock failure handling in OpenSSL and WolfSSL TLS streams with early error propagation.
    • Refined connection completion paths to ensure proper resource release under edge cases.
  • Performance

    • Optimized scheduler operation queue processing for better throughput.
    • Streamlined executor dispatch target resolution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR updates Windows IOCP socket wrapper lifecycle management, completion handling patterns, and TLS stream lock acquisition. Changes defer wrapper cleanup to destructor, isolate operation queue processing, add explicit null checks, and implement error handling for lock failures.

Changes

Cohort / File(s) Summary
IOCP Socket & Scheduler Management
src/corosio/src/detail/iocp/sockets.cpp, src/corosio/src/detail/iocp/scheduler.cpp
Shifted socket/acceptor wrapper cleanup from shutdown to destructor; restructured operation queue handling in do_one to isolate completed operations in local scope before passing to post_deferred_completions; updated restart to reset stop event flag; added early-exit handling when accept operation completes without owner.
TLS Stream Lock Acquisition
src/openssl/src/openssl_stream.cpp, src/wolfssl/src/wolfssl_stream.cpp
Changed lock acquisition from single return value to destructured pair [lec, guard]; added early-exit patterns on lock failure (non-zero lec); wolfssl updates propagate errors across read/write/renegotiation/shutdown paths with current_op_ reset.
Utility Improvements
src/corosio/src/detail/dispatch_coro.hpp, src/corosio/src/detail/intrusive.hpp
Explicit null comparison in executor target check; defensive linkage clearing (nullptr assignment) on popped intrusive list/queue nodes to prevent stale pointer misuse.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • sgerbino

Poem

🐰 Wrappers now rest in the destructor's care,
Operations queue-dance through morning air,
Locks check their status before they proceed,
Lists scrub their pointers of lingering greed,
Safe IOCP journeys with deferred cleanup freed! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective of the changeset: fixing use-after-free issues in IOCP socket service shutdown, which is the primary focus of the substantial refactoring in sockets.cpp and scheduler.cpp.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into develop

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/corosio/src/detail/iocp/sockets.cpp (1)

662-677: Double lock acquisition for create_impl — consider consolidating.

create_impl acquires mutex_ twice: once to add the internal to socket_list_ (line 666-668) and once to add the wrapper to socket_wrapper_list_ (line 672-674). Between the two locks, new win_socket_impl(...) runs unlocked, which is fine for correctness. But a single lock scope covering both insertions would be simpler and avoid the small window where the internal is registered but the wrapper isn't.

Same pattern exists in create_acceptor_impl (lines 804-814).

♻️ Suggested consolidation
 win_socket_impl&
 win_sockets::
 create_impl()
 {
     auto internal = std::make_shared<win_socket_impl_internal>(*this);
-
-    {
-        std::lock_guard<win_mutex> lock(mutex_);
-        socket_list_.push_back(internal.get());
-    }
-
     auto* wrapper = new win_socket_impl(std::move(internal));
 
     {
         std::lock_guard<win_mutex> lock(mutex_);
+        socket_list_.push_back(wrapper->get_internal());
         socket_wrapper_list_.push_back(wrapper);
     }
 
     return *wrapper;
 }

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.97%. Comparing base (03fef8f) to head (4713aa8).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #131      +/-   ##
===========================================
- Coverage    80.98%   80.97%   -0.01%     
===========================================
  Files           64       64              
  Lines         5721     5724       +3     
===========================================
+ Hits          4633     4635       +2     
- Misses        1088     1089       +1     
Files with missing lines Coverage Δ
src/corosio/src/detail/dispatch_coro.hpp 100.00% <100.00%> (ø)
src/corosio/src/detail/intrusive.hpp 98.27% <100.00%> (+0.09%) ⬆️
src/openssl/src/openssl_stream.cpp 77.31% <ø> (ø)
src/wolfssl/src/wolfssl_stream.cpp 77.25% <ø> (ø)

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03fef8f...4713aa8. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cppalliance-bot
Copy link

cppalliance-bot commented Feb 11, 2026

An automated preview of the documentation is available at https://131.corosio.prtest3.cppalliance.org/index.html

If more commits are pushed to the pull request, the docs will rebuild at the same URL.

2026-02-14 03:00:40 UTC

@cppalliance-bot
Copy link

cppalliance-bot commented Feb 11, 2026

GCOVR code coverage report https://131.corosio.prtest3.cppalliance.org/gcovr/index.html
LCOV code coverage report https://131.corosio.prtest3.cppalliance.org/genhtml/index.html
Coverage Diff Report https://131.corosio.prtest3.cppalliance.org/diff-report/index.html

Build time: 2026-02-14 03:09:13 UTC

@vinniefalco vinniefalco changed the title Add when_all async_event test for io_context Fix use-after-free in IOCP socket service shutdown Feb 12, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/corosio/src/detail/iocp/sockets.cpp`:
- Around line 620-633: The destructor currently deletes nodes returned by
socket_wrapper_list_.pop_front() and acceptor_wrapper_list_.pop_front(), but
those popped nodes keep stale next_/prev_ pointers that later cause list
corruption when their destructor calls unregister_impl() -> remove(); fix by
ensuring pop_front() clears the popped node's linkage (set node->next_ =
node->prev_ = nullptr) so deleted nodes are clearly unlinked, or alternatively
harden remove() in intrusive.hpp to detect and ignore already-unlinked nodes
(e.g., verify a node is linked to this list before dereferencing prev_/next_);
target the pop_front implementation and/or remove/unregister_impl functions to
implement this defensive reset/check.

@vinniefalco vinniefalco force-pushed the develop branch 2 times, most recently from 16f8d83 to fcf1067 Compare February 13, 2026 13:35
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/wolfssl/src/wolfssl_stream.cpp`:
- Around line 874-876: In do_shutdown, the SHUTDOWN_NOT_DONE paths that call
auto [lec, guard] = co_await io_cm_.scoped_lock() currently break without
propagating the lock error (lec) into ec, so callers can observe success despite
a lock failure; update both occurrences (the scoped_lock() branches around the
SHUTDOWN_NOT_DONE logic) to assign ec = lec before the break to match the
WOLFSSL_SUCCESS path, or if the discard is intentional, add a clear comment in
do_shutdown explaining why lock errors are suppressed there; reference the
scoped_lock call, the lec variable, and the ec return code when making the
change.

During execution_context destruction, win_sockets::shutdown() was
eagerly deleting socket wrapper objects while the scheduler still
had pending IOCP operations referencing them. When win_scheduler::
shutdown() later drained those operations and destroyed coroutine
frames, the tcp_socket destructors inside those frames would call
release() on already-freed wrappers, corrupting the heap.

The fix follows the same pattern Asio uses in
win_iocp_socket_service_base::base_shutdown(): shutdown() now only
closes socket handles to force pending I/O to complete; wrapper
deletion is deferred to ~win_sockets(), which runs after the
scheduler has finished draining all outstanding operations.

Also fixes accept_op's destroy path to properly release its
pre-allocated accepted_socket and peer_wrapper before destroying
the coroutine frame.
post_deferred_completions() was called with completed_ops_ by
reference. When PostQueuedCompletionStatus failed, the recovery
path did completed_ops_.splice(ops) where ops aliased
completed_ops_, corrupting the intrusive queue and permanently
losing operations. Leaked ops never decrement outstanding_work_
so run() hangs.

Drain completed_ops_ into a local queue before calling
post_deferred_completions, matching the epoll/kqueue backends.
Reset stop_event_posted_ flag in restart() so a stopped IOCP
scheduler can run again. Set linger(true, 0) on server sockets
in accept churn benchmarks to avoid TIME_WAIT accumulation.
…tch_coro

Remove the duplicate outstanding_work_ zero-check from do_one()
which could race with run()'s own shutdown logic and cause premature
exits. Add explicit nullptr comparison in dispatch_coro to silence
compiler warnings. Improve GQCS timeout comment.
Adapt openssl_stream and wolfssl_stream to the new scoped_lock()
signature that returns (error_code, guard) via structured bindings,
propagating lock-acquisition errors instead of ignoring them. Update
tcp_server to use the renamed get_current_frame_allocator().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants