Add trace ID generation and propagation middleware#98
Add trace ID generation and propagation middleware#98Suhaibinator wants to merge 8 commits intomainfrom
Conversation
Suhaibinator
commented
Dec 1, 2025
- Use stack-allocated buffer in generateUUID() instead of heap allocation
- Replace slice reslicing with index-based batch consumption to avoid slice header garbage
- Replace time.Sleep with runtime.Gosched() for more efficient CPU yielding in background worker
- Use stack-allocated buffer in generateUUID() instead of heap allocation - Replace slice reslicing with index-based batch consumption to avoid slice header garbage - Replace time.Sleep with runtime.Gosched() for more efficient CPU yielding in background worker
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| case g.idChan <- generateUUID(): | ||
| // Successfully added a new UUID | ||
| default: | ||
| // Channel is full, sleep longer to save CPU | ||
| time.Sleep(1 * time.Millisecond) // 1ms instead of 10ms | ||
| // Channel is full, yield to scheduler | ||
| runtime.Gosched() |
There was a problem hiding this comment.
Idle generator now busy-spins on full buffer
When idChan is already full (the normal steady state), the background loop hits the default arm and only calls runtime.Gosched(). Gosched merely yields and immediately re-enters the loop, so the goroutine tight-spins and burns a CPU core instead of blocking; the previous code slept for 1 ms here to avoid thrashing. This regression will peg CPU whenever traffic is light and the buffer stays full. Consider restoring a blocking sleep/backoff when the channel is full.
Useful? React with 👍 / 👎.
The previous change replaced time.Sleep with runtime.Gosched in the "channel full" case, which causes a tight spin loop burning CPU when traffic is light. Restore the 1ms sleep for the full channel case while keeping Gosched for the active refill path where responsiveness matters.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #98 +/- ##
==========================================
- Coverage 97.06% 93.84% -3.22%
==========================================
Files 18 18
Lines 2382 1819 -563
==========================================
- Hits 2312 1707 -605
- Misses 57 95 +38
- Partials 13 17 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Add tests to cover `case <-g.stop: return` in both batch fill and
normal fill select statements
- Remove unreachable `default: goto doneAdding` case from batch fill
loop (the `if len == size { break }` check prevents this from ever
triggering since we're the only producer)
- Update TestIDGeneratorStopDuringNormalFill to actively drain while
keeping channel above batch threshold
- Add TestIDGeneratorBatchFillChannelFull to verify batch fill behavior
Fix data race in timeout middleware by: 1. Using atomic CompareAndSwap in writeJSONError to ensure only one goroutine (timeout or handler) writes the response 2. Holding the mutex during the entire write operation to prevent concurrent access to the response writer 3. Waiting for the handler goroutine to finish in the timeout case before returning, preventing races between test reads and handler writes The race occurred when: - Handler won the CAS race and started writing - Timeout's writeJSONError returned early (CAS failed) - timeoutMiddleware returned without waiting for handler - Test read rr.Code while handler was still writing
There was a problem hiding this comment.
Pull request overview
This PR adds trace ID generation and propagation middleware with several performance optimizations including stack-allocated buffers in UUID generation, index-based batch consumption to reduce garbage, and replacement of time.Sleep with runtime.Gosched() for more efficient CPU yielding.
Key Changes
- Refactored timeout middleware error handling to use writeJSONError with atomic write claiming
- Added optimized batch UUID generation with index-based consumption instead of slice reslicing
- Replaced fixed sleep intervals with runtime.Gosched() for more responsive scheduler yielding
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/router/router.go | Simplified timeout error handling by delegating synchronization to writeJSONError; added atomic CAS logic to writeJSONError to prevent race conditions |
| pkg/router/register_generic_route_test.go | Changed test from immediate fallthrough to explicit timeout case for better test clarity |
| pkg/middleware/trace_stop_test.go | Added comprehensive test coverage for stop behavior in different scenarios (batch fill, normal fill, channel full) |
| pkg/middleware/trace.go | Optimized UUID generation with stack-allocated buffer, replaced slice reslicing with index-based batch consumption, replaced time.Sleep with runtime.Gosched() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| batchIndex++ | ||
| } | ||
| if len(g.idChan) == g.size { | ||
| break | ||
| } | ||
| } |
There was a problem hiding this comment.
The batch fill select statement is missing a default case. When the channel is full (or nearly full), the g.idChan <- batchUUIDs[batchIndex] case will block indefinitely, preventing the loop from exiting even when len(g.idChan) == g.size is true. This can cause the goroutine to hang.
Add a default case to handle when the channel is full:
select {
case <-g.stop:
return
case g.idChan <- batchUUIDs[batchIndex]:
batchIndex++
default:
// Channel is full, exit batch fill loop
goto exitBatchFill
}Then add a label after the loop: exitBatchFill:
| batchIndex++ | |
| } | |
| if len(g.idChan) == g.size { | |
| break | |
| } | |
| } | |
| batchIndex++ | |
| default: | |
| // Channel is full, exit batch fill loop | |
| goto exitBatchFill | |
| } | |
| if len(g.idChan) == g.size { | |
| break | |
| } | |
| } | |
| exitBatchFill: |
pkg/middleware/trace_stop_test.go
Outdated
| // TestIDGeneratorBatchFillChannelFull tests the default case in batch fill loop. | ||
| // This covers `default: goto doneAdding` when channel fills during batch insertion. |
There was a problem hiding this comment.
The test comment claims to cover default: goto doneAdding when the channel fills during batch insertion, but the actual implementation at lines 86-93 of trace.go does not have a default case in the batch fill select statement. This test will not behave as documented because the code it claims to test doesn't exist.
Either:
- Remove this test since the corresponding code doesn't exist, or
- Update the implementation to add the missing
defaultcase as documented