Skip to content

Add trace ID generation and propagation middleware#98

Open
Suhaibinator wants to merge 8 commits intomainfrom
claude/add-trace-id-middleware-01CeCEjewz3TG69EUELhUG6V
Open

Add trace ID generation and propagation middleware#98
Suhaibinator wants to merge 8 commits intomainfrom
claude/add-trace-id-middleware-01CeCEjewz3TG69EUELhUG6V

Conversation

@Suhaibinator
Copy link
Owner

  • Use stack-allocated buffer in generateUUID() instead of heap allocation
  • Replace slice reslicing with index-based batch consumption to avoid slice header garbage
  • Replace time.Sleep with runtime.Gosched() for more efficient CPU yielding in background worker

- Use stack-allocated buffer in generateUUID() instead of heap allocation
- Replace slice reslicing with index-based batch consumption to avoid
  slice header garbage
- Replace time.Sleep with runtime.Gosched() for more efficient CPU
  yielding in background worker
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 104 to 113
case g.idChan <- generateUUID():
// Successfully added a new UUID
default:
// Channel is full, sleep longer to save CPU
time.Sleep(1 * time.Millisecond) // 1ms instead of 10ms
// Channel is full, yield to scheduler
runtime.Gosched()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Idle generator now busy-spins on full buffer

When idChan is already full (the normal steady state), the background loop hits the default arm and only calls runtime.Gosched(). Gosched merely yields and immediately re-enters the loop, so the goroutine tight-spins and burns a CPU core instead of blocking; the previous code slept for 1 ms here to avoid thrashing. This regression will peg CPU whenever traffic is light and the buffer stays full. Consider restoring a blocking sleep/backoff when the channel is full.

Useful? React with 👍 / 👎.

The previous change replaced time.Sleep with runtime.Gosched in the
"channel full" case, which causes a tight spin loop burning CPU when
traffic is light. Restore the 1ms sleep for the full channel case while
keeping Gosched for the active refill path where responsiveness matters.
@codecov
Copy link

codecov bot commented Dec 1, 2025

Codecov Report

❌ Patch coverage is 53.38983% with 55 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.84%. Comparing base (519a7e3) to head (101c807).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
pkg/router/router.go 46.31% 46 Missing and 5 partials ⚠️
pkg/metrics/handler_method.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #98      +/-   ##
==========================================
- Coverage   97.06%   93.84%   -3.22%     
==========================================
  Files          18       18              
  Lines        2382     1819     -563     
==========================================
- Hits         2312     1707     -605     
- Misses         57       95      +38     
- Partials       13       17       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Suhaibinator and others added 3 commits December 1, 2025 07:50
- Add tests to cover `case <-g.stop: return` in both batch fill and
  normal fill select statements
- Remove unreachable `default: goto doneAdding` case from batch fill
  loop (the `if len == size { break }` check prevents this from ever
  triggering since we're the only producer)
- Update TestIDGeneratorStopDuringNormalFill to actively drain while
  keeping channel above batch threshold
- Add TestIDGeneratorBatchFillChannelFull to verify batch fill behavior
Fix data race in timeout middleware by:
1. Using atomic CompareAndSwap in writeJSONError to ensure only one
   goroutine (timeout or handler) writes the response
2. Holding the mutex during the entire write operation to prevent
   concurrent access to the response writer
3. Waiting for the handler goroutine to finish in the timeout case
   before returning, preventing races between test reads and handler
   writes

The race occurred when:
- Handler won the CAS race and started writing
- Timeout's writeJSONError returned early (CAS failed)
- timeoutMiddleware returned without waiting for handler
- Test read rr.Code while handler was still writing
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds trace ID generation and propagation middleware with several performance optimizations including stack-allocated buffers in UUID generation, index-based batch consumption to reduce garbage, and replacement of time.Sleep with runtime.Gosched() for more efficient CPU yielding.

Key Changes

  • Refactored timeout middleware error handling to use writeJSONError with atomic write claiming
  • Added optimized batch UUID generation with index-based consumption instead of slice reslicing
  • Replaced fixed sleep intervals with runtime.Gosched() for more responsive scheduler yielding

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
pkg/router/router.go Simplified timeout error handling by delegating synchronization to writeJSONError; added atomic CAS logic to writeJSONError to prevent race conditions
pkg/router/register_generic_route_test.go Changed test from immediate fallthrough to explicit timeout case for better test clarity
pkg/middleware/trace_stop_test.go Added comprehensive test coverage for stop behavior in different scenarios (batch fill, normal fill, channel full)
pkg/middleware/trace.go Optimized UUID generation with stack-allocated buffer, replaced slice reslicing with index-based batch consumption, replaced time.Sleep with runtime.Gosched()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 92 to 97
batchIndex++
}
if len(g.idChan) == g.size {
break
}
}
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The batch fill select statement is missing a default case. When the channel is full (or nearly full), the g.idChan <- batchUUIDs[batchIndex] case will block indefinitely, preventing the loop from exiting even when len(g.idChan) == g.size is true. This can cause the goroutine to hang.

Add a default case to handle when the channel is full:

select {
case <-g.stop:
    return
case g.idChan <- batchUUIDs[batchIndex]:
    batchIndex++
default:
    // Channel is full, exit batch fill loop
    goto exitBatchFill
}

Then add a label after the loop: exitBatchFill:

Suggested change
batchIndex++
}
if len(g.idChan) == g.size {
break
}
}
batchIndex++
default:
// Channel is full, exit batch fill loop
goto exitBatchFill
}
if len(g.idChan) == g.size {
break
}
}
exitBatchFill:

Copilot uses AI. Check for mistakes.
Comment on lines 108 to 109
// TestIDGeneratorBatchFillChannelFull tests the default case in batch fill loop.
// This covers `default: goto doneAdding` when channel fills during batch insertion.
Copy link

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test comment claims to cover default: goto doneAdding when the channel fills during batch insertion, but the actual implementation at lines 86-93 of trace.go does not have a default case in the batch fill select statement. This test will not behave as documented because the code it claims to test doesn't exist.

Either:

  1. Remove this test since the corresponding code doesn't exist, or
  2. Update the implementation to add the missing default case as documented

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants