feat: add observability for security agent#58
Merged
jeanduplessis merged 14 commits intomainfrom Feb 9, 2026
Merged
Conversation
Addresses Finding #15 (HIGH: No Operational Observability) from the security agent production readiness review. Lays out a 5-phase plan covering correlation IDs, structured logging, LLM call timing/token tracking, cron heartbeats, sync metrics, pipeline instrumentation, and degradation detection — all using existing codebase infrastructure (emitApiMetrics, sentryLogger, Sentry spans, BetterStack heartbeats). https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
…kflows Implements all 5 phases of the observability plan (Finding #15): Phase 1 - Correlation ID & Structured Logging: - Generate correlationId (UUID) at analysis start, thread through all tiers - Store correlationId in SecurityFindingAnalysis JSONB for queryability - Replace ~76 console.log/error calls with sentryLogger (dual console+Sentry) - Wrap startSecurityAnalysis in Sentry withScope for tag propagation Phase 2 - LLM Call Timing & Token Tracking: - Wrap triage and extraction LLM calls in Sentry startSpan (op: ai.inference) - Extract token usage from sendProxiedChatCompletion responses - Emit metrics via emitApiMetrics with mode security-agent-triage/extraction - Track input/output tokens as span attributes Phase 3 - Cron Heartbeats & Sync Metrics: - Add BetterStack heartbeat support to both cron jobs (env-configurable URLs) - Send /fail heartbeat on sync errors - Add per-repository sync timing in syncDependabotAlertsForRepo - Track GitHub API rate limits via x-ratelimit-remaining headers Phase 4 - Pipeline Timing & R2 Retry Instrumentation: - Wrap processAnalysisStream in Sentry span (op: ai.pipeline) - Track stream duration, R2 retry attempts, and retry wait time - Log tier transition timing (Tier 1 duration) - Record stream outcome status on span attributes Phase 5 - Outcome Distribution & Degradation Detection: - Add Sentry breadcrumbs for triage/extraction outcomes with isFallback flag - Track auto-dismiss decisions with correlationId and source - Add stale analysis anomaly detection (warn when count > threshold) - Log bulk auto-dismiss summaries https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
…ementation - Fix withScope propagation: move withScope inside processAnalysisStream where background work actually runs instead of startSecurityAnalysis - Fix span exception handling: move try/catch inside startSpan callback so span attributes are available on error paths - Refactor triage, extraction, and auto-dismiss to use options objects instead of growing positional argument lists - Guard emitApiMetrics calls with O11Y_KILO_GATEWAY_CLIENT_SECRET check to prevent sending metrics with empty client secret - Derive toolsUsed from actual LLM response tool_calls instead of hardcoding before validation - Remove unused warn variable in triage-service - Add try/catch and failure heartbeat to cleanup-stale-analyses cron - Use consistent performance.now() in sync-service runFullSync - Use 'cron' source tag for auth warnings in cron routes for consistent Sentry alert routing https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
Contributor
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Files Reviewed (10 files)
|
The observability refactor removed console.error statements from parseTriageResult and parseExtractionResult without replacing them, losing visibility into which field validation failed and what the invalid value was. Restore logging using sentryLogger (logError) so failures surface in both console and Sentry. https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
…raction The observability refactor removed console.log/console.error calls for response validation failures (no choice, no tool call, unexpected tool) and success logging (triage/extraction complete) without replacing them. Restore using sentryLogger so these events surface in both console and Sentry. https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
The observability refactor replaced the truncated reasoning excerpts with a redundant source field. Restore the reasoning.slice(0, 100) so dismiss logs show *why* the finding was dismissed without needing to look up the full analysis. https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
Restore console statements that were removed without replacement: analysis-service.ts: - R2 message fetch debug info (messageCount, lastFewTypes) - Which message type was selected (completion_result, text, fallback) with messageIndex and contentLength sync-service.ts: - Alert count after GitHub fetch - Finding count after parsing These are useful for diagnosing pipeline issues (e.g. why an analysis returned no result, or how many alerts a repo actually has). https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
Three logError calls passed raw `error` as a positional arg instead of
a structured object. sentryLogger puts args into `extra.args[]`, so raw
errors end up as `args[0]` with no key — losing context in Sentry.
Consistently use `{ error }` (and other relevant fields) so Sentry
extra data has named keys.
https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
The heartbeat fetch calls are awaited — if BetterStack is slow or unreachable, the cron handler stalls until the platform kills it. Add AbortSignal.timeout(5000) so heartbeats are truly best-effort and never block the response. https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
The options-object refactor dropped the @param documentation from triageSecurityFinding, extractSandboxAnalysis, and maybeAutoDismissAnalysis. Restore them using options.field notation. https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
…ar access Centralizes SECURITY_SYNC_BETTERSTACK_HEARTBEAT_URL and SECURITY_CLEANUP_BETTERSTACK_HEARTBEAT_URL in @/lib/config.server instead of reading process.env directly in route files. https://claude.ai/code/session_01H6HahwjayzdFFZXbpE9Hg7
eshurakov
approved these changes
Feb 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses shortcoming of operational observability from the security agent.