Skip to content

feat: add repo staleness detection and refresh across MCP, API, and CLI#18

Open
eddiefleurent wants to merge 5 commits intoHKUDS:mainfrom
eddiefleurent:feature/repo-staleness-detection
Open

feat: add repo staleness detection and refresh across MCP, API, and CLI#18
eddiefleurent wants to merge 5 commits intoHKUDS:mainfrom
eddiefleurent:feature/repo-staleness-detection

Conversation

@eddiefleurent
Copy link

@eddiefleurent eddiefleurent commented Feb 28, 2026

Summary

  • Indexed repositories were treated as immutable forever — no git pull, no commit comparison, no TTL. The only way to get fresh data was manually deleting index metadata via delete_repo_metadata and re-triggering indexing.
  • Added lightweight staleness detection (git fetch + SHA comparison) and an explicit refresh flow (git pull + re-index) with feature parity across all three interfaces (MCP, REST API, CLI).
  • The UX approach is non-blocking: code_qa appends a tasteful freshness warning when repos are outdated, nudging the user to call refresh_repo — queries still work against the existing index.

Changes

File Change
fastcode/loader.py New methods: get_head_commit(), check_for_updates() (git fetch + SHA compare), pull_updates() (git pull); fixed hexsha[:8]hexsha to prevent stale false-positives
fastcode/vector_store.py save() now persists indexed_commit SHA in metadata pickle; new get_indexed_commit() reader
fastcode/main.py index_repository() and load_multiple_repositories() pass commit SHA to save(); new check_repo_for_updates() and refresh_repository() orchestration methods
mcp_server.py _ensure_repos_ready() checks freshness on indexed repos and collects warnings; code_qa appends "Repository freshness" section when stale; new tools: check_repo_freshness, refresh_repo
api.py New endpoints: POST /check-repo-freshness, POST /refresh-repo
web_app.py New endpoints: POST /api/check-repo-freshness, POST /api/refresh-repo
main.py (CLI) New commands: check-freshness --repos <names>, refresh <repo_name>

Test plan

  • Index a repo via MCP code_qa, then push a commit to the remote — subsequent code_qa call should show a freshness warning in the response
  • Call check_repo_freshness tool — should report the repo as OUTDATED with commit SHAs
  • Call refresh_repo tool — should pull new commits and re-index, reporting old/new commit
  • Call refresh_repo again immediately — reports already up-to-date, reindexed: false, no redundant re-index
  • Verify POST /check-repo-freshness and POST /refresh-repo return correct JSON via REST API
  • Verify check-freshness and refresh CLI commands work end-to-end
  • Verify old indexes (without indexed_commit in metadata) still load and work — staleness check reports indexed_commit: null and stale: false rather than crashing

Bug found and fixed during testing

loader.py was storing hexsha[:8] (short SHA) at index time, but all comparison code uses full hexsha. This caused indexed_commit != current_commit to always be true, reporting every repo as stale even when fully up-to-date. Fixed in the final commit by removing the [:8] slice.

Tested on

Self-hosted FastCode instance running as systemd services. Confirmed against a live repo:

POST /check-repo-freshness  (after fresh index)
{
  "stale": false,
  "indexed_commit": "2e929df112e7746af2fdfb5b6491f3c19b24015a",
  "current_commit": "2e929df112e7746af2fdfb5b6491f3c19b24015a",
  "remote_commit": "2e929df112e7746af2fdfb5b6491f3c19b24015a",
  "has_remote_updates": false
}

eddiefleurent and others added 5 commits February 28, 2026 07:50
…s kwarg

- FastMCP defaults to 127.0.0.1 which prevents remote connections when
  running SSE transport. Default to 0.0.0.0 and allow override via
  FASTMCP_HOST / FASTMCP_PORT env vars.
- mcp.run() does not accept sse_params kwarg (raises TypeError). Remove it.
  The --port arg was always a no-op since FastMCP is instantiated at module
  level before args are parsed; use FASTMCP_PORT env var instead.
Indexed repositories were treated as immutable — no git pull, no commit
comparison, no way to detect upstream changes. The only refresh path was
manually deleting index metadata.

This adds lightweight staleness detection (git fetch + SHA comparison)
and an explicit refresh flow (git pull + re-index) with feature parity
across all three interfaces (MCP, REST API, CLI).

MCP: code_qa now appends a freshness warning when repos are outdated,
nudging the user to call the new refresh_repo tool. Also adds
check_repo_freshness for explicit checks.

API: POST /check-repo-freshness and POST /refresh-repo endpoints
in both api.py and web_app.py.

CLI: `check-freshness` and `refresh` commands.

Core: VectorStore persists the indexed commit SHA in the metadata
pickle. RepositoryLoader gains get_head_commit, check_for_updates,
and pull_updates methods. FastCode gains check_repo_for_updates
and refresh_repository orchestration methods.

Made-with: Cursor
hexsha[:8] was stored at index time but full hexsha used everywhere else,
causing equality check to always fail and report repos as stale.
Python evaluates type annotations at import time, so Pydantic models must
be defined before the route functions that reference them. Moving the two
classes up to the models section fixes the NameError on startup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move repo_info, available_repositories, and loaded_repositories fields
from the RefreshRepoRequest model to the StatusResponse model for more
appropriate schema organization.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant