Skip to content

adds daily task to update global bot leaderboard, refactors command for convenience#3933

Open
lsabor wants to merge 17 commits intomainfrom
feat/global-bot-leaderboard-daily-updating
Open

adds daily task to update global bot leaderboard, refactors command for convenience#3933
lsabor wants to merge 17 commits intomainfrom
feat/global-bot-leaderboard-daily-updating

Conversation

@lsabor
Copy link
Contributor

@lsabor lsabor commented Dec 14, 2025

closes #3808

This task will take ~20 minutes long, so it might be worth considering optimizing the approach.
The only thing that takes a long time is the head to head scoring for popular questions. The simplest solution would be to store all of the head to head scores at question resolution in a new table. This would take the task down to 1-2 minutes from upwards of 30 (and maybe some memory issues).

Summary by CodeRabbit

  • New Features
    • Global Bot Leaderboard now automatically updates daily at 5:00 AM UTC.
  • Improvements
    • Incremental caching and partial-update processing for faster, more efficient leaderboard refreshes.
    • Enhanced reliability and error handling during leaderboard generation.
    • Additional processing telemetry and statistical evaluations to improve monitoring and diagnostics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing that happens here is to factor out the handle action so it can be called with a task

@lsabor lsabor requested a review from hlbmtc December 14, 2025 20:10
bootstrap_iterations = 30

# SETUP: users to evaluate & questions
print("Initializing...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prints won't be typically visible in django command handlers, so I suggest using logging instead

Comment on lines 614 to 616
metadata__bot_details__metac_bot=True,
metadata__bot_details__include_in_calculations=True,
metadata__bot_details__display_in_leaderboard=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this might be a slow operation since we don't index User.metadata col

Comment on lines 631 to 637
related_posts__post__default_project_id__in=[
3349, # aib q3 2024
32506, # aib q4 2024
32627, # aib q1 2025
32721, # aib q2 2025
32813, # aib fall 2025
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to refactor this so we don’t hardcode these values here?
At the very least, I’d move these IDs into constants.py.

Score.objects.filter(user=user, question__in=question_list)
.distinct("question_id")
.count()
.exclude(related_posts__post__default_project__slug__startswith="minibench")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned here, I don't like this slug approach

#3925 (comment)

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 23, 2026

📝 Walkthrough

Walkthrough

Adds a daily cron job to run a new runner that updates the Global Bot Leaderboard; the leaderboard update logic was refactored from a Django Command to a callable runner with caching and incremental processing.

Changes

Cohort / File(s) Summary
Cron Job Scheduling
misc/management/commands/cron.py
Imports update_gobal_bot_leaderboard from scoring.jobs and registers a daily cron job at 0 5 * * *, wrapping the call with close_old_connections and assigning job id update_gobal_bot_leaderboard.
Job Orchestration Layer
scoring/jobs.py
Adds update_gobal_bot_leaderboard() which queries for the "Global Bot Leaderboard", logs if missing, and calls run_update_global_bot_leaderboard() inside exception handling.
Leaderboard Update Logic
scoring/management/commands/update_global_bot_leaderboard.py
Replaces Command-driven flow with run_update_global_bot_leaderboard(cache_use="partial"); reworks gather_data() signature and caching behavior (full/partial CSV caching), adds per-question caching/skipping, timestamped score rows, incremental CSV persistence, expanded preprocessing and filtering, and additional statistical instrumentation.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler as Cron Scheduler
    participant Conn as close_old_connections
    participant Jobs as Jobs Layer
    participant Update as run_update_global_bot_leaderboard
    participant DB as Leaderboard DB

    Scheduler->>Scheduler: Trigger at 05:00 UTC
    Scheduler->>Conn: invoke wrapper(update_gobal_bot_leaderboard)
    Conn->>Jobs: execute update_gobal_bot_leaderboard()
    Jobs->>DB: query "Global Bot Leaderboard"
    DB-->>Jobs: leaderboard object or None
    alt leaderboard found
        Jobs->>Update: call run_update_global_bot_leaderboard(cache_use)
        Update->>Update: gather_data() (CSV cache read/write)
        Update->>Update: compute per-question scores, timestamps, incremental cache
        Update->>DB: update leaderboard entries
        DB-->>Update: confirmation
        Update-->>Jobs: success
    else not found
        Jobs-->>Jobs: log warning and exit
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 At dawn I nibble lines of code,
Five AM bells wake the leaderboard road,
Cached crumbs kept, new scores take flight,
Bots hop up ranks by morning light!

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately summarizes the main changes: adding a daily task to update the global bot leaderboard and refactoring the command.
Linked Issues check ✅ Passed The PR implements the primary objective from #3808 by scheduling a daily cron job (0 5 * * *) and refactoring the command with caching logic to address performance concerns.
Out of Scope Changes check ✅ Passed All changes are scoped to the daily updating and command refactoring objectives; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/global-bot-leaderboard-daily-updating

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@misc/management/commands/cron.py`:
- Around line 212-218: The two scheduled jobs update_custom_leaderboards and
update_gobal_bot_leaderboard are both added via scheduler.add_job using
CronTrigger.from_crontab("0 5 * * *") and will run concurrently; change one cron
expression to a staggered time (e.g., set update_gobal_bot_leaderboard to "0 6 *
* *") so the long-running update_gobal_bot_leaderboard does not overlap with
update_custom_leaderboards, keeping id, max_instances, replace_existing handling
the same.

In `@scoring/jobs.py`:
- Line 19: Rename the misspelled function update_gobal_bot_leaderboard to
update_global_bot_leaderboard and update every reference to it (including the
cron job ID and any imports) so names stay consistent; specifically change the
import and usage in misc/management/commands/cron.py to import
update_global_bot_leaderboard and update the cron registration/ID that currently
uses the typo, then run a quick search for update_gobal_bot_leaderboard to
replace remaining occurrences.

In `@scoring/management/commands/update_global_bot_leaderboard.py`:
- Around line 805-821: The loop over ordered_skills is causing an N+1 DB hit via
User.objects.get(id=uid); fix by collecting all integer uids used in that loop
(where uid is not a str) and doing a single query like
User.objects.filter(id__in=collected_ids) to build a mapping uid_to_username,
then inside the loop replace User.objects.get(id=uid).username with
uid_to_username.get(uid, "<missing>") (keeping the existing behavior for str
uids where username = uid) so no per-iteration queries occur; update references
in the same function where unevaluated and player_stats are used to read from
the prebuilt mapping instead of calling User.objects.get.
- Around line 655-665: The per-user loop triggers an N+1 query by calling
Score.objects.filter(user=user,
question__in=question_list).distinct("question_id").count() for each user;
instead run a single aggregated query like Score.objects.filter(user__in=users,
question__in=question_list).values("user").annotate(count=Count("question",
distinct=True)) to get distinct question counts per user, then populate
scored_question_counts from that result (referencing scored_question_counts,
users, question_list and the Score model) rather than counting inside the loop.
- Around line 726-731: The current use of Leaderboard.objects.get_or_create(...)
with multiple fields risks creating duplicates when non-lookup fields change;
change the logic to call Leaderboard.objects.get_or_create(name="Global Bot
Leaderboard") to perform lookup by name only, then set or update the remaining
attributes (project=Project.objects.get(type=Project.ProjectTypes.SITE_MAIN),
score_type=LeaderboardScoreTypes.MANUAL,
bot_status=Project.BotLeaderboardStatus.BOTS_ONLY) on the returned leaderboard
instance and call save() so existing records are updated instead of creating
duplicates; ensure you still handle the created flag appropriately from
Leaderboard.objects.get_or_create.
- Around line 743-747: The loop performs N+1 queries by calling
User.objects.get(id=uid) for each integer uid; instead, before iterating gather
all integer uids used in that loop, fetch users in bulk via
User.objects.filter(id__in=ids) (or User.objects.in_bulk(ids)) in the
update_global_bot_leaderboard logic, build a mapping from id to User, and
replace the per-iteration User.objects.get call with a lookup like
users_by_id[uid] (handle missing users gracefully); this removes the
per-iteration DB hit while preserving the existing checks on
user.metadata["bot_details"] and display_in_leaderboard.
- Around line 618-648: The Post.default_project foreign key lacks an index and
is used heavily in joins (see filters/excludes on
related_posts__post__default_project__* in update_global_bot_leaderboard), so
add db_index=True to the ForeignKey declaration for default_project on the Post
model, create and commit the generated migration (makemigrations), and run/apply
the migration (migrate) to ensure the DB has the new index; this will speed up
joins against Project.default_permission and related filters.
♻️ Duplicate comments (4)
scoring/management/commands/update_global_bot_leaderboard.py (4)

144-180: Hardcoded project IDs create maintenance burden and coupling.

These hardcoded project IDs (lines 146-152, 155-159) and the mapping logic are fragile - any new AIB project requires code changes. This was already flagged in a previous review.


627-633: Hardcoded AIB project IDs duplicated from earlier in the file.

These IDs (3349, 32506, 32627, 32721, 32813) duplicate lines 146-152. This was flagged in a previous review - consider extracting to constants.


639-639: Slug-based exclusion approach was flagged in previous review.

The .exclude(related_posts__post__default_project__slug__startswith="minibench") approach was already discussed as problematic.


778-892: Debug/test output in production code.

The extensive print statements for results display (lines 779-835) and statistical tests (lines 843-892) appear to be debug/development artifacts. In a cron job context, these won't be visible and add execution overhead. The previous review flagged that prints won't be visible in Django command handlers.

Consider using logging at DEBUG level or removing these diagnostic outputs for production runs.

🧹 Nitpick comments (4)
scoring/management/commands/update_global_bot_leaderboard.py (2)

107-141: Default parameter change from False to True may have unintended side effects.

Changing cache: bool = True means all callers now default to caching behavior. This could cause stale data issues if the cache file (HtH_score_data.csv) exists but contains outdated information. The code also writes to a file in the current working directory, which may vary depending on execution context (cron job vs manual invocation).

Consider:

  1. Using an absolute path or Django's MEDIA_ROOT/temp directory for cache files
  2. Adding cache invalidation logic (e.g., timestamp-based expiry)

710-718: Consider adding strict=True to zip() for safety.

Static analysis flagged line 714's zip() call without strict=. While the lists should be the same length by construction, adding strict=True would catch bugs if they diverge.

♻️ Proposed fix
-    for u1id, u2id, qid in zip(user1_ids, user2_ids, question_ids):
+    for u1id, u2id, qid in zip(user1_ids, user2_ids, question_ids, strict=True):
scoring/jobs.py (2)

20-25: Existence check queries the leaderboard but doesn't use the result.

The function fetches global_bot_leaderboard to check existence but never uses it - run_update_global_bot_leaderboard() will perform its own get_or_create. This is defensive but slightly wasteful. Consider either:

  1. Removing the check since run_update_global_bot_leaderboard handles creation
  2. Passing the leaderboard instance to avoid duplicate queries

26-29: Use logger.exception() instead of logger.error() to capture stack trace.

Per static analysis, logger.exception() automatically includes the traceback, which is essential for debugging failures in a cron job.

♻️ Proposed fix
     try:
         run_update_global_bot_leaderboard()
     except Exception as e:
-        logger.error(f"Error updating Global Bot Leaderboard: {e}")
+        logger.exception("Error updating Global Bot Leaderboard")
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c2c0e37 and 55af6c6.

📒 Files selected for processing (3)
  • misc/management/commands/cron.py
  • scoring/jobs.py
  • scoring/management/commands/update_global_bot_leaderboard.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-01-15T19:29:58.940Z
Learnt from: hlbmtc
Repo: Metaculus/metaculus PR: 4075
File: authentication/urls.py:24-26
Timestamp: 2026-01-15T19:29:58.940Z
Learning: In this codebase, DRF is configured to use IsAuthenticated as the default in REST_FRAMEWORK['DEFAULT_PERMISSION_CLASSES'] within metaculus_web/settings.py. Therefore, explicit permission_classes([IsAuthenticated]) decorators are unnecessary on DRF views unless a view needs to override the default. When reviewing Python files, verify that views relying on the default are not redundantly decorated, and flag cases where permissions are being over-specified or when a non-default permission is explicitly required.

Applied to files:

  • scoring/management/commands/update_global_bot_leaderboard.py
  • misc/management/commands/cron.py
  • scoring/jobs.py
🧬 Code graph analysis (3)
scoring/management/commands/update_global_bot_leaderboard.py (3)
scoring/models.py (3)
  • Score (17-56)
  • Leaderboard (97-280)
  • LeaderboardEntry (302-354)
projects/models.py (4)
  • Project (212-495)
  • BotLeaderboardStatus (248-252)
  • save (404-439)
  • save (603-608)
scoring/constants.py (1)
  • LeaderboardScoreTypes (17-57)
misc/management/commands/cron.py (1)
scoring/jobs.py (1)
  • update_gobal_bot_leaderboard (19-29)
scoring/jobs.py (2)
scoring/management/commands/update_global_bot_leaderboard.py (1)
  • run_update_global_bot_leaderboard (603-892)
scoring/models.py (1)
  • Leaderboard (97-280)
🪛 Ruff (0.14.13)
scoring/management/commands/update_global_bot_leaderboard.py

714-714: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


900-900: Unused method argument: args

(ARG002)


900-900: Unused method argument: options

(ARG002)

scoring/jobs.py

28-28: Do not catch blind exception: Exception

(BLE001)


29-29: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build Docker Image / Build Docker Image
  • GitHub Check: Backend Checks
  • GitHub Check: integration-tests
🔇 Additional comments (5)
scoring/management/commands/update_global_bot_leaderboard.py (3)

189-191: Early return for cached questions improves performance.

This optimization correctly skips questions that are already in the cache, reducing redundant processing time.


603-614: Refactoring to a standalone function is a good pattern for cron integration.

Exposing run_update_global_bot_leaderboard() as a standalone callable makes it reusable from both the Django management command and the scheduled job. The baseline player ID (236038) is still hardcoded though - consider making it configurable or documenting why this specific user is the baseline.


895-901: Command class correctly delegates to the runner function.

The thin wrapper pattern is appropriate. The unused args and options parameters flagged by static analysis are standard Django Command signatures and can be safely ignored.

misc/management/commands/cron.py (2)

29-29: Import inherits the typo from scoring/jobs.py.

Update this import if/when the function name is corrected in scoring/jobs.py.


212-218: Job configuration follows established patterns correctly.

The job uses close_old_connections wrapper, max_instances=1, and replace_existing=True consistent with other scoring jobs. Good implementation.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +212 to +218
scheduler.add_job(
close_old_connections(update_gobal_bot_leaderboard),
trigger=CronTrigger.from_crontab("0 5 * * *"), # Every day at 05:00 UTC
id="update_gobal_bot_leaderboard",
max_instances=1,
replace_existing=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Scheduling conflict: both update_custom_leaderboards and update_gobal_bot_leaderboard run at 05:00 UTC.

Both jobs are scheduled at the same time (lines 207 and 214). Since update_gobal_bot_leaderboard takes ~20 minutes as noted in the PR description, running them concurrently could cause resource contention. Consider staggering the times.

♻️ Suggested fix - stagger to 06:00 UTC
         scheduler.add_job(
             close_old_connections(update_gobal_bot_leaderboard),
-            trigger=CronTrigger.from_crontab("0 5 * * *"),  # Every day at 05:00 UTC
+            trigger=CronTrigger.from_crontab("0 6 * * *"),  # Every day at 06:00 UTC
             id="update_gobal_bot_leaderboard",
             max_instances=1,
             replace_existing=True,
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
scheduler.add_job(
close_old_connections(update_gobal_bot_leaderboard),
trigger=CronTrigger.from_crontab("0 5 * * *"), # Every day at 05:00 UTC
id="update_gobal_bot_leaderboard",
max_instances=1,
replace_existing=True,
)
scheduler.add_job(
close_old_connections(update_gobal_bot_leaderboard),
trigger=CronTrigger.from_crontab("0 6 * * *"), # Every day at 06:00 UTC
id="update_gobal_bot_leaderboard",
max_instances=1,
replace_existing=True,
)
🤖 Prompt for AI Agents
In `@misc/management/commands/cron.py` around lines 212 - 218, The two scheduled
jobs update_custom_leaderboards and update_gobal_bot_leaderboard are both added
via scheduler.add_job using CronTrigger.from_crontab("0 5 * * *") and will run
concurrently; change one cron expression to a staggered time (e.g., set
update_gobal_bot_leaderboard to "0 6 * * *") so the long-running
update_gobal_bot_leaderboard does not overlap with update_custom_leaderboards,
keeping id, max_instances, replace_existing handling the same.

logger = logging.getLogger(__name__)


def update_gobal_bot_leaderboard():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in function name: update_gobal_bot_leaderboard should be update_global_bot_leaderboard.

The function name has a typo ("gobal" vs "global"). This typo propagates to the cron job ID and import. Fix this before it becomes a maintenance headache.

♻️ Proposed fix
-def update_gobal_bot_leaderboard():
+def update_global_bot_leaderboard():

Also update the import in misc/management/commands/cron.py accordingly.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def update_gobal_bot_leaderboard():
def update_global_bot_leaderboard():
🤖 Prompt for AI Agents
In `@scoring/jobs.py` at line 19, Rename the misspelled function
update_gobal_bot_leaderboard to update_global_bot_leaderboard and update every
reference to it (including the cron job ID and any imports) so names stay
consistent; specifically change the import and usage in
misc/management/commands/cron.py to import update_global_bot_leaderboard and
update the cron registration/ID that currently uses the typo, then run a quick
search for update_gobal_bot_leaderboard to replace remaining occurrences.

Comment on lines 618 to 648
questions: QuerySet[Question] = (
Question.objects.filter(
Q(
related_posts__post__default_project__default_permission__in=[
"viewer",
"forecaster",
]
)
.order_by("id")
.distinct("id")
| Q(
related_posts__post__default_project_id__in=[
3349, # aib q3 2024
32506, # aib q4 2024
32627, # aib q1 2025
32721, # aib q2 2025
32813, # aib fall 2025
]
),
related_posts__post__curation_status=Post.CurationStatus.APPROVED,
resolution__isnull=False,
scheduled_close_time__lte=timezone.now(),
)
###############
# make sure they have at least 100 resolved questions
print("initialize list")
question_list = list(questions)
print("Filtering users.")
scored_question_counts: dict[int, int] = defaultdict(int)
c = users.count()
i = 0
for user in users:
i += 1
print(i, "/", c, end="\r")
scored_question_counts[user.id] = (
Score.objects.filter(user=user, question__in=question_list)
.distinct("question_id")
.count()
.exclude(related_posts__post__default_project__slug__startswith="minibench")
.exclude(resolution__in=UnsuccessfulResolutionType)
.filter(Exists(user_forecast_exists))
.prefetch_related( # only prefetch forecasts from those users
Prefetch(
"user_forecasts", queryset=Forecast.objects.filter(author__in=users)
)
excluded_ids = [
uid for uid, count in scored_question_counts.items() if count < 100
]
users = users.exclude(id__in=excluded_ids)
###############
print("Initializing... DONE")

# Gather head to head scores
user1_ids, user2_ids, question_ids, scores, weights = gather_data(
users, questions
)

# choose baseline player if not already chosen
if not baseline_player:
baseline_player = max(
set(user1_ids) | set(user2_ids), key=(user1_ids + user2_ids).count
)
# get variance of average scores (used in rescaling)
avg_scores = get_avg_scores(user1_ids, user2_ids, scores, weights)
var_avg_scores = (
np.var(np.array(list(avg_scores.values()))) if len(avg_scores) > 1 else 0
.order_by("id")
.distinct("id")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "models.py" | head -20

Repository: Metaculus/metaculus

Length of output: 292


🏁 Script executed:

rg -n "class Post|class Question|class Forecast|class Project" --type python | grep models.py | head -20

Repository: Metaculus/metaculus

Length of output: 149


🏁 Script executed:

# Check for indexes on Post model
rg -n "class Meta" -A 20 posts/models.py | grep -E "class Meta|indexes|db_index" | head -30

Repository: Metaculus/metaculus

Length of output: 319


🏁 Script executed:

# Check for indexes on Question model  
rg -n "class Meta" -A 20 questions/models.py | grep -E "class Meta|indexes|db_index" | head -30

Repository: Metaculus/metaculus

Length of output: 300


🏁 Script executed:

# Look for specific field definitions on Post and Project models
rg -n "default_project|default_permission|curation_status" posts/models.py questions/models.py projects/models.py

Repository: Metaculus/metaculus

Length of output: 3499


🏁 Script executed:

# Get the Post model indexes around line 1023
sed -n '1017,1070p' posts/models.py

Repository: Metaculus/metaculus

Length of output: 1686


🏁 Script executed:

# Get the Question model indexes
sed -n '388,410p' questions/models.py
sed -n '613,640p' questions/models.py
sed -n '703,730p' questions/models.py

Repository: Metaculus/metaculus

Length of output: 2999


🏁 Script executed:

# Check if curation_status and default_project fields have db_index
rg -n "curation_status|default_project" posts/models.py | grep "db_index\|models\."

Repository: Metaculus/metaculus

Length of output: 232


🏁 Script executed:

# Get the full Post model definition around curation_status and default_project fields
sed -n '570,590p' posts/models.py
sed -n '810,830p' posts/models.py

Repository: Metaculus/metaculus

Length of output: 1572


🏁 Script executed:

# Check for any indexes on Post model itself
sed -n '850,920p' posts/models.py

Repository: Metaculus/metaculus

Length of output: 2387


🏁 Script executed:

# Look for the Post model Meta class
rg -n "class Post\(|class Meta:" posts/models.py | head -20

Repository: Metaculus/metaculus

Length of output: 191


🏁 Script executed:

# Get the Post model Meta class (the first one at line 986)
sed -n '986,1016p' posts/models.py

Repository: Metaculus/metaculus

Length of output: 1315


🏁 Script executed:

# Find the actual Post class Meta definition - it should be after line 541
sed -n '900,1000p' posts/models.py | grep -n "class Meta" -A 15

Repository: Metaculus/metaculus

Length of output: 657


🏁 Script executed:

# Also check if there's any Meta class for Post model
rg -n "^class Post\(" posts/models.py -A 450 | grep "class Meta" -A 20 | head -40

Repository: Metaculus/metaculus

Length of output: 346


🏁 Script executed:

# Find the end of the Post class definition
sed -n '541,985p' posts/models.py | tail -50

Repository: Metaculus/metaculus

Length of output: 1899


🏁 Script executed:

# Check if Post class has any Meta class at all by looking for all Meta classes
rg -n "^class.*\(.*\):" posts/models.py | head -20

Repository: Metaculus/metaculus

Length of output: 444


🏁 Script executed:

# Check if Post class has a Meta definition between lines 541-949
sed -n '541,949p' posts/models.py | grep -n "class Meta" -A 30

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Also check default_project foreign key - does it have db_index?
sed -n '817,830p' posts/models.py

Repository: Metaculus/metaculus

Length of output: 651


🏁 Script executed:

# Check the Question model and its related_posts relationship
rg -n "class Question|related_posts|class Meta" questions/models.py | head -40

Repository: Metaculus/metaculus

Length of output: 370


🏁 Script executed:

# Check if there are any indexes involving default_project or FK relationships in Post model
rg -n "default_project|Index" posts/models.py | head -40

Repository: Metaculus/metaculus

Length of output: 1111


🏁 Script executed:

# Check the Post model indexes around line 1024
sed -n '1024,1050p' posts/models.py

Repository: Metaculus/metaculus

Length of output: 854


🏁 Script executed:

# Check if Question model has a related_posts relationship
sed -n '39,200p' questions/models.py | grep -E "related_posts|ForeignKey|ManyToMany"

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Find the related_posts relationship definition
rg -n "related_posts" questions/models.py posts/models.py | head -20

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Check the Project model for default_permission field
rg -n "default_permission|class Meta" projects/models.py | head -40

Repository: Metaculus/metaculus

Length of output: 389


🏁 Script executed:

# Get the default_permission field definition in Project model
sed -n '333,345p' projects/models.py

Repository: Metaculus/metaculus

Length of output: 489


🏁 Script executed:

# Search for related_posts more broadly - might be a reverse relation
rg -n "related_posts\|RelatedPostsQuestion\|through=" questions/models.py posts/models.py

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Look for any model with related_posts in its name or definition
rg -n "class.*Related.*Posts|related_posts.*=" questions/models.py posts/models.py

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Search the entire codebase for related_posts definition
rg -n "related_posts" . --type python | head -20

Repository: Metaculus/metaculus

Length of output: 93


🏁 Script executed:

# Search for related_posts in the codebase
rg -n "related_posts" . | head -20

Repository: Metaculus/metaculus

Length of output: 1092


🏁 Script executed:

# Look at the review comment file to understand the query better
sed -n '618,648p' scoring/management/commands/update_global_bot_leaderboard.py

Repository: Metaculus/metaculus

Length of output: 1268


🏁 Script executed:

# Find the model that has a ForeignKey to Question with related_name="related_posts"
rg -n "ForeignKey.*Question|related_name.*related_posts" --type-list | head -5

Repository: Metaculus/metaculus

Length of output: 452


🏁 Script executed:

# Search for the related_posts definition
rg -n "related_name\s*=\s*[\"']related_posts" .

Repository: Metaculus/metaculus

Length of output: 45


🏁 Script executed:

# Search for all ForeignKey definitions that might create related_posts
rg -n "related_name" posts/models.py questions/models.py | grep -i related

Repository: Metaculus/metaculus

Length of output: 2671


🏁 Script executed:

# Look for any model that has a post ForeignKey
rg -n "post\s*=\s*models\.ForeignKey" . -A 2 | head -40

Repository: Metaculus/metaculus

Length of output: 1879


🏁 Script executed:

# Check misc/models.py for related_posts definition
sed -n '1,100p' misc/models.py

Repository: Metaculus/metaculus

Length of output: 3281


🏁 Script executed:

# Look for the model that has related_name="related_posts"
rg -n "related_name\s*=\s*" misc/models.py | head -20

Repository: Metaculus/metaculus

Length of output: 229


Add database index on Post.default_project FK for query join performance.

The query filters on multiple join paths through related_posts__post__default_project__ including default_permission and default_project_id, plus traverses this FK in exclude clauses. While Post.curation_status has db_index=True and Project.default_permission is indexed, the critical Post.default_project ForeignKey field itself lacks an explicit index. Since this job processes many questions and takes ~20 minutes, adding db_index=True to the default_project FK definition will improve join performance on these high-cardinality traversals.

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
618 - 648, The Post.default_project foreign key lacks an index and is used
heavily in joins (see filters/excludes on
related_posts__post__default_project__* in update_global_bot_leaderboard), so
add db_index=True to the ForeignKey declaration for default_project on the Post
model, create and commit the generated migration (makemigrations), and run/apply
the migration (migrate) to ensure the DB has the new index; this will speed up
joins against Project.default_permission and related filters.

Comment on lines 655 to 665
scored_question_counts: dict[int, int] = defaultdict(int)
c = users.count()
i = 0
for user in users:
i += 1
print(i, "/", c, end="\r")
scored_question_counts[user.id] = (
Score.objects.filter(user=user, question__in=question_list)
.distinct("question_id")
.count()
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

N+1 query pattern: counting scores per user in a loop.

This loop executes a separate Score.objects.filter(...).count() query for each user, creating an N+1 query problem. With many bot users, this compounds the already slow execution time.

♻️ Suggested optimization using annotation
-    scored_question_counts: dict[int, int] = defaultdict(int)
-    c = users.count()
-    i = 0
-    for user in users:
-        i += 1
-        print(i, "/", c, end="\r")
-        scored_question_counts[user.id] = (
-            Score.objects.filter(user=user, question__in=question_list)
-            .distinct("question_id")
-            .count()
-        )
+    from django.db.models import Count
+    scored_question_counts = dict(
+        Score.objects.filter(user__in=users, question__in=question_list)
+        .values("user_id")
+        .annotate(q_count=Count("question_id", distinct=True))
+        .values_list("user_id", "q_count")
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
scored_question_counts: dict[int, int] = defaultdict(int)
c = users.count()
i = 0
for user in users:
i += 1
print(i, "/", c, end="\r")
scored_question_counts[user.id] = (
Score.objects.filter(user=user, question__in=question_list)
.distinct("question_id")
.count()
)
from django.db.models import Count
scored_question_counts = dict(
Score.objects.filter(user__in=users, question__in=question_list)
.values("user_id")
.annotate(q_count=Count("question_id", distinct=True))
.values_list("user_id", "q_count")
)
🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
655 - 665, The per-user loop triggers an N+1 query by calling
Score.objects.filter(user=user,
question__in=question_list).distinct("question_id").count() for each user;
instead run a single aggregated query like Score.objects.filter(user__in=users,
question__in=question_list).values("user").annotate(count=Count("question",
distinct=True)) to get distinct question counts per user, then populate
scored_question_counts from that result (referencing scored_question_counts,
users, question_list and the Score model) rather than counting inside the loop.

Comment on lines +726 to +731
leaderboard, _ = Leaderboard.objects.get_or_create(
name="Global Bot Leaderboard",
project=Project.objects.get(type=Project.ProjectTypes.SITE_MAIN),
score_type=LeaderboardScoreTypes.MANUAL,
bot_status=Project.BotLeaderboardStatus.BOTS_ONLY,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

get_or_create may create duplicate leaderboards if other fields vary.

Using get_or_create with name, project, score_type, and bot_status means if any of these fields change, a new leaderboard is created rather than updating the existing one. Consider using get_or_create with only name as the lookup, then updating other fields explicitly.

♻️ Safer approach
-    leaderboard, _ = Leaderboard.objects.get_or_create(
-        name="Global Bot Leaderboard",
-        project=Project.objects.get(type=Project.ProjectTypes.SITE_MAIN),
-        score_type=LeaderboardScoreTypes.MANUAL,
-        bot_status=Project.BotLeaderboardStatus.BOTS_ONLY,
-    )
+    leaderboard, created = Leaderboard.objects.get_or_create(
+        name="Global Bot Leaderboard",
+        defaults={
+            "project": Project.objects.get(type=Project.ProjectTypes.SITE_MAIN),
+            "score_type": LeaderboardScoreTypes.MANUAL,
+            "bot_status": Project.BotLeaderboardStatus.BOTS_ONLY,
+        },
+    )
🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
726 - 731, The current use of Leaderboard.objects.get_or_create(...) with
multiple fields risks creating duplicates when non-lookup fields change; change
the logic to call Leaderboard.objects.get_or_create(name="Global Bot
Leaderboard") to perform lookup by name only, then set or update the remaining
attributes (project=Project.objects.get(type=Project.ProjectTypes.SITE_MAIN),
score_type=LeaderboardScoreTypes.MANUAL,
bot_status=Project.BotLeaderboardStatus.BOTS_ONLY) on the returned leaderboard
instance and call save() so existing records are updated instead of creating
duplicates; ensure you still handle the created flag appropriately from
Leaderboard.objects.get_or_create.

Comment on lines 743 to 747
if isinstance(uid, int):
user = User.objects.get(id=uid)
bot_details = user.metadata["bot_details"]
if not bot_details.get("display_in_leaderboard"):
excluded = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

N+1 query: fetching User inside loop.

User.objects.get(id=uid) is called inside the loop for every integer uid. This creates unnecessary database queries.

♻️ Prefetch users before the loop
+    user_map = {u.id: u for u in User.objects.filter(id__in=[uid for uid, _ in ordered_skills if isinstance(uid, int)])}
     for uid, skill in ordered_skills:
         contribution_count = len(player_stats[uid][1])

         excluded = False
         if isinstance(uid, int):
-            user = User.objects.get(id=uid)
+            user = user_map.get(uid)
+            if not user:
+                continue
             bot_details = user.metadata["bot_details"]
🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
743 - 747, The loop performs N+1 queries by calling User.objects.get(id=uid) for
each integer uid; instead, before iterating gather all integer uids used in that
loop, fetch users in bulk via User.objects.filter(id__in=ids) (or
User.objects.in_bulk(ids)) in the update_global_bot_leaderboard logic, build a
mapping from id to User, and replace the per-iteration User.objects.get call
with a lookup like users_by_id[uid] (handle missing users gracefully); this
removes the per-iteration DB hit while preserving the existing checks on
user.metadata["bot_details"] and display_in_leaderboard.

Comment on lines 805 to 821
for uid, skill in ordered_skills:
if isinstance(uid, str):
username = uid
else:
username = User.objects.get(id=uid).username
unevaluated.remove(uid)
lower = ci_lower.get(uid, 0)
upper = ci_upper.get(uid, 0)
print(
"| 2.5% "
"| Skill "
"| 97.5% "
"| Match "
"| Quest. "
"| ID "
"| Username "
f"| {round(lower, 2):>6} "
f"| {round(skill, 2):>6} "
f"| {round(upper, 2):>6} "
f"| {player_stats[uid][0]:>6} "
f"| {len(player_stats[uid][1]):>6} "
f"| {uid if isinstance(uid, int) else '':>6} "
f"| {username}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Another N+1 query pattern in display loop.

User.objects.get(id=uid) is called again inside this display loop. Same issue as noted above - prefetch all needed users beforehand.

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
805 - 821, The loop over ordered_skills is causing an N+1 DB hit via
User.objects.get(id=uid); fix by collecting all integer uids used in that loop
(where uid is not a str) and doing a single query like
User.objects.filter(id__in=collected_ids) to build a mapping uid_to_username,
then inside the loop replace User.objects.get(id=uid).username with
uid_to_username.get(uid, "<missing>") (keeping the existing behavior for str
uids where username = uid) so no per-iteration queries occur; update references
in the same function where unevaluated and player_stats are used to read from
the prebuilt mapping instead of calling User.objects.get.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

🚀 Preview Environment

Your preview environment is ready!

Resource Details
🌐 Preview URL https://metaculus-pr-3933-feat-global-bot-leaderboard-da-preview.mtcl.cc
📦 Docker Image ghcr.io/metaculus/metaculus:feat-global-bot-leaderboard-daily-updating-a0c679e
🗄️ PostgreSQL NeonDB branch preview/pr-3933-feat-global-bot-leaderboard-da
Redis Fly Redis mtc-redis-pr-3933-feat-global-bot-leaderboard-da

Details

  • Commit: 72071a1bf9110a4947f336a0123440f53a869a7b
  • Branch: feat/global-bot-leaderboard-daily-updating
  • Fly App: metaculus-pr-3933-feat-global-bot-leaderboard-da

ℹ️ Preview Environment Info

Isolation:

  • PostgreSQL and Redis are fully isolated from production
  • Each PR gets its own database branch and Redis instance
  • Changes pushed to this PR will trigger a new deployment

Limitations:

  • Background workers and cron jobs are not deployed in preview environments
  • If you need to test background jobs, use Heroku staging environments

Cleanup:

  • This preview will be automatically destroyed when the PR is closed

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@scoring/jobs.py`:
- Around line 20-30: The guard in update_gobal_bot_leaderboard is ineffective
because run_update_global_bot_leaderboard internally calls
Leaderboard.objects.get_or_create, so the pre-check is a no-op; decide one
contract and make both call sites match: either remove the existence check in
update_gobal_bot_leaderboard and let run_update_global_bot_leaderboard handle
creation, or change run_update_global_bot_leaderboard to accept a Leaderboard
instance (pass the queried global_bot_leaderboard) and remove get_or_create
there. Also replace logger.error with logger.exception in
update_gobal_bot_leaderboard to preserve the traceback when catching exceptions
from run_update_global_bot_leaderboard.

In `@scoring/management/commands/update_global_bot_leaderboard.py`:
- Around line 144-157: The CSV partial cache (HtH_score_data.csv) never expires,
so update the read/write logic around cache_use, cached_question_ids and
question_ids to include a cache-busting field and TTL check: when writing rows
add a header/column (e.g., cache_version or written_timestamp) and when reading
in the block that builds cached_question_ids skip or ignore rows whose
cache_version doesn't match the current version or whose written_timestamp is
older than a configurable TTL (re-queue those question IDs into question_ids for
reprocessing); ensure any writer path updates the version/timestamp and that the
reader reconstructs cached_question_ids only from valid, non-stale rows.
- Around line 131-151: The code currently uses a relative
Path("HtH_score_data.csv") (seen as csv_path) in both the reader and the cache
creation branch (cache_use == "partial"); change this to resolve a deterministic
directory from configuration (e.g., a Django setting or env var like
SCORE_CACHE_DIR or settings.MEDIA_ROOT/BASE_DIR var), then build the file path
as Path(config_dir) / "HtH_score_data.csv", ensure the directory exists
(mkdir(parents=True, exist_ok=True)) before opening the file, and update both
the reader and writer sites to use that csv_path so the cache is persisted to a
stable location across deployments and instances.
- Around line 874-877: Remove the stray breakpoint() inside the loop in
update_global_bot_leaderboard.py (the block where skip is set and the code does
"if skip: breakpoint(); continue") because it can hang the cron job; replace the
breakpoint() with a non-blocking action such as logging a warning or debug
message that includes identifying info about the bot/match (use the module
logger or logging.getLogger(__name__) and include variables that indicate why
skip was true), then keep the continue to skip processing.
- Around line 59-62: In the update_global_bot_leaderboard management command
replace the leftover breakpoint() in the coverage == 0 branch with a
non-blocking failure path: remove the debugger call and emit a clear log message
(use the module/class logger or the command's self.stdout/self.stderr) that
includes context (e.g., bot identifier and coverage value), then return None (or
otherwise exit the function cleanly) so the daily cron job does not hang; update
the branch around the coverage variable in the function handling leaderboard
updates accordingly.
- Line 302: Add a null guard before calling .timestamp() on
question.actual_close_time to avoid AttributeError: check if
question.actual_close_time is not None (e.g., if question.actual_close_time:)
before using question.actual_close_time.timestamp() in both places where
timestamps.append(question.actual_close_time.timestamp()) appears; update the
code paths that build the timestamps list (the occurrences that append to
timestamps) to skip or handle questions with actual_close_time == None.
🧹 Nitpick comments (5)
scoring/management/commands/update_global_bot_leaderboard.py (5)

122-129: _deserialize_user is defined identically twice.

Extract it to module scope to avoid duplication.

Proposed fix
+def _deserialize_user(value: str) -> int | str:
+    value = value.strip()
+    if not value:
+        return value
+    try:
+        return int(value)
+    except ValueError:
+        return value
+
+
 def gather_data(

Then remove both inner definitions and reference the module-level function.

Also applies to: 333-340


711-790: ~80 lines of commented-out code should be removed or tracked as a TODO/issue.

This commented block adds noise and makes the file harder to maintain. If this logic is needed later, it can be recovered from version control.


1067-1117: Statistical diagnostics run unconditionally on every cron invocation.

Shapiro-Wilk, Anderson-Darling, and KS tests are computed and printed to stdout on every daily run. This is development-time analysis that adds compute overhead and log noise in production. Consider gating behind a verbose flag or moving to logger.debug.


249-265: Commented-out query and TODO without tracking.

Lines 249–251 have a commented-out query, and Line 264 has a TODO to grab the aggregate at spot scoring time. These should be tracked as issues so they don't get forgotten.

Would you like me to open issues to track these TODOs?


661-704: print() statements throughout — consider using logging for the cron path.

Since run_update_global_bot_leaderboard is now called from a scheduled cron job (via scoring/jobs.py), all print() output goes to stdout and may not appear in structured logs. For the management command path, self.stdout.write is idiomatic; for the cron path, logger.info/logger.debug would be more appropriate. Consider accepting a logger or verbosity parameter.

Comment on lines +20 to +30
def update_gobal_bot_leaderboard():
global_bot_leaderboard = Leaderboard.objects.filter(
name="Global Bot Leaderboard",
).first()
if not global_bot_leaderboard:
logger.warning("Global Bot Leaderboard not found.")
return
try:
run_update_global_bot_leaderboard()
except Exception as e:
logger.error(f"Error updating Global Bot Leaderboard: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Guard is ineffective and logger.error loses the traceback.

Two issues:

  1. The existence check on Lines 21-26 guards against a missing leaderboard, but run_update_global_bot_leaderboard() internally calls Leaderboard.objects.get_or_create(...) (line 951 of the command file), so it will create the leaderboard anyway—making this guard a no-op.
  2. Use logger.exception instead of logger.error to preserve the full traceback in logs.
Proposed fix
 def update_gobal_bot_leaderboard():
-    global_bot_leaderboard = Leaderboard.objects.filter(
-        name="Global Bot Leaderboard",
-    ).first()
-    if not global_bot_leaderboard:
-        logger.warning("Global Bot Leaderboard not found.")
-        return
     try:
         run_update_global_bot_leaderboard()
     except Exception as e:
-        logger.error(f"Error updating Global Bot Leaderboard: {e}")
+        logger.exception(f"Error updating Global Bot Leaderboard: {e}")

If the guard is intentional (i.e., the job should only run when the leaderboard already exists), remove the get_or_create from the runner and pass the leaderboard instance in, so the two call-sites agree on the contract.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def update_gobal_bot_leaderboard():
global_bot_leaderboard = Leaderboard.objects.filter(
name="Global Bot Leaderboard",
).first()
if not global_bot_leaderboard:
logger.warning("Global Bot Leaderboard not found.")
return
try:
run_update_global_bot_leaderboard()
except Exception as e:
logger.error(f"Error updating Global Bot Leaderboard: {e}")
def update_gobal_bot_leaderboard():
try:
run_update_global_bot_leaderboard()
except Exception as e:
logger.exception(f"Error updating Global Bot Leaderboard: {e}")
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 29-29: Do not catch blind exception: Exception

(BLE001)


[warning] 30-30: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In `@scoring/jobs.py` around lines 20 - 30, The guard in
update_gobal_bot_leaderboard is ineffective because
run_update_global_bot_leaderboard internally calls
Leaderboard.objects.get_or_create, so the pre-check is a no-op; decide one
contract and make both call sites match: either remove the existence check in
update_gobal_bot_leaderboard and let run_update_global_bot_leaderboard handle
creation, or change run_update_global_bot_leaderboard to accept a Leaderboard
instance (pass the queried global_bot_leaderboard) and remove get_or_create
there. Also replace logger.error with logger.exception in
update_gobal_bot_leaderboard to preserve the traceback when catching exceptions
from run_update_global_bot_leaderboard.

Comment on lines 59 to 62
if coverage == 0:
# investigate!
breakpoint()
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

breakpoint() left in production code — will hang the daily cron job.

When coverage == 0, this halts execution waiting for a debugger. Since this runs as an unattended daily task, it will block the process indefinitely.

Proposed fix
         if coverage == 0:
-            # investigate!
-            breakpoint()
+            # No overlapping coverage between the two forecasters
             return None
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 61-61: Trace found: breakpoint used

(T100)

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines 59
- 62, In the update_global_bot_leaderboard management command replace the
leftover breakpoint() in the coverage == 0 branch with a non-blocking failure
path: remove the debugger call and emit a clear log message (use the
module/class logger or the command's self.stdout/self.stderr) that includes
context (e.g., bot identifier and coverage value), then return None (or
otherwise exit the function cleanly) so the daily cron job does not hang; update
the branch around the coverage variable in the function handling leaderboard
updates accordingly.

Comment on lines +131 to +151
csv_path = Path("HtH_score_data.csv")
with csv_path.open("r") as input_file:
reader = csv.DictReader(input_file)
for row in reader:
user1_ids.append(_deserialize_user(row["user1"]))
user2_ids.append(_deserialize_user(row["user2"]))
question_ids.append(int(row["questionid"]))
scores.append(float(row["score"]))
coverages.append(float(row["coverage"]))
timestamps.append(float(row["timestamp"]))

return (user1_ids, user2_ids, question_ids, scores, coverages, timestamps)

if cache_use == "partial":
csv_path = Path("HtH_score_data.csv")
if csv_path.exists():
userset = set([str(u.id) for u in users]) | {
"Pro Aggregate",
"Community Aggregate",
}
import csv

def _deserialize_user(value: str) -> int | str:
value = value.strip()
if not value:
return value
try:
return int(value)
except ValueError:
return value

user1_ids: list[int | str] = []
user2_ids: list[int | str] = []
question_ids: list[int] = []
scores: list[float] = []
coverages: list[float] = []
with csv_path.open() as input_file:
reader = csv.DictReader(input_file)
for row in reader:
if (row["user1"] in userset) and (row["user2"] in userset):
user1_ids.append(_deserialize_user(row["user1"]))
user2_ids.append(_deserialize_user(row["user2"]))
question_ids.append(int(row["questionid"]))
scores.append(float(row["score"]))
coverages.append(float(row["coverage"]))
return (user1_ids, user2_ids, question_ids, scores, coverages)
if not csv_path.exists():
with csv_path.open("w") as output_file:
writer = csv.writer(output_file)
writer.writerow(
["user1", "user2", "questionid", "score", "coverage", "timestamp"]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

CSV cache uses a relative path — unreliable in production deployments.

Path("HtH_score_data.csv") resolves relative to the worker's CWD, which is non-deterministic across environments (containers, cron workers, etc.). The file will be lost on container restarts and could conflict if multiple instances run concurrently.

Consider using a well-known directory (e.g., from Django settings or an env var), or persisting to the database/object storage as the PR description itself suggests.

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
131 - 151, The code currently uses a relative Path("HtH_score_data.csv") (seen
as csv_path) in both the reader and the cache creation branch (cache_use ==
"partial"); change this to resolve a deterministic directory from configuration
(e.g., a Django setting or env var like SCORE_CACHE_DIR or
settings.MEDIA_ROOT/BASE_DIR var), then build the file path as Path(config_dir)
/ "HtH_score_data.csv", ensure the directory exists (mkdir(parents=True,
exist_ok=True)) before opening the file, and update both the reader and writer
sites to use that csv_path so the cache is persisted to a stable location across
deployments and instances.

Comment on lines +144 to +157
if cache_use == "partial":
csv_path = Path("HtH_score_data.csv")
if csv_path.exists():
userset = set([str(u.id) for u in users]) | {
"Pro Aggregate",
"Community Aggregate",
}
import csv

def _deserialize_user(value: str) -> int | str:
value = value.strip()
if not value:
return value
try:
return int(value)
except ValueError:
return value

user1_ids: list[int | str] = []
user2_ids: list[int | str] = []
question_ids: list[int] = []
scores: list[float] = []
coverages: list[float] = []
with csv_path.open() as input_file:
reader = csv.DictReader(input_file)
for row in reader:
if (row["user1"] in userset) and (row["user2"] in userset):
user1_ids.append(_deserialize_user(row["user1"]))
user2_ids.append(_deserialize_user(row["user2"]))
question_ids.append(int(row["questionid"]))
scores.append(float(row["score"]))
coverages.append(float(row["coverage"]))
return (user1_ids, user2_ids, question_ids, scores, coverages)
if not csv_path.exists():
with csv_path.open("w") as output_file:
writer = csv.writer(output_file)
writer.writerow(
["user1", "user2", "questionid", "score", "coverage", "timestamp"]
)
with csv_path.open("r") as input_file:
reader = csv.DictReader(input_file)
for row in reader:
question_ids.append(int(row["questionid"]))
cached_question_ids = set(question_ids)
question_ids = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

CSV cache has no invalidation — stale data persists indefinitely.

The partial cache appends new question results but never invalidates existing entries. If a question's resolution is corrected, scoring logic changes, or user mappings are updated, the cached data becomes stale with no way to detect or refresh it. The only recovery is manually deleting the CSV file.

Consider adding a cache-busting mechanism (e.g., a version/hash header in the CSV, or a TTL-based approach that re-processes questions older than N days).

Also applies to: 319-351

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
144 - 157, The CSV partial cache (HtH_score_data.csv) never expires, so update
the read/write logic around cache_use, cached_question_ids and question_ids to
include a cache-busting field and TTL check: when writing rows add a
header/column (e.g., cache_version or written_timestamp) and when reading in the
block that builds cached_question_ids skip or ignore rows whose cache_version
doesn't match the current version or whose written_timestamp is older than a
configurable TTL (re-queue those question IDs into question_ids for
reprocessing); ensure any writer path updates the version/timestamp and that the
reader reconstructs cached_question_ids only from valid, non-stale rows.

question_ids.append(q)
scores.append(u1s)
coverages.append(cov)
timestamps.append(question.actual_close_time.timestamp())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, locate and read the file to understand context around lines 302 and 42
fd -t f "update_global_bot_leaderboard.py" -x wc -l {}

Repository: Metaculus/metaculus

Length of output: 130


🏁 Script executed:

#!/bin/bash
# Read the relevant sections
sed -n '295,310p' scoring/management/commands/update_global_bot_leaderboard.py

Repository: Metaculus/metaculus

Length of output: 829


🏁 Script executed:

#!/bin/bash
# Read the query section mentioned in the comment
sed -n '678,710p' scoring/management/commands/update_global_bot_leaderboard.py

Repository: Metaculus/metaculus

Length of output: 1307


🏁 Script executed:

#!/bin/bash
# Also check line 42 context
sed -n '35,50p' scoring/management/commands/update_global_bot_leaderboard.py

Repository: Metaculus/metaculus

Length of output: 791


🏁 Script executed:

#!/bin/bash
# Find and examine the Question model definition
fd -t f "models.py" | xargs grep -l "class Question" | head -5

Repository: Metaculus/metaculus

Length of output: 82


🏁 Script executed:

#!/bin/bash
# Read the Question model to check actual_close_time field definition
rg -A 2 -B 2 "actual_close_time" questions/models.py | head -40

Repository: Metaculus/metaculus

Length of output: 1600


🏁 Script executed:

#!/bin/bash
# Get more context from the model definition
cat -n questions/models.py | grep -A 5 -B 5 "actual_close_time\s*="

Repository: Metaculus/metaculus

Length of output: 516


🏁 Script executed:

#!/bin/bash
# Check if there are any conditions around the actual_close_time usage at line 302 and line 42
sed -n '280,310p' scoring/management/commands/update_global_bot_leaderboard.py

Repository: Metaculus/metaculus

Length of output: 1412


🏁 Script executed:

#!/bin/bash
# Also check the full function containing line 42
sed -n '25,70p' scoring/management/commands/update_global_bot_leaderboard.py

Repository: Metaculus/metaculus

Length of output: 2121


Add null guard for actual_close_time before calling .timestamp().

The model defines actual_close_time as DateTimeField(null=True), and the query filters on resolution__isnull=False without also checking actual_close_time__isnull=False. A resolved question can have actual_close_time=None, causing both line 42 and line 302 to raise AttributeError when calling .timestamp() on None.

Add a null check at both locations before using actual_close_time.

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` at line 302,
Add a null guard before calling .timestamp() on question.actual_close_time to
avoid AttributeError: check if question.actual_close_time is not None (e.g., if
question.actual_close_time:) before using question.actual_close_time.timestamp()
in both places where timestamps.append(question.actual_close_time.timestamp())
appears; update the code paths that build the timestamps list (the occurrences
that append to timestamps) to skip or handle questions with actual_close_time ==
None.

Comment on lines +874 to +877
skip = True
if skip:
breakpoint()
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Second breakpoint() left in production code — same hang risk.

This will freeze the cron job whenever a bot's model is older than 1 year for a given match.

Proposed fix
         if skip:
-            breakpoint()
             continue
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 876-876: Trace found: breakpoint used

(T100)

🤖 Prompt for AI Agents
In `@scoring/management/commands/update_global_bot_leaderboard.py` around lines
874 - 877, Remove the stray breakpoint() inside the loop in
update_global_bot_leaderboard.py (the block where skip is set and the code does
"if skip: breakpoint(); continue") because it can hang the cron job; replace the
breakpoint() with a non-blocking action such as logging a warning or debug
message that includes identifying info about the bot/match (use the module
logger or logging.getLogger(__name__) and include variables that indicate why
skip was true), then keep the continue to skip processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Global Bot Leaderboard - daily updating

2 participants