Skip to content

Require project_id on list endpoints to prevent full table scans#1133

Draft
mihow wants to merge 1 commit intomainfrom
fix/require-project-filter-on-list
Draft

Require project_id on list endpoints to prevent full table scans#1133
mihow wants to merge 1 commit intomainfrom
fix/require-project-filter-on-list

Conversation

@mihow
Copy link
Collaborator

@mihow mihow commented Feb 16, 2026

Summary

  • Adds require_project_for_list = True as the default in ProjectMixin, so all list endpoints return 400 if no project_id is provided. This prevents unfiltered queries that cause full table scans (e.g. COUNT(*) on classifications took 3.6 min per request).
  • Sets require_project = True on SummaryView and removes the unfiltered code path.
  • Opts out global entities: TaxonViewSet, TaxaListViewSet, TagViewSet.

Context

On 2026-02-16, Facebook's crawler (meta-externalagent) hit /api/v2/classifications/ 13K+ times in a day without a project filter, causing recurring latency spikes (P99 > 2 min, throughput drops from 2700 to 140 req/15min). The crawler is now blocked at the nginx layer (ami-devops@5a29503), but unfiltered list queries from any client remain a risk.

Changes

File Change
ami/base/views.py ProjectMixin.require_project_for_list = True default; get_active_project() enforces on action == "list"
ami/main/api/views.py TaxonViewSet, TaxaListViewSet, TagViewSet: opt out with require_project_for_list = False
ami/main/api/views.py SummaryView: require_project = True, remove unfiltered else branch

Behavior

  • GET /api/v2/classifications/ → 400 {"project_id": ["This field is required."]}
  • GET /api/v2/classifications/?project_id=1 → 200 (filtered, fast)
  • GET /api/v2/classifications/123/ → 200 (detail view, unaffected)
  • GET /api/v2/taxa/ → 200 (opted out, global taxonomy)
  • GET /api/v2/status/summary/ → 400 (now requires project_id)

Test plan

  • Verify list endpoints return 400 without project_id
  • Verify list endpoints return 200 with valid project_id
  • Verify detail/retrieve endpoints still work without project_id
  • Verify opted-out endpoints (taxa, tags, taxa-lists) still list without project_id
  • Verify summary endpoint requires project_id
  • Check frontend still works (it should already pass project_id on all list requests)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Taxonomy, taxa lists, and tags are now accessible globally across all projects.
  • Changes

    • Summary view now requires an active project context for data filtering.

Add `require_project_for_list = True` as the default in ProjectMixin.
All ViewSets using the mixin now return 400 on list actions if no
project_id query parameter is provided, preventing unfiltered queries
that cause full table scans and multi-minute response times.

Opt out global entities that aren't project-scoped:
- TaxonViewSet (taxonomy is global)
- TaxaListViewSet (taxa lists are global)
- TagViewSet (includes global tags)

Also set require_project = True on SummaryView and remove the
unfiltered else branch that returned all-projects counts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 16, 2026 22:27
@netlify
Copy link

netlify bot commented Feb 16, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit eae316f
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/699399e083cc7c0008ea2f35

@netlify
Copy link

netlify bot commented Feb 16, 2026

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit eae316f
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/699399e0629d170008bd8e31

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

This change introduces a new require_project_for_list flag to ProjectMixin that conditionally requires a project context specifically for list actions. Global resources like taxonomies and tags opt out of this requirement, while SummaryView is configured to always require a project. Affected list endpoints can now operate without an explicit project when the flag is disabled.

Changes

Cohort / File(s) Summary
Base View Infrastructure
ami/base/views.py
Added require_project_for_list boolean attribute to ProjectMixin and refactored the logic in get_active_project to combine both require_project and the new flag with action type checking. Updated error handling to raise Http404 when a required project is not found, and updated docstring.
API Viewsets & Views
ami/main/api/views.py
Set require_project_for_list = False on TaxonViewSet, TaxaListViewSet, and TagViewSet to allow global list access. Added require_project = True to SummaryView and simplified its data construction logic by removing the per-project conditional branch.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A flag hops into view, so neat and so keen,
Lists need not projects where globals convene,
Tags and taxa roam freely, no borders in sight,
While summaries stand firm, requiring the light,
Sweet logic refined, oh what a delight! ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: requiring project_id on list endpoints to prevent full table scans, which aligns with the substantial code changes.
Description check ✅ Passed The description includes a clear summary, comprehensive list of changes, detailed context explaining the incident and motivation, behavior examples, and a test plan—covering most required template sections.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/require-project-filter-on-list

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the API against expensive unfiltered list queries by making project_id required by default for list actions in ProjectMixin, and by removing the unfiltered code path for the status summary endpoint, while explicitly opting out global resources (taxa, taxa lists, tags).

Changes:

  • Default ProjectMixin.require_project_for_list = True and compute “required” based on action == "list".
  • Make SummaryView require a project and delete the unfiltered aggregation branch.
  • Opt out TaxonViewSet, TaxaListViewSet, and TagViewSet from the new list requirement.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
ami/base/views.py Adds list-action project enforcement logic to ProjectMixin.get_active_project() via require_project_for_list.
ami/main/api/views.py Opts out global list endpoints and makes SummaryView require project_id, removing the unfiltered summary path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +68
require_project = False # Project required for all actions
require_project_for_list = True # Project required for list actions
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment on require_project is misleading: the attribute is False (project optional), but the comment says it's required for all actions. Please update the comment (or the default) so it matches the actual behavior to avoid confusion when configuring views.

Suggested change
require_project = False # Project required for all actions
require_project_for_list = True # Project required for list actions
require_project = False # If True, project is required for all actions; otherwise project is optional
require_project_for_list = True # If True, project is required for list actions

Copilot uses AI. Check for mistakes.
Comment on lines 72 to 85
def get_active_project(self) -> Project | None:
"""
Instance method wrapper around the standalone get_active_project function.
Raises Http404 if project_id is required but not provided or the project does not exist.

Enforcement logic:
- If require_project is True, project_id is required for all actions.
- If require_project_for_list is True (default), project_id is required for list actions.
- Missing project_id when required raises ValidationError (400).
- Valid project_id pointing to nonexistent project raises Http404 (404).
"""
action = getattr(self, "action", None)
required = self.require_project or (self.require_project_for_list and action == "list")

project = get_active_project(
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProjectMixin.get_active_project() docstring/behavior suggests require_project_for_list=True enforces project_id for list actions, but that enforcement only happens if the view actually calls self.get_active_project() during request handling. For example, ProjectViewSet inherits ProjectMixin but its list/get_queryset path never calls get_active_project(), so no 400 will be raised. Consider either (a) clarifying the docstring to state this is enforced only when get_active_project() is invoked, or (b) moving the enforcement into a hook that always runs for list actions (e.g., initial() / get_queryset() in the mixin) if the intent is truly "all list endpoints".

Copilot uses AI. Check for mistakes.
Comment on lines 1682 to 1687
class SummaryView(GenericAPIView, ProjectMixin):
permission_classes = [IsActiveStaffOrReadOnly]
require_project = True # Unfiltered summary queries are too expensive

@extend_schema(parameters=[project_id_doc_param])
def get(self, request):
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SummaryView now requires project_id (require_project = True), but the OpenAPI parameter used here (project_id_doc_param) is defined with required=False, so the generated docs will still show the query param as optional. Consider overriding the schema for this endpoint with a required=True parameter (or introducing a separate required_project_id_doc_param) so the docs match the new 400 behavior.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@ami/base/views.py`:
- Around line 82-94: Update the get_active_project function docstring to match
the actual parameter name: change any occurrences of "require_project" to
"required" so the docstring accurately documents the boolean parameter used by
get_active_project; verify the description refers to the purpose of the
"required" flag and adjust any examples or parameter lists to use the exact
identifier "required".

In `@ami/main/api/views.py`:
- Around line 1682-1684: ProjectViewSet is broken by the new default
require_project_for_list = True causing
LimitOffsetPaginationWithPermissions._get_project() to call
get_active_project(required=True) and raise Http404 for root-level project
listing; to fix, explicitly set require_project_for_list = False on the
ProjectViewSet class (same pattern as TaxonViewSet/TaxaListViewSet/TagViewSet)
so list actions do not force project extraction, and ensure pagination via
LimitOffsetPaginationWithPermissions._get_project() and get_paginated_response()
will skip required project resolution for this view.
🧹 Nitpick comments (1)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed.


In `@ami/base/views.py`:
- Around line 82-94: Update the get_active_project function docstring to match
the actual parameter name: change any occurrences of "require_project" to
"required" so the docstring accurately documents the boolean parameter used by
get_active_project; verify the description refers to the purpose of the
"required" flag and adjust any examples or parameter lists to use the exact
identifier "required".
ami/base/views.py (1)

82-94: Logic is sound; minor docstring param-name mismatch on the standalone function.

The computed required flag and the Http404 for "valid ID but nonexistent project" are correct. One nit: the docstring for the standalone get_active_project (line 23) refers to the parameter as require_project, but the actual parameter name is required.

Fix docstring param name
     Args:
         request: The DRF request object
         kwargs: URL kwargs (optional, used to extract pk from URL paths)
-        require_project: Whether project_id is required (raises validation error if True and not found)
+        required: Whether project_id is required (raises validation error if True and not found)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/base/views.py` around lines 82 - 94, Update the get_active_project
function docstring to match the actual parameter name: change any occurrences of
"require_project" to "required" so the docstring accurately documents the
boolean parameter used by get_active_project; verify the description refers to
the purpose of the "required" flag and adjust any examples or parameter lists to
use the exact identifier "required".

Comment on lines 1682 to +1684
class SummaryView(GenericAPIView, ProjectMixin):
permission_classes = [IsActiveStaffOrReadOnly]
require_project = True # Unfiltered summary queries are too expensive
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if ProjectViewSet or its parent classes call get_active_project in any list-related code path
echo "=== ProjectViewSet definition ==="
ast-grep --pattern 'class ProjectViewSet($_$$) {
  $$$
}'

echo ""
echo "=== Search for get_active_project calls in views.py ==="
rg -n 'get_active_project' ami/main/api/views.py

echo ""
echo "=== Check if pagination or other mixins call get_active_project ==="
rg -n 'get_active_project' ami/base/

Repository: RolnickLab/antenna

Length of output: 1472


🏁 Script executed:

#!/bin/bash
echo "=== Find ProjectViewSet class definition ==="
rg -n 'class ProjectViewSet' ami/main/api/views.py -A 5

echo ""
echo "=== Examine pagination code that calls get_active_project ==="
sed -n '25,35p' ami/base/pagination.py

echo ""
echo "=== Look for require_project_for_list in views.py ==="
rg -n 'require_project' ami/main/api/views.py | head -20

echo ""
echo "=== Check ProjectViewSet's parent classes and methods ==="
rg -n 'class ProjectViewSet' ami/main/api/views.py -A 20

Repository: RolnickLab/antenna

Length of output: 2070


🏁 Script executed:

#!/bin/bash
echo "=== Check DefaultViewSet definition ==="
rg -n 'class DefaultViewSet' ami/main/api/views.py -A 10

echo ""
echo "=== Check ProjectMixin and its get_active_project method ==="
rg -n 'class ProjectMixin' ami/base/views.py -A 20

echo ""
echo "=== Look at ProjectPagination ==="
rg -n 'class ProjectPagination' ami/base/pagination.py -B 2 -A 15

echo ""
echo "=== Check the full pagination _get_project method ==="
sed -n '27,35p' ami/base/pagination.py

echo ""
echo "=== Search for require_project usage pattern in the code ==="
rg -n 'require_project' ami/base/views.py

Repository: RolnickLab/antenna

Length of output: 2993


🏁 Script executed:

#!/bin/bash
echo "=== Look for ProjectViewSet URL routing ==="
rg -n 'ProjectViewSet' ami/main/api/ | grep -E '(router|urlpatterns|register)'

echo ""
echo "=== Check URL patterns file ==="
fd -e py 'url' ami/main/api/ | head -5

echo ""
echo "=== Check the get_active_project full implementation ==="
sed -n '72,100p' ami/base/views.py

echo ""
echo "=== Look at how require_project_for_list is used ==="
sed -n '67,95p' ami/base/views.py

Repository: RolnickLab/antenna

Length of output: 2710


🏁 Script executed:

#!/bin/bash
echo "=== Check how ProjectViewSet is registered in router ==="
fd -type f -name '*.py' ami/main/api/ -exec grep -l 'ProjectViewSet' {} \;

echo ""
echo "=== Look for router registration ==="
fd -type f -name 'urls.py' ami/ 2>/dev/null

echo ""
echo "=== Check if there's an api router file ==="
find ami/main/api -name '*.py' -type f | head -10

Repository: RolnickLab/antenna

Length of output: 463


🏁 Script executed:

#!/bin/bash
echo "=== Check if ProjectViewSet sets require_project or require_project_for_list ==="
sed -n '143,170p' ami/main/api/views.py

echo ""
echo "=== Check if there are any Django URL patterns ==="
rg -n 'path|url' ami/main/api/views.py | head -10

echo ""
echo "=== Check the ami/main/urls.py ==="
cat ami/main/urls.py

echo ""
echo "=== Check the api/schemas file for routing ==="
rg -n 'ProjectViewSet' ami/main/api/schemas.py

Repository: RolnickLab/antenna

Length of output: 1554


🏁 Script executed:

#!/bin/bash
echo "=== Find urls.py files ==="
find ami -name 'urls.py' -type f

echo ""
echo "=== Check if ProjectViewSet overrides require_project_for_list ==="
sed -n '143,180p' ami/main/api/views.py

echo ""
echo "=== Look for RestRouter in schemas ==="
rg -n 'Router\|ProjectViewSet' ami/main/api/schemas.py -A 2

echo ""
echo "=== Check if ProjectViewSet calls list action ==="
sed -n '143,200p' ami/main/api/views.py | grep -E 'def list|require_project'

Repository: RolnickLab/antenna

Length of output: 1544


🏁 Script executed:

#!/bin/bash
echo "=== Check if ProjectViewSet is at project-level endpoint or root endpoint ==="
rg -n 'register.*ProjectViewSet' ami/ -r '' 2>/dev/null || echo "Not found in rg"

echo ""
echo "=== Look at how DRF routes work - check if ProjectViewSet is nested under project ==="
cat ami/main/api/schemas.py | head -50

echo ""
echo "=== Check pagination more carefully - when does _get_project get called? ==="
sed -n '1,40p' ami/base/pagination.py

echo ""
echo "=== Look for list method in ProjectViewSet or DefaultViewSet ==="
sed -n '103,180p' ami/main/api/views.py

Repository: RolnickLab/antenna

Length of output: 4246


🏁 Script executed:

#!/bin/bash
echo "=== Check the get_active_project logic when required=True ==="
sed -n '12,60p' ami/base/views.py

echo ""
echo "=== Check if ProjectViewSet overrides require_project_for_list anywhere ==="
sed -n '143,250p' ami/main/api/views.py | grep -A 5 -B 5 'require_project'

echo ""
echo "=== Confirm how the kwargs are used for project_id extraction ==="
sed -n '12,50p' ami/base/views.py | head -40

Repository: RolnickLab/antenna

Length of output: 3557


🏁 Script executed:

#!/bin/bash
echo "=== Check if ProjectViewSet needs to override require_project_for_list ==="
echo "Looking at the actual behavior - does ProjectViewSet list need project_id?"
echo ""
echo "Show ProjectViewSet context again with more lines:"
sed -n '143,180p' ami/main/api/views.py

echo ""
echo "=== Check if there are any tests for ProjectViewSet list ==="
find . -name '*test*' -type f -name '*.py' -exec grep -l 'ProjectViewSet' {} \; 2>/dev/null | head -3

Repository: RolnickLab/antenna

Length of output: 1604


🏁 Script executed:

#!/bin/bash
echo "=== Look at similar root-level viewsets that set require_project_for_list = False ==="
rg -B 5 'require_project_for_list = False' ami/main/api/views.py

echo ""
echo "=== Understand the difference - where are these viewsets used? ==="
rg -B 15 'require_project_for_list = False' ami/main/api/views.py | grep 'class ' | head -5

echo ""
echo "=== Check if ProjectViewSet should also override this ==="
echo "ProjectViewSet doesn't override require_project_for_list, so it inherits the default."
echo "Let's verify pagination actually gets called during list:"
sed -n '10,15p' ami/base/pagination.py

Repository: RolnickLab/antenna

Length of output: 1495


ProjectViewSet is broken by the new require_project_for_list = True default and requires explicit override.

While ProjectViewSet's get_queryset() doesn't directly call get_active_project(), the pagination code does—LimitOffsetPaginationWithPermissions._get_project() explicitly calls it during get_paginated_response(). During a list action with the new default, required=True is passed to get_active_project(). Since ProjectViewSet is a root-level endpoint (not nested under a project_id), the project extraction fails and raises Http404.

ProjectViewSet should set require_project_for_list = False (like TaxonViewSet, TaxaListViewSet, and TagViewSet do) since it lists all projects globally, not per-project resources.

🧰 Tools
🪛 Ruff (0.15.0)

[warning] 1683-1683: Mutable class attributes should be annotated with typing.ClassVar

(RUF012)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/main/api/views.py` around lines 1682 - 1684, ProjectViewSet is broken by
the new default require_project_for_list = True causing
LimitOffsetPaginationWithPermissions._get_project() to call
get_active_project(required=True) and raise Http404 for root-level project
listing; to fix, explicitly set require_project_for_list = False on the
ProjectViewSet class (same pattern as TaxonViewSet/TaxaListViewSet/TagViewSet)
so list actions do not force project extraction, and ensure pagination via
LimitOffsetPaginationWithPermissions._get_project() and get_paginated_response()
will skip required project resolution for this view.

@mihow mihow marked this pull request as draft February 16, 2026 22:34
@mihow
Copy link
Collaborator Author

mihow commented Feb 18, 2026

Interaction with #1094 (Configurable taxa lists)

PR #1094 currently sets require_project = True on TaxaListViewSet. This PR adds require_project_for_list = False to the same viewset, which is the correct approach since taxa lists use M2M to projects (global resources, same as TaxonViewSet and TagViewSet).

When merging after #1094: There will be a minor conflict on TaxaListViewSet. Resolve by keeping require_project = True (for write operations) and adding require_project_for_list = False (to allow global listing).

Recommended merge order: #1094 first, then this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments