Skip to content

Conversation

@dor-forer
Copy link
Collaborator

@dor-forer dor-forer commented Jan 8, 2026

Describe the changes in the pull request

This PR fixes FP16 SVS tests' precision and assertion issues:

  • Fix FP16 overflow in batch iterator tests: Scale vector indices by 0.1 to avoid FP16 overflow in L2 distance calculations.
  • Changed the ARM machine include the SVE optimizations

Tests updated:
svs_bulk_vectors_add_delete_test: Fixed size assertions and added verification of complete deletion
svs_reindexing_same_vector: Fixed assertions for index size after deletions and reinsertions
svs_reindexing_same_vector_different_id: Same lazy deletion fixes
test_delete_vector: Fixed single vector deletion assertion
TieredBatchIteratorBasic: Added scale factor to avoid FP16 overflow
KNNSearch: Use indexLabelCount() instead of indexSize() for SVS index assertions

Which issues this PR fixes

  1. MOD-13429

Main objects this PR modified

  1. ...
  2. ...

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses FP16 precision overflow issues in vector similarity search tests by introducing scale factors to prevent distance calculations from exceeding FP16's maximum value (~65504). The changes ensure test vectors remain within safe numerical ranges while preserving test correctness.

Key changes:

  • Added scale factor (0.1f) to SVS tiered index batch iterator tests to prevent overflow when n=1000
  • Added scale factor (0.01f) to FP16 override and batch iterator tests to prevent overflow when n=250
  • Upgraded ARM CI infrastructure from t4g.medium to r8g.xlarge for better performance

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tests/unit/test_svs_fp16.cpp Introduced 0.1f scale factor to prevent FP16 overflow in tiered batch iterator tests with proper documentation
tests/unit/test_fp16.cpp Added 0.01f scale factor to test_override and test_batch_iterator_basic, plus tolerance-based comparison for FP16 precision loss
.github/workflows/arm.yml Upgraded EC2 instance type to r8g.xlarge for improved CI performance

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

size_t n = 100;
size_t new_n = 250;
// Scale factor to avoid FP16 overflow. FP16 max value is 65504, and L2² = dim × diff².
// With scale=0.4 and max diff=250: L2² = 4 × (250×0.4)² = 40000 < 65504.
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "With scale=0.4 and max diff=250" but the actual scale factor used is 0.01f, not 0.4. The math in the comment should be updated to reflect the actual scale value. With scale=0.01 and new_n=250, the max distance would be 4 × (250×0.01)² = 4 × 6.25 = 25, which is well below the FP16 max.

Suggested change
// With scale=0.4 and max diff=250: L2² = 4 × (250×0.4)² = 40000 < 65504.
// With scale=0.01 and max diff=250: L2² = 4 × (250×0.01)² = 25 < 65504.

Copilot uses AI. Check for mistakes.
void FP16Test::test_batch_iterator_basic(params_t params) {
size_t n = params.initialCapacity;
// Scale factor to avoid FP16 overflow. FP16 max value is 65504, and L2² = dim × diff².
// With scale=0.4 and max diff=250: L2² = 4 × (250×0.4)² = 40000 < 65504.
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "With scale=0.4 and max diff=250" but the actual scale factor used is 0.01f, not 0.4. The math in the comment should be updated to reflect the actual scale value. With scale=0.01 and n=250, the max distance would be 4 × (250×0.01)² = 4 × 6.25 = 25.

Suggested change
// With scale=0.4 and max diff=250: L2² = 4 × (250×0.4)² = 40000 < 65504.
// With scale=0.01 and max diff=250: L2² = 4 × (250×0.01)² = 4 × 6.25 = 25 < 65504.

Copilot uses AI. Check for mistakes.
float exp_score = 4 * diff * diff;
ASSERT_EQ(score, exp_score) << "id: " << id << " score: " << score;
// Use tolerance-based comparison due to FP16 precision loss in SVE accumulation.
// FP16 has ~3 decimal digits of precision, so we allow ~0.1% relative tolerance.
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "~0.1% relative tolerance" but the code uses 0.002f which equals 0.2%. The comment should be updated to match the actual tolerance value used in the calculation.

Suggested change
// FP16 has ~3 decimal digits of precision, so we allow ~0.1% relative tolerance.
// FP16 has ~3 decimal digits of precision, so we allow ~0.2% relative tolerance.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.18%. Comparing base (f6df960) to head (e45e854).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #884      +/-   ##
==========================================
+ Coverage   97.13%   97.18%   +0.04%     
==========================================
  Files         129      130       +1     
  Lines        7614     7767     +153     
==========================================
+ Hits         7396     7548     +152     
- Misses        218      219       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants