Skip to content

Comments

MB-62182: fast merge of vector indexes#47

Open
Thejas-bhat wants to merge 11 commits intoblevefrom
merge_from
Open

MB-62182: fast merge of vector indexes#47
Thejas-bhat wants to merge 11 commits intoblevefrom
merge_from

Conversation

@Thejas-bhat
Copy link
Member

@Thejas-bhat Thejas-bhat commented Mar 5, 2025

  • The existing merge_from API can be utilized to merge 2 IVF indexes provided they have identical centroid layout.
  • To avoid re-training in a segment architecture, a template index can be first created out of a random sample from the data when its size is known and fixed.
  • This template can be copied between segments and the inverted lists can be merged without affecting the learned context of the data distribution. This would also involve the merging of the direct map as well which stores the exact location of a key (vector) within the inverted lists.

@Thejas-bhat Thejas-bhat changed the title Draft: making a IndexIVF's merge_from API a little more flexible MB-62182: fast merge of vector indexes Jan 29, 2026
@Thejas-bhat Thejas-bhat moved this from Todo to In Progress in Fast Merge Jan 30, 2026
@Thejas-bhat Thejas-bhat marked this pull request as ready for review February 19, 2026 00:15
CATCH_AND_HANDLE
}

int faiss_Set_coarse_quantizers(FaissIndex* index, FaissIndex* srcIndex) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can name this specific to SQ, since we will probably look at PQ as well.

index_ivf->quantizer = faiss::clone_index(reinterpret_cast<const faiss::Index*>(src_index->quantizer));
index_ivf->is_trained = true;

faiss::IndexIVFScalarQuantizer* index_ivsc_src = reinterpret_cast<faiss::IndexIVFScalarQuantizer*>(srcIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since IndexIVFScalarQuantizer inherits from IndexIVF, you can just do the cast to IndexIVFScalarQuantizer and then copy both the quantizer and the sq.

"can only merge indexes of the same type");

// merging only the direct map type array and no map
bool merge_direct_map_cond = (this->direct_map.type == DirectMap::Array && this->direct_map.type == DirectMap::Array)||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

must be

(this->direct_map.type == DirectMap::Array && other->direct_map.type == DirectMap::Array)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants