Add discard_spikes to curation, and update to v3#4287
Open
chrishalcrow wants to merge 9 commits intoSpikeInterface:mainfrom
Open
Add discard_spikes to curation, and update to v3#4287chrishalcrow wants to merge 9 commits intoSpikeInterface:mainfrom
discard_spikes to curation, and update to v3#4287chrishalcrow wants to merge 9 commits intoSpikeInterface:mainfrom
Conversation
…spikeinterface into add-discard-spikes
Member
|
Awesome Chris! Quick comment before diving into the PR:
I don't think that clean and split should be allowed. We have rules so that a unit can either be removed, merged, split, and I would add discard. Does it simplify the new unit id logic? |
alejoe91
reviewed
Jan 6, 2026
Comment on lines
+291
to
+297
| if len(discard_spikes_unit_ids) > 0: | ||
| ids_to_remove = [] | ||
| for new_id_set in new_ids: | ||
| if new_id_set[0] in discard_spikes_unit_ids or new_id_set[1] in discard_spikes_unit_ids: | ||
| ids_to_remove.append(new_id_set[0]) | ||
|
|
||
| curated_sorting_or_analyzer = curated_sorting_or_analyzer.remove_units(ids_to_remove) |
alejoe91
reviewed
Jan 6, 2026
|
|
||
| # decide if unit is a simple discard, a simple split or a discard and split | ||
| just_discard = False | ||
| discard_and_split = False |
Member
There was a problem hiding this comment.
Yeah I think this should not be allowed. A unit can be either split or cleaned, not both! This should simplify the logic a lot!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Would close #4261
Docs: https://spikeinterface--4287.org.readthedocs.build/en/4287/modules/curation.html
(Note: PR looks big but mostly tests+docs - main changes are updating the id strategy in
sorting_tools.py(~100 lines) and updating theapply_mergesfunction (~50 lines))This PR allows you to remove spikes from a unit during curation by specifying the following in your curation json file:
This new feature means that the curation format gets bumped to v3!
You can discard at the same time as merging and splitting.
Tricky bit 1
How to apply it to an analyzer. Decided to discard spikes at the same time as splitting. During the splitting step, we re-wrangle discard spikes into another split unit (call them "discard units") and keep track of the discard unit id. Then remove the full discard units after the splitting. This allows us to use the existing splitting machinery (including splitting extension etc) for discards - nice!
Tricky bit 2
Much of the complexity is now related to the id strategy (ughh - why do we give the user THREE new id strategies!!!). It's difficult because we want a cleaned unit to retains its id. This is true even if there are other units which are just split, and the user is using the "append" or "split" new_unit_id_strategy.
So suppose the user does the following:
Unit 1: clean
Unit 2: split
Unit 3: clean and split
With strategy append, we want
Unit 1 -> Unit 1 + Unit 1 dirty
Unit 2 -> Unit 4 + Unit 5
Unit 3 -> Unit 6 + Unit 7 + Unit 3 dirty
We do this by slotting in the dirty units into places we know to split later. For "append" strategy, I chose to put "Unit 1 dirty" to the last possible unit id, and unit 3 dirty to "unit 3".
Other stuff
We have to do merges after splitting+discarding. This is because the spike indices change after merging. To avoid wrangling spike indices (gross!) we just do discarding first.
Tests to do: