feat: Free gpu space after each inference run#493
feat: Free gpu space after each inference run#493hh-space-invader wants to merge 24 commits intogpufrom
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. 🗂️ Base branches to auto review (3)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
4ea2be3 to
37c09e2
Compare
26d0674 to
ec73aae
Compare
21f169f to
786cb11
Compare
This PR introduces two improvements to memory management in ONNX Runtime:
Mitigating Memory Fragmentation:
memory.enable_memory_arena_shrinkageforces cleanup of the memory arena after each run, reducing fragmentation at the cost of some performance.Optimizing Memory Allocation Strategy:
kNextPowerOfTwo, which can lead to excessive memory consumption (e.g., 1GB → 2GB → 4GB, etc.).kSameAsRequestedensures that only the necessary memory is allocated, preventing unnecessary over-allocation.All Submissions:
New Feature Submissions:
pre-commitwithpip3 install pre-commitand set up hooks withpre-commit install?New models submission: