Skip to content

feat: deduplicate shared URL downloads across test suites#338

Open
dabrain34 wants to merge 2 commits intofluendo:masterfrom
dabrain34:dab_duplication_download
Open

feat: deduplicate shared URL downloads across test suites#338
dabrain34 wants to merge 2 commits intofluendo:masterfrom
dabrain34:dab_duplication_download

Conversation

@dabrain34
Copy link
Contributor

@dabrain34 dabrain34 commented Feb 26, 2026

Introduce a centralized DownloadManager that ensures each URL is downloaded at most once, eliminating duplicate downloads both across test suites and within a single test suite.

  • Add DownloadManager class in utils.py with download-once caching and centralized archive cleanup
  • Refactor TestSuite.download() to use pre-downloaded archives from the manager across all three download paths
  • Use a thread pool to download concurrently and make DownloadManager thread-safe so duplicate URLs are still fetched only once.

This feature allows to fast up considerably the download of AV1-ARGON* which was downloading each time the 6GB archive for every test vector.

Fix #309

Introduce a centralized DownloadManager that ensures each URL is
downloaded at most once, eliminating duplicate downloads both across
test suites and within a single test suite.

- Add DownloadManager class in utils.py with download-once caching
  and centralized archive cleanup
- Refactor TestSuite.download() to use pre-downloaded archives from
  the manager across all three download paths
- Use a thread pool to download concurrently and make DownloadManager
  thread-safe so duplicate URLs are still fetched only once.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Downloading the AV1 test suites results in downloading multiple times a 6GB archive

1 participant