Skip to content

Making testing requirements clearer#673

Draft
PawelPlesniak wants to merge 11 commits intodevelopfrom
PawelPlesniak/ClearerPRTemplate
Draft

Making testing requirements clearer#673
PawelPlesniak wants to merge 11 commits intodevelopfrom
PawelPlesniak/ClearerPRTemplate

Conversation

@PawelPlesniak
Copy link
Collaborator

@PawelPlesniak PawelPlesniak commented Nov 11, 2025

Description

Changes the structure of the PR template to prioritize the testing, introduces the requirements from other repos as a field.

No tests or further checks have been run, as this is a template issue, and does not affect the core code.

Type of change

  • Documentation (non-breaking change that adds or improves the documentation)
  • New feature (non-breaking change which adds functionality)
  • Optimization (non-breaking, back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (whatever its nature)

Key checklist

  • All tests pass (eg. python -m pytest)
  • Pre-commit hooks run successfully (eg. pre-commit run --all-files)

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added or an issue has been opened to tackle that in the future.
    (Indicate issue here: # (issue))

@PawelPlesniak
Copy link
Collaborator Author

@jamesturner246 @Aurashk @miruuna this is the PR to clarify what tests should be performed for PR merges. This will update the PR template so the testing requiremenst are clearer, this is sometthing that I would value feedback on. The tests that should be performed are:

  • unit tests with pytest
  • regression tests with minimal_system_quick_test
  • integration tests with daqsystemtest_integtest_bundle for anyone with either access to the NP0x cluster or CVMFS

Looking forward to feedback from you

@Aurashk
Copy link
Contributor

Aurashk commented Nov 21, 2025

@jamesturner246 @Aurashk @miruuna this is the PR to clarify what tests should be performed for PR merges. This will update the PR template so the testing requiremenst are clearer, this is sometthing that I would value feedback on. The tests that should be performed are:

* unit tests with `pytest`

* regression tests with `minimal_system_quick_test`

* integration tests with `daqsystemtest_integtest_bundle` for anyone with either access to the NP0x cluster or CVMFS

Looking forward to feedback from you

Thanks Pawel, that's very helpful.
Just wondering about the integration tests, is it enough to daqsystemtest_integtest_bundle on any machine with CVMFS active? So there is no requirement to test on specific hardware?

@PawelPlesniak
Copy link
Collaborator Author

There are no hardware restrictions, but this code has been written to run on AlmaLinux9. If I remember correctly, @jamesturner246 got CVMFS set up on his laptop, and should be able to run these tests locally

@miruuna
Copy link
Contributor

miruuna commented Nov 24, 2025

This looks good and it'll help standardize the workflow. However, running daqsystemtest_integtest_bundle on my machine (Ubuntu 24.04.3 LTS) seems to always take a very long time (>30 min). I am not sure if it's got to do with my machine being slow but this would definitely slow the workflow in my case by quite a bit.

@PawelPlesniak
Copy link
Collaborator Author

PawelPlesniak commented Nov 24, 2025

This looks good and it'll help standardize the workflow. However, running daqsystemtest_integtest_bundle on my machine (Ubuntu 24.04.3 LTS) seems to always take a very long time (>30 min). I am not sure if it's got to do with my machine being slow but this would definitely slow the workflow in my case by quite a bit.

Thanks Miruna. Yes, it is expected that this test takes very long, but this should be only have to be run when major changes are made to the core codebase, prior to merging a PR. I will make this clearer in the testing list requirements

@bieryAtFnal
Copy link
Contributor

In case it is useful, I will contribute a couple of notes regarding where we can run the daqsystemtest regression tests.

  1. The system software packages that are needed on a computer in order to run the DAQ generally are mentioned on the "software area instructions" page here in section 1.vi.
  2. As @PawelPlesniak knows, we don't yet have sufficient-computer-resource checks in all of the existing regression tests, including those in daqsystemtest. These checks attempt to ensure that the current computer has enough CPU/memory/free disk to handle a given regression test. If such checking should be a priority, I will work on it - please let me know. (For reference, somewhat recently we added such checking to the regression tests in the dfmodules package. Now, the tests in that package will skip tests that require more resources than the current computer has available. This helps to avoid confusing errors when we run a test on a system that is underpowered for the test.)

@Aurashk
Copy link
Contributor

Aurashk commented Nov 25, 2025

In case it is useful, I will contribute a couple of notes regarding where we can run the daqsystemtest regression tests.

1. The system software packages that are needed on a computer in order to run the DAQ generally are mentioned on the "software area instructions" page [here](https://github.com/DUNE-DAQ/daqconf/wiki/Setting-up-a-fddaq%E2%80%90v5.5.0-development-area) in section 1.vi.

2. As @PawelPlesniak knows, we don't yet have sufficient-computer-resource checks in all of the existing regression tests, including those in `daqsystemtest`.  These checks attempt to ensure that the current computer has enough CPU/memory/free disk to handle a given regression test.  If such checking should be a priority, I will work on it - please let me know.  (For reference, somewhat recently we added such checking to the regression tests in the `dfmodules` package.  Now, the tests in that package will skip tests that require more resources than the current computer has available.  This helps to avoid confusing errors when we run a test on a system that is underpowered for the test.)

Thanks Kurt, that's very helpful, it's possible 2. explains some test failures I was seeing locally last time I tried. Is it totally unpredictable what will happen in the tests without these checks for computational resources or is there a common point of failure? Also one other more general thing that might be useful to know is what makes the tests run long? Is it a timed simulation of everything working meaninfully together or is it doing a lot of computational work?

@Aurashk
Copy link
Contributor

Aurashk commented Nov 25, 2025

Also another thing came to mind. What's the situation with this MSQT test in the CI https://github.com/DUNE-DAQ/drunc/actions/workflows/run_mqst.yml? It seems like it was abandoned some time earlier in the year judging by the actions runs. Is this something we want to get working again?

@PawelPlesniak
Copy link
Collaborator Author

In case it is useful, I will contribute a couple of notes regarding where we can run the daqsystemtest regression tests.

1. The system software packages that are needed on a computer in order to run the DAQ generally are mentioned on the "software area instructions" page [here](https://github.com/DUNE-DAQ/daqconf/wiki/Setting-up-a-fddaq%E2%80%90v5.5.0-development-area) in section 1.vi.

2. As @PawelPlesniak knows, we don't yet have sufficient-computer-resource checks in all of the existing regression tests, including those in `daqsystemtest`.  These checks attempt to ensure that the current computer has enough CPU/memory/free disk to handle a given regression test.  If such checking should be a priority, I will work on it - please let me know.  (For reference, somewhat recently we added such checking to the regression tests in the `dfmodules` package.  Now, the tests in that package will skip tests that require more resources than the current computer has available.  This helps to avoid confusing errors when we run a test on a system that is underpowered for the test.)

Thanks Kurt, that's very helpful, it's possible 2. explains some test failures I was seeing locally last time I tried. Is it totally unpredictable what will happen in the tests without these checks for computational resources or is there a common point of failure? Also one other more general thing that might be useful to know is what makes the tests run long? Is it a timed simulation of everything working meaninfully together or is it doing a lot of computational work?

When a session is running on a host with insufficient resources, a session will likley throw errors with the number of missing/empty data products. This takes time as we run a variety of configurations with many runs - there are 9 tests and multiple configurations for some of these tests. Supposing each tests takes 3 mins, this will get you the approximate half hour for running.

@PawelPlesniak
Copy link
Collaborator Author

Also another thing came to mind. What's the situation with this MSQT test in the CI https://github.com/DUNE-DAQ/drunc/actions/workflows/run_mqst.yml? It seems like it was abandoned some time earlier in the year judging by the actions runs. Is this something we want to get working again?

This is something held back by the development of the Subprocess process manager, in this PR

@bieryAtFnal
Copy link
Contributor

In principle, the time that each regression test takes to run is dominated by the amount of time spent waiting in each FSM state (e.g. trigger-enabled), as much time as the writer of the test chose. Of course, if lots of failures happen and/or a process either stalls or crashes, run control transitions can take longer than usual (e.g. some of the "stop-run" transitions), and those might produce a noticeable extra amount of time.

@jamesturner246
Copy link
Contributor

jamesturner246 commented Dec 15, 2025

Hi all. As discussed last meeting, I think a special cluster account just for testing PRs would be invaluable for this workflow.

Something we could perhaps hook into CI -- e.g. manually (or even auto, but maybe too noisy) trigger the full integration test suite on the cluster once the PR is marked ready for review.

@PawelPlesniak PawelPlesniak marked this pull request as draft January 14, 2026 11:02
@PawelPlesniak PawelPlesniak added the enhancement New feature or request label Jan 22, 2026
@PawelPlesniak
Copy link
Collaborator Author

@jamesturner246 @Aurashk @miruuna @bieryAtFnal
I have attempted to make the testing and review policy as clear as possible. If you could all review to make sure we're on the same page, we can make the testing less bottlenecked. If anything is unclear or you would like to suggest a change, please indicate so here. Thanks
The list of role based developers is now on the drunc wiki.

@bieryAtFnal
Copy link
Contributor

Hi @PawelPlesniak , the updated template looks reasonable to me. I've made a note to myself to revisit the template once the global bundle script is generally available. When I do that, I will update the template to reference the new script (dunedaq_integtest_bundle.sh).

@Aurashk
Copy link
Contributor

Aurashk commented Feb 6, 2026

Looking nice and very useful, thanks @PawelPlesniak. I have a couple of small suggestions.

  • Explicitly say here that daqsystemtest_integtest_bundle.sh needs to be run on specific clusters (I think that's the case)
  • Add two check marks for announcing major changes (possibly with an example of a major change) on the slack channel to be discussed. i.e. 'Does not require discussion with the wider dunedaq' or 'has been discussed and approved with wider dunedaq.' I'd put it just before the final review stage or in the review stage.

@PawelPlesniak
Copy link
Collaborator Author

@emmuhamm can you give a final review

@emmuhamm
Copy link
Contributor

emmuhamm commented Feb 6, 2026

Hi @PawelPlesniak, thanks for this :D. Since its a MD based PR I'll just note my thoughts here


Comments in template

I see a lot of comments in the template itself. Have you considered exposing the template so that its visible in the docs?

See the template docs.

Nightlies

I think in the template the developer should list which nightly was used to test this.

As a reviewer, to test a PR I would normally check out a specific nightly, clone the relevant branch(es), and the run the test. If not specified, the checked out PR branch may be no longer up to date with the rest of the nightlies.

  • For example, the current PR's head is from 1 Feb, but the rest of the stack is from 14 Feb and then theres a big mismatch

Naturally the best thing would be to have the PRs continously up to date with the current nightly via constant rebases/merges. Actually, would this be worth sticking in the template as a 'thing to do'?

Verbosity

Right now the initial checklist bit is quite verbose. As a developer I would prefer if there was just a clear set of check boxes I need to do as required (eg. preparing the PR for review, or reviewing someone else's).

Maybe consider either having a separate docs with all this info (not personally preferred), or restructuring this section and making use of the "details" functionality in GitHub Flavored Markdown. Eg:

Title of collapsible some extra information!

In particular, parts where I found quite verbose:

  • initial checklist prior to marking this as "ready for review"
  • Notifications (probably best collapsed somewhere)

Linting

Should linting be part of the checklist?

Merge responsibilities

Once the requested reviewer is satisfied with the PR, the PR can be merged.

Still isn't clear who does the final merging. Is it the reviewer or the asignee or the author?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants