Skip to content

Conversation

@tlm
Copy link
Member

@tlm tlm commented Dec 10, 2025

This commit re-implements storage pool creation from the facade down to the state layer. An attempt was made to sort of do this at the start of the domain but it left a lot to be desired on the implementation front.

The domain it self still has several rough edges but they will get fixed in follow up PR's as other facade calls are wired in. The below is the set of notable changes in this PR and why they were performed.

Storage Pool Related Changes

Provider Type type

Introduced a new type to the storage domain for representing the type of a provider. Previous to this our code was referring to the internal package type offering similar functionality. Long term the internal type will go away but we don't need to or want to have the internal package sprawl out. This is now a domain concern and so it makes sense for the provider type to be owned and controlled by the domain.

Provider Type type Validation

We rely on the client passing us the provider type value in certain operations. For example when creating storage pools, until now we have no exercised the value we have been given. The client could pass us an excessively long string or Unicode code points that we don't support. In order to be more sane, fail early and stamp out denial of service issues a new validator was introduced in this PR to check that the value falls within the certain permissible boundaries that we expect. This now gives confidence to the code that the memory occupied by the client is safe for further processing.

Return Params Error For Unauthorised Out of The Facade

When asserting if a caller to the facade is authorised for a given storage action we were directly returning the error received from the authorizer out of the facade func. This then relied on the facade catch all handler to finally deal with the error and return a correct params based error to the client.

The problem with this is you can't reasonably write good tests for the facade to assert the authorisation logic that is has. Instead of relying on how the facades get wired up with can be fraught with danger on change the facade now returns the correct params error for unauthorized making it clear what has gone wrong to the caller.

This allows for testing of the facade to assert permission access over time and make sure that internal changes to the facade do not break authz.

New Error Type For Incorrect Storage Provider Attribute Values

I have introduced a new error type StoragePoolAttributeInvalid to the storage domain so that when creating storage pools and validating the supplied config with the storage provider each component can safely communicate upwards what about the config is invalid.

This error type isn't in use in this PR except outside of tests. It required one more pass through to update the storage providers themselves. But in affect what this allows is for the validation of storage config to have errors passed up to the facade where they can be written in a user facing way to convey the problem. Without this we always going to be writing user facing errors in environs with no context on how we ended up here.

Removal Of Storage Pool Attribute Type

The initial implementation of the domain had a specialised type to represent the config attributes of a storage pool. The reality was this type wasn't adding any value to the problem and a map[string]any is more then sufficient to represent the raw information.

Storage Pool UUID creation

Storage pool UUID creation now happens in the service layer and not that of the state layer. A new internal type representing the create args for a storage pool was made to help pass this information along. The service func now also returns the UUID of the new pool so we can move away from using the StoragePool name as the identifier.

No Upsert Operation On Storage Pool Creation

This PR removes the potentially dangerous upsert operation when create storage pools. Storage pool creation is now just an insert with a check on UUID and storage pool name for uniqueness.

Storage Pool Attributes Batch Inserted

Storage pool attributes are now batch inserted with the transaction instead of being one by one potentially leading to a costly operation.

Drive By Changes

To do this work I tried to move the storage domain towards a better direction and touch up some of the initial implementations problems it had. These include:

  • Removed excessive validation of storage pool data out of the state layer. The state layer was trying to do validation on user input. This MUST be happening at the service level and shouldn't make it this far.
  • Domain errors for storage where lacking a consistent pattern. For example referring to storage pools as pools when the rest of the modelling is using storage pool. Removing Error suffix from the end of error consts as this was already implied based on the import package.
  • Added tests to the storage facade for asserting permissions of already implemented facade methods. What we had already was not covered. New tests are more exhaustive and reflect the contracts the facade maintains with a client.
  • Removed prescriptive setting up of state from storage facade base test. We don't want tests that establish a certain state which all sub tests inherit. This leads to confusing tests relying on behaviour that is easily missed. The test should tell a story about the pre and post.
  • Renamed the pool list test suite for the storage facade to poolListSuite as this follows the pattern of having different suites in the facade for testing isolated and related components together instead of an all in test suite that is hard to grok the larger it gets.

Future PR

  • At the moment a storage provider can and will add extra attributes to a storage pool to support operations. These extra attributes are dynamic and not tracked by the controller. To avoid changes in behaviour and be repeatable we need to capture these defaults in the storage pool on creation.

Checklist

  • Code style: imports ordered, good names, simple structure, etc
  • Comments saying why design decisions were made
  • Go unit tests, with comments saying what you're testing
  • [ ] Integration tests, with comments saying what you're testing
  • [ ] doc.go added or updated in changed packages

QA steps

There is a bit to cover for this PR......

QA 1

Demonstrate that Juju 3.6 has some gremlins that needed to be addressed specifically by this PR. That is the

  1. Bootstrap a Kubernetes controller. Provider substrate doesn't matter for this test.
  2. Add a new model
  3. Add a new storage pool to the model with juju create-storage-pool 'foo?00' kubernetes

This demonstrates with juju storage-pools that we ended up with a bug in our validation regex for storage pool names that allow '?' characters. Based on historical digging of PR's this should have never been the case.

We will have to allow this bad pattern during model migration (handled in future PR). But as of this PR we won't be allowing this anymore for new storage pools. Potentially because we now have RI in the storage model Juju 4.0 could support renaming the storage pool as it is just a display component. We will have to investigate what that may mean for resources deployed in the cloud under the storage pool and if changing the name would affect those.

In Juju4.0

  1. Bootstrap a new controller (any substrate will do)
  2. Add a new model
  3. Add a new storage pool to the model with juju create-storage-pool 'foo?00' kubernetes

We expect to see this now fail with something along the lines of ERROR storage pool name is either not set or is not a valid name.

QA 2

Create a new storage pool in the model and confirm the database representation is correct. We also want to make sure that the origin_id is set 0 meaning user defined.

  1. Bootstrap a new controller, add a model
  2. Create a new storage pool juju create-storage-pool foo tmpfs
  3. juju ssh -m controller 0
  4. juju_db_repl
  5. .switch model-test
  6. select * from storage_pool;

QA 3

Bootstrap a new controller with defined storage pools

  1. juju bootstrap localhost --storage-pool=name=test --storage-pool=type=tmpfs
  2. juju switch controller
  3. juju stoage-pools

Confirm the bootstrapped storage pool shows up in the list

Documentation changes

N/A

Links

Jira card: JUJU-8890

@tlm tlm added the do not merge Even if a PR has been approved, do not merge the PR! label Dec 10, 2025
@jujubot jujubot added the 4.0 label Dec 10, 2025
@tlm tlm force-pushed the juju-storage-pool-facade branch from 26d4534 to fc7945b Compare December 11, 2025 05:36
@tlm
Copy link
Member Author

tlm commented Dec 11, 2025

/build

tlm added 22 commits December 15, 2025 04:35
This commit introduces a new type for the storage domain to represent a
storage provider type. There exists a value in the internal package but
we are looking to get off of it.

The value contained in the internal package also has strings attached
that we don't want to take on.

This was done as part of getting creating storage pools working again
via the facade.
This commit integrates the provider type value of the domain and removes
the type for storage pool attributes from the create storage pool func
on the storage domain.

This was done to bring the interface inline with out expectations of
service funcs. The custom type for attributes as not adding and value
and just masking the poor design of the attribute system.
This commit introduces a new storage pool name validator for the storage
domain. The new validator takes what Juju had already been using in the
internal package and modifies it to be both a bit more stricter and also
more liberal in some respects.

The old validator that we had introduced a copy and past bug from other
regex that ended up allowing question mark characters in storage pool
names. This was never the intented purpose. The old validator also had
no upper bounds on the maximum length of a storage pool name.

Because of this unbound property is possible to have a denial of service
path for storage pool names that could overwhelm parts of the system. To
be strict and secure we need to establish some sort of upper bound. It
was decided to use 128 runes.

The validator also now supports unicode letter classes instead of just
ASCII. This allows Juju to start moving to a more language agnostic
implementation supporting all types of users.

Because a change in validation of storage pool names has been introduced
it now makes us ask how will we support existing storage pools that get
model migrated into the controller. The simple answer is we will reject
them through validation and force the user to update the names in the
model before migrating.

This is the preferred approach as we don't need to carry transition
logic forever and gets the problem dealth with as soon as possible.
This commit introduces a new internal type to the storage domain for
creating new storage pools. The new type will allow the removal of the
package level one and supports the service being able to create the new
uuid instead of the state layer.
The storage domain had types defined for the storage pool origin. I have
moved these types into the storage pool file so they are grouped
together and renamed the test file that existed.

This should make the grouping of values more logicial for the developer
to understand.
This commit adds the storage pool origin to the create args of a new
storage pool.
Introduces a new func on storage provider type for validating provider
type values. This allows for user input values to be asserted for
correctness before use.

Includes tests to assert the logic of valid for a provider type.
Moves the origin value for storage pools to the storage pool file at the
root of the domain.
This commit re-instates storage pool creation via the domain service.
The new implementation is now safe for the values supplied and performs
the proper checks required.

I have taken the oppurtunity to also bolster up the contracts on offer
and make the func as reliable as possible. While here the service also
now both generates the storage pool uuid and returns it out of the func.

Test for storage pool creation was also done taking the oppurtuntiy to
clean up the existing testing that existed.
This commit implements the state logic for creating new storage pool and
the contract required by the serivce layer.

While here I had to fix a bunch of other tests for parts of storage pool
that were either wrong or poorly implemented. I have decided to just
ignore the rest of the tests for the moment and comment them out.

They need to be fixed and just add to the bulk of this commit. Most of
the logic they have as well is wrong.
Updates the model migration package for the storage domain to use the
new create storage pool interface. This code is not being used today and
will change when model migration is done. This commit is just about
passing the tests for the moment until the real work is performed.
Updates the bootstrap worker to use the new service interface for
creating storage pools.
Removes the use of internal pool name validation from storage domain
services.
Storage pool errors did not have a uniform style and also included the
word error itself when implied by the package name. This commit fixes up
and makes consistent domain storage errors for storage pools.
Makes the importing of domain storage packages consistent across storage
pool files.
This commit re-implements creating storage pools in the apiserver facade
again. I made the decision to split out storage pool related facade
methods into their own seperate file. This us allows us to think about
the client storage facade in terms of the individual components that
comprise storage.

I also cleaned up the testing in this package by making suites for the
individual parts of the facade and not relying on a contrived default
state from the base suite.
This change breaks apart the authz checks of the storage facade so that
error messages from auth are not leaked outwards and also so that a
correct params error gets returned.

This allows correct testing of the facade for authz
This commit adds tests asserting the verious permission compositions we
expect to handle when creating storage pools. It shows that a model user
with admin or write permissions is required.
Adds missing tests to the client storage facade for checking permissions
when listing storage pools.
tlm added 6 commits December 15, 2025 04:37
Adds missing authz tests for detatch storage.
Storage pool name regex validator required a minimum of two runes when
it should have been one. This was caught by test coverage.
This commit removes const keys for defining storage pools at bootstrap
time to the core package. They don't belong in domain as they are
outside if it's scope. Ideally they could also be pushed further down
into the cli.
This commit cleans up the bootstrap worker for creating storage pools.
Have removed the use of internal storage for representing storage pool
and introduced a new type to the package. Removed validation logic that
is done by the domain and was just double handling.
Removes unused default storage pool code from the storage domain.
@tlm tlm force-pushed the juju-storage-pool-facade branch from b12f700 to 1c3650e Compare December 15, 2025 04:37
Updates the error value used for storage to match the name change after
rebasing on latest.
@adisazhar123 adisazhar123 self-requested a review December 16, 2025 04:06
@tlm tlm removed the do not merge Even if a PR has been approved, do not merge the PR! label Dec 16, 2025
@manadart
Copy link
Member

This exceeds the size that would anticipate effective review. Any way to break it up?

@tlm tlm mentioned this pull request Dec 17, 2025
3 tasks
@tlm tlm force-pushed the juju-storage-pool-facade branch from 84b0a8e to a37559a Compare December 17, 2025 05:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants