Conversation
majosm
left a comment
There was a problem hiding this comment.
Looks good to me. I threw in a few optional suggestions.
| src_bdry_nodes, | ||
| src_grp, tol): | ||
| ambient_dim, nelements, ntgt_unit_nodes = tgt_bdry_nodes.shape | ||
|
|
There was a problem hiding this comment.
Maybe check whether ntgt_unit_nodes == nsrc_unit_nodes first before going on to the pairwise comparison stuff?
There was a problem hiding this comment.
I like the thinking, but the criterion your propose is actually too restrictive. The comparison thing is also supposed to work in the (ideally, common, see #225) case of face restrictions where the face nodes are a subset of the volume nodes.
| dist_vecs = (tgt_bdry_nodes.reshape(ambient_dim, nelements, -1, 1) | ||
| - src_bdry_nodes.reshape(ambient_dim, nelements, 1, -1)) |
There was a problem hiding this comment.
I tried to think of a faster way to do this, but I wasn't able to come up with anything that didn't involve either SpatialBinaryTreeBucket (which might not be faster most of the time) or rand(). 🙂
There was a problem hiding this comment.
Yeah... it is clearly an O(nunit_nodes^2) algorithm, my hope is just that numpy crushes it. Anything with a Python loop will be way slower. I am a bit worried about the size of that temporary. If it's too big,we can always do a Python loop for one of those axes.
| - src_bdry_nodes.reshape(ambient_dim, nelements, 1, -1)) | ||
|
|
||
| # shape: (nelements, num_tgt_nodes, num_source_nodes) | ||
| is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol |
There was a problem hiding this comment.
| is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol | |
| is_close = np.sum(dist_vecs**2, axis=0) < tol**2 |
? 🤷♂️
There was a problem hiding this comment.
74e33b9. Used la.norm after all... more descriptive.
| idx = np.empty_like(is_close, dtype=np.int32) | ||
| idx[:] = np.arange(src_grp.nunit_dofs).reshape(1, 1, -1) | ||
| source_indices = idx[is_close].reshape(nelements, ntgt_unit_nodes) |
There was a problem hiding this comment.
| idx = np.empty_like(is_close, dtype=np.int32) | |
| idx[:] = np.arange(src_grp.nunit_dofs).reshape(1, 1, -1) | |
| source_indices = idx[is_close].reshape(nelements, ntgt_unit_nodes) | |
| source_indices = np.where(is_close)[-1].reshape(nelements, ntgt_unit_nodes) |
(I think?)
There was a problem hiding this comment.
Whoa, thanks! You can probably tell that my solution took me a while to construct. This is much better. 74e33b9
| matched_src_bdry_nodes = src_bdry_nodes[ | ||
| :, np.arange(nelements).reshape(-1, 1), source_indices] | ||
| dist_vecs = tgt_bdry_nodes - matched_src_bdry_nodes | ||
| is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol |
There was a problem hiding this comment.
| is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol | |
| is_close = np.sum(dist_vecs**2, axis=0) < tol**2 |
?
There was a problem hiding this comment.
74e33b9. Used la.norm after all... more descriptive.
…ing nodes exactly before going to Gauss-Newton
|
@majosm Thanks for taking a look! |
While it doesn't actually solve the problem described in #105 (comment), this should work around the main impact of #105: The DG examples no longer use matvecs to interpolate onto the boundary. This should help speed up larger-scale DG runs somewhat.
Draft because:
test_mesh_multiple_groups: Test inhomogeneous polynomial degree), because otherwise the Gauss-Newton code path would likely be untested.test_bdry_restriction_is_permutationto remove thexfails. All the tests should then be passing: cbbca40Post-merge:
@majosm Could you review this? I know it'll likely have a little bit of conflict with #204, but
_make_cross_face_batcheswas getting crazy long. Hopefully the conflicts won't be bad: the diff doesn't touch any of the actual functionality, it just shuffles function arguments around.cc @nchristensen @matthiasdiener