_make_cross_face_batches: refactor into multiple functions, try matching nodes exactly before going to Gauss-Newton#226

Merged

inducer merged 2 commits intomainfrom

Jun 17, 2021

Owner

inducer commented Jun 16, 2021 •

edited

Loading

While it doesn't actually solve the problem described in #105 (comment), this should work around the main impact of #105: The DG examples no longer use matvecs to interpolate onto the boundary. This should help speed up larger-scale DG runs somewhat.

Draft because:

Should merge after Test that connections that should be permutations actually *are* permutations #225, specifically 3d2f376 (test_mesh_multiple_groups: Test inhomogeneous polynomial degree), because otherwise the Gauss-Newton code path would likely be untested.
After Test that connections that should be permutations actually *are* permutations #225 is pulled into this, should update test_bdry_restriction_is_permutation to remove the xfails. All the tests should then be passing: cbbca40

Post-merge:

Update Test that connections that should be are actually permutations #105 as described in Test that connections that should be are actually permutations #105 (comment): Reproduce two minima in _make_cross_face_batch Gauss-Newton #228.

@majosm Could you review this? I know it'll likely have a little bit of conflict with #204, but _make_cross_face_batches was getting crazy long. Hopefully the conflicts won't be bad: the diff doesn't touch any of the actual functionality, it just shuffles function arguments around.

cc @nchristensen @matthiasdiener

inducer mentioned this pull request

Test that connections that should be are actually permutations #105

Closed

inducer force-pushed the node-matching branch from a37e3e0 to cbbca40 Compare

June 17, 2021 00:03

inducer marked this pull request as ready for review

June 17, 2021 00:03

majosm approved these changes

View reviewed changes

Collaborator

majosm left a comment

Looks good to me. I threw in a few optional suggestions.

meshmode/discretization/connection/opposite_face.py

+                      src_bdry_nodes,
+                      src_grp, tol):
+                  ambient_dim, nelements, ntgt_unit_nodes = tgt_bdry_nodes.shape

Collaborator

majosm Jun 16, 2021

Maybe check whether ntgt_unit_nodes == nsrc_unit_nodes first before going on to the pairwise comparison stuff?

Owner Author

inducer Jun 17, 2021

I like the thinking, but the criterion your propose is actually too restrictive. The comparison thing is also supposed to work in the (ideally, common, see #225) case of face restrictions where the face nodes are a subset of the volume nodes.

meshmode/discretization/connection/opposite_face.py

Comment on lines +100 to +101

		dist_vecs = (tgt_bdry_nodes.reshape(ambient_dim, nelements, -1, 1)
		- src_bdry_nodes.reshape(ambient_dim, nelements, 1, -1))

Collaborator

majosm Jun 16, 2021

I tried to think of a faster way to do this, but I wasn't able to come up with anything that didn't involve either SpatialBinaryTreeBucket (which might not be faster most of the time) or rand(). 🙂

Owner Author

inducer Jun 17, 2021

Yeah... it is clearly an O(nunit_nodes^2) algorithm, my hope is just that numpy crushes it. Anything with a Python loop will be way slower. I am a bit worried about the size of that temporary. If it's too big,we can always do a Python loop for one of those axes.

meshmode/discretization/connection/opposite_face.py Outdated

+                          - src_bdry_nodes.reshape(ambient_dim, nelements, 1, -1))
+                  # shape: (nelements, num_tgt_nodes, num_source_nodes)
+                  is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol

Collaborator

majosm Jun 16, 2021

Suggested change

      
                is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol
          
                is_close = np.sum(dist_vecs**2, axis=0) < tol**2

? 🤷‍♂️

Owner Author

inducer Jun 17, 2021

74e33b9. Used la.norm after all... more descriptive.

meshmode/discretization/connection/opposite_face.py Outdated

Comment on lines 111 to 113

+                  idx = np.empty_like(is_close, dtype=np.int32)
+                  idx[:] = np.arange(src_grp.nunit_dofs).reshape(1, 1, -1)
+                  source_indices = idx[is_close].reshape(nelements, ntgt_unit_nodes)

Collaborator

majosm Jun 17, 2021

Suggested change

      
                idx = np.empty_like(is_close, dtype=np.int32)
          
                idx[:] = np.arange(src_grp.nunit_dofs).reshape(1, 1, -1)
          
                source_indices = idx[is_close].reshape(nelements, ntgt_unit_nodes)
          
                source_indices = np.where(is_close)[-1].reshape(nelements, ntgt_unit_nodes)

(I think?)

Owner Author

inducer Jun 17, 2021

Whoa, thanks! You can probably tell that my solution took me a while to construct. This is much better. 74e33b9

meshmode/discretization/connection/opposite_face.py Outdated

+                  matched_src_bdry_nodes = src_bdry_nodes[
+                          :, np.arange(nelements).reshape(-1, 1), source_indices]
+                  dist_vecs = tgt_bdry_nodes - matched_src_bdry_nodes
+                  is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol

Collaborator

majosm Jun 17, 2021

Suggested change

      
                is_close = np.sqrt(np.sum(dist_vecs**2, axis=0)) < tol
          
                is_close = np.sum(dist_vecs**2, axis=0) < tol**2

?

Owner Author

inducer Jun 17, 2021

74e33b9. Used la.norm after all... more descriptive.

inducer added 2 commits

June 16, 2021 21:15


          _make_cross_face_batches: refactor into multiple functions, try match…

74e33b9

…ing nodes exactly before going to Gauss-Newton


          Remove conditionals from test_bdry_restriction_is_permutation

8b191a9

inducer force-pushed the node-matching branch from cbbca40 to 8b191a9 Compare

June 17, 2021 02:16

Owner Author

inducer commented Jun 17, 2021

@majosm Thanks for taking a look!

inducer enabled auto-merge (rebase)

June 17, 2021 02:18

inducer merged commit e662ec0 into main

inducer deleted the node-matching branch

June 17, 2021 02:54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet