- nx_nodesNumber of compute nodes in the X direction
C++ Type:unsigned int
Controllable:No
Description:Number of compute nodes in the X direction
- nx_procsNumber of processors in the X direction within each node
C++ Type:unsigned int
Controllable:No
Description:Number of processors in the X direction within each node
HierarchicalGridPartitioner
Partitions a mesh into sub-partitions for each computational node then into partitions within that node. All partitions are made using a regular grid.
Description
The HierarchicalGridPartitioner is a two-level partitioner similar to GridPartitioner. The idea is to use a coarse grid with the number of computational nodes (nodes in your cluster you are going to use) to first partition the domain. Then use a finer-grained grid within each of those partitions to partition for each processor within the computational node.
This type of scheme minimizes off-node communication, minimizing network communication during large simulations.
Example
An example is the best way to explain what's going on. The mesh in Figure 1 is the mesh we want to partition. It has 128x128 elements in it (16,384 total). We're going to be running on a cluster where we're going to use 4 computational nodes—each of which has 16 processors (64 processors total).
As shown in Figure 2 utilizing GridPartitioner we can get a decent partitioning of this mesh by specifying the partitioning grid to be 8x8.
Now, the "problem" with this partitionining is that it will do quite a bit of off-processor communication. Consider what's on the second node (the third and fourth rows from the bottom). In total there will be 2x128=256 element faces with off-processor neighbors (everything above and below those two rows). In this case that isn't even that bad. If we were using even more processors (say 128) then each node would have a long "strip" of partitions on it where each partition would communicate both above and below it.
To fix this we can use the HierarchicalGridPartitioner using syntax shown in Listing 1. By telling it that we are going to have a 2x2 arrangement of nodes and then a 4x4 arrangement of processors on each node we get Figure 3 showing the processor assignment. Now each node only has 128 element faces with off-processor neighbors. Even better, there are a number of partitions that will not do any off-processor communication at all (the four interior ones in the middle of each node). Even in this small example we've cut the off-processor communication in half by using better partitioning! In a much larger run with larger numbers of processors per computational node (say 36, 40, 64 or even 128 which we'll see soon) this can make an even bigger difference.
[Mesh]
type = GeneratedMesh
dim = 2
nx = 8
ny = 8
[Partitioner]
type = HierarchicalGridPartitioner
nx_nodes = 2
ny_nodes = 2
nx_procs = 2
ny_procs = 2
[]
[]
(../../../SoftwareDownloads/moose/test/tests/partitioners/hierarchical_grid_partitioner/hierarchical_grid_partitioner.i)Input Parameters
- ny_nodes0Number of compute nodes in the Y direction
Default:0
C++ Type:unsigned int
Controllable:No
Description:Number of compute nodes in the Y direction
- ny_procs0Number of processors in the Y direction within each node
Default:0
C++ Type:unsigned int
Controllable:No
Description:Number of processors in the Y direction within each node
- nz_nodes0Number of compute nodes in the Z direction
Default:0
C++ Type:unsigned int
Controllable:No
Description:Number of compute nodes in the Z direction
- nz_procs0Number of processors in the Z direction within each node
Default:0
C++ Type:unsigned int
Controllable:No
Description:Number of processors in the Z direction within each node
Optional Parameters
- control_tagsAdds user-defined labels for accessing object parameters via control logic.
C++ Type:std::vector<std::string>
Controllable:No
Description:Adds user-defined labels for accessing object parameters via control logic.
- enableTrueSet the enabled status of the MooseObject.
Default:True
C++ Type:bool
Controllable:No
Description:Set the enabled status of the MooseObject.