Back to Christian Bell homepage
Keywords: RDMA, Memory Registration, GASNet, Global Address Space Languages.
Firehose is a distributed memory registration strategy for supporting Remote DMA (RDMA) operations over pinning-based networks. RDMA, in the field of High-Performance Computing, is an extension of User-Level networking where user processes can directly read and write data to the network. This particular concept is not really new. It started appearing in research with the U-Net research project and has since been incorporated by many High Performance Computing vendors in various forms.
Approaches taken to register memory for Remote DMA operation divide available technologies into two categories:
In a single sentence, Firehose is a memory registration mechanism that establishes a maximum amount of locally pinnable memory and distributes the management of a fixed fraction of this memory to each node in a parallel job. For example, if at most M pages of physical memory can be registered on node n_i, every node can assume that can independently manage p pages on node n_i, where p is established from p = M / nodes. As such, up to p mappings (or firehoses) can be maintained into n_i's address space, where a firehose is a mapping that guarantees that the underlying physical pages are pinned and where Remote DMA can operate fully one-sided and synchronization-free (data is free to pour through a firehose).
A node that requires more firehose mappings that the p it is statically allotted to every node must reuse some of it's unused mappings, or in fact move one of it's existing but stale firehose. A firehose is active while data is pouring through it or is otherwise inactive.
Naturally, constructive interference actually works at firehose's advantage, which occurs if more than one remote remote happen to reference the same set of pages pages. Such a case causes the registration operation to be essentially free. Consider the live firehose snapshot shown below:
Live Firehose snapshot of node B's memory space with nodes A and C mapping
5 of their 8 firehoses into B's memory space. Node B keeps a reference count
with every one of it's pinned pages: zero refcounts are pinned but unreferenced
by any firehoses while non-zero refcounts are referenced by one or more
firehoses.
A few optimizations are possible from the base case shown above. For one, once a reference count reaches zero, the underlying page can remain pinned and deregistered in a lazy manner or when the local node is running out of pages it can pin. Also, the fact that per-node firehoses are statically partitioned at startup assumes that each node will require the same amount of firehoses to every other node, something that doesn't hold for all applications (i.e. nearest neighbour and/or boundary-type computations). Because of lazy deregistration, these applications may incur extra roundtrips to move firehoses but the registration cost will not necessarily be paid.
To be continued. .
Back to Christian Bell homepage