Introduction
NetIO-next is an event-driven communication library for RDMA networks developed within the ATLAS FELIX project. NetIO-next implements a publish–subscribe messaging pattern and is capable of data coalescence. NetIO-next is connection-oriented and is based on messages rather than byte streams.
NetIO-next is built on libfabric, a low-level communication library that abstracts diverse networking technologies. Each supported technology is called fabric provider and among these is verbs, a wrapper the ibverbs. Linux library. Ibverbs is the library that allows programs to utilise RDMA-capable hardware from userspace; it supports the RDMA standards RoCE (RDMA over Converged Ethernet), InfiniBand and iWARP, all compliant with the InfiniBand architecture specification. For debugging purposes libfabric includes two TCP/IP providers: sockets and tcp. NetIO-next defaults to verbs when RDMA-capable hardware is selected and to sockets otherwise. The sockets provider emulates the RDMA stack and is not designed for high performance. The tcp provider is not supported by NetIO-next [1].
Important
NetIO-next is not thread-safe, the FELIX API felix-client-thread is.
Recommended Setup
In order to use the RDMA technology, the host computer shall be equipped with a capable network card such as nVidia Connect-X (from X3 onwards), or Intel Ethernet Netowk Adapter X722 or E810. In case of Nvidia cards and a Linux kernel older than 5 it is reccomended to install the Nvidia MLNX_OFED drivers that include an update of ibverbs. MLNX_OFED drivers are recompiled for various kernel versions by the FELIX developer team. Setting the following environment variables is strongly reccomended. All but RDMAV_FORK_SAFE are automatically set in FELIX software; RDMAV_FORK_SAFE has be set in the environment of SW ROD because CERN ROOT libraries use fork().
Variable |
Value |
Reason |
---|---|---|
RDMAV_FORK_SAFE [2] |
any |
allows the application to use fork() |
FI_VERBS_TX_IOV_LIMIT |
30 |
Max length of TX IOV vector, see Unbuffered RDMA Communication. |
FI_VERBS_RX_IOV_LIMIT |
30 |
Max length of TX IOV vector, see Unbuffered RDMA Communication. |
FI_VERBS_TX_SIZE |
1024 |
Max number of buffers per socket, see Buffered RDMA Communication. |
FI_VERBS_RX_SIZE |
1024 |
Max number of buffers per socket, see Buffered RDMA Communication. |
Distribution, Use, API
NetIO-next is part of the FELIX software distribution. and it can be compiled within the FELIX framework or as an ATLAS TDAQ release package. FELIX users are not supposed to interface to NetIO-next but rather use the felix-client API.
References
Event-driven RDMA network communication in the ATLAS DAQ system with NetIO. 24th International Conference on Computing in High Energy and Nuclear Physics. J. Schumacher [Slides]
Utilizing HPC Network Technologies in High Energy Physics Experiments. IEEE 25th Annual Symposium on High-Performance Interconnects. J. Schumacher [Paper, Slides]