Introduction
============
NetIO-next is an event-driven communication library for RDMA networks
developed within the ATLAS FELIX project. NetIO-next implements a
publish–subscribe messaging pattern and is capable of data coalescence.
NetIO-next is connection-oriented and is based on messages rather than byte
streams.
NetIO-next is built on `libfabric `_,
a low-level communication library that abstracts diverse networking technologies.
Each supported technology is called *fabric provider* and among these is
**verbs**, a wrapper the `ibverbs `_.
Linux library. Ibverbs is the library that allows programs to utilise
RDMA-capable hardware from userspace; it supports the RDMA standards RoCE
(RDMA over Converged Ethernet), InfiniBand and iWARP, all compliant with the
`InfiniBand architecture specification `_.
For debugging purposes libfabric includes two TCP/IP providers: **sockets** and
**tcp**. NetIO-next defaults to verbs when RDMA-capable hardware is selected
and to sockets otherwise. The sockets provider emulates the RDMA stack and is
not designed for high performance. The tcp provider is not supported by NetIO-next [#]_.
.. important:: NetIO-next is not thread-safe, the FELIX API `felix-client-thread` is.
.. [#] *TCP* does not support `FI_PROGRESS_AUTO `_).
Recommended Setup
-----------------
In order to use the RDMA technology, the host computer shall be equipped with a
capable network card such as nVidia Connect-X (from X3 onwards), or Intel
Ethernet Netowk Adapter X722 or E810.
In case of Nvidia cards and a Linux kernel older than 5 it is reccomended to
install the Nvidia `MLNX_OFED drivers `_
that include an update of ibverbs.
MLNX_OFED drivers are recompiled for various kernel versions by the FELIX
developer team.
Setting the following environment variables is strongly reccomended.
All but RDMAV_FORK_SAFE are automatically set in FELIX software; RDMAV_FORK_SAFE
has be set in the environment of SW ROD because CERN ROOT libraries use fork().
.. list-table:: Reccomended environment variables.
:widths: 15 5 80
:header-rows: 1
* - Variable
- Value
- Reason
* - RDMAV_FORK_SAFE [#]_
- any
- allows the application to use fork()
* - FI_VERBS_TX_IOV_LIMIT
- 30
- Max length of TX IOV vector, see :ref:`UnbufferedCommunication`.
* - FI_VERBS_RX_IOV_LIMIT
- 30
- Max length of TX IOV vector, see :ref:`UnbufferedCommunication`.
* - FI_VERBS_TX_SIZE
- 1024
- Max number of buffers per socket, see :ref:`BufferedCommunication`.
* - FI_VERBS_RX_SIZE
- 1024
- Max number of buffers per socket, see :ref:`BufferedCommunication`.
.. [#] see `https://www.rdmamojo.com/2012/05/24/ibv_fork_init `_.
Distribution, Use, API
----------------------
NetIO-next is part of the `FELIX software distribution `_.
and it can be compiled within the FELIX framework or as an ATLAS TDAQ release
package. FELIX users are not supposed to interface to NetIO-next but rather use
the `felix-client `_
API.
References
----------
*Event-driven RDMA network communication in the ATLAS DAQ system with NetIO.*
24th International Conference on Computing in High Energy and Nuclear Physics.
J. Schumacher
[`Slides `_]
*Utilizing HPC Network Technologies in High Energy Physics Experiments.*
IEEE 25th Annual Symposium on High-Performance Interconnects.
J. Schumacher
[`Paper `_,
`Slides `_]