Debian Infiniband HOWTO
The infiniband technology is featured by open source components and drivers. Unfortunately the well known OFED a software stack from the OpenFabrics Alliance focuses on RPM based distributions. This howto shows a way to create a working infiniband setup with IP-over-Infiniband and iSCSI-over-Infiniband.
Purpose
The infiniband technology is featured by open source components and drivers.
Unfortunately the well known OFED a software stack from the OpenFabrics Alliance focuses on RPM based distributions. This howto shows a way to create a working infiniband setup with IP-over-Infiniband and iSCSI-over-Infiniband.
Prerequisities
First at all you need a kernel that supports the infiniband Hardware you like to use.
In our case we use a Mellanox HCA
hades:~# lspci
...
06:00.0 InfiniBand: Mellanox Technologies MT25208 InfiniHost III Ex (Tavor compatibility mode) (rev 20)
...
The recent Kernel 2.6.26 lists the following supported devices:
Mellanox HCA
QLogic InfiniPath
Ammasso 1100 HCA
Chelsio RDMA Driver
Mellanox ConnectX HCA
NetEffect RNIC Driver
Checking for the kernel drivers
#dmesg | grep ib
...
ib_mthca: Mellanox InfiniBand HCA driver v0.08 (February 14, 2006)
ib_mthca: Initializing 0000:06:00.0
...
The Mellanox card has been found and the device-driver ib_mthca has been loaded.
In case of a Mellanox ConnectX card you will need the mlx4_core AND the mlx4_ib kernel module. The first one is automatically load by udev, the last one you should put in /etc/modules to be load at boot time.
Querying the driver parameters
In the sysfs there is driver of class infiniband registered
#ll /sys/class/infiniband/mthca0/
board_id fw_ver hw_rev
node_guid ports/ sys_image_guid
device/ hca_type node_desc
node_type subsystem/ uevent
One can ask the individual ports sone questions:
hades:~# cat /sys/class/infiniband/mthca0/ports/1/state
2: INIT
hades:~# cat /sys/class/infiniband/mthca0/ports/1/rate
20 Gb/sec (4X DDR)
hades:~# cat /sys/class/infiniband/mthca0/ports/1/phys_state
5: LinkUp
The first port is in the INIT-State. Data rate is 20GBit/s and the physical link is up.
hades:~# cat /sys/class/infiniband/mthca0/ports/2/phys_state
2: Polling
The link at port 2 is down. Thats ok since no cable is attached.
Bringing the card to from INIT to ACTIVE state
In our configuration we have two infiniband cards connected directly via cable.
Despite the physical layer shows a LinkUp state the logical layer is still disconnected
showing the INIT state at both ends.
If the cards were connected to a switch the cards will show both ACTIVE state.
The reason is that a switch acts as infiniband subnet manager. Without a switch no subnet manager without subnet manager no subnet.
But before going buying a costy switch - there's a solution.
One may use a software subnet manager like opensm. http://www.openfabrics.org/downloads/management/
Building opensm
Download the sources from http://www.openfabrics.org/downloads/management/
opensm-3.2.2.tar.gz
libibumad-1.2.1.tar.gz
libibcommon-1.1.1.tar.gz
For building opensm you will need the following debian packages:
#aptitude install checkinstall flex btyacc build-essential
Now untar the source code archives
# tar -xzf opensm-3.2.2.tar.gz
# tar -xzf libibumad-1.2.1.tar.gz
# tar -xzf libibcommon-1.1.1.tar.gz
Build libibcommon first. Use the checkinstall utility to build debian packages.
# cd libibcommon-1.1.1
# ./configure
# make
# checkinstall -D
Follow the same recipe to build libibumad-1.2.1 opensm-3.2.2
Setting the IP-over-infiniband mode
iSCSI over Infiniband
On server Hades:
tgtadm --lld iscsi --op new --mode target --tid 1 -T hades.disk
tgtadm --lld iscsi --op new --mode logicalunit --tid 1 --lun 1 -b /dev/vg1/test
tgtadm --lld iscsi --op bind --mode target --tid 1 -I 10.2.0.19
Do not forget to write this into /etc/init.d/storage_export
On server Poseidon:
Further information
If there is anywhere else the reader can go for more information on this topic, include some links or pointers here.