Hi Michael,
You don't have to add aliased interfaces on the device level. You can
use the routing mechanism of lnet (see e.g.
http://wiki.lustre.org/manual/LustreManual20_HTML/ConfiguringLNET.html
and
http://wiki.lustre.org/manual/LustreManual20_HTML/ManagingLNET.html)
You would have a fabric containing your lustre servers and some routers
for cluster A and some routers for cluster B.
The routers are nothing else than servers equipped with two infiniband
cards (or dual port cards), with one port in the server-fabric, the
other port in the cluster fabric.
You have to enable forwarding on the routers. On the servers you add the
routes to the lnets in the clusters and on the clients you have to add
the routes towards the server fabric.
lnet in conjunction with ofed 2 uses the ip addresses (together with the
lnet id) as a name for the client. It doesn't actually use ipoib. When
the routes are set up correctly (back and forth) you can lctl ping the
clients from the servers and vice versa, whereas ip pings can not pass
trough the routers (which can be seen as an lnet connection from one
fabric to the other one). Using lnet routing you shouldn't run into
address conflicts on the ip level, because there is no routing between
the networks on the level of the ip stack.
kind regards,
Martin
On 05/30/2014 03:59 PM, Michael Di Domenico wrote:
From my understanding of the lustre manual, I'm not sure what I
want
to do is possible, so I figured i'd ask
I have several clusters on different subnets
clusterA eth0 192.168.1.0/24 ib0 192.168.2.0/24
clusterB eth0 192.168.3.0/24 ib0 192.168.4.0/24
...etc...
what i want to do is connect lustre onto a separate network and use
only infiniband for the lustre communication
lustreA eth0 192.168.5.0/24 ib0 192.168.6.0/24
if i add
ib0:1 192.168.2.250/32
ib0:2 192.168.4.250/32
to the lustreA server I can talk via ipoib. however as i understand
it (probably incorrectly) lustreA would appear in lnet as
o2ib0(ib0) 192.168.6.250
o2ib1(ib0:1) 192.168.2.250
o2ib2(ib0:2) 192.168.4.250
however the clients would appear as
clusterA o2ib0(ib0) 192.168.2.250
clusterB o2ib0(ib0) 192.168.4.250
which as i understand it, creates a conflict in the o2ib devices where
o2ib0 on the client will not match o2ib2(ib0:2) on the lustre server
Is there a way to accomplish this or a better way overall?
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss(a)lists.01.org
https://lists.01.org/mailman/listinfo/hpdd-discuss