Re: [HPDD-discuss] [lustre-discuss] Does Lustre support RoCE?
by Oucharek, Doug S
The note regarding MOFED 4 not supported by Lustre: I’m working on it. MOFED 4 did not drop support of Lustre, but did make API/behaviour changes which Lustre has not fully adapted to yet. The ball is in the Lustre community’s court on this one now.
Doug
On May 11, 2017, at 8:47 AM, Simon Guilbault <simon.guilbault(a)calculquebec.ca<mailto:simon.guilbault@calculquebec.ca>> wrote:
Hi, your lnet.conf look fine, I tested lnet with RoCE V2 a while back with a pair of server using Connectx4 with a single 25Gb interface and RDMA was working with Centos 7.3, stock RHEL OFED and Lustre 2.9. The only settings that I had to use in lustre's config was this one:
options lnet networks=o2ib(ens2)
The performance was about the same (1.9GB/s) without any tuning with the lnet self-test but the CPU utilisation was a lot lower with RDMA than TCP (3% vs 65% of a core).
From my notes I took back then Lustre needed to be recompiled with MLNX OFED 3.4 and MLNX OFED 4 dropped support of Lustre accordings to their release notes.
Ref 965588
https://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_R...
https://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_R...
On Thu, May 11, 2017 at 11:34 AM, Indivar Nair <indivar.nair(a)techterra.in<mailto:indivar.nair@techterra.in>> wrote:
So I should add something like this in lnet.conf -
options lnet networks=o2ib0(p4p1)
Thats it, right?
Regards,
Indivar Nair
On Thu, May 11, 2017 at 8:39 PM, Dilger, Andreas <andreas.dilger(a)intel.com<mailto:andreas.dilger@intel.com>> wrote:
If you have RoCE cards and configure them with OFED, and configure Lustre to use o2iblnd then it should use RDMA for those interfaces. The fact that they are RoCE cards is hidden below OFED.
Cheers, Andreas
> On May 11, 2017, at 08:36, Indivar Nair <indivar.nair(a)techterra.in<mailto:indivar.nair@techterra.in>> wrote:
>
> Hi ...,
>
> I have read in different forums and blogs that Lustre supports RoCE.
> But I cant find any documentation on it.
>
> I have a Lustre setup with 6 OSS and 2 SMB/NFS Gateways.
> They are all interconnected using Mellanox SN2700 100G Switch and Mellanox Connect-X4 100G NICs.
> I have installed the Mellanox OFED Drivers, but I cant find a way to tell Lustre / LNET to use RoCE.
>
> How do I go about?
>
> Regards,
>
>
> Indivar Nair
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss(a)lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss(a)lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss(a)lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
3 years, 9 months
Re: [HPDD-discuss] [lustre-discuss] Does Lustre support RoCE? (Oucharek, Doug S)
by Chris Hunter
Is there any references on which MOFED versions Lustre does support (or
test) ?
I note lnd code patches seem to reference mainline kernel versions and
and some Lustre scope statements will reference an OFA OFED version.
regards,
Chris Hunter
> Date: Sat, 13 May 2017 00:52:11 +0000
> From: "Oucharek, Doug S"
> Subject: Re: [HPDD-discuss] [lustre-discuss] Does Lustre support RoCE?
> I?ve been able to determine what is causing the dump_cqe failures, but not why it is happening now (all of a sudden).
>
> In Lustre, we pass an IOV of fragments to be RDMA?ed over IB. The fragments need to be page aligned except that the first fragment does not have to start on a page boundary and the last fragment does not have to end on a page boundary.
>
> When we set up the DMA addresses for remote RDMA, we mask off the fragments so the addresses are all on a page boundary. I guess the original authors believed that all DMA addresses needed to be page aligned for IB hardware. The mlx5 code (MOFED 4 specific?) does not like that we are not using the actual start address and is rejecting it in the form of a dump_cqe error.
>
> This code does not seem to be a problem with MOFED 3.x so has something changed? Has a page alignment restriction been removed? I really cannot just turn off this alignment operation as I have no idea what will break elsewhere in the world of OFED/MOFED.
>
> Could use some insight from people who understand IB hardware/firmware.
>
> Doug
>
> On May 11, 2017, at 11:26 AM, Indivar Nair <indivar.nair(a)techterra.in<mailto:indivar.nair@techterra.in>> wrote:
>
> Thanks for the advice.
> I had a hunch that the development will take time.
>
> Regards,
>
>
> Indivar Nair
>
> On Thu, May 11, 2017 at 11:28 PM, Oucharek, Doug S <doug.s.oucharek(a)intel.com<mailto:doug.s.oucharek@intel.com>> wrote:
> As I write this, I am banging my head against this wall trying to figure it out. It is related to the new memory region registration process used by mlx5 cards. I could really use the help of any Mellanox/RDMA experts out there. The API has virtually no documentation and without the source code for MOFED 4, I am really in unable to do much more than guess at what is going on.
>
> So, expect this to take a long time to resolve and stick with MOFED 3.x.
>
> Doug
>
> On May 11, 2017, at 10:29 AM, Indivar Nair <indivar.nair(a)techterra.in<mailto:indivar.nair@techterra.in>> wrote:
>
> Thanks a lot, Michael, Andreas, Simon, Doug,
> I have already installed MLNX OFED 4:-(
> I will now have to undo it and install the earlier version.
>
> Roughly, by when would the support for MLNX OFED 4 be available?
>
> Regards,
>
>
> Indivar Nair
>
> On Thu, May 11, 2017 at 9:35 PM, Oucharek, Doug S <doug.s.oucharek(a)intel.com<mailto:doug.s.oucharek@intel.com>> wrote:
> The note regarding MOFED 4 not supported by Lustre: I?m working on it. MOFED 4 did not drop support of Lustre, but did make API/behaviour changes which Lustre has not fully adapted to yet. The ball is in the Lustre community?s court on this one now.
>
> Doug
>
> On May 11, 2017, at 8:47 AM, Simon Guilbault <simon.guilbault(a)calculquebec.ca<mailto:simon.guilbault@calculquebec.ca>> wrote:
>
> Hi, your lnet.conf look fine, I tested lnet with RoCE V2 a while back with a pair of server using Connectx4 with a single 25Gb interface and RDMA was working with Centos 7.3, stock RHEL OFED and Lustre 2.9. The only settings that I had to use in lustre's config was this one:
>
> options lnet networks=o2ib(ens2)
>
> The performance was about the same (1.9GB/s) without any tuning with the lnet self-test but the CPU utilisation was a lot lower with RDMA than TCP (3% vs 65% of a core).
>
> From my notes I took back then Lustre needed to be recompiled with MLNX OFED 3.4 and MLNX OFED 4 dropped support of Lustre accordings to their release notes.
>
> Ref 965588
> https://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_R...
> https://www.mellanox.com/related-docs/prod_software/Mellanox_OFED_Linux_R...
>
>
> On Thu, May 11, 2017 at 11:34 AM, Indivar Nair <indivar.nair(a)techterra.in<mailto:indivar.nair@techterra.in>> wrote:
> So I should add something like this in lnet.conf -
>
> options lnet networks=o2ib0(p4p1)
>
> Thats it, right?
>
> Regards,
>
>
> Indivar Nair
>
> On Thu, May 11, 2017 at 8:39 PM, Dilger, Andreas <andreas.dilger(a)intel.com<mailto:andreas.dilger@intel.com>> wrote:
> If you have RoCE cards and configure them with OFED, and configure Lustre to use o2iblnd then it should use RDMA for those interfaces. The fact that they are RoCE cards is hidden below OFED.
>
> Cheers, Andreas
>
> > On May 11, 2017, at 08:36, Indivar Nair <indivar.nair(a)techterra.in<mailto:indivar.nair@techterra.in>> wrote:
> >
> > Hi ...,
> >
> > I have read in different forums and blogs that Lustre supports RoCE.
> > But I cant find any documentation on it.
> >
> > I have a Lustre setup with 6 OSS and 2 SMB/NFS Gateways.
> > They are all interconnected using Mellanox SN2700 100G Switch and Mellanox Connect-X4 100G NICs.
> > I have installed the Mellanox OFED Drivers, but I cant find a way to tell Lustre / LNET to use RoCE.
> >
> > How do I go about?
> >
> > Regards,
> >
> >
> > Indivar Nair
> >
> >
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss(a)lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss(a)lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss(a)lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
>
>
>
3 years, 9 months
Does Lustre support RoCE?
by Indivar Nair
Hi ...,
I have read in different forums and blogs that Lustre supports RoCE.
But I cant find any documentation on it.
I have a Lustre setup with 6 OSS and 2 SMB/NFS Gateways.
They are all interconnected using Mellanox SN2700 100G Switch and Mellanox
Connect-X4 100G NICs.
I have installed the Mellanox OFED Drivers, but I cant find a way to tell
Lustre / LNET to use RoCE.
How do I go about?
Regards,
Indivar Nair
3 years, 9 months
OpenSFS Handbook - Now Available
by OpenSFS Administration
Hello OpenSFS Community,
As a follow up to our original message, we wanted to thank everyone who
contributed to the review and feedback of the OpenSFS Handbook
<http://cdn.opensfs.org/wp-content/uploads/2014/02/OpenSFSHandbook.pdf> .
The feedback received has been incorporated and the first revision of the
OpenSFS Handbook is now published on the OpenSFS website
<http://opensfs.org/policies/> . This document will be revised periodically
as needed or as feedback is received.
Best regards,
OpenSFS Administration
From: Participants [mailto:participants-bounces@lists.opensfs.org] On Behalf
Of OpenSFS Administration
Sent: Monday, April 10, 2017 1:47 PM
To: lwg(a)lists.opensfs.org; hpdd-discuss(a)lists.01.org;
lustre-discuss(a)lists.lustre.org; discuss(a)lists.opensfs.org;
participants(a)lists.opensfs.org
Subject: [Participants] Review Requested: OpenSFS Handbook
Dear OpenSFS Community,
We have drafted a Handbook summarizing the organization of OpenSFS, its
Bylaws, policies, guidelines, and other general information.
The Handbook is online as a Google Doc and anyone with the like can suggest
revisions or comment. We would greatly appreciate your feedback and
contributions to the Handbook, so please take a look and share your input.
Anonymous feedback is allowed, but if you comment while logged in to Google
we'll be able to respond to you directly.
<https://docs.google.com/document/d/1mD_in2HuLRw99ZlKfpkz6kLcEeqi93tDZD5v8lw
tmyY/edit?usp=sharing>
https://docs.google.com/document/d/1mD_in2HuLRw99ZlKfpkz6kLcEeqi93tDZD5v8lwt
myY/edit?usp=sharing
Comments are requested by Friday, May 5th so feedback can be reviewed and
discussed prior to the <http://opensfs.org/lug-2017/> LUG conference.
Sincerely,
Your OpenSFS Board <http://opensfs.org/board-members/>
3 years, 9 months