Hello Dan,
On 02/02/14 17:25, Dan Tascione wrote:
Hi Cédric,
I had a few hopefully easy questions about your Ubuntu setup, if you have the time to
answer them.
Our server side is Lustre 2.4.2 on CentOS 6.4 (installed with the Whamcloud RPMs). These
nodes all seem to be operating fine.
We have 2.4.2 on Ubuntu 12.04 with 2.6.32 kernel (which our partner, Q-Leap GmbH, set up
and maintains)
Our client side is currently Ubuntu 12.04. I've tried:
- Compiling Lustre client from the git tree (both 2.4.2 and master)
Haven't even tried it (being quite certain it would fail)
- Building the 3.13 kernel from Ubuntu, with the Lustre
modules enabled
Unfortunately, in all my tests, the Ubuntu nodes regularly panic or just outright freeze
entirely anywhere from 2 to 24 hours of operation.
In order to in-kernel Lustre client to work (on kernel 3.12 for sure and also 3.13 I
think), you *must* at least add the patches addressing:
-
https://jira.hpdd.intel.com/browse/LU-4127
-
https://jira.hpdd.intel.com/browse/LU-4157
For your Ubuntu clients, are you using the 3.12.8 that comes from Ubuntu, or from
kernel.org?
We started with an "apt-get source" in a Ubuntu/Trusty VM at the time its kernel
was 3.12.0-7.15 (corresponding to 3.12.4 upstream).
We then added all incremental patches from
https://www.kernel.org/ to "rebase"
that kernel to 3.12.9.
It looks like you are just using the Lustre version that comes with the 3.12.8 kernel,
and not the version from the Lustre source tree, is that correct?
Yes, absolutely.
The Lustre source tree still targets kernel 2.6.32 (or the like). As such, it is not
suited for recent kernels :-(
We started with stock in-kernel Lustre client from Ubuntu/Trusty 3.12.0-7.15, with patches
for:
-
https://jira.hpdd.intel.com/browse/LU-4127 (*required*)
-
https://jira.hpdd.intel.com/browse/LU-4157 (*required*)
-
https://jira.hpdd.intel.com/browse/LU-4231 (for NFS re-export)
-
https://jira.hpdd.intel.com/browse/LU-4400 (for NFS re-export)
BUT, as we stumbled on other minor bugs:
-
https://jira.hpdd.intel.com/browse/LU-4209
-
https://jira.hpdd.intel.com/browse/LU-4520
-
https://jira.hpdd.intel.com/browse/LU-4530
We decided to pull the in-kernel Lustre client from the latest-to-date kernel source; see
https://jira.hpdd.intel.com/browse/LU-4530 for a discussion on what that might be.
Thus, we pull the in-kernel Lustre client from:
-
https://github.com/verygreen/linux/tree/lustre-next
(which incorporates a few of the patches mentioned above, plus many others)
And added the patches fro the yet-not-integrated patches:
-
https://jira.hpdd.intel.com/browse/LU-4231
-
https://jira.hpdd.intel.com/browse/LU-4530
-
https://jira.hpdd.intel.com/browse/LU-4520 (<-> 4152 <-> 4398 <->
4429); this one is still unresolved as it requires server-side patches
- other that I thought might help our LU-4520
Are you clients all Infiniband, or are they Ethernet? We're using Ethernet here for
the clients, and I am wondering if that's interacting badly somehow.
All clients are Ethernet
You mentioned "3.14rc1~patched" below, but I wasn't sure what this version
number referred to?
At the time it was
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git,
"staging-3.14rc1" branch, but it know no longer valid. Better off from
https://github.com/verygreen/linux/tree/lustre-next
Thanks,
Dan
Best,
Cédric
*From:*HPDD-discuss [mailto:hpdd-discuss-bounces@ml01.01.org] *On Behalf Of *Cédric
Dufour - Idiap Research Institute
*Sent:* Friday, January 24, 2014 7:17 AM
*To:* Lustre (HPDD-discuss)
*Subject:* [HPDD-discuss] 'lustre-dkms' (skeleton) package for Debian/Ubuntu
available
Hello all,
Newly subscribed to the list, I've been going through the archives and seen some
questions about Lustre client support on recent versions of Debian/Ubuntu distributions.
We have addressed that issue by:
- building a custom kernel with Lustre client *disabled*, based on Ubuntu's latest
available kernel + latest stable patchsets, 3.12.8 for us so far (PS: './debian/rules
editconfigs' to disable Lustre)
- having a separate (easily upgrade-able) 'lustre-dkms' package based on Lustre
in-kernel client code + our patches, 3.14rc1~patched for us so far
We use that 3.12.8 kernel + lustre-dkms (3.14rc1~patched) package without any problem
on:
- Ubuntu/Quantal (~100 workstations and computation nodes)
- Debian/Wheezy with the few libc (>= 2.14) dependencies pulled from Debian/Testing
(a few servers requiring Lustre access)
- (hopefully Ubuntu/Trusty 14.04 in a few weeks)
(against a Lustre 2.6.32/2.4.2 cluster)
I have tarball-ed the required resources at
http://www.idiap.ch/~cdufour/download/lustre-dkms.tar.bz2
<
http://www.idiap.ch/%7Ecdufour/download/lustre-dkms.tar.bz2> . It contains the
skeleton directory and HOWO.TXT file that should get going those of you who are interested
to follow the same path.
Hope it helps.
Best regard,
Cédric
--
*Cédric Dufour @ Idiap Research Institute*