Building lustre client 1.8 on Centos 6.5, followup
by Kurt Strosahl
All,
So after some digging online I found a patch that I was able to apply to get around the issue. I mounted up the lustre file system and it does seem to be working, however there is an odd error that I ran across.
Namely it seems that one of the modules wasn't created (or if it was created wasn't properly installed), instead showing up as a broken link to itself.
/lib/modules/2.6.32-431.el6.x86_64/weak-updates/kernel/net/lustre/libcfs.ko.
The file does exist in updates/kernel/net.
I'm not sure what I've done wrong, or if the answer is just to create the link myself.
w/r,
Kurt J. Strosahl
System Administrator
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
7 years, 6 months
Building lustre client 1.8 on Centos 6.5
by Kurt Strosahl
All,
I'm attempting to get a lustre client up and running on a centos 6.5 system, and it seems to be erroring over a change made in the kernel..
<system> uname -sr
Linux 2.6.32-431.el6.x86_64
When I try to run rpmbuild --rebuild I get the following...
/root/rpmbuild/BUILD/lustre-1.8.9/lustre/llite/dir.c: In function 'll_dir_ioctl':
/root/rpmbuild/BUILD/lustre-1.8.9/lustre/llite/dir.c:1288: error: assignment from incompatible pointer type
/root/rpmbuild/BUILD/lustre-1.8.9/lustre/llite/dir.c:1360: error: passing argument 1 of 'putname' from incompatible pointer type
/usr/src/kernels/2.6.32-431.el6.x86_64/include/linux/fs.h:2170: note: expected 'struct filename *' but argument is of type 'char *'
Are there any workarounds for this?
w/r,
Kurt
7 years, 6 months
file placement on OSTs with one OST disabled
by Kaizaad Bilimorya
OSSs & MDS
==========
Lustre 2.5.3
CentOS 6.5 kernel 2.6.32-431.23.3.el6_lustre.x86_64
Clients
=======
Lustre 1.8.x and 2.5.3
We had to disable a crashed OST, so on our combined MDS/MGS we did a
lctl conf_param lundwork-OST000e.osc.active=0
and that seems to have worked fine.
We now have an issue where the file OST allocation algorithm doesn't seem to be
working anymore. This was working fine before the OST crash (using default
system striping parameters).
Now we notice that when we are moving (rsync'ing) large amounts of data to this
lustre filesystem, it only uses all the OSTs before the failed one (listed in
"lfs df -h"). So, we inactivated those OSTs once they started to get full. Now
it seems that the next OST in the list (after the failed one) is the only one
being hit, until we deactivate that one.
thanks
-k
7 years, 6 months
How to unmount all clients
by Jérôme BECOT
Hi,
Is there a command to run on the MGS to properly order to all clients to
unmount from a share ?
Thanks
--
Jérome BECOT
Administrateur Systèmes et Réseaux
Molécules à visée Thérapeutique par des approches in Silico (MTi)
Univ Paris Diderot, UMRS973 Inserm
Case 013
Bât. Lamarck A, porte 412
35, rue Hélène Brion 75205 Paris Cedex 13
France
Tel : 01 57 27 83 82
7 years, 6 months
Re: [HPDD-discuss] [Lustre-discuss] Lustre build and install on Ubuntu
by Dilger, Andreas
Nathan, would you and/or Mandar be willing to incorporate the below steps into the "make debs" build target of the Lustre master branch?
While we don't maintain Debian or Ubuntu builds ourselves, it does seem like there is an increasing interest in this, and I hate seeing people having to rediscover the same problems and struggle to fix them each time.
At one point, I also thought that there was a Lustre client in the Debian unstable repo, but is that no longer the case? I don't see that as removing the reason to getting "make debs" to work in the master branch, since there will always be people who want to try the latest code instead of waiting for the Debian/Ubuntu repo to be updated.
Cheers, Andreas
> On Nov 5, 2014, at 7:25, "Grodowitz, Nathan T." <grodowitznt(a)ornl.gov> wrote:
>
> Hello Mandar,
>
> I recently did some work with 2.6 as a client on ubuntu 12.04. These are some of the steps that were taken below:
>
> apt-get source module-assistant
> cd module-assistant
>
> # Get the patches from the following pages
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702648
> # save detection.patch
> http://debian.2.n7.nabble.com/Bug-697269-FTBS-with-uapi-patched-module-as...
> # save parsing.patch
>
> patch < detection.patch
> patch < parsing.patch
>
> sudo apt-get build-dep module-assistant
> dpkg-buildpackage -rfakeroot -us -uc
> cd ../
> sudo dpkg -i module-assistant-0.14.deb
>
> Then you can take a look at LU-1706<https://jira.hpdd.intel.com/browse/LU-1706>, which explains build script patching that must take place, even on newer versions of Ubuntu.
>
> Once that is done, follow these steps to build the client. For your build, you will want to build the server as well.
>
> cd lustre-release
> git checkout v2_6_0
> sh ./autogen.sh
> ./configure --disable-server --disable-iokit --disable-tests
> make debs
> cd debs
> sudo dpkg -i lustre-client-modules-3.11.0-15-generic_2.6.0-1_amd64.deb lustre-utils_2.6.0-1_amd64.deb
>
>
> Hopefully that helps.
>
> Nathan Grodowitz
>
>
> On Nov 5, 2014, at 8:10 AM, Mandar Joshi <mandar.joshi(a)calsoftinc.com<mailto:mandar.joshi@calsoftinc.com>> wrote:
>
> Hi,
> I have 2 questions regarding building and installing Lustre server on Ubuntu. We have a running Lustre setup on RHEL and now trying to build and install on Ubuntu.
>
> 1. Whether latest Lustre (2.6+) server can be built on Ubuntu with the same steps as for RHEL? Or are there different steps for Ubuntu.
> 2. Has anyone tried this before and/or Whether Lustre2.6.x server is supported on Ubuntu.
> 3. In another experiment, I took my RHEL working rpms (patched kernel & lustre) and converted them to .deb packages using “alien”. I could install all rpms and then could also boot into my lustre-patched-kernel. I am able to mkfs.lustre my MGS (and even could mount it) But mkfs.lustre for MDT fails.
> a. First it gave errors like “Invalid filesystem option set”. So I installed e2fsprogs-1.42.7.wc1-7.el6.x86_64.rpm and e2fsprogs-libs-1.42.7.wc1-7.el6.x86_64.rpm (which I had used for my RHEL) using “alien”
> b. After this now it gives me another error, “Filesystem has unsupported feature(s) while setting up superblock”. I found similar issue athttps://www.mail-archive.com/pkg-lustre-maintainers@lists.alioth.debian... . I am NOT getting message “Your mke2fs.conf file does not define the ldiskfs filesystem type.”. Still tried solution suggested in the next mail, which did not work. Even links for ldiskfsprogs are not working to try. Do I need some other version of e2fsprogs?
> Can anyone throw light on this ?
>
> Thanks,
> Mandar Joshi
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss(a)lists.lustre.org<mailto:Lustre-discuss@lists.lustre.org>
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss(a)lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
7 years, 6 months
Join OpenSFS at SC'14
by OpenSFS Administration
Hello Lustre community members,
We hope you're planning to join us for this year's
<http://sc14.supercomputing.org/> Supercomputing (SC'14) conference, which
is being held in New Orleans, November 17-20th. OpenSFS will be hosting a
booth, as well as several sessions on the LustreR file system. As a member
of the Lustre community, we hope you will mark your calendars to attend the
following events:
* SC'14 BoF Session: The LustreR Community - At the Heart of HPC and
Big Data
o Tuesday, November 18th: 12:15-1:15pm in Room(s) 275-76-77
o Speakers include: Eric Barton, Intel; Chris Morrone, Lawrence Livermore
National Laboratory; Nathan Rutman, Seagate Technology; Galen Shipman, Oak
Ridge National Laboratory; Hugo Falter, EOFS
* OpenSFS Community Update
o Wednesday, November 19th: 1:30-2:00pm at the Seagate booth (#3239)
o Steve Simms, OpenSFS Community Board member, Indiana University
* OpenSFS booth party - stop by to join us for drinks and meet
others in the community!
o Wednesday, November 19th: 4:00-6:00pm at the OpenSFS booth (#2524)
* Daily Technical Sessions: have questions about Lustre features,
want to know how others are using Lustre? OpenSFS will be hosting dedicated
sessions throughout SC'14 to "talk shop" with a Lustre expert. Feel free to
join us throughout the show during the following times to learn more about
Lustre:
o Tuesday, November 18th: 2:30-3pm (Lustre tutorial)
o Wednesday, November 19th: 11:00-11:30am (Lustre tutorial), 1:00-2:00pm
(Lustre expert Q&A), 2:00-2:30pm (Lustre tutorial), 4:00-5:00pm (Lustre
expert Q&A)
o Thursday, November 20th: 11-11:30am (Lustre tutorial), 1:00-2:00pm
(Lustre expert Q&A)
Lastly, we encourage you to spread the word about the SC events planned for
the Lustre community! More information is available on our
<http://www.opensfs.org/events/sc14/> web site. We have also prepared the
attached social media/website content for your marketing teams to use in
promoting SC'14.
We look forward to seeing you at SC'14 and encourage you to take advantage
of the opportunities to join these events!
Sincerely,
OpenSFS Administration
Open Scalable File Systems, Inc. is a strong and growing nonprofit
organization dedicated to the success of the LustreR file system. OpenSFS
was founded in 2010 to advance Lustre development, ensuring it remains
vendor-neutral, open, and <http://lustre.opensfs.org/download-lustre/>
free. Since its inception, OpenSFS has been responsible for advancing the
Lustre file system and delivering
<http://lustre.opensfs.org/community-lustre-roadmap/> new releases on behalf
of the open source community. Through working groups, events, and ongoing
funding initiatives, OpenSFS harnesses the power of collaborative
development to fuel innovation and growth of the Lustre file system
worldwide.
All trademarks are the property of their respective owners. For more
information visit opensfs.org.
7 years, 6 months
ZFS + Lustre + Pacemaker question
by Brian Musson
Hi everybody! This is my first post to this mailing-list.
I am setting up a simple 5 node Lustre storage system for use in my basement for experimental purposes. I am using RHEL 6.6 and Lustre server 2.5.4. The backend is comprised of a shared storage resource which is iSCSI. These iSCSI targets serve as the OSTs and are allowed to mount on each of the OSS. At this time, I have a fully functional Lustre configuration; four OSTs, four OSS nodes and one dedicated MDS/MGS.
Pacemaker is up and fully functional, too. That is, I have fencing configured with STONITH and “pcs status” shows me everything is healthy. I can also simulate a failure and mount the OST on the fail-node. No problems there. I just cant figure out the automatic process.
While researching how to configure OSTs as a cluster resource, I have found many examples that explain how to setup the “ldiskfs” for failover. However, (from what I can tell) there seems to be a lacking document with as much information for the ZFS portion of it all.
How I understand it:
When pacemaker detects an unstable node, fencing is performed on the offending node, while at the same time the resource (lets say OST2) is moved to the “fail-node”. The fail-node then mounts the OST2 and serves it.
The examples outlined here for use with the ocf:heartbeat:Filesystem seem straight forward.
primitive resMyOST ocf:heartbeat:Filesystem \
meta target-role="stopped" \
operations $id="resMyOST-operations" \
op monitor interval="120" timeout="60" \
op start interval="0" timeout="300" \
op stop interval="0" timeout="300" \
params device="device" directory="directory" fstype="lustre”
… How does this work for ZFS + Lustre? From what I can tell, it will try to mount the OST on the fail-node into /mnt/lustre/foreign/ost2 and /mnt/lustre/local/ost2 on the primary OSS, so the static configuration for “directory” doesn’t seem like it will work here. This resource example appears to be lacking some additional steps for the ZFS piece.
Does somebody have a working example from pacemaker about how to facilitate the ZFS OST as a resource for the cluster? Documentation is just as good.
It seems overly complicated for what I need it to do:
“if pacemaker detects a problem with an OSS, shoot it in the head and perform ‘service lustre start <failed_ost>’ on the standby node.”
Thanks everybody in advance for taking time out of your busy day to answer my questions. Any suggestions are welcome.
Regards,
Brian
7 years, 6 months
Lustre build and install on Ubuntu
by Mandar Joshi
Hi,
I have 2 questions regarding building and installing Lustre
server on Ubuntu. We have a running Lustre setup on RHEL and now trying to
build and install on Ubuntu.
1. Whether latest Lustre (2.6+) server can be built on Ubuntu with the
same steps as for RHEL? Or are there different steps for Ubuntu.
2. Has anyone tried this before and/or Whether Lustre2.6.x server is
supported on Ubuntu.
3. In another experiment, I took my RHEL working rpms (patched kernel
& lustre) and converted them to .deb packages using "alien". I could install
all rpms and then could also boot into my lustre-patched-kernel. I am able
to mkfs.lustre my MGS (and even could mount it) But mkfs.lustre for MDT
fails.
a. First it gave errors like "Invalid filesystem option set". So I
installed e2fsprogs-1.42.7.wc1-7.el6.x86_64.rpm and
e2fsprogs-libs-1.42.7.wc1-7.el6.x86_64.rpm (which I had used for my RHEL)
using "alien"
b. After this now it gives me another error, "Filesystem has
unsupported feature(s) while setting up superblock". I found similar issue
at
https://www.mail-archive.com/pkg-lustre-maintainers@lists.alioth.debian.org/
msg00263.html . I am NOT getting message "Your mke2fs.conf file does not
define the ldiskfs filesystem type.". Still tried solution suggested in the
next mail, which did not work. Even links for ldiskfsprogs are not working
to try. Do I need some other version of e2fsprogs?
Can anyone throw light on this ?
Thanks,
Mandar Joshi
7 years, 6 months
Reconfigure networking on a running cluster
by Jérôme BECOT
Hi,
I've set up a little cluster (1 MGS/MDS and 2 OSS) with lustre 2.6 on a
private network (eth0/10.0.1.X), with some clients (~10) and everything
was pretty easy to do (except the pain to make debian clients working
with newer kernels).
I wanted to add another network via eth1/172.27.7.X but i didn't find
clearly the information on how to do this. I did :
- unmount all the clients
- unmount OSTs and MDT
- unload lustre modules on both OSSs and the MDS with lustre_rmmod
- configured /etc/modprobe.d/lustre.conf according to new interfaces (
"options lnet networks=tcp0(eth0),tcp1(eth1)" )
If i restart the cluster, lustre restart in the same configuration,
except that lctl list_nids outputs my networks correctly on all nodes.
But at this point the cluster is still served on the 10.0.1.X network only.
What is the process to tell lustre to run on eth1 as well ?
I tried to follow the "Changing a Server NID" section of the manual as
it was advised in a archived post and ran :
- Unmount the clients.
- Unmount the MDT.
- Unmount all OSTs.
- If the MGS and MDS share a partition, start the MGS only: mount -t
lustre /dev/sdb -o nosvc /mdt
- Run the replace_nids command on the MGS: lctl replace_nids
10.0.1.60@tcp0,172.27.7.100@tcp1
- If the MGS and MDS share a partition, stop the MGS : umount /mdt
Then when i tried to start the cluster again i was unable to mount any
client. I did setup the clients lustre.conf as well and before i could
mount a client on an OSS node (I know it's bad but for testing purpose
its doable) where after these changes i couldn't.
Did I miss some point ?
I can fetch the old logs if needed because i simply rebuild the lustre
partition by running mkfs.lustre with the
--mgsnode=10.0.1.60,172.27.7.100 and --reformat options. It's running
but i'd like to know how to change network setup without reformating.
Thanks
-
Jérome BECOT
Administrateur Systèmes et Réseaux
Molécules à visée Thérapeutique par des approches in Silico (MTi)
Univ Paris Diderot, UMRS973 Inserm
Case 013
Bât. Lamarck A, porte 412
35, rue Hélène Brion 75205 Paris Cedex 13
France
Tel : 01 57 27 83 82
7 years, 6 months