i got a running lustre 2.5.0 + zfs setup on top of centos 6.4 (the kernels
available on the public whamcloud site), my clients are on centos 6.5
(minor version difference, i recompiled the client sources with the options
specified on the whamcloud site)
but now i have some problems. I cannot judge how serious it is, as the only
problems i observe are slow responses on ls, rm and tar and apart from that
it works great. i also export it over nfs, which sometimes hangs the client
on which it is exported, but i expect this is an issue related to how many
service threads i have running on my servers (old machines).
but my osses (i got two) keep spitting out these messages into the system
xxxxxxxxxxxxxxxxxxxxxx kernel: SPL: Showing stack for process 3264
xxxxxxxxxxxxxxxxxxxxxx kernel: Pid: 3264, comm: txg_sync Tainted:
P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1
xxxxxxxxxxxxxxxxxxxxxx kernel: Call Trace:
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa01595a7>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0161337>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0163b13>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0160f8f>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa02926f0>] ? spa_deadman+0x0/0x120
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa016432b>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0164612>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa028259a>] ? spa_sync+0x1fa/0xa80
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810a2431>] ? ktime_get_ts+0xb1/0xf0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295707>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810560a9>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295400>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162478>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162410>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81096a36>] ? kthread+0x96/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
does anyone know, is this a serious problem, or just aesthetics? any way to
solve this? any hints?
I'd like to announce the availability of Lester, the Lustre lister. It
is available from github at https://github.com/ORNL-TechInt/lester
Lester is an extension of (n)e2scan for generating lists of files (and
potentially their attributes) from a ext2/ext3/ext4/ldiskfs filesystem.
We primarily use it for generating a purge candidate list, but it is
also useful for generating a list of files affected by an OST outage or
providing a name for an inode.
For example, to list files that have not been accessed in two weeks and
put the output in ne2scan format in $OUTFILE:
touch -d 'now - 2 weeks' /tmp/flag
lester -A fslist -a before=/tmp/flag -o $OUTFILE $BLOCKDEV
To do the same thing, but generate a full listing of the filesystem in
touch -d 'now - 2 weeks' /tmp/flag
lester -A fslist -a before=/tmp/flag -a genhit=$UNACCESSED_LIST
-o $FULL_LIST $BLOCKDEV
To name inodes to stdout (when not using Lustre 2.4's LINKEA):
lester -A namei -a $INODE1 -a $INODE2 ... $BLOCKDEV
To get a list of files with objects on OSTs 999 and 1000:
lester -A lsost -a 999 -a 1000 -o $OUTFILE $BLOCKDEV
To get a list of options and actions, use 'lester -h'; to get a list of
options for a given action, use 'lester -A $ACTION -a help'.
Lester uses its own AIO-based IO engine by default, which is usually
much faster than the default Unix engine for large filesystems on
high-performance devices. The number of requests in flight, request
size, cache size, and read-ahead settings for various phases of the scan
are all configurable. I recommend experimenting with the settings to
find a balance between speed and resource usage for your situation.
More information about the gains we've seen in testing prototype version
of Lester are in my LUG 2011 presentation,
National Center for Computational Science
Oak Ridge National Laboratory
Your web site showed the latest el6/RPMS/x86_64 e2fsprogs
version is 1.42.7-wc2-7.
However, the latest tag in the git repo seemed to me is v1.42.7.wc1.
Which commit is 1.42.7-wc2-7 tagged to ?
I am still struggling with the GET message on the LND level. After
adding a test to the lnet selftest, there is one GET going through the
LND, and after that nothing happens some time, untill this error message
is dumped to the console:
add test RPC failed on 12345-1@ex: Unknown error 18446744073709551506
Is there any way to find out what caused this error? That would be a
great help in finding what the LND does wrong.
I consulted the different log files like /var/log/dmesg and
/var/log/messages. On my system, there is no file log-lustre under /tmp.
So, I do not know how to investigate this error further. Any help on
this matter would be very much appreciated!
Thanks and kind regards
Based on my reading: shutting down the file system including un-mounting clients, MDT/MGS, and OST and then performing a writeconf would clear this permanently removed OST.
Question: I believe and also do not see any issue in continuing to run the file system without having to shut down until the next scheduled maintenance, to run writeconf and hence to clear deleted OST's from being listed on MDS. Do I get this right?
Amit H. Kumar
I upgraded from 2.1.6 to 2.4.2 all servers using the docs.
This is a combined MGS/MDS.
tunefs.lustre --mdt --quota on MDT
tunefs.lustre --ost --quota on OSTs
lctl conf_param lustre.quota.mdt=ug
lctl conf_param lustre.quota.os=ug
After remounting everything I'm getting:
[root@mds1 ~]# lctl get_param osd-*.*.quota_slave.info
target name: lustre-MDT0000
pool ID: 0
quota enabled: ug
conn to master: not setup yet
space acct: ug
user uptodate: glb,slv,reint
group uptodate: glb,slv,reint
Any suggestions on getting 'conn to master' change to 'setup'?
The filesystem itself behaves normally.
I tried it in a VM on a fresh test fs with just MGS/MDS and the conn to
master was immediately "setup" even before adding any OSSes.
Is it possible to have a Lustre client working on this kernel with DKMS or otherwise:
root@steamos:/home/desktop/git/lustre-release# uname -a
Linux steamos 3.10-3-amd64 #1 SMP Debian 3.10.11-1st1 (2014-01-21) x86_64 GNU/Linux
Call for Presentations: Lustre User Group 2014
Miami Marriott Biscayne Bay, April 8-10
Are you interested in presenting your company's best practices, case
studies, or business experiences to over 200+ Lustre-focused attendees?
Would you like to share specific Lustre development challenges, new
development features, project updates, or cutting-edge implementations? Then
we want to hear from you!
Don't miss your chance to submit a presentation for the 12th annual LustreR
User Group (LUG) conference on April 8-10 at the
Miami Marriott Biscayne Bay in Miami, Florida! All LUG presentations will be
30 minutes, with a total of 25 speaking opportunities available. We
encourage you to review the agendas and caliber of presentations at
<http://www.opensfs.org/past-events/> past LUG events.
Submission deadline is January 31!
To submit an abstract:
* Visit the <https://www.easychair.org/conferences/?conf=lug2014> LUG
2014 Call for Presentations Submission website
* You will need to create a user account on Easy Chair (our online
submission platform) if you don't already have one
* After providing your user details via Easy Chair, you will be
prompted to complete the online submission process
* An abstract is only required for the submission process;
presentation materials will be requested once abstracts are reviewed and
<http://www.opensfs.org/events/lug14/> LUG 2014 will also feature panel
discussions where leading developers, vendors, and users of Lustre debate
future requirements, explore upcoming enhancements, and share real world
best practices. If you'd like to suggest a topic for a panel discussion - or
be part of a panel, please contact <mailto:firstname.lastname@example.org>
Registration will open in February and additional event logistics are
included on the LUG <http://www.opensfs.org/events/lug14/> event page. Be
sure to check back regularly, as speakers and session highlights will also
be posted over the coming weeks.
Please contact <mailto:email@example.com> admin(a)opensfs.org if you have any
OpenSFS LUG Planning Committee
3855 SW 153rd Drive Beaverton, OR 97006 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:firstname.lastname@example.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org/lug-2014-sponsorship/> Click here to learn how to
become a 2014 LUG Sponsor
Open Scalable File Systems, Inc. was founded in 2010 to advance Lustre
development, ensuring it remains vendor-neutral, open, and free. Since its
inception, OpenSFS has been responsible for advancing the Lustre file system
and delivering new releases on behalf of the open source community. Through
working groups, events, and ongoing funding initiatives, OpenSFS harnesses
the power of collaborative development to fuel innovation and growth of the
Lustre file system worldwide.