i got a running lustre 2.5.0 + zfs setup on top of centos 6.4 (the kernels
available on the public whamcloud site), my clients are on centos 6.5
(minor version difference, i recompiled the client sources with the options
specified on the whamcloud site)
but now i have some problems. I cannot judge how serious it is, as the only
problems i observe are slow responses on ls, rm and tar and apart from that
it works great. i also export it over nfs, which sometimes hangs the client
on which it is exported, but i expect this is an issue related to how many
service threads i have running on my servers (old machines).
but my osses (i got two) keep spitting out these messages into the system
xxxxxxxxxxxxxxxxxxxxxx kernel: SPL: Showing stack for process 3264
xxxxxxxxxxxxxxxxxxxxxx kernel: Pid: 3264, comm: txg_sync Tainted:
P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1
xxxxxxxxxxxxxxxxxxxxxx kernel: Call Trace:
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa01595a7>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0161337>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0163b13>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0160f8f>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa02926f0>] ? spa_deadman+0x0/0x120
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa016432b>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0164612>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa028259a>] ? spa_sync+0x1fa/0xa80
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810a2431>] ? ktime_get_ts+0xb1/0xf0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295707>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810560a9>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295400>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162478>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162410>] ?
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81096a36>] ? kthread+0x96/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
does anyone know, is this a serious problem, or just aesthetics? any way to
solve this? any hints?
On 2014/03/10, 6:01 AM, "E.S. Rosenberg" <esr(a)cs.huji.ac.il<mailto:firstname.lastname@example.org>> wrote:
On Tue, Mar 4, 2014 at 4:04 PM, Simmons, James A.
>>If yes, you most likely stumbled on https://jira.hpdd.intel.com/browse/LU-4209
>We are running the in-kernel client, kernel 3.13.x on Debian machines
>The lfs utility in use is from the lustre 2.4.2 branch
>I see that a patch was submitted to the linux-kernel team, does anyone have a link to that patch? I'd like to or patch our kernel or see what version of the kernel we should be building, if it made >it into 3.14-rcX then we'll build that...
Correct me if I'm wrong but it looks like that patch hit a dead end...
Andreas did any fsdevs ever voice an opinion on generic/lustre flags?
The patch was resubmitted, and Al Viro had relatively little to say about it (i.e. he didn't completely hate it, which is good). I've since sent an updated patch to Greg again.
Any chance it'll be in 3.14?
The patch is currently in the staging tree, I'm not sure which release it will be in.
Lustre Software Architect
Intel High Performance Data Division
I've got Lustre file system ver2.4.2 built up on CentOS6.4 MDS/OSSs/Client(test only) systems using the same version of OS and kernel 2.6.32-358.23.2.
Need to load Lustre client on several compute systems with SLES11sp2 build; matrix show this is doable. How do I enable SLES11sp2 client which has different kernel builds? I believe this require to recompile Lustre on client systems with correct source; what are the steps?
We at Cray recently noticed that on CentOS 6.4/5 servers even with
SELinux disabled there is still a noticeable amount of memory in use:
cat /proc/slabinfo | grep selinux
selinux_inode_security 7578 7579 72 53 1 : tunables 120
60 0 : slabdata 143 143 0
That's from an idle server, and it's not much usage (7579 objects, 72
bytes per object, ~ 0.5 MB of memory), but on an active server, there
can be millions of objects, leading to a few hundred MB of memory usage.
As far as I can tell, disabling selinux is required for Lustre servers
to function. When the kernel is built with SELinux disabled in the
.config, this memory usage goes away.
I'd like to propose changing the Lustre provided RHEL kernel config file
to not build SELINUX in to the kernel.
This won't affect clients unless they're running the patched kernel -
only the patched Lustre server kernels are changed by this, and as far
as I know, servers always have SELinux disabled.
I wanted to float this to a broad audience and get any objections before
creating an LU and pushing a patch to Gerrit. Does anyone have a reason
not to do this?
- Patrick Farrell
Patch, in essence:
@@ -4411,15 +4411,7 @@
# CONFIG_SECURITY_ROOTPLUG is not set
-# CONFIG_SECURITY_SELINUX_POLICYDB_VERSION_MAX is not set
+# CONFIG_SECURITY_SELINUX is not set
# CONFIG_SECURITY_SMACK is not set
# CONFIG_SECURITY_TOMOYO is not set
Lustre User Group 2014
Online Registration Closes Today
Today, March 31, 2014 is the last day to register online for the 12th Annual
Lustre <http://www.opensfs.org/lug14/> R User Group (LUG) Conference! Don't
miss the opportunity to participate in this exclusive event on April 8-10 in
With over 180 people already registered from over 50 different companies
worldwide, LUG offers you the unique opportunity to network with industry
leaders, end-users, developers, and vendors of Lustre.
The event will feature over 30+ Lustre focused sessions, including the
first-annual State of the Lustre Community, as well as a number of
opportunities for attendees to actively participate in industry dialogue on
best practices and emerging technologies, including:
* Networking Reception: Join us for an evening reception to network
with fellow attendees.
* Poster Exhibition: Participate in the first LUG poster exhibition,
and learn about the OpenSFS Benchmarking Working Group, Cooperative work
with SAMBA on Lustre, Correlation of File and Batch System Activities in the
HRSK-II Project, and more!
* OpenSFS Working Group Meetings: Attendees are invited to join the
OpenSFS Working Group meetings following LUG on April 10, regardless of
their OpenSFS membership status.
<https://opensfs.wufoo.com/forms/lug-conference-2014/> Register Today!
We encourage you to review the <http://www.opensfs.org/lug-2014-agenda/>
LUG 2014 Agenda to learn more about the networking opportunities, poster
exhibition, and sessions at this year's conference.
We look forward to seeing you at LUG! If you have any questions, please feel
free to contact <mailto:email@example.com> admin(a)opensfs.org.
OpenSFS LUG Planning Committee
3855 SW 153rd Drive Beaverton, OR 97006 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:firstname.lastname@example.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org/lug-2014-sponsorship/> Click here to learn how to
become a LUG Sponsor
Open Scalable File Systems, Inc. was founded in 2010 to advance Lustre
development, ensuring it remains vendor-neutral, open, and free. Since its
inception, OpenSFS has been responsible for advancing the Lustre file system
and delivering new releases on behalf of the open source community. Through
working groups, events, and ongoing funding initiatives, OpenSFS harnesses
the power of collaborative development to fuel innovation and growth of the
Lustre file system worldwide.
Is 2.1.6 lustre client build supported on sles 11.2?
I have installed OFED-3.5-2 and installed the RPMs Then get an error
during the config process of lustre client
hous0162:/usr/src/lustre-2.1.6 # ./configure
checking for /boot/kernel.h... no
checking for /var/adm/running-kernel.h... no
checking for /usr/src/linux-3.0.13-0.27-obj/x86_64/default/.config... yes
checking for /usr/src/linux-3.0.13-0.27-obj/x86_64/default/include/linux/autoconf.h...
configure: error: Run make config in /usr/src/linux-3.0.13-0.27.
I have already make config in the linux src and the .config file exists
hous0162:/usr/src/linux-3.0.13-0.27 # ls -la .config
-rw-r--r-- 1 root root 76743 Mar 31 10:29 .config
Any suggestions in getting this to work would be appreciated!
Also, is it suggested to use mellanox ofed or are we ok with the ofed-3.5-2?
I am building some monitoring scripts of our lustre setup and I am having trouble wrapping my head around:
read_bytes 100195530 samples [bytes] 0 1048576 54085118202030
write_bytes 178494565 samples [bytes] 1 1048576 166914498969357
What do each of the each of the numbers mean?
CAEN Advanced Computing
XSEDE Campus Champion