lustre 2.5.0 + zfs
by luka leskovec
Hello all,
i got a running lustre 2.5.0 + zfs setup on top of centos 6.4 (the kernels
available on the public whamcloud site), my clients are on centos 6.5
(minor version difference, i recompiled the client sources with the options
specified on the whamcloud site)
but now i have some problems. I cannot judge how serious it is, as the only
problems i observe are slow responses on ls, rm and tar and apart from that
it works great. i also export it over nfs, which sometimes hangs the client
on which it is exported, but i expect this is an issue related to how many
service threads i have running on my servers (old machines).
but my osses (i got two) keep spitting out these messages into the system
log:
xxxxxxxxxxxxxxxxxxxxxx kernel: SPL: Showing stack for process 3264
xxxxxxxxxxxxxxxxxxxxxx kernel: Pid: 3264, comm: txg_sync Tainted:
P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1
xxxxxxxxxxxxxxxxxxxxxx kernel: Call Trace:
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa01595a7>] ?
spl_debug_dumpstack+0x27/0x40 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0161337>] ?
kmem_alloc_debug+0x437/0x4c0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0163b13>] ?
task_alloc+0x1d3/0x380 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0160f8f>] ?
kmem_alloc_debug+0x8f/0x4c0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa02926f0>] ? spa_deadman+0x0/0x120
[zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa016432b>] ?
taskq_dispatch_delay+0x19b/0x2a0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0164612>] ?
taskq_cancel_id+0x102/0x1e0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa028259a>] ? spa_sync+0x1fa/0xa80
[zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810a2431>] ? ktime_get_ts+0xb1/0xf0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295707>] ?
txg_sync_thread+0x307/0x590 [zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810560a9>] ?
set_user_nice+0xc9/0x130
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295400>] ?
txg_sync_thread+0x0/0x590 [zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162478>] ?
thread_generic_wrapper+0x68/0x80 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162410>] ?
thread_generic_wrapper+0x0/0x80 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81096a36>] ? kthread+0x96/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
does anyone know, is this a serious problem, or just aesthetics? any way to
solve this? any hints?
best regards,
Luka Leskovec
6 years, 9 months
Lester, the Lustre lister
by David Dillow
I'd like to announce the availability of Lester, the Lustre lister. It
is available from github at https://github.com/ORNL-TechInt/lester
Lester is an extension of (n)e2scan for generating lists of files (and
potentially their attributes) from a ext2/ext3/ext4/ldiskfs filesystem.
We primarily use it for generating a purge candidate list, but it is
also useful for generating a list of files affected by an OST outage or
providing a name for an inode.
For example, to list files that have not been accessed in two weeks and
put the output in ne2scan format in $OUTFILE:
touch -d 'now - 2 weeks' /tmp/flag
lester -A fslist -a before=/tmp/flag -o $OUTFILE $BLOCKDEV
To do the same thing, but generate a full listing of the filesystem in
parallel:
touch -d 'now - 2 weeks' /tmp/flag
lester -A fslist -a before=/tmp/flag -a genhit=$UNACCESSED_LIST
\
-o $FULL_LIST $BLOCKDEV
To name inodes to stdout (when not using Lustre 2.4's LINKEA):
lester -A namei -a $INODE1 -a $INODE2 ... $BLOCKDEV
To get a list of files with objects on OSTs 999 and 1000:
lester -A lsost -a 999 -a 1000 -o $OUTFILE $BLOCKDEV
To get a list of options and actions, use 'lester -h'; to get a list of
options for a given action, use 'lester -A $ACTION -a help'.
Lester uses its own AIO-based IO engine by default, which is usually
much faster than the default Unix engine for large filesystems on
high-performance devices. The number of requests in flight, request
size, cache size, and read-ahead settings for various phases of the scan
are all configurable. I recommend experimenting with the settings to
find a balance between speed and resource usage for your situation.
More information about the gains we've seen in testing prototype version
of Lester are in my LUG 2011 presentation,
http://www.opensfs.org/wp-content/uploads/2012/12/500-530_David_Dillow_LU...
--
Dave Dillow
National Center for Computational Science
Oak Ridge National Laboratory
7 years
The e2fsprogs git tree tagging is out of date
by Jay Lan
Hi,
Your web site showed the latest el6/RPMS/x86_64 e2fsprogs
version is 1.42.7-wc2-7.
However, the latest tag in the git repo seemed to me is v1.42.7.wc1.
Which commit is 1.42.7-wc2-7 tagged to ?
Thanks,
Jay
7 years
Problems with HSM
by Valvanuz Fernandez
Hello:
I've installed the Lustre Feature Release (2.5) from whamcloud website
in my Centos 6 servers and followed the documentation to configure the HSM.
It seems that the coodinator and the agent are running properly, but
when I try to archive a file from a lustre client it fails. As I was not
sure if there was a problem with the agent, I've instantiated 2 agents
in 2 different machines (one in the MDS, that also acts as lustre
client). The backend of the first agent is a NFS filesystem and the
backend of the sencond is XFS.
The error I get when I try to archive a file from any of my clients is
the following:
[root@wn021 ~]# lfs hsm_archive /lustre/collectl.conf
Cannot send HSM request (use of /lustre/collectl.conf): Invalid argument
Below, I've attached the output of the commands I've used to test that
the configuration and the strace output of the of the "lfs hsm_archive".
Could somebody help me? Thanks in advanced
Valvanuz
[root@wn024 ~]# lctl get_param mdt.lustrefs-MDT0000.hsm_control
mdt.lustrefs-MDT0000.hsm_control=enabled
[root@wn024 ~]# lctl get_param -n mdt.lustrefs-MDT0000.hsm.agents
uuid=62a6a7ae-1c24-1380-e960-443ee68c4e80 archive_id=1
requests=[current:0 ok:0 errors:0]
uuid=5f5d5627-7fa6-1de0-6823-7d80db8d482d archive_id=2
requests=[current:0 ok:0 errors:0]
[root@wn031 ~]# ps -ef|grep hsm
root 20437 1 0 Jan22 ? 00:00:00 lhsmtool_posix --daemon
--hsm-root /localtmp/lustrehsm --archive=2 /lustre
[root@wn024 ~]# ps -ef|grep hsm
root 11750 1 0 Jan22 ? 00:00:00 lhsmtool_posix --daemon
--hsm-root /oceano/gmeteo/WORK/valva/lustre --archive=1 /lustre
[root@wn021 ~]# lfs hsm_archive /lustre/collectl.conf
Cannot send HSM request (use of /lustre/collectl.conf): Invalid argument
[root@wn021 ~]# lfs hsm_state /lustre/collectl.conf
/lustre/collectl.conf: (0x00000000)
[root@wn021 ~]# strace lfs hsm_archive /lustre/collectl.conf
execve("/usr/bin/lfs", ["lfs", "hsm_archive", "/lustre/collectl.conf"],
[/* 69 vars */]) = 0
brk(0) = 0xa43000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258d9000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or
directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=54707, ...}) = 0
mmap(NULL, 54707, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f90258cb000
close(3) = 0
open("/lib64/libpthread.so.0", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\\\0\0\0\0\0\0"..., 832)
= 832
fstat(3, {st_mode=S_IFREG|0755, st_size=142464, ...}) = 0
mmap(NULL, 2212768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f902549e000
mprotect(0x7f90254b5000, 2097152, PROT_NONE) = 0
mmap(0x7f90256b5000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17000) = 0x7f90256b5000
mmap(0x7f90256b7000, 13216, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f90256b7000
close(3) = 0
open("/lib64/libreadline.so.6", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0PE\1\0\0\0\0\0"..., 832)
= 832
fstat(3, {st_mode=S_IFREG|0755, st_size=269592, ...}) = 0
mmap(NULL, 2370056, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f902525b000
mprotect(0x7f9025295000, 2097152, PROT_NONE) = 0
mmap(0x7f9025495000, 32768, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3a000) = 0x7f9025495000
mmap(0x7f902549d000, 2568, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f902549d000
close(3) = 0
open("/lib64/libncurses.so.5", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000j\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=140096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258ca000
mmap(NULL, 2235624, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9025039000
mprotect(0x7f902505b000, 2093056, PROT_NONE) = 0
mmap(0x7f902525a000, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x21000) = 0x7f902525a000
close(3) = 0
open("/lib64/libkeyutils.so.1", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\v\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=10192, ...}) = 0
mmap(NULL, 2105424, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9024e36000
mprotect(0x7f9024e38000, 2093056, PROT_NONE) = 0
mmap(0x7f9025037000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x7f9025037000
close(3) = 0
open("/lib64/libc.so.6", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\355\1\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1916568, ...}) = 0
mmap(NULL, 3745960, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9024aa3000
mprotect(0x7f9024c2d000, 2093056, PROT_NONE) = 0
mmap(0x7f9024e2c000, 20480, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x189000) = 0x7f9024e2c000
mmap(0x7f9024e31000, 18600, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f9024e31000
close(3) = 0
open("/lib64/libtinfo.so.5", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\310\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=135896, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c9000
mmap(NULL, 2232320, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9024882000
mprotect(0x7f902489f000, 2097152, PROT_NONE) = 0
mmap(0x7f9024a9f000, 16384, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d000) = 0x7f9024a9f000
close(3) = 0
open("/lib64/libdl.so.2", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\r\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=19536, ...}) = 0
mmap(NULL, 2109696, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f902467e000
mprotect(0x7f9024680000, 2097152, PROT_NONE) = 0
mmap(0x7f9024880000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f9024880000
close(3) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c8000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c7000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c6000
arch_prctl(ARCH_SET_FS, 0x7f90258c7700) = 0
mprotect(0x7f9024880000, 4096, PROT_READ) = 0
mprotect(0x7f9024e2c000, 16384, PROT_READ) = 0
mprotect(0x7f9025037000, 4096, PROT_READ) = 0
mprotect(0x7f90256b5000, 4096, PROT_READ) = 0
mprotect(0x7f90258da000, 4096, PROT_READ) = 0
munmap(0x7f90258cb000, 54707) = 0
set_tid_address(0x7f90258c79d0) = 15850
set_robust_list(0x7f90258c79e0, 0x18) = 0
futex(0x7fffd860f6bc, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7fffd860f6bc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1,
NULL, 7f90258c7700) = -1 EAGAIN (Resource temporarily unavailable)
rt_sigaction(SIGRTMIN, {0x7f90254a3ae0, [], SA_RESTORER|SA_SIGINFO,
0x7f90254ad500}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x7f90254a3b70, [],
SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f90254ad500}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM_INFINITY}) = 0
shmget(IPC_PRIVATE, 65680, 0600) = 65536
shmat(65536, 0, 0) = ?
shmctl(65536, IPC_RMID, 0) = 0
brk(0) = 0xa43000
brk(0xa64000) = 0xa64000
lstat("/lustre/collectl.conf", {st_mode=S_IFREG|0644, st_size=7361,
...}) = 0
open("/lustre/collectl.conf", O_RDONLY|O_NONBLOCK|O_NOFOLLOW) = 3
ioctl(3, 0x800866ad, 0xa43048) = 0
close(3) = 0
lstat("/lustre", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("/lustre/collectl.conf", {st_mode=S_IFREG|0644, st_size=7361,
...}) = 0
open("/etc/mtab", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=344, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258d8000
read(3, "/dev/md0 / ext4 rw 0 0\nproc /pro"..., 4096) = 344
read(3, "", 4096) = 0
close(3) = 0
munmap(0x7f90258d8000, 4096) = 0
open("/lustre", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
ioctl(3, 0x401866d9, 0xa43030) = -1 EI[root@wn021 ~]# strace
lfs hsm_archive /lustre/collectl.conf
execve("/usr/bin/lfs", ["lfs", "hsm_archive", "/lustre/collectl.conf"],
[/* 69 vars */]) = 0
brk(0) = 0xa43000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258d9000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or
directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=54707, ...}) = 0
mmap(NULL, 54707, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f90258cb000
close(3) = 0
open("/lib64/libpthread.so.0", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\\\0\0\0\0\0\0"..., 832)
= 832
fstat(3, {st_mode=S_IFREG|0755, st_size=142464, ...}) = 0
mmap(NULL, 2212768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f902549e000
mprotect(0x7f90254b5000, 2097152, PROT_NONE) = 0
mmap(0x7f90256b5000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17000) = 0x7f90256b5000
mmap(0x7f90256b7000, 13216, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f90256b7000
close(3) = 0
open("/lib64/libreadline.so.6", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0PE\1\0\0\0\0\0"..., 832)
= 832
fstat(3, {st_mode=S_IFREG|0755, st_size=269592, ...}) = 0
mmap(NULL, 2370056, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f902525b000
mprotect(0x7f9025295000, 2097152, PROT_NONE) = 0
mmap(0x7f9025495000, 32768, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3a000) = 0x7f9025495000
mmap(0x7f902549d000, 2568, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f902549d000
close(3) = 0
open("/lib64/libncurses.so.5", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000j\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=140096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258ca000
mmap(NULL, 2235624, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9025039000
mprotect(0x7f902505b000, 2093056, PROT_NONE) = 0
mmap(0x7f902525a000, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x21000) = 0x7f902525a000
close(3) = 0
open("/lib64/libkeyutils.so.1", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\v\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=10192, ...}) = 0
mmap(NULL, 2105424, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9024e36000
mprotect(0x7f9024e38000, 2093056, PROT_NONE) = 0
mmap(0x7f9025037000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x7f9025037000
close(3) = 0
open("/lib64/libc.so.6", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\355\1\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1916568, ...}) = 0
mmap(NULL, 3745960, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9024aa3000
mprotect(0x7f9024c2d000, 2093056, PROT_NONE) = 0
mmap(0x7f9024e2c000, 20480, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x189000) = 0x7f9024e2c000
mmap(0x7f9024e31000, 18600, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f9024e31000
close(3) = 0
open("/lib64/libtinfo.so.5", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\310\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=135896, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c9000
mmap(NULL, 2232320, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f9024882000
mprotect(0x7f902489f000, 2097152, PROT_NONE) = 0
mmap(0x7f9024a9f000, 16384, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d000) = 0x7f9024a9f000
close(3) = 0
open("/lib64/libdl.so.2", O_RDONLY) = 3
read(3,
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\r\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=19536, ...}) = 0
mmap(NULL, 2109696, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f902467e000
mprotect(0x7f9024680000, 2097152, PROT_NONE) = 0
mmap(0x7f9024880000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f9024880000
close(3) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c8000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c7000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258c6000
arch_prctl(ARCH_SET_FS, 0x7f90258c7700) = 0
mprotect(0x7f9024880000, 4096, PROT_READ) = 0
mprotect(0x7f9024e2c000, 16384, PROT_READ) = 0
mprotect(0x7f9025037000, 4096, PROT_READ) = 0
mprotect(0x7f90256b5000, 4096, PROT_READ) = 0
mprotect(0x7f90258da000, 4096, PROT_READ) = 0
munmap(0x7f90258cb000, 54707) = 0
set_tid_address(0x7f90258c79d0) = 15850
set_robust_list(0x7f90258c79e0, 0x18) = 0
futex(0x7fffd860f6bc, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x7fffd860f6bc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1,
NULL, 7f90258c7700) = -1 EAGAIN (Resource temporarily unavailable)
rt_sigaction(SIGRTMIN, {0x7f90254a3ae0, [], SA_RESTORER|SA_SIGINFO,
0x7f90254ad500}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x7f90254a3b70, [],
SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f90254ad500}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM_INFINITY}) = 0
shmget(IPC_PRIVATE, 65680, 0600) = 65536
shmat(65536, 0, 0) = ?
shmctl(65536, IPC_RMID, 0) = 0
brk(0) = 0xa43000
brk(0xa64000) = 0xa64000
lstat("/lustre/collectl.conf", {st_mode=S_IFREG|0644, st_size=7361,
...}) = 0
open("/lustre/collectl.conf", O_RDONLY|O_NONBLOCK|O_NOFOLLOW) = 3
ioctl(3, 0x800866ad, 0xa43048) = 0
close(3) = 0
lstat("/lustre", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("/lustre/collectl.conf", {st_mode=S_IFREG|0644, st_size=7361,
...}) = 0
open("/etc/mtab", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=344, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0x7f90258d8000
read(3, "/dev/md0 / ext4 rw 0 0\nproc /pro"..., 4096) = 344
read(3, "", 4096) = 0
close(3) = 0
munmap(0x7f90258d8000, 4096) = 0
open("/lustre", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
ioctl(3, 0x401866d9, 0xa43030) = -1 EINVAL (Invalid argument)
close(3) = 0
write(2, "Cannot send HSM request (use of "..., 73Cannot send HSM
request (use of /lustre/collectl.conf): Invalid argument
) = 73
rt_sigaction(SIGINT, {0x40b230, [RT_1 RT_2 RT_3 RT_4 RT_5 RT_6 RT_7 RT_8
RT_9 RT_10 RT_11 RT_12 RT_13 RT_14 RT_15], SA_RESTORER|SA_RESTART,
0x7f90254ad500}, NULL, 8) = 0
exit_group(22) = ?
NVAL (Invalid argument)
close(3) = 0
write(2, "Cannot send HSM request (use of "..., 73Cannot send HSM
request (use of /lustre/collectl.conf): Invalid argument
) = 73
rt_sigaction(SIGINT, {0x40b230, [RT_1 RT_2 RT_3 RT_4 RT_5 RT_6 RT_7 RT_8
RT_9 RT_10 RT_11 RT_12 RT_13 RT_14 RT_15], SA_RESTORER|SA_RESTART,
0x7f90254ad500}, NULL, 8) = 0
exit_group(22) = ?
7 years
Debugging LNet RPC Errors
by Tobias Groschup
Hello,
I am still struggling with the GET message on the LND level. After
adding a test to the lnet selftest, there is one GET going through the
LND, and after that nothing happens some time, untill this error message
is dumped to the console:
add test RPC failed on 12345-1@ex: Unknown error 18446744073709551506
Is there any way to find out what caused this error? That would be a
great help in finding what the LND does wrong.
I consulted the different log files like /var/log/dmesg and
/var/log/messages. On my system, there is no file log-lustre under /tmp.
So, I do not know how to investigate this error further. Any help on
this matter would be very much appreciated!
Thanks and kind regards
Tobias Groschup
7 years, 1 month
Any issues in leaving deleted OST's as is, without writeconf
by Kumar, Amit
Dear All,
Based on my reading: shutting down the file system including un-mounting clients, MDT/MGS, and OST and then performing a writeconf would clear this permanently removed OST.
Question: I believe and also do not see any issue in continuing to run the file system without having to shut down until the next scheduled maintenance, to run writeconf and hence to clear deleted OST's from being listed on MDS. Do I get this right?
Please advise.
Thank you,
Amit H. Kumar
7 years, 1 month
Cannot turn on quotas on 2.1.6 -> 2.4.2 upgrade: not setup yet
by Anthony Alba
Hi list,
I upgraded from 2.1.6 to 2.4.2 all servers using the docs.
This is a combined MGS/MDS.
On MGS/MDS:
tunefs.lustre --mdt --quota on MDT
On OSSs:
tunefs.lustre --ost --quota on OSTs
On MGS/MDS:
lctl conf_param lustre.quota.mdt=ug
lctl conf_param lustre.quota.os=ug
After remounting everything I'm getting:
[root@mds1 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.lustre-MDT0000.quota_slave.info=
target name: lustre-MDT0000
pool ID: 0
type: md
quota enabled: ug
conn to master: not setup yet
space acct: ug
user uptodate: glb[0],slv[0],reint[1]
group uptodate: glb[0],slv[0],reint[1]
Any suggestions on getting 'conn to master' change to 'setup'?
The filesystem itself behaves normally.
I tried it in a VM on a fresh test fs with just MGS/MDS and the conn to
master was immediately "setup" even before adding any OSSes.
- Anthony
7 years, 1 month
DKMS Debian
by Arden Wiebe
Is it possible to have a Lustre client working on this kernel with DKMS or otherwise:
root@steamos:/home/desktop/git/lustre-release# uname -a
Linux steamos 3.10-3-amd64 #1 SMP Debian 3.10.11-1st1 (2014-01-21) x86_64 GNU/Linux
7 years, 1 month
LUG 2014: Presentation Submission Deadline is January 31
by OpenSFS Administration
<http://www.opensfs.org/events/lug14/>
Call for Presentations: Lustre User Group 2014
Miami Marriott Biscayne Bay, April 8-10
Are you interested in presenting your company's best practices, case
studies, or business experiences to over 200+ Lustre-focused attendees?
Would you like to share specific Lustre development challenges, new
development features, project updates, or cutting-edge implementations? Then
we want to hear from you!
Don't miss your chance to submit a presentation for the 12th annual LustreR
User Group (LUG) conference on April 8-10 at the
<http://www.marriott.com/hotels/travel/miabb-miami-marriott-biscayne-bay/>
Miami Marriott Biscayne Bay in Miami, Florida! All LUG presentations will be
30 minutes, with a total of 25 speaking opportunities available. We
encourage you to review the agendas and caliber of presentations at
<http://www.opensfs.org/past-events/> past LUG events.
Submission deadline is January 31!
To submit an abstract:
* Visit the <https://www.easychair.org/conferences/?conf=lug2014> LUG
2014 Call for Presentations Submission website
* You will need to create a user account on Easy Chair (our online
submission platform) if you don't already have one
* After providing your user details via Easy Chair, you will be
prompted to complete the online submission process
* An abstract is only required for the submission process;
presentation materials will be requested once abstracts are reviewed and
selected
<http://www.opensfs.org/events/lug14/> LUG 2014 will also feature panel
discussions where leading developers, vendors, and users of Lustre debate
future requirements, explore upcoming enhancements, and share real world
best practices. If you'd like to suggest a topic for a panel discussion - or
be part of a panel, please contact <mailto:panelist@opensfs.org>
admin(a)opensfs.org.
Registration will open in February and additional event logistics are
included on the LUG <http://www.opensfs.org/events/lug14/> event page. Be
sure to check back regularly, as speakers and session highlights will also
be posted over the coming weeks.
Please contact <mailto:admin@opensfs.org> admin(a)opensfs.org if you have any
questions.
Best regards,
OpenSFS LUG Planning Committee
_________________________
OpenSFS Administration
3855 SW 153rd Drive Beaverton, OR 97006 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:admin@opensfs.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org> www.opensfs.org
<http://www.opensfs.org/lug-2014-sponsorship/>
<http://www.opensfs.org/lug-2014-sponsorship/> Click here to learn how to
become a 2014 LUG Sponsor
Open Scalable File Systems, Inc. was founded in 2010 to advance Lustre
development, ensuring it remains vendor-neutral, open, and free. Since its
inception, OpenSFS has been responsible for advancing the Lustre file system
and delivering new releases on behalf of the open source community. Through
working groups, events, and ongoing funding initiatives, OpenSFS harnesses
the power of collaborative development to fuel innovation and growth of the
Lustre file system worldwide.
7 years, 1 month