Hello all,
i got a running lustre 2.5.0 + zfs setup on top of centos 6.4 (the kernels
available on the public whamcloud site), my clients are on centos 6.5
(minor version difference, i recompiled the client sources with the options
specified on the whamcloud site)
but now i have some problems. I cannot judge how serious it is, as the only
problems i observe are slow responses on ls, rm and tar and apart from that
it works great. i also export it over nfs, which sometimes hangs the client
on which it is exported, but i expect this is an issue related to how many
service threads i have running on my servers (old machines).
but my osses (i got two) keep spitting out these messages into the system
log:
xxxxxxxxxxxxxxxxxxxxxx kernel: SPL: Showing stack for process 3264
xxxxxxxxxxxxxxxxxxxxxx kernel: Pid: 3264, comm: txg_sync Tainted:
P --------------- 2.6.32-358.18.1.el6_lustre.x86_64 #1
xxxxxxxxxxxxxxxxxxxxxx kernel: Call Trace:
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa01595a7>] ?
spl_debug_dumpstack+0x27/0x40 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0161337>] ?
kmem_alloc_debug+0x437/0x4c0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0163b13>] ?
task_alloc+0x1d3/0x380 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0160f8f>] ?
kmem_alloc_debug+0x8f/0x4c0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa02926f0>] ? spa_deadman+0x0/0x120
[zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa016432b>] ?
taskq_dispatch_delay+0x19b/0x2a0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0164612>] ?
taskq_cancel_id+0x102/0x1e0 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa028259a>] ? spa_sync+0x1fa/0xa80
[zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810a2431>] ? ktime_get_ts+0xb1/0xf0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295707>] ?
txg_sync_thread+0x307/0x590 [zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810560a9>] ?
set_user_nice+0xc9/0x130
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0295400>] ?
txg_sync_thread+0x0/0x590 [zfs]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162478>] ?
thread_generic_wrapper+0x68/0x80 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffffa0162410>] ?
thread_generic_wrapper+0x0/0x80 [spl]
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff81096a36>] ? kthread+0x96/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0ca>] ? child_rip+0xa/0x20
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff810969a0>] ? kthread+0x0/0xa0
xxxxxxxxxxxxxxxxxxxxxx kernel: [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
does anyone know, is this a serious problem, or just aesthetics? any way to
solve this? any hints?
best regards,
Luka Leskovec