Hello.
The OST10 remain inactive. Two disk failed in the RAID and then disable,
data was unlink from this.
The version the lustre is 2.5.3
Do you know some tips to solve this problem?
Thanks!
________________________________
From: Tim Carlson <tim.s.carlson(a)gmail.com>
Sent: Friday, December 22, 2017 12:50 PM
To: Nicolas Gonzalez
Cc: hpdd-discuss(a)lists.01.org
Subject: Re: [HPDD-discuss] An unbalancing Lustre fs write the first ACTIVE OST always
I have seen this before where an INACTIVE OST will stop Lustre from using OSTs past that
number. Can you reactivate OST10? In my case this was Lustre 2.5.4
On Fri, Dec 22, 2017 at 3:21 AM, Nicolas Gonzalez
<Nicolas.Gonzalez@alma.cl<mailto:Nicolas.Gonzalez@alma.cl>> wrote:
Hello
We have a Lustre fs for data reduction and currently the follow usage distribution
UID 1K-blocks Used Available Use% Mounted on
jaopost-MDT0000_UUID 652420096 35893004 573024684 6% /.lustre/jaopost[MDT:0]
jaopost-MDT0001_UUID 307547736 834192 286206104 0% /.lustre/jaopost[MDT:1]
jaopost-OST0000_UUID 15617202700 15384873240 232295720 99% /.lustre/jaopost[OST:0]
jaopost-OST0001_UUID 15617202700 15418334924 198855308 99% /.lustre/jaopost[OST:1]
jaopost-OST0002_UUID 15617202700 15462419636 154754580 99% /.lustre/jaopost[OST:2]
jaopost-OST0003_UUID 15617202700 15461905276 155125548 99% /.lustre/jaopost[OST:3]
jaopost-OST0004_UUID 15617202700 15476870016 140305764 99% /.lustre/jaopost[OST:4]
jaopost-OST0005_UUID 15617202700 15550920180 66263692 100% /.lustre/jaopost[OST:5]
jaopost-OST0006_UUID 15617202700 15495824888 121358212 99% /.lustre/jaopost[OST:6]
jaopost-OST0007_UUID 15617202700 15509071792 108086048 99% /.lustre/jaopost[OST:7]
jaopost-OST0008_UUID 15617202700 15465714268 151463980 99% /.lustre/jaopost[OST:8]
jaopost-OST0009_UUID 15617202700 15490943928 126146476 99% /.lustre/jaopost[OST:9]
jaopost-OST000a_UUID 15617202700 15447985132 169182460 99% /.lustre/jaopost[OST:10]
jaopost-OST000b_UUID 15617202700 15364135336 253034356 98% /.lustre/jaopost[OST:11]
jaopost-OST000c_UUID 15617202700 15532906368 84281576 99% /.lustre/jaopost[OST:12]
jaopost-OST000d_UUID 15617202700 15485639672 131543112 99% /.lustre/jaopost[OST:13]
jaopost-OST000e_UUID 15617202700 15528786804 88404480 99% /.lustre/jaopost[OST:14]
jaopost-OST000f_UUID 15617202700 15523110328 94092292 99% /.lustre/jaopost[OST:15]
OST0010 : inactive device
jaopost-OST0011_UUID 15617202700 13303847400 2313354908 85% /.lustre/jaopost[OST:17]
jaopost-OST0012_UUID 15617202700 2593078056 13024119288 17% /.lustre/jaopost[OST:18]
jaopost-OST0013_UUID 15617202700 580724544 15036476468 4% /.lustre/jaopost[OST:19]
jaopost-OST0014_UUID 15617202700 1793039312 13824161232 11% /.lustre/jaopost[OST:20]
jaopost-OST0015_UUID 15617202700 4323099708 11294102856 28% /.lustre/jaopost[OST:21]
jaopost-OST0016_UUID 15617202700 281201736 15336000780 2% /.lustre/jaopost[OST:22]
jaopost-OST0017_UUID 15617202700 110096064 15507106444 1% /.lustre/jaopost[OST:23]
jaopost-OST0018_UUID 15617202700 2858929908 12758272512 18% /.lustre/jaopost[OST:24]
Were added OSTs 17-24 and I follow the documentation procedure and
the OST 0-15 were disables and the lfs_migrate was started. For problems
with the reduction software all working folder has striping 1 and the
offset is set to -1.
But, only the OST 17 was fulled (some data was moved with lfs migrate
and forced to move a specific OST). I done some test with dd command
and the situation was the same. Only changed the stripe index to a
specific OST, dd command write in a other target.
I repeated the test in other cluster that we have, and the files were correctly write in
deferents OSTs with offset -1
I changed the the priority in the QOS algorithm
/proc/fs/lustre/lov/*/qos_prio_free to 100% and the result was the same
[
https://jira.hpdd.intel.com/images/icons/emoticons/sad.png]
Do you have any idea what is the root cause?
Cloud be a bug or a setup problem?
Thanks in advance...
_______________________________________________
HPDD-discuss mailing list
HPDD-discuss@lists.01.org<mailto:HPDD-discuss@lists.01.org>
https://lists.01.org/mailman/listinfo/hpdd-discuss