o2iblnd peer_credits and concurrent_sends
by Prescott,Craig P
Hi,
Yesterday I tried adding
options ko2iblnd peer_credits=126 concurrent_sends=63
to our /etc/modprobe.d/lnet.conf for our all IB-connected Lustre 1.8 clients and servers. The motivation for wanting to try it came from section VI of this CUG12 paper:
https://cug.org/proceedings/attendee_program_cug2012/includes/files/pap16...
Here are the client syslog messages which resulted:
Jul 2 18:46:02 c0a-s1 kernel: current num of QPs 0x7
Jul 2 18:46:02 c0a-s1 kernel: command failed, status bad parameter(0x3), syndrome 0x317227
Jul 2 18:46:02 c0a-s1 kernel: LustreError: 2360:0:(o2iblnd.c:808:kiblnd_create_conn()) Can't create QP: -22, send_wr: 16191, recv_wr: 130
Here is a corresponding server message:
Jul 2 18:47:30 ts-lfs-01 kernel: LustreError: 7744:0:(o2iblnd_cb.c:2529:kiblnd_rejected()) 10.13.68.116@o2ib rejected: o2iblnd no resources
I'm not sure what to make of the above - are the values of peer_credits and concurrent_sends that I tried to use too large? Are there other parameters which one must change in order to set o2iblnd peer_credits and/or concurrent_sends?
Thanks,
Craig
8 years, 10 months
Show Your Support for Lustre!
by OpenSFS Administration
Hello Lustre community,
In an effort to drive awareness about the LustreR technology and the
companies who are actively utilizing Lustre, we're asking for your help in
linking to the <http://lustre.opensfs.org/> OpenSFS Lustre and
<http://www.opensfs.org/> OpenSFS organization web sites. By highlighting
these two web sites, we'll increase the overall awareness of Lustre in the
community and ensure that participants have access to the best content
available.
To help with these efforts, you can take these two simple steps:
- Include a link from your company's web site to
<http://www.lustre.opensfs.org> www.lustre.opensfs.org
o The lustre.opensfs.org site includes access to the latest Lustre
releases, a Lustre overview, roadmap and development plans, issue tracking,
and participation opportunities
- Include a link from your company's web site to
<http://www.opensfs.org> www.opensfs.org
o The opensfs.org site includes details about OpenSFS, the non-profit
organization supporting the HPC open source file system software community.
If you're not an OpenSFS member, we hope you'll consider joining us!
Additionally, we encourage you to stay connected to the Lustre community by
following us on Twitter ( <https://twitter.com/opensfs> @OpenSFS), joining
the <http://lists.opensfs.org/listinfo.cgi/discuss-opensfs.org> OpenSFS
discuss and <http://lustre.opensfs.org/join-the-community/> Lustre mailing
lists, or joining an <http://lustre.opensfs.org/join-the-community/>
OpenSFS Working Group.
We appreciate your support as we continue to advance the Lustre file system!
Sincerely,
OpenSFS Administration
__________________________
OpenSFS Administration
3855 SW 153rd Drive Beaverton, OR 97006 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:admin@opensfs.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org> www.opensfs.org
8 years, 10 months
What is opc and its significance ?
by Akam
Hello,
In logs, generally messages of the form, like below are seen,
00000100:00100000:2.0:1307742632.935994:0:23114:0:(service.c:1705:ptlrpc_server_handle_reque
st()) Handling RPC *pname:cluuid+ref:pid:xid:nid:opc*
ll_ost_io_126:30dc3ffd-9567-deb3-c9ed-423e4589e173+6:21865:x1371266995979156:12345-0@lo
:*10
*
00000100:00100000:2.0:1307742632.950439:0:23114:0:(service.c:1752:ptlrpc_server_handle_reque
st()) *Handled RPC pname:cluuid+ref:pid:xid:nid:opc*
ll_ost_io_126:30dc3ffd-9567-deb3-c9ed-423e4589e173+6:21865:x1371266995979156:12345-0@lo
:*10*
What are these values ? what do they signify ? Are there any standard
definitions for these ?
from lustre/include/lustre/lustre_idl.h
991 struct ptlrpc_body {
992 struct lustre_handle pb_handle;
993 __u32 pb_type;
994 __u32 pb_version;
*995 __u32 pb_opc*;
996 __u32 pb_status;
997 __u64 pb_last_xid;
998 __u64 pb_last_seen;
999 __u64 pb_last_committed;
1000 __u64 pb_transno;
1001 __u32 pb_flags;
1002 __u32 pb_op_flags;
1003 __u32 pb_conn_cnt;
1004 __u32 pb_timeout; /* for req, the deadline, for rep, the
service est */
1005 __u32 pb_service_time; /* for rep, actual service time, also
used for
1006 net_latency of req */
1007 __u32 pb_limit;
1008 __u64 pb_slv;
1009 /* VBR: pre-versions */
1010 __u64 pb_pre_versions[PTLRPC_NUM_VERSIONS];
1011 /* padding for future needs */
1012 __u64 pb_padding[4];
1013 };
Thanks
--
cheers
Akam
8 years, 10 months
Serious Performance Problems, Need help!!!
by Kumar, Amit
Dear Lustre,
We are having major performance problems this time and hard to grasp what is going on.
Health check all look good. Network looks good. But performance is bad.
(a) lfs df, output at the end of the email shows a couple of OST's temporarily unavailable, but this normally happens and it connects back. It does connect back but then it is unavailable again in a short while, this is repeating.
(b) Also included below are outputs for the following commands from every OSS
cat /proc/fs/lustre/devices
lctl get_param ost.*.ost_io.threads_max
lctl get_param ost.*.ost_io.threads_started
grep -i LBUG /var/log/messages
cat /proc/fs/lustre/health_check
cat /proc/sys/lnet/nis
(c)
(d) Based on this RPC stats that is attached to this email it seems a lot of pending pages to write to is probably causing this. Attached rpc_stats includes all ost's.
(e) Also the LNET peer stats below show a great deal of congestion with two of the OST.
I am not sure how to approach this in reducing the performance problems. Almost all OSS is seeing IO wait, backend storage also looks good.
Can anybody please advise on possible issue that may be causing this other than the file system being 88% full.
No changes were made to the system recently, except to refresh the disk in OST I deactivated OST temporarily while I was migrating the data off the deactivated OST. Since this problem this Friday I re-activated the deactivated OST, so that I could add additional OSS and OST to load balance, hence relieve the performance issues. It seemed to help a bit but not much.
Best,
Thank you,
Amit
Below here is the output of the following commands from each of the OSS.
cat /proc/fs/lustre/devices
lctl get_param ost.*.ost_io.threads_max
lctl get_param ost.*.ost_io.threads_started
grep -i LBUG /var/log/messages
cat /proc/fs/lustre/health_check
cat /proc/sys/lnet/nis
array2
0 UP mgc MGC10.1.1.40@tcp 87942af4-c7b4-5695-4680-2a3a4f232054 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST001a smuhpc-OST001a_UUID 439
3 UP obdfilter smuhpc-OST0000 smuhpc-OST0000_UUID 439
4 UP obdfilter smuhpc-OST0001 smuhpc-OST0001_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=367
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.51@tcp up -1 225 8 0 256 256 -1512
array2b
0 UP mgc MGC10.1.1.40@tcp f4072991-d501-f944-10b0-4c6a460c9c6d 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0002 smuhpc-OST0002_UUID 439
3 UP obdfilter smuhpc-OST0003 smuhpc-OST0003_UUID 438
4 UP obdfilter smuhpc-OST0008 smuhpc-OST0008_UUID 439
ost.OSS.ost_io.threads_max=128
ost.OSS.ost_io.threads_started=64
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.54@tcp up -1 225 8 0 256 256 -306
array3
0 UP mgc MGC10.1.1.40@tcp 524536bc-fb4f-bed5-6e55-924aa46112d1 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0004 smuhpc-OST0004_UUID 439
3 UP obdfilter smuhpc-OST0005 smuhpc-OST0005_UUID 439
4 UP obdfilter smuhpc-OST0006 smuhpc-OST0006_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=362
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.52@tcp up -1 225 8 0 256 256 -1037
array3b
0 UP mgc MGC10.1.1.40@tcp 00fdbef3-fd0c-18db-637b-eb869eb99309 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0007 smuhpc-OST0007_UUID 439
3 UP obdfilter smuhpc-OST0011 smuhpc-OST0011_UUID 439
4 UP obdfilter smuhpc-OST0012 smuhpc-OST0012_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=293
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.55@tcp up -1 225 8 0 256 256 -147
array4
0 UP mgc MGC10.1.1.40@tcp b90bd48b-3f2f-aa60-a1a4-e743ce1d4025 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST000b smuhpc-OST000b_UUID 439
3 UP obdfilter smuhpc-OST000c smuhpc-OST000c_UUID 439
4 UP obdfilter smuhpc-OST000d smuhpc-OST000d_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=512
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.53@tcp up -1 225 8 0 256 256 -966
array4b
0 UP mgc MGC10.1.1.40@tcp 1b31358c-ffc6-ca4d-14ea-78bf8804a15a 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST000e smuhpc-OST000e_UUID 439
3 UP obdfilter smuhpc-OST001c smuhpc-OST001c_UUID 437
4 UP obdfilter smuhpc-OST001d smuhpc-OST001d_UUID 437
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=512
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.56@tcp up -1 223 8 0 256 256 -655
array5
0 UP mgc MGC10.1.1.40@tcp 0bdb83f9-dbf5-aeaa-ff7d-66c0b6471811 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0009 smuhpc-OST0009_UUID 439
3 UP obdfilter smuhpc-OST000a smuhpc-OST000a_UUID 439
4 UP obdfilter smuhpc-OST000f smuhpc-OST000f_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=512
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.57@tcp up -1 225 8 0 256 256 -385
array5b
0 UP mgc MGC10.1.1.40@tcp d5982303-1e80-3ba5-c88b-b712e2d7c7af 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0010 smuhpc-OST0010_UUID 439
3 UP obdfilter smuhpc-OST0017 smuhpc-OST0017_UUID 439
4 UP obdfilter smuhpc-OST001b smuhpc-OST001b_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=312
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.58@tcp up -1 225 8 0 256 249 -383
array6
0 UP mgc MGC10.1.1.40@tcp 624e193a-3f28-2936-14e8-a3ff130bcd0f 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0030 smuhpc-OST0030_UUID 436
3 UP obdfilter smuhpc-OST0031 smuhpc-OST0031_UUID 436
4 UP obdfilter smuhpc-OST0032 smuhpc-OST0032_UUID 436
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=128
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.59@tcp up -1 225 8 0 256 238 -475
array6b
0 UP mgc MGC10.1.1.40@tcp 1854af7c-31b9-43c5-058c-4953afb936bb 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0033 smuhpc-OST0033_UUID 436
3 UP obdfilter smuhpc-OST0034 smuhpc-OST0034_UUID 436
4 UP obdfilter smuhpc-OST0035 smuhpc-OST0035_UUID 436
ost.OSS.ost_io.threads_max=256
ost.OSS.ost_io.threads_started=128
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.60@tcp up -1 225 8 0 256 239 -217
array8
0 UP mgc MGC10.1.1.40@tcp a6744840-8c1a-cd8a-487b-db2e9efbf856 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0013 smuhpc-OST0013_UUID 439
3 UP obdfilter smuhpc-OST0014 smuhpc-OST0014_UUID 439
4 UP obdfilter smuhpc-OST0015 smuhpc-OST0015_UUID 439
5 UP obdfilter smuhpc-OST0016 smuhpc-OST0016_UUID 439
6 UP obdfilter smuhpc-OST0018 smuhpc-OST0018_UUID 439
7 UP obdfilter smuhpc-OST0019 smuhpc-OST0019_UUID 439
ost.OSS.ost_io.threads_max=512
ost.OSS.ost_io.threads_started=512
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.62@tcp up -1 225 8 0 256 256 -673
array7
0 UP mgc MGC10.1.1.40@tcp 315ffeaf-3075-24d7-2a01-02e062b60e34 5
1 UP ost OSS OSS_uuid 3
2 UP obdfilter smuhpc-OST0036 smuhpc-OST0036_UUID 437
3 UP obdfilter smuhpc-OST0037 smuhpc-OST0037_UUID 437
4 UP obdfilter smuhpc-OST0038 smuhpc-OST0038_UUID 437
5 UP obdfilter smuhpc-OST0039 smuhpc-OST0039_UUID 437
6 UP obdfilter smuhpc-OST003a smuhpc-OST003a_UUID 437
7 UP obdfilter smuhpc-OST003b smuhpc-OST003b_UUID 437
ost.OSS.ost_io.threads_max=256
ost.OSS.ost_io.threads_started=64
healthy
nid status alive refs peer rtr max tx min
0@lo up 0 2 0 0 0 0 0
10.1.1.61@tcp up -1 225 8 0 256 251 -1421
MGS/MDS_NODE# cat /proc/sys/lnet/peers | grep "10\.1\.1\." (below here are our ost's, and you see congestion on two of those, although the health_check shows healthy)
10.1.1.51@tcp 1 up 8 8 8 8 -6732 0
10.1.1.52@tcp 1 up 8 8 8 8 -2753 0
10.1.1.53@tcp 1 up 8 8 8 8 -4 0
10.1.1.54@tcp 1 up 8 8 8 8 -40 0
10.1.1.55@tcp 1 up 8 8 8 8 -7 0
10.1.1.56@tcp 1 up 8 8 8 8 0 0
10.1.1.57@tcp 1 up 8 8 8 8 -4 0
10.1.1.58@tcp 1 up 8 8 8 8 -6 0
10.1.1.59@tcp 1 up 8 8 8 8 -2 0
10.1.1.60@tcp 1 up 8 8 8 8 -1 0
10.1.1.61@tcp 1 up 8 8 8 8 -15 0
10.1.1.62@tcp 1 up 8 8 8 8 -11 0
=======More LOGS from MDS/MGS ====
# grep '[0-9]' /proc/fs/lustre/osc/*/kbytes{free,avail,total}
/proc/fs/lustre/osc/smuhpc-OST0000-osc/kbytesfree:514058156
/proc/fs/lustre/osc/smuhpc-OST0001-osc/kbytesfree:765667120
/proc/fs/lustre/osc/smuhpc-OST0002-osc/kbytesfree:1096019280
grep: /proc/fs/lustre/osc/smuhpc-OST0003-osc/kbytesfree: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0004-osc/kbytesfree:1577637660
/proc/fs/lustre/osc/smuhpc-OST0005-osc/kbytesfree:132305164
/proc/fs/lustre/osc/smuhpc-OST0006-osc/kbytesfree:899697048
/proc/fs/lustre/osc/smuhpc-OST0007-osc/kbytesfree:857944436
/proc/fs/lustre/osc/smuhpc-OST0008-osc/kbytesfree:36161928
/proc/fs/lustre/osc/smuhpc-OST0009-osc/kbytesfree:39061480
/proc/fs/lustre/osc/smuhpc-OST000a-osc/kbytesfree:938678228
/proc/fs/lustre/osc/smuhpc-OST000b-osc/kbytesfree:8604452
/proc/fs/lustre/osc/smuhpc-OST000c-osc/kbytesfree:44878900
/proc/fs/lustre/osc/smuhpc-OST000d-osc/kbytesfree:1117771508
/proc/fs/lustre/osc/smuhpc-OST000e-osc/kbytesfree:769454268
/proc/fs/lustre/osc/smuhpc-OST000f-osc/kbytesfree:56939372
/proc/fs/lustre/osc/smuhpc-OST0010-osc/kbytesfree:210416704
/proc/fs/lustre/osc/smuhpc-OST0011-osc/kbytesfree:1315953944
/proc/fs/lustre/osc/smuhpc-OST0012-osc/kbytesfree:1112498952
/proc/fs/lustre/osc/smuhpc-OST0013-osc/kbytesfree:917528092
/proc/fs/lustre/osc/smuhpc-OST0014-osc/kbytesfree:818228736
/proc/fs/lustre/osc/smuhpc-OST0015-osc/kbytesfree:119717344
/proc/fs/lustre/osc/smuhpc-OST0016-osc/kbytesfree:818664044
/proc/fs/lustre/osc/smuhpc-OST0017-osc/kbytesfree:1307525340
/proc/fs/lustre/osc/smuhpc-OST0018-osc/kbytesfree:561629216
/proc/fs/lustre/osc/smuhpc-OST0019-osc/kbytesfree:682050424
/proc/fs/lustre/osc/smuhpc-OST001a-osc/kbytesfree:1262541880
/proc/fs/lustre/osc/smuhpc-OST001b-osc/kbytesfree:864048788
/proc/fs/lustre/osc/smuhpc-OST001c-osc/kbytesfree:511371988
/proc/fs/lustre/osc/smuhpc-OST001d-osc/kbytesfree:109860844
grep: /proc/fs/lustre/osc/smuhpc-OST0030-osc/kbytesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0031-osc/kbytesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0032-osc/kbytesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0033-osc/kbytesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0034-osc/kbytesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0035-osc/kbytesfree: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0036-osc/kbytesfree:718292640
/proc/fs/lustre/osc/smuhpc-OST0037-osc/kbytesfree:472531244
/proc/fs/lustre/osc/smuhpc-OST0038-osc/kbytesfree:433755684
/proc/fs/lustre/osc/smuhpc-OST0039-osc/kbytesfree:875580388
/proc/fs/lustre/osc/smuhpc-OST003a-osc/kbytesfree:1161276948
grep: /proc/fs/lustre/osc/smuhpc-OST003b-osc/kbytesfree: Resource temporarily unavailable
/proc/fs/lustre/osc/smuhpc-OST0000-osc/kbytesavail:514033840
/proc/fs/lustre/osc/smuhpc-OST0001-osc/kbytesavail:765639756
/proc/fs/lustre/osc/smuhpc-OST0002-osc/kbytesavail:1095950892
grep: /proc/fs/lustre/osc/smuhpc-OST0003-osc/kbytesavail: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0004-osc/kbytesavail:1577629868
/proc/fs/lustre/osc/smuhpc-OST0005-osc/kbytesavail:132295072
/proc/fs/lustre/osc/smuhpc-OST0006-osc/kbytesavail:899689368
/proc/fs/lustre/osc/smuhpc-OST0007-osc/kbytesavail:857942648
/proc/fs/lustre/osc/smuhpc-OST0008-osc/kbytesavail:36140876
/proc/fs/lustre/osc/smuhpc-OST0009-osc/kbytesavail:38998500
/proc/fs/lustre/osc/smuhpc-OST000a-osc/kbytesavail:938670344
/proc/fs/lustre/osc/smuhpc-OST000b-osc/kbytesavail:8593840
/proc/fs/lustre/osc/smuhpc-OST000c-osc/kbytesavail:44876596
/proc/fs/lustre/osc/smuhpc-OST000d-osc/kbytesavail:1117758504
/proc/fs/lustre/osc/smuhpc-OST000e-osc/kbytesavail:769447360
/proc/fs/lustre/osc/smuhpc-OST000f-osc/kbytesavail:56922292
/proc/fs/lustre/osc/smuhpc-OST0010-osc/kbytesavail:210406920
/proc/fs/lustre/osc/smuhpc-OST0011-osc/kbytesavail:1315948464
/proc/fs/lustre/osc/smuhpc-OST0012-osc/kbytesavail:1112487208
/proc/fs/lustre/osc/smuhpc-OST0013-osc/kbytesavail:917520972
/proc/fs/lustre/osc/smuhpc-OST0014-osc/kbytesavail:818200064
/proc/fs/lustre/osc/smuhpc-OST0015-osc/kbytesavail:119708876
/proc/fs/lustre/osc/smuhpc-OST0016-osc/kbytesavail:818659948
/proc/fs/lustre/osc/smuhpc-OST0017-osc/kbytesavail:1307516124
/proc/fs/lustre/osc/smuhpc-OST0018-osc/kbytesavail:561624584
/proc/fs/lustre/osc/smuhpc-OST0019-osc/kbytesavail:682045540
/proc/fs/lustre/osc/smuhpc-OST001a-osc/kbytesavail:1262529492
/proc/fs/lustre/osc/smuhpc-OST001b-osc/kbytesavail:863983524
/proc/fs/lustre/osc/smuhpc-OST001c-osc/kbytesavail:511362064
/proc/fs/lustre/osc/smuhpc-OST001d-osc/kbytesavail:109827908
grep: /proc/fs/lustre/osc/smuhpc-OST0030-osc/kbytesavail: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0031-osc/kbytesavail: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0032-osc/kbytesavail: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0033-osc/kbytesavail: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0034-osc/kbytesavail: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0035-osc/kbytesavail: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0036-osc/kbytesavail:718253728
/proc/fs/lustre/osc/smuhpc-OST0037-osc/kbytesavail:472467152
/proc/fs/lustre/osc/smuhpc-OST0038-osc/kbytesavail:433729872
/proc/fs/lustre/osc/smuhpc-OST0039-osc/kbytesavail:875578332
/proc/fs/lustre/osc/smuhpc-OST003a-osc/kbytesavail:1161272852
grep: /proc/fs/lustre/osc/smuhpc-OST003b-osc/kbytesavail: Resource temporarily unavailable
/proc/fs/lustre/osc/smuhpc-OST0000-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST0001-osc/kbytestotal:9612387536
/proc/fs/lustre/osc/smuhpc-OST0002-osc/kbytestotal:11534862728
grep: /proc/fs/lustre/osc/smuhpc-OST0003-osc/kbytestotal: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0004-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST0005-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST0006-osc/kbytestotal:9612387536
/proc/fs/lustre/osc/smuhpc-OST0007-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST0008-osc/kbytestotal:9615574536
/proc/fs/lustre/osc/smuhpc-OST0009-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST000a-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST000b-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST000c-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST000d-osc/kbytestotal:9612387536
/proc/fs/lustre/osc/smuhpc-OST000e-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST000f-osc/kbytestotal:9612387536
/proc/fs/lustre/osc/smuhpc-OST0010-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST0011-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST0012-osc/kbytestotal:9615574536
/proc/fs/lustre/osc/smuhpc-OST0013-osc/kbytestotal:13452678016
/proc/fs/lustre/osc/smuhpc-OST0014-osc/kbytestotal:13452678016
/proc/fs/lustre/osc/smuhpc-OST0015-osc/kbytestotal:11530866816
/proc/fs/lustre/osc/smuhpc-OST0016-osc/kbytestotal:13452678016
/proc/fs/lustre/osc/smuhpc-OST0017-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST0018-osc/kbytestotal:13452678016
/proc/fs/lustre/osc/smuhpc-OST0019-osc/kbytestotal:11530866816
/proc/fs/lustre/osc/smuhpc-OST001a-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST001b-osc/kbytestotal:9615574536
/proc/fs/lustre/osc/smuhpc-OST001c-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST001d-osc/kbytestotal:9615574536
grep: /proc/fs/lustre/osc/smuhpc-OST0030-osc/kbytestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0031-osc/kbytestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0032-osc/kbytestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0033-osc/kbytestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0034-osc/kbytestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0035-osc/kbytestotal: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0036-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST0037-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST0038-osc/kbytestotal:11534862728
/proc/fs/lustre/osc/smuhpc-OST0039-osc/kbytestotal:11538687128
/proc/fs/lustre/osc/smuhpc-OST003a-osc/kbytestotal:9612387536
grep: /proc/fs/lustre/osc/smuhpc-OST003b-osc/kbytestotal: Resource temporarily unavailable
# grep '[0-9]' /proc/fs/lustre/osc/*/files{free,total}
/proc/fs/lustre/osc/smuhpc-OST0000-osc/filesfree:128514539
/proc/fs/lustre/osc/smuhpc-OST0001-osc/filesfree:191416790
/proc/fs/lustre/osc/smuhpc-OST0002-osc/filesfree:274004820
grep: /proc/fs/lustre/osc/smuhpc-OST0003-osc/filesfree: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0004-osc/filesfree:394395591
/proc/fs/lustre/osc/smuhpc-OST0005-osc/filesfree:33076291
/proc/fs/lustre/osc/smuhpc-OST0006-osc/filesfree:224911717
/proc/fs/lustre/osc/smuhpc-OST0007-osc/filesfree:214486110
/proc/fs/lustre/osc/smuhpc-OST0008-osc/filesfree:8856919
/proc/fs/lustre/osc/smuhpc-OST0009-osc/filesfree:9624045
/proc/fs/lustre/osc/smuhpc-OST000a-osc/filesfree:234669553
/proc/fs/lustre/osc/smuhpc-OST000b-osc/filesfree:2151113
/proc/fs/lustre/osc/smuhpc-OST000c-osc/filesfree:11219725
/proc/fs/lustre/osc/smuhpc-OST000d-osc/filesfree:279442892
/proc/fs/lustre/osc/smuhpc-OST000e-osc/filesfree:192357679
/proc/fs/lustre/osc/smuhpc-OST000f-osc/filesfree:14234843
/proc/fs/lustre/osc/smuhpc-OST0010-osc/filesfree:52604176
/proc/fs/lustre/osc/smuhpc-OST0011-osc/filesfree:328988486
/proc/fs/lustre/osc/smuhpc-OST0012-osc/filesfree:278118850
/proc/fs/lustre/osc/smuhpc-OST0013-osc/filesfree:229382023
/proc/fs/lustre/osc/smuhpc-OST0014-osc/filesfree:204557180
/proc/fs/lustre/osc/smuhpc-OST0015-osc/filesfree:29929336
/proc/fs/lustre/osc/smuhpc-OST0016-osc/filesfree:204663451
/proc/fs/lustre/osc/smuhpc-OST0017-osc/filesfree:326881334
/proc/fs/lustre/osc/smuhpc-OST0018-osc/filesfree:140407304
/proc/fs/lustre/osc/smuhpc-OST0019-osc/filesfree:170512603
/proc/fs/lustre/osc/smuhpc-OST001a-osc/filesfree:315635470
/proc/fs/lustre/osc/smuhpc-OST001b-osc/filesfree:216012197
/proc/fs/lustre/osc/smuhpc-OST001c-osc/filesfree:127842996
/proc/fs/lustre/osc/smuhpc-OST001d-osc/filesfree:27465211
grep: /proc/fs/lustre/osc/smuhpc-OST0030-osc/filesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0031-osc/filesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0032-osc/filesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0033-osc/filesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0034-osc/filesfree: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0035-osc/filesfree: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0036-osc/filesfree:179276300
/proc/fs/lustre/osc/smuhpc-OST0037-osc/filesfree:118132811
/proc/fs/lustre/osc/smuhpc-OST0038-osc/filesfree:108438921
/proc/fs/lustre/osc/smuhpc-OST0039-osc/filesfree:218891001
/proc/fs/lustre/osc/smuhpc-OST003a-osc/filesfree:290319237
grep: /proc/fs/lustre/osc/smuhpc-OST003b-osc/filesfree: Resource temporarily unavailable
/proc/fs/lustre/osc/smuhpc-OST0000-osc/filestotal:132250935
/proc/fs/lustre/osc/smuhpc-OST0001-osc/filestotal:193743661
/proc/fs/lustre/osc/smuhpc-OST0002-osc/filestotal:277776374
grep: /proc/fs/lustre/osc/smuhpc-OST0003-osc/filestotal: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0004-osc/filestotal:396440189
/proc/fs/lustre/osc/smuhpc-OST0005-osc/filestotal:35044722
/proc/fs/lustre/osc/smuhpc-OST0006-osc/filestotal:226783057
/proc/fs/lustre/osc/smuhpc-OST0007-osc/filestotal:217558056
/proc/fs/lustre/osc/smuhpc-OST0008-osc/filestotal:11184523
/proc/fs/lustre/osc/smuhpc-OST0009-osc/filestotal:12231803
/proc/fs/lustre/osc/smuhpc-OST000a-osc/filestotal:237327760
/proc/fs/lustre/osc/smuhpc-OST000b-osc/filestotal:4003238
/proc/fs/lustre/osc/smuhpc-OST000c-osc/filestotal:12981815
/proc/fs/lustre/osc/smuhpc-OST000d-osc/filestotal:281106176
/proc/fs/lustre/osc/smuhpc-OST000e-osc/filestotal:195190328
/proc/fs/lustre/osc/smuhpc-OST000f-osc/filestotal:16476504
/proc/fs/lustre/osc/smuhpc-OST0010-osc/filestotal:55089005
/proc/fs/lustre/osc/smuhpc-OST0011-osc/filestotal:330925776
/proc/fs/lustre/osc/smuhpc-OST0012-osc/filestotal:279931713
/proc/fs/lustre/osc/smuhpc-OST0013-osc/filestotal:231994647
/proc/fs/lustre/osc/smuhpc-OST0014-osc/filestotal:206633272
/proc/fs/lustre/osc/smuhpc-OST0015-osc/filestotal:31825125
/proc/fs/lustre/osc/smuhpc-OST0016-osc/filestotal:206716377
/proc/fs/lustre/osc/smuhpc-OST0017-osc/filestotal:329458681
/proc/fs/lustre/osc/smuhpc-OST0018-osc/filestotal:142450211
/proc/fs/lustre/osc/smuhpc-OST0019-osc/filestotal:172358875
/proc/fs/lustre/osc/smuhpc-OST001a-osc/filestotal:318295996
/proc/fs/lustre/osc/smuhpc-OST001b-osc/filestotal:218409008
/proc/fs/lustre/osc/smuhpc-OST001c-osc/filestotal:129660363
/proc/fs/lustre/osc/smuhpc-OST001d-osc/filestotal:29250131
grep: /proc/fs/lustre/osc/smuhpc-OST0030-osc/filestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0031-osc/filestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0032-osc/filestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0033-osc/filestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0034-osc/filestotal: Cannot send after transport endpoint shutdown
grep: /proc/fs/lustre/osc/smuhpc-OST0035-osc/filestotal: Cannot send after transport endpoint shutdown
/proc/fs/lustre/osc/smuhpc-OST0036-osc/filestotal:180363444
/proc/fs/lustre/osc/smuhpc-OST0037-osc/filestotal:119215177
/proc/fs/lustre/osc/smuhpc-OST0038-osc/filestotal:109491390
/proc/fs/lustre/osc/smuhpc-OST0039-osc/filestotal:220028416
/proc/fs/lustre/osc/smuhpc-OST003a-osc/filestotal:291302059
grep: /proc/fs/lustre/osc/smuhpc-OST003b-osc/filestotal: Resource temporarily unavailable
# grep '[0-9]' /proc/fs/lustre/mds/*/kbytes{free,avail,total}
/proc/fs/lustre/mds/smuhpc-MDT0000/kbytesfree:677031612
/proc/fs/lustre/mds/smuhpc-MDT0000/kbytesavail:677031612
/proc/fs/lustre/mds/smuhpc-MDT0000/kbytestotal:688076544
# grep '[0-9]' /proc/fs/lustre/mds/*/files{free,total}
/proc/fs/lustre/mds/smuhpc-MDT0000/filesfree:154564006
/proc/fs/lustre/mds/smuhpc-MDT0000/filestotal:196608000
# lfs df
UUID 1K-blocks Used Available Use% Mounted on
smuhpc-MDT0000_UUID 688076544 11044932 677031612 2% /lustre[MDT:0]
smuhpc-OST0000_UUID 11538687128 11024628972 514033840 96% /lustre[OST:0]
smuhpc-OST0001_UUID 9612387536 8846797188 765561148 92% /lustre[OST:1]
smuhpc-OST0002_UUID 11534862728 10438969420 1095808024 90% /lustre[OST:2]
smuhpc-OST0003_UUID 11538687128 10237552284 1301088824 89% /lustre[OST:3]
smuhpc-OST0004_UUID 11534862728 9957280360 1577573564 86% /lustre[OST:4]
smuhpc-OST0005_UUID 11538687128 11406381964 132295072 99% /lustre[OST:5]
smuhpc-OST0006_UUID 9612387536 8712821564 899477908 91% /lustre[OST:6]
smuhpc-OST0007_UUID 11534862728 10676918312 857941880 93% /lustre[OST:7]
smuhpc-OST0008_UUID 9615574536 9587388940 28150780 100% /lustre[OST:8]
smuhpc-OST0009_UUID 11534862728 11502157276 32703404 100% /lustre[OST:9]
smuhpc-OST000a_UUID 11538687128 10600008916 938671044 92% /lustre[OST:10]
smuhpc-OST000b_UUID 11534862728 11526258276 8593840 100% /lustre[OST:11]
smuhpc-OST000c_UUID 11538687128 11493808228 44876596 100% /lustre[OST:12]
smuhpc-OST000d_UUID 9612387536 8494655908 1117723728 88% /lustre[OST:13]
smuhpc-OST000e_UUID 11534862728 10765432012 769356988 93% /lustre[OST:14]
smuhpc-OST000f_UUID 9612387536 9555448164 56922292 99% /lustre[OST:15]
smuhpc-OST0010_UUID 11534862728 11324446024 210408036 98% /lustre[OST:16]
smuhpc-OST0011_UUID 11538687128 10222777216 1315902940 89% /lustre[OST:17]
smuhpc-OST0012_UUID 9615574536 8503099372 1112462716 88% /lustre[OST:18]
smuhpc-OST0013_UUID 13452678016 12535227748 917441212 93% /lustre[OST:19]
smuhpc-OST0014_UUID 13452678016 12634464700 818178500 94% /lustre[OST:20]
smuhpc-OST0015_UUID 11530866816 11411149472 119708876 99% /lustre[OST:21]
smuhpc-OST0016_UUID 13452678016 12639406780 813217988 94% /lustre[OST:22]
smuhpc-OST0017_UUID 11538687128 10231302100 1307378884 89% /lustre[OST:23]
smuhpc-OST0018_UUID 13452678016 12891084320 561588224 96% /lustre[OST:24]
smuhpc-OST0019_UUID 11530866816 10848816880 682044044 94% /lustre[OST:25]
smuhpc-OST001a_UUID 11534862728 10272389460 1262461836 89% /lustre[OST:26]
smuhpc-OST001b_UUID 9615574536 8751547300 864019044 91% /lustre[OST:27]
smuhpc-OST001c_UUID 11538687128 11027353036 511322908 96% /lustre[OST:28]
smuhpc-OST001d_UUID 9615574536 9505741344 109832588 99% /lustre[OST:29]
smuhpc-OST0030_UUID 11534862728 7027461656 4507389808 61% /lustre[OST:48]
smuhpc-OST0031_UUID 11538687128 2208373512 9330099084 19% /lustre[OST:49]
smuhpc-OST0032_UUID 9612387536 5795054380 3817314724 60% /lustre[OST:50]
smuhpc-OST0033_UUID 11534862728 7414666856 4120177440 64% /lustre[OST:51]
smuhpc-OST0034_UUID 11538687128 7489405512 4049271072 65% /lustre[OST:52]
smuhpc-OST0035_UUID 9615574536 6709760396 2905811696 70% /lustre[OST:53]
smuhpc-OST0036_UUID 11534862728 10824985124 709871280 94% /lustre[OST:54]
smuhpc-OST0037_UUID : Resource temporarily unavailable
smuhpc-OST0038_UUID : Resource temporarily unavailable
smuhpc-OST0039_UUID 11538687128 10663152820 875526472 92% /lustre[OST:57]
smuhpc-OST003a_UUID 9612387536 8451144380 1161159188 88% /lustre[OST:58]
smuhpc-OST003b_UUID 9615574536 9396687868 218849404 98% /lustre[OST:59]
filesystem summary: 446049266544 393606006040 52442216896 88% /lustre
8 years, 10 months
Where is cfs_spin_unlock?
by Jay Lan
I tried to build lustre-2.4.0 client for sles11sp2 3.0.74-0.6.6.2.
The build failed because cfs_spin_unlock was not defined:
/usr/src/packages/BUILD/lustre-2.4.0/lnet/klnds/socklnd/socklnd_cb.c: In
function 'ksocknal_check_peer_timeouts':
/usr/src/packages/BUILD/lustre-2.4.0/lnet/klnds/socklnd/socklnd_cb.c:2525:
error: implicit declaration of function 'cfs_spin_unlock'
Where is that defined?
Thanks,
Jay
8 years, 10 months