Hi all,


I previously setup a temporary combined MGS/MDS server and it worked fine with the OST server and clients (Lustre 2.5.0). After I finished setting up the final server, I used gnu tar v1.26 to "copy" the data from the temporary one to the new one according to the lustre user manual. However, then everything went wrong. The basic information:


*** Client (cola1)

eth0: 10.242.116.6

eth1: 192.168.1.6

modprobe.conf: options lnet networks=tcp0(eth1),tcp1(eth0)


*** Old MDS (cola4)

eth0: 10.242.116.7

eth1: 192.168.1.7

modprobe.conf: options lnet networks=tcp0(eth1),tcp1(eth0)

MGS/MDS mount point: /MDT

device: /dev/mapper/VolGroup00-LogVol03


*** New MDS (myoss)

eth0: 10.242.116.32

eth1: 192.168.1.32

modprobe.conf: options lnet networks=tcp0(eth1),tcp1(eth0)

MGS/MDS mount point: /MDT

device: /dev/sda6


*** OSS (new_mds)

eth0: 192.168.1.34

eth1: Disabled

modprobe.conf: options lnet ip2nets="tcp0 192.168.1.*"

OST mount point: /OST

device: /dev/sda5



*************************************************************************************

What I did (the steps are not so accurate enough since I have tried, failed, and then retried according to what I google’d):


*** New MDS (new_mds)

[root@new_mds ~]# mkfs.lustre --reformat --fsname=lustre --mgs --mdt --index=0 /dev/sda6


  Permanent disk data:

Target:     lustre:MDT0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x65

             (MDT MGS first_time update )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:


device size = 102406MB

formatting backing filesystem ldiskfs on /dev/sda6

       target name  lustre:MDT0000

       4k blocks     26216064

       options        -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F

mkfs_cmd = mke2fs -j -b 4096 -L lustre:MDT0000  -J size=400 -I 512 -i 2048 -q -O dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E lazy_journal_init -F /dev/sda6 26216064

Writing CONFIGS/mountdata

[root@new_mds ~]# mount -t ldiskfs /dev/sda6 /MDT

[root@new_mds ~]# cd /MDT

[root@new_mds MDT]# ssh 192.168.1.7 "cd /MDT; tar cf - --xattrs --sparse ." | tar xpf - --xattrs --sparse

root@192.168.1.7's password:

tar: ./ROOT/space/users/svnsync/.google/desktop/a2_sock: socket ignored

tar: ./ROOT/space/users/svnsync/.google/desktop/a1_sock: socket ignored

tar: ./ROOT/space/users/svnsync/.google/desktop/a3_sock: socket ignored

tar: ./ROOT/space/users/svnsync/.google/desktop/a4_sock: socket ignored

tar: value -1984391879 out of time_t range 0..8589934591

tar: ./ROOT/space2/a/Database/Database/FILE0064.jpg: implausibly old time stamp 1907-02-13 20:02:01

tar: ./ROOT/space2/Departee/???(??)/????/IMP????/Green.JPG: time stamp 2019-07-07 10:12:00 is 174511703.910839819 s in the future

tar: ./ROOT/space2/Departee/???(??)/????/IMP????/Horizon.jpg: time stamp 2035-03-24 10:12:00 is 670361303.91045891 s in the future

tar: ./ROOT/space2/Departee/???(??)/????/IMP????/a.jpg: time stamp 2035-02-24 10:12:00 is 667942103.910342464 s in the future

tar: ./ROOT/space2/Departee/???(??)/????/IMP????/Blue.JPG: time stamp 2027-07-30 10:12:00 is 428959703.909806892 s in the future

tar: ./ROOT/space2/server_backup/cola8/var/lib/mysql/mysql.sock: socket ignored

tar: value -2147483648 out of time_t range 0..8589934591

tar: value -2147483648 out of time_t range 0..8589934591

tar: ./ROOT/space2/user_backup/Obsolete/DVB/software_sim/.ico: implausibly old time stamp 1901-12-14 04:45:52

tar: ./ROOT/space2/user_backup/Obsolete/DVB/software_sim/small.ico: implausibly old time stamp 1901-12-14 04:45:52

tar: Exiting with failure status due to previous errors

[root@new_mds MDT]# ls

CATALOGS           lost+found  oi.16.2   oi.16.33  oi.16.47  oi.16.60

CONFIGS            lov_objid   oi.16.20  oi.16.34  oi.16.48  oi.16.61

NIDTBL_VERSIONS    lov_objseq  oi.16.21  oi.16.35  oi.16.49  oi.16.62

O                  oi.16.0     oi.16.22  oi.16.36  oi.16.5   oi.16.63

OI_scrub           oi.16.1     oi.16.23  oi.16.37  oi.16.50  oi.16.7

PENDING            oi.16.10    oi.16.24  oi.16.38  oi.16.51  oi.16.8

REMOTE_PARENT_DIR  oi.16.11    oi.16.25  oi.16.39  oi.16.52  oi.16.9

ROOT               oi.16.12    oi.16.26  oi.16.4   oi.16.53  quota_master

changelog_catalog  oi.16.13    oi.16.27  oi.16.40  oi.16.54  quota_slave

changelog_users    oi.16.14    oi.16.28  oi.16.41  oi.16.55  seq_ctl

fld                oi.16.15    oi.16.29  oi.16.42  oi.16.56  seq_srv

hsm_actions        oi.16.16    oi.16.3   oi.16.43  oi.16.57

last_rcvd          oi.16.17    oi.16.30  oi.16.44  oi.16.58

lfsck_bookmark     oi.16.18    oi.16.31  oi.16.45  oi.16.59

lfsck_namespace    oi.16.19    oi.16.32  oi.16.46  oi.16.6

[root@new_mds MDT]# cd

[root@new_mds ~]# umount /MDT

[root@new_mds ~]# mount -t lustre /dev/sda6 -o nosvc /MDT

[root@new_mds ~]# lctl replace_nids lustre-MDT0000 192.168.1.32@tcp

[root@new_mds ~]# lctl dl

 0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 5

 1 UP mgs MGS MGS 5

 2 UP mgc MGC192.168.1.32@tcp fa00c372-90c9-ce21-7d9b-058e1125be1a 5

[root@new_mds ~]# umount /MDT

[root@new_mds ~]# mount /MDT

[root@new_mds ~]# mount -t lustre /dev/sda6 /MDT



*** OSS (myoss)

# umount /MDT

# tunefs.lustre --erase-param --mgsnode=192.168.1.32@tcp --writeconf /dev/sd5

# tunefs.lustre --writeconf /dev/sda5

# mount -t lustre /dev/sda5 /OST



Mounting /OST was successful so I assumed it should be okay. However, the client failed to access the filesystem with the following error:



# mount -t lustre 192.168.1.32:/lustre /mnt/tmp

mount.lustre: mount 192.168.1.32:/lustre at /mnt/tmp failed: File exists



On the OSS, I can see some errors in /var/log/message like this:


Dec 25 15:44:21 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 15:44:40 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 15:44:40 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 15:44:41 myoss kernel: LustreError: 13a-8: Failed to get MGS log params and no local copy.

Dec 25 15:44:41 myoss kernel: LustreError: Skipped 1 previous similar message

Dec 25 15:44:41 myoss kernel: Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450

Dec 25 15:44:46 myoss kernel: Lustre: lustre-OST0000: Will be in recovery for at least 2:30, or until 1 client reconnects

Dec 25 15:44:46 myoss kernel: Lustre: lustre-OST0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted.

Dec 25 15:44:46 myoss kernel: Lustre: lustre-OST0000: deleting orphan objects from 0x0:8499078 to 0x0:8499169

Dec 25 15:46:57 myoss kernel: Lustre: Failing over lustre-OST0000

Dec 25 15:46:58 myoss kernel: Lustre: server umount lustre-OST0000 complete




On the new MDS, I see:


Dec 25 15:46:31 new_mds kernel: Lustre: 1738:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387957584/real 1387957584]  req@ffff880279ed3c00 x1455378659279340/t0(0) o13->lustre-OST0000-osc-MDT0000@192.168.1.34@tcp:7/4 lens 224/368 e 0 to 1 dl 1387957591 ref 1 fl Rpc:X/0/ffffffff rc 0/-1

Dec 25 15:46:31 new_mds kernel: Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 192.168.1.34@tcp) was lost; in progress operations using this service will wait for recovery to complete

Dec 25 15:46:37 new_mds kernel: Lustre: 1731:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387957591/real 1387957591]  req@ffff880279ed3000 x1455378659279344/t0(0) o8->lustre-OST0000-osc-MDT0000@192.168.1.34@tcp:28/4 lens 400/544 e 0 to 1 dl 1387957597 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1

Dec 25 15:47:07 new_mds kernel: Lustre: 1731:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387957616/real 1387957616]  req@ffff880279ed3400 x1455378659279352/t0(0) o8->lustre-OST0000-osc-MDT0000@192.168.1.34@tcp:28/4 lens 400/544 e 0 to 1 dl 1387957627 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1

Dec 25 15:47:31 new_mds kernel: Lustre: Found index 0 for lustre-OST0000, updating log

Dec 25 15:47:37 new_mds kernel: Lustre: 1731:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387957641/real 1387957641]  req@ffff880279ed3400 x1455378659279364/t0(0) o8->lustre-OST0000-osc-MDT0000@192.168.1.34@tcp:28/4 lens 400/544 e 0 to 1 dl 1387957657 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1

Dec 25 15:47:38 new_mds kernel: Lustre: 1777:0:(mgc_request.c:1645:mgc_process_recover_log()) Process recover log lustre-mdtir error -22

Dec 25 15:48:02 new_mds kernel: Lustre: 1731:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387957666/real 1387957666]  req@ffff88027b6fe000 x1455378659279452/t0(0) o8->lustre-OST0000-osc-MDT0000@0@lo:28/4 lens 400/544 e 0 to 1 dl 1387957682 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1

Dec 25 15:48:11 new_mds kernel: Lustre: lustre-OST0000-osc-MDT0000: Connection restored to lustre-OST0000 (at 192.168.1.34@tcp)

Dec 25 15:54:27 new_mds kernel: Lustre: Failing over lustre-MDT0000

Dec 25 15:54:33 new_mds kernel: Lustre: 2396:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387958067/real 1387958067]  req@ffff88026b260800 x1455378659279928/t0(0) o251->MGC192.168.1.32@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1387958073 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1

Dec 25 15:54:33 new_mds kernel: Lustre: server umount lustre-MDT0000 complete

Dec 25 15:54:39 new_mds kernel: LDISKFS-fs (sda6): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 15:54:58 new_mds kernel: LDISKFS-fs (sda6): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 15:54:58 new_mds kernel: Lustre: MGS: Logs for fs lustre were removed by user request.  All servers must be restarted in order to regenerate the logs.

Dec 25 15:54:58 new_mds kernel: Lustre: lustre-MDT0000: used disk, loading

Dec 25 15:54:58 new_mds kernel: LustreError: 2486:0:(osd_io.c:950:osd_ldiskfs_read()) lustre=MDT0000: can't read 128@8192 on ino 28: rc = 0

Dec 25 15:54:58 new_mds kernel: LustreError: 2486:0:(mdt_recovery.c:112:mdt_clients_data_init()) error reading MDS last_rcvd idx 0, off 8192: rc -14

Dec 25 15:54:58 new_mds kernel: LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.



After several hours of trials, I decided to fallback to the old MDS server. As a result, I did the same things (and also tried a lot of times):



*** OSS (myoss)

[root@myoss ~]# tunefs.lustre --erase-param --mgsnode=192.168.1.7@tcp --writeconf /dev/sda5

checking for existing Lustre data: found

Reading CONFIGS/mountdata


  Read previous values:

Target:     lustre-OST0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x1002

             (OST no_primnode )

Persistent mount opts: errors=remount-ro

Parameters: mgsnode=192.168.1.7@tcp



  Permanent disk data:

Target:     lustre=OST0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x1142

             (OST update writeconf no_primnode )

Persistent mount opts: errors=remount-ro

Parameters: mgsnode=192.168.1.7@tcp


Writing CONFIGS/mountdata

[root@myoss ~]# tunefs.lustre --writeconf /dev/sda5

checking for existing Lustre data: found

Reading CONFIGS/mountdata


  Read previous values:

Target:     lustre-OST0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x1142

             (OST update writeconf no_primnode )

Persistent mount opts: errors=remount-ro

Parameters: mgsnode=192.168.1.7@tcp



  Permanent disk data:

Target:     lustre=OST0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x1142

             (OST update writeconf no_primnode )

Persistent mount opts: errors=remount-ro

Parameters: mgsnode=192.168.1.7@tcp


Writing CONFIGS/mountdata

[root@myoss ~]# mount -t lustre /dev/sda5 /OST

[root@myoss ~]# df -kh

Filesystem      Size  Used Avail Use% Mounted on

/dev/sda2        29G  8.9G   19G  33% /

tmpfs           7.8G   72K  7.8G   1% /dev/shm

/dev/sda1       291M   62M  215M  23% /boot

/dev/sda4       9.7G  2.5G  6.7G  28% /tmp

/dev/sda5        11T  3.9T  6.5T  38% /OST



*** Old MDS (old_mds)

[root@old_mds ~]# mount -t lustre /dev/mapper/VolGroup00-LogVol03 -o nosvc /MDT

[root@old_mds ~]# lctl replace_nids lustre-MDT0000 192.168.1.7@tcp

[root@old_mds ~]# lctl dl

 0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 5

 1 UP mgs MGS MGS 5

 2 UP mgc MGC192.168.1.7@tcp 6094ff79-4ad8-d93f-1f37-307e324387e2 5

[root@old_mds ~]# umount /MDT

[root@old_mds ~]# tunefs.lustre --writeconf /dev/mapper/VolGroup00-LogVol03

checking for existing Lustre data: found

Reading CONFIGS/mountdata


  Read previous values:

Target:     lustre-MDT0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x5

             (MDT MGS )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:



  Permanent disk data:

Target:     lustre=MDT0000

Index:      0

Lustre FS:  lustre

Mount type: ldiskfs

Flags:      0x105

             (MDT MGS writeconf )

Persistent mount opts: user_xattr,errors=remount-ro

Parameters:


Writing CONFIGS/mountdata

[root@old_mds ~]# mount -t lustre /dev/mapper/VolGroup00-LogVol03 /MDT

[root@old_mds ~]# cat /proc/fs/lustre/devices  0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 8

 1 UP mgs MGS MGS 7

 2 UP mgc MGC192.168.1.7@tcp 28fc524e-9128-4b16-adbf-df94972b556a 5

 3 UP mds MDS MDS_uuid 3

 4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 4

 5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 7

 6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 4

 7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 4

 8 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 5

 9 AT osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 1

[root@old_mds ~]# df -k

Filesystem                      1K-blocks     Used Available Use% Mounted on

/dev/mapper/VolGroup00-LogVol02  30237648 11163024  17538624  39% /

tmpfs                             4029452       72   4029380   1% /dev/shm

/dev/sda1                          297485    62889    219236  23% /boot

/dev/mapper/VolGroup00-LogVol01  30237648   773892  27927756   3% /tmp

/dev/mapper/VolGroup00-LogVol03  78629560  7876016  65511384  11% /MDT



*** Client

[root@cola1 ~]# mount -t lustre 192.168.1.7@tcp0:/lustre /mnt

mount.lustre: mount 192.168.1.7@tcp0:/lustre at /mnt failed: No such file or directory

Is the MGS specification correct?

Is the filesystem name correct?

If upgrading, is the copied client log valid? (see upgrade docs)



*** Messages on OSS

Dec 25 20:53:21 myoss modprobe: FATAL: Error inserting padlock_sha (/lib/modules/2.6.32-358.18.1.el6_lustre.x86_64/kernel/drivers/crypto/padlock-sha.ko): No such device

Dec 25 20:53:25 myoss kernel: Lustre: Lustre: Build Version: 2.5.0-RC1--PRISTINE-2.6.32-358.18.1.el6_lustre.x86_64

Dec 25 20:53:26 myoss kernel: LNet: Added LNI 192.168.1.34@tcp [8/256/0/180]

Dec 25 20:53:26 myoss kernel: LNet: Accept secure, port 988

Dec 25 21:41:50 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:42:07 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:51:58 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:51:59 myoss kernel: LDISKFS-fs (sda5): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:52:00 myoss kernel: LustreError: 13a-8: Failed to get MGS log params and no local copy.

Dec 25 21:52:00 myoss kernel: LustreError: 13a-8: Failed to get MGS log params and no local copy.



*** Messages on Old MDS


Dec 25 20:56:31 old_mds kernel: LNet: HW CPU cores: 8, npartitions: 2

Dec 25 20:56:31 old_mds modprobe: FATAL: Error inserting crc32c_intel (/lib/modules/2.6.32-358.18.1.el6_lustre.x86_64/kernel/arch/x86/crypto/crc32c-intel.ko): No such device

Dec 25 20:56:31 old_mds kernel: alg: No test for crc32 (crc32-table)

Dec 25 20:56:31 old_mds kernel: alg: No test for adler32 (adler32-zlib)

Dec 25 20:56:35 old_mds modprobe: FATAL: Error inserting padlock_sha (/lib/modules/2.6.32-358.18.1.el6_lustre.x86_64/kernel/drivers/crypto/padlock-sha.ko): No such device

Dec 25 20:56:39 old_mds kernel: Lustre: Lustre: Build Version: 2.5.0-RC1--PRISTINE-2.6.32-358.18.1.el6_lustre.x86_64

Dec 25 20:56:40 old_mds kernel: LNet: Added LNI 192.168.1.7@tcp [8/256/0/180]

Dec 25 20:56:40 old_mds kernel: LNet: Added LNI 10.242.116.7@tcp1 [8/256/0/180]

Dec 25 20:56:40 old_mds kernel: LNet: Accept secure, port 988

Dec 25 21:45:23 old_mds kernel: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:46:20 old_mds kernel: LustreError: 2378:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-MDT0000: Can't end config log lustre-client.

Dec 25 21:46:20 old_mds kernel: LustreError: 2378:0:(obd_mount_server.c:1426:server_put_super()) MGS: failed to disconnect lwp. (rc=-2)

Dec 25 21:46:26 old_mds kernel: Lustre: 2378:0:(client.c:1897:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1387979180/real 1387979180]  req@ffff8802193b0c00 x1455398531891216/t0(0) o251->MGC192.168.1.7@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1387979186 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1

Dec 25 21:46:26 old_mds kernel: Lustre: server umount MGS complete

Dec 25 21:52:43 old_mds kernel: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:53:08 old_mds kernel: LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. quota=on. Opts:

Dec 25 21:53:08 old_mds kernel: Lustre: MGS: Logs for fs lustre were removed by user request.  All servers must be restarted in order to regenerate the logs.

Dec 25 21:53:09 old_mds kernel: Lustre: lustre-MDT0000: used disk, loading

Dec 25 21:53:09 old_mds kernel: LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.

Dec 25 21:53:20 old_mds kernel: Lustre: MGS: Regenerating lustre-OST0000 log by user request.

Dec 25 21:53:27 old_mds kernel: Lustre: 2461:0:(mgc_request.c:1645:mgc_process_recover_log()) Process recover log lustre-mdtir error -22

Dec 25 21:53:27 old_mds kernel: LustreError: 2516:0:(ldlm_lib.c:429:client_obd_setup()) can't add initial connection

Dec 25 21:53:27 old_mds kernel: LustreError: 2516:0:(osp_dev.c:684:osp_init0()) lustre-OST0000-osc-MDT0000: can't setup obd: -2

Dec 25 21:53:27 old_mds kernel: LustreError: 2516:0:(obd_config.c:572:class_setup()) setup lustre-OST0000-osc-MDT0000 failed (-2)

Dec 25 21:53:27 old_mds kernel: LustreError: 2516:0:(obd_config.c:1591:class_config_llog_handler()) MGC192.168.1.7@tcp: cfg command failed: rc = -2

Dec 25 21:53:27 old_mds kernel: Lustre:    cmd=cf003 0:lustre-OST0000-osc-MDT0000  1:lustre-OST0000_UUID  2:0@<0:0>  




During the tar, I accidentally created a file under /MDT then I deleted it. Can this damage the MDT?


I even then tried to add failover nodes to OSS and run both MDS at the same time, then try to “mount -t lustre 192.168.1.7@tcp:/lustre /lustre” and “mount -t lustre 192.168.1.32@tcp:/lustre /lustre”. Both fail.


Since we almost run out of all space, I’ve put some important data onto lustre. If this can’t be recovered, it will be a disaster to me… Hope somebody can help me. Thanks a lot.


   Frank