In a failover situation, any target can be stopped and restarted without
impact on other nodes. The startup order in the manual is for a cold
startup/full shutdown situation, and does not apply to a running
filesystem and failover.
You should not have the ordering directive, I think. In particular, the
MGT is only used for client/server mounts, if the filesystem is up and
running the MGT failover should be very transparent to the rest of the
cluster and should never require another node to be restarted.
On 5/23/14, 7:58 AM, "Riccardo Murri" <riccardo.murri(a)uzh.ch> wrote:
The online Lustre manual recommends that Lustre targets are started in
this order: MGT, MDT, OSTs, clients.
Now we are setting up an HA cluster with Pacemaker, and a strict
ordering directive ("order ... Mandatory: ostXX mdt mgt") results in a
complete restart of all targets if the MGT is migrated. In my
experience (with Lustre 1.8.5) this is most of the time unnecessary,
and Lustre can recover from a single target restart. However, we have
recently switched to Lustre 2.4.3 and things might have changed.
So the question is: is this order strict (in Lustre 2.4.3), or can a
target be stopped and restarted on another node without affecting the
targets running on other nodes?
Thanks for any help!
Grid Computing Competence Centre
University of Zurich
Winterthurerstrasse 190, CH-8057 Zürich (Switzerland)
Tel: +41 44 635 4222
Fax: +41 44 635 6888
Lustre-discuss mailing list