I should also have mentioned the Robin Hood project:




Robin Hood is a storage policy management tool that works well with Lustre.





From: hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of Cowe, Malcolm J
Sent: Wednesday, May 29, 2013 9:27 AM
To: gary.k.sweely@census.gov; hpdd-discuss@lists.01.org
Cc: Prasad Surampudi; Holtz, JohnX; raymond.illian@census.gov; Chris Churchey; james.m.lessard@census.gov
Subject: Re: [HPDD-discuss] Anyone Backing up a Large LUSTRE file systems, any issues


Hi Gary,


I would recommend a file-based backup strategy where the backup processes run on Lustre clients that are connected to the backup infrastructure. In fact this is the only realistic way to be able to provide targeted restores of files/directories. We quite often see data management or mover nodes in HPC architectures – servers on the boundary of the cluster that can interface with external data systems such as tape libraries, either over a network or fibre channel. By managing the backups like this, there is no need to interface directly with the OSTs or MDTs and most if not all backup applications will work perfectly well on the data management  Lustre client.


One might also want to consider an online duplicate of the most critical data by syncing to a separate lustre fs, since restore time from a tape vault can be considerable for a large volume of data. Several strategies exist, depending on requirements and the applications in use.







Malcolm Cowe, Systems Engineer

Intel High Performance Data Division


+61 408 573 001


From: hpdd-discuss-bounces@lists.01.org [mailto:hpdd-discuss-bounces@lists.01.org] On Behalf Of gary.k.sweely@census.gov
Sent: Wednesday, May 29, 2013 1:16 AM
To: hpdd-discuss@lists.01.org
Cc: Prasad Surampudi; Chris Churchey; Holtz, JohnX; raymond.illian@census.gov; james.m.lessard@census.gov
Subject: [HPDD-discuss] Anyone Backing up a Large LUSTRE file systems, any issues


Has anyone identified issues with backing up and restoring a large LUSTRE file system.


We want to be able to backup the file system and restore both individual files, and the full file system.

Has anyone identified specific issues with backup and restore of the LUSTRE file system.

Backup needs to run while users are accessing and writing files to the file system.


Backup concern: 

  1. How does it handle backup of data spread across multiple OST/OSS's yet maintain consistency of the file segments?
  2. Will backup system require backup media service pulling data over Ethernet, or can the OSS's do direct backup and restore of EXT4 file systems for full system backup/restores while maintaining consistency of the files spread across OSTs?
  3. Is there a specific backup product used to solve some of the file consistency issues?

We would be using a large tape drive library cluster that can strip the backup across multiple tape drives to improve backup media performance.  This would most likely mean having several systems running backup concurrently to multiple tape drive strip sets.  I expect we would need to break the LUSTRE file systems into several backup segments running concurrently, which would also mean several independent restores to restore the whole system. But one major requirement is being able to restore a single file or directory when needed.

Backup windows would be 8-14 hours.

RTO of single file would need to be under 1 hour.

RTO of full file system would be 4 days.

RPO is one day's worth of project data, 1 week's worth of source data.



We are considering a LUSTRE environment as follows;


30TB-50TB source data, potentially will grow out to about 200TB.

100TB to 500TB Project workspace.

30TB of user Scratch space (does not need to be backed up).


Initial total capacity 170TB growing to max size of 1PB.


Most likely initially using 2TB OST's, across 11+ OSS's.  May user larger OST's if no issues found in services/supportability/throughput.


We were thinking of breaking the total space into separate file systems to allow using multiple MDS/MDT's for improving performance of the MDS's, which would also facilitate easier full LUSTRE file system backup/restores.  But this means loosing the flexibility of having one large file system.


OSTs using EXT4 or XFS file systems.


About 25 Dedicated Clients servers with 20 to 40 CPU cores and 200GB-1TB RAM running scheduled batch compute jobs.  Grows as loads dictate.

Potentially add about 10-100 VMware Virtual client compute servers running batch jobs. (4 or 8 cores with 8 to 32GB ram).

About 2-5 interactive user nodes, nodes added as load needs dictate.



Truth 23. Your common sense is not always someone else's common sense. Don't assume that just because it's obvious to you, it will be obvious to others.
Gary K Sweely, 301-763-5532, Cell 301-651-2481
SAN and Storage Systems Manager
US Census, Bowie Computer Center

Paper Mail to:
Washington DC 20233

Office Physical or Delivery Address
17101 Melford Blvd
Bowie MD, 20715