Re: [HPDD-discuss] Problem remounting file systems / MDT
by Laurent Herlaud
Hi Andreas,
A few days before that, we upgraded from lustre 1.8 to lustre 2.1.6.
Evrything was working fine.
Le 11 mars 2014 19:27, "Dilger, Andreas" <andreas.dilger(a)intel.com> a écrit :
>
> On 2014/03/11, 10:36 AM, "Laurent Herlaud" <lherlaud(a)arsystemes.fr<mailto:lherlaud@arsystemes.fr>> wrote:
> After a kernel panic on my OSS on RHEL 5, i decided to upgrade it on RHEL 6.4
>
> Even more important - which version of Lustre did you upgrade from, and which version are you trying to run now?
>
> Cheers, Andreas
>
> I have 5 file system, and 2 of them doesn't want to remount.
>
> I first had problems to mount MDT and OST for them, with errors "transport endpoint not connected" but after an --eraseparams, it's OK.
>
> But i have an another problem when i want to mount them on clients :
>
> # mount /appli
> mount.lustre: mount lo1117@o2ib:lo1116@o2ib:/appli at /appli failed: No such file or directory
> Is the MGS specification correct?
> Is the filesystem name correct?
> If upgrading, is the copied client log valid? (see upgrade docs)
>
> lctl ping lo1116@o2ib and lctl ping lo1117@o2ib are OK.
>
> So I tried to umount the 2 OST associated with this filesystem, umount the MDT and umount the MGS.
> Then I remounted the MGS, and when i want to remount the MDT, he fails and tell me that file exists.
>
>
> mount.lustre: mount /dev/mapper/appli at /lustre/mdt_appli failed: File exists
>
> When i check /proc/fs/lustre/devices, i see this line about that filesystem :
>
> 19 AT osc appli-OST0002-osc-MDT0000 appli-MDT0000-mdtlov_UUID 1
>
> I have double checked, but OSTs and MDT associated with filesystem "appli" are not mounted.
>
> Any idea ?
>
> Need a reboot here in any case.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Software Architect
> Intel High Performance Data Division
8 years, 2 months
Problem remounting file systems / MDT
by Laurent Herlaud
Hi,
After a kernel panic on my OSS on RHEL 5, i decided to upgrade it on
RHEL 6.4
I have 5 file system, and 2 of them doesn't want to remount.
I first had problems to mount MDT and OST for them, with errors
"transport endpoint not connected" but after an --eraseparams, it's OK.
But i have an another problem when i want to mount them on clients :
# mount /appli
mount.lustre: mount lo1117@o2ib:lo1116@o2ib:/appli at /appli failed: No
such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)
lctl ping lo1116@o2ib and lctl ping lo1117@o2ib are OK.
So I tried to umount the 2 OST associated with this filesystem, umount
the MDT and umount the MGS.
Then I remounted the MGS, and when i want to remount the MDT, he fails
and tell me that file exists.
/mount.lustre: mount /dev/mapper/appli at /lustre/mdt_appli failed: File exists/
When i check /proc/fs/lustre/devices, i see this line about that
filesystem :
19 AT osc appli-OST0002-osc-MDT0000 appli-MDT0000-mdtlov_UUID 1
I have double checked, but OSTs and MDT associated with filesystem
"appli" are not mounted.
Any idea ?
Thanks for help !
8 years, 2 months
LU-4619 filefrag and FIEMAP
by John Bauer
Is it possible to get the source for filefrag as used on Lustre? I have
a simple program that does the FIEMAP ioctl requests and it
crashes the client systems. I would like to figure out how filefrag
does the fiemap calls without crashing systems. I have done strace on
both my program and filefrag, and they both use the same command
argument to ioctl. I am kind of stuck until LU-4619 gets address.
https://jira.hpdd.intel.com/browse/LU-4619
Thanks for any help.
John
--
John Bauer
I/O Doctors LLC
507-766-0378
bauerj(a)iodoctors.com
8 years, 2 months
REMINDER: Lustre Survey 2014 by OpenSFS
by Christopher J. Morrone
Dear Lustre Community,
This is a REMINDER that the 2014 Lustre Survey under way NOW!
Please note that the Survey will complete on Friday, March 14th.
Here is the orginal announcement in case you missed it the first time:
The OpenSFS Community Development Working Group is gathering data from
oranizations using Lustre in order to develop a long-term support
strategy recommendation for Lustre.
We want to ensure that future Lustre releases are well-aligned with the
needs of the Lustre community.
Please complete this short survey to make sure that your organization's
voice is heard!
https://www.surveymonkey.com/s/P3HFY7R
Note that all questions are optional so it is ok to submit a partially
completed survey if you prefer not to disclose some information.
Best regards,
Christopher Morrone
OpenSFS CDWG Lead
8 years, 2 months
Double mount on active/active OST confiuguration ?
by Laurent Herlaud
Hi,
I have a question about active/active OST configuration.
I have 2 OST in active/active with 4 OSS, 2 OSS mounted on each server.
Does i have to mount 2 OSS on OST1 and 2 OSS on OST2, or mount the 4 OSS
on the 2 OST ?
Thanks.
---
Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active.
http://www.avast.com
8 years, 2 months
Re: [HPDD-discuss] iozone and lustre
by E.S. Rosenberg
On Thu, Feb 20, 2014 at 7:46 PM, Alexander I Kulyavtsev <aik(a)fnal.gov>wrote:
> You may want to use iozone in throughput mode, with flag -t. It will use
> multple writers/readers on the same node.
>
> If you wish to use several client nodes, use iozone cluster mode -+m . You
> need to prepare cluster file to describe file locations.
>
> Use file size , data set large enough to "flush" data after write on
> server, otherwise read rates are "too good" as data read from memory. Try
> iozone remount option. We used iozone wrapper on slave nodes and set "dont
> need" flag for the file on write so it is flushed from cache, at least
> rates changed.
>
> You may want to watch rates on network and choose time period when all
> processes do io. Iozone printout includes period when some procs finished
> and late starters doing io.
> But: network rates include retries and read aheads.
>
> This is independent of file striping.
>
Thanks Alex I'm going to try that!
Sorry that I keep bothering everyone but:
How do I troubleshoot lfs setstripe that refuses saying it's getting an
invalid argument (22)?
lfs setstripe iozone.tmp -c 10
unable to open 'iozone.tmp': Invalid argument (22)
error: setstripe: create stripe file 'iozone.tmp' failed
I have tried first touching the file, I have tried it on directories (both
existing and non-existing) but all the same result.
I am running the command as a regular user inside a folder that belongs to
the user....
Thanks,
Eli
>
> Alex
>
> On Feb 20, 2014, at 8:01 AM, "E.S. Rosenberg" <
> esr+hpdd-discuss(a)mail.hebrew.edu> wrote:
>
> > at the moment I am just using automated tests...
> >
> > (iozone -Ra -g 64G)
>
8 years, 2 months
LUG 2014 Agenda Announced
by OpenSFS Administration
<http://www.opensfs.org/events/lug14/>
Lustre User Group 2014
Miami Marriott Biscayne Bay, April 8-10
<http://www.opensfs.org/lug14/> The 12th Annual LustreR User Group (LUG)
Conference is just over a month away and we hope to see you there! This
year's event is structured to allow ample time for discussion and
collaboration, while maintaining a strong list of topics.
Event Overview: LUG 2014 will be held on April 8-10 at the
<http://www.marriott.com/hotels/travel/miabb-miami-marriott-biscayne-bay/>
Miami Marriott Biscayne Bay in Miami, Florida! The event will feature 29
sessions over two and a half days on topics including:
* The inaugural Lustre Annual Report from OpenSFS examining the
current status of Lustre in multiple vertical markets spanning the high
performance computing, Big Data, and enterprise computing markets, including
the drivers and barriers for future adoption.
* Deep technical presentations from DataDirect Networks, Intel,
Lawrence Livermore National Laboratory (LLNL), Los Alamos National
Laboratory (LANL), NetApp, Oakridge National Laboratory (ORNL), Texas
Advanced Computing Center (TACC), TITECH (Japan), and Xyratex.
* Wide range of technical poster sessions for easy access at any
point during the event.
Take a look at the <http://www.opensfs.org/lug-2014-agenda/> LUG 2014
Agenda.
Early Bird Registration:
<https://opensfs.wufoo.com/forms/lug-conference-2014/> Register before March
17th to take advantage of the special early-bird registration rate of $525.
After March 17th, the LUG registration rate will be $575.
Guestroom Block: The deadline to make your hotel reservation is Friday,
March 21. The General Attendee room rate is $169 USD for single occupancy,
and the Government Attendee room rate is $152 USD. Please note that
guestroom rates are subject to applicable state and local taxes (currently
13%) in effect at the time of check-out. To make your general attendee room
reservation, please click
<http://www.marriott.com/meeting-event-hotels/group-corporate-travel/groupCo
rp.mi?resLinkData=Open%20SFS%20LUG%20Conference%5Emiabb%60SFSSFSA%60$169.00%
60USD%60false%604/7/14%604/11/14%603/21/14&app=resvlink&stop_mobi=yes> here.
If you have any questions, please feel free to contact
<mailto:admin@opensfs.org> admin(a)opensfs.org.
Best regards,
OpenSFS LUG Planning Committee
_________________________
OpenSFS Administration
3855 SW 153rd Drive Beaverton, OR 97006 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:admin@opensfs.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org> www.opensfs.org
<http://www.opensfs.org/lug-2014-sponsorship/>
<http://www.opensfs.org/lug-2014-sponsorship/> Click here to learn how to
become a 2014 LUG Sponsor
Open Scalable File Systems, Inc. was founded in 2010 to advance Lustre
development, ensuring it remains vendor-neutral, open, and free. Since its
inception, OpenSFS has been responsible for advancing the Lustre file system
and delivering new releases on behalf of the open source community. Through
working groups, events, and ongoing funding initiatives, OpenSFS harnesses
the power of collaborative development to fuel innovation and growth of the
Lustre file system worldwide.
8 years, 2 months
HSM questions
by Frank Zago
Hello,
We, at Cray, are writing another copytool, and we've got some questions.
1/ While no copytool is running, if I run "lfs hsm_archive" on a file,
and then delete that file, the action will still be on the MDS (lctl
get_param -n mdt.lustre-MDT0000.hsm.actions).
Also, if I start a copytool, it won't get an event for it. Thus the
action stays in the system until I manually "purge" it. Is that normal
behavior?
2/ What is the way, or is there a way, to archive/restore a single file
from different nodes? For instance, I have a 1TB file, and to hasten the
transfer, I want to spread the load between 3 nodes to do the transfer
between Lustre and the backend, with each node copying 1/3 of the data.
llapi_hsm_action_get_fd() doesn't seem to have been conceived to work
that way. How would the progress reporting work in that case?
3/ What is the case where hai_extent.length is different than -1 for
archiving? The posix copytool accounts for it, but I haven't seen it in
my testings.
Regards,
Frank.
8 years, 2 months
Using multiple NIDs for the same failnode
by Adesanya, Adeyemi
Hi.
I'm trying to perform a rolling upgrade of a Lustre 2.1.4 filesystem to 2.4.2. The each server has multiple NIDs with failover. Here's tunefs.lustre output from an OST:
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target: pfs-OST0001
Index: 1
Lustre FS: pfs
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: ost.quota_type=ug mgsnode=192.168.115.8@o2ib,172.23.46.8@tcp failover.node=192.168.115.9@o2ib,172.23.46.9@tcp
Permanent disk data:
Target: pfs-OST0001
Index: 1
Lustre FS: pfs
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: ost.quota_type=ug mgsnode=192.168.115.8@o2ib,172.23.46.8@tcp failover.node=192.168.115.9@o2ib,172.23.46.9@tcp
When I attempt to failover the OST to a 2.4.x server, I am seeing some type of parsing error:
Mar 3 18:42:57 ki-oss02 kernel: LDISKFS-fs (dm-1): Unrecognized mount option "172.23.46.8@tcp" or missing value
Mar 3 18:42:57 ki-oss02 kernel: LustreError: 789:0:(osd_handler.c:5410:osd_mount()) pfs-OST0001-osd: can't mount /dev/mapper/360080e5000245170000004aa4ebc173c: -22
This looks a lot like the bug described in LU-4460. Is there a fix/workaround for this in a current maintenance release?
-------
Yemi
8 years, 2 months
filesystem usable space
by Javed Shaikh
hi,
specs: lustre 2.4.2 (client and server)
where can i find more info on /proc/fs/lustre/llite/blah-file-system/kbytes* files? their purpose.
basically i’m interested in getting actual usable space on filesystem … using df (or lfs df) if i add up ‘avail’ + ‘used’ space, it doesn’t give ‘size’ … so its known the filesystem is using some space for itself, but where is this info stored?
any pointers will be helpful.
thanks,
javed
8 years, 2 months