Lustre and kernel buffer interaction
by John Bauer
I have been trying to understand a behavior I am observing in an IOR
benchmark on Lustre. I have pared it down to a simple example.
The IOR benchmark is running in MPI mode. There are 2 ranks, each
running on its own node. Each rank does the following:
Note : Test was run on the "swan" cluster at Cray Inc., using /lus/scratch
write a file. ( 10GB )
fsync the file
close the file
MPI_barrier
open the file that was written by the other rank.
read the file that was written by the other rank.
close the file that was written by the other rank.
The writing of each file goes as expected.
The fsync takes very little time ( about .05 seconds).
The first reads of the file( written by the other rank ) start out *very
*slowly. While theses first reads are proceeding slowly, the
kernel's cached memory ( the Cached: line in /proc/meminfo) decreases
from the size of the file just written to nearly zero.
Once the cached memory has reached nearly zero, the file reading
proceeds as expected.
I have attached a jpg of the instrumentation of the processes that
illustrates this behavior.
My questions are:
Why does the reading of the file, written by the other rank, wait until
the cached data drains to nearly zero before proceeding normally.
Shouldn't the fsync ensure that the file's data is written to the
backing storage so this draining of the cached memory should be simply
releasing pages with no further I/O?
For this case the "dead" time is only about 4 seconds, but this "dead"
time scales directly with the size of the files.
John
--
John Bauer
I/O Doctors LLC
507-766-0378
bauerj(a)iodoctors.com
5 years, 10 months
Using a swapfile on lustre
by E.S. Rosenberg
Is anyone using a swapfile stored on lustre, I did quick attempt and got a
"kernel: swapon: swapfile has holes"
error, however a naggle discussion from the days of lustre 1.6.x seems
to suggest that at one time it was supported.....
Another question:
To what extent does lustre in the staging tree of the kernel change
from release to release, we are currently using kernel 3.17 + lustre
2.5.3 (on clients) at the moment we are not likely to upgrade the
kernel for a while unless we run into big errors or have issues/needs
with other programs that indicate a newer kernel but I am curious how
much (if any) gain there is from upgrading regularly...
Thanks,
Eli
6 years, 1 month
[PATCH v2] staging: lustre: lustre: osc: modifying seq_printf statements
by Heba Aamer
This patch modifies the seq_printf statements in
drivers/staging/lustre/lustre/osc/lproc_osc.c file.
It changes it to seq_puts and seq_putc wherever applicable.
Signed-off-by: Heba Aamer <heba93aamer(a)gmail.com>
---
v2: changing %% to %
drivers/staging/lustre/lustre/osc/lproc_osc.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/staging/lustre/lustre/osc/lproc_osc.c b/drivers/staging/lustre/lustre/osc/lproc_osc.c
index 8e22e45..1795d3a 100644
--- a/drivers/staging/lustre/lustre/osc/lproc_osc.c
+++ b/drivers/staging/lustre/lustre/osc/lproc_osc.c
@@ -364,7 +364,7 @@ static int osc_checksum_type_seq_show(struct seq_file *m, void *v)
else
seq_printf(m, "%s ", cksum_name[i]);
}
- seq_printf(m, "\n");
+ seq_putc(m, '\n');
return 0;
}
@@ -601,9 +601,9 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, "pending read pages: %d\n",
atomic_read(&cli->cl_pending_r_pages));
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
- seq_printf(seq, "pages per rpc rpcs %% cum %% |");
- seq_printf(seq, " rpcs %% cum %%\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "pages per rpc rpcs % cum % |");
+ seq_puts(seq, " rpcs % cum %\n");
read_tot = lprocfs_oh_sum(&cli->cl_read_page_hist);
write_tot = lprocfs_oh_sum(&cli->cl_write_page_hist);
@@ -624,9 +624,9 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
break;
}
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
- seq_printf(seq, "rpcs in flight rpcs %% cum %% |");
- seq_printf(seq, " rpcs %% cum %%\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "rpcs in flight rpcs % cum % |");
+ seq_puts(seq, " rpcs % cum %\n");
read_tot = lprocfs_oh_sum(&cli->cl_read_rpc_hist);
write_tot = lprocfs_oh_sum(&cli->cl_write_rpc_hist);
@@ -647,9 +647,9 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
break;
}
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
- seq_printf(seq, "offset rpcs %% cum %% |");
- seq_printf(seq, " rpcs %% cum %%\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "offset rpcs % cum % |");
+ seq_puts(seq, " rpcs % cum %\n");
read_tot = lprocfs_oh_sum(&cli->cl_read_offset_hist);
write_tot = lprocfs_oh_sum(&cli->cl_write_offset_hist);
--
1.7.9.5
6 years, 1 month
[PATCH] staging/lustre: remove unused lustre_update.h header
by green@linuxhacker.ru
From: Oleg Drokin <green(a)linuxhacker.ru>
lustre_update.h containts various server-side structures that are
not really relevant for the client.
Also remove the only user of this file that does not actually need it.
Signed-off-by: Oleg Drokin <green(a)linuxhacker.ru>
---
.../staging/lustre/lustre/include/lustre_update.h | 189 ---------------------
drivers/staging/lustre/lustre/ptlrpc/layout.c | 1 -
2 files changed, 190 deletions(-)
delete mode 100644 drivers/staging/lustre/lustre/include/lustre_update.h
diff --git a/drivers/staging/lustre/lustre/include/lustre_update.h b/drivers/staging/lustre/lustre/include/lustre_update.h
deleted file mode 100644
index 84defce..0000000
--- a/drivers/staging/lustre/lustre/include/lustre_update.h
+++ /dev/null
@@ -1,189 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.htm
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2013, Intel Corporation.
- */
-/*
- * lustre/include/lustre_update.h
- *
- * Author: Di Wang <di.wang(a)intel.com>
- */
-
-#ifndef _LUSTRE_UPDATE_H
-#define _LUSTRE_UPDATE_H
-
-#define UPDATE_BUFFER_SIZE 8192
-struct update_request {
- struct dt_device *ur_dt;
- struct list_head ur_list; /* attached itself to thandle */
- int ur_flags;
- int ur_rc; /* request result */
- int ur_batchid; /* Current batch(trans) id */
- struct update_buf *ur_buf; /* Holding the update req */
-};
-
-static inline unsigned long update_size(struct update *update)
-{
- unsigned long size;
- int i;
-
- size = cfs_size_round(offsetof(struct update, u_bufs[0]));
- for (i = 0; i < UPDATE_BUF_COUNT; i++)
- size += cfs_size_round(update->u_lens[i]);
-
- return size;
-}
-
-static inline void *update_param_buf(struct update *update, int index,
- int *size)
-{
- int i;
- void *ptr;
-
- if (index >= UPDATE_BUF_COUNT)
- return NULL;
-
- ptr = (char *)update + cfs_size_round(offsetof(struct update,
- u_bufs[0]));
- for (i = 0; i < index; i++) {
- LASSERT(update->u_lens[i] > 0);
- ptr += cfs_size_round(update->u_lens[i]);
- }
-
- if (size != NULL)
- *size = update->u_lens[index];
-
- return ptr;
-}
-
-static inline unsigned long update_buf_size(struct update_buf *buf)
-{
- unsigned long size;
- int i = 0;
-
- size = cfs_size_round(offsetof(struct update_buf, ub_bufs[0]));
- for (i = 0; i < buf->ub_count; i++) {
- struct update *update;
-
- update = (struct update *)((char *)buf + size);
- size += update_size(update);
- }
- LASSERT(size <= UPDATE_BUFFER_SIZE);
- return size;
-}
-
-static inline void *update_buf_get(struct update_buf *buf, int index, int *size)
-{
- int count = buf->ub_count;
- void *ptr;
- int i = 0;
-
- if (index >= count)
- return NULL;
-
- ptr = (char *)buf + cfs_size_round(offsetof(struct update_buf,
- ub_bufs[0]));
- for (i = 0; i < index; i++)
- ptr += update_size((struct update *)ptr);
-
- if (size != NULL)
- *size = update_size((struct update *)ptr);
-
- return ptr;
-}
-
-static inline void update_init_reply_buf(struct update_reply *reply, int count)
-{
- reply->ur_version = UPDATE_REPLY_V1;
- reply->ur_count = count;
-}
-
-static inline void *update_get_buf_internal(struct update_reply *reply,
- int index, int *size)
-{
- char *ptr;
- int count = reply->ur_count;
- int i;
-
- if (index >= count)
- return NULL;
-
- ptr = (char *)reply + cfs_size_round(offsetof(struct update_reply,
- ur_lens[count]));
- for (i = 0; i < index; i++) {
- LASSERT(reply->ur_lens[i] > 0);
- ptr += cfs_size_round(reply->ur_lens[i]);
- }
-
- if (size != NULL)
- *size = reply->ur_lens[index];
-
- return ptr;
-}
-
-static inline void update_insert_reply(struct update_reply *reply, void *data,
- int data_len, int index, int rc)
-{
- char *ptr;
-
- ptr = update_get_buf_internal(reply, index, NULL);
- LASSERT(ptr != NULL);
-
- *(int *)ptr = cpu_to_le32(rc);
- ptr += sizeof(int);
- if (data_len > 0) {
- LASSERT(data != NULL);
- memcpy(ptr, data, data_len);
- }
- reply->ur_lens[index] = data_len + sizeof(int);
-}
-
-static inline int update_get_reply_buf(struct update_reply *reply, void **buf,
- int index)
-{
- char *ptr;
- int size = 0;
- int result;
-
- ptr = update_get_buf_internal(reply, index, &size);
- result = *(int *)ptr;
-
- if (result < 0)
- return result;
-
- LASSERT((ptr != NULL && size >= sizeof(int)));
- *buf = ptr + sizeof(int);
- return size - sizeof(int);
-}
-
-static inline int update_get_reply_result(struct update_reply *reply,
- void **buf, int index)
-{
- void *ptr;
- int size;
-
- ptr = update_get_buf_internal(reply, index, &size);
- LASSERT(ptr != NULL && size > sizeof(int));
- return *(int *)ptr;
-}
-
-#endif
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index dc5ceb5..bbef666 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -65,7 +65,6 @@
#endif
/* struct ptlrpc_request, lustre_msg* */
#include "../include/lustre_req_layout.h"
-#include "../include/lustre_update.h"
#include "../include/lustre_acl.h"
#include "../include/lustre_debug.h"
--
2.1.0
6 years, 1 month
Nominations due January 30th: OpenSFS Community Representative
by OpenSFS Administration
Dear OpenSFS Adopters and Supporters,
As a reminder, January 30th is the last day to submit your nominations for
the 2015 OpenSFS Community Representative Director on the OpenSFS Board of
Directors. This position will be open from February 2015 to February 2016.
Nominations must be emailed to admin(a)opensfs.org
<mailto:admin@opensfs.org%3cmailto:admin@opensfs.org>
<mailto:admin@opensfs.org> before 5pm Pacific Time, Friday, January 30th.
Please take a moment to give this nomination serious consideration. The
person selected for this position is responsible for representing the
OpenSFS Supporter and Adopter Participants and has a full Board seat and
voting rights to carry out that responsibility. Community participation is
growing and we want to be sure all Supporters and Adopters have a strong
voice in the decisions made by the OpenSFS Board. Your Community Board
Member is guided by Supporter and Adopter values as the Board makes
decisions for OpenSFS including how OpenSFS Development and Support funding
is spent.
We'd like to thank Steve Simms from Indiana University for his excellent
service as the 2014 Community Representative Director.
Full details regarding the nomination process can be found on the Community
Director Election web page <http://opensfs.org/about/community-director/> .
Sincerely,
OpenSFS Board of Directors
__________________________
OpenSFS
3855 SW 153rd Drive Beaverton, OR 97003 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:admin@opensfs.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org/> www.opensfs.org
6 years, 1 month
LUG 2015 Presentation Submission Deadline is February 6
by OpenSFS Administration
<http://opensfs.org/events/lug-2015/>
Call for Presentations: Lustre User Group 2015
Denver Marriott City Center, April 13-15
The deadline to submit a presentation for the 13th annual Lustre
<http://opensfs.org/events/lug-2015/> R User Group (LUG) conference is
quickly approaching. Don't miss your chance to share your company's best
practices, case studies, or business experiences with over 200+
Lustre-focused attendees.
OpenSFS relies on the real world experience of the open source Lustre
community members to provide valuable sessions to LUG attendees, and we want
to hear from you! All LUG presentations will be 30 minutes, with a total of
25 speaking opportunities available.
Submission deadline is Friday, February 6, 2015!
To submit an abstract:
* Visit the LUG 2015 Call for Presentations Submission web page
<http://opensfs.org/lug-call-for-presentations/>
* Only an abstract is required for the submission process;
presentation materials will be requested once abstracts are reviewed and
selected
OpenSFS is also excited to include panel discussions and poster sessions in
the LUG agenda:
* Suggest a topic for a panel discussion - or be part of a panel and
debate the future direction of the industry with leading developers,
vendors, and users of Lustre. OpenSFS Administration
<mailto:admin@opensfs.org> welcomes your suggestions!
* Poster sessions provide an opportunity for informal, interactive
presentation and discussion of diverse topics related to Lustre technology.
* Visit the LUG 2015 Call for Presentations Submission web page
<http://opensfs.org/lug-call-for-presentations/>
With the ongoing growth and outstanding progress of the Lustre community,
LUG 2015 promises to be an exciting and informative event! Registration will
open in February and additional event logistics are included on the LUG
event page <http://opensfs.org/events/lug-2015/> .
Please contact admin(a)opensfs.org <mailto:admin@opensfs.org> if you have any
questions.
Best regards,
OpenSFS LUG Planning Committee
_________________________
OpenSFS Administration
3855 SW 153rd Drive Beaverton, OR 97003 USA
Phone: +1 503-619-0561 | Fax: +1 503-644-6708
Twitter: <https://twitter.com/opensfs> @OpenSFS
Email: <mailto:admin@opensfs.org> admin(a)opensfs.org | Website:
<http://www.opensfs.org> www.opensfs.org
<http://cdn.opensfs.org/wp-content/uploads/2015/01/LUG_2015_Sponsorship_Oppo
rtunities.pdf> Click here to learn how to become a 2015 LUG Sponsor
Open Scalable File Systems, Inc. was founded in 2010 to advance Lustre
development, ensuring it remains vendor-neutral, open, and free. Since its
inception, OpenSFS has been responsible for advancing the Lustre file system
and delivering new releases on behalf of the open source community. Through
working groups, events, and ongoing funding initiatives, OpenSFS harnesses
the power of collaborative development to fuel innovation and growth of the
Lustre file system worldwide
6 years, 1 month
[PATCH] staging: lustre: lustre: osc: modifying seq_printf statements
by Heba Aamer
This patch modifies the seq_printf statements in
drivers/staging/lustre/lustre/osc/lproc_osc.c file.
It changes it to seq_puts and seq_putc wherever applicable.
Signed-off-by: Heba Aamer <heba93aamer(a)gmail.com>
---
drivers/staging/lustre/lustre/osc/lproc_osc.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/staging/lustre/lustre/osc/lproc_osc.c b/drivers/staging/lustre/lustre/osc/lproc_osc.c
index 8e22e45..61d2ca4 100644
--- a/drivers/staging/lustre/lustre/osc/lproc_osc.c
+++ b/drivers/staging/lustre/lustre/osc/lproc_osc.c
@@ -364,7 +364,7 @@ static int osc_checksum_type_seq_show(struct seq_file *m, void *v)
else
seq_printf(m, "%s ", cksum_name[i]);
}
- seq_printf(m, "\n");
+ seq_putc(m, '\n');
return 0;
}
@@ -601,9 +601,9 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, "pending read pages: %d\n",
atomic_read(&cli->cl_pending_r_pages));
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
- seq_printf(seq, "pages per rpc rpcs %% cum %% |");
- seq_printf(seq, " rpcs %% cum %%\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "pages per rpc rpcs %% cum %% |");
+ seq_puts(seq, " rpcs %% cum %%\n");
read_tot = lprocfs_oh_sum(&cli->cl_read_page_hist);
write_tot = lprocfs_oh_sum(&cli->cl_write_page_hist);
@@ -624,9 +624,9 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
break;
}
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
- seq_printf(seq, "rpcs in flight rpcs %% cum %% |");
- seq_printf(seq, " rpcs %% cum %%\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "rpcs in flight rpcs %% cum %% |");
+ seq_puts(seq, " rpcs %% cum %%\n");
read_tot = lprocfs_oh_sum(&cli->cl_read_rpc_hist);
write_tot = lprocfs_oh_sum(&cli->cl_write_rpc_hist);
@@ -647,9 +647,9 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
break;
}
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
- seq_printf(seq, "offset rpcs %% cum %% |");
- seq_printf(seq, " rpcs %% cum %%\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "offset rpcs %% cum %% |");
+ seq_puts(seq, " rpcs %% cum %%\n");
read_tot = lprocfs_oh_sum(&cli->cl_read_offset_hist);
write_tot = lprocfs_oh_sum(&cli->cl_write_offset_hist);
--
1.7.9.5
6 years, 1 month
[PATCH] staging: lustre: lustre: osc: fix Prefer seq_puts to seq_printf
by Heba Aamer
This patch fixes the following checkpatch.pl warning:
Prefer seq_puts to seq_printf
Signed-off-by: Heba Aamer <heba93aamer(a)gmail.com>
---
drivers/staging/lustre/lustre/osc/lproc_osc.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/lustre/lustre/osc/lproc_osc.c b/drivers/staging/lustre/lustre/osc/lproc_osc.c
index 8e22e45..4da837e 100644
--- a/drivers/staging/lustre/lustre/osc/lproc_osc.c
+++ b/drivers/staging/lustre/lustre/osc/lproc_osc.c
@@ -364,7 +364,7 @@ static int osc_checksum_type_seq_show(struct seq_file *m, void *v)
else
seq_printf(m, "%s ", cksum_name[i]);
}
- seq_printf(m, "\n");
+ seq_puts(m, "\n");
return 0;
}
@@ -601,7 +601,7 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, "pending read pages: %d\n",
atomic_read(&cli->cl_pending_r_pages));
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
seq_printf(seq, "pages per rpc rpcs %% cum %% |");
seq_printf(seq, " rpcs %% cum %%\n");
@@ -624,7 +624,7 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
break;
}
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
seq_printf(seq, "rpcs in flight rpcs %% cum %% |");
seq_printf(seq, " rpcs %% cum %%\n");
@@ -647,7 +647,7 @@ static int osc_rpc_stats_seq_show(struct seq_file *seq, void *v)
break;
}
- seq_printf(seq, "\n\t\t\tread\t\t\twrite\n");
+ seq_puts(seq, "\n\t\t\tread\t\t\twrite\n");
seq_printf(seq, "offset rpcs %% cum %% |");
seq_printf(seq, " rpcs %% cum %%\n");
--
1.7.9.5
6 years, 1 month
[PATCH] staging: lustre: fid: lproc_fid: Improving error control
by Rickard Strandqvist
Improving error checking by now use a return value from sscanf.
This was found using a static code analysis program called cppcheck
Signed-off-by: Rickard Strandqvist <rickard_strandqvist(a)spectrumdigital.se>
---
drivers/staging/lustre/lustre/fid/lproc_fid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/lustre/lustre/fid/lproc_fid.c b/drivers/staging/lustre/lustre/fid/lproc_fid.c
index 6a21f07..9b4ada4 100644
--- a/drivers/staging/lustre/lustre/fid/lproc_fid.c
+++ b/drivers/staging/lustre/lustre/fid/lproc_fid.c
@@ -85,7 +85,7 @@ static int lprocfs_fid_write_common(const char __user *buffer, size_t count,
rc = sscanf(kernbuf, "[%llx - %llx]\n",
(unsigned long long *)&tmp.lsr_start,
(unsigned long long *)&tmp.lsr_end);
- if (!range_is_sane(&tmp) || range_is_zero(&tmp) ||
+ if (rc != 2 || !range_is_sane(&tmp) || range_is_zero(&tmp) ||
tmp.lsr_start < range->lsr_start || tmp.lsr_end > range->lsr_end)
return -EINVAL;
*range = tmp;
--
1.7.10.4
6 years, 1 month