On Tue, Sep 01, 2015 at 09:06:08AM +0200, Christoph Hellwig wrote:
On Tue, Sep 01, 2015 at 09:38:03AM +1000, Dave Chinner wrote:
> On Mon, Aug 31, 2015 at 12:59:44PM -0600, Ross Zwisler wrote:
> > For DAX msync we just need to flush the given range using
> > wb_cache_pmem(), which is now a public part of the PMEM API.
> This is wrong, because it still leaves fsync() broken on dax.
> Flushing dirty data to stable storage is the responsibility of the
> writeback infrastructure, not the VMA/mm infrasrtucture. For non-dax
> configurations, msync defers all that to vfs_fsync_range(), because
> it has to be implemented there for fsync() to work.
> Even for DAX, msync has to call vfs_fsync_range() for the filesystem to commit
> the backing store allocations to stable storage, so there's not
> getting around the fact msync is the wrong place to be flushing
> DAX mappings to persistent storage.
DAX does call ->fsync before and after this patch. And with all
the recent fixes we take care to ensure data is written though the
cache for everything but mmap-access. With this patch from Ross
we ensure msync writes back the cache before calling ->fsync so that
the filesystem can then do it's work like converting unwritten extents.
The only downside is that previously on Linux you could always use
fsync as a replaement for msymc, which isn't true anymore for DAX.
Which means applications that should "just work" without
modification on DAX are now subtly broken and don't actually
guarantee data is safe after a crash. That's a pretty nasty
landmine, and goes against *everything* we've claimed about using
DAX with existing applications.
That's wrong, and needs fixing.