On Thu, Jan 28, 2016 at 01:38:58PM -0800, Christoph Hellwig wrote:
On Thu, Jan 28, 2016 at 12:35:04PM -0700, Ross Zwisler wrote:
> There are a number of places in dax.c that look up the struct block_device
> associated with an inode. Previously this was done by just using
> inode->i_sb->s_bdev. This is correct for inodes that exist within the
> filesystems supported by DAX (ext2, ext4 & XFS), but when running DAX
> against raw block devices this value is NULL. This causes NULL pointer
> dereferences when these block_device pointers are used.
It's also wrong for an XFS file system with a RT device..
> +#define DAX_BDEV(inode) (S_ISBLK(inode->i_mode) ? I_BDEV(inode) \
> + : inode->i_sb->s_bdev)
.. but this isn't going to fix it. You must use a bdev returned by
get_blocks or a similar file system method.
Jan & Dave,
Before I start in on a solution to this issue I just wanted to confirm that
DAX can rely on the fact that the filesystem's get_block() call will reliably
set bh->b_bdev for non-error returns. From this conversation between Jan &
No. The real problem is a long-standing abuse of struct buffer_head
used for passing block mapping information (it's on my todo list to remove
that at least from DAX code and use cleaner block mapping interface but
first I want basic DAX functionality to settle down to avoid unnecessary
conflicts). Filesystem is not supposed to touch bh->b_bdev.
That has not been true for a long, long time. e.g. XFS always
rewrites bh->b_bdev in get_blocks because the file may not reside on
the primary block device of the filesystem. i.e.:
* If this is a realtime file, data may be on a different device.
* to that pointed to from the buffer_head b_bdev currently.
bh_result->b_bdev = xfs_find_bdev_for_inode(inode);
If you need
that filled in, set it yourself in before passing bh to the block mapping
That may be true, but we cannot assume that the bdev coming back
out of get_block is the same one that was passed in.
It sounds like this is always true for XFS, and from looking at the ext4 code
I think this is true there as well because bh->b_bdev is set in
ext4_dax_mmap_get_block() via map_bh().
Relying on the bh->b_bdev returned by get_block() is correct, yea?