On Sun, Jan 31, 2016 at 2:55 AM, Matthew Wilcox <willy(a)linux.intel.com> wrote:
On Sat, Jan 30, 2016 at 11:12:12PM -0700, Ross Zwisler wrote:
> > I did probably 70% of the work needed to switch the radix tree over to
> > storing PFNs instead of sectors. It seems viable, though it's a big
> > change from where we are today:
> At one point I had kaddrs in the radix tree, so I could just pull the addresses out
> and flush them. That would save us a pfn -> kaddrs conversion before flush.
> Is there a reason to store pnfs instead of kaddrs in the radix tree?
Once ARM, MIPS and SPARC get supported, they're going to need temporary
kernel addresses assigned to PFNs rather than permanent ones. Also,
it'll be easier for teardown to delete PFNs associated with a particular
device than kaddrs associated with a particular device. And it lets
us support more persistent memory on a 32-bit machine (also on a 64-bit
machine, but that's mostly theoretical)
+ * DAX uses the 'exceptional' entries to store PFNs in the radix tree.
+ * Bit 0 is clear (the radix tree uses this for its own purposes). Bit
+ * 1 is set (to indicate an exceptional entry). Bits 2 & 3 are PFN_DEV
+ * and PFN_MAP. The top two bits denote the size of the entry (PTE, PMD,
+ * PUD, one reserved). That leaves us 26 bits on 32-bit systems and 58
+ * bits on 64-bit systems, able to address 256GB and 1024EB respectively.
It's also pretty cheap to look up the kaddr from the pfn, at least on
64-bit architectures without cache aliasing problems:
+static void *dax_map_pfn(pfn_t pfn, unsigned long index)
+ return pfn_to_kaddr(pfn_t_to_pfn(pfn));
pfn_to_kaddr() assumes persistent memory is direct mapped which is not
always the case.