[ adding KASAN devs...]
On Mon, Jun 4, 2018 at 4:40 PM, Dan Williams <dan.j.williams(a)intel.com> wrote:
On Sun, Jun 3, 2018 at 6:48 PM, Dan Williams
> On Sun, Jun 3, 2018 at 5:25 PM, Dave Chinner <david(a)fromorbit.com> wrote:
>> On Mon, Jun 04, 2018 at 08:20:38AM +1000, Dave Chinner wrote:
>>> On Thu, May 31, 2018 at 09:02:52PM -0700, Dan Williams wrote:
>>> > On Thu, May 31, 2018 at 7:24 PM, Dave Chinner
>>> > > On Thu, May 31, 2018 at 06:57:33PM -0700, Dan Williams wrote:
>>> > >> > FWIW, XFS+DAX used to just work on this setup (I
>>> > >> > installed ndctl until this morning!) but after changing
>>> > >> > it no longer works. That would make it a regression, yes?
>>> > >> I suspect your kernel does not have CONFIG_ZONE_DEVICE enabled
>>> > >> has the following dependencies:
>>> > >>
>>> > >> depends on MEMORY_HOTPLUG
>>> > >> depends on MEMORY_HOTREMOVE
>>> > >> depends on SPARSEMEM_VMEMMAP
>>> > >
>>> > > Filesystem DAX now has a dependency on memory hotplug?
>>> > > OK, works now I've found the magic config incantantions to
>>> > > everything I now need on.
>>> By enabling these options, my test VM now has a ~30s pause in the
>>> boot very soon after the nvdimm subsystem is initialised.
>>> [ 1.523718] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
>>> [ 1.550353] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a
>>> [ 1.552175] Non-volatile memory driver v1.3
>>> [ 2.332045] tsc: Refined TSC clocksource calibration: 2199.909 MHz
>>> [ 2.333280] clocksource: tsc: mask: 0xffffffffffffffff max_cycles:
0x1fb5dcd4620, max_idle_ns: 440795264143 ns
>>> [ 37.217453] brd: module loaded
>>> [ 37.225423] loop: module loaded
>>> [ 37.228441] virtio_blk virtio2: [vda] 10485760 512-byte logical blocks
(5.37 GB/5.00 GiB)
>>> [ 37.245418] virtio_blk virtio3: [vdb] 146800640 512-byte logical blocks
(75.2 GB/70.0 GiB)
>>> [ 37.255794] virtio_blk virtio4: [vdc] 1073741824000 512-byte logical
blocks (550 TB/500 TiB)
>>> [ 37.265403] nd_pmem namespace1.0: unable to guarantee persistence of
>>> [ 37.265618] nd_pmem namespace0.0: unable to guarantee persistence of
>>> The system does not appear to be consuming CPU, but it is blocking
>>> NMIs so I can't get a CPU trace. For a VM that I rely on booting in
>>> a few seconds because I reboot it tens of times a day, this is a
>> And when I turn on KASAN, the kernel fails to boot to a login prompt
> What's your qemu and kernel command line? I'll take look at this first
> thing tomorrow.
I was able to reproduce this crash by just turning on KASAN...
investigating. It would still help to have your config for our own
regression testing purposes it makes sense for us to prioritize
"Dave's test config", similar to the priority of not breaking Linus'
I believe this is a bug in KASAN, or a bug in devm_memremap_pages(),
depends on your point of view. At the very least it is a mismatch of
assumptions. KASAN learns of hot added memory via the memory hotplug
notifier. However, the devm_memremap_pages() implementation is
intentionally limited to the "first half" of the memory hotplug
procedure. I.e. it does just enough to setup the linear map for
pfn_to_page() and initialize the "struct page" memmap, but then stops
short of onlining the pages. This is why we are getting a NULL ptr
deref and not a KASAN report, because KASAN has no shadow area setup
for the linearly mapped pmem range.
In terms of solving it we could refactor kasan_mem_notifier() so that
devm_memremap_pages() can call it outside of the notifier... I'll give
this a shot.