On Tue, 8 May 2018 13:13:40 -0600
Logan Gunthorpe <logang(a)deltatee.com> wrote:
On 08/05/18 10:50 AM, Christian König wrote:
> E.g. transactions are initially send to the root complex for
> translation, that's for sure. But at least for AMD GPUs the root complex
> answers with the translated address which is then cached in the device.
> So further transactions for the same address range then go directly to
> the destination.
Sounds like you are referring to Address Translation Services (ATS).
This is quite separate from ACS and, to my knowledge, isn't widely
supported by switch hardware.
They are not so unrelated, see the ACS Direct Translated P2P
capability, which in fact must be implemented by switch downstream
ports implementing ACS and works specifically with ATS. This appears to
be the way the PCI SIG would intend for P2P to occur within an IOMMU
managed topology, routing pre-translated DMA directly between peer
devices while requiring non-translated requests to bounce through the
IOMMU. Really, what's the value of having an I/O virtual address space
provided by an IOMMU if we're going to allow physical DMA between
downstream devices, couldn't we just turn off the IOMMU altogether? Of
course ATS is not without holes itself, basically that we trust the
endpoint's implementation of ATS implicitly. Thanks,