This is not specific to dGPU, it could apply to any PCIe device. Emphasis on "th...

alanfranz · on Jan 17, 2023

Uuuuuh, ok, but.. what’s the point of doing so? If I do zero-copy on a shared memory area between cpu and gpu, the advantage is clear - no copy and fast transfer.

If I map some host memory to the GPU… I get worse latency and worse bandwidth. Most likely not a win.

yaantc · on Jan 17, 2023

That's why the author says "theoretically" I guess ;) Yes in practice you probably wouldn't want your GPU compute engines to do such direct accesses and stall for a long time on each access, even for a one-shot streaming processing. Then even to avoid using the GPU main memory one would likely use DMA copies to a local working memory and do the processing there by chunks. But the direct mapping can still be convenient: a local DMA engine (or any HW coprocessor) can access host or GPU memory in the same way.