iGPUs are typically weak, and/or aren't capable of running the LLM so the CPU is used instead. You can run things this way, but it's not fast, and it gets slower as the models go up in size.
If you want things to run quickly, then aside from Macs, there's the 2025 ASUS Flow z13 which (afaik) is the only laptop with AMD's new Ryzen Max+ 395 processor. This is powerful and has up to 128Gb of RAM that can be shared with the GPU, but they're very rare (and Mac-expensive) at the moment.
The other variable for running LLMs quickly is memory bandwidth; the Max+ 395 has 256Gb/s, which is similar to the M4 Pro; the M4 Max chips are considerably higher. Apple fell on their feet on this one.
If you want things to run quickly, then aside from Macs, there's the 2025 ASUS Flow z13 which (afaik) is the only laptop with AMD's new Ryzen Max+ 395 processor. This is powerful and has up to 128Gb of RAM that can be shared with the GPU, but they're very rare (and Mac-expensive) at the moment.
The other variable for running LLMs quickly is memory bandwidth; the Max+ 395 has 256Gb/s, which is similar to the M4 Pro; the M4 Max chips are considerably higher. Apple fell on their feet on this one.