A higher cost for context switching. Lots of people work on apps with lots of i/o, and there is a long history of coroutine/callback/green thread architectures beating the pants off of thread per request architectures.
No, the context switching overhead tends to be minimal if you're using a well-tuned kernel. You're doing a context switch to the kernel for I/O in the first place, and in a green thread model you have to do a userspace context switch in addition to the context switch the kernel imposes to get back into your scheduler.