Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The vRam for example limits you running local llms. Basically, the formula for this is: 1GB of vRAM can handle 1B parameters. There are ways to overcome this, but the easiest way for the best performance, is just to have enough vRAM.


If running local LLMs were a thing I ever wanted to do, I'm sure that would be something that mattered to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: