Glad you're working on this. Duolingo is garbage and I've been hopeful that AI can help accelerate language learning in a way that is actually effective.
Check the RAM requirements for the model (can be hard to find, though) and compare to the available RAM you want to run it in (VRAM if you are running on GPU, system RAM if running on CPU.)
E.g., this extended-context Llama 3 70B requires 64GB at 256K context and over 100GB at 1M.