It's good that you make a distinction between success rate and safety rate.
However, safety rate doesn't need to be 100%. If you can keep the failure probability as low as the probability to be hit by a random asteroid falling from the sky, that's good enough.
TDMPC for one of the downstream networks perhaps, but the upstream will have to be some sort of reasoning model if you want it to be general purpose w.r.t. babies.
Also VLMs/LAMs aren't going to cut it. You're going to need something like TDMPC.