Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Counter-argument: having your entire world-wide deployment operate under a single control-plane is a recipe for global outages. There should no single command that one can fat-finger that will bring down your system globally.

One-cluster-per-region (with some tie-in into one region being its own failure domain, both at the underlying infrastructure and application level) is the way to go for reliability.



The model put forth in TFA seems to address this in that pods (or “sub clusters” can run for weeks at a time without communication with the toplevel cluster. It’s pretty hand-wavy and probably can’t solve for all possible outage scenarios, but it seems like it would help dramatically.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: