Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My guess is that the main feature which enables this kind of automation is that they can take down any node without consequences. So they can just install an update on all the machines, and then reboot/restart the software on the machines sequentially. If you have implemented redundancy correctly, then software updating becomes simple.


We actually update each machine while it is serving live traffic, with no downtime.

We start a new instance of the server, warm it up (pre-load popular Workers), then move all new requests over to the new instance, while allowing the old instance to complete any requests that are in-flight.

Fewer moving parts makes it really easy to push an update at any time. :)


What happens if you have long running tasks in the worker?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: