It depends on the application. For Netflix, transient glitches like having outdated play history or catalog entries that take a while to propagate aren't hugely detrimental -- the core value of serving the videos that users request doesn't depend at all on having a consistent view of quickly-changing user data.
For more complex applications (think Facebook), there are useful consistency models other than strong consistency, with causal consistency being one of the most promising: http://queue.acm.org/detail.cfm?id=2610533
Do you know whether the linked article is what Facebook is currently using, or a proposal? Facebook is an example of a webapp that fairly often has user-visible weird behavior, like "read" notifications becoming unread again. But I'm not sure if those are glitches, or a result of their consistency model.
Embrace idempotence. Being able to perform am operation multiple times with the same result adds a lot of flexibility.
For example, using the rails TODO list, if you add a task to "buy milk" and then on another client view the list and don't see "buy milk" there (because consistency hasn't been reached), you might be inclined to think you forgot to save it or something. Then you might enter it again. If this is not idempotent (as is the case on the classic tutorial) you will end up with 2 "buy milk" tasks instead of just one.
This is actually a great advice - but not all operations can be made idempotent. Adding and substracting money from bank accounts comes to mind (although you can make the operation "write transaction to some log" idempotent, you can't perform the actual computations in an idempotent fashion).
You handle this by modeling the account as a ledger with immutable entries containing unique identifiers rather than a single mutable balance. The ledger entries are facts. The balance is a derived computation. Only store facts, not derived computations.
(At some point for efficiency reasons you might need to prune the transactions, but this often involves trade-offs like not allowing any new "lost" transactions older than date X.)
Different applications could use different schemas, you just keep in mind the eventual consistency.
i.e. you can use event sourcing or just treating everything as a log that gets read and merged (think of it as a CRDT) should be able to handle those kind of scenarios. That use case matches Cassandra quite well with its wide rows.
But now the user is burdened with having to be explicit.
This is exactly the type of case where I'd much rather have the app be explicit about it's actions: allow both to be added, and at most show a notice after synchronisation saying "I've noticed you've added two buy-milk; merge them or leave them separate?".
I've had nothing but bad experiences with apps that thinks they know best and try to merge records behind my back.
The idea behind idempotence is that you can apply multiple times, in any order, and still come up with the same result. http://en.wikipedia.org/wiki/Idempotence
If you say, "Buy_first_Milk" and "Buy_Second_Milk" - and the order is reversed, and the "Buy_Second_Milk" is applied first, and the "Buy_first_milk" is applied twice then everything happens as desired.
Being explicit is what allows everything to work without confusion.
That's the point I'm trying to get across - you know the user's intent if they are explicit. If the user enters "Buy Milk" - you don't know their intent. But, if they enter "Buy First Milk" and then "Buy Second Milk" - you know precisely what they want.
From a user interface perspective, if the user ever enters "Buy Milk" - and there is no existing milk, that is always, "Buy First Milk". If they go to a different client, and the milk order is there, and they enter "Buy Milk" - that then becomes "Buy Second Milk". But, if "Buy Milk" hasn't synced yet, they can explicitly flag it as "Second Milk Order".
Right, you are shifting the burden of merging conflicts to the user. That may be the best approach, but you are giving up on achieving consistency purely through tech.
No, the app can manage this in most cases - if you enter buy milk twice on the same browser session, it will know how to construct the keys properly. If you buy milk once in two different sessions that are not yet consistent, in both cases you would see only one in the list and so adding another in either or both sessions will yield correct results.
Netflix has a couple talks about this, but it mostly comes down to for their specific use case, their users don't care about consistency.
For example, for your viewing history you might not entirely care that it is 100% up to date and the last video you watched isn't available. And for the times you do care, most users will simply usually refresh the page (and since, in almost all cases eventual consistency means resolution in seconds rather than instantly, most users will have the correct info after a page refresh).
I can't find them right now, but Christos Kalantzis (@chriskalan) has a number of talks on using Cassandra at Netflix w.r.t eventual consistency.
I assume: serving content that doesn't change often (videos) and mostly needs to be available. Recording user scores and aggregating their effect later (last write win means you could update the score a second time and the first time only would be recorded - which is annoying but not end-of-the-world).
Likely if they have things that require stronger consistency they use another DB.
...seriously, though. All of our views are out of date by the time we see them, and all of the user inputs have to be dealt with for races and double-submits. Eventual consistency is just the patterns you already know, but bigger.
Eventual consistency is much more complicated than that. Patterns for dealing with nontrivial logical operations (i.e. anything that is bigger than a single 'object') are much easier on consistent DBs than eventually consistent ones. On typical RDBMSs you maintain a per-object version number and foreign keys (possibly also a per-object checksum if you're using more relaxed isolation modes). That's generally enough to prevent races, double submits, and other concurrency artifacts over multi-object operations, and it's pretty easy to implement.
The same is not true for eventually consistent systems - checking against a version number can't save you unless all your updates are trivial, single-object ones: some of your updates may succeed while others fail - and there's generally no way to perform an all or nothing operation. In general, there's no one good mechanism for maintaining eventual consistency on nontrivial operations - you have to think (hard) about it on a case by case basis.
Are any of these anti-chaos tools open source or shared with the community? Would love to see more companies that I'm dependent on have this sort of testing and robustness...
HVM is more performant in PV in most situations with modern virtualization technology.
PV drivers on HVM have been a thing for a while now, so your IO and network go through PV even with an HVM instance. SRIOV/"Enhanced Networking" is even better than PV networking drivers, so HVM has another win there if you have NICs that support it (The larger Amazon instances do)
PV is significantly slower with system calls than HVM on 64bit hardware, as the x86_64 spec removed two of the four CPU protection rings, one of which being where PV lived in 32bit architectures, below the guest, allowing it to 'intercept' these system calls. Now since it shares a ring with the guest, it cannot intercept these and you are left with doubling the amount of context switches that you were previously. Virtualization extensions from Intel and AMD allow HVM to bypass this and skip the context switch. Other hardware advantages like EPT also give HVM an edge when it comes to memory related performance.
About the only area where PV is still better from a performance standpoint is interrupts/timers
Amazon rebooted lots of PV guests. Presumably they collocate HVM and PV guests on the same box. If there were any HVM guests on the box, then there could be the possibility of an attack. (I guess they could forcibly kick off the HVM guests, but that wouldn't be very nice.)
HVM, at least in the past, had a bunch more code that the guest DomU interacts with vs. fully pv guests. This has security implications.
Now, my knowledge of HVM is a few years... or more like half a decade out of date, for example, I don't even know how to force a HVM guest to only use PV drivers (which would solve 90% of the problem.) and i know that more and more of this has moved into hardware, so it's possible that what was true five years ago is not true now, but... yeah, I don't let untrusted users on HVM guests for the same reason I don't let untrusted users use pygrub or load untrusted kernels directly.
HVM guests at Amazon will default to using PV drivers for IO and networking. (Unless using SRIOV/"Enhanced Networking", which will not use the PV drivers)
PVH is actually PV on top of an HVM container and is a bit different. You can think of it as PV sitting on top of enough HVM bits to take advantage of the hardware extensions Intel and AMD have invested so heavily in while still being majority PV. This gives you the best of both worlds, including the remaining PV performance benefits related to interrupts and timers that PV drivers on HVM can't utilize.
Correct me if I'm wrong, but I don't think this is the case. In fact, I'm running PV on m3.medium instances (EDIT: and c3.large) right now. It depends on the AMI you use to launch the instance, though some instance types only support HVM (e.g. t2.* or r3.* ) or PV (old-gen t1.* or m1.* ), but some, like m3.* and c3.* which are current-gen, support both. See the Amazon Linux AMI availability matrix for example: http://aws.amazon.com/amazon-linux-ami/instance-type-matrix/
Inside a single data center they're rare, but when running on AWS in multiple availability zones and regions they're a fact of life which you should be designing your systems to deal with. Given that it would be great to have automated tooling to simulate them and keep everyone on their toes.
Ever lost a switch? Rare, but happens on a regular enough case to prepare against it. Even if you use bonding/teaming, with dual power supply network kit + servers, it can still happen.
Yeah, there was an excellent ACM article with him, and Bailis. I think if the network starts to partition, or fail in a datacenter, that's time to evacuate the datacenter / AZ. If a handful of machines fail, they should disengage. If more than say, 5% of the machines in the DC are having reachability issues at any given point (in a modern DC that's like ~2000 machines), it's time to shut it down.
Not AFAIK - last time I looked it requires one or more AWS autoscaling groups, and works by semi-randomly terminating instances that match configured criteria.