I've use Scientist for every major refactor I've done--IMO it's one of the best examples of a library that does one thing extremely well and has a clean interface.
Feeding production data into two different paths to compare them is obviously valuable. Does anyone have any good reasons why this has to happen in online code and can't happen against database replicas in offline mode?
I always feel that it would be great to be able to do this with code that has side-effects (eg. anything that changes the database) but I've never seen a general purpose solution for this. The README mentions using a write replica, but how do you deal with data drifting in case of bad writes?
There’s goreplay[0]. But once you start using that you quickly find that many parts of your app use random data (such as uuids) or clock data (timestamps etc) getting these synced between prod and replay or ignored on replay is quite a hassle.
How to handle the case when you want to replace a function that makes changes to a database? Is this library aimed only towards functions which don't make changes to the system's state?
This kind of testing isn’t just good for refactors. You can also use it for rewrites or substantial architectural changes. I think github did something like this while trying to make the storage format more efficient.
By the time we extracted Scientist, the code we were refactoring had pretty good test coverage. But even the best test suite is an imperfect model of production. The first section of the README briefly mentions this:
"Let's pretend you're changing the way you handle permissions in a large web app. Tests can help guide your refactoring, but you really want to compare the current and refactored behaviors under load."
Even in a well-tested codebase some integration points are extremely hard to test, notably external services that you connect to. Rewriting these to use a different service or just a different way of connecting to a service would be an example where this ends up being useful.
Which is really valuable if you don’t have tests, or your test suite doesn’t give you much confidence, is too brittle to make the changes without breaking the tests, or maybe the code is structured so that it’s really hard to test all the paths that occur in production.