Scientist: A Ruby library for carefully refactoring critical paths

jamesfzhang · on Nov 10, 2019

I've use Scientist for every major refactor I've done--IMO it's one of the best examples of a library that does one thing extremely well and has a clean interface.

Manfred · on Nov 10, 2019

Feeding production data into two different paths to compare them is obviously valuable. Does anyone have any good reasons why this has to happen in online code and can't happen against database replicas in offline mode?

I always feel that it would be great to be able to do this with code that has side-effects (eg. anything that changes the database) but I've never seen a general purpose solution for this. The README mentions using a write replica, but how do you deal with data drifting in case of bad writes?

jauco · on Nov 10, 2019

There’s goreplay[0]. But once you start using that you quickly find that many parts of your app use random data (such as uuids) or clock data (timestamps etc) getting these synced between prod and replay or ignored on replay is quite a hassle.

[0]: https://goreplay.org/

dyeje · on Nov 10, 2019

They wrote a good blog post about using this gem back in 2016: https://github.blog/2016-02-03-scientist/

FrancoisBosun · on Nov 10, 2019

Maybe because of the large increase in complexity?

hamandcheese · on Nov 10, 2019

Another possibility might be to do the side effects in a transaction that gets rolled back?

aflag · on Nov 10, 2019

How to handle the case when you want to replace a function that makes changes to a database? Is this library aimed only towards functions which don't make changes to the system's state?

blairanderson · on Nov 10, 2019

How would you handle this case WITHOUT this library?

aflag · on Nov 11, 2019

I'm not experienced in running live experiments. So, I don't know.

tobyhinloopen · on Nov 10, 2019

Alternative title: “Scientist: A Ruby library for changing production code while still not having to write tests”

hinkley · on Nov 10, 2019

In TDD they might call these pinning tests.

This kind of testing isn’t just good for refactors. You can also use it for rewrites or substantial architectural changes. I think github did something like this while trying to make the storage format more efficient.

jbarnette · on Nov 10, 2019

By the time we extracted Scientist, the code we were refactoring had pretty good test coverage. But even the best test suite is an imperfect model of production. The first section of the README briefly mentions this:

"Let's pretend you're changing the way you handle permissions in a large web app. Tests can help guide your refactoring, but you really want to compare the current and refactored behaviors under load."

Xylakant · on Nov 10, 2019

Even in a well-tested codebase some integration points are extremely hard to test, notably external services that you connect to. Rewriting these to use a different service or just a different way of connecting to a service would be an example where this ends up being useful.

hamandcheese · on Nov 10, 2019

Which is really valuable if you don’t have tests, or your test suite doesn’t give you much confidence, is too brittle to make the changes without breaking the tests, or maybe the code is structured so that it’s really hard to test all the paths that occur in production.

neko_ranger · on Nov 10, 2019

Using this at work last week to refactor a database call to a new service call

sentrysapper · on Nov 10, 2019

is there a library like this for PHP?

e12e · on Nov 10, 2019

See https://github.com/github/scientist/blob/master/README.md#al...

spraak · on Nov 10, 2019

[flagged]

dang · on Nov 10, 2019

"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."

https://news.ycombinator.com/newsguidelines.html

nchase · on Nov 10, 2019

https://www.npmjs.com/package/scientist

Is it this one, or another?