While this article is mostly focused on collecting/gathering/visualizing cohort metrics, rather than on analysis, I would like to plug my college professor Peter Fader’s research on empirical Bayes modeling and analyzing Customer Lifetime Value.
One thing I learned from his class is that survival rates will naturally trend upward over time, which marketers erroneously attribute to (1) improving the product, (2) better customer service, (3) network effects / lock-in, etc.
However, if you have a heterogeneous customer base with latently better and worse customers, inevitably your worse customers will churn before your better customers, showing that “decrease” in churn.
These models also let you do cool things like conditional expectation: “if a customer has survived 13 months, what’s the probability they churn in the 14th?”
This sounds a lot like the type of survival analysis mentioned in the other comment. The Kaplan--Meier estimator can be made conditional by, well, conditioning on earlier parts of the curve.
If you count not "time-to-event" to "total-revenue-to-event" you get a lifetime value estimator!
One thing people often forget about retention is that the data are censored. When customers stop being active, you know it, but you don't know when the customers that are active today will stop being active. If you don't account for that ignorance, you get data that are biased one way or the other.
Another common problem, especially when analysing what effect interventions have on retention, is ignoring selection bias. Most interventions are of the form where they will be preferentially offered to or accepted by users who are either more or less satisfied than the average. This will in turn make it seem like retention is positively or negatively affected by the intervention, when really it's just selecting for user groups that were different to start out with.
That is true for time-based (eg. monthly) subscriptions, but not for the general repeat customer case. (Home Depot or Starbucks would have no timely idea if I suddenly decided to boycott them, while Comcast would.)
Right. I was going off the definition in the article that active customers are those that have performed an action within a time span, i.e. at the end of each such time span you can definitely count customers as "no longer active" or "still active... for now".
One thing I learned from his class is that survival rates will naturally trend upward over time, which marketers erroneously attribute to (1) improving the product, (2) better customer service, (3) network effects / lock-in, etc.
However, if you have a heterogeneous customer base with latently better and worse customers, inevitably your worse customers will churn before your better customers, showing that “decrease” in churn.
These models also let you do cool things like conditional expectation: “if a customer has survived 13 months, what’s the probability they churn in the 14th?”
Here’s a paper of his from 2004: https://repository.upenn.edu/cgi/viewcontent.cgi?article=141...