I got curious as to how the 37% improvement in James Clear’s book “Atomic Habits” was calculated. As such, I went about figuring out how and tried to generalize it to different time periods (rather than just a year) and with variable improvement and regression rates for each day.

Improvement at a fixed percentage

This calculation is made on the assumption that there is a compounding involved, that the improvement is consistent every day without gaps, and no regression happens during that time. With $\Delta p$ as the improvement in percent and $P(t)$ as the cumulative improvement over time, we arrive at the following:

$$ \begin{aligned} \left( 1 + \Delta p \right)^t &= P(t) \ \left( 1 + 0.01 \right)^{365} &= 37.7834 \end{aligned} $$

Here is where we get the 37% cumulative improvement in the 1% Rule.

Improvement and Regression at a fixed percentage

Now, what if we have regression on some days? As such a much more general equation where regression can be expected is as follows:

$$P(t) = \left( 1 + \Delta p + \Delta r \right)^{t}$$

where

  • $P(t)$ is still the cumulative improvement over time
  • $\Delta p$ is the improvement increment for a day in percent: $\Delta p \in (0,1)$
  • $\Delta r$ is the regression decrement for a day in percent: $\Delta r \in (-1,0)$
  • $t$ as the total period (in days) of which the cumulative improvement is being tracked

Improvement and Regression at a variable percentage

This equation, however, only applies to improvement and regression increments that are constant. Now what if I do not consistently improve at 1% but have days where I improve 2%, 3%, or 0.25%? Or days where things did not go well and I regress by 0.5%, 1%, or 5%?

To convert it to discrete variable improvement and regression percentages per day we have the following (which we can further take its limits to arrive at a product integral):

$$P(t) = \prod_{i=1}^{t} \left( 1 + \Delta p_i + \Delta r_i \right)$$

Calculating P(t) with Python

Given some data (maybe a CSV file with date, $\Delta p$, and $\Delta r$ fields for each date in a given time period), we can code up a quick calculation for $P(t)$. For example in python:

1
2
3
4
5
6
7
8
9
dp = [0.011, 0.020, 0.015, 0.01]
dr = [0.009, 0.001, 0.005, 0.05]
Pt = 1

for i in dp:
 Pt = Pt * (1 + dp[i] + dr[i])

print(f"The cumulative performance for " +
      f"{len(dp)} days is {Pt:.2f}x.")