Weight loss through simple data science

I presented a technical talk at the PyOhio conference this year, describing applications of elementary data science techniques to weight loss. (As I described here, I generally prefer the term "body fat reduction", because it's more specific, but most people are more familiar with the term "weight loss". So it goes.)

You can watch the video here:


My presentation slideshow and the script used to generate the data set are available on github.

The so-called "Hawthorne Effect" describes the result of an experiment in industrial engineering and management at the Hawthorne Works factory in Illinois in 1925. The result suggested that observing workers tended to alter their performance and productivity, in a positive direction.

Wikipedia describes the Hawthorne Effect as:

a type of reactivity in which individuals modify an aspect of their behavior in response to their awareness of being observed

Suggested explanations varied widely: hypotheses include excitement that management was taking an uncommon interest in their work, and anxiety that the reason for the increased interest was planning for layoffs.

A similar effect seems to operate in the simpler case of "personal observations", which is what I described in my talk. In this case, the "manager" and the "worker" are the same person, and the "observations" were simple, daily measurements of body weight using a smart scale for easy data logging. The "why" doesn't matter as much as the fact that the effect seems to work, and you can use it to reach your goals.

Studies suggest that test subjects who log their meals and snacks in a food diary ("observation") tend to experience greater weight loss ("effect"). This apparently happens even when the doctors running the study do not ask the subjects to change or limit their eating habits. Similarly, measuring weight on a daily tempo seems to generate a similar self-awareness, whether on a conscious or unconscious level. Something about the awareness that you are being observed tends to foster habit change in a desired direction.

Over time, this observation can lead a person to adopt new habits and modify old habits - both consciously and unconsciously - that cause the long-term trend line on the graph to move in the desired direction.

[caption id="attachment_396" align="alignnone" width="700"]image0 Weight over time (raw data and seven-day moving average)[/caption]

This is definitely the case in my personal data set that I starting recording on Feb 14, 2019.

Computing the daily weight change values, or "deltas", delivers some interesting and actionable insights. The "delta" or weight change at day i (today) is defined as delta[i] = w[i] - w[i-1]. That is, today's weight minus yesterday's weight. It is the answer to the question "how much did my weight change between yesterday and today?" (For best results, I measure my body weight at approximately the same time every day.)

One of the surprising things I observed is this: approximately half the time, I wasgainingweight.

Even though I reduced my body weight by over 30 lbs over the interval, nearly half of the measured daily delta values are greater than zero, indicating weight gain.

As of now (2019-09-04), for N=202 observations, the breakdown is:

  • 95 days increasing weight
  • 95 days decreasing weight
  • 12 days with no change

[caption id="attachment_398" align="alignnone" width="700"]image1 Daily deltas over time[/caption]

The graph of delta versus time shows this clearly. With an apparently random mix of increases and decreases, it's very hard to tell from this plot alone whether it adds up to a net gain or loss. If you add it up, the numbers are clear: the total gain is about 60 lbs and the total reduction is about 90 lbs, adding up to a net reduction of around 30 lbs.

Similarly, the histogram showing the distribution of deltas shows no obvious skew or asymmetry toward weight gain (right side) or or weight loss (left side).

[caption id="attachment_402" align="alignnone" width="700"]image2 Histogram of daily deltas[/caption]

What we can learn from this is that:

  • very short-term (daily) weight changes bear little or no relation to the long term trend (monthly)
  • a long-term decrease in body weight contains many days during which a weight gain occurs (and vice versa)

In other words, nobody becomes obese overnight, and nobody drops 50 lbs of body fat overnight either. These changes take place over the long term, in response to changes in food composition and quantity, hormone levels, activity levels, and other inputs.

The practical lesson seems to be that there's no point in feeling joy over a 2 lb drop in the number on the scale, or misery over a 2 lb rise. As hard as it may be to believe in the moment, a one-day increase or decrease appears to be absolutely meaningless in its implications for long-term weight change.

It's a real challenge for many people to disconnect their emotions from the random daily fluctuations of the number on the scale. However, seeing today's "number" in the context of historical numbers is a great way of keeping a broad perspective: does it matter if the body weight went from 181 to 183 lbs today if the starting point was 211 lbs?

Another advantage of collecting long-term data like this is that it enables you to run experiments and to catch and observe trends before they become a problem. Without the data recorded and plotted, it's unlikely that you would make the connection between (e.g.) experimentally adding a new food, and a slow rise in body weight over three weeks.

Perhaps you think that adding food X or removing habit Y might give good results. By running an experiment, perhaps for two weeks or thirty days, and making a change ("input"), you can observe the result ("output"). Of course, to make this work, you need to keep other input variables as constant as possible. If you change three inputs at the same time, it's very hard to isolate which one had an influence on the output.

The conclusion is that long-term change in body weight, in one direction, is made up of lots of small daily changes, in both directions. The data is very noisy. Therefore, the weight change on a random day has very little to do with either the long-term trend, or the endpoint. Accumulating a body of data over time is a great way to create an objective and impersonal reference about a body metric like weight. Human memory is ineffective and subject to revision and distortion, whereas recorded data is far less likely to lie. Tracking your body measurements is a powerful way for you to observe change and to drive it.