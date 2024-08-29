Introduction: The Debate over Correlation vs. Causation

At the heart of the academic debate over our claims in The Anxious Generation has been the question of correlation versus causation. While the book is about two major interlocking changes to childhood—the loss of the play-based childhood and the rise of the phone-based childhood—nearly all of the academic debate has been focused on the phone side, and most of that has focused just on the social media component of the phone side, and most of that has been focused on social media’s impact on mental health. Notably, our claims about the importance of real-world independence, unsupervised play, and responsibility in Chapters 2 and 3, and the impact of smartphones and social media on school climate and academic performance in Chapters 5 and 11, have elicited hardly any objections or critiques.

In this four-part series on causality, we zoom in on the center of the debate: our claim that social media use (especially heavy use) is a substantial contributing cause of mental health problems (at least in some substantial minority of adolescent users). Most participants in the debate agree that heavy social media use is associated with many different kinds of harms — mental health deterioration, sleep deprivation, attention fragmentation, etc. (see, for example, Orben 2020 for the typical range of the associations, and see ch. 4 of the recent National Academies of Sciences report for a long list of associated harms). Regarding mental health, heavy users—who often qualify as having “problematic use” that interferes with other areas of functioning—are nearly always found to have higher rates of anxiety and depression than light users or non-users, and these differences are often quite large, especially for girls (among whom heavy users of social media were found to be three times as likely to be depressed as light users; Kelly et al., 2019).

It is, however, a challenge to establish whether something about social media in general causes mental health harms such as depression and anxiety, or whether the association is due primarily to reverse causation (meaning that depression or anxiety is what is causing some adolescents to use social media more often), as is claimed by Candice Odgers at UC-Irvine and by Michael Rich at Harvard.

A great deal hangs on this question of causation because social media companies have consistently dismissed claims—and liability—regarding the harms that parents and teens believe were caused by their platforms. They claim that the scientific evidence at present does not point to causation. Mark Zuckerberg used this defense in his opening statement to a Senate subcommittee last January:

Mental health is a complex issue and the existing body of scientific work has not shown a causal link between using social media and young people having worse mental health outcomes.

Is he right? What kinds of scientific studies could show causal links, if they were there?

The Importance of Experiments

To address questions of causality, social scientists usually turn to experimental studies that use random assignment of people to conditions. (These are sometimes called RCTs, for “Randomized Controlled Trials.”) For example, Davis and Goldfield (2024) randomly assigned 220 distressed college students to either an experimental group, which was asked to reduce its social media usage to no more than 60 minutes per day for three weeks, or to a control group that had no social media restrictions. The study found that the intervention group showed significant reductions in symptoms of depression, anxiety, and FoMO (fear of missing out), and they showed increases in sleep. Such a finding in an RCT supports the inference (though does not prove) that social media reduction caused these benefits, which supports the inference that social media use (especially heavy use) causes such harms.

Since 2019, we (Jon and Zach) and Jean Twenge have been gathering all such experiments we can find into our main open-source Google document (titled Social Media and Mental Health: A Collaborative Review). We search for and continually add studies on all sides, including those that fail to find any benefits from social media reduction. However, as we note in the document, scientific questions are not decided simply by counting up studies on both sides. Conclusions are more reliable when studies are weighted for measures of quality, such as having a large sample size as opposed to a small one.

This kind of analysis is called a “meta-analytical review,” meaning it is a study of studies. In a meta-analysis, the researcher gathers all the studies meeting specific criteria and then extracts measures of effect sizes from each study for the relevant dependent variables one is studying. How much benefit or harm is there when we compare the experimental group to the control group in each study for each particular outcome? All effect sizes are converted to a common scale, such as Cohen’s d, which estimates the standardized difference between two means. (That is the number of standard deviations by which the means vary. That number is usually well below 1 because groups rarely vary by a full standard deviation.) The average of these effect sizes is then calculated, but not as a simple average. Instead, it is a weighted average where studies with larger sample sizes and lower variance are given more weight than those with smaller sample sizes and higher variance.

The Ferguson Meta-Analysis

Stetson University psychology professor Chris Ferguson recently carried out a meta-analysis, which was published in early 2024 in the Journal of Psychology of Popular Media under the title “Do social media experiments prove a link with mental health: A methodological and meta-analytic review.”

Ferguson selected 27 studies for analysis (25 were published studies, two were dissertations). Most of the studies asked participants in the experimental condition to reduce their use of social media in real life, for a period of time, as with the Davis and Goldfield (2024) study that we described. We will call these “reduction studies.” But Ferguson also included seven studies that exposed participants in the experimental condition to some aspect of social media and then looked for effects related to mental health or wellbeing. We will call these “exposure studies.” These exposures usually took place in a psychology lab at a university, and all but one lasted between 5 and 30 minutes.

Ferguson merged all of the 27 studies together (and all of their outcome variables, from satisfaction to loneliness to depression) to calculate an average effect size of d = .088 (which means that the experimental groups differed from the control groups by about 9% of one standard deviation). This finding suggests a small benefit from reducing social media consumption, but when Ferguson calculated a confidence interval around that number, it included zero (though just barely), meaning that the possibility could not be excluded that there was no effect overall. Ferguson summarized what he thought to be the implications of his meta-analysis like this:

Currently, experimental studies should not be used to support the conclusion that social media use is associated with mental health. Taken at surface value, mean effect sizes are no different from zero. Put very directly, this undermines causal claims by some scholars (e.g., Haidt, 2020; Twenge, 2020) that reductions in social media time would improve adolescent mental health. [Bolding added by Zach and Jon]

Are Ferguson’s conclusions valid? Is it really true that reducing social media time won’t improve adolescent mental health? Well, one of the most essential prerequisites for any meta-analysis is to ensure that there is no obvious moderator that greatly influences the outcome. For instance, consider an experiment testing the efficacy of a drug for reducing anxiety. Suppose there are two slightly different versions of the drug made by two different companies. If version A consistently reduces anxiety while version B consistently increases it, then 'drug version' would be a moderator. In this case, the result of averaging the effects across experiments would depend on how many of the experiments investigated versions A and how many versions B. Any assertion about ‘the effect’ without regard to the version of the drug would be meaningless, and concluding that the drug has no overall impact would also be a serious error. It would mislead the medical community and discourage doctors from prescribing version A. Instead, the two versions should be analyzed separately, with effect sizes reported for each.

The most obvious candidate for a major moderator in Ferguson’s study is type of study. Ferguson blended together two very different kinds of studies: twenty social media reduction studies and seven exposure studies that used entirely different methods, which (as we’ll show) yielded very different outcomes. The conflation of these two types of studies is problematic because it does not make sense to consider a 5-minute exposure to Facebook measuring momentary mood with a 3-week abstinence from social media measuring risk of clinical depression on a validated scale to be measures of the same effect.

Furthermore, even within the 20 reduction studies, the duration of the study is another obvious moderator. We would not expect the benefits of social media reduction to kick in within the first few days, given that withdrawal effects are common when people have been heavy users of social media, or cigarettes, or any addictive substance or activity. According to Anna Lembke, who studies and treats behavioral addictions as well as biological addictions, withdrawal symptoms generally include anxiety, irritability, insomnia, depression, and craving. Acute withdrawal symptoms typically peak after a few days, but often last for up to two weeks. So, it makes little sense to combine one-day abstinence studies with four-week reduction studies when it is only the longer studies that get participants past the withdrawal period. It is these multi-week studies that offer us the best test of the hypothesis that social media use causes declines in mental health. Parents who are considering delaying the age at which their children get social media accounts should look to the long-term reduction studies for guidance, not to one-day abstinence studies.

Ferguson’s 27 Studies, Reordered and Reconsidered

We think that Ferguson’s 27 studies should be divided up by type of study into three bins, which should be analyzed separately:

Multi-week reduction experiments. These ten studies examined the impact of reducing social media use for at least two weeks, allowing withdrawal symptoms to dissipate. Short (one week or less) reduction experiments. These ten studies examined brief periods of abstinence from social media use, which are likely to pick up withdrawal symptoms from heavy users. Exposure experiments. In these seven studies, the ‘treatment’ is typically brief exposure to some kind of social media, such as requiring high school students to look at their Facebook or Instagram page for 10 minutes.

We have produced three tables corresponding to those three bins, which collectively show all of Ferguson’s 27 studies. Positive numbers (shown in orange) indicate that reducing or quitting social media had beneficial effects (or that exposure to SM in lab experiments was detrimental), which indicates that social media is harmful. (That’s why we use orange—a widely used warning color—to mark such findings).

In contrast, negative numbers (shown in green) indicate that people who quit or reduced social media use were worse off, at least by the well-being measures that Ferguson chose to analyze. This suggests that social media is helpful, at least in the case of multi-week studies which get past withdrawal effects. (We mark such findings in green, which indicates “go” or “go ahead and use social media”).

In writing this post we draw heavily from an in-depth analysis made by David Stein, an independent scholar in the Czech Republic who has long studied and written about suicide rates at his blog (now Substack) The Shores of Academia. You can read Stein’s original post here: Fundamental Flaws in Meta-Analytical Review of Social Media Experiments. We have checked Stein’s findings carefully and we report them here, with additions and with his permission, in a form that we think will be more accessible to readers.

We begin with the first bin. Table 1 shows the 10 multi-week reduction experiments selected by Ferguson. In this table and in Tables 2 and 3 we list the effect sizes as calculated by Ferguson (which are reported in an online OSF supplement). As you can see, six of the studies are in orange and just one is in green. (Numbers that fell below .10 but above -.10 we left uncolored.) If we take the simple average of all the effect sizes we get d = .20, meaning that there are clear mental health benefits to participants for reducing their social media consumption for at least two weeks.

Table 1. Multi-Week Social Media Reduction Studies (Two Weeks or More)

In contrast, Table 2 gives Ferguson’s effect sizes for the short-term reduction studies (one week or less). Most of these produced negative numbers for their effect sizes, which Ferguson took to mean that social media is beneficial, since people felt worse when they quit. For consistency with Table 1 we colored such cells in green, but of course if these short-term studies are primarily measuring withdrawal effects, then this apparent benefit is really indicative of a larger harm.

We separated the short-term studies into two sections in Table 2 to highlight an interesting finding: all four of the very short term studies produced negatively valenced effect sizes (in green), whereas the six one-week studies produced more variable results (three in orange, three in green).

Table 2. Short Term Reduction Studies (One Week or Less)

Ferguson interpreted all of the negative numbers in Table 2 as backfire effects (that is, getting off social media was harmful, which means that social media is good), but we think that what looks like a backfire effect is really just the withdrawal effect we should expect from some users in the first few days (note that the average teen spends 5 hours a day on social media). The fact that the effect size increases so substantially as the duration of the reduction increases from one day to one week to several weeks supports this interpretation.

Now, let’s look at those seven exposure experiments, shown below in Table 3. These studies employ many methods, from spending ten minutes looking at your own Facebook page to spending twenty minutes communicating with others through Facebook. It is not clear that they should be combined. But if we do combine them, and if we draw only from Ferguson’s calculations, we find that the average effect size is d = +0.06. This is the equivalent of a third anti-anxiety drug that produces only a tiny improvement in anxiety (In our next post in this series we’ll correct some errors on the analysis of the exposure studies and show that they produce mostly harmful effects). Once again, it would be misleading to merge these exposure studies with the reduction studies and report that there is no effect, and no benefit, from reducing social media.

Table 3. Social Media Exposure Studies

Let’s put this all together. To summarize the findings of Tables 1, 2, and 3:

These results clearly indicate that multi-week social media reduction experiments improve some aspects of well-being, while very short-term reductions backfire, producing declines in some aspects of well-being. Exposure studies using heterogeneous methods produce heterogeneous findings. Merging all of these studies together, as Ferguson did, and then reporting that the overall effect is close to zero is as misleading as merging those three anxiety drugs together and reporting that they don’t work. One of them works, one backfires, one does almost nothing.

Conclusion

The Ferguson meta-analysis is the first and only meta-analysis of experiments that we know of, and once we organize it by type of study it clearly shows that when young people reduce their social media consumption for at least two weeks, their mental health improves.

Note that in this first post in our four-part series we used only the studies that Ferguson collected, we used only the effect sizes that Ferguson calculated, and we still found that his data supports (rather than undermines) the contention by some scholars that “reductions in social media time would improve adolescent mental health,” at least as long as the reductions continue for two weeks or longer.

Was Ferguson’s selection of studies correct? Were his calculations for those studies correct? We looked into all twenty-seven studies, and in our next post we’ll show that Ferguson made a number of errors and questionable decisions. To offer a preview of Part 2, we will show that Ferguson made two calculation errors in his effect sizes, he included two studies that did not meet his inclusion criteria, he did not include two studies that he should have known about, and he merged mental health outcomes (which consistently show harmful effects of social media) with measures of well-being (which produce more varied outcomes). When we correct these errors, we find that the evidence of mental health benefits from social media reduction and the evidence of harms from social media exposure get even stronger.

In part 3 of this series, we will present our own analysis of the relevant experiments, starting from scratch. We find many more experiments to analyze (including five multi-week experiments that Ferguson did not include in his analysis, all of which found mental health benefits from reducing use), and we focus on anxiety and depression as the key outcome variables. We find stronger and more consistent evidence that social media causes mental health harms.

In part 4, we will discuss the implications of these findings about causality for a variety of larger issues, including the Surgeon General’s call for warning labels, advice for parents, and why we don’t have to resolve the question of whether social media alone caused the gigantic increase in mental illness at the “population level” to be able to show that social media can cause anxiety and depression, and that it is causing harm to many adolescents in a variety of other ways.

So, was Zuckerberg correct to say that “the existing body of scientific work has not shown a causal link between using social media and young people having worse mental health outcomes?” We think he was wrong. Stay tuned for more.