The Annals of Statistics

2009, Vol. 37, No. 3, 1437–1465

DOI: 10.1214/08-AOS613

c Institute of Mathematical Statistics, 2009



arXiv:0906.1720v1 [math.ST] 9 Jun 2009

By Tyler J. VanderWeele and James M. Robins

University of Chicago and Harvard University Notions of minimal sufficient causation are incorporated within the directed acyclic graph causal framework. Doing so allows for the graphical representation of sufficient causes and minimal sufficient causes on causal directed acyclic graphs while maintaining all of the properties of causal directed acyclic graphs. This in turn provides a clear theoretical link between two major conceptualizations of causal- ity: one counterfactual-based and the other based on a more mecha- nistic understanding of causation. The theory developed can be used to draw conclusions about the sign of the conditional covariances among variables.

1. Introduction. Two broad conceptualizations of causality can be dis- cerned in the literature, both within philosophy and within statistics and epidemiology. The first conceptualization may be characterized as giving an account of the effects of certain causes; the approach addresses the question, “Given a particular cause or intervention, what are its effects?” In the con- temporary philosophical literature, this approach is most closely associated with Lewis’ work [17, 18] on counterfactuals. In the contemporary statistics literature, this first approach is closely associated with the work of Rubin [30, 31] on potential outcomes, of Robins [25, 26] on the use of counterfac-

been its cause?” In the contemporary philosophical literature, this second approach is most notably associated with Mackie’s work [19] on insufficient but necessary components of unnecessary but sufficient conditions (INUS conditions) for an effect. In the epidemiologic literature, this approach is most closely associated with Rothman’s work [29] on sufficient-component causes. This work is more closely related to the various mechanisms for a particular effect than is the counterfactual approach. Rothman’s work on sufficient-component causes has, however, seen relatively little development, extension or application, though the basic framework is routinely taught in introductory epidemiology courses. Perhaps the only major attempt in the statistics literature to extend and apply Rothman’s theory has been the work of Aickin [1] (comments relating Aickin’s work to the present work are available from the authors upon request).

In this paper, we incorporate notions of minimal sufficient causes, corresponding to Rothman’s sufficient-component causes, within the directed acyclic graph causal framework [21]. Doing so essentially unites the mechanistic and the counterfactual approaches into a single framework. As will be seen in Section 5, we can use the framework developed to draw conclusions about the sign of the conditional covariances among variables. Without the theory developed concerning minimal sufficient causes, such conclusions cannot be drawn from causal directed acyclic graphs. In a related paper [35] we have discussed how these ideas relate to epidemiologic research. The present paper develops the theory upon which this epidemiologic discussion relies.

The theory developed in this paper is motivated by several other considerations. As will be seen below, the incorporation of minimal sufficient cause nodes allows for the identification of certain conditional independencies which hold only within a particular stratum of the conditioning variable (i.e., “asymmetric conditional independencies,” [7]) which were not evident without the minimal sufficient causation structures. We note that these asymmetric conditional independencies have been represented elsewhere by Bayesian multinets [7] or by trees [3]. Another motivation for the development of the theory in this paper concerns the notion of interaction. Product terms are frequently included in regression models to assess interactions among variables; these statistical interactions, however, even if present, need not imply the existence of an actual mechanism in which two distinct causes both participate. Interactions which do concern the actual mechanisms are sometimes referred to as instances of “synergism” [29], “biologic interactions” [32] or “conjunctive causes” [20], and the development of minimal sufficient cause theory provides a useful framework to characterize mechanistic interactions. In related work [37] we have derived empirical tests for interactions in this sufficient cause sense.

Fig. 1. Causal directed acyclic graph under the alternative hypothesis of familial coaggregation.

analytic puzzle faced by psychiatric epidemiologists. Consider the following somewhat simplified version of a study reported in Hudson et al. [10].

Three hundred pairs of obese siblings living in an ethnically homogenous upper-middle class suburb of Boston are recruited and cross classified by the presence or absence of two psychiatric disorders: manic-depressive disorder P and binge eating disorder B. The question of scientific interest is whether these two disorders have a common genetic cause, because, if so, studies to search for a gene or genes that cause both disorders would be useful. Consider two analyses. The first analysis estimates the covariance β between P2i and B1i, while the second analysis estimates the conditional covariance α between P2i and B1i among subjects with P1i = 1, where Bki is 1 if the kth sibling in the ith family has disorder B and is zero otherwise, with Pki defined analogously. It was found that the estimates β and α were both positive with 95% confidence intervals that excluded zero.

Hudson et al. [10] substantive prior knowledge is summarized in the directed acyclic graph of Figure 1 in which the i index denoting family is suppressed. In what follows, we will make reference to some standard results concerning directed acyclic graphs; these results are reviewed in detail in the following section.

In Figure 1, GB and GP represent the genetic causes of B and P, respectively, that are not common causes of both B and P. The variables E1 and E2 represent the environmental exposures of siblings 1 and 2, respectively, that are common causes of both diseases, for example, exposure to a particularly stressful school environment. The variables GB and GP are assumed independent as would typically be the case if, as is highly likely, they are not genetically linked. Furthermore, as is common in genetic epidemiology, the environmental exposures E1 and E2 are assumed independent of the genetic factors. The causal arrows from P1 to B1 and P2 to B2 represent the investigators’ beliefs that manic-depressive disorder may be a cause of binge eating disorder but not vice-versa. The node F represents the common genetic causes of both P and B as well as any environmental causes of both P and B that are correlated within families. There is no data available 4 T. J. VANDERWEELE AND J. M. ROBINS for GB, GP, E1, E2 or F. The reason for grouping the common genetic causes with the correlated environmental causes in F is that, based on the available data {Pki, Bki ; i = 1,..., 300, k = 1, 2}, we can only hope to test the null hypothesis that F so defined is absent, which is referred to as the hypothesis of no familial coaggregation. If this null hypothesis is rejected, we cannot determine from the available data whether F is present due to a common genetic cause or a correlated common environmental cause. Thus E1 and E2 are independent on the graph because, by definition, they represent the environmental common causes of B and P that are independently distributed between siblings.

Now, under the null hypothesis that F is absent, we note that P2 and B1 are still correlated due to the unblocked path P2 − Gp − P1 − B1, so we would expect β = 0 as found. Furthermore, P2 and B1 are still expected to be correlated given P1 = 1 due to the unblocked path P2 − Gp − P1 − E1 − B1, so we would expect α = 0 as found. Thus, we cannot test the null hypothesis that F is absent without further substantive assumptions beyond those encoded in the causal directed acyclic graph of Figure 1.

Now Hudson et al. [10] were also willing to assume that for no subset of the population did the genetic causes Gp and GB of P and B prevent disease.

Similarly, they assumed there was no subset of the population for whom the environmental causes E1 and E2 of B and P prevented either disease.

We will show in Section 5 that under these additional assumptions, the null hypothesis that F is absent implies that the conditional covariance α must be less than or equal to zero, provided that there is no interaction, in the sufficient cause sense, between E and GP. If it is plausible that no sufficient cause interaction between E and GP exists, then the null hypothesis that F is absent is rejected because the estimate of α is positive with a 95% confidence interval that does not include zero.

Thus, the conclusion in the argument above that familial coaggregation of diseases B and P was present depended critically on the existence of (i) a formal definition of a sufficient cause interaction, (ii) a substantive understanding of what the assumption of no sufficient cause interaction entailed, and (iii) a sound mathematical theory that related assumptions about the absence of sufficient cause interactions to testable restrictions on the distribution of the observed data, specifically on the sign of a particular conditional covariance. In this paper, we provide a theory that offers (i)–(iii).

The remainder of the paper is organized as follows. The second section reviews the directed acyclic graph causal framework and provides some basic definitions; the third section presents the theory which allows for the graphical representation of minimal sufficient causes within the directed acyclic graph causal framework; the fourth section gives an additional preliminary result concerning monotonicity; the fifth section develops results relating minimal sufficient causation and the sign of conditional covariances; the 5


sixth section provides some discussion concerning possible extensions to the present work.

2. Basic definitions and concepts. In this section, we review the directed acyclic graph causal framework and give a number of definitions regarding sufficient conjunctions and related concepts. Following Pearl [21], a causal directed acyclic graph is a set of nodes (X1,..., Xn ), corresponding to variables, and directed edges among nodes, such that the graph has no cycles and such that, for each node Xi on the graph, the corresponding variable is given by its nonparametric structural equation Xi = fi (pai, ǫi ), where pai are the parents of Xi on the graph and the ǫi are mutually independent random variables. These nonparametric structural equations can be seen as a generalization of the path analysis and linear structural equation models [21, 22] developed by Wright [43] in the genetics literature and Haavelmo [9] in the econometrics literature. Robins [27, 28] discusses the close relationship between these nonparametric structural equation models and fully randomized, causally interpreted structured tree graphs [25, 26]. Spirtes, Glymour and Scheines [33] present a causal interpretation of directed acyclic graphs outside the context of nonparametric structural equations and counterfactual variables. It is easily seen from the structural equations that (X1,..., Xn ) admits the following factorization: p(X1,..., Xn ) = n p(Xi |pai ). The noni=1 parametric structural equations encode counterfactual relationships among the variables represented on the graph. The equations themselves represent one-step ahead counterfactuals with other counterfactuals given by recursive substitution. The requirement that the ǫi be mutually independent is essentially a requirement that there is no variable absent from the graph which, if included on the graph, would be a parent of two or more variables [21, 22].

A path is a sequence of nodes connected by edges regardless of arrowhead direction; a directed path is a path which follows the edges in the direction indicated by the graph’s arrows. A node C is said to be a common cause of A and B if there exists a directed path from C to B not through A and a directed path from C to A not through B. A collider is a particular node on a path such that both the preceding and subsequent nodes on the path have directed edges going into that node. A backdoor path from A to B is a path that begins with a directed edge going into A. A path between A and B is said to be blocked given some set of variables Z if either there is a variable in Z on the path that is not a collider or if there is a collider on the path such that neither the collider itself nor any of its descendants are in Z. If all paths between A and B are blocked given Z, then A and B are said to be d-separated given Z. It has been shown that if all paths between A and B are blocked given Z, then A and B are conditionally independent given Z [8, 13, 40].

6 T. J. VANDERWEELE AND J. M. ROBINS Suppose that a set of nonparametric structural equations represented by a directed acyclic graph H is such that its variables X are partitioned into two sets X = V ∪ W. If in the nonparametric structural equation for V ∪ W, by replacing each occurrence of Xi ∈ W by fi (pai, ǫi ), the nonparametric structural equations for V can be written so as to correspond to some causal directed acyclic graph G, then G is said to be the marginalization of H over the set of variables W. A causal directed acyclic graph with variables X = V ∪ W can be marginalized over W if no variable in W is a common cause of any two variables in V.

In giving definitions for a sufficient conjunction and related concepts, we will use the following notation. An event is a binary variable taking values in {0, 1}. The complement of some event E we will denote by E. A conjunction or product of the events X1,..., Xn will be written as X1 · · · Xn.

