Establishing an argument of causation is an important research activity because it influences the delivery of good medical care. A finding of causation influences decisions related to prognosis, diagnosis, and treatment, and it may have medical-legal ramifications. Examples include the ability to make predictions about future onset of change upon exposure to a causative agent (e.g., aluminum exposure leading to an increased risk of Alzheimer's disease), the inclusion of behavioral changes in the diagnostic criteria set for a causative disorder (e.g., hallucinations in Lewy body disease), research directed toward reversing causative agents (e.g., chelating agents in Alzheimer's disease), and denial of access to funding of health care depending on the nature of the presumed causative agent (e.g., in Ontario, rehabilitation funding provided by automobile insurance has been available only for behavioral changes related to physical injury). Findings of causation may also teach us about the functioning of healthy systems, including the role of the brain in producing changes in affect, behavior, and cognition (abbreviated herein as Behavior).
Bradford Hill's criteria have been summarized
2 as including 1) the demonstration of a strong association between the causative agent and the outcome, 2) consistency of the findings across research sites and methodologies, 3) the demonstration of specificity of the causative agent in terms of the outcomes it produces, 4) the demonstration of the appropriate temporal sequence, so that the causative agent occurs prior to the outcome, 5) the demonstration of a biological gradient, in which more of the causative agent leads to a poorer outcome, 6) the demonstration of a biologic rationale, such that it makes sense that the causative agent causes the outcome, 7) coherence of the findings, such that the causation argument is in agreement with what we already know, 8) experimental evidence, and 9) evidence from analogous conditions.
Bradford Hill's criteria are potentially helpful for neuropsychiatry because they 1) are widely accepted and applied throughout medicine, 2) may increase rigor in establishing causation through the structure they provide, 3) facilitate the teaching of important lessons about the role of the brain in producing Behavioral change (e.g., if there is an absence of a biologic gradient, this suggests hypotheses related to brain function that might explain this finding), and 4) suggest research approaches (e.g., the need to establish a temporal sequence speaks to the need for prospective studies). The criteria are reviewed below in terms of their role in establishing an argument of causation in neuropsychiatry and of the challenges and opportunities inherent in each of the criteria.
APPLICATIONS OF THE CRITERIA TO NEUROPSYCHIATRY
1. Strength of the Association
Clearly if condition A causes outcome B, then it must be that A and B can be demonstrably associated with each other. The association has to be strong enough to be judged clinically significant by the reader of the argument. This is a necessary, but not sufficient, criterion in establishing an argument of causation. Obviously the direction of causality in the case of A and B may in fact be reversed, with B causing A. Often less obvious is the possibility that some unidentified third factor, C, is in fact causing both A and B, leading to the detected association between A and B. For example, a hypothetical study may show an association between male pattern baldness and heart disease, thus falsely suggesting that baldness causes heart disease. The confounding (or “true”) causative agent in this study would not have been detected unless age, and perhaps testosterone status, were also measured and assessed statistically.
The possibility of “confounding” by undetected causative agents is one reason for the need for a control population with which to compare the strength of the association in the condition under investigation. This control population will ideally be matched to the exposed population on all relevant variables except the presence of the proposed causative agent. Another reason for demanding a control population is to reduce the risk of bias in designation of the presence of either A or B, especially when such assessments are done blind to the presence of the proposed causative agent.
The selection of an appropriate control group is one challenge in neuropsychiatric research. Clearly the appropriate control group will share important variables with the exposed group, but which variables to match on? The biopsychosocial model of Behavior suggests that variables to be considered include aspects of the medical history and examination findings, demographic and social variables, the individual's premorbid Behavioral and family histories, and the inclusion of variables assessing other outcomes of the neurological insult. For example, one patient summarized his belief related to the causation of his depression, which onset after a traumatic brain injury, by asking one of us (R.v.R.), “Wouldn't you be depressed if your life had changed this much?” This patient is suggesting that the brain lesion itself is not necessarily the only causative agent related to his depression, but rather that the sequelae of the insult may drive changes in affect (or behavior and cognition). These sequelae may include pain, impairment, loss of function, inability to access community resources, and change in family status. Given that there may be very many determinants of Behavior, one approach to the research is to match as closely as possible on potential determinants identified from the existing literature, measure to the extent possible the biopsychosocial status of the samples, and then perform analyses such as logistic regression to preliminarily identify the relative importance of the assessed variables. Direct comparisons of the identified significant variables, in terms of their impact on the outcome of interest, is then required in further research to confirm the role of the preliminarily identified causative agents.
Other challenges to the establishment of an association in neuropsychiatry may include the detection of the putative causative agent. Diagnostic issues and the limited sensitivity of existing technologies are sometimes important limiting factors here. Autopsy may be necessary for the final assignment of case status, but this clearly poses its own challenges. The assessment of outcome is another challenge because instruments validated for specific neuropsychiatric disorders remain relatively scarce (although this is changing rapidly). “Borrowing” instruments or diagnostic criteria from non–brain-injured populations is a common practice that may be vulnerable on many fronts. For example, the brain insult may either mimic or mask the production of signs and symptoms of the outcome. Parkinson's disease may mimic depression by producing slowing, or it may mask its expression by producing expressive aprosody through subcortical-frontal involvement. Maintaining blindness to causative agent status is also often very difficult, as the causative agent may produce visibly obvious signs, or a particular pattern of symptoms or sequelae, which allows the outcome assessor to determine causative agent status.
When should outcome status be measured? We know that some Behavioral outcomes are associated with different stages of the disease. A good example here are the affective and behavioral changes seen over the course of Alzheimer's disease (AD). Who is the appropriate informant of outcome status? Cognitive impairment, including lack of insight, may limit the validity of the subjects' responses. But second-person informants may not truly know what the subjects are experiencing. Obviously there are many challenges to the demonstration of an association between brain pathology and Behavioral change.
Opportunities in neuropsychiatry include ancillary investigations of the status of the brain. Imaging and cognitive testing are two powerful modalities. Increasingly rigorous diagnostic criteria are also becoming available. Outcome measures such as the Neuropsychiatric Inventory
4 are increasingly available, and because of their design features they are often relatively quick and easy to use. Increasing knowledge is also leading to a narrower range of possible confounding factors and is directing selection of appropriate control populations.
2. Consistency of the Evidence
This is a very important criterion for causation, and one that is particularly amenable to research in neuropsychiatry. If A causes B, then we can expect that evidence supporting this relationship will be found consistently. In neuropsychiatry, multiple approaches to the study of the brain (e.g., imaging, pathology, cognitive testing) are available, and they should all lead to consistent findings supporting the causative role of A in producing B. If findings are inconsistent, then it is very important to examine the possibility that methodological issues may be at the heart of the discrepant findings. If no reasonable explanation exists for the discrepancy, then even a single piece of inconsistent evidence seriously undermines the argument of causation. Consistency of the findings is a necessary criterion for causation in neuropsychiatry, and it implies that a breadth of research findings is available (i.e., an argument of causation will rarely be accepted on the basis of a single piece of evidence).
3. Specificity
This criterion stems from old beliefs related to a one-disease, one-outcome model of illness. Clearly this criterion does not hold even for infectious diseases or toxin exposures, in which multiple pathogens may produce the same set of symptoms, or in which a single pathogen may produce a number of outcomes. E. coli may produce urinary tract infections as well as infections of the gastrointestinal system, and both of these types of infections may also be produced by a number of other pathogens. Similar concerns limit the validity of this criterion in neuropsychiatry; that is, if specificity can be demonstrated, then this is additional support for causation, but if specificity is lacking, then this in no way detracts from the argument of causation.
4. Temporal Sequence
Clearly if A is causing B, then A should necessarily occur prior to B. This is a necessary criterion for establishing an argument of causation, but it is also one that is often difficult to establish in neuropsychiatry.
As with all health-related research, the only way to be certain about the temporal sequence is to conduct prospective studies, in which the samples are studied (for presence of the outcome) prior to the onset of the putative causative agent and then followed over time after the insult. Recall bias may lead an informant to report wellness prior to a brain insult, whereas there may well have been some premorbid Behavioral disturbance. Recall bias potentially affects all nonprospective health research, and it is often exacerbated in neuropsychiatric research by the presence of cognitive impairments affecting the ability to accurately recall pre-insult status. Another type of bias potentially affecting retrospective studies is attribution bias, in which the informant's need to understand the observed changes may falsely lead to the attribution of such change to events temporally associated with the change. Recall and attribution biases may sometimes be addressed to some extent by measuring premorbid status as soon as possible after the onset of the causative agent (ideally before the Behavioral changes are seen) and by using sources of data (e.g., medication history from computerized registries) that are less likely to be affected by these types of bias.
Whereas retrospective assessments are clearly limited by these issues, prospective studies are limited by feasibility issues related to having to follow large nondiseased samples, sometimes over long periods of time, until the onset of the putative causative agent. Selecting populations “at risk” can make this approach more feasible by reducing the sample size necessary, but it comes at the cost of limiting the generalizability of the results to other populations who do not exhibit the factor associated with the increase in risk.
It should be noted here that the presence of the outcome of interest in some of the subjects prior to the onset of the causative agent does not necessarily invalidate the establishment of a temporal sequence. For example, some individuals who develop poststroke depression may well have had prestroke depression. In this case, establishing a finding of new-onset cases or some new-onset feature of the outcome can add to the evidence of a temporal sequence.
Establishing a temporal sequence also requires that we know when the causative agent first appeared, and this is often very challenging in neuropsychiatry. For example, we now have evidence that the pathophysiologic changes underlying AD may start well before the cognitive impairments become obvious.
5 Late-onset depression, which occurs at a high rate in persons who later manifest AD,
6 may appear, on the basis of the temporal sequence criteria, to be causative of AD. More likely, though, is the possibility that underlying brain changes of AD first produce the late-onset depression, and thereafter the cognitive dysfunction typical of AD. This temporal sequence is correct, but can be determined only if the appropriate measurements are made.
Although temporal sequence is necessary for an argument of causation, it is often very difficult to be certain it has been established, and doing so may require extensive research efforts.
5. Biological Gradient
Establishing a biological gradient is appropriate for outcomes secondary to insults such as infectious diseases or exposure to toxins, but it may not be appropriate, or may be difficult to establish, in neuropsychiatric conditions. Brain lesion size has been commonly used as a measure of the severity of the brain insult. However, we know that brain lesion size is not necessarily the only variable influencing the impact of the lesion on the function of the brain. Lesion location is also important, so that small (e.g., subcortical) lesions disrupting neuronal circuits may produce dysfunction of large (e.g., cortical) areas of the brain similar to that seen with large lesions that directly affect the dysfunctional area. Furthermore, very severe lesions may actually prevent expression of the outcome of illness. As a gross example, individuals in prolonged coma after very severe TBI are obviously unable to experience or report the subjective symptoms of depression.
What is the appropriate measure of severity of the insult to the brain? In TBI research the initial severity of the TBI, as assessed by depth and duration of coma or duration of posttraumatic amnesia, is often used. However, these variables relate primarily to disruption of the reticular activating system in the brainstem and may not reflect the degree of cortical involvement brought about by contusional injuries to the cortex. Degree of impairment or handicap may be indirect measures of severity of the insult, but both clearly may be influenced by non-brain factors, such as level of education in the case of cognitive impairment, or access to rehabilitation in the case of return-to-work status.
Choosing a narrow focus of severity may lead to the problem of missing important correlations or identifying misleading ones, since the variable assessed may not fully correlate with the severity of the insult. For example, cognitive dysfunction in Alzheimer's disease is often used as a measure of the severity of the disease, but we know that this dysfunction is largely attributable to dysfunction in certain brain systems and may not indicate the degree of dysfunction in other brain systems that might have important impacts on other aspects of Behavior. For most neuropsychiatric outcomes, it is likely that some measure of the degree of dysfunction in relevant brain regions will be the best measure of the “severity” of the insult to the brain. Because we are now increasingly able to measure brain function through imaging and testing paradigms, this criterion is more likely to be satisfied in the future. At present, however, given the existing challenges, the demonstration of a biological gradient should be considered supportive of an argument for causation in neuropsychiatry, but the absence of a biological gradient may not preclude the determination of a causative relationship.
6. Biologic Rationale
There is a greater likelihood of a causative relationship being present if it makes biological sense that A causes B. Whether or not it makes sense that a putative causative agent causes the outcome of interest is important to us as humans because we need to fit research findings into our understanding of our world and ourselves. But how we understand our world and ourselves is clearly a function of the state of our belief systems at present, and “reality” changes as new belief systems evolve. Although it once made sense that body fluids such as bile were determinants of Behavior, this does not make sense today.
Many of our perceptions about determinants of Behavior are influenced today by post-psychoanalytic thought and by cultural expectations (such as the expectation that we be “in control” of our emotions). Modern neuropsychiatry is once again shifting the focus back to biological processes as determinants of Behavior, but our judgment as to whether arguments of causation for changes in brain function affecting Behavior are valid will continue to be influenced by our preconceived notions of the world and ourselves. Interestingly, it seems that we are much more likely to accept the role of the brain in determining changes in cognitive function than we are to accept its role in determining changes in mood and behavior. It may be that such possibilities threaten us, in the sense of undermining our need for self-control. Whatever our reasons for not considering certain arguments of causation to be biologically plausible, we need to constantly remind ourselves that just because the argument doesn't make sense to us does not necessarily mean that it isn't true. A biologic rationale is necessary for establishing an argument for causation, but it may not be accepted by everyone in the here and now.
7. Coherence
This is similar to the biologic rationale criterion. It stipulates that there is a greater likelihood that A causes B if this postulated causal relationship is consistent with what is already known about the disease or disorder. Clearly the relevance of this criterion will depend to a large extent on the amount of knowledge that we have at the moment. As with the biologic rationale criterion, if this criterion is met, then it is supportive of an argument of causation; if not, then we may simply not yet know enough, or we may need to revisit that which we think we know.
8. Experimental Evidence
Experimental evidence is the most compelling evidence of causation. If it can be shown that experimentally (ideally randomly) inducing the causative agent consistently produces the outcome, at greater rates than in a nonexposed control sample, this is clear and compelling evidence of causation. However, it is obvious that such evidence will be rare in neuropsychiatry, as it is grossly unethical to induce most forms of brain dysfunction experimentally in humans. Transient alterations in brain function, such as with apomorphine or transcranial magnetic stimulation, are sometimes the exception to this ethical concern and may yield important results in the future. Experimental approaches are often applied to nonhuman species, but this practice is also increasingly considered to raise ethical concerns. Further, the nonhuman brain has important differences in brain structure and function that may mislead researchers investigating causation in humans.
Some experimental evidence in humans, however, may be forthcoming from results of treatment studies. Indeed, the dopamine hypothesis of schizophrenia was born from observations of response to treatment with dopamine-active agents such as chlorpromazine. The problem with this type of thinking is that conditions may respond to a treatment that does not necessarily address the causative agent. For example, few believe that headaches are caused by an absence of aspirin, despite the fact that headaches may decrease with aspirin. While there is no “hypoaspirinemia” theory of headaches, this type of experimental evidence may provide important leads into causative relationships. The role of prostaglandins in the formation of pain responses is an example that flows from the observation that aspirin relieves pain.
These limitations on the use of experimental evidence limit the utility of this criterion for causation in neuropsychiatry, rendering it a helpful but not a necessary criterion at present.
9. Analogous Evidence
This approach takes the form of thinking that if some condition similar to A causes an outcome similar to B, then this is evidence that A causes B. While analogous evidence is helpful, there are clearly major limitations to this approach in neuropsychiatry. Although different types of insults to the brain may share certain features, they also usually have important differences as well. Furthermore, the nature of the lesion may influence the expression of the outcome of interest in important ways. For example, even within the realm of cerebrovascular disorders, important differences in outcome may depend on lesion location. The same loss of interest that is associated with depression in posterior strokes may represent the presence of an apathy state, and not depression, in strokes involving frontal systems. Hence analogous evidence needs to be treated with caution in neuropsychiatry, and is not a necessary criterion for causation.
SUMMARY
This review illustrates the potential value of applying rigorous criteria to the establishment of arguments of causation in neuropsychiatry, as well as the challenges entailed in their use. It is hoped that this review will facilitate and encourage the use of these criteria in neuropsychiatry, and in so doing allow us to be more certain of the strengths and the limits of our arguments of causation. Considering these criteria, and using research options increasingly available to us in neuropsychiatry to address them, will also potentially allow for improved research planning and methodology in studies of causation in neuropsychiatry.
We suggest that some of Bradford Hill's criteria are more relevant, or more feasible, to use in neuropsychiatry than others. Demonstration of an association between the causative agent and the outcome, consistency of the findings, a biologic rationale, and the appropriate temporal sequence are all necessary criteria that are feasible to achieve (although not without potential difficulties). The biological gradient, coherence, analogous evidence, and specificity criteria are not necessarily appropriate for neuropsychiatry, but where demonstrable will add to the argument for causation. Experimental evidence is extremely compelling where available, but is limited by ethical concerns. Although some of these criteria have their limitations, systematically applying them is still likely to lead to greater rigor than nonsystematic arguments of causation.
This paper has addressed the need for rigorous criteria to establish causation in neuropsychiatry but has not yet addressed the process of establishing an argument of causation. How is it that we finally become convinced that A causes B? Ultimately we need to decide for ourselves, based on our review of the evidence. However, as humans we also value the opinions of others, especially “experts,” and often it is not until a consensus of expert opinion is available that we will accept an argument of causation. Caution is warranted, however, because even experts may be prone to bias. Perhaps nowhere is this more evident than in court, where experts from the two sides may argue with great passion, and often from the same set of facts, as to whether a particular lesion in the brain is responsible for a particular Behavior. Awareness of the criteria for establishing arguments of causation, and of the research strategies (with their limitations) available to address these criteria, lowers our risk of being misled by the “expertise” of others.