摘要:Objectives. We explore how misclassification in disease status can distort the exposure–disease association in a study with dichotomous disease and exposure status. Methods. We define the difference in population odds ratios between populations with and without disease misclassification as population-level bias and derive the bias as a function of sensitivity and specificity for observed disease status. The magnitude and direction of bias can be elucidated through analytic derivations, as illustrated with numerical examples. Results. Patterns of bias exist not only for nondifferential misclassification but also for some differential misclassification scenarios. We have provided conditions defined in terms of sensitivity and specificity that correspond to each pattern of bias. Conclusions. Caution is needed in interpreting results when misclassification is present. Our findings can be used to assess the effects of disease misclassification in a population when sensitivity and specificity are known or can be estimated. In epidemiological and clinical studies, we are often interested in the association between a dichotomous exposure and a dichotomous health outcome such as disease status. However, misclassification is often present in these measures when the gold standard assessment is too expensive to apply and a more affordable but less accurate assessment is used instead. For example, misclassification for disease status is likely to occur when psychiatric disorder status is assessed through self-reported surveys instead of in-person clinical diagnosis. Likewise, misclassification for exposure status is likely to occur when individual exposure to air pollution is assessed by measurements recorded at neighborhood monitoring stations rather than by personal monitoring devices. Misclassification can alter the odds ratio (OR) that measures the exposure–disease association in a population. This difference can sometimes present significant problems in drawing conclusions about the nature and strength of the exposure–disease association, because the direction of the deviation is unclear and the magnitude of the deviation can be large. Here we focus on the impact of disease misclassification on the exposure–disease relationship when the exposure category is correctly classified. Two types of disease misclassification can arise in an exposure–disease association study: nondifferential and differential. Nondifferential misclassification occurs when neither sensitivity nor specificity for disease classification varies by exposure category. By contrast, differential misclassification occurs when misclassification of disease status varies by exposure category. 1,2 It is usually believed that nondifferential misclassification in either exposure or disease status results in an estimate that has the same sign as the true association but reduced magnitude, unless the misclassification is so severe that the estimate might switch over to the opposite side of the null. 3–9 However, differential misclassification can have effects with indeterminate direction, 6 away from the null, toward the null, or even switched to the opposite side of the null. It is unclear what conditions cause specific deviations. Chyou studied patterns of effects in the OR estimation attributable to differential misclassification by case–control status in a case–control study, with limited numerical examples. 10 However, conclusions based on limited numerical examples may be sensitive to the conditions chosen for the study. Thus it is desirable to use analytic derivation to examine the pattern of misclassification effects in the exposure–disease association, especially when differential misclassification occurs. Here we focus on the difference in population parameters (here the OR) between populations with and without disease misclassification, referred to as population-level bias. This population-level bias is different from the bias of an estimator, which represents the difference between an estimator’s expectation and the true value of the parameter being estimated. For sample-based estimation, the parameters estimated are consistent asymptotically for the corresponding population parameters; thus the patterns of bias for the sample estimators are the same asymptotically as the patterns for the population parameters. We focus on population parameters without estimation error and refer to population-level bias simply as bias.