American journal of epidemiology
-
The correlation between objective and self-reported measures of physical activity varies between studies. We examined this association and whether it differed by demographic factors or socioeconomic status (SES). Data were from 3,975 Whitehall II (United Kingdom, 2012-2013) participants aged 60-83 years, who completed a physical activity questionnaire and wore an accelerometer on their wrist for 9 days. ⋯ Self-reported physical activity from more energetic activities was more strongly associated with accelerometer data (for sports, r = 0.22; for gardening, r = 0.16; for housework, r = 0.09). High-SES persons reported more energetic activities, producing stronger accelerometer associations in these groups. Future studies should identify the aspects of physical activity that are most critical for health; this involves better understanding of the instruments being used.
-
The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. ⋯ The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction.
-
Historically, clinical epidemiologic research has been constrained by the costs and time associated with manually identifying cases and abstracting clinical data. In this issue, Carrell et al. (Am J Epidemiol. 2014;179(6);749-758) report on their impressive success using natural language processing techniques to correctly identify cases of cancer recurrence among women with previous breast cancer. They report a 10-fold decrease in the need for chart abstraction, though with an 8% loss in case detection. This commentary outlines some recent history associated with the development of "high-throughput clinical phenotyping" of electronic health records and speculates on the impact such computational capabilities may have for observational research and patient consent.
-
Unobserved confounding can seldom be ruled out with certainty in nonexperimental studies. Negative controls are sometimes used in epidemiologic practice to detect the presence of unobserved confounding. An outcome is said to be a valid negative control variable to the extent that it is influenced by unobserved confounders of the exposure effects on the outcome in view, although not directly influenced by the exposure. ⋯ In this paper, we go beyond the use of control outcomes to detect possible unobserved confounding and propose to use control outcomes in a simple but formal counterfactual-based approach to correct causal effect estimates for bias due to unobserved confounding. The proposed control outcome calibration approach is developed in the context of a continuous or binary outcome, and the control outcome and the exposure can be discrete or continuous. A sensitivity analysis technique is also developed, which can be used to assess the degree to which a violation of the main identifying assumption of the control outcome calibration approach might impact inference about the effect of the exposure on the outcome in view.