Estimating disease prevalence using census data EPIDEMIOLOGY AND INFECTION Choy, M., Switzer, P., de Martel, C., Parsonnet, J. 2008; 136 (9): 1253-1260

Abstract

We describe a method of working on publicly available data to estimate disease prevalence in small geographic areas using Helicobacter pylori as a model infection. Using data from the Third National Health and Nutrition Examination Survey, risk parameters for H. pylori infection were obtained by logistic regression and validated by predicting 737.5 infections in an independent cohort with 736 observed infections. The prevalence of H. pylori infection in the San Francisco Bay Area was estimated with the probabilities obtained from a predictive logistic model, using risk parameters with individual-level 1990 U.S. Census data as input. Predicted H. pylori prevalence was also compared to gastric cancer incidence obtained from the Northern California Cancer Center and showed a positive correlation with gastric cancer incidence (P<0.001, R2=0.87), and no statistically significant association with other malignancies. By exclusively using publicly available data, these methods may be applied to selected conditions with strong demographic predictors.

View details for DOI 10.1017/S0950268807009752

View details for Web of Science ID 000259332400014

View details for PubMedID 18047747