Database and other sources for population-based research
Using secondary datasets for population-level epidemiologic, outcomes and health services research can be an effective, resource-efficient way to conduct high quality research. The following tables provide the name, sources, most precise level of geographical classification, most recent date of presentation, and a brief description of data that can be used in such research. The databases and datasets range from registries to survey data to government databases that can be used to research health behaviors, health care utilization, disease incidence, and other health related issues. The data source name of each database/dataset is hyperlinked to the source’s website which will contain more information on that source. Sources are categorized into one of five overarching categories—the Center for Disease Control and Prevention (CDC), Agency for Healthcare Quality and Research (AHRQ), National Cancer Institute (NCI), Substance Abuse and Mental Health Services Administration (SAMHSA), and all other sources. These other sources include the state of Illinois and Medicare, among other sources. This list not meant to be exhaustive, but to provide a reasonably extensive list of potential sources for secondary database research exploration. There may be some cost and/or application processes associated with obtaining these sources.
For assistance in accessing and utilizing this data, as well as Community Health and Demographic Sources (CHADS) please contact:
Albert Botchway, PhD: statistics@siumed.edu (217) 545-3611
Steven Scaife, MS: sscaife@siumed.edu (217) 545-6949
For an introductory guide to developing and conducting research using secondary datasets, please read “Conducting High-Value Secondary Dataset Analysis: An Introductory Guide and Resources” by Dr. Alexander K. Smith and colleagues, which is available here.
Data Source Name | Most Precise Level of Geography | Data Element Categories |
---|---|---|
National | Diseases and conditions, nutrition monitoring, environmental exposures monitoring, children’s growth and development, infectious disease monitoring, etc. | |
National | Provider characteristics and patient characteristics (including demographics, diagnoses, medications) | |
National | Diagnosis, Payment, and Admission Type | |
National | Recruitment, job satisfaction, training, job history, demographics | |
National | Contraception, sterilization, teenage sexual activity and pregnancy, family planning and unintended pregnancy, infertility, adoption, breastfeeding, marriage, divorce, cohabitation, fatherhood involvement, HIV risk behavior | |
State (or more specific) | Birth rates, birthweight, teen and nonmarital pregnancy, pregnancy outcomes, method of delivery, preterm delivery, multiple births, infant mortality, life expectancy, causes of death, occupational mortality | |
National | Background information, service offered, staff profile, resident profile, record keeping. | |
National | Immunization status of preschool children and adolescents, demographics, family resources, health care utilization, barriers to care | |
State | Immunization status of teens aged 13-17, demographics, family resources, health care utilization, barriers to care | |
Region/ MSA | Data are obtained on patients' symptoms, physicians' diagnoses, and medications ordered or provided. The survey also provides statistics on the demographic characteristics of patients and services provided, including information on diagnostic procedures, patient management, and planned future treatment. | |
State | Current/former asthma status, doctors visits, asthma management, medication use, lifestyle effects | |
County | Geographic location, age, race gender, ICD code for underlying cause of death | |
State | Demographics, health behaviors, chronic disease presence | |
State and metro area | Access, disparities, health care, prescription drugs, expenditures, mental health, obesity | |
44 states and 1000+ hospitals | Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics | |
29 million records and 964 hospitals in 29 states (including IL) | Primary and secondary diagnoses, discharge status, patient demographics, payment source, ED charges, Hospital characteristics | |
44 state In Patient data- 4,100+ hospital | Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital | |
44 state Inpatient Hospital data | Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics | |
28 states ambulatory surgery database: some | 2009 Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics | |
27 states; some hospital identification when linked with ARF | 2009 Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics | |
County or City dependent on location | Cancer incidence, type, staging, survival from 18 registries (constituting 28% of the US population) around the country. Cancer mortality data are available for the entire country. | |
County or City dependent on location | Clinical, demographic, cause of death, and Medicare claim information for cancer patients | |
County or City dependent on location | Clinical, demographic, cause of death, and health related quality of life of cancer patients with Medicare | |
Study Center | Data from the National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial including screening data, results, and patient-related data | |
Regional | Health information sources, prevention behaviors, cancer knowledge, etc | |
National | Demographics, alcohol, tobacco, and illegal drug use | |
State | Ownership, services offered, types of treatment, # of clients and beds, programs offered, medications prescribed and dispensed | |
Metropolitan/Micropolitan Area | Client characteristics, service setting, prior treatment, substances abused | |
Zip code level | Data on cancer type, stage, gender, race, incidence, mortality etc. The more local the data, the less specific the data. | |
State level | Decision making about services, coordinated, ongoing comprehensive care, adequate insurance, early and continuous screening, community based services availability, receipt of services to make transition into adult life | |
State | Beneficiary-specific, providerspecific including claims and clinical data for varying types of care including inpatient, outpatient , hospice, home health and skilled nursing facility | |
State | Routine care, specialist care, dental, RXs, mental healthcare, health insurance, accessibility | |
County | Education and training in nursing, professional nursing certifications, education and workforce participation prior to becoming a registered nurse, current and recent workforce participation, income, demographic characteristics | |
Census Division | Patient characteristics, facility characteristics, staging, treatment and outcome data |