Database and other sources for population-based research

Using secondary datasets for population-level epidemiologic, outcomes and health services research can be an effective, resource-efficient way to conduct high quality research. The following tables provide the name, sources, most precise level of geographical classification, most recent date of presentation, and a brief description of data that can be used in such research. The databases and datasets range from registries to survey data to government databases that can be used to research health behaviors, health care utilization, disease incidence, and other health related issues. The data source name of each database/dataset is hyperlinked to the source’s website which will contain more information on that source. Sources are categorized into one of five overarching categories—the Center for Disease Control and Prevention (CDC), Agency for Healthcare Quality and Research (AHRQ), National Cancer Institute (NCI), Substance Abuse and Mental Health Services Administration (SAMHSA), and all other sources. These other sources include the state of Illinois and Medicare, among other sources. This list not meant to be exhaustive, but to provide a reasonably extensive list of potential sources for secondary database research exploration. There may be some cost and/or application processes associated with obtaining these sources.

For assistance in accessing and utilizing this data, as well as Community Health and Demographic Sources (CHADS) please contact:

Albert Botchway, PhD: statistics@siumed.edu 217-545-3611  

Steven Scaife, MS: sscaife@siumed.edu (217) 545-6949 

For an introductory guide to developing and conducting research using secondary datasets, please read “Conducting High-Value Secondary Dataset Analysis: An Introductory Guide and Resources” by Dr. Alexander K. Smith and colleagues, which is available here.

 
Data Source NameMost Precise Level of GeographyData Element Categories
National Health and Nutrition Examination Survey (NHANES)NationalDiseases and conditions, nutrition monitoring, environmental exposures monitoring, children’s growth and development, infectious disease monitoring, etc.
National Ambulatory Medical Care SurveyNationalProvider characteristics and patient characteristics (including demographics, diagnoses, medications)
Hospital Discharge SurveyNationalDiagnosis, Payment, and Admission Type
National Home Health Aide SurveyNationalRecruitment, job satisfaction, training, job history, demographics
National Survey of Family GrowthNationalContraception, sterilization, teenage sexual activity and pregnancy, family planning and unintended pregnancy, infertility, adoption, breastfeeding, marriage, divorce, cohabitation, fatherhood involvement, HIV risk behavior
National Vital Statistics SystemState (or more specific)Birth rates, birthweight, teen and nonmarital pregnancy, pregnancy outcomes, method of delivery, preterm delivery, multiple births, infant mortality, life expectancy, causes of death, occupational mortality
National Study of LongTerm Care ProvidersNationalBackground information, service offered, staff profile, resident profile, record keeping.
National Immunization SurveyNationalImmunization status of preschool children and adolescents, demographics, family resources, health care utilization, barriers to care
National Immunization Survey-TeenStateImmunization status of teens aged 13-17, demographics, family resources, health care utilization, barriers to care
National Survey of Ambulatory SurgeryRegion/ MSAData are obtained on patients' symptoms, physicians' diagnoses, and medications ordered or provided. The survey also provides statistics on the demographic characteristics of patients and services provided, including information on diagnostic procedures, patient management, and planned future treatment.
Asthma Call Back SurveyStateCurrent/former asthma status, doctors visits, asthma management, medication use, lifestyle effects
Compressed Mortality FileCountyGeographic location, age, race gender, ICD code for underlying cause of death
Behavioral Risk Factors Surveillance SystemStateDemographics, health behaviors, chronic disease presence
Medical Expenditures Panel SurveyState and metro areaAccess, disparities, health care, prescription drugs, expenditures, mental health, obesity
Nationwide Inpatient Sample (NIS)44 states and 1000+ hospitalsPrimary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics
Nationwide Emergency Department Sample29 million records and 964 hospitals in 29 states (including IL)Primary and secondary diagnoses, discharge status, patient demographics, payment source, ED charges, Hospital characteristics
Kids’ Inpatient Database44 state In Patient data- 4,100+ hospitalPrimary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital
State Inpatient Database44 state Inpatient Hospital dataPrimary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics
State Ambulatory Surgery Database28 states ambulatory surgery database: some2009 Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics
State Emergency Department Databases27 states; some hospital identification when linked with ARF2009 Primary and secondary diagnosis, admission/discharge statuses, demographics, payment source, charges, LOS, Hospital characteristics
Surveillance Epidemiology and End Results (SEER)County or City dependent on locationCancer incidence, type, staging, survival from 18 registries (constituting 28% of the US population) around the country. Cancer mortality data are available for the entire country.
SEER-Medicare Linked DatabaseCounty or City dependent on locationClinical, demographic, cause of death, and Medicare claim information for cancer patients
SEER-Medicare Health Outcomes SurveyCounty or City dependent on locationClinical, demographic, cause of death, and health related quality of life of cancer patients with Medicare
Cancer Data Access SystemStudy CenterData from the National Lung Screening Trial (NLST) and the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial including screening data, results, and patient-related data
Health Information National Trends SurveyRegionalHealth information sources, prevention behaviors, cancer knowledge, etc
National Survey on Drug Use and HealthNationalDemographics, alcohol, tobacco, and illegal drug use
National Survey of Substance Abuse Treatment ServicesStateOwnership, services offered, types of treatment, # of clients and beds, programs offered, medications prescribed and dispensed
Treatment Episode DataAdmissionsMetropolitan/Micropolitan AreaClient characteristics, service setting, prior treatment, substances abused
Illinois State Cancer RegistryZip code levelData on cancer type, stage, gender, race, incidence, mortality etc. The more local the data, the less specific the data.
National Survey of Children with Special Health Care NeedsState levelDecision making about services, coordinated, ongoing comprehensive care, adequate insurance, early and continuous screening, community based services availability, receipt of services to make transition into adult life
Medicare/MedicaidStateBeneficiary-specific, providerspecific including claims and clinical data for varying types of care including inpatient, outpatient , hospice, home health and skilled nursing facility
Survey of Adult Transition and HealthStateRoutine care, specialist care, dental, RXs, mental healthcare, health insurance, accessibility
National Sample Survey of Registered NursesCountyEducation and training in nursing, professional nursing certifications, education and workforce participation prior to becoming a registered nurse, current and recent workforce participation, income, demographic characteristics
National Cancer DatabaseCensus DivisionPatient characteristics, facility characteristics, staging, treatment and outcome data