Synthesis of HRS-SSA linked data

Award Information
Agency: Department of Health and Human Services
Branch: N/A
Contract: 1R41AG029756-01
Agency Tracking Number: AG029756
Amount: $99,986.00
Phase: Phase I
Program: STTR
Awards Year: 2007
Solicitation Year: 2007
Solicitation Topic Code: N/A
Solicitation Number: N/A
Small Business Information
DUNS: 141050547
HUBZone Owned: N
Woman Owned: Y
Socially and Economically Disadvantaged: Y
Principal Investigator
 (607) 330-5743
Business Contact
Phone: (607) 257-4673
Research Institution
ITHACA, NY 14853
ITHACA, NY, 14853 2300
 Nonprofit college or university
DESCRIPTION (provided by applicant): The Health and Retirement Study is one of the world's most important data resources for the study of aging. The basic longitudinal survey instrument has been supplemented with data from a variety of other sources including Social Security Administration records containing the detailed earnings history of the respondent. Under current HRS protocols, the use of the SSA data is restricted. Investigators must make special security arrangements to obtain these ?les, which they can not redistribute once they complete their analyses. These protocols are necessary to preserve the confidentiality of the underlying HRS and SSA micro data, which is essential to the continued willingness of respondents to participate in the study, but they severely limit the usefulness of the SSA data. New statistical disclosure limitation methods have been developed that promise to provide much of the information in confidential micro data in a manner that permits much wider dissemination and use of the protected data. This project is a Phase I feasibility study of applying these new methods, called synthetic data, to a subset of the variables in the SSA records that link to the general-use RAND-HRS data. The project has three main components: (1) port a general data synthesizer that was developed at the U.S. Census Bureau for use with SSA data linked to the Survey of Income and Program Participation for adaptation to the HRS/SSA link; (2) synthesize a few variables from the HRS/SSA link and test their usefulness in statistical modeling; (3) perform studies of the statistical disclosure risk associated with linking synthetic SSA data to the RAND-HRS general-release ?le. If the confidentiality-protected data prove scientifically useful and if the statistical disclosure risk can be controlled, then Phase II of the research would synthesize the entire HRS/SSA data link. The project scientists will work with the HRS Data Release Protocol Committee and SSA to develop appropriate certifications of the statistical disclosure avoidance provided by the methods. 1 7 Project Narrative Many users of the Health and Retirement Study general-release data ?les would benefit from some access to the restricted-release ?les without the requirement for special security arrangements. Critical variables on the restricted- release data from the Social Security Administration earnings histories will be confidentiality-protected using powerful new methods that preserve privacy while allowing many important statistical analyses to be performed combining the protected SSA and general-release HRS data. If the demonstration is successful, the methods will be extended to the entire HRS/SSA linked data in a future project.

* Information listed above is at the time of submission. *

Agency Micro-sites

SBA logo
Department of Agriculture logo
Department of Commerce logo
Department of Defense logo
Department of Education logo
Department of Energy logo
Department of Health and Human Services logo
Department of Homeland Security logo
Department of Transportation logo
Environmental Protection Agency logo
National Aeronautics and Space Administration logo
National Science Foundation logo
US Flag An Official Website of the United States Government