Sample Descriptions

HRS Sample Description

The HRS core sample comprised an integrated sample of the U.S. household population using a multi-stage area probability sample design. The sampling included four distinct selection stages consistent with the general sample design framework and sampling procedures of the SRC National Sample (Heeringa, Connor, and Darrah, 1984).  The primary stage sampling units (PSUs) were selected probability proportionate to size (PPS) from U.S. Metropolitan Statistical Areas (MSAs) and non-MSA counties based on the county-level 1980 Census Reports of Population and Housing and 1980 SMSA definitions. Second stage sampling units (SSUs) were selected within PSUs from computerized files that were prepared from the 1990 Census. Theses SSUs were comprised of Census blocks or groups of blocks.  For the third stage of sample selection, a complete listing were made of all housing units (HUs) that were physically located within the bounds of the SSU. HUs were selected with multi-stage sampling using a sampling rate within SSUs which was inversely proportional to the PPS probabilities used to selected the PSU and SSU. The number of HUs per SSU was based on the expected occupancy rate, the screening required to find age-eligible households, and the expected response rate. The fourth stage was the selection of the household financial unit within a sample HU. A household informant was asked the year of birth of any person in the housing unit aged 50 to 62. If the year of birth was 1931 - 1941, the person was eligible to be interviewed for the HRS survey. If only one person was eligible or if there were two eligible persons who were married/partnered, this person/persons were selected as the financial unit. If more than one unmarried/unpartnered persons were eligible, a single financial unit was selected using an objective procedure described by Kish (1965). 

In addition to the core sample, the HRS design includes three additional oversamples. The oversamples are designed to increase the numbers of Black and Hispanic HRS respondents as well as the number of HRS respondents who are residents of the state of Florida. 

In addition to the HRS core sample (called the HRS cohort), there are many additional cohorts which have been added to supplement the HRS cohort and refresh the total sample. The AHEAD cohort included persons born 1924. This cohort was initially part of a separate study (The Study of Assets and Health Dynamics Among the Oldest Old) which was conducted in 1993 and 1995 and then incorporated into the HRS survey in the 1998 wave. In 1998 the HRS added Children of Depression (CODA) cohort which included persons born between 1924 and 1930. In 1998, the War Baby (WB) cohort was added which included persons born before 1942 to 1947. In 2004 the HRS added the Early Baby Boomer (EBB) cohort which included persons born 1948 to 1953. in 2010 the HRS added Mid Baby Boomer (MBB) cohort which included persons born 1954 to 1959.

MHAS Sample Description

The original MHAS sample was designed with a probabilistic, stratified cluster approach using the National Survey of Employment (ENE) for the quarter October-December 2000. If there was more than one age-eligible person in a selected household then, one person was selected at random. The original MHAS sample also included an oversample of individuals from high immigration states. These are states with a high proportion of persons that were migrants to the United States. 

In 2012 a refreshment sample was added to the MHAS. This refreshment sample included persons 50 to 60 years old, selected from the National Occupation and Employment Survey (ENOE) in the second quarter of 2012. The sample selection involved three stages: the selection of primary sampling units (PSUs), the selection of households inside selected PSUs, and the random selection of 1 age-eligible person inside selected households. 

ELSA Sample Description

The original ELSA sample was selected from three years of the Health Survey of England (HSE) 1998, 1999, and 2001. ELSA used the core samples for these years, all of which were nationally representative. Households were removed from the HSE sampling frame for ELSA Wave 1 if it was known that there was no adult of 50 years or older in the household who had agreed to be recontacted at some time in the future. Individuals in the remaining households provided the basis for the ELSA Wave 1 sample (11,578 households containing 18,813 eligible individuals). 

The ELSA sample has been refreshed at three waves of data collection to make the sample representative of all age groups. The sample was refreshed at Wave 3, Wave 4 and Wave 6. Wave 3 included a refreshment sample of people aged between 50 and 53. This sample included new people from HSE 2001 - 2004 who were previously too young to join ELSA (or become an ELSA core member) in 2002, but who were now aged 50 or over (i.e. people aged 50 to 53 and their partners). At Wave 4 the ELSA sample was further refreshed across a wider age range of 50 to 74 years. This refreshment sample included new people from HSE 2006 and their partners. At wave 6, a refreshment sample of respondents from HSE 2009, 2010 or 2011, aged between 50 and 55 years was included.

SHARE Sample Description

Because of the unique multi-country design of SHARE, SHARE uses a mixture of sampling designs based on what was available for each country. The original SHARE sample included persons from Austria, Belgium, Denmark, France, Germany, Greece, Israel, Italy, The Netherlands, Spain, Sweden, and Switzerland. Both Denmark and Sweden used a stratified simple random sampling from national population registers. Germany, Italy, Spain, and The Netherlands used multi-stage sampling using regional/local population registers. Austria, Greece, and Switzerland used single or multi-stage sampling using telephone directories followed by field screening. 

Austria, specifically, used a three-stage sampling design. The first-stage selection used a list of municipalities and political districts in areas where interviewers were located stratified by the combination of nine regions and three population size groups. The second-stage selected telephone numbers located in the the selection locations using a CD-ROM containing all registered telephone numbers.  If a business phone number was selected, the next private number on the list was selected. The third-stage screened for age-eligibility from each selected number. All age-eligible households were included in the sample.

Denmark, specifically, used a simple random sampling of households using a family (household) register created by Statistics Denmark from their Danish Civil Registration System.

France, specifically, used a master sample of dwellings which was a subsample of the 1999 census plus a list of new dwellings built since the 1990 census. The master sample was stratified by region and degree of urbanization (divided into four categories). In urban units with more than 20,000 inhabitants, SHARE used two-stage sampling where districts in the selected unit were selected.  In all other units, all households in the unit were sampled from. The final stage of sampling used systematic sampling with equal probabilities except for rural strata where SHARE first selected a sample of counties from the primary unit before the final sampling. 

Germany, specifically, used a two-stage sampling approach. In the first stage municipalities are classified by district and size using a list of all 13,416 German municipalities. From each combination one municipality was chosen with probabilities proportional the the population size. In the second stage 80 persons were chosen from each selected municipality using municipal population and address lists listing address of people born in 1953 or earlier. For the main study 27 of these persons were selected with simple random sampling without replacement.

Greece, specifically, used stratified two-stage sampling. Starting with a list all 54 Greek prefectures, the first stage selected phone numbers by simple random sampling without replacement for each prefecture using a computerized telephone directory. In the second stage, interviews called all selected phone numbers to identify age-eligible households. All age-eligible houses were interviewed.

Italy, specifically, used a three-stage stratified sampling design. In the first-stage, municipalities were stratified by population size 50+ as of 2001 and by geographical location. The 11 largest municipalities were selected and other municipalities were chosen from all strata by simple random sampling without replacement. In the second-stage electoral divisions were chosen inside the selected municipalities using simple random sampling without replacement based on a list of electoral divisions from the Italian Ministry of Interior. The final stage was conducted in two phases. In the first phase an equal number of males and females were selected using gender-specific municipal electoral registers within the selected electoral division by simple random sampling. Once non-age-eligible persons were deleted, the males and women were selected using a 3/4 ratio by simple random sampling. 

The Netherlands, specifically, used a two-stage sampling. In the first stage, municipalities were selected from a list of all 489 Dutch municipalities probability proportional to population born in 1954 or earlier using 2003 population statistics. At the second stage, households were chosen by simple random sampling from the selected municipalities using local population registers prepared by the municipality to uniquely list persons born in 1954 or before.

Spain, specifically, using a two-stage sampling design. At the first stage, census districts were selected probability proportional to the total population using a list of all districts by municipality. In the second stage, persons were selected using systematic sampling with a random start of 11 persons from each selected census district using population registers of individuals born in 1954 or earlier based on census and municipal registers managed by the National Statistical Office.

Sweden, specifically, used a simple random sapling of persons using the population register NAVET of the Swedish tax authority which includes all registered residents as of 2004 born in 1954 or earlier.

Switzerland, specifically, used simple random sampling without replacement to select telephone numbers using the telephone directory of Switzerland stratified by the dominating language of the region so that there was a German language region, a French language region, and an Italian language region. Phone numbers were then screened for eligibility. 

In 2006 SHARE added three additional countries to its sample: Czech Republic, Poland, and Ireland. SHARE also added a refreshment sample in Belgium, Denmark, France, Germany, Greece, Israel, Italy, The Netherlands, Spain, Sweden, and Switzerland.

In 2010 SHARE added four additional countries to its sample: Hungary, Portugal, Slovenia, and Estonia. SHARE also added a refreshment sample in Austria, Belgium, Czech Republic, Denmark, France, Italy, The Netherlands, Spain, and Switzerland.

KLoSA Sample Description

The original KLoSA sample was selected as part of a stratified, multi-sage area probability design. The first component of this sampling framework is the probability proportional to size (PPS) systematic sampling of the 2005 South Korean Census enumeration districts after stratifying by the location (15 major metropolitan cities and provinces) and characteristic of the district (urban or rural, and apartment building or non-apartment dwelling). Households were selected within PSUs from a listing of households in the Census identified as age-eligible; that is, inhabited by at least one person 45 years of age and older.


In KLoSA, once it was determined that there was an age-eligible member of the household, all age-eligible household members were interviewed. As a result, there are households in the data in which more than one couple is interviewed.

The initial sample included 10,254 respondents age 45 and over. The second wave was conducted in 2008 and 8,688 respondents. The third wave was conducted in 2010 and 7,920 respondents. The fourth wave was conducted in 2012 and 7,486 respondents. There was no refresher sample in wave two to four.


JSTAR Sample Description

The baseline JSTAR sample included persons who lived in five municipalities in the eastern area of Japan. The municipalities are Takikawa city in Hokkaido, Sendai city in Miyagi Prefecture, Adachi ward in Tokyo, Kanazawa city in Ishikawa Prefecture, and Shirakawa town in Gifu Prefecture. Unlike other studies, JSTAR did not use a national representative random sampling but chose to conduct stratified random sampling with each of the five chosen municipalities. JSTAR used a two-stage approach. After diving household registry data which are sorted by address into groups, first-stage locations were randomly selected for each municipality. The second stage involved randomly selected 20 individuals from each selected location.

The second wave of JSTAR was conducted in 2009 and included the five original municipalities and two additional municipalities: Tosu city in Saga Prefecture and Naha city in Okinawa Prefecture.

The third wave of JSTAR was conducted in 2011 and encompassed the five original municipalities, the two municipalities added in 2009, and three additional municipalities: Choufu city in Tokyo, Tondabayashi city in Osaka Prefecture, and Hiroshima city in Hiroshima Prefecture.

TILDA Sample Description

The first wave of TILDA was conducted between October 2009 and February 2011 and led by Trinity College Dublin. The TILDA survey sample was selected as part of multi-stage sampling based on the Irish Geo-directory, a comprehensive and up-to-date listing and mapping of residential addresses in Ireland complied by the Ordinance Survey Office. Each address in the country has an equal probability of selection from the sample list of addresses.

The TILDA sample includes individuals who were age-eligible respondent at least 50 years of age and their spouse regardless of age. TILDA study interviewed 8,504 participants, including 8,175 respondents age 50 and over and 329 younger partners of eligible individuals. 


CHARLS Sample Description

The baseline wave of CHARLS was conducted from 2011 to 2012, interviewing older adults aged 45 or older and their spouse at all ages. A stratified multi-stage probability sample was drawn, first by stratifying urban districts and rural counties by per capita GDP, then selecting urban communities or rural villages, proportionate to population size (PPS), and finally randomly selecting households.   The baseline sample included 10,257 households and 17,500 individuals in 450  urban communities or rural villages, from a 150 counties/districts in 28 provinces

LASI Sample Description

The LASI pilot study was conducted in four India states – Rajasthan and Punjab in the north and Kerala and Karnataka in the south in 2010. The sampling plan was based on the 2001 Indian Census. Two districts were randomly chosen from each state. Within these districts, eight primary sampling units (PSUs) were chosen to be surveys. Primary sampling units were stratified across urban and rural districts within each of the four states to capture a variety of socioeconomic conditions. Rural PSUs with fewer than 500 households were then selected through a two-stage sampling procedure, while urban PSUs and rural PSUs with more than 500 households were selected through a three-stage procedure.

Eligible households were defined as those with a least one member 45 years of age or older.    The LASI sample includes individuals who were age-eligible respondent at least 45 years of age and their spouse regardless of age. LASI randomly sampled 950 households and collected data from 1,683 individuals.