Bridged-Race Postcensal Population Estimates
for July 1-2000-July 1-2004
for Calculating Vital Rates
On September 8, 2005, the National Center for Health Statistics released the bridged Vintage 2004 postcensal population file. This file contains estimates of the resident population of the United States as of July 1, 2000, July 1, 2001, July 1, 2002, July 1, 2003, and July 1, 2004, by county, single-year of age (0, 1, 2,..., 85 years and over), bridged-race category (White, Black or African American, American Indian or Alaska Native, Asian or Pacific Islander), Hispanic origin (not Hispanic or Latino, Hispanic or Latino), and sex (1).
The estimates on this file resulted from bridging the Vintage 2004 postcensal estimates with 31 race groups (the 31 race groups used in Census 2000 in accordance with the 1997 Office of Management and Budget (OMB) standards for the collection of data on race and ethnicity) to the four race categories specified under the 1977 OMB standards. Thus, the estimates in this file are based on Census 2000. The bridged-race postcensal estimates were produced by the Population Estimates Program of the U.S. Census Bureau in collaboration with the National Center for Health Statistics (NCHS).
Background
In 1997, OMB issued "Revisions to the Standards for the Classification of Federal Data on Race and Ethnicity," which supersedes the 1977 Statistical Policy Directive 15, "Race and Ethnic Standards for Federal Statistics and Administrative Reporting" (2,3). Both documents specify rules for the collection, tabulation, and presentation of race and ethnicity data within the Federal statistical system. The 1977 standards required Federal agencies to report race-specific tabulations using four single-race categories, namely, White, Black, American Indian or Alaska Native, and Asian or Pacific Islander. The 1997 revision incorporated two major changes designed to reflect the changing racial and ethnic profile of the United States. First, the 1997 revision increased from four to five the minimum set of categories to be used by Federal agencies for identification of race. As in the past, these categories represent a social-political construct and are not anthropologically or biologically based. The five categories for race specified in the 1997 standards are: American Indian or Alaska Native; Asian; Black or African American; Native Hawaiian or Other Pacific Islander; and White. Second, the revised standards add the requirement that Federal data collection programs allow respondents to select one or more race categories when responding to a query on their racial identity. This provision means that there are potentially 31 race groups, depending on whether an individual selects one, two, three, four, or all five of the race categories. Collection of additional detail on race or ethnicity is permitted so long as the additional categories can be aggregated into the minimum categories.
During the transition to full implementation of the 1997 standards, two different standards for the collection of race and ethnicity data are being used, creating incomparability across data systems. Further, within a given data system, the change in the race standards results in incomparability across time, thus making it difficult to perform trend analyses. The OMB recognized that approaches to make data collected under the 1997 standards comparable to data collected under the 1977 standards would be needed. Therefore, the OMB issued "Provisional Guidance on the Implementation of the 1997 Standards for Federal Data on Race and Ethnicity" (4). The guidance document contains a detailed discussion of bridging methods.
Vital rates are based on information obtained from vital records collected through the state-based Vital Statistics Cooperative Program (numerators) and population estimates based on the U.S. Census (denominators). The 2000 decennial census collected race and ethnicity data in accordance with the 1997 standards. However, full implementation of the 1997 standards within the Vital Statistics Cooperative System had not occurred at that time. Indeed, for this data system, implementation of the 1997 standards is being phased in over several years as the States revise their birth and death certificates to reflect the 1997 standards. Thus, beginning with the 2000 data year, the numerators and denominators for vital rates have incompatible race data. Previously released rates for 2000 and 2001 utilized 1990-based postcensal estimates of the July 1, 2000 and July 1, 2001 resident population for denominators(5-8). Estimates for 2002 and beyond were not available from the 1990-based postcensal series, so it was necessary to develop a bridging method so that race-specific vital rates could be calculated. It is also important that the more accurate counts available from the 2000 Census be used.
Bridging methodology developed by NCHS bridges the multiple-race group population counts to single-race categories. Information from the pooled 1997-2000 National Health Interview Survey was used to develop the bridging methodology. Regression models with person-level and county-level covariates were used to generate the probability of selecting each single-race category possible for a multiple-race group. The probabilities generated from the fitted regression models are referred to as the NHIS bridging proportions. The Census Bureau applied the NHIS bridging proportions generated by NCHS to the Census 2000 Modified Race Data Summary file (9, 10). This application resulted in a bridged population count for each of the four single-race categories (White, Black or African American, American Indian or Alaska Native, and Asian or Pacific Islander) by county, single-year of age, Hispanic origin group, and sex, for April 1, 2000. The bridging methodology is described in detail in the report, "United States Census 2000 Population with Bridged Race Categories" (which is available for download from this site) and in a related report (11, 12).
Postcensal population estimates are estimates made for the years following a census, before the next census has been taken. Postcensal estimates are derived by updating the resident population enumerated in the decennial census using various measures of population change. The components of population change used in the derivation of the postcensal estimates include: births to U.S. resident women, deaths to U.S. residents, net international immigration, net movement of U.S. Armed Forces and civilian citizens of the U.S, and migration within the U.S. The Census Bureau annually produces a series of postcensal estimates that includes estimates for the current data year and revised estimates for earlier years. Estimates for earlier years in a given series are revised to reflect changes in the components of change data sets (for example, a preliminary natality file is replaced with a final natality file). The last year in a series is used to name the series. For example, the Vintage 2002 postcensal series has estimates for July 1, 2000, July 1, 2001, and July 1, 2002 (released 8/1/2003). The Vintage 2003 series has estimates for July 1, 2000, July 1, 2001, July 1, 2002, and July 1, 2003. The July 1, 2000, July 1, 2001, and July 1, 2002 estimates from the Vintage 2002 and Vintage 2003 series differ.
To date, the Census Bureau has produced the Vintage 2001, Vintage 2002, Vintage
2003, and Vintage 2004 series of postcensal estimates of the July 1 resident
population of the U.S. using the Census 2000
Modified Race Data Summary File as the base data for the series (9).
These series initially had estimates for 31
race groups, in accordance with the 1997 race and ethnicity standards (2).
Under a collaborative arrangement with NCHS,
the Population Estimates Program of the U.S. Census Bureau applied the NHIS
bridging proportions to the 31-race postcensal population estimates to produce
bridged-race postcensal estimates (estimates for the four single-race
categories:
White, Black or African American, American Indian or Alaska Native,
and Asian or Pacific Islander.
Release of estimates
In response to the need for bridged
estimates by a wide range of users, NCHS is making the bridged-race population
estimates available for download from the Population Estimates web site
(see http://www.cdc.gov/nchs/about/major/dvs/popbridge/popbridge.htm,
look under "Datasets and Documentation").
The report detailing the bridging methodology is available for download from this site
(see United States Census 2000 Population with Bridged Race Categories
under "Methodology").
NCHS is using the bridged-race postcensal population estimates to calculate birth and death
rates.
Previously published reports that used 1990-based postcensal population estimates to calculate rates for
2001 have been re-issued in whole or in part; new reports use the bridged-race
estimates (13-16).
Although efforts were made to use the
best available data and methods to produce these estimates, the modeling
process introduces error into the estimates.
The potential for error will be greatest for the smallest population
groups, particularly the smaller race groups and county level estimates. The
postcensal estimates are updated annually as additional data become available,
for use in the components of change model.
In addition, the bridged-race estimates may be revised periodically to
reflect changes made to the bridging process.
NCHS would appreciate receiving feedback on the usefulness of the estimates as well as notification
of any problems that have been identified.
Please provide comments via e-mail to:
PopEst@cdc.gov.
Suggested citation
National Center for Health Statistics.
Estimates of the July 1, 2000-July 1, 2004,
United States resident population from the Vintage 2004 postcensal series
by year,county, age, sex, race, and Hispanic origin,
prepared under a collaborative arrangement with the U.S. Census Bureau.
Available on the Internet at:
http://www.cdc.gov/nchs/about/major/dvs/popbridge/popbridge.htm.
September 9, 2005.
References
File layout for the Bridged-Race Vintage 2004 Postcensal Files, 2000-2004
There is one file for the full Vintage 2004 postcensal series with data for all four years in the series (July 1, 2000, July 1, 2001, July 1, 2002, July 1, 2003, and July 1, 2004) and one file with only the Vintage 2004 estimates for July 1, 2004.
The files contain bridged-race postcensal estimates of the July 1 resident population of the United States by year, county, single-year of age (0, 1,..., 85 years and over), bridged-race category (White, Black or African American, American Indian or Alaska Native, Asian or Pacific Islander), Hispanic origin (not Hispanic or Latino, Hispanic or Latino), and sex.
There is a record on the files for each combination of county, age, race and sex, and Hispanic origin.
The population estimates on the files were derived using the Census 2000 Modified Race Data Summary File as the base file (9).
The files were released by NCHS on September 9, 2005.
Control totals for Vintage 2004 data files:
File Layout for pcen_v2004.txt:
File Layout for pcen_v2004_y04.txt:
Description of pcen_v2004_y04.sas7bdat
The SAS data set, pcen_v2004_y04.sas7bdat, contains seven numeric variables:
Control total
File name Number of records (1) Month, Year Population count
pcen_v2004.txt 4,320,640 July 1, 2000 282,192,162
July 1, 2001 285,102,075
July 1, 2002 287,941,220
July 1, 2003 290,788,976
July 1, 2004 293,655,404
pcen_v2004_y04.txt 4,320,640 July 1, 2004 293,655,404
pcen_v2004_y04.sas7bdat 4,320,640 July 1, 2004 293,655,404
1. One record for each county, race, sex, Hispanic origin, and age combination
Location Field size Item and Code Outline Format
1-4 4 Series vintage (2004) Numeric
5-6 2 FIPS State code Numeric
7-9 3 FIPS county code Numeric
10-11 2 Age Numeric
(0, 1, 2,..., 85 years and over)
12 1 Race-sex Numeric
1=White male
2=White female
3=Black male
4=Black female
5=American Indian or Alaska Native male
6=American Indian or Alaska Native female
7=Asian or Pacific Islander male
8=Asian or Pacific Islander female
13 1 Hispanic origin Numeric
1=not Hispanic or Latino
2=Hispanic or Latino
14-21 8 July 1, 2000 Population count Numeric
22-29 8 July 1, 2001 Population count Numeric
30-37 8 July 1, 2003 Population count Numeric
46-53 8 July 1, 2004 Population count Numeric
Location Field size Item and Code Outline Format
1-4 4 Series vintage (2004) Numeric
5-6 2 FIPS State code Numeric
7-9 3 FIPS county code Numeric
10-11 2 Age Numeric
(0, 1, 2,..., 85 years and over)
12 1 Race-sex Numeric
1=White male
2=White female
3=Black male
4=Black female
5=American Indian or Alaska Native male
6=American Indian or Alaska Native female
7=Asian or Pacific Islander male
8=Asian or Pacific Islander female
13 1 Hispanic origin Numeric
1=not Hispanic or Latino
2=Hispanic or Latino
14-21 8 July 1, 2000 Population count Numeric
Pop resident population
Race4 1=white; 2=black; 3=American Indian; 4=Asian and Pacific Islander
Age 0-84 single years of age; 85+
Co_fips County FIPS code
Hisp 1=non-Hispanic; 2=Hispanic
Sex 1=male; 2=female
St_fips State FIPS code
Source:
Documentation for bridged-race postcensal Vintage 2004 population estimates for July 1, 2000-July 1, 2004,
which was released on September 8, 2005 is on the internet at
ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/datasets/nvss/bridgepop/DocumentationBridgedPostcenV2004.doc.