You are on page 1of 14

4/8/2019 Consumer Expenditure Surveys

U.S. Bureau of Labor Statistics

Consumer Expenditure Surveys

Consumer Expenditure Surveys Public-use


Microdata Getting Started Guide
This page provides documentation for Consumer Expenditure Surveys (CE) Public-use Microdata (PUMD), its
conventions, files, sample code, and methodology.

Table of Contents
Section 1. CE program
Section 2. CE PUMD
Section 3. Interview Survey
Section 4. Diary Survey
Section 5. Sample code
Section 6. CE PUMD methodology

Section 1. CE program
The Consumer Expenditure Surveys (CE) program provides data on expenditures, income, and demographic
characteristics of consumers in the United States. The CE program provides these data in tables, LABSTAT
databases, news releases, publications, and public-use microdata files.

CE data are collected by the Census Bureau for the Bureau of Labor Statistics (BLS) in two surveys, the
Interview Survey for major and/or recurring items, and the Diary Survey for more minor or frequently purchased
items. CE data are primarily used to revise the relative importance of goods and services in the market basket of
the Consumer Price Index. The CE program conducts the only Federal household survey to provide information
on the complete range of consumers' expenditures and income. For more information, see the overview section
in the CE chapter in the BLS Handbook of Methods.

Section 2. CE PUMD
CE PUMD provide the individual responses to the two surveys from respondents. The data have been adjusted to
protect the confidentiality of respondents. The CE PUMD allow researchers to analyze expenditure, income, and
demographic data beyond what is provided in published tabulations.

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 1/14
4/8/2019 Consumer Expenditure Surveys

2.1 CE PUMD files

CE PUMD include data from both the Interview and Diary Surveys. Most files are analogous between the two
surveys, however the Interview Survey files contain roughly 50 additional detailed data files, as well as paradata
files that provide detail about the collection process. Table 3 Interview Survey files and content lists the major
files currently available, and their content. For a more comprehensive list of files provided in the CE PUMD, see
the Dictionary for the Interview and Diary Surveys.

Are the CE PUMD required for a particular research topic?


A research topic may require the detail that only PUMD files provide, but the CE program does provide a
wealth of information that has already been tabulated and may be sufficient for a user's analysis. This
information includes tables, LABSTAT databases, news releases, and publications. To learn more about
these products, see the introduction to the CE data products.
What is required in order to use the CE PUMD?
Users of CE PUMD need to be familiar with statistical concepts and be proficient with a statistical
software package, such as SAS, R, or STATA.
What data formats are available?
The data files are available in SAS, STATA, SPSS, and CSV and can be downloaded from 1996 through
the latest release from the public-use microdata data files page. If users' research requires access to CE
microdata without the disclosure restrictions applied, they can apply to be visiting researchers on the BLS
onsite researcher page.
Where can users obtain additional information?
If users have comments or questions about this page and its contents, contact us.

2.2 CE PUMD file conventions

For both the Interview and Diary Surveys, the files use the following conventions:

How are CE PUMD files named?


CE PUMD file naming conventions consist of three parts:
File name
Calendar year (YY)
Quarter (Q) if applicable. Quarter can be 1-5.
The detailed annual Interview Survey files do not specify the quarter but only the year, for example
intrvw16.zip\expn16\cla16.sas7bdat.
What types of values do CE PUMD variables use?
CE PUMD variables are stored in one of the following three formats:
Numeric (NUM): Variables that predominantly contain dollar amounts and counts
String (CHAR): Variables that contain a sequence of alphanumeric characters
Categorical (CHAR): Coded variables
Where do data users find descriptions for the CE PUMD variables?
The Dictionary for the Interview and Diary Surveys contains a description for each variable from 1996
forward. In addition to the description, the dictionary lists the associated codes, the location of a variable
within the files and within the survey, and the duration in which a variable existed in the PUMD.
How do data users track changes in PUMD?
The Dictionary for the Interview and Diary Surveys tracks detailed changes to files, variables, and codes.
For large survey changes, consult the history page in the chapter Consumer Expenditures and Income in
the BLS Handbook of Methods.
How do data users identify a unique record in the CE PUMD?
Identifying a unique record depends on the file. For a list of the primary key variables for each file, see
Table 3 Interview Survey files and content and Table 6 Diary Survey files and content.

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 2/14
4/8/2019 Consumer Expenditure Surveys

How do data users link an interview or diary for a given Consumer Unit (CU) in different files?
NEWID links data for one CU across interviews and files. Users cannot link CUs across surveys because
the Diary and Interview surveys use different samples.
How is the variable NEWID structured?
NEWID is a unique sequential number concatenated with the number of the interview. The last digit of
NEWID indicates the interview number in a series of 4, or the week of diary collection in a series of 2. All
values prior to the last digit, identify a CU.

Section 3. Interview Survey


3.1 Interview Survey overview

The Interview Survey is a rotating panel survey in which approximately 10,000 addresses are contacted each
calendar quarter that yield approximately 6,000 useable interviews. One-fourth of the addresses that are
contacted each quarter are new to the survey. After a housing unit has been in the sample for four consecutive
quarters, it is dropped from the survey, and a new address is selected to replace it. For more information, see the
chapter Consumer Expenditures and Income in the BLS Handbook of Methods.

Before 2015, the Interview Survey included a preliminary bounding interview, and each CU could be contacted
up to five times over five quarters. Although data from the bounding interview were not published, its purpose
was to minimize telescoping errors.ⅰ The CE program stopped fielding the bounding interview in 2015 due to
concerns about its effectiveness in reducing telescoping errors, cost, and impact on respondent burden. For more
information, see Ian Elkin's article Recommendation regarding the use of a CE bounding interview.ⅱ

3.2 Interview Survey file conventions

For the Interview Survey, the files use the following conventions:

What does an Interview "quarter" refer to?


The interview "quarter" refers to the calendar quarter in which the interview occurred. For example, any
CU interviewed in April, May, or June would have their data stored in the quarter 2 (YYQ2) datasets.
During an interview, the CU is asked to report expenditures for the three months prior to the interview. So,
for a CU interviewed in April, their expenditures in the YYQ2 are for January, February, and March. This
distinction is important to remember when calculating calendar year estimates.
How many quarters does a CE PUMD release include?
Each CE PUMD release includes five quarters. Four quarters for the release year (YYQ1-YYQ4) and the
first quarter of the next year.
Why do some CE PUMD Interview Survey files exist as part of two different data releases?
Each data release contains five quarters of Interview Survey data in order to allow users to calculate a
calendar year estimate. For more information on this calculation, see Section 6.1 Estimation procedures for
the Interview Survey.

Each annual data release of the CE PUMD is processed using new data and new disclosure avoidance
guidelines. For quarters that appear in two different data releases, an "x" is added to the end of the file
name. This "x" is used as an indicator to inform users that the two files were processed under a different
set of rules and conditions and therefore the content may differ slightly. It is at the user's discretion as to
which file to use.

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 3/14
4/8/2019 Consumer Expenditure Surveys

Table 1: Description of "x" in file names for the fifth quarter


Is the "x" What file and Did the files, methods, or data
Example
included? release? change?
No, they stayed the same as in the
Fifth file of previous
No previous four quarters in the FMLIYYQ.sas7bdat
year's release
package.
First file of current Yes, they changed from the previous
Yes FMLIYYQx.sas7bdat
year's release year's version.

What do the flag values in the Interview Survey represent?


In the Interview Survey files, data fields are explained by using flags for selected variables. Variables that
have a flag variable associated with them are identified in the Dictionary for the Interview and Diary
Surveys, on the Variable tab, under the column "Flag Name." Table 2 lists the codes for flags in the
Interview Survey.
Table 2: Interview Survey flag variable codes
Flag
Description
value

A Valid blank; a blank field where a response is not anticipated

Invalid blank due to invalid nonresponse; nonresponse that is not consistent with other data reported
B
by the CU

C Blank due to "Don't know," refusal, or other nonresponse

D Valid value; unadjusted

E Valid value; allocated

F Valid value; imputed or adjusted in some other way

G Valid value; allocated and imputed

Valid blank for an expenditure that is a "parent record" where the expenditure was allocated to other
H
records and the original expenditure was overwritten with a blank

T Valid value; topcoded or suppressed

U Valid value; allocated then topcoded or suppressed

V Valid value; imputed or adjusted in some other way then topcoded or suppressed

W Valid value; allocated and imputed or adjusted in some other way then topcoded or suppressed

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 4/14
4/8/2019 Consumer Expenditure Surveys

3.3 Interview Survey file types

Table 3 summarizes the Interview Survey files currently available. If users encounter a file that is not listed
below, consult the Dictionary for the Interview and Diary Surveys for additional details.

Table 3: Interview Survey files and content


Variable Files
First
Name Content per Primary keysⅳ
periodicityⅲ release year
CU level Summary
Expenditures Quarterlyⅴ
CU level income,
FMLI Annual 5 NEWID 1996
assets, and liabilities
CU characteristics
NA
and weights
NEWID, SEQNO, ALCNO, UCC,
Monthly
MTBI Monthly 5 RTYPE, EXPNAME, UCCSEQ, 1996
expenditures
REF_MO, REF_YR
Member level
Annual
income
MEMI 5 NEWID, MEMBNO 1996
Member
NA
characteristics

ITBI Detailed income Monthly 5 NEWID, UCC, REF_MO, REFYR 1996

Imputed income NEWID, UCC, REF_MO, REFYR,


ITII Monthly 5 2004
iterations IMPNUM
Estimated federal
2013
NTAXI and state income Annual 5 NEWID and TAXID
Q2
taxes

DetailedDetailed expenditure Quarterly,


About
data and non-expenditure monthly, NEWID, SEQNO, ALCNO 1996
50
files data weekly, or NA

Data related to the


FPAR NA 1 NEWID 2009
survey process

Data related to the


MCHI NA 1 NEWID 2009
contact history

3.3.1 Detailed data files - Detailed expenditure and non-expenditure data

The roughly 50 detailed data files include expenditure and non-expenditure information that is directly collected
from sections of the Interview Survey (See the Survey materials page for more information). The Dictionary for
the Interview and Diary Surveys contains additional information related to the content and makeup for each of
these files. Each detailed data file consist of five quarters of data. Because these files correspond to specific
sections in the survey, they have a number of differences between them. These are the main differences:

The reference periods may differ due to different questions.

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 5/14
4/8/2019 Consumer Expenditure Surveys

The number of records per CU differs. Some files having multiple records per CU, some have one record
per CU, and some have no records per CU interviewed each quarter.
The method to identify unique records differs. Users can identify unique records with NEWID and
depending on the file these variables:
SEQNO is assigned sequentially during the interview as each expenditure record is recorded into
the database.
ALCNO is assigned sequentially for each record that has been allocated from one expenditure. For
example, a CU may report spending $50 on a pair of men's pants and a shirt. The CE program will
allocate out that record into two separate records, one for men's pants and shorts ($30) and one for
men's shirts ($20).

Here is an example of the detailed data file VEQ (Vehicles, maintenance and repair) and some of the variables it
contains.

VEQ-Vehicle maintenance and repair


VOPSERVY is an indicator variable that describes the type of maintenance or repair.
VOPMOA is an indicator variable for the month in which the expense occurred.
VOPEXPX is the total cost of the maintenance or repair expense.

3.3.2 Interview Survey Paradata files

Paradata files provide data about the interview process. Beginning in 2009, the CE program began releasing
paradata for the Interview Survey. The CE program does not release paradata for the Diary Survey. Paradata are
available in two datasets:

FPAR - Data related to the survey process


Contains data about the survey, including timing for each section and whether the respondent used
records.
Organized by NEWID.
Unique records are defined by NEWID and QYEAR.

MCHI - Data related to the contact history


Contains data about the contact history between the field representative and the respondent,
including reasons for interview refusal and time of contact.
The files are organized by NEWID.
Unique records are defined by NEWID and QYEAR.

How many quarters are in the paradata files?


Each paradata file has nine quarters. These include four quarters for the first year, four for the second year,
and one for the first quarter of the third year.

Section 4. Diary Survey


4.1 Diary Survey overview

The Diary Survey is a panel survey in which approximately 5,000 addresses are contacted each calendar quarter
that yield approximately 3,000 useable interviews.ⅵ After a housing unit has been in the sample for two
consecutive weeks, it is dropped from the survey, and a new address is selected to replace it. For more
information, see the chapter Consumer Expenditures and Income in the BLS Handbook of Methods.

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 6/14
4/8/2019 Consumer Expenditure Surveys

4.2 Diary Survey file conventions

For the Diary Survey, the files use the following conventions:

What does a Diary Survey "quarter" refer to?


A Diary Survey "quarter" refers to the calendar quarter in which the Diary Survey booklet was placed in
the home of the CU by the Census Field Representative. All Diary Survey files are organized as quarterly
files.
What does a Diary Survey "week" refer to?
The Diary Survey "week" refers to the 7 consecutive days in which the data were recorded. Respondents
only record expenditures of that week. Each CU is in the sample for two consecutive weeks. Each Diary
Survey week is assigned to the Diary Survey quarter in which it was recorded.
What do the flag values in the Diary Survey represent?
In the Diary Survey files, data fields are explained by using flags for selected variables. Variables that
have a flag variable associated with them are identified in the Dictionary for the Interview and Diary
Surveys, on the Variable tab, under the column "Flag Name." Table 4 lists the codes for flags in the Diary
Survey.
Table 4: Diary Survey flags
Flag
Description
value

A Valid blank; a blank field where a response is not anticipated

Blank due to invalid nonresponse; nonresponse that is not consistent with other data reported by
B
the CU

C Blank due to "Don't know," refusal, or other nonresponse

D Valid value; unadjusted

E Valid value; allocated

T Valid value; topcoded or suppressed

For Diary Survey expenditures located on the EXPD files, the variable ALLOC can be utilized to
determine if an expenditure has been adjusted, allocated, topcoded, or any combination of the three. Table
5 lists the allocation codes and its corresponding flag values.

Table 5: Diary Survey allocation codes


ALLOC Code Description Corresponding Flag

0 Valid value, unadjusted D

1 Valid value, allocated E

2 Topcoded and allocated T

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 7/14
4/8/2019 Consumer Expenditure Surveys

ALLOC Code Description Corresponding Flag

3 Topcoded, not allocated T

4.3 Diary Survey file types

Table 6 summarizes the Diary Survey files currently available. If users encounter a file that is not listed below,
consult the Dictionary for the Interview and Diary Surveys for additional details.

Table 6: Diary Survey files and content


Variable Files per First
Name Content
periodicity release Primary keysⅶ year
Summary expenditures Weekly
CU level income, assets, and
FMLD Annual 4 NEWID 1996
liabilities
CU characteristics and weights Annual
Member level income Annual NEWID and
MEMD 4 1996
Member characteristics NA MEMBNO

Detailed expenditure and non- NEWID and


EXPD Weekly 4 1996
expenditure data ALLOC

DTBD Detailed income Annual 4 NEWID and UCC 1996

NEWID, UCC,
DTID Income imputation iterations Annual 4 2004
IMPNUM

Section 5. Sample code


For CE PUMD, the CE program provides two sample code types:

Annual weighted calendar year estimates of national totals, averages, standard errors, and CVs. The
code integrates data from both surveys on a UCC level. This code is available in SAS, R, and STATA on
the PUMD documentation page.
Code to approximate the published tables. CE PUMD estimates do not fully match the table estimates.
For more information, see FAQ 26 on the CE FAQ page. This code is available in SAS on the PUMD
documentation page.

When preparing estimates with UCCs, users need to understand their hierarchical groupings. The hierarchical
stub files list UCCs' hierarchical grouping by major categories. For more information on stub files, see the
PUMD documentation page. When embarking on a research project, the data and its estimates have several
limitations due to issues including small samples or respondent recollection that may affect the estimates.

Section 6. CE PUMD methodology


https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 8/14
4/8/2019 Consumer Expenditure Surveys

This section describes the estimation procedures for the Interview Survey and the estimation procedures for the
Diary Survey; the formulas to estimate weighted annual calendar year estimates; and sampling statements. The
CE program integrates information from both the Interview and Diary Surveys in its publications. Therefore any
analysis limited to only the one survey may produce results that do not match the published CE estimates. In
addition, users may find that estimates do not match the published estimates due to the non-disclosure criteria
that are applied to the CE PUMD. For more information on non-disclosure requirements, see the Protection of
Respondent Confidentiality page.

6.1 Estimation procedures for the Interview Survey


This section discusses procedures for estimating annual calendar year means with data from the interview
surveys. Field representatives interview CUs to collect the cost of all expenses during the prior three months.
Data collected by each interview are treated as statistically independent - each quarter's interview is separately
weighted to be representative of the population. For more information, see the collections and data sources
section in the Consumer Expenditures and Income chapter in the BLS Handbook of Methods.

For the Interview Survey, users may want to consider the following general concepts:

What information does the Interview Survey ask CUs?


The Interview Survey asks respondents about all expenses that the CU incurs during the survey period as
well as information about financial data and demographic information. For more information on what is
included and excluded in expenditures, see the entry on expenditures on the glossary page.
How are Interview Survey data organized?
The Interview Survey data are organized and identified by quarter, but particular files may provide data by
month or by year. For more information on the data's periodicity, see Table 3: Interview Survey files and
content.
How many quarters does a user need for calendar year estimates?
To produce calendar year estimates, users need to access all five quarters of data: All four quarters of the
year of interest and the first quarter for the subsequent year. For example for estimates of 2017, users need
the files for quarter 1 through 4 for 2017 and quarter 1 for 2018.
Why do users need data from two years to estimate one calendar year?
Data users need data from two subsequent years to calculate calendar year estimates because in the
Interview Survey, users report expenditures for the three months prior to the interview. Thus in January,
February, and March interviews, a CU has the potential to report expenditures from the previous year,
which are considered out of scope when developing a current calendar year estimate.

When calculating data for 2016, interviews conducted in January 2017 cover expenditures made between
October 2016 and December 2016, and are used to estimate data for these three months in 2016. Similarly,
interviews conducted in March 2017 cover expenditures between December 2016 and February 2017 and
are used to estimate data for December 2016. Thus, users have to use the first file for 2017 to estimate data
for the last quarter of 2016. Charts 1 illustrates that concept. The green months show those that are in
scope for the estimates of 2016 and the yellow months show those months in 2017 that out of scope.

Chart 1:
Months in
scope for
quarter 5
(FMLI171)

A similar differentiation of scope happens at the beginning of the year. The data collected in January of
2016 are not in scope for 2016 expenditures because the January interview collects data for the last 3
months of 2015. However, data collected in February and March 2016 are partially in scope. Data
https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 9/14
4/8/2019 Consumer Expenditure Surveys

collected in February includes data for January of 2016, and data collected in March 2016 includes data for
January and February 2016. See chart 2.

Chart 2:
Months in
scope for
quarter 1
(FMLI161)

Finally, for the months April through December all months are in scope. For example, quarter 2 interviews
conducted in April, May, and June collect expenditure data for January 2016 through May 2016, which are
all in scope for 2016. See chart 3. The same holds true for Quarter 3 and 4.

Chart 3:
Months in
scope for
quarter 2
(FMLI162)

How much does a CU contribute to a calendar year estimate in each interview months?
A CU's contribution depends on the interview month and year. For information on how to identify a CU's
contribution to a calendar year estimate, see Section 6.3 Formulas.
Is the periodicity of variable values consistent across files?
No, it is not. Different files and different variables within files may have different periodicities. For more
information, see the Table 3: Interview Survey files and content.

6.2 Diary Survey estimation procedures

This section provides users of the Diary Survey with procedures to estimate annual calendar means.

CUs self-report a detailed description of all expenses using a product-oriented diary for two consecutive 1-week
periods. Data entries can start on any day of the week. Data collected each week are treated as statistically
independent - each week's diary is separately weighted to be representative of the population. For more
information, see the collections and data sources section in the chapter of Consumer Expenditures and Income in
the BLS Handbook of Methods.

For the Diary Survey, users may want to consider the following concepts:

What information does the Diary Survey ask CUs?


The Diary Survey asks for almost all expenses that the CU incurs during the survey week. In addition, the
Diary Survey also asks about income and demographic information. The Diary Survey excludes expenses
incurred by family members while away from home overnight or on vacation, and for credit and
installment plan payments.
How are Diary Survey data organized?
The Diary Survey data are organized and identified by the day an item was purchased.
How do users identify the purchase date?
Users cannot identify the exact purchase date. However, users can identify the start month of the reference
week (STRTMNTH), the day of the week (EXPNWDY), the sequential day of the survey (EXPNSQDY),
as well as the reference month (EXPNMO) and year (EXPNYR).
How many quarters does a user need for annual calendar year estimates?
To produce calendar year estimates, users need to access four collection-quarter files of the year of interest.
https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 10/14
4/8/2019 Consumer Expenditure Surveys

For example for 2017 estimates, users need the files for quarter 1 through 4 for 2017.
How much does a CU contribute to a calendar year estimate in each interview month?
In the Diary Survey, a CU contributes 100 percent of its expenditures to the calendar year. Unlike the
Interview Survey, the Diary Survey has no lag between the time an expenditure occurs and the time it is
reported, which means that the potential contribution of each CU to the mean is the same.

6.3 Formulas
The formulas described below can be used to calculate weighted estimates that use data from both surveys. The
formulas calculate annual calendar year aggregates, averages, and standard errors for expenditures and reported
income. While these formulas can also be used to calculate annual averages of imputed income as well, they
cannot be used to calculate standard errors. For more information on this topic, see the Description of Income
Imputation Beginning with 2004 Data.

What is the impact of different periodicity by different variables?


Variable periodicity can be annual, quarterly, monthly or weekly. When working with the PUMD, users
need to take the particular periodicity into account. For example, when combining weekly data with annual
data, users need to account for the difference in periodicity by inflating the weekly data to represent a
quarterly value.
How to calculate comprehensive estimates of expenditures and income?
To gain a complete picture of expenditures and income, users need to integrate data from both surveys.
The CE program collects data with two independent surveys. While they complement each other with
respect to the data collection, they use independent samples that do not overlap. To see which UCCs the
CE tables use to integrate and from which survey, see the Source Selection File.
How to integrate data from both surveys?
To integrate data from both surveys, users first need to choose which expenditures they would like to
integrate. Some items are only collected in one survey while others are collected in both surveys. Once a
user has established which UCCs they would like to integrate, users can then develop an estimate for each
UCC individually. After estimates have been developed for each UCC individually, sum the results to
develop an integrated estimate.

When integrating data across surveys, keep in mind that estimates created from the Diary Survey will yield
a weekly amount and therefore, users will need to adjust their estimates so that each survey result
represents the same time period. Inflating the Diary Survey UCC estimate by a multiplier of 13, will result
in a quarterly amount, which can then be summed with an Interview Survey estimate.

How to calculate representative statistics?


Users can calculate representative statistics with the weight variable FINLWT21. This variable attributes a
weight to each NEWID, which allows users to estimate values for the entire population. This variable is
available in the FMLI and FMLD files.
Does the CE program provide sample code that uses the below formulas?
Yes, the CE program provides sample code with the same logic in SAS, STATA, and R on the PUMD
documentation page.

6.3.1 Developing a weighted calendar year estimate

This section presents the methods to calculate the population, aggregate values, and average values for
expenditures or income for a calendar year.

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 11/14
4/8/2019 Consumer Expenditure Surveys

Denominator: Population

NEWID = Identifier for one CU for one quarter


FINLWT21 = Weight of each NEWID
QNUM = Number of quarters in the analysis (Usually equal to 4 for a 1 year estimate)
MO_SCOPE = Indicator for the number of months in scope for each NEWID

How do users calculate representative population weights (FINLWT21)?


To make the population weights representative of the U.S. population, data users need to generate two
adjustment factors:
"QNUM" adjusts the weights from annual to quarterly. The CE sample is designed to be
representative of the entire annual U.S. population in the collection of each quarter. Thus, the weight
(FINLWT21) needs to be divided by 4 to adjust for this fact. Without this adjustment the population
in the denominator would be 4 times as large as the U.S. population. For example for an annual
estimate (4 quarters) QNUM is 4.
"MO_SCOPE/3" adjusts the weights for CUs that are out of scope. Interviews that were conducted
in January, February, or March are not fully in scope. (This applies only to the interview survey. For
more information, see Section 6.1 Estimation procedures for the Interview Survey.) For these
months, only the part that is in scope should be used for representative population weights.
MO_SCOPE adjusts the CU weights to the months in scope.
How to determine the value for MO_SCOPE?
The value for MO_SCOPE depends on the survey. For all four quarters within the Diary Survey, the value
for MO_SCOPE is 3. For the Interview Survey, MO_SCOPE depends on the year and month of the
interview. Users can identify the year using the FMLI variable QINTRVYR. For a description of what
months are in scope, see Section 6.1 Estimation procedures for the Interview Survey.

For the first four quarters, MO_SCOPE is defined by the value of QINTRVMO:

If QINTRVMO is 1 then MO_SCOPE is 0


If QINTRVMO is 2 then MO_SCOPE is 1
If QINTRVMO is 3 then MO_SCOPE is 2
If QINTRVMO is 4-12 then MO_SCOPE is 3

For the fifth quarter, MO_SCOPE is defined by the value of QINTRVMO:

If QINTRVMO is 1 then MO_SCOPE is 3


If QINTRVMO is 2 then MO_SCOPE is 2
If QINTRVMO is 3 then MO_SCOPE is 1

Numerator: Aggregate value

X = Expenditures or income variables by NEWID. This formula can be used for quarterly,
annual, weekly, or monthly data.

Quotient: Average value

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 12/14
4/8/2019 Consumer Expenditure Surveys

6.3.2 Reliability statement

Description of sampling and non-sampling errors


Sample surveys are subject to two types of errors, sampling and non-sampling. Sampling errors occur because
observations are not taken from every unit in the entire population. Standard errors measure sampling errors. The
primary purpose of standard errors is to provide users with a measure of the variability associated with the mean
estimates. The sample estimate and its estimated standard error enable one to construct confidence intervals.

Non-sampling errors can be attributed to many sources, such as definitional difficulties, differences in the
interpretation of questions, inability or unwillingness of the respondent to provide correct information, mistakes
in recording or coding the data obtained, and other errors of collection, response, processing, coverage, and
estimation of missing data. Estimates using a small number of observations are less reliable. Research articles
examining CE measurement error and nonresponse bias are included in the CE library. The CE program
regularly examines CE data in the annual data quality assessment and compares CE results with other sources of
federal statistics. For more information, see the Data Quality and Comparisons page.

Estimating sampling error


The CE program estimates sampling error using Balanced Repeated Replication (BRR). The CE program
implements this method with three steps:

1. Selects 44 subsamples that are balanced half-samples of the full sample.


2. Estimates a statistic for each half-sample, using the replicate weight variables WTREP01-WTREP44. The
replicate weight variables contains a value greater than 0 for CUs assigned to that replicate and a value of
missing for CUs not assigned to that replicate.
3. Estimates the variance between the values of the full-sample and half-samples with the standard formula
for computing sample variances.

Replicate means for expenditures

WTREP = 44 Replicate weights (WTREP01-WTREP44)

Standard error

Note that this method does not work for imputed income data. For information on
calculating sampling errors from imputed income, see the User's Guide to Income
Imputation in the CE.

6.4 Sampling statement

6.4.1 Survey sample design

The CE survey sample is a nationwide household survey representing the entire U.S. civilian noninstitutional
population. It includes people living in houses, condominiums, apartments, and group quarters such as college
https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 13/14
4/8/2019 Consumer Expenditure Surveys

dormitories. It excludes military personnel living overseas or on base, nursing home residents, and people in
prisons. The civilian noninstitutional population represents more than 98 percent of the total U.S. population. For
more information, see sample design in the chapter Consumer Expenditures and Income of the BLS Handbook
of Methods.

6.4.2 Weighting

Each CU included in the CE sample represents a given number of CUs in the U.S. population, which is
considered to be the universe. Weighting is used to adjust the relative contribution of each CU to reflect the
inverse of its selection probability, as well as to account for nonresponse and to match certain characteristics to
known control totals. For more information, see sample design in the chapter Consumer Expenditures and
Income of the BLS Handbook of Methods.

ⅰ Telescoping errors refer to the temporal displacement of an event. Respondents of the CE surveys may perceive
recent events to be more remote than they are (backwards telescoping) and distant events to be more recent than
they are (forward telescoping).
ⅱ Ian Elkin, Recommendation regarding the use of a CE bounding interview, 2013, Bureau of Labor Statistics.

ⅲ Variable periodicity refers to the period that a given value represents.

ⅳ Primary keys identify each unique record in the database.

ⅴ Quarterly summary expenditures are presented as two variables - one containing expenditures made in the
previous calendar quarter and one containing expenditures made in the current calendar quarter.

ⅵ For more information on the number of contacted addresses and completed interviews, see the CE Data
Quality Profile.
ⅶ Primary keys identify each unique record in the database.

Last Modified Date: February 6, 2019

Recommend this page using:


Facebook
Twitter
LinkedIn

U.S. Bureau of Labor Statistics | Consumer Expenditure Surveys, PSB Suite 3985, 2 Massachusetts Avenue, NE
Washington, DC 20212-0001

www.bls.gov/CEX | Telephone: 1-202-691-6900 | Contact Us

https://data.bls.gov/cgi-bin/print.pl/cex/pumd-getting-started-guide.htm 14/14

You might also like