You are on page 1of 67

RPPI PRACTICAL GUIDE

Approved By Prepared by Vanda Guerreiro


James W. Tebrake

CONTENTS

PREFACE _________________________________________________________________________________________ 6

INTRODUCTION _________________________________________________________________________________ 7

DATA SOURCES AND WEIGHTS_______________________________________________________________ 10


A. Identifying Data Sources ______________________________________________________________________ 10
B. Assessing Source Data Quality ________________________________________________________________ 14
C. Weights _______________________________________________________________________________________ 23

METHODS ______________________________________________________________________________________ 26
D. Method 1: Median with Stratification _________________________________________________________ 27
E. Hedonic Methods _____________________________________________________________________________ 33
F. Method 2: Time Dummy Hedonic Method ____________________________________________________ 35
G. Method 3: Characteristics Hedonic Method ___________________________________________________ 40
H. Method 4: Imputations Hedonic Method _____________________________________________________ 46
I. RPPI Results ____________________________________________________________________________________ 51

DISSEMINATION AND OTHER INDICATORS _________________________________________________ 53

REFERENCES ___________________________________________________________________________________ 62

BOXES
1. The R Software _________________________________________________________________________________ 8
2. Price Concepts and its Fitness for the Purpose ________________________________________________ 11
3. Sample Representativeness ___________________________________________________________________ 12
4. Prepare Data for Processing ___________________________________________________________________ 17
5. Imputations ___________________________________________________________________________________ 21
6. Should Mean or Median be used? _____________________________________________________________ 27
7. Chained Indices _______________________________________________________________________________ 32
8. Hedonic Regression ___________________________________________________________________________ 33

FIGURES
1. Example of a Timeline of the Transaction Process _____________________________________________ 11
2. Data Sources and Price Concepts______________________________________________________________ 12
3. Histograms Before ‘Cleaning’ the data ________________________________________________________ 21

INTERNATIONAL MONETARY FUND 3


RPPI PRACTICAL GUIDE

4. Histograms After ‘Cleaning’ the data __________________________________________________________ 23


5. The Mechanics of the Characteristics Hedonic Method________________________________________ 41
6. The Mechanics of the Imputations Hedonic Method __________________________________________ 46
7. RPPI Index Compiled with Four Methods ______________________________________________________ 52
8. RPPI and Sub-indices __________________________________________________________________________ 54
9. YoY - RPPI and Sub-indices____________________________________________________________________ 55
10. YoY and Contributions _______________________________________________________________________ 55
11. RPPI and Number of dwellings sold __________________________________________________________ 56
12. RPPI and CPI _________________________________________________________________________________ 56
13. The Financial Times Vocabulary ______________________________________________________________ 57
14. US Cities with the Highest Share of Million-dollar Dwellings _________________________________ 58
15. Housing Affordability in Sweden Sample of Cities____________________________________________ 59
15. House Price-to-income Ratio Across Countries ______________________________________________ 59
17. Residential Property Sales Variation __________________________________________________________ 60
18. Heating Map with Price Development Per Square Meter in Italy _____________________________ 60

TABLES
2. Comparison of Different Data Sources ________________________________________________________ 14
3. Data Structure _________________________________________________________________________________ 18
4. Summary Statistics for the Price _______________________________________________________________ 18
5. Summary Statistics for the Surface ____________________________________________________________ 18
6. Frequency of ‘property_type_name’ ___________________________________________________________ 19
7. Frequency of ‘town_name’_____________________________________________________________________ 19
8. Duplicates _____________________________________________________________________________________ 20
9. Missing Values ________________________________________________________________________________ 20
10. Outliers ______________________________________________________________________________________ 21
11. Example for Threshold of Number of Bedrooms by Property Type ___________________________ 22
12. Example of Thresholds for Floor Area by Property Type and Bedrooms ______________________ 22
13. Example of Thresholds for Number of Rooms by Property Type and Bedrooms _____________ 22
14. Weights Compilation – Summing the Transactions Value ____________________________________ 25
15. Weights Compilation_________________________________________________________________________ 25
1. Five Main Methods for RPPI Compilation______________________________________________________ 26
16. Stratification - Index Structure _______________________________________________________________ 28
17. Stratification - Calculation of the Median in the Reference Period (1Q2016) _________________ 29
18. Stratification - Calculation of the Median in the Current Period (2Q2016)____________________ 30
19. Stratification - Compilation of the Sub-indices _______________________________________________ 30
20. Stratification - Aggregation of the Sub-indices ______________________________________________ 30
21. Stratification - Re-referencing the Sub-indices _______________________________________________ 31
22. Stratification - Base Prices ____________________________________________________________________ 31
23. Stratification - Median Prices in the Current Period (1Q2017) ________________________________ 31
24. Stratification - Compilation of the Sub-indices (1Q2017)_____________________________________ 32
25. Stratification - Aggregation of the Sub-indices ______________________________________________ 32
26. Stratification - Chaining the Indexes _________________________________________________________ 33

4 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

27. ‘Shadow’ Prices of Houses A and B ___________________________________________________________ 34


28. Example of Chaining the Indexes from Rolling Window Hedonic Regression ________________ 36
29. Time Dummy - Data Set with the Dummy Variables _________________________________________ 37
30. Time Dummy - OLS Results for Stratum 1 ____________________________________________________ 38
31. Time Dummy - OLS Results for Stratum 2 ____________________________________________________ 39
32. Time Dummy – Compilation of the Sub-indices ______________________________________________ 39
33. Time Dummy - Aggregation of the Sub-indexes _____________________________________________ 39
34. Time Dummy - Re-referencing the Sub-indices ______________________________________________ 40
35. Time Dummy - Chaining the Indices _________________________________________________________ 40
36. Example of Calculation of the ________________________________________________________________ 42
37. Characteristics – Coefficients for Stratum 1 in the Reference Period (1Q2016) _______________ 43
38. Characteristics – Coefficients and Average Characteristics for Stratum 1 in the Reference
Period (1Q2016) and in the Current Period (2Q2016) ____________________________________________ 43
39. Characteristics – Compilation of the Sub-index for Stratum 1 in the Current Period (2Q2016)
__________________________________________________________________________________________________ 44
40. Characteristics – Aggregation of the Sub-indices ____________________________________________ 45
41. Characteristics - Re-referencing the Sub-indices _____________________________________________ 45
42. Characteristics - Chaining the Indices for 2017 _______________________________________________ 46
43. Characteristics - Chaining the Indices for 2018 _______________________________________________ 46
44. Imputations - Coefficients ____________________________________________________________________ 48
45. Imputations - Sample of Three Dwellings from Stratum 1 of Period 1Q2016 _________________ 49
46. Imputations - Example of Price Imputations of Three Dwellings from Stratum 1 of Period
1Q2016 __________________________________________________________________________________________ 49
47. Imputations - Jevons of the Imputed Prices for the Year 2016 _______________________________ 50
48. Imputations - Compilation of the Sub-indices for the Year 2016 _____________________________ 50
49. Imputations - Aggregation of the Sub-indices _______________________________________________ 50
50. Imputations - Re-referencing the Sub-indices________________________________________________ 50
51. Imputations – Compilation of the Sub-indices for 1Q2017 ___________________________________ 51
52. Imputations - Chaining the Indices for 2017 _________________________________________________ 51
53. Imputations - Chaining the Indices for 2018 _________________________________________________ 51
54. RPPI Index Compiled with Four Methods _________________________________________________ 52

ANNEXES
I. Relation between the Methods and the Data Available ________________________________________ 63
II. G20 Data Gaps Initiative Template on Housing Price Statistics ________________________________ 64

INTERNATIONAL MONETARY FUND 5


RPPI PRACTICAL GUIDE

PREFACE
Forthcoming

6 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

INTRODUCTION
1. The RPPI Practical Compilation Guide (Guide), sets out practical advice on the compilation
of a Residential Property Price Index (RPPI) based on the conceptual approach described in the
Handbook on Residential Property Price Indices (Handbook), published in 2013. The preparation of
the Guide was funded by Switzerland State Secretariat for Economic Affairs (SECO), whose
support is gratefully acknowledged.

2. The Handbook notes that “Residential property price indices have a number of important
uses:

• as a macro-economic indicator of economic growth;


• for use in monetary policy and inflation targeting;
• as an input into estimating the value of housing as a component of wealth;
• as a financial stability or soundness indicator to measure risk exposure;
• as a deflator in the national accounts;
• as an input into an individual citizen’s decision making on whether to buy (or sell) a
residential property;
• as an input into the consumer price index, which in turn is used for wage bargaining and
indexation purposes;
• for use in making inter-area and international comparisons.”

3. The Second Phase of the G-20 Data Gaps Initiative and guidance on Financial Soundness
Indicators identify real estate statistics, in particular price changes on residential property, as a
critical input into financial stability policy analysis and macroprudential measures. Data users in
IMF member countries have expressed a need for RPPIs and the IMF has provided training and
capacity building to many member countries.

4. The Guide was prepared by the IMF Statistics Department, building on experience
gathered during technical assistance and training missions, delivered to RPPI compilers from
more than 80 countries over the past six years. Compilers have expressed the need for practical
guidance to complement the theoretical content of the Handbook, particularly when transitioning
from less to more sophisticated methods.

5. The Guide was piloted with several countries.1 In addition, as is common practice in
developing methodological guidance, a public consultation was launched in November 2019,
open to RPPI compilers and other stakeholders world-wide.

1
Switzerland State Secretariat for Economic Affairs 2 (SECO2) RPPI project, funded by the Government of
Switzerland.

INTERNATIONAL MONETARY FUND 7


RPPI PRACTICAL GUIDE

6. The Guide is written for data compilers in statistical offices, central banks, housing
agencies, or other statistical agencies and demonstrates how to compile and disseminate RPPIs.
Using the Guide requires a background in statistics and some knowledge of econometrics.

7. The Guide includes several practical exercises compilers can complete to help better
understand the data sources and methods used to compile an RPPI. The exercises are delivered
through a series of step-by-step instructions that make use of a synthetic data set and R scripts
(see box below) developed specifically for the RPPI compilation.

8. The synthetic data set and the R scripts that accompanies this Guide were developed in-
house. Users can access and download these data and the R scripts. The data set has 22
variables and includes three years (2016–2018) of monthly data. In total there are over 13,000
observations.

Box 1. The R Software


R is a programming language and free open source (R-Studio) software environment for
statistical uses with a large online community that supports its availability and reliability and
provides technical support. The hardware and software requirements are the same as for other
econometric software such as STATA, SAS or E-views. The R software is open which means that
any person can use it free of charge. The R scripts accompanying this Guide have been
prepared such that compilers should be able to use them with their own data to compile a
monthly or quarterly index, with a few and easy-to-perform adaptations. 2 Using the R program
and the data set can help compilers to develop their capacity for index compilation.

9. The Guide is useful for all RPPI compilers but specifically targets new RPPI compilers with
capacity constraints (lack of econometricians, programmers, or IT support). The aim of the Guide
is to effectively help developing countries close data gaps and improve data quality. Countries
that are already compiling an RPPI can also benefit from the Guide by using it to improve their
methodologies or processes.

10. The Guide is divided into three sections. The first section illustrates the type of data
commonly used to construct RPPIs and how compilers can best evaluate the quality of the source
data. It also outlines how to prepare or pre-process data prior to compiling an RPPI. Finally, the
section stresses the importance of selecting appropriate weights and the different type of data
that can be used to construct the weights.

11. The second section of this Guide outlines the different methods that can be used to
compile RPPIs.

12. Finally, the third section provides guidance on dissemination practices and suggests
some additional indicators compiler may want to construct in order to complement the RPPI.

2
Specific guidance on what and how to change is provided in each R script.

8 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

13. The Guide concludes by providing some practical advice regarding the dissemination of
RPPIs.

INTERNATIONAL MONETARY FUND 9


RPPI PRACTICAL GUIDE

DATA SOURCES AND WEIGHTS


A. Identifying Data Sources

Reference: Handbook on Residential Property Price Indices, pages 102-111

14. When designing an RPPI program compilers should target a national monthly or
quarterly index covering monetary transactions (mortgage and cash purchases) of all types of
residential properties (single and multi-family dwellings, new and existing dwellings) in line with
the concepts outlined in the Handbook. In cases where the available data, operation of the
market or other constraints does not support complete alignment with the Handbook, compilers
may need to reduce the coverage of the index (e.g., the index only covers urban markets or
registered transactions).

15. To compile an RPPI conforming to the recommended best practices data related to the
transaction price as well as key characteristics 3 of the dwellings are required to assure a constant-
quality index. As with any price index the target is to follow the price trend by comparing like-
with-like. This is a challenge since a direct price comparison requires comparable dwellings or the
same dwelling to be transacted in consecutive periods. This is generally not the case with
residential properties where the same residential property may only be sold once every 20 years.
Given their infrequent sale and the heterogeneity of residential properties, quality adjustment
techniques are required to derive measures of pure price change. This means that the data
requirements for a high quality RPPI are extensive and rely heavily on the acquisition of detailed
characteristics about each property given there are a wide range of characteristics that can
influence the price of a dwelling. For example, usable floor area, number of rooms, district,
region, property type, age, new or existing, completion state of dwelling, existence of an elevator,
etc. can all impact the price of a dwelling.

16. As a first step in designing an RPPI, compilers should acquire an in-depth understanding
of the real estate transaction process within their country. This will allow the compiler to
understand the type of information generated throughout the transaction process and which
data would be useful for the purposes of compiling an RPPI. Figure 1 illustrates a possible set of
information that could be generated leading up to a real estate transaction, during the execution
of the transaction, and after the transaction has occurred.

3 Key characteristics refer to structure and location attributes such as usable floor area, number of rooms, district,
region, property type, age, new or existing, completion state, renovations, etc.

10 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Figure 1. Example of a Timeline of the Transaction Process

17. It is clear from Figure 1 that the same dwelling can have many different prices at distinct
points in time. For example, there is typically an asking price, a transaction price, and a declared
price. When constructing an RPPI the compiler should ideally aim to acquire the agreed upon
price that corresponds to the date the transaction is finalized (e.g., this is considered the target
price to be acquired by the compiler).

Box 2. Price Concepts and their Fitness for Purpose

1. Asking price – Corresponds to the price asked when the dwelling is put on the market. It may change
if the dwelling takes some time to be sold and after negotiation. It is normally is higher than the target
price. It may be a suitable proxy.
2. Transaction price – After negotiations the buyer and the seller agree on the price. This is the target
price, e.g., the price actually paid. However, this can be difficult to obtain since at this moment there is
no register in a formal or institutional database.
3. Valuation price – The dwelling is evaluated by the lending institution. The value of the appraisal can be
higher, lower or equal to the transaction price. If the value of appraisal equals the value of the
transaction, these data are suitable, but otherwise are not recommended.
4. Declared price – This is the “legal” price of the dwelling, normally used for tax purposes. The
registration of the selling price will normally happen later than the transaction date and declared prices
may differ from the transaction prices. Depending on the size of these differences it can be or not a
suitable proxy.

18. The data required to construct an RPPI can be obtained from a variety of sources
including websites, real estate authorities or surveys conducted by NSOs. The most commonly
used data sources are illustrated in Figure 2 along with the price concepts most commonly
associated with the data source.

INTERNATIONAL MONETARY FUND 11


RPPI PRACTICAL GUIDE

Figure 2. Data Sources and Price Concepts

Land Lending
Agents and register institution
Web sites
developers
Lawyers, Tax
notaries authorities

Asking
price Declared Valuation
price price

19. Data from websites can be obtained by web scraping or through direct provision by the
site owner. Additionally, data can also be obtained via application programming interfaces (APIs)
when the website or data owner decides to make the data available via an API. In general,
websites provide a wide coverage, a detailed description of the property, and the asking price. In
many countries, online sites cover the entire country, all types of dwellings and both cash and
mortgage transactions. Also, in many cases dwellings sold through agents, developers, or
owners, are included. Timeliness is an advantage with this source since the price is obtained as
soon as the property enters the market.

20. While data obtained from websites have many advantages there are often several
drawbacks. In some countries, web listings are confined to higher-end properties that are only
transacted by developers. It is also common that the asking price differs from the final
transaction price which is not captured in the advertisement. Finally, compilers cannot always
assume a property has sold if it is no longer listed. A dwelling can be “delisted” simply because
the owner decides no longer to sell.

21. Compilers can also survey agents and developers to obtain residential real estate prices
and related information. The advantage of a survey is the compiler can design the questionnaire
in such a way that they obtain the precise information they require according to the concept they
are measuring. While a survey provides the compiler with the greatest degree of control over the
information they use to construct the index it has several drawbacks including costs and
response burden. In addition, samples are often required, and it is very difficult to obtain a
representative sample as explained in Box 3.

Box 3. Sample Representativeness


Sample surveys can be used to collect data from agents, developers, lawyers, or notaries. This raises the
issues of sampling bias and representativeness. When sampling respondents, care must be taken to
guarantee an adequate coverage of the turnover, geography and dwelling type. The sample should be
stratified with the larger developers and agents included in the sample on a regular basis and the
smaller agents and developers rotating into the sample every year. Information on turnover may be
available from business surveys or government administrative records. The larger agents and developers
may work exclusively with dwellings located in the main cities. To ensure adequate coverage of the
entire territory a sample of small developers and agents may need to be selected. In addition, small

12 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

developers and agents may focus on a niche part of the market and therefore by not including them an
important segment of the market can be missed. Finally, surveys can be costly and incur a respondent
burden. More information on sampling can be found in Chapter 4 of the CPI Manual: Concepts and
Methods, forthcoming.

22. A key data source that compilers can use to compile RPPIs are national land registries or
property assessment files maintained by government or tax authorities. These data may cover all
transactions in all geographic regions. Depending on national circumstances, the quality of the
data may be good given their coverage and the fact that they are used for legal and tax
purposes. While land registries and property assessment files are often an excellent source of
information care must be taken to ensure the transaction is recorded in the proper period. In
certain countries it is possible that the registration of the sale can take place three months after
the transaction takes place. In this case the price would be recorded in the wrong period and the
compiler may introduce a time bias in the calculation of the index.

23. Mortgage applications obtained from lending institutions are another possible source of
information that can be used to compile RPPIs. The information is often timely, and the
geographic coverage is generally broad. Lending institutions often hold a large amount of
information related to the characteristics of the dwellings which can also aid in the calculation of
the RPPI.

24. When using mortgage applications as a data source for compiling RPPIs compilers need
to ensure that the price on the application reflects the final selling price. Mortgage applications
often contain the upper limit of the value that can be borrowed which can be different from the
agreed upon transaction price of the property. Additionally, in some countries the banking sector
is very fragmented, and a substantial amount of work will need to be undertaken to standardize
the data so that it can be processed into an RPPI. In some countries it may be best to work with
the Central Bank who may act as a clearing house for detailed mortgage information.

25. It may also be possible to obtain information from lawyers or notaries. Often, following
the sale of a residential property, the contract is registered with a lawyer or notary. These are
typically good data sources in terms of coverage and contain detailed information about the
characteristics of the dwelling, its location, and the actual transaction price.

26. However, if sales are registered with a lawyer, it may be difficult to obtain a complete
electronic file that includes all transactions for a given period. This is because in some countries
lawyers are not required to register their contracts into a national database. When a national
registration system does not exist, each lawyer may have their own contract registration system.
The compiler would then need to obtain the registry from each lawyer and integrate the data
into a common registry which is costly and inefficient. Obtaining data from lawyers may also be
difficult due to client privilege rights since in most jurisdictions’ lawyers cannot provide
information about their clients to authorities. Notaries are normally regulated and required to
record the sale of a property in a common database. The contract is likely to be the first official
register that secures the transaction and the transfer of ownership of the dwelling. In most cases

INTERNATIONAL MONETARY FUND 13


RPPI PRACTICAL GUIDE

the price that is recorded in the database is the target price for the RPPI. The main drawback with
this data source is that it may often lack detailed information on the characteristics of the
dwelling.

27. The compiler should always bear in mind that the available data sources and their
strengths and weaknesses are country-dependent.

Table 1. Comparison of the Features of the Different Data Sources

Detail for
Data
Availability Coverage Timeliness Quality Price target
Provider
Adjustment
Websites Good. Partial Good Must be Asking price
Immediate and checked on each differs from
free. website transaction price
Agents Good. Partial Good Good Asking price
Requires a differs from
survey transaction price
therefore is
costly.
Developers Good. Partial Good Good Asking price
Requires a differs from
survey transaction price
therefore is
costly.
Lending Must be Excludes cash Good Good Value can differ
institutions negotiated transactions from transaction
price
Lawyers or Good Good if mandatory Good Must be May suffer from
notaries with or very common checked underreporting of
electronic values
data set
Lawyers or Costly Sample Good Must be May suffer from
notaries checked underreporting of
without values
electronic
data set
Land register Good Can be rather late Can be Must be May suffer from
and Tax rather late checked underreporting of
authorities values

B. Assessing Source Data Quality

28. Prior to constructing an RPPI and selecting the data sources a compiler should undertake
a thorough assessment of the quality of the data. It may be useful for the compiler to assess the
data source using a quality assessment framework. Statistical authorities often have quality
assessment frameworks that are used to assess statistical programs within their jurisdiction. If
one exists compilers should use their country-specific quality assessment framework to assess
the quality of their source data. If a data quality framework does not exist the compiler should

14 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

either use the IMF’s generic Data Quality Assessment Framework or at least assess the data
source(s) in terms of accuracy, fit for purpose, structure, coverage, and availability.

29. The accuracy can be determined by assessing the occurrence of errors and missing
observations.

30. Fit for purpose is related to conceptual issues such as whether the price reflect the
target price concept or if the coverage is adequate (national vs local) or if it meets the desired
timeliness and frequency (monthly, quarterly, annually) requirements.

31. The structure of the data can be measured by the number of variables or attributes
associated with each observation. For example, source data only containing price and city is less
useful than source data containing, price, city, size, type of structure and age.

32. The coverage examines the share of observations included in the source data as a share
of the total observations required. For example, if a compiler is targeting a national index but the
data source only includes transactions for two of the 10 largest cities then the coverage would be
low (even if they had a census of transactions for the two cities). In a highly regulated market the
coverage should be high. Conversely, in a less regulated or unregulated market the transaction
can be as simple as the two parties agreeing on a price and the corresponding amount changing
hands. It is highly likely that in a regulated system at least some components of the transaction
will be registered, and the coverage will be high. In an unregulated market it is less likely that any
component of a transaction will be registered, and the coverage will be low.

33. Compilers also need to assess the availability of the data. While data may be highly
accurate with good coverage, if the data cannot be acquired in a reliable fashion according to an
agreed upon schedule it will be of little use. Additionally, if the data are not in a format that the
compiler can readily access, they may be of little use. For example, in some cases the information
may only be available in paper format. When the transactions are registered on paper the
information must be digitized. In other cases, transactions may be recorded in digital form, but
several different regional systems may be in use, thus several data files using different standards
may need to be standardized and merged into a unique data file. This will pose challenges such
as:

• dealing with different formatting;


• varied variables;
• variables with the same name but diverse meanings; and
• identifying a key to integrate the information.

34. Once the RPPI compiler has assessed the source data and determined if it can be used to
develop an RPPI, he/she should enter into a formal data exchange agreement with the data
provider to ensure a consistent and reliable transmission of the source data. This agreement
should articulate, among other things:

INTERNATIONAL MONETARY FUND 15


RPPI PRACTICAL GUIDE

• procedures for the secure transmission of the data (since the data often include sensitive
information);
• the provision of metadata;
• an agreed upon schedule for the transmission; and
• validation reports to ensure the data were transmitted correctly.

35. Detailed explanations and examples on how to assess the data quality are showed below
in Exercise 1.

36. The RPPI compiler should also ensure that a data management system is in place to
store, process, and back up the files. The compiler should also ensure that they are able to keep
track of which files were used when calculating the RPPI to guarantee the replicability of the
index compilation. The storage space required for all the files can be quite large. It is
recommended that the compiler maintain and archive the raw data, the cleaned data, and the
processed data for each compilation cycle. If a proper data management system and practice
does not exist, the data can be damaged or even lost.

37. Source data used to construct RPPIs are often sensitive. The compilers ability to access
the micro-data may be limited even when regulations are in place that entitle statistical offices to
have access to data. For instance, tax authorities may be reluctant to share data with other
institutions. In these cases, the compiler should work with the data provider to address
confidentiality issues. It is considered a best practice to establish a memorandum of
understanding that governs the exchange of the information and clearly lays out each parties’
responsibility with respect to protecting the confidentiality of the data.

38. The compiler should aim to obtain data monthly even if the intention is to construct the
index on a quarterly basis. The data should be validated as soon as they are received.

39. It is recommended that the compiler develop standard validation reports that can be
generated upon receipt of the data. In addition to developing validation reports the compiler may
want to open the file for a brief visual inspection. This will allow the compiler to detect severe
errors such as:

• an empty or corrupted file;


• an unexpected number of observations;
• a file with unrecognizable characters;
• binary variables with numbers other than zero or one; or
• to confirm that the structure of the data remains stable.

40. The prices should usually be recorded in the official local currency since this is usually the
actual price facing households. If the original data file shows other currencies use the daily
exchange rates from the national central bank website to convert the transaction value into the
local currency.

16 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Box 4. Prepare Data for Processing


Before importing a data file (excel or csv) into a processing system some preparations should be
undertaken to make sure the program reads the file correctly. The required preparations are specific to
the software and the dataset being used; however, some general advice and examples are provided.

• Text in the data file, such as names of variables (e.g., floor area) and occurrences of qualitative
variables, 4 should not have spaces, accents, and misleading characters like minuses (-). The use of
underscores is helpful - “floor_area” and “semi_detached” since it facilitates readability by
machines and humans.
• Numbers must be formatted as “number” and qualitative variables as “text.”
• Commas or dots to split the thousands should not be used.
• The names of the variables must be exactly the same in every file. For example, the software will
read “town_house” as different from “Town_house” (Capitalization) and “Floor_Area” as different
from “FloorArea” (missing the underscore).
• It is advisable to replace all missing values with “NA” since the software often does not read
empty cells correctly.
• Normally the data set includes a column with the date to which the transaction refers to. Adding
a column with the identification of the quarter (for quarterly indices) or of the month (for
monthly indices) is recommended since it is often the case that the data file of the following
quarter will include transactions that happened in the previous quarter.

Exercise 1: Assessing Source Data Quality in Practice

41. The following exercise outlines a series of procedures that a compiler may undertake to
analyze their data source and pre-process their data so that they are ready to be used to develop
an RPPI. These steps are examples and are intended to illustrate the type of activity that should
be undertaken. The precise steps that a compiler includes will depend on the data sources and
methods they are using to develop their index as well as their specific institutional arrangements.

42. The R script “data_analysis.r” and the data set “1Q2016_raw.csv” that accompany this
Guide can be used to complete this exercise.

43. Visually inspect the data. Table 3 lists the variables that are included in the synthetic set
of data. As a first step user should open the data set and visually inspect the data and identify
any problem cases that may require further investigation.

4
A qualitative variable, also called a categorical variable, is a variable that is not numerical. It describes data that
fit into categories. (e.g., semi-detached). A numeric variable can also be treated as categorical when its values are
grouped in categories.

INTERNATIONAL MONETARY FUND 17


RPPI PRACTICAL GUIDE

Table 2. Data Structure

Data structure Description Type


Period Month, quarter, and year categorical
Year Year categorical
Month Month categorical
town_name name of the town qualitative/categorical
Region region where the town is located qualitative/categorical
property_type_name type of property qualitative/categorical
Baths number of baths categorical
Beds number of beds categorical
Garden existence of a garden categorical
Garage existence of a garage categorical
Balcony existence of a balcony categorical
Utility room existence of a utility room categorical
Conservatory existence of a conservatory categorical
Ensuite existence of a suite (bed and bath) categorical
Fireplace existence of a fireplace categorical
Dglaze existence of double-glazed windows categorical
Floorsq total floor square meters (including common areas) numerical
Price registered price of the dwelling numerical
Air existence of air conditioner categorical
Elevator existence of an elevator categorical
construction_year year of construction categorical
Vintage difference between year of construction and numerical
current year

44. Analyze summary statistics – Compilers should assess the economic reasonability of the
data by creating summary statistics for all variables in their data set. The tables below provide an
example of the type of summary statistics that can be calculated. From Table 4 we observe that
there is a large range between the minimum and the maximum price. This is a good indication
that the data set includes several outliers. The last column “na” shows that the synthetic data set
has two missing prices.

Table 4. Summary Statistics for the Price

Minimum q1 median mean q3 Maximum na


60,074 155,928.2 236,434 775,944 353,538 501,489,000 2

Table 5. Summary Statistics for the Floor Area

Minimum q1 median mean q3 maximum na


51 177 227 327.0707 281 36800 1

18 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

q1 – First quartile

q3 – Third quartile

45. Verify the date field – The compiler should ensure that the date field is properly
formatted and that the date range is as expected. For example, if the compiler is only expecting
prices for the first quarter of 2019 and finds observations dated the fourth quarter of 2018
further investigation may be required.

46. Check the number of occurrences on each variable – Compilers may also want to
produce a frequency count for the occurrence of each variable. The aim is to find the variables
that can influence the price. For example, if only one dwelling in the data set has a garage then
the compiler may not want to use this information when constructing the pricing model since it
is not statistically representative. Frequency counts can also be useful when deciding how to
group or stratify the data. For example, consider a data set where there are mostly apartments
and town houses, and few bungalows. The compiler may want to group town houses and
bungalows together into a single method.

47. Analyze the coverage - Confirm that the coverage is as expected in terms of geography,
type of dwellings, financing, etc. For example, if the target coverage is only the capital city,
transactions in regions other than the capital city should be removed. Tables 6 and 7 are samples
of coverage reports for the variable property_type_name and town_name.

Table 6. Frequency of “property_type_name”

property_type_name Freq
Apartment 529
Detached 472
Semi_detached 60

Table 7. Frequency of “town_name”

town_name Freq
Town_1 346
Town_10 185
Town_2 51
Town_3 99
Town_4 105
Town_5 93
Town_6 68
Town_7 86
Town_8 22
Town_9 6

INTERNATIONAL MONETARY FUND 19


RPPI PRACTICAL GUIDE

48. Check for duplicated records - Compilers should include a check for duplicate records
in cases where duplicate records would be clear errors and may impact the subsequent
processing of the data. For example, it may be logically incorrect for a dwelling with the same
selling date and address to appear on the file multiple times. A validation report could be
created which checks for cases where two observations have the same date and address. This
type of verification is especially important when using web scraping techniques since the same
dwelling can be listed by different real estate agents and by the owner himself or herself over
multiple websites. Table 8 is an example of a validation report that check for duplicate values.

Table 8. Duplicates

period town_name region property_type_name baths beds (…) floorsq price


1Q2016 Town_1 1 Detached 1 4 (…) 354 407906
1Q2016 Town_1 1 Detached 1 4 (…) 354 407906
1Q2016 Town_1 1 Detached 1 4 (…) 354 407906
(…)
1Q2016 Town_1 1 Detached 1 4 (…) 354 407906
1Q2016 Town_1 1 Detached 1 4 (…) 354 407906

49. Check for missing values - Compilers may want to perform is a check on missing values.
One way to assess the extent of missing values is to calculate the percentage of missing values to
total expected values. For example, in the synthetic set of data provided with this Guide there are
five missing values. This represents 0.022 percent of the total number of observations. Table 9 is
an example of a validation report that check for missing values.

Table 9. Missing Values

period year Month town_name Region property_type_name floorsq price


(…)
1Q2016 2016 3 Town_3 1 Apartment (…) NA 132999
1Q2016 2016 3 Town_1 1 Apartment (…) 88 NA
1Q2016 2016 3 Town_3 1 NA (…) 181 130259
1Q2016 2016 2 NA 1 Apartment (…) 118 NA

50. Identify errors and outliers - While it is straightforward to identify missing values
identifying errors and outliers is more challenging. First, it is important to distinguish errors from
outliers. While an outlier is plausible but not very probable, an error is a value that is not
plausible. For example, a 2,500 square meter single-family house with 50 bedrooms is plausible
but not very probable, whereas an apartment with 2,000 bedrooms is not plausible and would be
an error. Errors can occur because of data entry errors or mis-reported information. Compilers
should establish error and outlier detection routines that are generated each time a new file is
received from the data provider.

20 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

51. One way to identify outliers is to determine a plausible range of values and track all
observations that fall outside of the range. This range can either be developed using a variety of
statistical techniques our based on the complier’s judgement and knowledge of the industry. The
thresholds used must be defined according to the data characteristics of each country and for
each stratum. Different threshold must be tested on the data before setting the final values to be
used in the outlier detection routine. Once thresholds are determined the same thresholds
should be used in each period and not be constantly changed.

52. Figure 3 shows histograms for the variables, “price” and “floorsq” (“Surface”) for
observations coded to Region 1 in the synthetic dataset. Given the large tails it is clear there are
several outliers in the dataset. The outliers are show in Table 10.

Figure 3. Histograms Before “Cleaning” the data

Table 10. Outliers

period year Month town_name region property_type_name (…) floorsq price

1Q2016 2016 2 Town_1 1 Detached (…) 5000 547753


1Q2016 2016 1 Town_1 1 Detached (…) 36800 485352
1Q2016 2016 3 Town_2 1 Detached (…) 30800 544277
1Q2016 2016 2 Town_10 1 Detached (…) 31500 439691
1Q2016 2016 1 Town_3 1 Detached (…) 320 501489000
1Q2016 2016 2 Town_10 1 Detached (…) 387 38014700

Box 5. Imputations
When compiling an RPPI it is often necessary to impute values when they are missing or not reported.
For price indices where the matched model is used, this is particularly important since a price is required
for each observation in each time period.

If imputations are required three options are suggested: impute with the average if the compiler is
imputing a numerical value (such as the price), with the mode if the variable is qualitative (e.g., the unit
has a garage), or use expert knowledge to determine the most likely value. Regardless of the technique
being used a degree of bias is introduced in the index when estimated prices are used. The compiler
should always seek to obtained as much observed data as possible.

INTERNATIONAL MONETARY FUND 21


RPPI PRACTICAL GUIDE

53. Other types of thresholds can be established to control for outliers. The following tables
from O’Hanlon (2011) show examples of how establishing boundaries or ranges can be used to
identify outliers. These boundaries must be adapted to the circumstances of the housing market
of each country. The country data must be analyzed to understand which values should be used
for these boundaries. For instance, a compiler could establish a range of bedrooms that could be
associated with each property type. If an observation falls outside of the range, it would be
flagged as an outlier. Tables 11, 12, and 13 illustrate three types of thresholds that could be
established when attempting to detect outliers within a set of residential property data.

Table 11. Example for Threshold of Number of Bedrooms


by Property Type

Property Type Allowable Range for number of Bedrooms


Detached House between 2 and 6
Semi-detached House between 2 and 5
Bungalow between 1 and 6
Terraced House between 1 and 5
Apartment/Flat between 1 and 4

Table 12. Example of Thresholds for Floor Area by Property Type and Bedrooms

Property Type/Number of Allowable Range for Floor area (meters squared)


Bedrooms 1 2 3 4 5 6
Detached House - 40-239 60-279 100-339 120-419 120-479
Semi-detached House - 40-179 60-199 80-219 80-339 -
Bungalow 10-219 20-2019 40-239 80-319 100-379 100-419
Terraced House 20-159 40-159 60-159 60-259 80-379 -
Apartment/Flat 20-119 40-139 40-159 40-219 - -

Table 13. Example of Thresholds for Number of Rooms by Property Type and Bedrooms

Allowable Range for number of Rooms


Detached House - 4-8 5-12 6-14 7-16 8-17
Semi-detached House - 4-8 5-10 6-12 7-14 -
Bungalow 3-8 4-8 5-10 6-13 7-15 8-15
Terraced House 3-6 4-7 5-10 6-12 7-13 -
Apartment/Flat 3-5 4-7 5-8 6-10 - -

54. The R script associated with this exercise includes a sample set of algorithms that
illustrate the use of thresholds in detecting outliers. These thresholds include:

22 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

• For single family dwellings the maximum number of bedrooms is four and the maximum
number of bathrooms is three;
• Only apartments can have an elevator;
• the number of bedrooms must be superior or equal to the number of bathrooms; and
• the “vintage” (age of the dwelling) must be equal to the difference between the
“construction_year” and the “year” (current year).

55. The results after data “cleaning” can be seen in Figure 4. The histograms no longer show
a long right tale.

Figure 4. Histograms After “Cleaning” the Data

C. Weights

Reference: Handbook on Residential Property Price Indices, pages 39 and 156

56. Weights play an important role in the compilation of a price index. Weights are used to
aggregate prices from the primary level into more meaningful aggregates and groups. The
choice and quality of weights can have a significant bearing on any price index calculation. For
example, using a mean of prices to create an aggregated index assumes that each price is of
equal importance. In most cases this is not the realistic. There are items in the basket that are of
higher importance than other items. As such, the weights should be differentiated and
correspond to the importance of the item in the basket.

57. Residential properties are highly differentiated. The type of dwelling can differ (single
home, apartments), the properties can be in different locations (rural, urban), and the
characteristics of the properties can vary (bedrooms, bathrooms). Given these differences it is
advisable to place properties into strata where “like” properties are grouped together. This will
be discussed latter, to measure pure price change from one period to the next, an RPPI compiler
must ensure they are comparing the same property from one period to the next. Strata are useful
ways to group like items together. Once the RPPI compiler creates strata there is a need to
develop weights to aggregate the strata so aggregate measures (totals and sub-totals) can be
produced.

INTERNATIONAL MONETARY FUND 23


RPPI PRACTICAL GUIDE

58. There are two types of weights that an RPPI compiler can use: stock weights and flow
weights. Stocks are positions in time and flows reflect changes from one period to the next. For
example, a stock weight could be the number of properties in a given geographic region on a
specific date. A flow weight could be the number of properties that were sold between January
1st and March 31st. Flow weights are therefore associated with the transactions that took place in
a certain period. Stock weights are the weights associated with the accumulated flow of
transactions that have taken place from a point in time. The flow weights are suited for
monitoring financial stability and the stock weights are more appropriate if the index is being
used to track the price change in the stock of housing.

59. In most countries stock weights are derived from census data. This presents several
challenges for the compiler. First, a census is conducted on a periodic basis, often every 10 years
and therefore the weights become outdated. Second, the census may not collect the level of
detail required by the compiler such as the number of bedrooms, bathrooms, and area.

60. Flow weights are generally derived from administrative records such as real estate
transactions or loan data obtained from financial institutions. Generally, when an administrative
data source 5 is used all transactions are available and weights for each transaction type can be
estimated. In some cases, a census of administrative transactions cannot be obtained, and a
sample of the transactions is only available. When a sample is used (for example, data from web
scraping) a complementary data source is required to estimate the weights otherwise it is
assumed that the mix of dwellings in the sample is representative of the mix of dwellings being
transacted in that period. The weights can be derived from one year of data or from an average
of years (three is the general rule) to obtain more robust results. As a best practice the weights
should be updated annually and kept constant for the year when constructing a sub-annual
index.

61. The following table illustrates weights calculated from the synthetic data provided with
this Guide. 6 First the total value of all properties in a given stratum is calculated. Second, the
value of all properties is calculated. The stratum weights are derived by dividing the total value of
all properties in the stratum by the total value of all properties as shown in Tables 14 and 15.

5
Administrative data sources refer to lawyers, or notaries, or land registries, or tax authorities, or municipalities.
6
The file 1Q2016.csv and the R code weights.r were used in this example.

24 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Table 14. Weights Compilation – Summing the Transactions Value

property_type_
Period Year Month town_name region Floorsq price
name
1Q2016 2016 3 Town_1 1 Detached 354 407906
1Q2016 2016 1 Town_3 1 Detached 308 597951
1Q2016 2016 3 Town_1 1 Detached 280 509494
(…)
4Q2016 2016 10 Town_1 1 Detached 352 568347
4Q2016 2016 10 Town_4 1 Detached 282 530657
4Q2016 2016 12 Town_3 1 Detached 303 469976
Total
STRATUM 1 693130897
1Q2016 2016 3 Town_7 2 Semi_detached 101 104267
1Q2016 2016 3 Town_5 2 Detached 223 278265
1Q2016 2016 1 Town_5 2 Detached 213 337151
(…)
4Q2016 2016 11 Town_9 2 Detached 220 126139
4Q2016 2016 10 Town_6 2 Detached 116 134748
4Q2016 2016 11 Town_9 2 Apartment 189 86306
Total
STRATUM 2 397370036
Total 1090500933

Table 15. Weights Compilation

Total_W 1
W_Stratum1 693130897/1090500933 = 0.635608
W_Stratum2 397370036/1090500933 = 0.364392

INTERNATIONAL MONETARY FUND 25


RPPI PRACTICAL GUIDE

METHODS
62. This Guide illustrates the most commonly used methods to compile an RPPI. Broadly
speaking those are two different methodologies: stratification with simple averages and hedonic
methods. Three variants of hedonic methods will be described and thus we will refer to those as
different methods: Time dummy, Imputations, Characteristics. Table 1 synthesizes the methods
that are covered by this Guide and the circumstances those should be used.

Table 3. Four Main Methods for RPPI Compilation

Methods Name Used if…


Method 1 Median with Stratification … prices and at least one characteristic, (e.g.,
location) are available.
Method 2 Time dummy hedonic method … few price observations per period are
available or a monthly index is required.
Method 3 Imputations hedonic method … more detailed data are available. It can also
be used with geospatial data. 7
Method 4 Characteristics hedonic method … more detailed data are available. Equivalent to
Imputations method when the model is log-
linear and geospatial data are not used.

63. There are two methods that the Guide will not address: the repeated sales and the sales
price appraisal ratio (SPAR).

64. A well-known example of the repeated sales method is the Case-Shiller Index. It matches
the dwellings when these are re-sold. Despite the transparency in the calculations it is not
covered in this Guide for the following reasons:

• It is very data intensive, namely on backwards data. Therefore, it is difficult to implement


in many countries.
• It does not take into account depreciation and renovations.
• It may suffer from sample bias since the mix of dwellings taken in the calculations for
each period may not be representative.
• The dwellings sold for the first time are not covered.

65. The SPAR method matches the transaction value with the appraisal value from a previous
period. This method is applicable in countries where the properties are reassessed frequently,
normally for tax purposes. The SPAR method is used only in few countries.

7
Geospatial data or geographic information are the data or information that identify the geographic location of
features and boundaries on Earth, such as natural or constructed features, oceans, and more. Spatial data are
usually stored as coordinates and topology.

26 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

D. Method 1: Median with Stratification

Reference: Handbook on Residential Property Price Indices, pages 38- 48

66. One approach that could be taken to measure the change in residential property prices is
to compare the average price of properties transacted in a given period (period t) to the average
prices of properties transacted in a base or reference period (period 0). 8 While this approach is
straightforward and easy to implement it is discouraged due to the extreme heterogeneity of
properties. Averages across all transactions are only appropriate when the data available to the
compiler are limited to the price and date of the transaction.

Box 6. Should Mean or Median be Used?


It is not clear cut whether the mean or the median should be used. However, house prices distribution is
positively skew thus the median is likely to follow the price of the “typical” property. The use of the
median incorporates the assumption that prices for all dwellings will follow the price trend of the typical
dwelling. The median is used more widely in RPPI compilation than the mean as it is less subject to price
fluctuations that can be caused by a small number of high-priced properties. More detail on this
discussion can be found on the Handbook on Residential Property Price Indices, pages 38-48.

67. When compilers have access to additional information such as the location of the
property, the size or the type of property the stratification method should be used. Stratification
is a technique by which the compiler groups observations together according to some common
set of characteristics. For example, if a set of data contain information about single dwellings as
well as apartments it would be recommended to create two strata, one that includes all single
dwellings and one that includes all apartments

68. While stratification will help control for the quality-mix between strata the mix of
dwellings within a stratum will still tend to be rather heterogeneous. This is because compilers
have to balance the need for detailed strata with the need to have sufficient observations to
calculate an accurate price. If the stratum is too narrowly defined, a given stratum in a given
period may not have enough observations to obtain a robust estimate.

69. Building strata requires expert judgement; data analysis depends in large part on the
specificities of the available data set. While it is not possible to prescribe a precise and detailed
method to build appropriate strata some general guidelines have emerged.9

70. Location is normally the first level of stratification. The price of a dwelling is mainly
influenced by its location. In general, RPPI compilers have access to the price, date of the

8
Averages are measures of central tendency with the most measures common being median, mean, and mode.
9
The number of strata should be informed by an examination of the data. For example, a compiler may want to
select strata and then calculate the standard error of average prices to determine if the proposed strata are
appropriate.

INTERNATIONAL MONETARY FUND 27


RPPI PRACTICAL GUIDE

transaction, and address of the property. The address often conforms to administrative
boundaries (e.g., town) and can be a good starting point to develop strata. For example, a
stratum could be created for each town. If stratifying by town does not provide enough
observations the RPPI compiler could group towns together based on their similarities (e.g.,
population size). To find the similarities among the towns the compiler should consult with
agents, developers, and other professionals working in the real estate market.

71. RPPI compilers can also stratify their data by the vintage of the dwelling (new or existing).
New and existing dwellings are different market segments thus price trends might differ. Also,
the variables that influence the price are not the same. For new dwellings it can be important to
use a “state of completion” variable in countries where dwellings are sold to households before
or after completion. For existing dwellings, a variable identifying the renovation status may be
important.

72. User needs should be taken into consideration when developing the stratification
strategy. The compilation of sub-indices is a “by-product” of the stratification process. These
sub-indices may have economic meaning and be useful for analysis and data checking.

73. To illustrate a possible stratification of an RPPI assume strata built on the concept of a
town is adopted e.g., Stratum 1 includes observations from Town 1, Town 2, Town 3, Town 4, and
Town 10 and Stratum 2 includes observations from Town 5, Town 6, Town 7, Town 8, and Town
9. Sub-indices will be compiled at the level of strata but not for each town. If users required sub-
indices for each town a separate stratum would need to be drawn for each town.

Table 16. Stratification - Index Structure

Level 1Q2016 2Q2016 3Q2016 4Q2016


Overall_Index 95.7 100.6 102.5 101.2
I_Stratum1 98.0 95.1 102.1 104.7
I_Stratum2 92.1 109.6 102.9 95.3

To help compilers become familiar with the process of calculating an RPPI using the median with
stratification method an exercise (Exercise 2) has been developed to accompany this Guide.

Exercise 2: Calculation of a Quarterly RPPI Using a Median with Stratification

74. The index compilation for the first year prior to publication follows the next four steps.
The calculations can be replicated using the R code (named “Stratification_starting.r”) and the
synthetic data sets that accompanies this Guide. The R code includes guidance on how to adapt
the R code to work with other data sets.

75. The strata will be defined as above with Stratum 1 including all properties located in
Towns 1, 3, 4, and 10 and Stratum 2 including all properties located in Towns 5, 6, 7, 8, and 9.

28 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

76. Step 1 – Calculate the median - After strata have been defined 10 the median price
within each stratum is calculated for each period. The first period is referred to as the reference
period. In this exercise the first quarter of 2016 is the reference period.

Table 17. Stratification - Calculation of the Median in the Reference Period (1Q2016)

Period year Month town_name region property_type_name floorsq price


1Q2016 2016 3 Town_1 1 Detached 354 407906
1Q2016 2016 1 Town_3 1 Detached 308 597951
1Q2016 2016 3 Town_1 1 Detached 280 509494
(…)
1Q2016 2016 10 Town_1 1 Detached 352 568347
1Q2016 2016 10 Town_4 1 Detached 282 530657
1Q2016 2016 12 Town_3 1 Detached 303 469976
Median Stratum 1 237914.5
1Q2016 2016 3 Town_7 2 Semi_detached 101 104267
1Q2016 2016 3 Town_5 2 Detached 223 278265
1Q2016 2016 1 Town_5 2 Detached 213 337151
(…)
4Q2016 2016 11 Town_9 2 Detached 220 126139
4Q2016 2016 10 Town_6 2 Detached 116 134748
4Q2016 2016 11 Town_9 2 Apartment 189 86306
Median Stratum 2 213247

10
In this exercise, two strata have been defined.

INTERNATIONAL MONETARY FUND 29


RPPI PRACTICAL GUIDE

Table 18. Stratification - Calculation of the Median in the Current Period (2Q2016)

Period year month town_name region property_type_name floorsq price


2Q2016 2016 5 Town_3 1 Detached 293 468114
2Q2016 2016 6 Town_10 1 Detached 382 586620
2Q2016 2016 4 Town_4 1 Detached 354 585068
(…)
2Q2016 2016 6 Town_4 1 Apartment 129 127961
2Q2016 2016 6 Town_1 1 Apartment 217 114889
2Q2016 2016 4 Town_3 1 Apartment 70 145301
2Q2016 2016 4 Town_2 1 Apartment 83 92673
Median Stratum 1 230950

2Q2016 2016 6 Town_9 2 Detached 211 142737


2Q2016 2016 4 Town_6 2 Detached 116 141905
2Q2016 2016 6 Town_8 2 Detached 233 141178
(…)
2Q2016 2016 4 Town_6 2 Detached 54 108448
2Q2016 2016 5 Town_9 2 Apartment 186 123253
2Q2016 2016 5 Town_5 2 Apartment 192 99670
2Q2016 2016 4 Town_7 2 Apartment 229 129748
Median Stratum 2 253854

77. Step 2 – Calculate the sub-indices - The sub-indices are compiled by dividing the
median price of the current period (t) with the median price of the of the reference period (0) for
each stratum.

Table 19. Stratification - Compilation of the Sub-indices

Median 2016Q1 Median 2016Q2 Calculation Sub-index

Stratum 1 237915 230950 230950/237915 97.1


Stratum 2 213247 253854 253854/213247 119

78. Step 3 – Aggregation (Laspeyres) – The sub-indices are then aggregated using the
base year weights. In the context of this exercise the base year weights represent the total value
of the properties for the stratum divided by the total value of the properties for all strata.

Table 20. Stratification - Aggregation of the Sub-indices

1Q2016 2Q2016 3Q2016 4Q2016 Weights


Overall_Index 100 105.1 107.1 105.7 1.0
I_Stratum1 100 97.1 104.3 106.9 0.635608
I_Stratum2 100 119.0 111.8 103.5 0.364392

30 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

79. Step 4 – Re-reference the sub-indices – At this point the reference period of the sub-
indices corresponds to the first period (1Q2016). However, the reference period should be a year
and not a quarter. In order to re-reference the sub-indices to the year (2016=100) each quarter is
divided by the mean of the quarterly indices. The resulting average of the four re-referenced
quarters will be 100.

Table 21. Stratification – Re-referencing the Sub-indices

Re-referencing factor
1Q2016 2Q2016 3Q2016 4Q2016
(mean of indices)
Overall_I 100/1.0445 105.1/1.0445 = 107.1/1.0445 = 105.7/1.0445 = (100+105.1+107.1+105.7)/
ndex = 95.7 100.6 102.5 101.2 4 = 1.0445
I_Stratum 100/1.0208 97.1/1.0208 = 104.3/1.0208 = 106.9/1.0208 = (100+97.1+104.3+106.9)/4
1 = 98.0 95.1 102.2 104.7 = 1.0208
I_Stratum 100/1.0859 119.0/1.0859 = 111.8/1.0859 = 103.5/1.0859 = (100+119.0+111.8+103.5)/
2 = 92.1 109.6 102.9 95.3 4 = 1.0859

80. The index compilation for the following years is as described in the next five steps. The
calculations can be replicated using the R code named “Stratification_production.r.” In addition, a
base price should be calculated every year and the R code named “Base price.r” exemplifies this
calculation. The scripts include guidance on how to adapt it to other data sets.

81. Step 1 – Calculate the reference price - The reference price is the median of the
transaction prices that occurred in the reference period. In this Guide the indices will be compiled
in two steps. First based on last quarter of the previous year and after these are chained to the
base year (2016=100). The median prices for the fourth quarter of 2016 are shown in Table 22.

Table 22. Stratification - Reference Prices

4Q2016
I_Stratum1 254311
I_Stratum2 220785

82. Step 2 – Calculate the median for the current period - The median price for the
current period (1Q2017) is shown in Table 23 and was calculated using the R procedure:
“Stratification_production.r.”

Table 23. Stratification - Median Prices in the Current Period (1Q2017)

1Q2017
I_Stratum1 257130
I_Stratum2 225871

INTERNATIONAL MONETARY FUND 31


RPPI PRACTICAL GUIDE

83. Step 3 – Compilation of the sub-indices - The sub-indices are compiled by dividing the
median of the current period price (1Q2017) with the median of the reference period price
(4Q2016).

Table 24. Stratification – Compilation of the Sub-indices (1Q2017)

Sub-Indices 1Q2017
I_Stratum1 257130/254311*100 = 101.1
I_Stratum2 220785/225871*100 = 102.3

84. Step 4 – Aggregation (Laspeyres) – The sub-indices are aggregated using the base year
weights.

Table 25. Stratification - Aggregation of the Sub-indices

1Q2017 Weights
Overall_Index 101.6 1.0
I_Stratum1 101.1 0.635608
I_Stratum2 102.3 0.364392

85. Step 5 – Chaining the indices - All indices (stratum indices and overall index) of the
current period are chained (multiplied) by the chained index of the last period of the previous
year. Table 26 shows an example of chaining for the overall index.

Box 7. Chained Indices


In the example that accompanies this Guide indices are first compiled based on the last quarter of the
previous month. However, the reference period of the indices to be published is a previous period
(2016). To bring the indices reference period to 2016, indices are chained. This approach allows for
annual updates of the indices namely the weights and the base price or base average characteristics as
explained later in this Guide.

The formula below shows how to obtain the index of 3Q2018 in base 2016 from the 3Q2018 index in
base 4Q2017.

3𝑄𝑄2018 3𝑄𝑄2018 4𝑄𝑄2017


𝐼𝐼2016 = 𝐼𝐼4𝑄𝑄2017 ∗ 𝐼𝐼2016

Let’s suppose, 𝐼𝐼4𝑄𝑄2017 = 95.9, 𝐼𝐼2016 = 101.8 and 𝐼𝐼2016 = 107.3 the result would be 95.9 * 107.3 /100 =
3𝑄𝑄2018 4𝑄𝑄2017 3𝑄𝑄2018

102.8.

While 95.9 shows a price decrease of 4.1 per cent since the fourth quarter of 2017, the index 102.8 shows
a price increase of 2.8 per cent since 2016.
Further guidance can be found in the Chapter 9 of the CPI Manual: Concepts and Methods, forthcoming.

32 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Table 26. Stratification - Chaining the Indices

Unchained
Quarter Calculation Chained Index
Index
2016Q1 95.7 95.7
2016Q2 100.6 100.6
2016Q3 102.5 102.5
2016Q4 101.2 101.2
2017Q1 101.6 101.6 * 101.2 / 100 102.7
2017Q2 105.9 105.9 * 101.2 / 100 107.2
2017Q3 102.6 102.6 * 101.2 / 100 103.8
2017Q4 106.1 106.1 * 101.2 / 100 107.3
2018Q1 94.8 94.8 * 107.3 /100 101.7
2018Q2 103.8 103.8 * 107.3 /100 111.4
2018Q3 95.9 95.9 * 107.3 /100 102.8
2018Q4 97.5 97.5 * 107.3 /100 104.6

86. As explained before the stratification process controls for quality-mix adjustment
between strata, but not within. The need to control for quality-mix change within strata is an
argument for the use of hedonic regressions within a stratum.

E. Hedonic Methods

Reference: Handbook on Residential Property Price Indices, pages 50-51

87. To measure pure price changes, it is essential to measure the same product over time,
taking into account changes in quality and quantity. Quality adjustments require detailed
information about the characteristics of the properties and can be made using the hedonic
method.

Box 8. Hedonic Regression


“The hedonic regression method recognizes that heterogeneous goods can be described by their attributes
or characteristics. That is, a good is essentially a bundle of (performance) characteristics. In the housing context,
this bundle may contain attributes of both the structure and the location of the properties. There is no market
for characteristics, since they cannot be sold separately, so the prices of the characteristics are not
independently observed. The demand and supply for the properties implicitly determine the characteristics’
marginal contributions to the prices of the properties. Regression techniques can be used to estimate those
marginal contributions or shadow prices.” (Handbook)

88. Hedonic regressions recognize that a dwelling is composed of a bundle of characteristics.


These characteristics are not sold separately, and consequently their prices cannot be

INTERNATIONAL MONETARY FUND 33


RPPI PRACTICAL GUIDE

independently observed. Even though these prices cannot be observed they can be estimated
with hedonic regression techniques. These are often referred to as “shadow” prices.

89. Several functional forms for hedonic regressions are possible and discussed in the
Handbook. This Guide will use the log-linear form. The distribution of real estate prices is usually
positively skewed. The log liner form reduces the impact of skewness in the price distribution and
consequently, reduces heteroscedasticity. The log-linear form allows for parabolic relationships
of the variables and for a multiplicative association between characteristics.

90. The specification of a hedonic equation in the log-linear form is as follows:

𝑙𝑙𝑙𝑙 𝑝𝑝𝑛𝑛𝑡𝑡 = 𝛽𝛽0 + ∑𝑘𝑘𝑘𝑘=1 𝛽𝛽𝑘𝑘 𝑍𝑍𝑛𝑛𝑛𝑛


𝑡𝑡
+ 𝜀𝜀𝑛𝑛𝑡𝑡 (1)

t - period
n – number of dwellings in period t
k – characteristics
ln 𝑝𝑝𝑛𝑛𝑡𝑡 – price logarithm
β0 – intercept
𝛽𝛽𝑘𝑘𝑡𝑡 – “shadow” price of characteristic k in period t
𝑡𝑡
𝑍𝑍𝑛𝑛𝑛𝑛 – quantity of characteristic k in period t and dwelling n
𝜀𝜀𝑛𝑛 – error term
𝑡𝑡

91. For example, consider two houses (A and B) transacted in two consecutive periods. Both
houses are the same size (800 sqm), house A has a balcony and house B does not. House A has
3 bathrooms while house B has 2 bathrooms. Both houses have two car garages. Assume that in
period t house A sold for 465,000 and in period t+1 house B sold for 400,000. A very simple (and
misrepresented) analysis of this data would be to conclude that housing prices have fallen
16 percent. Comparing the price of these houses is not recommended since the characteristics of
house A are quite different from house B. Hedonic regression allows the compiler to decompose
the price of each house into its characteristics. This is shown in the following table.

Table 27. “Shadow” Prices of Houses A and B

House A House B
Characteristics
Shadow Prices Shadow Prices
Size 350,000 350,000
Bathrooms 45,000 30,000
Garage 20,000 20,000
Balcony 10,000
Total 425,000 400,000

92. To estimate hedonic regressions a minimum of 20 observations per variable is generally


recommended.

34 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

F. Method 2: Time Dummy Hedonic Method

Reference: Handbook on Residential Property Price Indices, pages 51-52

93. The time dummy method is generally used when the compiler had access to a data set
with a large number of characteristics but few transactions for each period. In this method, prices
and characteristics (𝑧𝑧𝑡𝑡𝑛𝑛𝑛𝑛 ) of all dwellings (𝑛𝑛) for several periods (𝑡𝑡) are pooled in the same
regression and a dummy variable is established for each period (𝐷𝐷𝑡𝑡𝑛𝑛 ). The index follows directly
from the estimated time dummy parameters.

94. The log-linear specification is used, thus the model for regression is:

ln 𝑝𝑝𝑛𝑛𝑡𝑡 = 𝛽𝛽0 + ∑𝑇𝑇𝑡𝑡=1 𝛿𝛿 𝑡𝑡 𝐷𝐷𝑛𝑛𝑡𝑡 + ∑𝐾𝐾 𝑡𝑡 𝑡𝑡


𝑘𝑘=1 𝛽𝛽𝑘𝑘 𝑍𝑍𝑛𝑛𝑛𝑛 + 𝜀𝜀𝑛𝑛 (2)

t - period
n – number of dwellings in period t
k – characteristics
ln 𝑝𝑝𝑛𝑛𝑡𝑡 – price logarithm
β0 – intercept
𝛽𝛽𝑘𝑘𝑡𝑡 – “shadow” price of characteristic k in period t
𝑡𝑡
𝑍𝑍𝑛𝑛𝑛𝑛 – quantity of characteristic k in period t and dwelling n
𝜀𝜀𝑛𝑛 – error term
𝑡𝑡

95. The index for current period (t) is derived as follows:

𝐼𝐼𝑡𝑡 = exp�𝛿𝛿̂𝑡𝑡 � *100 (3)

96. For the reference period (0) a dummy variable is not required (the price index in the
reference period is 100).

97. If data from a new period are added, the indices from the previous period can change
because the estimated coefficients 𝛿𝛿̂𝑡𝑡 of the previous periods are revised based on the new
observations. To avoid revisions to the indices a rolling window approach is used. Normally for
quarterly indices the “shadow” prices of the characteristics (𝛽𝛽̂ ) are kept fixed for at least a year.
This means that data from 12 months are used together. Every quarter a regression is estimated
with data from the current quarter and the previous three quarters. The indices from the “new
window” and the “previous window” are chained by using the last overlap period between the
two windows.

98. An example is given in the table below for a quarterly index:

INTERNATIONAL MONETARY FUND 35


RPPI PRACTICAL GUIDE

Table 28. Example of Chaining the Indexes from Rolling Window Hedonic Regression

Data from 2018Q1 Data from 2018Q2 Data from 2018Q3


Quarter Chained Index
to 2018Q4 to 2019Q1 to 2019Q2
2018Q1 100.0 100.0
2018Q2 101.2 100.0 101.2
2018Q3 101.1 101.3 100.0 101.1
2018Q4 100.9 101.2 100.8 100.9
2019Q1 101.0 100.3 100.7=100.9/101.2x101.0
2019Q2 101.4 101.8=100.7/100.3x101.4

99. Ireland was one of the first countries to implement the time dummy hedonic method
with a rolling window for the compilation of a house price index. O’Hanlon (2011) recommends
using data from 12 months for a monthly index compilation. The 12 months should capture all
seasonality effects; however, this period can be extended to 15 months.

100. To help compilers, become familiar with the process to calculate an RPPI using the time
dummy hedonic method an exercise has been developed to accompany this Guide. The exercise
(Exercise 3) uses a synthetic data set and a set of procedures written in R. The following step-by-
step instructions are intended to walk the reader through the exercise.

Exercise 3: Step-by-step Calculation for a Time Dummy Hedonic Quarterly Index

101. The index compilation for the first year prior to publication is presented in the seven
steps noted below. The calculations can be replicated using the R procedure “time
dummy_starting.r” and the synthetic data sets provided with this Guide.

102. Step 1 - Obtain data pertaining to four quarters.

103. Step 2 - Generate dummy variables - Create dummy variables for each period and for
the categorical variables.

36 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Table 29. Time Dummy - Data Set with the Dummy Variables

Variables Observations
period1Q2016 1 0 0 0 (…) 0
period2Q2016 0 1 1 0 (…) 0
period3Q2016 0 0 0 1 (…) 0
period4Q2016 0 0 0 0 (…) 1
Year 2016 2016 2016 2016 (…) 2016
Month 3 6 4 7 (…) 10
town_name Town_1 Town_10 Town_4 Town_1 (…) Town_3
Region 1 1 1 1 (…) 1
property_type_nameApartment 0 0 0 0 (…) 0
property_type_nameDetached 1 1 1 1 (…) 1
property_type_nameSemi_detached 0 0 0 0 (…) 0
baths1 1 0 1 1 (…) 0
baths2 0 1 0 0 (…) 1
baths3 0 0 0 0 (…) 0
beds1 0 0 0 0 (…) 0
beds2 0 0 0 0 (…) 0
beds3 0 1 1 0 (…) 0
beds4 1 0 0 1 (…) 1
garden0 1 0 1 1 (…) 1
garden1 0 1 0 0 (…) 0
garage0 1 1 1 1 (…) 1
garage1 0 0 0 0 (…) 0
balcony0 1 1 1 1 (…) 0
balcony1 0 0 0 0 (…) 1
utilityroom0 1 1 1 1 (…) 1
utilityroom1 0 0 0 0 (…) 0
conservatory0 1 1 1 1 (…) 1
conservatory1 0 0 0 0 (…) 0
ensuite0 1 1 1 1 (…) 0
ensuite1 0 0 0 0 (…) 1
fireplace0 1 1 0 1 (…) 1
fireplace1 0 0 1 0 (…) 0
dglaze0 1 1 1 1 (…) 1
dglaze1 0 0 0 0 (…) 0
Floorsq 354 358 258 287 (…) 353
Price 407906 401222 390824 422487 (…) 481971
air0 0 1 0 0 (…) 1
air1 1 0 1 1 (…) 0
elevator0 1 1 1 1 (…) 1
elevator1 0 0 0 0 (…) 0
construction_year 1983 1998 1992 1982 (…) 2015
Vintage 32 17 23 33 (…) 1

104. Step 3 - Build the strata (as per previous exercises).

105. Step 4 – Calculate the “shadow” prices - Run a regression with the price logarithm as
the dependent variable and all other significant variables as explanatory variables.

INTERNATIONAL MONETARY FUND 37


RPPI PRACTICAL GUIDE

106. There will be no coefficient for the first quarter. The index for the first quarter (reference
period) by definition, equals 100.

Table 30. Time Dummy - OLS Results for Stratum 1

.rownames Estimate Std.Error t.value


(Intercept) 11.384 0.074 153.230
period2Q2016 0.011 0.018 0.628
period3Q2016 0.009 0.018 0.529
period4Q2016 0.009 0.019 0.497
property_type_nameDetached -0.120 0.033 -3.580
property_type_nameSemi_detached 0.205 0.033 6.137
Baths -0.033 0.029 -1.151
Beds -0.012 0.028 -0.444
Garden -0.014 0.029 -0.500
Garage 0.020 0.025 0.799
Balcony -0.012 0.016 -0.726
Utilityroom -0.032 0.020 -1.622
Conservatory 0.037 0.033 1.128
Ensuite 0.025 0.022 1.140
Fireplace 0.031 0.024 1.274
Dglaze -0.008 0.030 -0.266
Floorsq 0.009 0.019 0.493
Air 0.019 0.019 1.009
Elevator -0.024 0.018 -1.338
Vintage 0.004 0.000 45.375

38 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Table 31. Time Dummy - OLS Results for Stratum 2

.rownames Estimate Std.Error t.value


(Intercept) 11.239 0.092 122.414
period2Q2016 0.002 0.031 0.079
period3Q2016 -0.013 0.030 -0.427
period4Q2016 -0.006 0.029 -0.197
property_type_nameDetached -0.079 0.048 -1.669
property_type_nameSemi_detached 0.306 0.046 6.720
Baths 0.055 0.052 1.066
Beds 0.043 0.048 0.898
Garden 0.038 0.038 0.992
Garage 0.017 0.033 0.523
Balcony 0.015 0.023 0.654
Utilityroom 0.004 0.024 0.170
Conservatory -0.003 0.036 -0.077
Ensuite 0.001 0.029 0.018
Fireplace 0.042 0.031 1.352
Dglaze -0.055 0.049 -1.114
Floorsq 0.013 0.029 0.434
Air 0.009 0.024 0.381
Elevator -0.026 0.026 -1.010
Vintage 0.004 0.000 32.510

107. Step 5 - Calculate the sub-indices – Following in equation (3), δ are the coefficients of
the time dummies. The coefficients of the other variables are now disregarded.

Table 32. Time Dummy – Compilation of the Sub-indices

Sub-
1Q2016 2Q2016 3Q2016 4Q2016
indices
exp(0.011) * 100 =
I_Stratum1 100.0 exp(0.009) * 100 = 101.0 exp(0.009) * 100 = 100.9
101.1
exp(0.002) * 100 =
I_Stratum2 100.0 exp(-0.013) * 100 = 99.0 exp(-0.006) * 100 = 99.4
100.3

108. Step 6 – Aggregation (Laspeyres) - The aggregation of the sub-indices is a weighted


average using the weights as calculated in the first exercise.

Table 33. Time Dummy - Aggregation of the Sub-indices

1Q2016 2Q2016 3Q2016 4Q2016 Weights


Overall_Index 100 100.8 100.1 100.4 1.0
I_Stratum1 100 101.1 101.0 100.9 0.635608
I_Stratum2 100 100.3 99.0 99.4 0.364392

INTERNATIONAL MONETARY FUND 39


RPPI PRACTICAL GUIDE

109. Step 7 – Re-reference the sub-indices - The indices were first compiled with first period
(1Q2016) as the reference period. However, the reference period should be 2016=100. To re-
reference the sub-indices such that 2016=100 the quarterly indices are divided by their mean.

Table 34. Time Dummy - Re-referencing the Sub-indices

Re-referencing factor
1Q2016 2Q2016 3Q2016 4Q2016
(mean of indices)
Overall_I 100/1.0033 = 100.8/1.0033 = 100.1/1.0033 = 100.4/1.0033 = (100+100.8+100.1+100.4)/
ndex 99.7 100.5 99.8 100.1 4 = 1.0033
I_Stratum 100/1.0076 = 101.1/1.0076 = 100.9/1.0076 = 100.8/1.0076 = (100+101.1+100.9+100.8)/
1 99.3 100.4 100.2 100.2 4 = 1.0076
I_Stratum 100/0.9960 = 100.4/0.9960 99/0.9960 99.6/0.9960 (100+100.4+99+99.6)/4 =
2 100.4 =100.7 =99.1 =99.8 0.9960

110. The index compilation for the subsequent years is the same as described in the previous
exercise except for Step 7 where the chaining occurs for the overall index and for all sub-indices.
The calculations for the subsequent years (2017 and 2018) can be replicated by using the R
procedure named “time dummy_production.r” and the synthetic data sets supplied with this
Guide. Table 35 illustrates how to chain the overall index.

Table 35. Time Dummy - Chaining the Indices

Indices from Indices from Indices from


Period Chained Index
1Q2016 to 4Q2016 2Q2016 to 1Q2017 3Q2016 to 2Q2017
1Q2016 99.7 99.7
2Q2016 100.5 100.0 100.5
3Q2016 99.8 100.2 100.0 99.8
4Q2016 100.1 99.7 99.3 100.1
1Q2017 100.4 98.9 100.7 = 100.1 / 99.7 x 100.4
2Q2017 98.7 100.5 = 100.7 / 98.9 x 98.7

G. Method 3: Characteristics Hedonic Method

Reference: Handbook on Residential Property Price Indices, pages 52-54

111. The characteristics hedonic method measures the price evolution of a “typical” property.
It relies on the use of a detailed set of property characteristics and is generally preferred when
the compiler has access to a large number of transactions for each period. The “typical” property
is estimated by averaging the characteristics of all the properties in the stratum. For example, if
there are 10 properties in a stratum and the average number of bedrooms among those
10 properties is four then the “typical” property would have four bedrooms.

40 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

112. The key specificity of this method is that it cannot be used with geospatial data
(coordinates) is not used 11 because the “typical” property is found by averaging the values of the
characteristics and the outcome of an average of geospatial coordinates can be unreasonable as
for example a dwelling located in the middle of the bay.

113. A “shadow” price is then estimated for each characteristic in current period (t) and in the
reference period (0). The price index is calculated by comparing the price of the typical property
in period (t) with the price of the “typical” property in the reference period (0).

Figure 5. The Mechanics of the Characteristics Hedonic Method

Comparing the
“prices” of period
The “typical” price variation of
A property t with the “prices”
property is a the “typical”
represents a of period 0,
bundle of the property from
bundle of keeping the
average period 0 to
characteristics average
characteristics period t.
characteristics
constant

114. The mean is calculated for each numerical variable and the frequency for the categorical
variables as in the example shown in the table below:

11
Geospatial data cannot be used with this method because when the average of the coordinates is made the
result can be that the “typical” dwelling is in the middle of a lake.

INTERNATIONAL MONETARY FUND 41


RPPI PRACTICAL GUIDE

Table 36. Example of Calculation of the “Typical” Dwelling

Dwelling A

Dwelling B

Dwelling C

Dwelling D

Dwelling E

Dwelling F

Dwelling H
Mean/Frq (Typical
Property)

property_type_nameApartment 0 0 0 1 1 1 1 0.57
property_type_nameSemi-
Detached 0 0 1 0 0 0 0 0.14
property_type_nameDetached 1 1 0 0 0 0 0 0.29
baths1 0 0 1 1 1 1 1 0.71
baths2 0 0 0 0 0 0 0 0.00
baths3 0 1 0 0 0 0 0 0.14
beds1 0 0 0 0 0 0 0 0.00
beds2 0 0 0 1 1 0 1 0.43
beds3 0 0 1 0 0 1 0 0.29
beds4 1 1 0 0 0 0 0 0.29
Garage 0 0 0 0 0 0 0 0.00
Balcony 0 0 0 0 0 0 0 0.00
Ensuite 0 0 0 1 1 0 0 0.00
Floorsq 289 286 292 76 133 200 206 212
Air 0 1 1 0 0 0 0 0.29
Elevator 0 0 0 0 0 0 1 0.14
Vintage 0 24 0 24 32 31 22 19

115. The log-linear specification for each stratum is used.

𝑙𝑙𝑙𝑙 𝑝𝑝𝑛𝑛𝑡𝑡 = 𝛽𝛽0 + ∑𝑘𝑘𝑘𝑘=1 𝛽𝛽𝑘𝑘 𝑧𝑧𝑛𝑛𝑛𝑛


𝑡𝑡
+ 𝜀𝜀𝑛𝑛𝑡𝑡 (4)

116. Separate regressions are run for this model, for each quarter in a stratum, on the data of
the reference period (0) and from the current period (t) to obtain the “shadow” prices. The
“shadow” prices are the estimated regression coefficients (β ̂).

117. The index is obtained by exponentiating the difference between the estimated regression
coefficients of the current period (t) and the reference period (0) multiplied by the average
characteristics of the reference period (0).

𝐼𝐼0𝑡𝑡 = exp(∑𝑘𝑘𝑘𝑘=1�𝛽𝛽̂𝑘𝑘𝑡𝑡 − 𝛽𝛽̂𝑘𝑘0 �𝑍𝑍𝑘𝑘̅ 0) (5)

118. To help compilers, become familiar with the process to calculate an RPPI using the
characteristics hedonic method an exercise has been developed to accompany this Guide. The
exercise (Exercise 4) uses the synthetic data set and R procedures that accompany this Guide. The
following step-by-step instructions are intended to walk the reader through the exercise.

42 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Exercise 4: Step-by-step Calculation for a Quarterly Index with the Characteristics


Hedonic Method

119. The index compilation for the first year is presented in the following seven steps. The
calculations can be replicated using the R procedure named “characteristics_starting.r” and the
synthetic set of data that accompanies this Guide.

120. Step 1 - Obtain data.

121. Step 2 - Build the strata – As per exercise 1. The following steps are performed for each
stratum.

122. Step 3 - Calculate the “shadow” prices - Run a different regression for each period with
the price logarithm as the dependent variable and all other significant variables as the
explanatory variables. The categorical variables are transformed in dummy variables.

Table 37. Characteristics – Coefficients for Stratum 1 in the Reference Period (1Q2016)

.rownames Estimate Std..Error t.value


(Intercept) 11.407948 0.0861976 132.34653
property_type_nameApartment -0.1261738 0.0633157 -1.9927739
property_type_nameDetached 0.1796821 0.0609523 2.9479141
baths1 -0.0960755 0.0596779 -1.6099017
baths2 -0.098451 0.0558587 -1.7624998
beds1 -0.0296235 0.05582 -0.5306975
beds2 0.0156601 0.047362 0.3306459
beds3 -0.022727 0.0306046 -0.7426008
garage -0.0238718 0.0502161 -0.4753814
balcony -0.0111464 0.0414217 -0.2690953
ensuite -0.0546011 0.0325894 -1.6754256
floorsq 0.0044956 0.0001804 24.915658
air 0.0007891 0.0243958 0.0323453
elevator 0.0986272 0.0355168 2.7769199
vintage 0.0003531 0.0012289 0.2873503

123. Step 4 – Calculate average characteristics - Calculate the quantity of the characteristics
for the “typical” property by calculating the mean for each numerical variable and the frequency
for each qualitative variable in the reference period (1Q2016).

Table 38. Characteristics – Coefficients and Average Characteristics for Stratum 1 in the
Reference Period (1Q2016) and in the Current Period (2Q2016)

Intcpt - Intercept
Coef_P2s1 – Coefficients of period 2 (2Q2016) in stratum 1
Coef_P1s1 – Coefficients of period 1 (1Q2016) in stratum 1
Average – Mean or frequency of the characteristics in the reference period (1Q2016)

INTERNATIONAL MONETARY FUND 43


RPPI PRACTICAL GUIDE

CHR Coef_P2s1 Coef_P1s1 Average


Intcpt 11.51 11.41
Air 0.05 0.00 0.52
Balcony -0.02 -0.01 0.18
baths1 -0.01 -0.10 0.53
baths2 0.04 -0.10 0.39
beds1 -0.06 -0.03 0.13
beds2 -0.07 0.02 0.16
beds3 -0.06 -0.02 0.37
Elevator 0.00 0.10 0.26
Ensuite -0.01 -0.05 0.49
Floorsq 0.00 0.00 226.27
Garage -0.08 -0.02 0.07
property_type_nameApartment -0.15 -0.13 0.47
property_type_nameDetached 0.13 0.18 0.47
Vintage 0.00 0.00 14.85

124. Step 5 - Compile the sub-indices - The sub-indices compilation is in equation (5). The
results of the calculations are shown in Table 39.

Table 39. Characteristics – Compilation of the Sub-index for Stratum 1 in the Current
Period (2Q2016)

CHR Coef_P2s1 Coef_P1s1 Average


Intcpt 11.5140 11.4079 11.5140-11.4079
air 0.0491 0.0008 0.52 (0.0491-0.0008) * 0.52
Balcony -0.0249 -0.0111 0.21 (-0.0249+0.0111) * 0.21
baths1 -0.0134 -0.0961 0.53 (-0.0134 + 0.0961) * 0.53
baths2 0.0374 -0.0985 0.39 (0.0374 + 0.0985) * 0.39
beds1 -0.0642 -0.0296 0.13 (-0.0642 + 0.0296) * 0.13
beds2 -0.0652 0.0157 0.16 (-0.0652 – 0.0157) * 0.16
beds3 -0.0589 -0.0227 0.37 (-0.0589 +0.0227) * 0.37
elevator 0.0000 0.0986 0.26 (0.0000-0.0986) * 0.26
ensuite -0.0142 -0.0546 0.47 (-0.0142+0.0546) * 0.497
floorsq 0.0040 0.0045 230.62 (0.0040-0.0045) * 230.62
garage -0.0844 -0.0239 0.07 (-0.0844+0.0239) * 0.07
property_type_nameApartment -0.1474 -0.1262 0.47 (-0.1474+0.1262) * 0.47
property_type_nameDetached 0.1287 0.1797 0.47 (0.1287-0.1797) * 0.47
vintage -0.0014 0.0004 15.08 (-0.0014-0.0004) * 15.08

Sub-index Stratum 1 Exp(sum(…)) * 100 = 101.0

125. Step 6 – Aggregation (Laspeyres formula) – Aggregate the sub-indices (stratum 1 and
stratum 2) using the weights compiled in exercise 1.

44 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Table 40. Characteristics – Aggregation of the Sub-indices

1Q2016 2Q2016 3Q2016 4Q2016 Weights


Overall_Index 100.0 100.5 100.5 100.8 1.0
I_Stratum1 100.0 101.0 101.9 101.0 0.635608
I_Stratum2 100.0 99.6 98.1 100.4 0.364392

126. Step 7 – Re-reference the sub-indices - The indices initially compiled using the first
quarter of 2016 as the reference period are re-referenced such that 2016=100.

Table 41. Characteristics - Re-referencing the Sub-indices

Re-referencing factor
1Q2016 2Q2016 3Q2016 4Q2016
(mean of indices)
Overall_I 100/1.0048 100.5/1.0048 100.5/1.0048 100.8/1.0048 (100+100.5+100.5+100.8) =
ndex =99.6 = 100.0 = 100.1 = 100.3 1.0048
I_Stratum 100/1.0097 101.0/1.0097 101.9/1.0097 101.0/1.0097 (100+101.0+101.9+101.0) =
1 = 99.0 = 100.0 = 100.9 = 100.0 1.0097
I_Stratum 100/0.9952 99.6/0.9952 98.1/0.9952 100.4/0.9952 (100+99.6+98.1+100.4) =
2 = 100.5 = 100.0 = 98.6 = 100.9 0.9952

127. The index compilation for the following years is described in the next five steps. The
calculations can be replicated using the R procedure named “Stratification_production.r.” The
average characteristics and the coefficients of the base period can be obtained by using the R
procedure named “base_characteristics.r” for 2017 and 2018 using data from the fourth quarter.
These should be updated every year using the last quarter of the previous year

128. Steps 1 to 3 are the same as for the first year.

129. Step 4 is calculated with the R procedure “base_characteristics.r.”

130. Step 5 and 6 are again the same as previously noted above.

131. Step 7 - Chain the indices.

132. To express the indices in terms of the reference year they need to be chained. Chain the
index by multiplying the index for the current period by the chained index of the last period of
the previous year as shown in Tables 42 and 43.

INTERNATIONAL MONETARY FUND 45


RPPI PRACTICAL GUIDE

Table 42. Characteristics - Chaining the Indices for 2017

4Q2016 1Q2017 2Q2017 3Q2017 4Q2017


4Q2016 2016= 4Q2016 2016= 4Q2016 2016= 4Q2016 2016=
2016=100 =100 100 =100 100 =100 100 =100 100
Overall_I
ndex 100.3 101.0 101.3 102.7 103.0 102.8 103.2 99.9 100.2
I_Stratu
m1 100.0 101.3 101.3 102.9 103.0 102.6 102.7 101.4 101.4
I_Stratu
m2 100.9 100.5 101.3 102.3 102.7 103.2 104.1 97.3 98.1

Table 43. Characteristics - Chaining the Indices for 2018

4Q2017 1Q2018 2Q2018 3Q2018 4Q2018

2016=100 4Q2017 2016= 4Q2017 2016= 4Q2017 2016= 4Q2017 2016=


=100 100 =100 100 =100 100 =100 100
Overall_I 100.2 101.2 101.4 100.6 100.9 101.7 101.8 101.3 101.5
ndex
I_Stratu 101.4 101.6 103.0 101.2 102.6 102.0 103.6 103.0 104.6
m1
I_Stratu 98.1 100.6 98.7 99.9 98.05 101.2 99.0 99.0 96.7
m2

H. Method 4: Imputations Hedonic Method

Reference: Handbook on Residential Property Price Indices, pages 54-56

133. Compilers may also want to consider using the imputations hedonic method. The
imputations hedonic method imputes a price for each property for each period (regardless as to
whether a transaction occurred) and calculates an index using the imputed prices.

134. This method requires detailed information on property characteristics and a large
number of transactions. It is generally appropriate if geospatial data (coordinates) are used.
When geospatial data are not used, and the log-linear form is used the characteristics and
imputation hedonic methods are equivalent.

Figure 6. The Mechanics of the Imputations Hedonic Method

Obtain the Estimate the


A property
“shadow” prices current price of
represents a
of the the dwellings Matched model
bundle of
characteristics in transacted in
characteristics
period t. period 0.

46 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

135. The log-linear specification for each stratum is used:

ln 𝑝𝑝𝑡𝑡𝑛𝑛 = 𝛽𝛽0 + ∑𝐾𝐾𝑘𝑘=1 𝛽𝛽𝑘𝑘 𝑧𝑧𝑡𝑡𝑛𝑛𝑛𝑛 + 𝜀𝜀𝑡𝑡𝑛𝑛 (6)

136. Separate regressions for reference period (0) and current period (t) are used to estimate
the “shadow“ prices, e.g., estimated coefficients (𝛽𝛽̂), for each quarter in a stratum. This gives,
after exponentiating, the imputed prices for the reference period (0):

0 0
� � exp �∑𝐾𝐾𝑘𝑘=1 𝛽𝛽
𝑝𝑝�0𝑛𝑛 (0) = exp �𝛽𝛽0
� 𝑧𝑧0𝑛𝑛𝑛𝑛 �
𝑘𝑘
(7)

137. The imputed prices for the current period (t) for the dwellings sold in the reference
period (0) are shown in equation (8).

𝑡𝑡 𝑡𝑡
� � exp �∑𝐾𝐾𝑘𝑘=1 𝛽𝛽
𝑝𝑝�𝑡𝑡𝑛𝑛 (0) = exp �𝛽𝛽0
� 𝑧𝑧0𝑛𝑛𝑛𝑛 �
𝑘𝑘
(8)

138. The index is calculated by taking the ratio of the geometric mean of the imputed prices
for period t for the reference period dwellings to the geometric average of the imputed prices for
reference period 0 for the same dwellings:

𝑡𝑡 (0))1/𝑁𝑁(0)
∏𝑛𝑛∈𝑆𝑆(0)(𝑝𝑝�𝑛𝑛
𝐼𝐼𝑡𝑡 = 0 )1/𝑁𝑁(0)
∏𝑛𝑛∈𝑆𝑆(0)(𝑝𝑝�𝑛𝑛
(9)

139. Note that the observed prices from the reference period (0) are replaced by the imputed
values. The reasoning is that the estimates resulting from the omitted variables offset each other
to some extent for both periods.

140. As with the other methods it is advisable to update the reference period dwellings each
year. For example, for a quarterly index, the dwellings from the fourth quarter in year t are the
reference period dwellings for the index calculation in year t+1.

141. To help compilers become familiar with the process to calculate an RPPI using the
imputations hedonic method an exercise has been developed to accompany this Guide. The
exercise (Exercise 5) uses a synthetic data set and a set of procedures written in R. The following
step-by-step instructions are intended to walk the reader through the exercise.

Exercise 5: Step-by-step Calculations for a Quarterly Index Using the Imputations


Hedonic Method

142. The index compilation for the first year is presented in the following eight steps. The
calculations can be replicated using the R procedure named “imputations_starting.r” and the
synthetic set of data accompanying this Guide .

143. Step 1 - Obtain data.

INTERNATIONAL MONETARY FUND 47


RPPI PRACTICAL GUIDE

144. Step 2 - Build the strata – Construct the stratum as per exercise 1. The following steps
are performed for each stratum.

145. Step 3 - Calculate the “shadow” prices - Run a regression for each period with the
price logarithm as dependent variable and all other significant variables as the explanatory
variables. The coefficients from the regression are the “shadow prices” of the variables.

Table 44. Imputations - Coefficients

1Q2016 2Q2016 3Q2016 4Q2016


(Intercept) 11.4079 11.51 11.39665 11.3768
property_type_nameApartment -0.1262 -0.15 -0.124095 -0.1641
property_type_nameDetached 0.1797 0.13 0.173795 0.20929
baths1 -0.0961 -0.01 -0.005754 0.01563
baths2 -0.0985 0.04 0.038617 0.03664
beds1 -0.0296 -0.06 -0.033746 0.06489
beds2 0.0157 -0.07 0.046466 0.05679
beds3 -0.0227 -0.06 0.009297 -0.0045
Garage -0.0239 -0.08 -0.047235 0.07517
Balcony -0.0111 -0.02 -0.003069 -0.0308
Ensuite -0.0546 -0.01 0.012173 0.03227
Floorsq 0.0045 0.00 0.004074 0.00406
Air 0.0008 0.05 0.020406 -0.0281
Elevator 0.0986 0.00 0.048317 0.01526
Vintage 0.0004 0.00 -0.000805 0.00034

146. Step 4 – Impute the prices - For each property impute a price for each period (including
the reference period) using the coefficients from the regressions.

48 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Table 45. Imputations - Sample of Three Dwellings from Stratum 1 of Period 1Q2016

Dwelling A B C
property_type_nameApartment 0 1 1
property_type_nameDetached 1 0 0
property_type_nameSemi_detached 0 0 0
baths1 0 1 0
baths2 1 0 0
baths3 0 0 1
beds1 0 1 0
beds2 0 0 0
beds3 0 0 1
beds4 1 0 0
garage 0 0 0
balcony 0 0 0
ensuite 0 1 1
floorsq 197 283 258
Price 282773 254876 216705
Air 0 0 0
elevator 0 0 0
vintage 0 0 30

Table 46. Imputations - Example of Price Imputations of Three Dwellings from Stratum 1
of Period 1Q2016

estimate A B C
(Intercept) 11.4079 11.4079 11.4079 11.4079
property_type_nameApartment -0.1262 0 0.0000 1 0.0000 1 0.0000

property_type_nameDetached 0.1797 1 0.1797 0 0.1797 0 0.1797


baths1 -0.0961 0 -0.0961 1 0.0000 0 0.0000
baths2 -0.0985 1 0.0000 0 -0.0985 0 -0.0985
beds1 -0.0296 0 0.0000 1 0.0000 0 0.0000
beds2 0.0157 0 0.0000 0 0.0000 0 0.0000
beds3 -0.0227 0 0.0000 0 -0.0227 1 0.0000
Garage -0.0239 0 0.0000 0 0.0000 0 0.0000
Balcony -0.0111 0 0.0000 0 0.0000 0 0.0000
Ensuite -0.0546 0 0.0000 1 0.0000 1 0.0000
Floorsq 0.0045 197 1.5914 283 1.2588 258 1.7398
Air 0.0008 0 0.0008 0 0.0008 0 0.0000
Elevator 0.0986 0 0.0000 0 0.0000 0 0.0000
Vintage 0.0004 0 0.0113 0 0.0074 30 0.0109
log imputed price 12.3748 12.3737 0 13.3749
imputed price 236762 236506 236784

147. Step 5 - Aggregate the estimated prices - The imputed prices are then aggregated by
taking the geometric mean of the imputed prices for the properties in each stratum for each
period.

INTERNATIONAL MONETARY FUND 49


RPPI PRACTICAL GUIDE

Table 47. Imputations - Jevons of the Imputed Prices for the Year 2016

Jevons 1Q2016 2Q2016 3Q2016 4Q2016


Stratum1 236273 238639 240727 238604
Stratum2 201994 201095 198215 203961.7

148. Step 6 - Calculate the sub-indices – The stratum index is the ratio of the stratum level
imputed price in period t to the imputed price in period 0 as follows:

𝑡𝑡 0
It = Geomean (𝑝𝑝
�𝑛𝑛 (0)) / Geomean (𝑝𝑝
�𝑛𝑛 (0)) (10)

Table 48. Imputations - Compilation of the Sub-indices for the Year 2016

Sub-
1Q2016 2Q2016 3Q2016 4Q2016
indices
238639.2/236272.5 * 100 = 240727/236272.5 * 100 = 238604/236272.5 * 100 =
Stratum1 100 101.0 101.9 101.0
201095.2/201994.1 * 100 = 198214.5/201994.1 * 100 203961.7/201994.1 * 100 =
Stratum2 100 99.6 =98.1 101.0

149. Step 7 – Aggregation (Laspeyres) - The aggregation of the sub-indices is a weighted


average using the weights as calculated in exercise 1.

Table 49. Imputations - Aggregation of the Sub-indices

1Q2016 2Q2016 3Q2016 4Q2016 Weights


Overall_Index 100 100.5 100.5 101.0 1.0
I_Stratum1 100 101.0 101.9 101.0 0.635608
I_Stratum2 100 99.6 98.1 101.0 0.364392

150. Step 8 – Re-reference the sub-indices - The indices initially compiled in the reference
period 1Q2016 need to be referenced such that 2016=100. This is done by first taking the mean
of the quarterly indices and then dividing the quarterly indices by the mean.

Table 50. Imputations - Re-referencing the Sub-indices

Rereferencing factor
1Q2016 2Q2016 3Q2016 4Q2016
(mean of indices)
100/1.0049 100.5/1.0049 100.5/1.0049 101/1.0049 (100+100.5+100.5+101)/4
Overall_Index
= 99.5 = 100.0 = 100.0 = 100.5 = 1.0049
100/1.0097 100.5/1.0097 100.5/1.0097 101/1.0097 (100+101+101.9+101)/4
I_Stratum1
= 99.0 = 100.0 = 100.9 = 100.0 = 1.0097
100/0.9966 100.5/0.9966 100.5/0.9966 101/0.9966 (100+99.6+98.1+101)/4 =
I_Stratum2
= 100.3 = 99.9 = 98.5 = 101.3 0.9966

151. The index compilation for the following years (2017 and 2018) is presented in the
proceeding five steps. The calculations can be replicated using the R procedure named

50 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

“imputations_production.r”. The base price used to compile the index for 2017 and 2018 can be
obtained with the R procedure named “base price.r”.

152. Steps 1 to 5 are the same as for the first year.

153. Step 6 - Calculate the sub-indices - The sub-indices are compiled as the ratio of the
geometric mean of the current period imputed price to the geometric mean of the base period
imputed price (last quarter of previous year).

Table 51. Imputations – Compilation of the Sub-indices for 1Q2017

Jevons Jevons
Sub-indices 1Q2017
4Q2016 1Q2017
I_Stratum1 236272.5 250655.1 250655.1 / 236272.5 * 100 = 101.3
I_Stratum2 201994.1 217080.5 217080.5/201994.1 * 100 = 99.8

154. Step 7 – Aggregation (Laspeyres) - The aggregation of the sub-indices uses the
weights as calculated in exercise 1.

155. Step 8 - Chain the indices - Chain the index by multiplying the index of the current
period by the chained index of the last period of the previous year as shown in Tables 52 and 53.

Table 52. Imputations - Chaining the Indices for 2017

4Q2016 1Q2017 2Q2017 3Q2017 4Q2017


4Q2016 2016= 4Q2016 2016= 4Q2016 2016= 4Q2016 2016=
2016=100
=100 100 =100 100 =100 100 =100 100
Overall_In
100.5
dex 100.8 101.2 102.5 103.0 102.7 103.2 99.7 100.2
I_Stratum1 100.0 101.3 101.3 102.8 102.8 102.6 102.7 101.2 101.2
I_Stratum2 101.3 99.8 101.1 101.9 103.2 102.7 104.1 97.1 98.4

Table 53. Imputations - Chaining the Indices for 2018

4Q2017 1Q2018 2Q2018 3Q2018 4Q2018


2016= 4Q2017 2016= 4Q2017 2016= 4Q2017 2016= 4Q2017 2016=
100 =100 100 =100 100 =100 100 =100 100
Overall_Index 100.2 101.2 101.3 100.6 100.8 101.7 101.8 101.3 101.5
I_Stratum1 101.2 101.6 102.8 101.2 102.4 102.0 103.2 103.0 104.2
I_Stratum2 98.4 100.6 99.0 99.9 98.3 101.2 99.6 99.0 97.4

I. RPPI Results

156. Figure 7 and Table 54 shows the RPPI compiled with four methods. The Characteristics
and the Imputations methods provide closely the same results as in fact they are variants of the
same method. As expected the time dummy provide results less volatile since it uses a rolling
window of four quarters which smooths the results. On the contrary, the outcome of compiling

INTERNATIONAL MONETARY FUND 51


RPPI PRACTICAL GUIDE

an RPPI with a simple stratification method with median is highly volatile due to the lack of
proper quality adjustments. In this example, because the data is suitable for any of the methods
the compiler should opt for the less volatile: The Time Dummy method.

Table 54. RPPI Index Compiled with Four Methods

1Q2016

2Q2016

3Q2016

4Q2016

1Q2017

2Q2017

3Q2017

4Q2017

1Q2018

2Q2018

3Q2018

4Q2018
RPPI

Stratification 95.7 100.6 102.5 101.2 102.7 107.2 103.8 107.3 101.7 111.4 102.8 104.6
Time dummy 99.7 100.5 99.8 100.1 100.7 100.6 101.1 102.1 102.4 103.1 102.4 101.6
Characteristics 99.6 100.0 100.1 100.3 101.1 102.8 103.0 100.0 101.2 100.6 101.7 101.3
Imputations 99.5 100.0 100.0 100.5 101.2 103.0 103.2 100.2 101.3 100.8 101.8 101.5

Figure 7. RPPI Index Compiled with Four Methods

2016=100 RPPI
115.0
110.0
105.0
100.0
95.0
90.0
85.0

Stratification Time dummy

Characteristics Imputations

52 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

DISSEMINATION AND OTHER INDICATORS


157. This Guide has provided compilers with a series of step-by-step exercises to help
demonstrate the different ways an RPPI can be computed. The choice of method depends largely
on the availability of data and the needs of the users. The methods that utilize hedonic
regressions are generally considered superior and provide users with higher quality estimates. If
compilers are unable to acquire the necessary data to use a hedonic-based method, other
methods such as median with stratification may be appropriate.

158. Before disseminating official results, compilers may want to consider developing
experimental indices for at least one year. The experimental indices can be vetted by various
users in terms of their accuracy, relevance and timeliness. The compiler can then take the
feedback they receive and refine both the methods and business process used to construct the
index. The development of experimental estimates also allows the compiler to test their overall
business process and make any necessary adjustments prior to the official release of the data.
Once the methodology and business process has been finalized the index should be compiled
for several periods before disseminating the official results.

159. RPPIs are an important source of information used by policy makers to better understand
financial stability as well as socio-economic issues such as housing affordability. Given the
extensive use of the indices it is important that the compilers develop an effective dissemination
strategy. The indices should be made public with a media release. The compilers may also want
to organize a meeting or seminar with the main stakeholders at the time of the release where
they can explain the index and how it can (and should not) be interpreted. The compilers may
want to consider the use of social media and videos to ensure awareness of the index is as
broad-based as possible.

160. The release should also be accompanied by a technical note on sources and methods
clearly explaining the methodology. The technical note should be a short document that is easy
to read and understandable and includes some of the following information:

• the aim of the index;


• the coverage;
• the stratification;
• the weighting system;
• the reference periods of the weights and the indices; and
• the method used.

161. The technical note is the main document stakeholders will consult when assessing the
credibility, quality, and usability of the index. It is recommended that the presentation of the
technical note follow a standard format such as the IMF factsheet for the Data Quality
Assessment Framework, the United Nations Generic National Quality Assurance Framework or
the template recommended by the Second Phase of the G20 Data Gaps Initiative as in Annex II.

INTERNATIONAL MONETARY FUND 53


RPPI PRACTICAL GUIDE

162. A well-articulated technical note or sources and methods guide is especially important in
the case of an RPPI due to the diversity of methods that can be used for its compilation.
International comparability can be deeply affected by the uses of different methodologies. For
this reason, it is crucial to provide information on how the index is compiled. Over time the data
sources and methods used to compile the index may change. The compiler needs to ensure
users are informed of the developments.

163. For the example followed on this guide the preferred method is the Time dummy and
some graphs should be produced from this result.

164. RPPIs are seen as a leading indicator that can help inform users about the business cycle.
Compilers should target releasing the data within 90 days after the reference period for quarterly
data. Compilers should make the data available in machine readable form (e.g., .csv) and publish
the data according to an advance release calendar.

165. The publication of a time-series graph with the RPPI and its sub-indices, is useful and
must clearly indicate the reference period (base) as shown in the example below where the
reference period is 2016. In the graph bellow it can be seen that all indices are following the
same trend except for the last period where the sub-indices develop in opposite directions.

Figure 8. RPPI and Sub-indices

RPPI 2016=100

105.0
104.0
103.0
102.0
101.0
100.0
99.0
98.0
97.0
96.0
1Q2016

2Q2016

3Q2016

4Q2016

1Q2017

2Q2017

3Q2017

4Q2017

1Q2018

2Q2018

3Q2018

4Q2018

RPPI I_Stratum1 I_Stratum2

166. In addition to publishing the index, compilers may also want to show as in Figure 9, the
percentage change compared to the same period of the previous year (YoY – Year on Year) and
the percentage change compared to the previous quarter and present these to users.

54 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Figure 9. YoY - RPPI and Sub-indices

YoY
100% 3.00
80% 2.50
60% 2.00
40% 1.50
20% 1.00
0% 0.50
-20% 0.00
-40% -0.50
-60% -1.00
1Q2017 2Q2017 3Q2017 4Q2017 1Q2018 2Q2018 3Q2018 4Q2018

I_Stratum1 I_Stratum2 RPPI

167. The trend of the contribution of each stratum to the YoY rates can be better analyzed on
a graph as show bellow. As expected, due to its higher weight, stratum 2 is the greater
contributor for the change in the overall index.

Figure 10. YoY and Contributions

YoY and Contributions


1.6 3.00
1.4 2.50
1.2
2.00
1
0.8 1.50
0.6 1.00
0.4 0.50
0.2
0.00
0
-0.2 -0.50
-0.4 -1.00
1Q2017 2Q2017 3Q2017 4Q2017 1Q2018 2Q2018 3Q2018 4Q2018

Contribution Stratum 1 Contribution Stratum 2 RPPI

168. The trend of the RPPI can be compared with the development of the number of
dwellings being sold in each period.

INTERNATIONAL MONETARY FUND 55


RPPI PRACTICAL GUIDE

Figure 11. RPPI and Number of dwellings sold

No Sales No Sales
Stratum 1 Stratum 2
1000 104.0 800 106.0
800 103.0
600 104.0
102.0
600 101.0 102.0
100.0 400
400 100.0
99.0 200
200 98.0 98.0
0 97.0 0 96.0

No Sales - Stratum 1 RPPI - Stratum1 No Sales - Stratum 2 RPPI - Stratum2

169. The change in the prices of the dwellings may or not follow the general inflation.
Benchmarking the RPPI with the CPI may provide some information of how in line are the house
price indices with the general inflation.

Figure 12. RPPI and CPI

RPPI and CPI


104.0
103.0
102.0
101.0
100.0
99.0
98.0
97.0
96.0
1Q2016

2Q2016

3Q2016

4Q2016

1Q2017

2Q2017

3Q2017

4Q2017

1Q2018

2Q2018

3Q2018

4Q2018

RPPI CPI

170. Benchmarking the RPPI trend with the neighboring countries can also provide useful
information for policy making. However, this analysis must take into account methodologies
used.

171. More and more statistical agencies are experimenting with various ways to visually
present RPPIs. RPPIs lend themselves to visual presentation given the fact that they are multi-
dimensional (e.g., time, geography, dwelling type, etc.).

56 INTERNATIONAL MONETARY FUND


172. The FT visual vocabulary provides guidance on how to create easily readable graphs for different types of information. The image
bellow shows some examples of how to build graphs for indicators measuring change over time.

Figure 13. The Financial Times Vocabulary


INTERNATIONAL MONETARY FUND 57

RPPI PRACTICAL GUIDE


RPPI PRACTICAL GUIDE

173. Compilers can also experiment with the use of maps and other less traditional visualizations
to ensure their data is as impactful as possible. For example, Figure 9 shows the cities in the United
States with the highest concentration of dwellings over a certain value.

Figure 14. US Cities with the Highest Share of Million-dollar Dwellings

Source: Lendingtree.

174. Many statistical agencies are releasing RPPIs within a broader set of residential property
information. RPPIs and the data that underpin their calculation can be used to examine issues such
as housing affordability. Consider the examples below which use the price levels in domestic
currency as an input to examine housing affordability in Sweden as well as across a select number of
countries. These indicators can be calculated using the RPPI input data in conjunction with income
data. Figure 10 shows the ratio of house prices to income for cities in Sweden while Figure 11
compares house prices to income for selected countries. These indices were constructed using the
same data that are used to compile RPPIs.

58 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

Figure 15. Housing Affordability in Sweden Sample of Cities

Figure 16. House Price-to-income Ratio Across Countries

175. The Second Phase of the G-20 Data Gaps Initiative recommends statistical agencies release
indicators on the number of dwelling sold. Indicators on sales generally appeal to a broader
audience then a price index. Figure 12 provides an example of how data on housing sales can be
disseminated to the general public.

INTERNATIONAL MONETARY FUND 59


RPPI PRACTICAL GUIDE

Figure 17. Residential Property Sales Variation

Source: Global Property Guide.

176. Regional price development and prices per square meter/foot can be of interest for the
public, especially when trend differs from region to region. The interactive graphic below provides a
wealth of housing related information for the different regions in Italy.

Figure 18. Heat Map with Price Development Per Square Meter in Italy

Source: idealista.it.

177. Dissemination is often an afterthought in the compilation of RPPIs. To ensure the results of
their work is as impactful as possible, compilers are encouraged to develop a dissemination strategy
prior to the development of the index. The strategy should consider user needs, the latest technical

60 INTERNATIONAL MONETARY FUND


RPPI PRACTICAL GUIDE

advances as well as the diverse set of uses of the information. Ideally the dissemination of the index
will be part of a larger residential property framework that provides a wealth of information on
housing prices, sales, stocks, quality, and affordability.

INTERNATIONAL MONETARY FUND 61


RPPI PRACTICAL GUIDE

REFERENCES
Brand, Manuel; Vermeulen, Corinne; Fischbach, Davide; Carpy, Yves (2017). Swiss residential property
price index

EUROSTAT, ILO, IMF, OECD, UNECE, the World Bank (2013). Handbook on Residential Property Prices
Indices (RPPIs).

EUROSTAT (2017). Technical Manual on Owner-Occupied Housing and House Price Indices.

Geng, Nan (2018). Fundamental Drivers of House Prices in Advanced Economies. IMF Working Paper.

Linz, Stefan; Behrmann, Timm (2004). Using Hedonic Pricing for the German House Price Index.
IMF the CPI Manual: Concepts and Methods, forthcoming.

Marola, Bogdan; Santos, Daniel; Evangelista, Rui and Guerreiro, Vanda. (2012). Working Towards a
Comparable House Price Inflation Measure in Europe. UNECE Meeting of the Group of Experts on CPI.
O’Hanlon, Niall (2011). Constructing a National House Price Index for Ireland. Journal of the
Statistical and Social Inquiry Society of Ireland, 40, 167-192.

Ottawa Group - International Working Group on Price Indices

Prud’Homme, Marc and Erdur, Serra (2007). House Price Indices at Statistics Canada_Practices and
Research.

Rechnio, Renata (2016) The Polish experience in developing House Price Index.

Silver, Mick (2016). How to better measure hedonic residential property price indices. IMF Working
Paper.

UNECE Meeting of the Group of Experts on Consumer Price Indices

62 INTERNATIONAL MONETARY FUND


Annex I. Relation between the Methods and the Data Available

No. of Prices
(Observations) Method 1 Method 4 Method 5
Method 2
Average Stratification Characteristics Imputations

Method 3
Rolling window

e.g., town e.g., town, size, No. of Characteristics


geo-spatial
garage, elevator, …
INTERNATIONAL MONETARY FUND 63

RPPI GUIDE
64

RPPI GUIDE
Annex II. G20 Data Gaps Initiative Template on Housing Price Statistics
INTERNATIONAL MONETARY FUND

Residential Property Price Indices (RPPIs)

Tier 1 indicators in green


Tier 2 indicators in orange
Starting
Type of Geographical Frequency period -
Categories Breakdown Vintage Measures
dwellings coverage (preferred) minimum
Weighting
required
Short Name of Single- Multi-
Full name of the series New Existing
the series unit unit
I.RPPIs I.1.1. a. RPPI - New and Sales
Index -
existing dwellings for all and/or
RPPI Total - Whole At least National
types of dwellings Stock × × × × 2005 Q1
Whole country country quarterly reference
(single- and multi-unit¹) based
period
for the whole country RPPI²
I.1.1. b. RPPI - New Sales
Index -
and existing dwellings for and/or
RPPI Total - At least National
all types of dwellings Stock × × × × Capital City 2005 Q1
Capital city3 quarterly reference
(single- and multi-unit¹) based
period
for the capital city3 RPPI²
I.1.1. c. RPPI - New Sales
Index -
and existing dwellings for and/or
RPPI Total At least National
all types of dwellings Stock × × × × Urban areas 3
2005 Q1
- Urban areas4 quarterly reference
(single- and multi-unit¹) based
period
for urban area4 RPPI²
I.1. 2. a RPPI - New Sales
RPPI Total - Index -
and existing dwellings and/or
Single-unit Whole At least National
for single-unit Stock × × × 2005 Q1
dwellings¹ - country quarterly reference
dwellings¹ for the whole based
Whole country period
country RPPI²
Sales
I.1. 2. b RPPI - New RPPI Total - Index -
and/or
and existing dwellings for single-unit At least National
Stock × × × Capital City 2005 Q1
single-unit dwellings¹ for dwellings¹ - quarterly reference
based
the capital city3 Capital city3 period
RPPI²
Sales
I.1. 2. c RPPI - New RPPI Total - At least Index -
and/or × × × Urban areas3 2005 Q1
and existing dwellings for single-unit quarterly National
Stock
single-unit dwellings¹ for dwellings¹ - based reference
urban areas4 Urban areas4 RPPI² period
Sales
I.1. 3. a. RPPI - New RPPI Total - Index -
and/or
and existing dwellings Multi-unit Whole At least National
Stock × × × 2005 Q1
for multi-unit dwellings dwellings - country quarterly reference
based
for the whole country Whole country period
RPPI²
Sales
I.1. 3. b. RPPI - New RPPI Total - Index -
and/or
and existing dwellings for Multi-unit At least National
Stock × × × Capital City 2005 Q1
multi-unit dwellings¹ for dwellings¹ - quarterly reference
based
the capital city3 Capital city3 period
RPPI²
Sales
I.1. 3. c. RPPI - New RPPI Total - Index -
and/or
and existing dwellings for Multi-unit At least National
Stock × × × Urban areas 3
2005 Q1
multi-unit dwellings¹ for dwellings¹ - quarterly reference
based
urban areas4 Urban areas4 period
RPPI²
I.2.1.a. RPPI - New Sales
Index -
dwellings for all types of and/or
RPPI New - Whole At least National
dwellings (single- and Stock × × × 2005 Q1
Whole country country quarterly reference
multi-unit¹) for the based
period
whole country RPPI²
I.2.1.b. RPPI - New Sales
Index -
dwellings for all types of and/or
RPPI New - At least National
dwellings (single- and Stock × × × Capital City 2005 Q1
Capital city3 quarterly reference
multi-unit¹) for the capital based
period
city3 RPPI²
I.2.1.c. RPPI - New Sales
Index -
dwellings for all types of and/or
RPPI New - At least National
dwellings (single- and Stock × × × Urban areas 2005 Q1
INTERNATIONAL MONETARY FUND 65

3
Urban areas4 quarterly reference
multi-unit¹) for urban based
period
areas4 RPPI²
Sales
I.2.2.a. RPPI - New RPPI New - Index -
and/or
dwellings for single-unit single-unit Whole At least National
Stock × × 2005 Q1
dwellings¹ for the whole dwellings¹ - country quarterly reference
based
country Whole country period
RPPI²
I.2.2.b. RPPI - New RPPI New - Index -
Sales
dwellings for single-unit single-unit At least National
and/or × × Capital City 2005 Q1

RPPI GUIDE
dwellings¹ for the capital dwellings¹ - quarterly reference
Stock
city3 Capital city3 period
66

RPPI GUIDE
based
INTERNATIONAL MONETARY FUND

RPPI²
Sales
RPPI New - Index -
I.2.2.c. RPPI - New and/or
single-unit At least National
dwellings for single-unit Stock × × Urban areas 3
2005 Q1
dwellings¹ - quarterly reference
dwellings¹ for urban areas4 based
Urban areas4 period
RPPI²
Sales
I.2.3.a. RPPI - New RPPI New - Index -
and/or
dwellings for multi-unit Multi-unit Whole At least National
Stock × × 2005 Q1
dwellings¹ for the whole dwellings¹ - country quarterly reference
based
country Whole country period
RPPI²
Sales
I.2.3.b. RPPI - New RPPI New - Index -
and/or
dwellings for multi-unit Multi-unit At least National
Stock × × Capital City 2005 Q1
dwellings¹ for the capital dwellings¹ - quarterly reference
based
city3 Capital city3 period
RPPI²
Sales
I.2.3.c. RPPI - New RPPI New - Index -
and/or
dwellings for multi-unit Multi-unit At least National
Stock × × Urban areas3 2005 Q1
dwellings¹ for the urban dwellings¹ - quarterly reference
based
areas4 Urban areas4 period
RPPI²
I.3.1.a.RPPI - Existing Sales
Index -
dwellings for all types of and/or
RPPI Existing - Whole At least National
dwellings (single- and Stock × × × 2005 Q1
Whole country country quarterly reference
multi-unit¹) for the based
period
whole country RPPI²
I.3.1.b. RPPI - Existing Sales
Index -
dwellings for all types of RPPI and/or
At least National
dwellings (single- and Existing - Stock × × × Capital City 2005 Q1
quarterly reference
multi-unit¹) for the capital Capital city3 based
period
city3 RPPI²
I.3.1.c. RPPI - Sales
Index -
Existing dwellings for all RPPI and/or
At least National
types of dwellings (single- Existing - Urban Stock × × × Urban areas 3
2005 Q1
quarterly reference
and multi-unit¹) for urban areas4 based
period
areas4 RPPI²
I.3.2.a.RPPI - Existing Index -
RPPI Sales
dwellings for single-unit Whole At least National
Existing - and/or × × 2005 Q1
dwellings¹ for the whole country quarterly reference
single-unit Stock
country period
dwellings¹ - based
Whole country RPPI²
RPPI Sales
I.3.2.b.RPPI - Existing Index -
Existing - and/or
dwellings for single-unit At least National
single-unit Stock × × Capital City 2005 Q1
dwellings¹ for the capital quarterly reference
dwellings¹ - based
city3 period
Capital city3 RPPI²
RPPI Sales
I.3.2.c.RPPI - Index -
Existing - and/or
existing dwellings for At least National
single-unit Stock × × Urban areas3 2005 Q1
single-unit dwellings¹ for quarterly reference
dwellings¹ - based
urban areas4 period
Urban areas4 RPPI²
RPPI Sales
I.3.3.a.RPPI - existing Index -
Existing - and/or
dwellings for multi-unit Whole At least National
Multi-unit Stock × × 2005 Q1
dwellings¹ for the whole country quarterly reference
dwellings¹ - based
country period
Whole country RPPI²
Sales
I.3.3.b.RPPI - Existing RPPI Index -
and/or
dwellings for multi-unit Existing - Multi- At least National
Stock × × Capital City 2005 Q1
dwellings¹ for the capital unit dwellings¹ quarterly reference
based
city3 - Capital city3 period
RPPI²
Sales
I.3.3.c.RPPI - RPPI Index -
and/or
Existing dwellings for Existing - Multi- At least National
Stock × × Urban areas 3
2005 Q1
multi-unit dwellings¹ for unit dwellings¹ quarterly reference
based
urban areas4 - Urban areas4 period
RPPI²
INTERNATIONAL MONETARY FUND 67

RPPI GUIDE

You might also like