You are on page 1of 15

What We Don’t Know | Literary Review of Canada


What We Don’t Know

A modest proposal for fixing Canada’s data deficit

D ata is back in fashion in Ottawa. As

Prime Minister Justin Trudeau brings
back Laurier’s “sunny ways,” one of the
Share This Essay

promised effects will be to infuse light into the
data dark age of the Harper government’s text=What+We+Don%E2%80
suppression of data and information. While the %99t+Know&via=LRCmag)

Harper government’s killing of the long-form (

census was its most obvious and egregious
attack on data, there were numerous others, we-dont-know%2F&
such as putting out of business the National t=What+We+Don%E2%80

Council of Welfare and its statistical reports, %99t+Know)

which were critical to any understanding of
poverty in Canada. The Harper people
explained their actions as protecting privacy we-dont-know%2F)

and saving money. Its critics dismissed these

excuses, saying the data vandalism was an More from This Issue
March 2017
attempt to keep facts from spoiling the
government’s storylines.
Is data truly back in Ottawa? Certainly it is rhetorically, with
pledges to restore the long-form census, fortify Statistics Canada
and commit the government to evidence-based policy making. Some
1 de 15 wags were fond of characterizing the Harper government approach 16-12-17 02:39
as policy-based evidence
What We Don’t Know | Literary Review of Canada

making, raising images

of data sleuths scurrying
around for statistics to
support whatever the
government was intent
on doing.
Data seems to be
back in purpose as well.
Much has been made of
the Privy Council
Office’s initiative around
“deliverology,” an idea Illustration by Jay Dart

borrowed from the

United Kingdom government of Tony Blair, which deliberately sets
out to measure how well government is doing in meeting its
commitments. The PCO “results and delivery” unit is up and
running. It has created a process for assessing how the government is
doing across the twelve broad issue areas of the government (for
example, stronger diversity, better relations with and outcomes for
the indigenous community, and growing the middle class), and
defining what constitutes success, and what needs to be observed
and measured along the way.
It is clear that data needs to be a big part of the process. So an
early part of deliverology is finding out how adequate our data is in
Canada. We may have a good idea of what we need to know in
assessing progress toward the government’s goals, but do those data
sets exist? Are they accessible? Are they useful? More broadly, do we
have a plan for data? Is it just a matter of StatsCan returning to
business as usual? Or do we need a more ambitious plan to meet
2 de 15 Canada’s data challenge, a plan that might envision new data 16-12-17 02:39
protocols and even new agencies?
What We Don’t Know | Literary Review of Canada

Some complain that criticism of the Harper government’s treatment
of data is overcooked, and that it was merely tidying things up and
eliminating intrusions on the privacy of Canadians. They claim that
the criticism was just so much partisan hacking.
This was not the view of the international statistics community,
which, unconcerned with the ins and outs of Canadian politics,
wondered how a modern nation could handicap itself in this way. In
Canada, objections came not just from the wonkery starved of their
oxygen, but from many commercial and economic organizations
needing the data to understand markets, from institutions needing
to understand trends influencing their clients, and from
governments at all levels faced with constructing effective policy and
The 2010 termination of the long-form census was the biggest
stroke, which created an uproar in the media and in the research
community. Long-form data fed other crucial studies such as the
Labour Force Survey and the Survey of Household Spending on
which the Consumer Price Index is based.
Other than the long-form census, a big area of loss between
2010 and 2012 was in social data:
The Participation and Activity Limitation Survey, known in the
research community as PALS, was interrupted and replaced with an
inferior data set. This was the major source of information on people
with disabilities and the supports they need.
Social Security Statistics: Canada and the Provinces, which
monitored federal/provincial/territorial/municipal programs,
3 de 15 The Survey of Labour and Income Dynamics (known as SLID) 16-12-17 02:39
was terminated in 2012, losing a key tracking
What We Don’t Know | Literary Review of Canada of individual

movement in and out of poverty.

Welfare Incomes and Poverty Profile were ended when the 2012
budget killed the National Council of Welfare, the social policy
advisory body to the government. Those reports were indispensable
sources of information in policy formation.
With the social policy community in shock as these critical data
sets fell away, the Caledon Institute of Social Policy decided that
Welfare Incomes and Poverty Profile had to be saved, and stepped
into the breach. It helped that Caledon president Ken Battle had
been the architect of both reports when he headed the National
Council of Welfare, before he went on to cofound Caledon in 1992
(with the author of this article).
In the process of incorporating the two reports into Caledon,
Battle looked at the state of social data in Canada generally, and
Caledon launched the Canada Social Report, an effort to put key
social data in one place. While the CSR is a useful and growing
resource, its development has illustrated two points: there is a great
need for a national resource that houses key data for Canada’s
ongoing social development and it must be done at scale, probably
by a stable and well-resourced agency. Caledon is a well-regarded
and successful public policy think tank, but it is small and mostly
privately funded. Finding the resources to run a national data agency
is beyond its reach.

Another key objective of the new federal government is to begin to
close the gap on our infrastructure maintenance and provision.
Across the country we have crumbling and deteriorating roads,
bridges, tunnels, energy distribution systems, schools, hospitals and
4 de 15 social housing. We also need to build new infrastructure to meet the
16-12-17 02:39
needs of a growing population and economy.
What We Don’t Know | Literary Review of Canada We need new

installations of the infrastructure above, and in public transit,

airports and harbours. The government has made infrastructure an
early funding priority.
Because of that, questions have arisen about how well we do at
building things. Are we on time and on budget?
This question intrigues Matti Siemiatycki, professor of
geography and planning at the University of Toronto. He wrote a
2016 paper on “Cost Overruns on Infrastructure Projects: Patterns,
Causes, and Cures” for the Institute on Municipal Finance and
Siemiatycki documents what many people sense, even if just
from reading newspaper headlines, that on time and on budget are
rare events. Looking at a large sampling of projects in Canada and
internationally, both publicly and privately executed, he reports that
costs run over by 28 percent. Rail projects go over by 45 percent,
bridges and tunnels by 34 percent, and surface roads by 20 percent.
Large information and technology projects go over by 27 percent,
with one in six of them over by 200 percent. And the average
overrun for Olympic Games? A mighty 179 percent! Faster higher
stronger, indeed.
These are alarming numbers. Siemiatycki can explain some of it:
scope changes and change orders once budgets have been approved;
“handover” problems between governments, private contractors and
subcontractors; incomplete studies prior to approval on technical
and engineering issues; inflation in labour and material costs; delays
from strikes, materials sourcing or coordination with utility
companies; and unforeseen events such as bad weather or accidents.
He also notes that there are human factors at work, such as
“optimism biases,” which lead project proponents to assume things
5 de 15 will work out well for the most part. There is also “strategic 16-12-17 02:39
misrepresentation” by proponents who
What We Don’t Know | Literary Review of Canada want to keep budgets low and
schedules short in order to get approval from a council or
parliament, or from a corporate board of directors.
But a significant observation Siemiatycki makes is this:
The world is in the midst of a big data and analytics revolution.
From professional sports to product marketing, sophisticated new
methods are being developed to improve performance by collecting
and statistically analyzing massive amounts of data. Yet
infrastructure megaproject delivery remains a sector that has been
largely untouched by this trend.
He goes on to recommend that project suppliers be subject to
rigorous prequalification in bids, as a way of separating good from
poor performers. But, he notes, “the strength and legitimacy of the
prequalification system is predicated on the development of a data
collection regime that is rigorous in capturing both the size and
causes of cost overruns as well as construction quality.”
Finally, Siemiatycki recommends
a systematic knowledge exchange. Some have unflatteringly
Citing the Major Projects Leadership compared the approach to
Academy at Oxford University in the that of the Toronto Transit
United Kingdom, he says that using Commission: TTC officials
the knowledge arising from this data think their customers are
collection to educate current and buses and trains, not riders;
emerging project managers is critical StatsCan thinks its customer
in reducing inefficiencies. is the data, not the user.

In two critical areas, then, social policy and infrastructure provision,
we are hearing voices calling for a much stronger data capability. On
one hand, it is a response to the ebbing of StatsCan and the
6 de 15 withdrawal of key elements on which policy makers and analysts 16-12-17 02:39
relied. On the other, it is a call for new
What We Don’t Know | Literary Review of Canada capacity, collecting and
analyzing data we have not had in the past.
One solution put forward is a simple reinvestment in StatsCan.
After all, the new government has restored the long-form census.
Surely it can restore the interrupted data series, or patriate those
such as Welfare Incomes that are now housed elsewhere. This
presumes that all we lack is the will of the government in power to
be committed to evidence as the basis of decision making.
There is no doubt that StatsCan is a formidable and able
enterprise. Formed in 1918 as the Dominion Bureau of Statistics,
renamed and reconstituted in 1971, it has been regarded as one of
the very best national statistical organizations by observers such as
The Economist. Over the decades, it has built up a strong body of
data, much of it the result of contribution agreements with data
generators across the country, principally provinces and territories. It
issues many regular and useful reports across a broad range of topics.
And yet it is not the most accessible of organizations. Many
researchers have stories of visiting the StatsCan facility in search of
data only to be subjected to a Soviet-style process of surrendering
cellphones and computers, being denied copying of documents, and
other minor privations only to find out some time later that much of
the information in question was already in the public domain. They
would not characterize StatsCan as being customer friendly. Some
have unflatteringly compared the approach to that of the Toronto
Transit Commission: TTC officials think their customers are buses
and trains, not riders; StatsCan thinks its customer is the data, not
the user of the data.
StatsCan and the TTC are not alone in this orientation. In fact,
it is at the base of a great discussion in the data world. On one side
of the discussion, there are traditionalists who believe that data is
7 de 15 proprietary, and that there are deep privacy and security issues at 16-12-17 02:39
stake in its distribution. They believe that
What We Don’t Know | Literary Review of Canada data falling into the

wrong hands can be used against the interest of the state, or whoever
owns the data. They believe that they have the ability to pose the
right questions to the data set to produce answers that will serve
society. Or at least they have the motivation to find out the right
questions. In the case of an agency like StatsCan, they have earned
enough respect that they may well be right.
On the other side are those who advocate for open data, making
available as much data as possible to the public, recognizing, of
course, that proper attention be paid to anonymity and security.
Advocates say data should be provided in its most raw form with as
little intermediation between it and the user as possible. The
benefits in transparency and accountability would be significant,
they say. Of more interest to many are potential benefits in
innovation and engagement by people who would pose new queries
to the data in order to develop new answers to old questions, new
solutions, and new tools for governments and citizens.
This tussle over data is going on everywhere. Some governments
have adopted open data protocols: municipal governments in San
Francisco, New York, Edmonton; 40 U.S. states and four Canadian
provinces; and many national governments, including Canada. For
the most part these are partial commitments, with only selected data
sets made available, but in most cases there is a commitment for
more to come. It should be said that having an open data policy does
not necessarily mean that a government has become more client- or
citizen-focused in its approach. The government can still be very
selective about what it makes available.
How then should the government go about achieving a more
effective data strategy? Is there a better model than StatsCan? Some
point to the Canadian Institute of Health Information. CIHI was
8 de 15 formed in 1994 with the mission of disseminating quality healthcare 16-12-17 02:39
information. A private non-profit organization,
What We Don’t Know | Literary Review of Canada it is governed by a

board with members from the federal and provincial/territorial

governments. Like StatsCan it has forged a large number of data
contribution arrangements across the country, and makes an effort
to engage many stakeholders in health care, including institutional
and private actors.
An advantage of a dedicated agency, apart from making data
widely available, is that it can push a research agenda into new
frontiers, or deeper waters. Because of its relationships with
contributors and users of data and information, it can develop new
questions, tackle harder problems and test innovative enquiries.
While there are others in society with deep knowledge of medical
issues, say sleep apnea or pancreatic cancer, perhaps in university or
hospital research facilities, a dedicated body with data at hand can
push consistently across a range of projects. And with a distributed
“ownership” represented by a diverse governance membership, no
one government or interest will dominate the direction of the
research agenda.
A key part of CIHI’s mandate is to make healthcare information
publicly available to Canadians. It is also committed to supporting
data access for graduate students as a way of developing future
generations of healthcare system professionals and analysts. It
balances access with a commitment to data security and privacy.
Without arguing that CIHI is perfect, one can say it provides a
model for other areas with data issues and concerns. StatsCan would
continue as Canada’s predominant data agency, and would be a
significant data supplier to agencies modelled on CIHI. Its power
and expertise should not be diminished. But new agencies would
permit stronger focus and deeper investigation of their areas of
9 de 15 What might that model look like in social policy and in 16-12-17 02:39
What We Don’t Know | Literary Review of Canada

A Canadian institute of social information, CISI if you will,

would take the idea that Caledon has developed with the Canada
Social Report to scale. It would become a comprehensive data and
information agency that would assemble the broadest possible range
of data. It would, of course, track the condition and performance of
the major social support programs: the Canada Child Benefit, the
Working Income Tax Benefit, the Canada Pension Plan, along with
the Old Age Supplement and Guaranteed Income Supplement,
Employment Insurance and the various disability supports.
It would also bring to one agency data on unemployment rates, a
basic barometer of economic health; poverty rates, showing whether
more or fewer Canadians are living in poverty; and the incomes of
Canadians, rising or falling. These three are basic measures of how
Canadians are doing at any one time.
Social data encompasses a broad range, some of it currently
measured and some not. The state of our education system from
kindergarten to post-graduate is critical to national competitiveness.
So is the vitality of our cities. We need to see the data story, and be
able to compare ourselves to those who are world leaders in
performance and support. What is the status of our food security?
Are Canadians adequately housed in homes of good quality and that
are affordable, and do they have security of tenure? Or are too many
even without housing, homeless? How are seniors faring, and
women, and youth? In Canada we do not have to ask a question
about the adequacy of our knowledge of conditions of our
indigenous community, particularly those living on reserve, because
we know it falls far short. How do we measure up on quality-of-life
measures, and how do we compare with those who are doing well?
And, on all these dimensions, are we getting better or worse?
10 de 15 A comprehensive approach to social data would allow for deep16-12-17 02:39
dives. Caledon uses unemployment data
What We Don’t Know | Literary Review of Canada as an example. The national

unemployment statistic tells one story, but is it the right story? As I

write this the reported Canadian unemployment rate is 6.8 percent.
But it varies by province and by region. The rate in Toronto is
different from the one in northern Ontario, and Kelowna’s is
different than Corner Brook’s. Young men have a rate more than
twice the national average. And it differs by education level. For a
social policy researcher looking to help design the best public
program support for those out of the labour market, or for the
private company looking to inform its human resource practice,
being able to parse these differences is important. These deep dives
are significant because they push the far frontiers of our knowledge
to produce greater insight, which can lead to better responses.
Much of this data is available, and a CISI would have to identify
sources and negotiate contribution agreements. Data would have to
be anonymized, kept secure and monitored for quality. It would have
to be accessible, taking the CIHI mission of making social
information available to Canadians. And it would have to operate at
scale to be adequately funded and to be comprehensive.
If we created a CISI, could we also have a CIII, a Canadian
institute of infrastructure information? Fortunately, the
aforementioned Professor Siemiatycki has written another paper,
“Implementing a Canadian Infrastructure Investment Agency,”
telling us how.
As he says, “the CIIA would be positioned as a national centre of
excellence supporting rigorous project evaluation, procurement best
practices and project financing under a single roof.” He envisions an
arm’s-length agency acting in an advisory role to those building
infrastructure on the matters of project financing, project selection
and prioritization, project delivery, the maintenance of transparency
11 de 15 and accountability in the process, and on building human capacity16-12-17 in 02:39
the system to improve overall performance.
What We Don’t Know | Literary Review of Canada It would focus on

projects over $100 million, which is where most of the over-budget,

behind-schedule problems occur.
As infrastructure data is a road less travelled than social policy,
such an agency would be designed from the ground up. In social
policy, for example, Welfare Incomes already has a negotiated
contribution agreement between the federal, provincial and
territorial governments. Infrastructure reporting would have to be
negotiated de novo among the many actors, at various governments
and in the private sector. Starting such an agency would require an
entrepreneurial leader in its early years to build the relationships,
programs and products, as well as the advisory models. Siemiatycki
stresses the importance of independence, and the entrepreneurial
style required shows why it is necessary. Deciding to make a call to a
deputy minister or a CEO should not require approvals up and
down a bureaucratic hierarchy, not to mention cross-ministry or
cross-agency sign-off.
Since Siemiatycki wrote his paper, the federal government has
announced the formation of the Canadian Infrastructure Bank.
There has been some debate about whether Canada actually needs
an infrastructure bank, but almost every discussion of it concludes
that we need better data and information about infrastructure. There
are many sources of infrastructure financing, the argument goes, but
there is not nearly enough on data and information. So maybe we
will get the bank anyway, and a vibrant information facility will be
part of it.

If a CISI and a CIII are good ideas, what should be some of the
basic design elements?
12 de 15 They should operate on an open data technology platform. 16-12-17 02:39
Technology has brought us a long way
What We Don’t Know | Literary Review of Canada from handwritten lists or
even from the construction of PDFs. The world has moved to open
data, where raw data is made available so that users can formulate
their own questions that generate answers that meet their needs.
Open data should be the default, and withholding data from the
open regime should only be done after a reasonable argument has
been made. In effect, this reverses the traditional situation, where
the reasonable argument has to be made why the data should be
placed in an open platform. Care has to be taken with data, of
course. It must be anonymized so that the identity of individuals is
protected. It must not reveal information that would place either the
state or individuals at risk, as long as that perception of risk is real.
But it must realize that sometimes we do not even know the
questions that need to be asked, and allowing data to have a wide
range of questions posed to it may provide valuable answers to
questions ranging from freeing up traffic flows to how we should
treat people with autism.
They should pick up on the CIHI model of being a non-profit
independent agency. CIHI has a board representing the key
stakeholders and other important viewpoints, but, as in all
governance, board members’ first allegiance must be to the well-
being of CIHI, not the stakeholder (the province or territory, or
professional group such as doctors or nurses) they come from. Such
independence is vital both for broad ownership of the enterprise and
for the authenticity and legitimacy of the data.
Collaboration should drive much of the data collection. All levels
of government collect data: municipalities know about water, waste,
transit and immigration settlement; provinces know about their big
files such as health care and education, and also about land use and
energy systems; and the federal government knows about income
13 de 15 supports, airports and harbours. The First Nations know about their 16-12-17 02:39
people and their needs, about the land,
What We Don’t Know | Literary Review of Canada and about justice.
Corporations know about the jobs, markets and consumers. Our
wide range of institutions such as hospitals, schools, police services
and the military know their issues and how people respond to them.
All can contribute data, and can benefit from having that data made
available, analyzed and turned into useful information.
These institutes should operate along the spectrum of data
leading to information leading to knowledge. They need to have
data available on open platforms, but also need to apply the analysis
that can produce reports to show trends and comparisons. And from
time to time they need to say what it all means, to describe to
Canadians the evolving story of the country.
And the institutes should be clients of StatsCan, not competitors
with it. StatsCan will continue to be the world-respected leader it is.
But these particular institutes will allow for deeper, more flexible,
and more accessible engagement.
Is this a costly enterprise? The annual budget of CIHI just
exceeds $100 million. Given the importance of health care to
Canadians and the benefits CIHI produces, that is probably a good
bargain. Siemiatycki estimates his version of a CIII would come in
at $20 million or less. Given the broader range of data in a CISI, it
might cost $40–50 million a year. For less than $200 million, we
would have three major areas of importance to Canadians supplied
with the data they need on a modern technology platform. Against
an annual federal budget of about $300 billion, that is an affordable
amount. In fact, it may be an investment we can’t afford not to make
because it will contribute to policy and programs that are more
effective and of better value.
It seems that for Justin Trudeau’s government’s sunny ways,
evidence, data and information will be a strong platform. After at
14 de 15 least a decade during which Canadians got used to governments 16-12-17 02:39
happy to operate in the dark, the light
What We Don’t Know | Literary Review of Canada is coming back on. The
creation of a set of vital information institutes would create a legacy
for the current government that would be hard to undo. Based as it
is on a broad collaboration, and having independent status, an
organization like CIHI would be hard to kill. With care in their
construction and conduct, so would a CISI, CIII and beyond.

Want to share your thoughts?

We welcome letters (, which we
reserve the right to publish after editing for length, clarity and

Alan Broadbent is chair and co-founder of Maytree, the Caledon Institute of

Social Policy, and the Institute on Municipal Finance and Governance. He is the
author of Urban Nation: Why We Need to Give Power Back to the Cities to Make
Canada Strong (HarperCollins, 2010).

15 de 15 16-12-17 02:39