You are on page 1of 11

J. Japanese Int.

Economies 43 (2017) 77–87

Contents lists available at ScienceDirect

Journal of The Japanese and International Economies


journal homepage: www.elsevier.com/locate/jjie

Computer technology and probable job destructions in Japan: An


evaluationR
Benjamin David
EconomiX-CNRS, University of Paris Ouest, France

a r t i c l e i n f o a b s t r a c t

Article history: Computer technology is currently experiencing important developments by generating new tools and
Received 17 November 2015 methods with increasing capacities. This suggests that a growing share of economic tasks could be per-
Revised 2 January 2017
formed by this new capital at the expense of labor. This paper evaluates the risk of job destructions
Accepted 4 January 2017
induced by computer technology in Japan. We aim at assessing the vulnerability of employment from
Available online 5 January 2017
a technical point of view by considering jobs’ differential dotation in non-programmable skills. Relying
JEL classification: on machine learning technique, we find evidence that approximatively 55% of jobs are susceptible to be
C53 carried by computer capital in the next years. We also show that there is no significant difference on the
J21 basis of gender. On the contrary, non-regular jobs (those that concern temporary and part-time workers)
O33 are more vulnerable to computer technology diffusion than the others. These findings, based on technical
background, shed light on the scale of the potential capital/labor substitution but this dynamics will also
Keywords:
Computer technology depends on economic and social factors.
Japanese labor market © 2017 Elsevier Inc. All rights reserved.
Automation
Random forest

1. Introduction tion policies have caused the emergence of networks supported


by computer infrastructure. This concerns the use of the fax ma-
In recent years, an important dissemination of computer and chine in the 1980s and above all the development of the Inter-
communication technologies has been observed in Japan and in net and mobile phone in the 1990s and 20 0 0s. Networks have
other industrialized countries. This diffusion in economic activi- hugely modified the modalities of communication and reduced
ties is a broad and international movement with country specific their costs. The actual situation is characterized by a massive ac-
timeline. According to Murata (2010), this process begins in Japan cess to these networks: 90% of Japanese people have an access to
in the 1950s via the acquisition of a US computer (Bendix G-15) Internet and the ownership rate of mobile phone is equal to 120%
by the Railway Technical Research Institute (1957) and continued (International Telecommunication Union, 2015).
through the 1960s by the introduction of second and third gen- Another aspect of the actual technological diffusion is the mas-
eration of computers.1 These machines which were both imported sive development of robotics that can be viewed as a component of
and produced locally, were progressively dedicated to special pur- the computer revolution (Spong et al., 2012).3 In this field, Japan is
pose terminals in manufacturing, distributions and financial indus- a key country both in terms of production and use to the point
tries in the 1970s. During the 1980s, Japan undergone a massive that it was labelled as “robot kingdom” (Schodt, 1988). Indeed,
development of office automation benefiting from the invention of Japan has a long and fruitful history in this matter which began
Personal Computer (PC) and other specific programs such spread- in the 1960s by the introduction of industrial robots dedicated
sheets or word processors.2 The spread of computers has contin- to welding, assembling or painting, followed by the first mobile
ued until now and they are used every day in a large range of robots used in inspection, transport or spatial tasks in the 1980s
economic activities. In parallel, technical advance and deregula- (Kumaresan and Miyazaki, 1999). Production and use of robots
continue at a large scale and were extended through the develop-
ment of micro-robots and services robots. Illustrations of this trend
R
We are grateful to Professor Valérie Mignon, Professor Ryo Kambayashi and the
two anonymous referees for very helpful comments and suggestions.
E-mail address: benjamin.david11@hotmail.fr 3
They argue that “The key element in the above definition is the reprogramma-
1
The second generation of computers is characterized by the introduction of bility of robots. It is the computer brain that gives the robot its utility and adapt-
transistors and the third by the use of integrated circuits. ability. The so-called robotics revolution is, in fact, part of the larger computer
2
The first PC produced in Japan was the NEC PC 8001 (1979). revolution”.

http://dx.doi.org/10.1016/j.jjie.2017.01.001
0889-1583/© 2017 Elsevier Inc. All rights reserved.
78 B. David / J. Japanese Int. Economies 43 (2017) 77–87

are the actual efforts in the humanoid robotic field initiated by breakthroughs in many fields such as image and speech recogni-
researchers of Waseda University who created Wabot 1 (1973) as tion or natural language understanding (LeCun et al., 2015). A spec-
well as other famous examples as ASIMO (Honda), Wakamaru (Mit- tacular example of this improvement is the victory of AlphaGo pro-
subishi), or Pepper (Aldebaran Robotics and Softbank). Data also gram designed by Google-Deepmind over the Go master Lee Sedol
show the importance of robotics diffusion in Japan. In 2013, this in March 2016. These news techniques constitute a major qualita-
country has the second highest robot density in the world (323 tive leap opening new economics applications.
units per 10,0 0 0 workers) behind the Republic of Korea and has At the same time, there are important developments in robotics
the most important robot density in the automotive industry with that benefit from the machine learning advances. We refer, for in-
1520 industrial robots per 10,0 0 0 employees (International Federa- stance, to mobile robots in use or at an advanced stage of devel-
tion of Robotics, 2014).4 opment. The best known example is the autonomous car currently
All these digital devices share a common theoretical and tech- developed by many firms such as Toyota or Mercedes-Benz. Sev-
nical background based on binary logic and basic electronic com- eral firms also work on autonomous trucks (see European Truck
ponents such as transistor or microprocessor.5 Many scholars sug- Platooning Challenge6 ) while many drones are designed for obser-
gest that this set is not a simple group of incremental innovations vation, security, delivery (Rakuten, Amazon, DHL...) or military ap-
but constitute a “Technological Revolution” or a “technological- plications. We also note the apparition of industrial and service
discontinuity” (Brynjolfsson and McAfee, 2011) able to “transform machines such as warehouse robots, cleaning robots, gardening
profoundly the rest of the economy” and produce a “new economic robots...This new generation of information technologies strongly
paradigm” (Perez, 2009). proposes capacities and application fields very extended with large
Indeed, efforts of characterization of technologies in the liter- number of economic applications. This is the scale of the potential
ature suggest that innovations or interrelated cluster of innova- of this new wave which makes the issue of automation particularly
tions are not comparable in their scale and their degree. In this important.
perspective, ICT can be viewed as a major technological set sus- With this in mind, it could be expected that a large number of
ceptible to produce large-scale impacts. In support of this vision, tasks in a wide variety of fields could be given over to computers
the literature on “General Purpose Technology” (GPT) (Bresnahan now and in the near future. This perspective constitutes a biggest
and Trajtenberg, 1995; Rosenberg and Trajtenberg, 2004) supposes challenge for modern economies. Acknowledging the importance
that the most important technologies share the feature of general- of this topic, our aim in this paper is to investigate the risk of job
ity. This crucial point was anticipated by Simon (1987) which stress destructions in Japan due to computer technology.
that the higher is the level of generality of a technology, the higher The perspective from which a growing share of tasks will be
is its potential because there is a very important number of pos- performed by computer capital has strong implications for several
sible applications. In the case of computer technology, the degree Japanese specific economic questions especially those pertaining to
of generality is very high because all economic activities include the labor market. Among these issues, we can mention the ageing
information processing tasks. process sustained by gains in longevity, while life expectancy is al-
This view is reinforced by the actual process of digitization of ready one the highest in the world, and low immigration flows.
human activities which makes the amount of information available Even if this process concerns several countries in Europe (Ger-
more and more important (“Big Data”). Furthermore, digitization is many, France) or in Asia (Korea, China), it is important to stress
coupled with an impressive improvement of ICT capacities in infor- that “Japan has the most rapidly aging population in the world”
mation processing as well as in terms of transmission, storage and (IMF, 2013). On the supply side of the labor market, significant
transformation of information (Nordhaus, 2007; Nagy et al., 2011; consequences of this evolution are the massive reduction of the
Koomey et al., 2010). Thus, it supposes that the field of application overall labor-force, or a dependency ratio reduction in the com-
of computer technology is growing, a characteristic that could have ing years and decades. A possible response face to this scenario is
notable effects on many economic matters. Among the possible to call to the technological solution. Growth of ICT capital stock
consequences of ICT adoption, an important aspect is its potential could contribute to maintain the level of production and to help in
destabilizing impact on employment. Indeed, labor activities do not the care of elderly people. On the demand side, we note that the
escape this trend of digitization: in production activities, there are Japanese labor market has experienced the introduction of more
many tasks consisting of manipulation of information but also that flexibility from the 1980s. This situation originates in the labor
some other tasks can be modeled as information flows. For exam- law reform in a context of asset-inflated bubble economy collapse
ple, accounting calculations is a task which consists of transferring, (Asao, 2011). The system characterized by job rigidity and wage
stocking and transforming information. On the other hand, some flexibility (shūshinkoyō) has progressively left space for a grow-
physical activities such as assembly, construction, transport can ing number of non regular jobs which represent roughly twenty
also be represented as information. Material elements, the space in millions people (Statistics Bureau, 2015). It is important to under-
which they are situated, and movements can be defined as math- line that if computer technology will replace workforce, it could
ematical objects. From this perspective, these tasks are susceptible do it differently according to the type of employment. Indeed, it
of being carried out by robots within computer programs repre- is possible that computer technology destroys more easily non-
senting physical objects, environment and motions in information regular employment either because dismissals are facilitated by
way. legal disposition or because computer technology realizes rather
Indeed, in addition to digitization and growing computing ca- tasks typically carried by non-regular workers. Inversely, if com-
pacity processes, significant efforts have been made in computer puter and communication tools threaten regular jobs, we can ex-
science as in machine learning to improve data processing. We pect that the share of non-regular workers in overall active popula-
could point the example of “deep learning” which made possible tion increases significantly, thus profoundly affecting the structure
of Japanese workforce. We also note that the computer revolution
could have notable effects on other questions such as the partic-
4
Moreover, Japan not only employs robots but is a major producer with 127,491 ipation of women in the workplace or could significantly modify
robots produced in 2014 of which 98,882 were exported (Japan Robotics Associa- the “return to education”.
tion, 2015).
5
For this reason, we consider the following expressions as synonym: “computer
technology”, “computer and communication technologies”, “Information Technology
6
(IT)”, “Information and Communication Technologies” (ICT). https://www.eutruckplatooning.com/default.aspx
B. David / J. Japanese Int. Economies 43 (2017) 77–87 79

Turning to methodological issues, in order to assess the poten- The first approach refers to the fact that there are “compensa-
tial risk for Japanese workers associated with the diffusion of com- tion mechanisms that are triggered by technical change itself and
puter technology, we use Career matrix data from the Japan Insti- which can counterbalance the initial labor saving impact of pro-
tute for Labour Policy and Training (JILPT) and we build a train- cess innovation” (Vivarelli, 2011).7 In this perspective, technology
ing sample containing occupations without doubt automatable and is not a serious threat to employment level but it could produce a
other occupations that could be considered as non-automatable. qualitative shift in jobs. Under technological pressure, some tasks
With this sample, we estimate a model explaining the probability are no longer carried out by workers whilst other appear because
of computerization by the differential dotation in non-automatable of new activities. The second view is more pessimistic because it
skills. Estimation is achieved by using the “Random Forest” algo- supposes that even if there are some compensation mechanisms, a
rithm, which builds several uncorrelated decision trees from boot- part of the workforce replaced by technology doesn’t find new jobs
strap samples from the training set and averages the results to ob- and contributes to “technological unemployment”. In this perspec-
tain the final estimation. Then, by using our estimates and data tive, there is an inequality between the number of jobs destroyed
from the Population Census (2010), we evaluate the number of and the number of jobs created.
jobs threatened by technology and we study the distribution of the Recently, this recurrent question has been raised again due to
probabilities by gender and type of employment. the diffusion of IT which seems capable of performing an ever-
Our results suggest that a large share of jobs could be per- increasing share of tasks traditionally carried out by workers. In
formed by computer system in the coming years. More pre- the recent literature, a major theoretical and empirical contribu-
cisely, according to our estimates, one half of Japanese jobs tion in the understanding of the relationship between computer
are susceptible to be destroyed by computer technology. How- technology and the evolution of the distribution of occupations
ever, this global result masks differential effects according to the was made by Autor et al. (2003). They have developed an equi-
kind of workers since non-regular workers seem to be more ex- librium model (“task model”) in which producers use two types
posed. On the contrary, we don’t find results differentiated by of inputs: routine and non-routine labors.8 They suppose that the
gender. first kind of labor input is perfectly substitutable by computer capi-
It is important to clarify that this work is based on techni- tal while the second constitutes a complement. The driving force of
cal criteria by relating the probability of automation by occu- the model is the fall in prices of computer capital which produces
pation and the differential endowments in non-computerizable differential evolution of each type of labor. Autor et al. (2003) pro-
skills. However, the capital/labor substitution is a multifactorial vide econometrical evidences to support their model. A very sim-
process including technological determinants such as capacities, ilar study was written by Maurin and Thesmar (2004) who bring
quality, and reliability of technologies but also economic and so- empirical arguments in favor of this hypothesis in France. Goos and
cial drivers. We refer, for instance, to the relative cost of labor Manning (2007) have extended the approach by showing that
and computer capital, the acceptation by consumers and produc- the non-routine tasks are mainly located in the top and the low
ers or the reaction of public authorities. These economic and so- level of wage distribution which can explain the polarization of
cial variables have unpredictable future states making it impossi- the labor market observed in the UK. Many other works con-
ble to have a clear idea of the extend of the future capital/labor firm the task-based polarization hypothesis for several economies
substitution. such as Goos et al. (2009) or Autor and Dorn (2013). Recently,
Since we compute the probability of automation by taking into Harrigan et al. (2016) also found evidence of job polarization in
account only technical determinants, we are particularly careful to France between 1994 and 2007. Their results suggest that the main
the terminology used. This is why we talk of “risk” or “probable” explanation of this dynamics is technology.
job destructions, “vulnerability” or “job threatened” to underline With regard to the Japanese economy, Ikenaga (2009) uses
there is not technological determinism. An effective “job destruc- the same framework as Autor et al. (2003) and shows that since
tions” would be is the situation where technological, economic and the 1990s, labor input of knowledge-intensive non-routine analytic
social conditions conduct to an execution of all the tasks of a job tasks and low-skill non-routine manual tasks is growing, whereas,
by ICT capital. Although we build a statistical model able to make at the same time labor input of routine manual tasks has declined.
prediction, the aim of this study is not to make precise forecast In addition, Ikenaga identifies a complementary relationship be-
but rather to define a “technical sensibility”, a degree of exposition tween ICT capital and workers who carry out non-routine analytic
to ICT diffusion. This also implies that the objective of this work tasks and a substitution trend with workers engaged in routine
is not to evaluate directly a possible technological unemployment. tasks. Ikenaga and Kambayashi (2010) also pay interest to the evo-
A growing participation of computer capital in the production pro- lution of the input share of each type of tasks on the basis of a
cess doesn’t imply an increase of unemployment. This is a possible specific measure of “task intensity” in each occupation. Their anal-
situation but a qualitative change of the production is also con- ysis suggests that there is a long-term increase in non-routine task
ceivable (see Section 2). The effect on the level of employment is input and a long-term decrease in routine tasks between 1960 un-
unknown. til 2005. In addition, they found a positive correlation between the
The rest of this paper is organized as follows. Section 2 briefly average wage in an occupation and the routine cognitive task in-
reviews the literature. Section 3 presents the methodology. put, and a negative correlation with routine manual task input.
Section 4 presents the results and Section 5 concludes. Recently, Frey and Osborne (2013) have renewed the analysis of
the exposure of workers faced with recent technological diffusion.
2. Literature review: employment and computer A starting point of their approach is that the identification between
non-routine tasks and low susceptibility to automation is called
A vast literature exists on the link between technology and into question by the improvements of computer tools in the fields
employment, dating at least from classical economists such as of machine learning and mobile robots. Indeed, non-routine tasks
Ricardo (1821) or Marx (1867). In this field, a recurrent question can also be carried out by computer capital. A famous example
concerns the capacity of technology to replace workers in eco-
nomic activities. The possibility of the realization of this capi- 7
Vivarelli (2011) identify six types of compensation mechanisms.
tal/labor substitution may be interpreted from the economic per- 8
Each input is an aggregate. Routine tasks are composed of routine cognitive
spective by the following alternative: compensation theory or tasks and routine manual tasks. Non-routine tasks can be non-routine analytic
“technological unemployment” (Keynes, 1930). tasks, non-routine interactive tasks or non-routine manual tasks.
80 B. David / J. Japanese Int. Economies 43 (2017) 77–87

is the autonomous driverless cars, now in development (but op- sary for each occupation. These values are defined on the basis of
erational), and nonetheless considered as a good example of non- surveys and range between 0 and 5, 0 being the lowest value and
automatable tasks by Autor et al. (2003) (cited by Brynjolfsson and 5 the maximum value. Firstly, we select 60 occupations similar or
McAfee, 2011). On the basis of expert opinions and relying on very close to Frey and Osborne (2013) selection (which is itself
the computer science literature, Frey and Osborne (2013) go be- based on computer science expert opinions). In order to expand
yond the distinction routine tasks/non-routine tasks and propose our sample and ensure robust estimation, we add other automat-
that computerization9 has strong limitations in perception and ma- able occupations identified on an empirical basis. This means that
nipulation tasks, creative intelligence tasks, and social intelligence they are now subject to automation process or that a technology
tasks. By using O∗ NET and SOC databases (from the US depart- in an advanced development stage is susceptible to replace work-
ment of labor), the authors, helped by a panel of computer scien- ers in these occupations. For example, we include in this sample
tists, consider 70 occupations which are without doubt automat- the occupation “Train Driver” because there are already automatic
able or not in the next years. For example they consider that the trains in running in Japan. Finally, the training sample considered
cashier occupation is automatable, while childcare worker occupa- in this study contains 69 occupations representing roughly 14% of
tion is non-automatable. Then, they construct a probabilistic classi- those of Career Matrix (Tables 3 and 4).
fication model with this training sample. The dependent variable is The second step in the constitution of our training sample is to
the probability of computerization and the explicative variables are select explanatory variables that determine the probability of au-
nine variables that representing the three “bottlenecks”. They use tomation. The baseline model supposes there are three limitations
this estimated model to predict the probabilities of automatization to automation: perception and manipulation tasks, creative intelli-
of the 702 occupations of their database. Then, they evaluate the gence tasks, and social intelligence tasks. Unfortunately, Career ma-
number and the distribution of jobs by sector now threatened by trix doesn’t contain variables for representing perception and ma-
computer capital. Their results suggest that “47 percent of total US nipulation tasks and creative intelligence tasks.
employment is at risk” (by considering a threshold of probability In order to overcome this problem, we create appropriate
equal to 0.7). dummy variables for the two categories without data. Precisely,
In the present paper, we aim at implementing a similar anal- we define a dummy variable “manual dexterity” to reflect percep-
ysis about Japanese labor market by assessing the share of jobs tion and manipulation tasks and two other to represent creative
which could be threatened by computer technology over the next intelligence tasks: “Fine Arts” and “Originality”. We prefer to cre-
few years. This approach is justified by the fact that the results ate dummies and only three variables to reduce the risk to assign
obtained by Frey and Osborne (2013) for the United States are not subjectively values between 0 and 5 for the other variables. For ex-
fully transposable to the case of the Japanese labor market. Indeed, ample, the occupation of surgeon takes a value of 1 to signify that
there are some differences in industry and occupational structures it requires high level of manual dexterity, while the occupation of
between these two countries. Industries have not the same weight lawyer takes a value of 0 to signify the inverse. We consider that
in each economy and occupations are partially different (follows this approach is a convenient tradeoff between the accounting of
that US and Japanese classifications are not the same). We note, important variables and the risk of subjective assignation.
for instance, the existence of Japanese specific occupations such To represent social intelligence tasks, we select, in Career Ma-
as kimono dressing instructor, sushi chef, pachinko employee, Juku trix, the following five variables: “Coordination”, “Persuasion”, “Ne-
teacher, administrative scrivener and some differences in occupa- gotiation”, “Instructing” and “Service orientation”. Lastly, by exam-
tions which have not the same valuations of skills (Ikenaga and ining variables contained in Career Matrix, we select another vari-
Kambayashi, 2010). able that seems impossible for the moment to automatize: “Judg-
On the other hand, Japan is a country where the technological ment and decision making”. We call this task: Appreciation tasks.
issue is particularly interesting. Even if the spread of computer de- We present all data in Table 5.
vices is an international dynamic, there are country specific char-
acteristics. As discussed above, Japan has a high level of technol- 3.2. Random forest
ogy diffusion and it is a major producer of innovations. If computer
technology has a significant impact on the structure or on the level We aim to compute the probability of automation of occupa-
of employment, we can expect that it will be particularly impor- tions on the basis of the different dotations in non-computerisable
tant in this country, in particular due to the development and the tasks inputs by using the methodology of Frey and Osborne (2013):
dissemination of robots. (i) considering a training sample of occupations which could be
considered as clearly automatable or not, (ii) we build a model
3. Methodology with this subsample, (iii) and we finally predict the probability of
automation of all occupations. In this perspective, we have consti-
3.1. Data tuted a training sample as described in Section 3.1. For the esti-
mation, we rely on the “Random Forest” (RF) algorithm10 , a very
The first step of our analysis is to constitute a training sam- useful method from machine learning framework.
ple with which we build our prediction model. The purpose is The use of RF instead of a standard method requires explana-
to select occupations, which can be undoubtedly, in the current tion. In these kind of situation, when we consider a binary de-
state, performed or not by computer capital. We also need some pendent variable, it is usual in applied economics to use logistic
information about the levels of skills required to perform each regression. However, in this case, for technical reasons, this seems
occupation. impossible. Indeed, as suggested by Peduzzi et al. (1996), it is nec-
For that purpose, we use data from Career matrix, a database essary to have at least 10 Events Per explanatory Variables (EPV)
created by the JILPT containing 499 occupations based on the Clas-
sification of Occupations for Employment Services (ESCO). It con-
10
tains 35 variables that describe the level of different skills neces- For estimation, we use R Package randomForest version 4.6–10 (Liaw and
Wiener, 2002). We consider a forest containing 100 trees. We do not need more
iterations because the study of the execution of the algorithm shows that the Mean
9
Frey and Osborne (2013) define computerization as follow: “We refer to com- Squared Error (MSE) stabilizes around 40 trees. We performed additional estima-
puterisation as job automation by means of computer-controlled equipment”. To tions by augmenting the number of trees up to 10 0 0. The results remain similar,
avoid unpleasant repetitions, we also use the usual term of “automation”. illustrating the robustness of our findings.
B. David / J. Japanese Int. Economies 43 (2017) 77–87 81

in order to perform logistic regression. Below this threshold, there


are important risks of non-convergence, loss in terms of accuracy,
or no normality of regression coefficients. In our case, we have 69
observations in the learning sample including 33 events (i.e. when
the probability of automation is equal to one) and 10 explanatory
variables (EPV = 3.3). Another potential difficulty is the correlation
between several explanatory variables. For example, the persuasion
and negotiation variables are highly correlated (r = 0.8). In front of
these problems, it seems difficult to expand the size of the learning
sample or to remove some variables just on a subjective basis. Fig. 1. RF Algorithm.
Initially, for simplicity and to circumvent these technical prob-
B
lems, we have been interested in using regression trees11 12 , specif- jb yˆ
The predicted value for an individual yj is equal to yˆj = b=1 B .
ically CART algorithm (Breiman et al., 1984), a simple nonpara-
It is important to stress that unlike CART, there is no “pruning”
metric method consisting in splitting the subspace of predictors in
of the estimated tree.16 At each step, the tree is constructed until
different regions and model the response of the dependent vari-
it reaches its maximum size defined in advance. This procedure is
able as a constant in each region.13 14 The main advantages of this
summarized in Fig. 1.
approach are that no assumption on the functional form to link-
RF has several advantages, in addition to those generally asso-
ing the variables is required (that allows taking into account non-
ciated with CART. First, using this method, we lose the possibility
linearity). This characteristic is important because any functional
to summarize the estimation in a simple tree because with each
form is anticipated for our model neither standard normal (probit
bootstrap sample we estimate a new tree17 , but we eliminate the
model) nor logistic (logistic model). In our case, the use of regres-
main problem of the instability of the results. Second, RF benefits
sion tree avoids the risk of misspecification. Moreover, it produces
from the randomization process because it allows to construct “de-
easily readable results and it can avoid the multicollinearity prob-
correlated trees” which in turn lead to reduce the variance of the
lem between the variables in Career Matrix.
estimated model. Third, RF is a powerful method which performs
However, regression trees, although being a very compelling
well when the sample has limited size and even if the number
method, suffer from a major drawback: instability (Breiman, 1996).
of observations is lesser than the number of predictor variables.18
If there is a small modification of the dataset, results will be
Lastly, RF also permits to compute a value which provides informa-
altered substantially, constituting a key issue in terms of re-
tion on the importance of the variables in the estimation by using
sults’robustness. To overcome this major drawback, we rely here
the “Mean Decrease in Accuracy(MDA)” criteria. For that purpose,
on another approach. Specifically, in order to stabilize trees and
we consider the Out-Of-Bag (OBB) sample, ie data that are not used
improve the accuracy of estimation, various methods have been
for constructing the b tree, and proceed as follows:
developed, such as “bagging ”15 (Breiman, 1996) and “Random For-
est” (Breiman, 2001a). We use the latter approach because this is • Compute the error of a b tree on its OBB sample.
a simple method that has interesting theoretical properties. • Permute randomly the value of the variable k from the training
RF has three main components: tree ➀, randomization ➁ and sample and compute new OBB error.
bagging ➂. The first step is to consider a bootstrap sample b of the • Average all the OBB errors to obtain the MDA (the value is nor-
training data (one third of the sample is left out) and construct a malized by standard deviation).
tree ➀. The difference with CART is that at each node p, we se-
lect randomly several variables for each node ➁ and we split the The underlying idea of this algorithm is to detect if the permu-
node in two child nodes, by considering the variable k and the split tation involves a decrease in the accuracy of the model. If a vari-
point s, that produce the best binary partition in terms of mini- able is not important the decrease will be weak while it will be
mization of the residual sum of squares. Formally, the aim is to significant if a variable is important.
minimize the sum of squared errors from the two regions (R p+1
and R p+2 ): 4. Results
 
  4.1. General results: classification of occupations
mink,s minc1 (yi − c1 )2 + minc2 ( yi − c2 ) 2 (1)
xi ∈R p+1 (k,s ) xi ∈R p+2 (k,s ) Before commenting and studying the probabilities of comput-
erization and their economic implications, it is necessary to evalu-
The best “response” in each m region (cm ) is equal to the aver-
ate the accuracy of our model. A simple approach is to check how
age value of the dependent variable. Since it is a regression tree,
many predictions of the adjusted model are correct by looking at
it is recommended that the number of k randomly selected vari-
the confusion matrix (not reported here). If we consider a standard
ables is equal to the following floor function:  K3  (Hastie et al.,
threshold of 0.5 for assigning a class to each occupation, the esti-
2009) where K denotes the number of explanatory variables. The
mated model achieves accurate prediction in approximately 88% of
third step is to repeat this procedure on B bootstrap samples ➂
cases.
and average the results of prediction to obtain the final estimation.
A more efficient measure of accuracy in classification problems
is the Area Under Curve (AUC) that corresponds to the evaluation
11
We give more details about regression trees in appendix.
12
Given the particular structure of data (y = 1 or y = 0), we could use classifica- 16
In CART, “Weakest Link Pruning” is used.
tion tree but we would a more nuanced view. By using regression tree instead, we 17
The major drawback of RF is that this is a “black box algorithm as other ma-
could get all values between 0 and 1. chine learning method” (Breiman, 2001b).With these methods, we could obtain
13
Varian (2014) defends the idea that machine learning methods could be useful very interesting results but they are limited in terms of interpretation. Indeed, they
in applied economics:“Machine learning techniques such as decision trees, support cannot establish precisely the link between the predictors and the dependent vari-
vector machines, neural nets, deep learning and so on may allow for more effective able as in stochastic data models. For example, it is not possible to get marginal
ways to model complex relationships.” effect. However, this problem is alleviated by the possibility to establish a ranking
14
CART part of the “Top 10 Algorithm in data mining” (Wu et al., 2008). of the regressors.
15 18
Bagging refers to “Bootstrap AGGregatING”. See examples in Verikas et al. (2011).
82 B. David / J. Japanese Int. Economies 43 (2017) 77–87

Table 1 Table 2
Overview of results. Share of employment by level of risk(%).

Top 10 occupations Probability Confidence interval at 95% Level of risk Total Men Women Regular Non-regular
jobs jobs
Speech therapist 0.01378 [0.01362, 0.01395]
Lawyer 0.01872 [0.01826, 0.01917] High risk 55.611 55.827 55.018 56.147 57.518
Stylist 0.02108 [0.02141, 0.02076] Middle risk 25.413 24.643 26.424 21.572 36.464
Classical musician 0.02531 [0.02369, 0.02692] Low risk 18.977 19.530 18.559 22.281 6.018
Theatre decorator 0.02531 [0.02692, 0.02369]
Stage director 0.03129 [0.03113, 0.03146]
High school teacher 0,03275 [0.03281, 0.03269]
Vocational school teacher 0.03275 [0.03281, 0.03269] occupations. On the contrary, some occupations such as model or
Make-up artist 0.03549 [0.03091, 0.04007] truck driver are difficult to appear as risky because no replacement
Radio director 0.03806 [0.03784, 0.03827]
process has now began.
Some intermediate occupations Probability Confidence interval at 95%

Dentist 0.11824 [0.10987, 0.12661] 4.2. Detailed results


Tourist guide 0.20046 [0.18293, 0.21799]
Programmer 0.36540 [0.36107, 0.36973]
Beyond the results by occupation, it is very interesting to draw
Radiology technologist 0.42377 [0.33570, 0.51184]
Nutritionist 0.50981 [0.42638, 0.59325] lessons at different economic levels. Our aim is to study the num-
Cargo handler 0.62812 [0.58732, 0.66893] ber, the distribution and the properties of the jobs which could be
Car assembler 0.69384 [0.65294, 0.73474] destroyed by computer devices in the next years. The drawback is
Postal clerk 0.78250 [0.76832, 0.79668] that Career matrix contains data on skills by occupation but no in-
Train driver 0.87407 [0.86635, 0.88179]
Taxi driver 0.95413 [0.94949, 0.95877]
formation on the number and the characteristics of workers who
perform each occupation. To alleviate this problem, we rely on the
Bottom 10 occupations Probability Confidence interval at 95%
data of the last Population Census (2010) that provides detailed in-
Packing worker 0.97200 [0.97105, 0.97295] formation on the number, the type of employment and the gender
Truck driver 0.97200 [0.97105, 0.97295] of people which perform each job.
Hotel worker 0.97428 [0.97201, 0.97654]
Our strategy is to construct a new data set combining interest-
Tourist bus driver 0.97428 [0.97201, 0.97654]
Road patrol worker 0.97428 [0.97201, 0.97654] ing information from these two sources. Since Population Census
Computer-assisted-design operator 0.98173 [0.98073, 0.98272] considers only 232 occupations, we include all Career Matrix occu-
Data entry keyer 0.98173 [0.98073, 0.98272] pations in Population Census categories and average the values of
Industrial waste collection 0.98173 [0.98073, 0.98272] all variables. In some cases, this approach is quite simple because
and transportation worker
Mail deliverer 0.98173 [0.98073, 0.98272]
occupations are similar (pharmacist, architectural engineers, child-
Computerized typesetting operator 0.98173 [0.98073, 0.98272] care workers...) which led no loss of information. However, in other
cases, we must insert several occupations from Career Matrix in
Population Census categories. For example, in the category “Motor
vehicle drivers”, we include taxi drivers, truck drivers, tourist bus
of the area situated under the Receiving Operating Curve (ROC).
drivers and bus drivers. For few cases, we have not any Career ma-
This tool enables to situate the quality of a classifier over two ref-
trix occupation to introduce. Our solution is to compute the mean
erences, a perfect and a random classifier which have respectively
of all variables from the occupations included in the same occu-
an AUC value of 1 and 0.5. The model considered in our analysis
pation group. For example, for “house cleaning workers”, we aver-
has a value of AUC equal to 0.955, suggesting that it has a very
age the values of the variables of building cleaning workers, waste
good level of performance that allows us to have a relative confi-
treatment workers and other cleaning workers. Lastly, when there
dence in the probabilities computed.19 To assess the statistical sig-
is no possible correspondence, we drop the corresponding occupa-
nificance of these findings, we have calculated the associated 95%
tion. We also put aside the category “Workers not classifiable by
confidence intervals. As shown in table 1, these intervals are small
occupation” because it contains too heterogeneous components.22
around the predicted value, confirming the robustness of our re-
Finally, our new dataset contains 88% of employed population. By
sults. (see Table 1).20
predicting probabilities of automation with this new sample, we
Application of the RF algorithm permits to obtain a probabil-
get (i) the total number of jobs which are threatened by actual
ity of automation for all occupations considered. Given that there
technological advance, and (ii) a detailed view by type of employ-
are 499 occupations, we cannot comment precisely all the results,
ment and by gender.
but we could give an overview21 of the findings by selecting the
Frey and Osborne (2013) present their results by decomposing
top ten, bottom ten and some intermediate occupations (Table 1).
the total of jobs in three categories: high, middle and low risk of
As shown, in the top 10 occupations, there are only occupations
automation. These groups are delimited by two thresholds equal
that require high level of creativity, manual dexterity or social in-
to 0.7 and 0.3 and serve as the basis for the determination of the
telligence. This is reflected in the variable importance measure that
number of jobs currently threatened by computer technology (they
highlights the significant weight of the variables “originality” and
evaluate that 47% of US employment is at risk). If we consider the
“manual dexterity”, “instructing” and “negotiation” in the estima-
same thresholds, our results suggest that approximatively 55% of
tion (Table 6 in appendix).
employment in Japan has a highly susceptibility to be replaced by
Logically, occupations in the bottom ten are poorly endowed in
technological equipment, namely 8 points higher than in the US
non computerisable skills. A part of results is plausible with, for
(see Table 2). Our predictions also state that roughly 25% of jobs
example, the probable disappearance of mail deliverer or cashier
are in the intermediate category and 19% can be considered as
non-automatable in the next years. These estimates are consistent
19
Note that Frey and Osborne (2013) model has AUC equal to 0.894.
with the US results but the share of at risk jobs is higher. The dif-
20
Confidence intervals are computed by using ”Infinitesimal Jackknife” adapted to ference of the estimated share of automatable occupations can be
Random Forest Wager et al. (2014).
21
All results (estimations of automation probabilities and confidence intervals) are
22
available upon request to the author. It represents approximatively 6% of the employed population.
B. David / J. Japanese Int. Economies 43 (2017) 77–87 83

Table 3
Training sample: automatable occupations.

Frey and Osborne (2013) based selection

Communication equipment assembler and repairer Human resources clerk Newspaper deliverer
Camera assembler Parking lot attendant Deliverer
Data entry keyer Tractor operator Taxi driver
Technical writer Cook Bus driver
Surveyor Maid Truck driver
PC assembler Seamstress Tour bus driver
Machine assembler Cashier Certified public accountants
Sheet metal worker Meter reader

Empirical selection Technology Examples

Toll road worker Self-checkout Open-source self-chek (Google)


Fighter pilot Drone Northrop Grumman RQ-4 Global Hawk
Train driver Automatic train New Transit Yurikamome
Model Humanoid robot HRP-4C, Geminoid
Taxi dispatcher Computer reservations systems (CRS) Axess
Building cleaning worker Robotic vacuum cleaner Roomba, 360 Eye (Dyson)
Warehouse worker Warehouse mobile robot Kiva Systems (Amazon)
Secretary Office automation emails, word processor softwares...
Train cleaning worker Robotic vacuum cleaner Roomba, 360 Eye (Dyson)

Table 4 ment determines the exposure to ICT capital. The non-regular jobs
Training sample (non-automatable occupations).
(temporary and part-time employment)23 appear to be more vul-
Professional golfer Hairdresser nerable to computer technology at any level of probability (Fig. 2).
Professional football player Prosecutor
Professional baseball player Judge
4.3. Comments
Jockey Landscape architects
Professional cyclists Lawyers
Sumo wrestler Child counsellor Although this study has some methodological limitations
Sushi cook Surgeon mainly related to the lack of some variables and to the aggregation
Chefs and head cooks Obstetrician, gynaecologist
procedure24 , we can draw several interesting conclusions. Our find-
Childcare workers Pediatrician
Civil engineers Physician
ings suggest that the diffusion of computer technology could have
Clergy Physicists an important effect on employment in the next years and decades.
Concierges Plumbers This is caused by the fact that a growing number of economic tasks
Dentists Preschool teacher could be technically performed by computer devices (they become
Economist Nurses
feasible at admissible costs). From this point of view, it is expected
Electrical engineers School nurses
Fashion designers Transport managers that some occupations and jobs in Japan (and in other industrial-
Flight attendants Waiters, waitresses ized countries) will disappear in short or medium term. Although,
Nail artist Zoologist the assessment of the exact share of jobs concerned is difficult to
Makeup artist
evaluate, our analysis shows that this share is around one half of
the total employment. Our estimate is slightly higher than the re-
cent analysis realized by NRI (2015) that estimates the share of
risked is equal to 49%.
explained by several factors. Firstly, it is a possible that this result
These estimates are based on a technological background, and
comes from methodological issues. Indeed, the aggregation proce-
the effective realization of these predictions will finally depend on
dure for combining Career Matrix and data from Population Cen-
many factors technological, economic and cultural factors. A first
sus could lead to loss of information causing an underestimation
determinant is the technological advance in terms of capacities,
of the number of jobs susceptible to be computerized. For exam-
quality and cost offered by the future technology which will cre-
ple, some Population Census categories contain a large number of
ate the opportunities of investment for producers. This is indeed
workers such as “Other general clerical workers” or “Comprehen-
on this technological basis and its cost (relative to labor cost) that
sive clerical workers”. If we had data at a more detailed level, it
their choices will be made.
would have been possible that a share of occupations included in
this group could be considered as non-computerizable which may
23
lower the share of employment threatened. A second type of ex- The data we consider are from the Population Census done by the Statistics
Bureau of the Ministry of Internal Affairs and Communications. The classification
planation is linked to the specificities of the Japanese labor mar-
used contains six groups whose main categories are regular, temporary and part-
ket, such as the presence of differences in skills for some occu- time workers. We have focused on these three sub-groups because (i) they con-
pations and differences in occupational structures. Lastly, we can tain the largest number of workers—being thus the most representative—and (ii)
notice, dealing also with US data, the estimations by Pajarinen and data for other categories are not always available. Temporary workers correspond
Rouvinen (2014) give a value of 49.2% thus reducing the difference to dispatched workers from temporary labour agency and the “part-timers” cate-
gory contains “Part-time workers” and “Arbeit (temporary workers)”, contract em-
observed between the two coutries.
ployees and entrusted employees. Our main distinction of interest being between
Beyond the determination of the share of jobs threatened by regular and non-regular workers, we can refer to Asao (2011) for a clear definition:
computer technology, we can also draw lessons at a more detailed a “regular employee is generally considered as an employee who is hired directly
level (Table 2, Fig. 2). As shown, the vulnerability face to techno- by his/her employer without a predetermined period of employment, and works for
scheduled hours. In other words, it can be summarized as open-ended, fulltime, di-
logical pressure is roughly similar between men and women since
rect employment” while a non-regular worker is “an employee who does not meet
their respective curves are both quite close to the total curve. How- one of the conditions for regular employment.”
ever, we identify a significant difference according to the type of 24
We have also used the median instead of the mean in the aggregation proce-
employment. Indeed, we find evidence that the type of employ- dure. We obtain similar results.
84 B. David / J. Japanese Int. Economies 43 (2017) 77–87

Table 5
Variables description.

Tasks Variable Definition Definition source Data source

Perception and manipulation Manual dexterity The ability to make very Author Author
precise manipulation with hands
Creative intelligence Fine arts Knowledge of theory and techniques FO (2013) Author
required to compose, produce, and
perform works of music, dance,
visual arts, drama, and sculpture
Originality The ability to come up with FO (2013) Author
unusual or clever ideas about
a given topic or situation,or
to develop creative ways
to solve a problem
Social intelligence Social perceptiveness Being aware of others’ reactions Ikenaga and Career matrix
and understanding why they Kambayashi (2010)
react as they do
Coordination Adjusting actions in relation to Ikenaga and Career matrix
others’ actions Kambayashi (2010)
Persuasion Persuading others to change Ikenaga and Career matrix
their minds or behavior Kambayashi (2010)
Negotiation Bringing others together and trying Ikenaga and Career matrix
to reconcile differences Kambayashi (2010)
Instructing Teaching others how to do something Ikenaga and Career matrix
Kambayashi (2010)
Service orientation Actively looking for ways to Ikenaga and Career matrix
help people Kambayashi (2010)

Appreciation Judgment and Considering the relative costs and Ikenaga and Career matrix

decision making benefits of potential actions to Kambayashi (2010)


choose the most appropriate one

Table 6
Variable importance.

Rank Variables Type of tasks % increase in MSE

1 Originality Creative intelligence 8.06676


2 Manual dexterity Perception and manipulation 7.01464
3 Instructing Social intelligence 5.41735
4 Negotiation Social intelligence 4.56105
5 Persuasion Social intelligence 3.09784
6 Judgment and decision making Appreciation 2.18357
7 Fine arts Creative intelligence 1.04819
8 Social perceptiveness Social intelligence 0.96448
9 Coordination Social intelligence 0.42546
10 Service orientation Social intelligence −2.58361

Another economic determinant of the extend of the future au- sition protects local jobs because they are those which mainly need
tomation is the specialization of the Japanese economy since, in some non-computerizable skills. Our estimates take into account
the context of globalization, the quality of the domestic produc- the actual specialization of the Japanese economy but the future
tion conditions the type of tasks carried and thus the type of labor of capital/labor substitution will depend on future insertion in in-
needed in a country. Moreover, some firms are engaged in interna- ternational trade.
tionalization process by participating in Global value chains (GVC). Furthermore, an important element will be the institutional and
This refers to a situation where the production is segmented and social “response” face to this technological wave. In this respect,
spreads among different firms and countries (see for example the the attitude of public authorities will play a key-role by promoting
case of electronics industry (Sturgeon and Kawakami, 2010; OECD, or not the technological diffusion. The observation of the part and
2013). Japanese firms are involved in this kind of strategy, for in- recent situation suggest that they will act in favor of development
stance, in automotive industry where Toyota and Mazda offshore a and adoption of ICT. Indeed, historically, the Japanese government
part of their production in Thailand and China (Taglioni and Win- has supported the development of computer and robotic technolo-
kler, 2014). gies since its beginning through the Ministry of Internal Affairs and
This trend in international trade has important implications for Communication (MIC), the Ministry of Economy, Trade, and Indus-
the issue of automation. Indeed, the specialization of an economy try (METI) or The New Energy and Industrial Technology Organiza-
and the position of local firms in global value chains determine tion (NEDO) and continues to promote actively innovation in these
the nature of the occupations and jobs needed locally to produc- fields (Kitano, 2005; Lechevalier et al., 2010; Murata, 2010). One
tion activities. This point could explain the cross-country differ- example of this continuity is the call, in May 2015, for a “robot
ences measured face to risk of automation (beyond methodological revolution” by the Prime Minister Abe. These efforts are also made
issues) and could condition the future impact. by firms and universities and likely to be continued due to age-
For now, Japanese firms engaged in GVC are in leading position ing and because ICT constitute a challenge for competitiveness in
by keeping the conception and development activities while they a context of regional competition with China and the Republic of
offshore the low return activities in low income countries. This po- Korea.
B. David / J. Japanese Int. Economies 43 (2017) 77–87 85

100

100
Share of total employment (%)

Share of total employment (%)

80
80

60
60

40
40
Total Men
Women

20
20

0
0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Probability of automation Probability of automation


100
Share of total employment (%)

80
60
40

Regular workers
Non−regular workers
20
0

0.0 0.2 0.4 0.6 0.8 1.0

Probability of automation

Fig. 2. Detailed results.

On the other hand, the availability of a technology is obvi- tivities producing of ICT (computer, software, robot...) and other
ously not a sufficient condition for its adoption. It is important in ICT using firms because these technologies need some specific
to emphasize that a crucial determinant of the future automa- workers coined “techies” by Harrigan et al. (2016). These workers
tion is the social acceptation of technology by producers and con- “mediate the adoption of new technology within firms: they are
sumers. For instance, it must be noted that a large share of jobs the ones who plan, purchase, and install new technology, and who
which could be probably automated in the future will be through train and support other workers in the use of new technology”.
robots with an important share of humanoid robots. We further Thus, they are complementary to adoption of computer tools, sug-
observe that there are already experimentations with the introduc- gesting that their number would increase with ICT diffusion.
tion of these machines in shops such as Pepper robot in Omote- In addition to this complementarity, we must take into ac-
sando Softbank shop or the Toshiba robot (Chihira Aico) in Mit- count a complementarity in use –i.e., when production requires
sukoshi department store. A key determinant of this kind of com- both workers and capital-. Indeed, the topic of automation can
puterization will be the public reception. Some scholars suggest be viewed through the substitutability-complementarity alterna-
that Japanese people are more willing to accept these new tech- tive as in the seminal paper of Autor et al. (2003). They con-
nologies than western people that might encourage automation. sider that some tasks are substitutable and other are complemen-
For example, Kaplan (2004)25 suggests that this better acceptance tary to labor. In the introduction, we mentioned that Frey and Os-
is due to the fact that from a western point of view, there is a borne (2013) identify an expansion of the share of tasks where
clear distinction between the natural and the artificial which gen- computers could challenge human workers and some “bottleneck”
erates a feeling of strangeness or aversion about this kind of robot. to automation. In this perspective, the number of workers submit-
Conversely, in the Japanese culture, there is not such distinction ted to “limited substitution” (Bresnahan et al., 2002) is decreasing
but rather a representation which emphasises a network of be- and the risk of job destructions and technological unemployment
ings. This type of analysis suggests that Japanese producers and are strong.
consumers may display a more positive attitude about robots, and However, it is possible that technological pressure could lead a
more broadly, toward information technologies that may foster job modification of the content of the tasks specific to occupation and
destructions. job by refocusing them on non-programmable tasks. The conse-
Another very important clarification to do about these results quence would not be job destructions, but a redefinition of produc-
is that these possible destructions of jobs are not equal to future tion activities around new complementarities susceptible to im-
unemployment. Indeed, as pointed in Section 2, the development prove quality and/or productivity of labor. Another possibility is
of new technologies also supports the creation of jobs by several that the social response to technology would be only partially posi-
compensation mechanisms. Jobs could be created in the new ac- tive. In this case, some tasks due to their arduousness and their so-
cial depreciation will be automated, but other most desirable and
25
recognized jobs would continue to be the responsibility of workers
See also Kitano (2005).
86 B. David / J. Japanese Int. Economies 43 (2017) 77–87

even if they are technically replaceable. This kind of complemen- output variable y is regressed on two predictors x1 and x2 . The
tarity will be more dictated by the level of acceptation of computer following tree represents the estimated model:
tools than by technical criteria.
Existence or importance of technological unemployment will
depend on the extend of these unpredictable dynamics. One pos-
sibility is that this technological diffusion produces no unemploy-
ment, but a deep restructuring of the production apparatus. An-
other positive point is specific to the Japanese situation and is
linked to the decline in the population and its consequence in
terms of reduction of labor participation. This dynamics might off-
set the potential negative impact of computer technology because
it is possible that computer capital will replace the individuals who
enter retirement instead of the available workers. The convergence
of these two phenomena might create a historical opportunity sus-
ceptible to avoid the risk of technological unemployment. On the
contrary, a more pessimistic view could be supported by the fact This tree has two internal nodes corresponding to R1 and R2
that ICT tools seem able to perform a very large kind of tasks. regions and three terminal nodes (or leaves) which are the R3 , R4 ,
The range of application seems to be unprecedented and it appears R5 regions. Interpretation of this model is quite simple. If x1 < 10
that technological limits are continually broken the barriers to cap- then the predicted value of y depends on the value of x2 . If x2 < 2
ital labor/substitution. This advance could cause serious problems then yˆ = c4 = 1 else yˆ = c5 = 6. If x1 > 10 then the predicted value
by putting forward a clear risk of technological unemployment in is yˆ = c3 = 8.
the short run during a transition period, but also in the long run if To build regression trees within the CART framework, we have
the compensation mechanisms are insufficient. to consider sequentially the k regressors (the splitting variables)
and the s splitting points to search the best binary partition of the
data in terms of minimization of the residuals. Mathematically, we
5. Conclusion
solve at each p node the following program:
 
The aim of this paper is to assess the number of jobs which are  
threatened to be performed by computer technology in Japan in mink,s minc1 (yi − c1 )2 + minc2 ( yi − c2 ) 2
the near future. Relying on machine learning method, our findings xi ∈R p+1 (k,s ) xi ∈R p+2 (k,s )
suggest that 55% of actual jobs in Japan could be considered to be
The procedure continues until a very large tree is constructed. The
at risk. We also found that non-regular workers are more threat-
final step of the algorithm is the “pruning” of this tree by mini-
ened than regular workers.
mizing the “cost complexity criterion”. The aim is to find an ap-
These results are based on technical background, and other eco-
propriate trade-off between accuracy and tree size. If there is no
nomic and social determinants are necessary to draw a clear pic-
partition, any structure is identified and if the tree is too large,
ture of the future automation as the relative cost of labor and com-
there is a risk of overfitting. The detail of the pruning procedure
puter capital or the social acceptation of producers and consumers.
is presented in Breiman et al. (1984); for a short explanation see
Overall, the current development in computer technology, the de-
James et al. (2013) or Hastie et al. (2009).
cline in the working population, or the support in ICT development
As stressed in Section 3.2, a limitation of decision trees is their
by public authorities, firms and universities suggest that computer
potential instability (estimation is sensitive to dataset modifica-
technology could strongly reshape the Japanese labor market and
tion). One way to overcome this drawback is to rely on RF that
production activities.
is composed of many trees, each of them being based on a boot-
strap subsample (b). Each tree is grown with randomly selected
Appendix predictors (“Randomization”) and the results of all trees are finally
aggregated by averaging (“Bagging”). For a given set of regressors
Regression tree with CART and random Forest values, the predicted value for an individual j is :
Our aim is to approximate an output vector of observa- B
tions y by a function of x = (x1 , . . . , xK ) regressor(s) y ≈ b=1 yˆjb
yˆj =
f(x). Regression tree26 27 is a statistical method developed by B
Breiman et al. (1984) which models the response of y by parti-
tioning “the space of all joint predictor variable values into disjoint References
regions” (Hastie et al., 2009).28
Formally, a regression tree has the following expression: Asao, Y., 2011. Overview of Non-Regular Employment in Japan. JILPT Report 10. JILPT.
Autor, D., Dorn, D., 2013. The growth of low-skill service jobs and the polarization

M of the us labor market. Am. Econ. Rev. 103 (5), 1553,1597. American Economic
f (x ) = cm I ( x ∈ Rm ) (2) Association.
1
Autor, D., Levy, F., Murnane, R., 2003. The skill content of recent technological
change an empirical exploration. Q. J. Econ. 118 (4), 1279,1333.
The response of the output variable is a constant (cm ) correspond- Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123,140.
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5,32.
ing to the average value of y in each region (Rm ).
Breiman, L., 2001. Statistical modeling: the two cultures. Stat. Sci. 16 (3), 199,231.
Trees have a very clear graphical representation facilitating the Breiman, L., Friedman, J., Olshen, R., Stone, C., 1984. Classification and Regression
interpretation of results. We consider a simple example where an Trees. Chapman & Hall.
Bresnahan, T., Brynjolfsson, E., Hitt, L., 2002. Information technology,workplace or-
ganization, and the demand for skilled labor: firm-level evidence. Q. J. Econ. 117
26 (1), 339,376. MIT Press.
In CART framework, there are two types of decision trees: regression and clas-
Bresnahan, T., Trajtenberg, M., 1995. General purpose technologies-engines of
sification trees. We just present the first because the RF used in our work is based growth? J. Econom.
only on regression trees. Brynjolfsson, E., McAfee, A., 2011. Race Against the Machine.
27
C4.5 is another method for growing decision trees (see Quinlan, 1993). Frey, C., Osborne, M., 2013. The Future of Employment: How Susceptible are Jobs to
28
We take the same mathematical notations. Computerisation?. Oxford University Martin School.
B. David / J. Japanese Int. Economies 43 (2017) 77–87 87

Goos, M., Manning, A., 2007. Lousy and lovely jobs: the rising polarization of work NRI, 2015. Nihon no roudoujinkou no 49 nado de daitaikanou ni -601syu no syoku-
in britain. Rev. Econ. Stat. 89 (1), 118,133. MIT Press. gyou goto ni computer gijyutsu niyoru daitaikakuritsu wo sisan-. NOMURA, In-
Goos, M., Manning, A., Salomons, A., 2009. Job polarization in Europe. Am. Econ. stitute.
Rev. 99 (2), 58,63. American Economic Association. OECD, 2013. Trade policy implications of global value chains: case studies. OECD
Harrigan, J., Reshef, A., Toubal, A., 2016. The march of the techies: technology, trade Trade Policy Papers 161 128,159. Doi: 10.1787/5k3tpt2t0zs1-en.
and job polarization in france, 1994–2007. NBER Working Paper (22110). Pajarinen, M., Rouvinen, P., 2014. Computerization Threatens One Third of Finnish
Hastie, T., Tibshirani, R., Friedman, J., 2009. The Elements of Statistical Learning: Employment. ETLA Brief 22. The Research Institute of the Finish Economy
Prediction, Inference and Data Mining, 2nd ed. Springer Verlag. (ETLA). http://pub.etla.fi/ETLA- Muistio- Brief- 22.pdf.
Ikenaga, T., 2009. Polarization of the japanese labor market: the adoption of it and Peduzzi, P., Concato, J., Kemper, E., Holford, T., Feinstein, A., 1996. A simulation study
changes in tasks contents. Jpn. J. Labour Stud. 584 (73), 90. of the number of events per variable in logistic regression analysis. J. Clin. Epi-
Ikenaga, T., Kambayashi, R., 2010. Long-term trends in the polarization of the demiol. 49, 1373–1379.
japanese labor market: the increase of non-routine task input and its valuation Perez, C., 2009. Technological revolutions and techno-economic paradigms. Cam-
in the labor market. PIE/CIS Discussion Paper 464. bridge J. Econ. 34 (1), 185,202.
IMF, 2013. Japan: Selected Issues. IMF Country Report 13/254. International Mone- Quinlan, R., 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Ma-
tary Fund. teo.
(ITU) International Telecommunication Union, 2015. http://www.itu.int/en/ITU-D/ Ricardo, D., 1821. On the principles of political economy and taxation. 3rd ed.
Statistics/Pages/stat/default.aspx. (IFR) International Federation of Robotics, 2014. www.ifr.org/.
James, G., Witten, D., Hastie, T., Tibshirani, R., 2013. An Introduction to Statistical Rosenberg, N., Trajtenberg, M., 2004. A general-purpose technology at work: the
Learning with Applications in R. Springer. corliss steam engine in the late-nineteenth-century united states. J. Econ. Hist.
(JARA) Japan Robotics Association, 2015. http://www.jara.jp/. 64 (1), 61,99.
Kaplan, F., 2004. Who is afraid of the humanoid? investigating cultural differences Schodt, F., 1988. Inside the Robot Kingdom: Japan, Mechatronics and the Coming
in the acceptance of robots. Int. J. Humanoid Rob. 1 (3), 1,16. Robotopia. Kodansha International, New-York.
Keynes, J., 1930. Economic Possibilities for our Grandchildren. Saturday Evening Simon, H., 1987. The steam engine and the computer: what makes technology rev-
Post. olutionary. Educom Bull. 22 (1), 2,5.
Kitano, N., 2005. Roboethics - a comparative analysis of social acceptance of robots Spong, M., Hutchinson, S., Vidyasagar, M., 2012. Robot Modeling and Control. John
between the west and japan -. Waseda J. Social Sci. 6. Tokyo. Wiley and sons, inc.
Koomey, J., Berard, S., Sanchez, M., Wong, H., 2010. Implications of historical trends Ministry of Internal Affairs (MIC) Statistics Bureau, 2015. http://www.stat.go.jp/
in the electrical efficiency of computing. Ann. Hist. Comput. 33 (3), 46,54. IEEE index.htm.
(Stanford University). Sturgeon, T., Kawakami, M., 2010. Global value chains in the electronics industry:
Kumaresan, N., Miyazaki, K., 1999. An integrated network approach to systems of was the crisis a window of opportunity for developing countries?. In: Catta-
innovation – the case of robotics in japan. Res. Policy 28 (6), 563–585. neo, O., Gereffi, G., Staritz, C. (Eds.), Global Value Chains in a Postcrisis World:
Lechevalier, S., Ikeda, Y., Nishimura, J., 2010. The effect of participation in govern- A Development Perspective. The World Bank, Washington, D.C.
ment consortia on the r&d productivity of firms: a case study of robot technol- Taglioni, D., Winkler, D., 2014. Making Global Value Chains Work for Development.
ogy in japan. Econ. Innovation New Technol. 19 (8), 669,692. Economic Premise note series.
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 251 (7553), 436,44. Varian, H., 2014. Big data: new tricks for econometrics. J. Econ. Perspect. 28 (2).
Liaw, A., Wiener, M., 2002. Classification and regression by randomforest. R News 2 Verikas, A., Gelzinis, A., Bacauskiene, M., 2011. Mining data with random forests: a
(3), 18–22. survey and results of new tests. Pattern Recognit. 44. Doi: 10.1016/j.patcog.2010.
Marx, K., 1867. Capital. 08.011
Maurin, E., Thesmar, D., 2004. Changes in the functional structure of firms and the Vivarelli, M., 2011. Innovation, Employment and Skills in Advanced and Developing
demand for skill. J. Labor Econ. University of Chicago Press. Countries: A Survey of the Literature. Technical Report IDB Publications 61058.
Murata, K., 2010. Lessons from the history of information system development and Inter-American Development Bank.
use in japan. Entrep. Hist. 60, 50,61. Wager, S., Hastie, T., Efron, B., 2014. Confidence intervals for random forest : the
Nagy, B., Farmer, J., Trancik, J., Gonzales, J., 2011. Superexponential long-term trends jackknife and the infinitesimal jackknife. J. Mach. Learn. Res. 15 (1), 1625,1651.
in information technology. Technol. Forecast. Soc. Change 78, 1356,1364. Doi: Wu, X., Kumar, V., Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G., Ng, A.,
10.1016/j.techfore.2011.07.006. Liu, B., Yu, P., Zhou, Z.-H., Steinbach, M., Hand, D., Steinberg, D., 2008. Top 10
Nordhaus, W., 2007. Two centuries of productivity growth in computing. J. Econ. algorithms in data mining. Knowl. Inf. Syst. 14 (1), 37.
Hist. 67, 128,159.

You might also like