Professional Documents
Culture Documents
Zhu, X. Machine Teaching - An Inverse Problem To Machine Learning and An Approach Toward Optimal Education
Zhu, X. Machine Teaching - An Inverse Problem To Machine Learning and An Approach Toward Optimal Education
Zhu, X. Machine Teaching - An Inverse Problem To Machine Learning and An Approach Toward Optimal Education
Abstract A
D
I draw the reader’s attention to machine teaching, the prob- A(D)
lem of finding an optimal training set given a machine learn-
ing algorithm and a target model. In addition to generating ∗
fascinating mathematical questions for computer scientists to θ
ponder, machine teaching holds the promise of enhancing ed- −1
−1 ∗ Α
ucation and personnel training. The Socratic dialogue style Α (θ ) Θ
aims to stimulate critical thinking.
D
D
Given a training set D ∈ D, machine learning returns a
Of Machines model A(D) ∈ Θ. Note A in general is many-to-one. Con-
Q: I know machine learning; What is machine teaching? versely, given a target model θ∗ ∈ Θ the inverse function
A−1 returns the set of training sets that will result in θ∗ . Ma-
Consider a “student” who is a machine learning algo-
chine teaching aims to identify the optimal member among
rithm, for example, a Support Vector Machine (SVM) or
A−1 (θ∗ ). However, A−1 is often challenging to compute,
kmeans clustering. Now consider a “teacher” who wants the
and may even be empty for some θ∗ . Machine teaching must
student to learn a target model θ∗ . For example, θ∗ can be
handle these issues.
a specific hyperplane in SVM, or the location of the k cen-
troids in kmeans. The teacher knows θ∗ and the student’s Q: Isn’t machine teaching just active learning / experimental
learning algorithm, and teaches by giving the student train- design?
ing examples. Machine teaching aims to design the optimal No. Recall active learning allows the learner to “ask
training set D. questions” by selecting items x and asking an ora-
Q: What do you mean by optimal? cle for its label y (Settles 2012). Consider learning a
noiseless threshold classifier in [0, 1], as shown below.
One definition is the cardinality of D: the smaller |D| is,
the better. But there are other definitions as we shall see. ε
{