2
Natural Language Understanding
3
Natural Language Understanding
Give me a recipe
for lasagna
4
Natural Language Understanding
Give me a recipe
for lasagna
Please bring me my
coffee mug from the
kitchen
5
Natural Language Understanding
Give me a recipe
for lasagna
Please bring me my
coffee mug from the
kitchen
6
Human Interactions
7
Human Interactions
Please bring me my
coffee mug from the
kitchen
8
Human Interactions
Please bring me my
coffee mug from the
kitchen
9
Human Interactions
Please bring me my
coffee mug from the
kitchen
What color is
your coffee mug?
10
Teach Machines to Ask Clarification Questions
11
Teach Machines to Ask Clarification Questions
12
Teach Machines to Ask Clarification Questions
In which field?
13
Teach Machines to Ask Clarification Questions
14
Teach Machines to Ask Clarification Questions
Please bring me my
coffee mug from the
kitchen What color is your
coffee mug?
15
PRIOR WORK
16
Reading Comprehension Question Generation
17
Question Generation for Slot Filling
SLOTS
USER: I want to go to Melbourne on July 14
<origin city>
SYSTEM: What time do you want to leave?
<departure city>
USER: I must be in Melbourne by 11 am
<origin time>
SYSTEM: Would you like a Delta flight that arrives at 10.15 am?
<departure time>
USER: Sure
<airline>
SYSTEM: In what name should I make the reservation?
18
Visual Question Generation Task
19
We consider two scenarios
20
We consider two scenarios -- First Scenario
StackExchange
21
We consider two scenarios -- First Scenario
StackExchange
22
We consider two scenarios -- Second Scenario
Amazon
23
We consider two scenarios -- Second Scenario
Amazon
24
Our Contributions
25
Our Contributions
26
Talk Outline
o Future Directions
27
Talk Outline
o Future Directions
28
Clarification Questions Dataset: StackExchange
29
Clarification Questions Dataset: StackExchange
30
Clarification Questions Dataset: StackExchange
Finding: Questions go unanswered for a long time if they are not clear enough
31
Clarification Questions Dataset: StackExchange
32
Clarification Questions Dataset: StackExchange
I'm aiming to install ape in Ubuntu 14.04 LTS, a simple code for
pseudopotential generation.
I'm having this error message while running ./configure
<error message> Updated Post
So I have the library but the program installation isn't finding it.
Any help? Thanks in advance!
33
Clarification Questions Dataset: StackExchange
I'm aiming to install ape in Ubuntu 14.04 LTS, a simple code for Edit as an answer
pseudopotential generation. to the question
I'm having this error message while running ./configure
<error message> Updated Post
So I have the library but the program installation isn't finding it.
Any help? Thanks in advance!
34
Clarification Questions Dataset: StackExchange
I'm aiming to install ape in Ubuntu 14.04 LTS, a simple code for Edit as an answer
pseudopotential generation. to the question
I'm having this error message while running ./configure
<error message> Updated Post
So I have the library but the program installation isn't finding it.
Any help? Thanks in advance!
35
Clarification Questions Dataset: StackExchange
Dataset Creation
36
Clarification Questions Dataset: Amazon
37
Clarification Questions Dataset: Amazon
McAuley and Yang. Addressing complex and subjective product-related queries with customer reviews. WWW 2016
38
Clarification Questions Dataset: Amazon
context
question
answer
McAuley and Yang. Addressing complex and subjective product-related queries with customer reviews. WWW 2016
39
Clarification Questions Dataset: Amazon
context
question
answer
McAuley and Yang. Addressing complex and subjective product-related queries with customer reviews. WWW 2016
40
Talk Outline
o Future Directions
41
Talk Outline
o Future Directions
Sudha Rao, Hal Daumé III, "Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected
Value of Perfect Information ”, ACL 2018
42
Expected Value of Perfect Information (EVPI) inspired model
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
43
Expected Value of Perfect Information (EVPI) inspired model
o Use EVPI to identify questions that add the most value to the given post
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
44
Expected Value of Perfect Information (EVPI) inspired model
o Use EVPI to identify questions that add the most value to the given post
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
45
Expected Value of Perfect Information (EVPI) inspired model
o Use EVPI to identify questions that add the most value to the given post
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
46
Expected Value of Perfect Information (EVPI) inspired model
o Use EVPI to identify questions that add the most value to the given post
EVPI (x|c) =
x X
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
47
Expected Value of Perfect Information (EVPI) inspired model
o Use EVPI to identify questions that add the most value to the given post
Likelihood of x given c
x X
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
48
Expected Value of Perfect Information (EVPI) inspired model
o Use EVPI to identify questions that add the most value to the given post
Likelihood of x given c
x X
Value of updating c with x
Mordecai et al. "The value of information and stochastic programming." Operations Research 18.5 (1970)
49
EVPI formulation for our problem
50
EVPI formulation for our problem
EVPI ( qi | c )=
c : given context
51
EVPI formulation for our problem
EVPI ( qi | c )= P( aj | c , qi )
c : given context
52
EVPI formulation for our problem
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
c : given context
53
EVPI formulation for our problem
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
aj A
Utility of updating the context c with answer aj
c : given context
54
We rank questions by their EVPI value
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
aj A
55
We rank questions by their EVPI value
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
aj A
What is the make of your wifi card? 0.34 What version of Ubuntu do you have?
What version of Ubuntu do you have? 0.85 What OS are you using?
What OS are you using? What is the make of your wifi card?
0.67
56
Three parts of our formulation:
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
qi Q aj A
1 2 3
57
Three parts of our formulation:
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
qi Q aj A
58
1. Question & Answer Generator
Dataset of
(post, question, answer)
Post as
Documents
Lucene
Post p as
Search
query
Engine
59
1. Question & Answer Generator
p1
Post as
Documents
p2
Lucene
Post p as pj
Search
query
Engine
p10
60
1. Question & Answer Generator
Post as
Documents q2
p2
Lucene qj
Post p as pj
Search
query
Engine
p10 q10
61
1. Question & Answer Generator
Lucene qj
Post p as pj aj
Search
query
Engine
62
Three parts of our formulation:
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
qi Q aj A
Answer
Modeling
63
2. Answer Modeling
P( aj | c , qi )≈ cosine_sim ( Embans( c , qi ), aj )
64
2. Answer Modeling
P( aj | c , qi )≈ cosine_sim ( Embans( c , qi ), aj )
Neural
Embedding
Network
c qi aj
65
2. Answer Modeling
P( aj | c , qi )≈ cosine_sim ( Embans( c , qi ), aj )
a1
Other
answers
a10
66
2. Answer Modeling
P( aj | c , qi )≈ cosine_sim ( Embans( c , qi ), aj )
Feedforward Average
Neural Network c qi
Context Question
LSTM LSTM
c qi aj
67
Three parts of our formulation:
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
qi Q aj A
Utility
Calculator
68
3. Utility Calculator
U( c + aj ) Value between
0 and 1
Neural Network
c qi aj
69
3. Utility Calculator
U( c + aj ) Value between
0 and 1
Training objective
Neural Network
Label
Original
( c , q0 , a0 ) (ques, ans) y=1
c qi aj
( c , q1 , a1 ) y=0
Other
(ques, ans)
70
3. Utility Calculator
U( c + aj ) Value between
0 and 1
Feedforward
Neural
Network c qi aj
c qi aj
71
Our EVPI inspired question ranking model (in summary)
EVPI ( qi | c )= P( aj | c , qi ) U( c + aj )
qi Q aj A
72
Human-based Evaluation Design
73
Human-based Evaluation Design
74
Human-based Evaluation Design
What is EVPI?
When is lunch?
75
Human-based Evaluation Design
What is EVPI?
When is lunch?
What is EVPI?
When is lunch?
What is EVPI?
When is lunch?
What is EVPI?
When is lunch?
80
Research Questions for Experimentation
1. Does a neural network architecture improve upon non-neural baselines?
81
Research Questions for Experimentation
1. Does a neural network architecture improve upon non-neural baselines?
82
Research Questions for Experimentation
1. Does a neural network architecture improve upon non-neural baselines?
3. Does EVPI formalism improve over a traditionally trained neural network?
83
Neural Baseline Model
o Neural (c, q, a)
Feedforward
ci qi ai
Both Neural (c, q, a) and
Neural
Network EVPI (q|c, a) have similar
Context Ques Ans no. of parameters
LSTM LSTM LSTM
ci qi ai
84
Human based evaluation results on StackExchange
Union of Best
Random 17.5
0 10 20 30 40
Precision @1
85
Human based evaluation results on StackExchange
Union of Best
Random 17.5
0 5 10 15 20 25 30 35 40
Precision @1
86
Human based evaluation results on StackExchange
Union of Best
Random 17.5
0 5 10 15 20 25 30 35 40
Precision @1
Nandi, Titas, et al. IIT-UHH at SemEval-2017 task 3: Exploring multiple features for community question
answering and implicit dialogue identification. Workshop on Semantic Evaluation (SemEval-2017).
87 2017.
Human based evaluation results on StackExchange
Union of Best
Non-linear vs linear
Features (c, q) 23.1
Random 17.5
0 5 10 15 20 25 30 35 40
Precision @1
88
Human based evaluation results on StackExchange
Union of Best
Random 17.5
0 5 10 15 20 25 30 35 40
Precision @1
89
Human based evaluation results on StackExchange
Union of Best
Random 17.5
Train: 61,678
Tune: 7,710
0 5 10 15 20 25 30 35 40
Test: 500 Precision @1
Note: Difference between EVPI and all baselines is statistically significant with p < 0.05
90
Talk Outline
o Future Directions
91
Talk Outline
o Future Directions
o Conclusion
Sudha Rao, Hal Daumé III, "Answer-based Adversarial Training for Generating Clarification
Questions”, In Submission 92
92
Issue with the ranking approach
What version of Ubuntu do you have? What version of Windows do you have?
93
Issue with the ranking approach
What version of Ubuntu do you have? What version of Windows do you have?
94
Sequence-to-sequence neural network model
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
95
Sequence-to-sequence neural network model
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
96
Sequence-to-sequence neural network model
A B C <EOS>
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
97
Sequence-to-sequence neural network model
A B C <EOS>
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
98
Sequence-to-sequence neural network model
W X
A B C <EOS> W
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
99
Sequence-to-sequence neural network model
W X Y
A B C <EOS> W X
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
100
Sequence-to-sequence neural network model
W X Y Z
A B C <EOS> W X Y
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
101
Sequence-to-sequence neural network model
W X Y Z <EOS>
A B C <EOS> W X Y Z
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
102
Sequence-to-sequence neural network model
W X Y Z <EOS>
A B C <EOS> W X Y Z
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. Sequence to sequence learning with neural networks. NIPS 2014
103
Max-likelihood clarification question generation model
Question
Generator
(Seq2seq)
Question
104
Max-likelihood clarification question generation model
Question
Generator Issues
(Seq2seq)
o Maximum-likelihood (MLE) training generates
generic questions
Li et al. A diversity-promoting objective function for neural conversation models. In NAACL, 2016.
105
Max-utility based clarification question generation model
Context
Question Answer
Generator Generator
(Seq2seq) (Seq2seq)
Question Answer
106
Max-utility based clarification question generation model
Context
Question Answer
Utility
Generator Generator Reward
Calculator
(Seq2seq) (Seq2seq)
Question Answer
107
Max-utility based clarification question generation model
Context
Question Answer
Utility
Generator Generator Reward
Calculator
(Seq2seq) (Seq2seq)
Question Answer
108
Max-utility based clarification question generation model
Context
Reward
Calculator
Question Answer
Utility
Generator Generator Reward
Calculator
(Seq2seq) (Seq2seq)
Question Answer
109
Max-likelihood vs Max-utility
Question
Reward
Calculator
Reward
110
Max-likelihood vs Max-utility
Question
Differentiable Non- Differentiable
Reward
Calculator
Similar to discrete metrics
like BLEU & ROUGE
Reward
Ranzato, Marc'Aurelio, et al. "Sequence level training with recurrent neural networks." ICLR 2016
111
Max-likelihood vs Max-utility
Question
Differentiable Non- Differentiable
Reward
Ranzato, Marc'Aurelio, et al. "Sequence level training with recurrent neural networks." ICLR 2016
112
ould generate an answer that would increase the utility of the context by adding useful
to it (see §2.3 for details).
ptimizing metrics like BLearning
Reinforcement LEU and ROUGE , this U TILITY function
for Clarification QuestionalsoGeneration
operates on dis-
utputs, which makes optimization difficult due to non-differentiability. A successful
oach dealing with the non-differentiability while also retaining some advantages of max-
hood training is the Mixed Key
Context Incremental
Idea: Cross-Entropy Reinforce (Ranzato et al., 2015)
M IXER). In M IXER, the overall loss L is differentiated as in R EINFORCE (Williams,
ü Estimate loss by drawing samples (“questions”)
Question
L(✓) Eqs ⇠p✓ r(q s )
=Generator ; r✓ L(✓) = - Eq s ⇠p✓ r(q
Loss = s
)r✓ log
reward(q p✓ (q s )
s|c) (3)
(Seq2seq)
a random output sample according to the model p✓ , where ✓ are the parameters of the
We then approximate the expected gradient using a single sample q s = (q1s , q2s , ..., qTs )
odel distribution (p✓ ). In R EINFORCE, the policy is initialized random, which can cause
gence times.Question
To solve this, M IXER starts by optimizing maximum likelihood and slowly
imizing the expected reward from Eq 3. For the initial time steps, M IXER optimizes
r the remaining (T ) time steps, it optimizes the external reward.
Reward
el, we minimize the U TILITY-based
Calculator loss Lmax-utility defined as:
T
X
Lmax-utility = (r(q p ) r(q b )) log p(qt |q1 , q2 , ..., qt 1 , ct ) (4)
Reward
t=1
) is the U TILITY based reward on the predicted question and r(q b ) is a baseline reward
o reduce the high variance otherwise observed when using R EINFORCE. 113
ould
lity
ty thatgenerate
that as the
would
would an
be answer
beutility
obtained
obtained thatthat ifwould
ifwould
the be
the context increase
contextobtained were
were theif utility
the
updated
updated of
contextwith
with thethewere
thecontext updated
answer
answer byto toadding
with useful
the proposed
the the
proposed answer
Rao to it & Daumé
(see §2.3 III (2018)
for details). Recently
observed Rao
that & Daumé
usefulness III
of (2018)
a question observed can be that usefulness
better measured of a q
We We use
use thisquestion.
this observation
observation We use totoas this
define
define observation
aa U U TILITYto
TILITY define
based
based rewarda
reward U function
function
TILITY based and
and reward
train
train thefunction
the updated trw
question
question and
ity that would be obtained ifthe theutility
context thatwere would updatedbe obtained with the if answer
the context to the were proposed
to tooptimize
optimize generator
this reward.
this to optimize
reward. Weand
We trainthis
train thereward.
the We
UUClarification train
reward
Ureward thefunction
to U predict areward
the to predict
likelihood that the
a l
rewardto predict
TILITYthe likelihood that
Weptimizing
use thismetrics
Reinforcement observation like BLearning
question.
toLEU define aRU Wefor
OUGE
TILITYuse ,TILITY
TILITY
this
based observation
TILITY to
function
Question define also
and Uoperates
train
Generation
TILITY on
the based dis-rewar
question
would
would
utputs, question
generate
generate
which anananswer
makes would
answer generate
that would
that
optimization
generator would an answer
increase
toincrease
difficult
optimize duethat
the
the would
utility
utility
toreward
this increase
of the
of the
non-differentiability.
reward. We the utility
context
context
train by Uof
byAadding
adding the useful
successful context
useful b
to optimize this reward. We train the U TILITY to predict the the likelihood TILITY thatrewarda
on
n totoitdealing
oach it(see
(seeinformation
§2.3
§2.3
with for
for the to itquestion
details).
details). (see §2.3would
non-differentiability for details). whilethe
generate also anretaining
answer some
that advantages
would of themax-
would generate an answer that would increase utility of the context byincrease
adding usefulutility
ohood to ittraining is for
the Mixed Incremental Cross-Entropy for Reinforce (Ranzato et al., 2015)
Context
noptimizing
optimizing (seeSimilar
metrics
metrics
§2.3 like BBinformation
to details).
optimizing
like LEU
LEU
Key metrics
and
and
Idea: tolike
RROUGE
OUGE it (see
,,B this
this §2.3
LEU U and
UTILITY
TILITY Rdetails).
OUGE function
function , this U TILITY
also
also operates
operates function
on dis-also
on
M IXER ). which
outputs,
outputs, In M text
crete
which IXERmakes
makes , theoptimization
outputs, overall
which loss makes L is optimization
difficult differentiated
due as in R EINFORCE
difficult
to non-differentiability.
non-differentiability. due A(Williams,
toRnon-differentiability
successful
optimizing metrics likeoptimization
BSimilar
LEU and ü to R difficult
optimizing
OUGE
Estimate , due
this
loss by U to
metrics TILITY
drawing like B
function
LEU
samples andalso A
operates
OUGE
(“questions”)
successful
, thison U dis-TILIT
proach
roach dealingrecent
dealing withapproach
with the
the dealing
non-differentiability
non-differentiability with the non-differentiability
while
while also
also retaining
retaining while
some
some also
advantagesretaining
advantages of
of some
max-
max- adv
outputs, which makes optimization s crete text outputs, difficultwhich due tomakes s optimization
non-differentiability. difficult
A due
successful to non
Question s
elihood
lihood L(✓) =imum
training
training Eqisis
Generatorlikelihood
s ⇠p the
the r(qMixed
Mixed ) training
;Incremental
Incremental
r is
L(✓)the
Loss
recent approach dealing =Mixed
= - E
Cross-Entropy
Cross-Entropy Incremental
s
with
⇠p r(q
reward(qReinforce
Reinforce
)r Cross-Entropy
logs|c)
the non-differentiability p (Ranzato
(Ranzato
(q ) Reinforce
et
et al.,
al.,
while (3)
2015)
2015) (Ranz
also ret
roach dealing with the ✓
non-differentiability ✓
while also
q ✓
retaining ✓
some advantages
✓
of max-
(M(MIXERIXER).).algorithm
In M
In MIXER
(Seq2seq) IXER (M , , the
the
IXER
imum ).
overall
overall In M loss
loss
IXER
likelihood L L , isthe
is overall
differentiated
differentiated
training loss as
as
L is
inin differentiated
RR EINFORCE (Williams,
EINFORCE as in
(Williams, R EINFO
lihood
a random training
output is the
sample Mixed Incremental
according ü to theCross-Entropy
Differentiate model the p✓is theas
, where
loss Mixed
Reinforce ✓ areIncremental
(Ranzato
the parameters Cross-Entropy
et al.,of2015)
the
1992): algorithm (M ). In M , the overall s loss sL is differentiate
We (Mthen ). In M IXERthe
IXERapproximate , the s overall
expected loss
gradient IXER
L is differentiated
using
s a single
IXER s as
sample in R qEINFORCE
= (q
s 1 s 2s, q (Williams,
s
, ..., q s
T) s
L(✓)
L(✓) = = E E q q ⇠p ✓✓
r(q
r(q s
L(✓)) )
1992): =
;; rrE ✓ q
✓ L(✓)
L(✓)
Loss r(q
odel distribution (p✓ ). In R EINFORCE, the policy is initialized random, which can cause
ss ⇠p s ⇠p ✓ = =
= - )EE qqs;s⇠p
⇠p ✓
r✓
r(q
r(q✓ logs
L(✓) )r
)r Pr(q
✓ =
✓ log
logs|c)Epp q
✓✓ (q
(q⇠p
s
)
reward(q
s )
✓
r(q )r|c) ✓ log (3)
p ✓ (q
gence
sisaarandom times.
random
L(✓) where ToEsolve
output
=Question
output yqss⇠p is
samplethis,
r(q s )M
✓a random
sample ; output
according
IXER
according rstartsto
✓ L(✓) by
the=optimizing
tosample
the
L(✓) model
model
-= Eq s E
accordingpp✓q✓,s✓,⇠p
⇠p maximum
where
r(q s s s|c)
✓to)r
where r(q
reward(q the are;likelihood
model
✓✓✓) are
log the
pthe
✓r
s log and
), where
parameters
(qpparameters
✓✓L(✓) Pr(q= slowly
✓E are
of
of
s|c)
(3)
q sthe
⇠pthe✓ r(
p
imizing
WeWethen thenthe expectedWe
network.
approximate
approximate reward
the then
the froms Eq
approximate
expected
expected 3. For
gradient
gradient the the
using initial
expected
using aa single time
gradient
single sample
sample steps,
using qqssM a=
= single
IXER (q
(q ss ,optimizes
, qqsample
ss , ..., ssq s =
2, ..., qthe
Tp) , w
is a random
r the remaining output sample where
)Intime according y
steps, it(pis a
to random
the
optimizes model output
the p ,
external
✓ sample
where ✓ according
reward. are the to
parametersthe
11 model
2 of T ✓
model
model from the
distribution
distribution (T(p model
(p ✓).). In distribution
RR EINFORCE
EINFORCE
network. We , , ).
the
the
then In R
policy
policy is
is
EINFORCE
approximate initialized
initialized ,
the the policy
random,
random,
expected is initialized
which
which
gradient can
canusingrandom,
cause
cause a sing w
We then approximate ✓ the expected gradient ✓
using a single sample q s
= (q s
, q s
, ..., q s
)
ergence
rgence long
times.
times.
el, we distribution
minimize convergence
To
Reward
To solve
solve
the(pU this,
this, times.
MM IXER
IXER
-based To solve
starts
starts
loss, the by this,
by M
optimizing
optimizing IXER starts maximum
maximum by optimizing
likelihood
likelihood maximum
1 and
and 2 slowly
slowly Tlikeli
model Calculator ✓ ). Infrom
TILITY R EINFORCEthe model policy defined
Ldistribution
max-utility (p✓ ). as:
is initialized In Rrandom,EINFORCE , the can
which policy causeis initi
ptimizingshifts
ptimizing the to optimizing
theexpected
expected reward
rewardlong the from
from expected
Eq
Eq 3.
3. reward
For
For the
the from
initial
initial Eq 3.
time
time For the
steps,
steps, initial
M
M time
optimizes
IXER optimizes
IXER steps, M
ergence times. To solve this, Mconvergence
IXER starts by times.
optimizing To solve maximumthis, M IXER likelihoodstarts by andoptimizing
slowly
forthe
or theremainingLmle and
remaining (Tfor the))shifts
(T remaining
timesteps,
time steps,
to itToptimizes
itX
(T
optimizing optimizes timeexpected
) the thesteps,
the external it reward
external optimizesreward.
reward. from theEq external
3. For reward.
the initial
ptimizing the expected reward p from b Eq 3. For the initial time steps, M IXER optimizes
or
del,
del, the
we L
weremaining
In Reward =
our model,
minimize
max-utility
minimize theU(r(q
the
(T we
UTILITY
TILITY
)minimize
)Ltime r(q
and
-based
mle-based
))loss
for
steps, itthe
the
loss log
optimizes
UL LTILITYp(qt |q
remaining
max-utility
max-utility the
-based ,(T qexternal
defined
1defined 2 , loss
...,as:q)L
ttime
as: 1 , ctsteps,
reward. ) defined
max-utility it optimizesas: (4) the ext
REINFORCE: Ronaldt=1 J Williams. Simple statistical gradient-following algorithms for
del, we minimize the U In our model,
-based weLminimizedefined
loss
TILITY
X
the U as: -based loss L
TT max-utility
connectionist reinforcement
X b
TILITY
learning. Machine defin
max-utility
Tlearning , 8(3-4):229–256, 1992.
) is the ULLTILITY based reward onr(q
the X
max-utility=
max-utility (r(qppL
= (r(q ))predicted
bb ))
))max-utility
r(q = T (r(q
log pquestion
) tt|q
logp(q
p(q |qr(q b and r(q ) is a baseline reward
11,,qq2)) ...,qqtlog
2,,..., t 11,,p(q q2T, ..., qt 1 ,(4)
cctt))t |q1 ,114 ct )
o reduce the high variance otherwise observed X when using R EINFORCE . X
Lmax-utility = (r(q ) r(q )) t=1 log
p b t=1 Lmax-utility
p(qt |q1 ,=q2 , ...,t=1 p
(r(q qt )1 , cr(qt ) ))
b
log p(qt(4)|q1 , q2
ould
lity
ty thatgenerate
that as the
would
would an
be answer
beutility
obtained
obtained
question. thatthatWe ifwould
ifwouldthethis
the
use be increase
context
context obtained
observation were
were thetoif utility
the
updated
updated
define contextofwith
with
a(2018)
U thethe
TILITY were
thecontext updated
answer
answer
based byto
reward toadding
with
the proposed
the
function useful
theandanswer
proposed train theq
Rao to it & Daumé
(see §2.3 III (2018)
for generator
details). Recently
observed Rao
that & Daumé
usefulness III
of a question observed can be thatbetter usefulness
measured of a
We We use
use thisquestion.
this observation
observation We use totoas this
define
define
to observation
optimize
the aacontext
UUTILITY
utility
this
TILITY
that tobased
reward.
would define
based Wereward
be aobtained
reward
train Uthe function
function
TILITY U TILITY
if based
the and
and
reward reward
train
train
context the
thefunction
to predict
were question
question and trw
the likeliho
updated
ity that would be obtained question if
would the generate an were
answer updated
that would withincreasethe answer
the theutility to the
of the proposed
context bythe addil
to to optimize
optimize
ptimizing generator
this
this reward.to
reward. optimizeWeWe trainthis
train thereward.
the U U TILITY
TILITY We train
reward
reward the to
to U predict
predict
TILITY thereward likelihood
likelihood to predict that
that a
We use thismetrics
Reinforcement observation like BtoLEU
information toand
question.
define
Learning aRU
it (see Wefor
OUGE
TILITY
§2.3 for, details).
use this
based
Clarification U TILITY rewardfunction
observation to define
function
Question also
anda Uoperates
train
Generation
TILITY the basedon
questiondis-rewar
would
would
utputs, question
generate
generate
which anan
makes would
answer
answer generate
that
that
optimization
generator would
would an answer
increase
increase
difficult
to optimize due that
the
the to
this would
utility
utility increase
of
of
non-differentiability.
reward. the
the We the
context
context
train utility
by
by
the A of
adding
adding
U the
successful context
useful
useful reward b
to optimize this reward.
Similar We
to (see train
optimizing the
metrics U TILITY reward to predict
like B LEU and ROUGE, this U TILITY function also operat the likelihood TILITY that a
on
n toto it
oach dealingit (see
(see information
§2.3
§2.3
with for
for to
details).
details). it §2.3 for details).
would generate anthe non-differentiability
answer
crete textquestion
that
outputs, would would
which increase
makeswhile
generate thealso anretaining
utility
optimization answer of the
difficult some
that context
due advantages
would byincrease
adding
to non-differentiability. of the max-
useful utilityAs
ohood to ittraining is for
the Mixed Incremental Cross-Entropy for Reinforce (Ranzato et al., 2015)
Context
noptimizing
optimizing (seeSimilar
metrics
metrics
§2.3 to details).
recent
optimizing
like BBinformation
likeapproach LEU
LEU
Key dealing
metrics
and
and
Idea: to
RROUGEwith
OUGE it the
like (see
,,B non-differentiability
this
this §2.3
LEU U and
UTILITY
TILITY Rdetails).
OUGE function
function while
, this also retaining
U TILITY
also
also operates
operates some
functiononadvantage
on dis-also
M IXER ). which In M text imum , the likelihood
overall training
loss Lis is thedifferentiated
Mixed Incremental as in Cross-Entropy
R EINFORCE Reinforce(Williams, (Ranzato et
outputs,
outputs, crete
which IXERmakes
makes outputs, optimization
optimization which makes
difficult
difficult optimization
due
due toto difficult
non-differentiability.
non-differentiability. due to non-differentiability
AA successful
successful
optimizing metrics like BSimilar
algorithm (M IXER
LEU ü to
and R optimizing
).Estimate
In M IXER
OUGE ,loss themetrics
,this by U TILITY
overall
drawing likefunction
loss LB isLEU
samples andalso
differentiated Roperates
OUGE
(“questions”)as in, R this
on U
EINFORCEdis- TILIT(
proach
roach dealingrecent
dealing withapproach
with the
the dealing
non-differentiability
non-differentiability
1992): optimization with the non-differentiability
while
while also
also retaining
retaining while
some
some also
advantages
advantagesretaining of
of some
max-
max- adv
outputs, which makes s crete text outputs, difficultwhich due to makes optimization
non-differentiability. difficult
A due
successful to non
Question s s
elihood
lihood L(✓) =imum
training
training Eqisis
Generatorlikelihood
s ⇠p the
the Mixed
Mixed
r(q ) training
;Incremental
Incremental
r
L(✓) is
=L(✓)
recent approachq dealing the
Loss E Mixed
=
= - E
Cross-Entropy
Cross-Entropy
r(q ssIncremental
)⇠pwith; r(q
reward(q
r
the Reinforce
Reinforce
)r Cross-Entropy
logs|c)
✓ non-differentiability
L(✓) = p
E (Ranzato
(Ranzato
(q )r(q s Reinforce
)r et
et al.,
al.,
✓ while
log p (3)
2015)
2015)
(q s(Ranz
also ret
)
roach dealing with the ✓
non-differentiability ✓ s ⇠pwhile
✓
q
also ✓
retaining ✓
some ✓ s
q ⇠p advantages
✓ of✓max-
(M(MIXERIXER).).algorithm
In M
In MIXER
(Seq2seq) IXER (M , , the
the
IXER
y simum
).
overall
overall In M loss
lossIXER
likelihood L L ,isthe
is overall
differentiated
differentiated
training loss as
as
L is
inin differentiated
RR EINFORCE
EINFORCE as in
(Williams,
(Williams, R EINFO
lihood
a random training
output is the
sample
where Mixed is aIncremental
according random
ü to theCross-Entropy
output
Differentiate model
sample the p✓is
according theas
, where
loss Mixed
Reinforce
to the ✓ are Incremental
model (Ranzato
the ✓ , whereet
pparameters Cross-Entropy
✓ areal.,of 2015)
the theparamet
1992): network. algorithm We then approximate
(M ).theIn expected
M gradient
, the using
overall s a single
loss sample
is q sq s=) (q1s , q
differentiate
We (Mthen ). In M IXER
IXERapproximate , the
the s overall
expected loss
gradient IXER
L is differentiated
using a
s R EINFORCE single
IXER as
sample in
ss, the policy
R qEINFORCE
= (q sL
, q
ss 1 s 2srandom,
(Williams,
s
, ..., T whichs
sfrom the smodel distribution (p ✓ ). -In is initialized
L(✓)
L(✓) = =
odel distribution (plong E E r(q
r(q L(✓) ) )
1992): ;=; rrE L(✓)
L(✓)
Loss r(q
==
= ) EE s;s⇠p r r(q
r(q L(✓)
log )r
)r Pr(q= log
logs|c)Epp (q
(q
reward(q)
) r(q )r |c) log (3)
p✓ (q
✓ ). In R EINFORCE times., the policy this,isMinitialized random, which can cause
q q s ⇠p
⇠p ✓ q
✓ s ⇠p q q ⇠p ✓ ✓ ✓ q
✓✓ s ⇠p ✓
✓✓ ✓ ✓ ✓ ✓
convergence To solve IXER starts by optimizing maximum likelihood a
gence
sisaarandomtimes.
random
L(✓) where ToEsolve
output
=Question
output is
yqssshifts
sample
⇠p this,
✓a random
sample
r(q s M IXER the
)according
to optimizing rstarts
; output
according to
✓ L(✓) by
the=optimizing
tosample
the
expectedL(✓) model
model Eq s E
according
-reward
= ✓q✓,s✓,⇠p
ppfrom
⇠p maximum
where
r(q Eq
reward(qs 3. sFor sthe
✓to)r
where r(q the✓✓✓) are
log ;likelihood
model
are
|c) the
pthe✓r
s logtime
), where
parameters
(qpparameters
initial ✓✓L(✓) and
Pr(q slowly
=steps,✓E
s|c)
are
of
of qM (3)
sthe
⇠pthe ✓ p
IXER r(
imizing
WeWethen thenthe expected
network.
approximate
approximate We
Lmle reward
and
the
the thenfor from
the
expected s Eq
remaining
approximate
expected 3. (T
gradient
gradient For
the the initial
) time
expected
using
using aa steps,
single
single ittime
gradient optimizes
sample
sample steps,
using the
qq ssM aexternal
== single
IXER (q
(q ss ,optimizes
, reward.
qq sample
ss , ..., ssq s =
2, ..., qtheTp) , w
is a random
r the remaining output sample where
according y is a
to random
the model output p , sample
where ✓ according
are the to
parametersthe
1 1 model
2 of T
model
model from the
distribution
distribution (T(p Inmodel
(p ✓).). In
our Intime
)model,R steps,
distribution
R we
EINFORCE
EINFORCE
network.
minimize
We
it(poptimizes
, , ).
theIn
the
the
then policy
U R
policy the
EINFORCE
TILITY
approximate is
is external
✓
initialized
initialized
-based ,
loss
the
reward.
the policy
random,
random,
Lmax-utility
expected is initialized
which
which
defined
gradient
as: can
canusing random,
cause
cause a
✓
sing w
We then approximate ✓ the expected ü gradient
Mixed
✓
using
Incremental a single
Cross-Entropy sample q s
=
Reinforce (q s
, q s
, ...,
(MIXER) q s
)
ergence
rgence long
times.
times.
el, we distribution
minimize convergence
To
Reward
To solve
solve
the(pU this,
this, times.
MM IXER
IXER
-based To solve
starts
starts
loss, the bythis,
by M
optimizing
optimizing IXER starts maximum
maximum by optimizing
likelihood
likelihood maximum
1 and
and 2 slowly
slowly Tlikeli
model Calculator ✓ ).
TILITYInfromR EINFORCE the model policyp defined
Ldistribution
max-utility (p✓b). as:
is initialized In Rrandom,
X T EINFORCE , the can
which policy cause is initi
ptimizingshifts
ptimizing the to optimizing
theexpected
expected reward
reward long the from
from expected
Eq
Eq 3.
3. reward
For
For the
the from
initial
initial
– r(q Eq 3.
time
time For the
steps,
steps, initial
M
M IXER
IXER time
optimizes
optimizes steps, M
ergence times. To solve this, Mconvergence starts==bytimes.
Lmax-utility (r(q ))To
optimizing solve )) this,log
r(q maximum Mp(q likelihood starts
t |q1 ,|c)q2 , ..., qby t 1optimizing
and , cslowly
t)
Loss - (r(q s b )) log Pr(q s
IXER IXER
forthe
or theremainingLmle and
remaining (Tfor the))shifts
(T remaining
timesteps,
time steps,
to itToptimizes
itX
(T
optimizing optimizes ) thetimeexpected
thesteps,
the external
external it reward
optimizes
t=1 reward.
reward. from theEq external
3. For reward.
the initial
ptimizing the expected rewardp from b Eq 3. For the initial time steps, M IXER optimizes
or
del,
del, the
we L
weremaining
In Reward =
our model,
minimize
max-utility
minimize where
the
the
(T U(r(q we
UTILITY
r(q
TILITY
p)
)Ltimeminimize
mle
r(q
and
) is-based
-basedUfor
steps,
the )) itthe
the
loss
loss
TILITY UL log
LTILITY p(q
remaining
optimizes
based t |q
reward
max-utility
max-utility the
-based ,(T qexternal
on
defined
1defined 2 ,the
...,predicted
loss q)L
ttime
as:
as: 1 , ctsteps,
reward. ) defined
question
max-utility it and
optimizes as:b ) is(4)
r(q athe ext
baseli
introduced to reduce thet=1 high variance otherwise observed when using R EINFORCE.
del, we minimize the In, the
U TILITY our
In M IXERRanzato
model,
-based
baseline
we
loss
isXTTLminimize
max-utility
estimated
levelusing thewith
defined
a linear
U TILITY
as: -based
T
regressor
loss Lmax-utility
ICLR 2016 defin
X X b that takes in the current hidden
et.al Sequence training recurrent neural networks
) is the ULLTILITY based reward onr(q
ppL)input the))ispredicted
bb )) pquestion and
b the r(q is
),,p(q a|qerror
baseline reward (4)
max-utilitythe
== model
(r(qas and =trained top(q
minimize mean squared p b
max-utility (r(q r(q
) max-utility (r(q
log
log )
p(q tt |q
|qr(q
11,,qq2))
2,, ...,
..., q qlog
tt 11 c
c tt))
t 1 , q 2 , ...,
(||r(q q , c
)t 1r(q t )||)2
)
o reduce the high we variance otherwisetraining
use a self-critical observed
X T
t=1
when
approach using
Rennie et R EINFORCE
al.
t=1 p
(2017) where. X
115T
the baseline is estima
p b t=1 b
Lmax-utility the
= reward
(r(qobtained
) r(qby)) Lmax-utility
log
the current model 1 ,=q2 ,greedy
p(qt |qunder (r(qqt decoding
..., )1 , cr(q )) test
t ) during p(qt(4)
logtime. |q1 , q2
Max-utility based clarification question generation model
Context
Question
Reward
Generator Reward
Calculator
(Seq2seq)
Question
116
Max-utility based clarification question generation model
Context
Trained Offline
Question
Reward
Generator Reward
Calculator
(Seq2seq)
Question
117
Max-utility based clarification question generation model
Question
Reward
Generator Reward
Calculator
(Seq2seq)
Question
118
Generative Adversarial Networks (GAN) based training
Context
Generator
Question
Generator
(Seq2seq)
Question
Model Data
119
Generative Adversarial Networks (GAN) based training
Context
Generator Discriminator
Question
Reward
Generator
Calculator
(Seq2seq)
Question
Model Data
120
Generative Adversarial Networks (GAN) based training
Real Data
Context
(context,
question,
Generator Discriminator
answer)
Question
Reward
Generator Reward
Calculator
(Seq2seq)
Question
ü Discriminator tries to distinguish between
Model Data real and model data
ü Generator tries to fool the discriminator by
generating real looking data
121
GAN-Utility based Clarification Question Generation Model
Real Data
Context
(context,
question,
Generator Discriminator
answer)
Question
Reward
Generator Reward
Calculator
(Seq2seq)
Question
122
Our clarification question generation model (in summary)
123
Our clarification question generation model (in summary)
Sequence-to-sequence model trained using MLE
Context
Question
Generator
(Seq2seq)
Question
124
Our clarification question generation model (in summary)
Sequence-to-sequence model trained using RL
Context
Question Answer
Utility
Generator Generator Reward
Calculator
(Seq2seq) (Seq2seq)
Question Answer
125
Our clarification question generation model (in summary)
Sequence-to-sequence model trained using GAN
Context
Generator Discriminator
Question Answer
Utility
Generator Generator Reward
Calculator
(Seq2seq) (Seq2seq)
Question Answer
126
Example outputs
Original: are these pillows firm and do they keep their shape
GAN-Utility: does this pillow come with a cover or does it have a zipper ?
127
Example outputs
Original: are these pillows firm and do they keep their shape
GAN-Utility: does this pillow come with a cover or does it have a zipper ?
Max-Likelihood: is it waterproof ?
128
Error Analysis of GAN-Utility model
Incompleteness
what is the size of the towel ? i 'm looking for something to be able to use it for
Word repetition
what is the difference between this and the picture of the cuisinart deluxe
deluxe deluxe deluxe deluxe deluxe deluxe
129
Research Questions for Experimentation
130
Research Questions for Experimentation
131
Research Questions for Experimentation
132
Research Questions for Experimentation
4. How do models perform when evaluated for specificity and usefulness?
133
Human-based Evaluation Design
Context
Evaluation set size: 500
Generated Question
Context
Evaluation set size: 500
Generated Question
136
Human-based Evaluation Results on Amazon Dataset
Original 3.07
Specificity score
137
Human-based Evaluation Results on Amazon Dataset
Information
Lucene 2.8 Retrieval
Original 3.07
Specificity score
138
Human-based Evaluation Results on Amazon Dataset
Max-Likelihood 2.84
Learning vs
Non-learning
Lucene 2.8
Original 3.07
Specificity score
139
Human-based Evaluation Results on Amazon Dataset
Max-Utility 2.88
Reinforcement
Learning
Max-Likelihood 2.84
Lucene 2.8
Original 3.07
Specificity score
140
Human-based Evaluation Results on Amazon Dataset
Gan-Utility 2.99
Adversarial
Training
Max-Utility 2.88
Max-Likelihood 2.84
Lucene 2.8
Original 3.07
141
Human-based Evaluation Results on Amazon Dataset
Gan-Utility 2.51
Difference
Max-Utility 2.47 Statistically
Insignificant
Max-Likelihood 2.48
Lucene 2.56
Original 2.68
142
Human-based Evaluation Results on Amazon Dataset
Gan-Utility 0.94
Difference
Max-Utility 0.9 Statistically
Insignificant
Max-Likelihood 0.93
Lucene 0.77
Original 0.79
Usefulness score
143
Talk Outline
o Future Directions
144
Talk Outline
o Future Directions
145
Generic versus specific questions
Amazon
146
Sequence-to-sequence model for question generation
Input Output
Context
Context Question
Training data
Question
Generator Context Question
(Seq2seq)
Context Question
Question
147
Sequence-to-sequence model for controlling specificity
Input Output
Context Specific
< specific > Question
Training data
Question Context Generic
Generator < generic > Question
(Seq2seq)
Context Generic
< generic > Question
Sennrich et al. Controlling politeness in neural machine translation via side constraints. NAACL 2016
148
Sequence-to-sequence model for controlling specificity
Input Output
Context
< specific > Context Specific
< specific > Question
Training data
Question Context Generic
Generator < generic > Question
(Seq2seq)
Context Generic
< generic > Question
Specific
Question
Sennrich et al. Controlling politeness in neural machine translation via side constraints. NAACL 2016
149
Annotating questions with level of specificity
Input Output
o We need annotations on training data
Context Specific
o Manually annotating is expensive
< specific > Question
Context Generic
< generic > Question
Context Generic
< generic > Question
150
Annotating questions with level of specificity
Input Output
o We need annotations on training data
Context Specific
o Manually annotating is expensive
< specific > Question
o Hence
Context Generic
Ø Use ask humans1 to annotate a set of
< generic > Question
3000 questions
Ø Train a machine learning model to Context Generic
< generic > Question
automatically annotate the rest
151
Specificity classifier
Input Output
Louis & Nenkova. "Automatic identification of general and specific sentences by leveraging discourse annotations.” IJCNLP 2011
152
Specificity classifier
Louis & Nenkova. "Automatic identification of general and specific sentences by leveraging discourse annotations.” IJCNLP 2011
153
Specificity classifier
Louis & Nenkova. "Automatic identification of general and specific sentences by leveraging discourse annotations.” IJCNLP 2011
154
Summary of specificity-controlled question generation model
Context Specific
Training < specific > Question
Input Question data
Test Context
Generation Context Generic
< specific > Model < specific > Question
Output
Context Generic
Specific Question < specific > Question
Input Output
155
Specificity classifier results (with feature ablation)
0.73
All features 0.79
0.71
Question bag-of-words 0.8
0.7
Syntax 0.71
0.64
Average word embeddings 0.66
0.65
Polarity 0.65
0.64
Path in WordNet 0.63
156
Example Outputs
157
Automatic metric based evaluation of question generation
Diversity
GAN-Utility
0.13
MLE
0.12
158
Automatic metric based evaluation of question generation
0.14
Specificity-GAN-Utility
0.16
Specificity-MLE
GAN-Utility
0.13
MLE
0.12
159
Automatic metric based evaluation of question generation
0.14
Specificity-GAN-Utility
0.1
0.16
Specificity-MLE
0.1
GAN-Utility
0.13
MLE
0.12
160
Automatic metric based evaluation of question generation
BLEU (specific)
Specificity-GAN-Utility 2.95
Specificity-MLE 4.45
GAN-Utility 2.69
MLE 1.41
0 2 4 6 8 10 12 14
161
Automatic metric based evaluation of question generation
2.95
Specificity-GAN-Utility
12.84
4.45
Specificity-MLE
12.61
2.69
GAN-Utility
12.01
1.41
MLE
12.61
0 2 4 6 8 10 12 14
162
Talk Outline
o Future Directions
163
Talk Outline
o Future Directions
164
1. Using multi-modal context (Text + Image)
165
1. Using multi-modal context (Text + Image)
166
2. Knowledge-grounded question asking
167
2. Knowledge-grounded question asking
168
2. Knowledge-grounded question asking
What version of Ubuntu are you using? What is the dimensions of the toaster?
169
3. Towards more intelligent dialog agents
Please bring me my
coffee mug from the
kitchen
Black
170
CONCLUSION
171
CONCLUSION
172
CONCLUSION
173
CONCLUSION
174
CONCLUSION
175
Collaborators
o Clarification Questions
ü Sudha Rao, Hal Daumé III, "Learning to Ask Good Questions: Ranking Clarification Questions
using Neural Expected Value of Perfect Information ”, ACL 2018 (Best Long Paper Award)
ü Sudha Rao, Hal Daumé III, “Answer-based Adversarial Training for Generating Clarification
Questions” In Submission
o Semantic Representations
ü Sudha Rao, Yogarshi Vyas, Hal Daume III, Philip Resnik, "Parser for Abstract Meaning
Representation using Learning to Search", Meaning Representation Parsing, NAACL 2016
ü Sudha Rao, Daniel Marcu, Kevin Knight Hal Daumé III, "Biomedical Event Extraction using
Abstract Meaning Representation” Biomedical Natural Language Processing, ACL 2017
179
Generalization beyond large datasets
ü Bootstrapping process:
1. Use template based approach or humans to write initial set of questions
2. Train model on small set of questions and generate more
3. Add these (noisy) questions to training data and retrain
ü Domain adaptation:
1. Find a similar domain that has large no. of clarification questions
2. Train neural network parameters on out-domain and tune on in-domain
180
StackExchange dataset: Example of comment as answer
181
StackExchange dataset: Example of comment as answer
Just bought a new external drive. Plugged it in, erased current partition using fdisk and
created a new extended partition using fdisk. Used all the defaults for start and end
blocks. I then try to format the new partition using the following:
sudo mkfs.ext4 /dev/sdb1 Initial
However, I received the following error: Post
mke2fs 1.42 (29-Nov-2011)
/dev/sdb1: Not enough space to build proposed filesystem while setting up superblock
Any ideas what could be wrong? Should I have created a primary partition? If so, why?
Question
are you installing from a bootable thumb drive ?
comment
182
StackExchange dataset: Example of edit as answer
use virtualbox
2. virtual machine os : ubuntu 12.04 lts Edit to the post
3. host machine os : ubuntu 12.04 lts .
183
StackExchange dataset: Example of non-answer
I did download Ubuntu 12.04LTS. I tried to install - no progress. I tried to remove all
partition using a bootable version of GParted. I created one big partition ext4 formatted. Initial
It all did not help. The installation stops after "Preparing to install Ubuntu". All three Post
checkmarks are checked an I can click "Continue" but then nothing for hours. What can I
do? Please help!
184
Human-based Evaluation Results (Specificity)
How specific is the question to the product?
300
250
200
150
100
50
0
Original Lucene Max-Likelihood Max-Utility GAN-Utility
This product Similar Products Products in Home & Kitchen N/A
185
Human-based Evaluation Results (Usefulness)
How useful is the question to a potential buyer?
300
250
200
150
100
50
0
Original Lucene Max-Likelihood Max-Utility GAN-Utility
Should be in the description Useful to large no. of users
Useful to small no. of users Useful only to person asking
N/A
186
Human-based Evaluation Results (Seeking new information)
Does the question ask for new information currently not included in the description?
450
400
350
300
250
200
150
100
50
0
Original Lucene Max-Likelihood Max-Utility GAN-Utility
Completely Somewhat No N/A
187
Human-based Evaluation Results (Relevance)
500
450
400
350
300
250
200
150
100
50
0
Original Lucene Max-Likelihood Max-Utility GAN-Utility
Yes No
188
Human-based Evaluation Results (Grammaticality)
500
450
400
350
300
250
200
150
100
50
0
Original Lucene Max-Likelihood Max-Utility GAN-Utility
Grammatical Comprehensible Incomprehensible
189
Human-based Evaluation Results
190
Error Analysis of MLE model
dishwasher safe ?
191
Error Analysis of Max-Utility model
what are the dimensions of this item ? i have a great size of baking pan and pans and pans
what are the dimensions of this topper ? i have a queen size mattress topper topper topper
can this be used with the sodastream system system system system
192
Error Analysis of GAN-Utility model
what is the size of the towel ? i 'm looking for something to be able to use it for
what is the difference between this and the picture of the cuisinart <unk> deluxe
deluxe deluxe deluxe deluxe deluxe deluxe
193
Error Analysis of specificity model
Incomplete questions
what are the dimensions of the table ? i 'm looking for something to put it in a suitcase
what is the density of the mattress pad ? i 'm looking for a mattress for a memory foam
does this unit come with a hose ? i need to know if the window window can be mounted
can you use this in a conventional oven ? i have a small muffin pan for baking .
what are the dimensions of the basket ? i need to know if the baskets are in the picture
194
Reward Calculator
Real Data
Reward
Calculator
Generated Generated
Context Testing
Question Answer
Model Output
195
Other types of Question Generation
o Liu, et al. “Automatic question generation for literature review writing support." International
Conference on Intelligent Tutoring Systems. 2010
o Penas and Hovy, “Filling knowledge gaps in text for machine reading” International Conference
on Computational Linguistics: Posters ACL 2010
o Artzi & Zettlemoyer, “Bootstrapping semantic parsers from conversations” EMNLP 2011
o Labutov, et al.“Deep questions without deep understanding” ACL 2015
o Mostafazadeh et al. "Generating natural questions about an image." ACL 2016
o Mostafazadeh et al. "Multimodal Context for Natural Question and Response Generation.” IJCNLP
2017.
o Rothe, Lake and Gureckis. “Question asking as program generation” NIPS 2017.
196
Key Idea behind Expected Value of Perfect Information (EVPI)
Possible questions
Avriel, Mordecai, and A. C. Williams. "The value of information and stochastic programming." Operations Research 18.5
197(1970)
197
4. Writing Assistance
Hi Kathy,
Hey John,
198
4. Writing Assistance
Hi Kathy,
Hey John,
199
4. Writing Assistance
200
4. Writing Assistance
Hi Kathy,
Sounds good!
201
3. Interactive Search Query
202
3. Interactive Search Query
Which region?
203
3. Interactive Search Query
Which region?
Which period?
204
4. Asking questions to help build reasoning
205
4. Asking questions to help build reasoning
206
4. Asking questions to help build reasoning
Because she
did not win
the race.
207
Generating Natural Questions from Images (+ Text)
208
Example outputs
209
ndal.a discriminator.
(2017) proposed a sequence
The generator GAN model
is an arbitrary model gfor 2 Gtext
that generation
produces outpu to
e gradient
eat their The
uestions). update
generator from
as anisthe
discriminator discriminator
agent
another and
model 2 Dto
used the theattempts
generator.
discriminator
that as aRecently
to classifyreward
betwee
dequence
ive modelGAN
model-generated
GAN-Utilityusing model
outputs. for
Thetext
reinforcement
based goal generation
Clarificationof the
learning toGeneration
generator
Question overcome
is to generate
techniques. Our this
data issue.
such thatT
GAN-based
Model
criminator;
gent andGAN
quence the goal
use the of thewith
discriminator
discriminator
model as isa toreward
two main be ablefunction
to successfully
modifications: a)toWe distinguish
update
use the M be
ge
neratedlearning
ement data. In the process of trying
techniques. Our to fool the discriminator,
GAN-based approach the
is generator by
inspired pro
or (§2.2) instead
Ø General
as close
GANof
as possible policy
Objective
to the gradient
real data approach;
distribution. and the
Generically, b) GAN
We use the UisT
objective
two main modifications:
scriminator a) We use the
instead of a convolutional M IXER
neural algorithm
network (CNN). as our ge
y gradient
LGAN (D, approach;
G) = max minand Eb) x⇠We
p̂ loguse
d(x)the + EU log(1 function
TILITY d(g(z)))(§2.3) as
our model, the
onvolutional answer
neural is an latent
network
d2D g2G
(CNN). variable: we do not actually use i
z⇠p z
Generator Discriminator
211
Generative Adversarial Networks (GAN)
Goal: Train a model to generate digits
Latent Space + Noise
Generator Discriminator
Model Data
212
Generative Adversarial Networks (GAN)
Real Data
Latent Space + Noise
1 (Real)
Generator Discriminator
0 (Fake)
Model Data
213
Generative Adversarial Networks (GAN)
Real Data
Latent Space + Noise
1 (Real)
Generator Discriminator
0 (Fake)
214
Style transfer prior work
Informal Formal
Gotta see both sides of the story You have to consider both sides of the story
Niu et al. Controlling the formality of machine translation output. EMNLP 2017
Rao and Tetreault. Corpus, Benchmarks and Metrics for Formality Style Transfer. NAACL 2018
215
Upwork annotation statistics
216
Detailed human evaluation results
B1 [ B2 V1\V2 Original
Model p@1 p@3 p@5 MAP p@1 p@3 p@5 MAP p@1
Random 17.5 17.5 17.5 35.2 26.4 26.4 26.4 42.1 10.0
Bag-of-ngrams 19.4 19.4 18.7 34.4 25.6 27.6 27.5 42.7 10.7
Community QA 23.1 21.2 20.0 40.2 33.6 30.8 29.1 47.0 18.5
Neural (p, q) 21.9 20.9 19.5 39.2 31.6 30.0 28.9 45.5 15.4
Neural (p, a) 24.1 23.5 20.6 41.4 32.3 31.5 29.0 46.5 18.8
Neural (p, q, a) 25.2 22.7 21.3 42.5 34.4 31.8 30.1 47.7 20.5
EVPI 27.7 23.4 21.5 43.6 36.1 32.2 30.5 49.2 21.4
Table 4.1: Model performances on 500 samples when evaluated against the union
of the “best” annotations (B1 [ B2), intersection of the “valid” annotations (V 1 \
V 2) and the original question paired with the post in the dataset. The di↵erence
between the bold and the non-bold numbers is statistically significant with p <
0.05 as calculated using bootstrap test. p@k is the precision of the k questions
ranked highest by the model and MAP is the mean average precision of the ranking
predicted by the model.
217
Detailed human evaluation results (without original)
B1 [ B2 V1\V2
Model p@1 p@3 p@5 MAP p@1 p@3 p@5 MAP
Random 17.4 17.5 17.5 26.7 26.3 26.4 26.4 37.0
Bag-of-ngrams 16.3 18.9 17.5 25.2 26.7 28.3 26.8 37.3
Community QA 22.6 20.6 18.6 29.3 30.2 29.4 27.4 38.5
Neural (p,q) 20.6 20.1 18.7 27.8 29.0 29.0 27.8 38.9
Neural (p,a) 22.6 20.1 18.3 28.9 30.5 28.6 26.3 37.9
Neural (p,q,a) 22.2 21.1 19.9 28.5 29.7 29.7 28.0 38.7
EVPI 23.7 21.2 19.4 29.1 31.0 30.0 28.4 39.6
Table 4.2: Model performances on 500 samples when evaluated against the union
of the “best” annotations (B1 [ B2) and intersection of the “valid” annotations
(V 1 \ V 2), with the original question excluded. The di↵erence between all numbers
except the random and bag-of-ngrams are statistically insignificant.
predict the “best” question. The model predicts “why would you need this” with
very high probability likely because it is a very generic question, unlike the question
marked as “best” by the annotator which is too specific. In the third example,218 the
model again predicts a very generic question which is also marked as “valid” by the
0.50 define “ frozen ” . did it panic ? or did something else happen ?
0.50 maybe you need to use your ‘fn‘ key when pressing print screen ?
0.50 tried ctrl + alt + f2 ?
StackExchange example
0.49 does the script output
process 1 iteration (ranking)
successfully ?
0.49 laptop or desktop ?
Title: How to flash a USB drive?.
Post: I have a 8 GB Sandisk USB drive. Recently it became write somehow.
So I searched in Google and I tried to remove the write protection
through almost all the methods I found. Unfortunately nothing worked.
So I decided to try some other ways.
Some said that flashing the USB drive will solve the problem.
But I don’t know how. So how can it be done ?
1.01 what file system was the drive using ?
1.00 was it 16gb before or it has been 16mb from the first day you used it ?
0.74 which os are you using ? which file system is used by your pen drive ?
0.64 what operation system you use ?
0.51 can you narrow ’a hp usb down ’ ?
0.50 could the device be simply broken ?
0.50 does it work properly on any other pc ?
0.50 usb is an interface , not a storage device . was it a flash drive or a portable disk ?
0.49 does usb flash drive tester have anything useful to say about the drive ?
0.49 your drive became writeable ? or read-only ?
Table 4.4: Examples of human annotation from the unix and superuser domain of
our dataset. The questions are sorted by expected utility, given in the first column.
The “best” annotation is marked with black ticks and the “valid”’ annotations
are marked with grey ticks .
219
43
StackExchange example output (ranking)
Table 4.3: Example of human annotation from the askubuntu domain of our dataset.
The questions are sorted by expected utility, given in the first column. The “best”
annotation is marked with black ticks and the “valid”’ annotations are marked
with grey ticks . 221
Automatic metric based evaluation (question generation)
Amazon StackExchange
Model Diversity Bleu Meteor Diversity Bleu Meteor
Reference 0.6934 — — 0.7509 — —
Lucene 0.6289 4.26 10.85 0.7453 1.63 7.96
MLE 0.1059 17.02 12.72 0.2183 3.49 8.49
Max-Utility 0.1214 16.77 12.69 0.2508 3.89 8.79
GAN-Utility 0.1296 15.20 12.82 0.2256 4.26 8.99
Generic Specific
Model Diversity Bleu Meteor Diversity Bleu Meteor
Our best model is the one that uses all the features and attains an accuracy
of 0.73 on the test set. In comparison, a baseline model that predicts the specificity
223
label at random gets an accuracy of 0.58 on the test set.