You are on page 1of 30

Outline

1  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  Evalua8on  
¤  Conclusion  
Outline
2  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  Evalua8on  
¤  Conclusion  
Spoken  Dialogue  System
3  

Speech   Language   Knowledge  


Recogni8on Understanding Base
Dialogue  
Manager

Speech   Language  
Synthesis Genera8on Web

Dialogue  System
NLG:  Problem  Defini8on
4  

¤  Given  a  meaning  representa8on,  map  it  into  natural  


language  uNerances.  
Dialogue  Act Realisa@ons

Inform(restaurant=Seven_days,  food=Chinese)

Seven  days  is  a  nice  restaurant  serving  Chinese.

Seven  days  is  a  good  Chinese  restaurant.

¤  What  do  we  care  about?  


¤  adequacy,  fluency,  readability,  varia8on                                      
 (Stent  et  al  2005)  
Tradi8onal  pipeline  approach
5  

Sentence     Surface  
Planning Realisa8on

Inform(  
       name=Z_House,   Z  House  is  a  cheap  
       price=cheap   restaurant.
)

Dialogue   Tree-­‐like   UNerance


Act template
Problems
6  

¤  Scalability  
¤  Grammars  are  handcra[ed.  
¤  Require  expert  knowledge.
Problems
7  

¤  Boring   Thank  you,  


¤  Frequent  repe88on  of  outputs.   Thank  you,  
good  bye.  
Thank  you,  
¤  Non-­‐colloquial,  awkward     good  bye.  
good  bye.  
Thank  you,  
           uNerances.   good  bye.  Thank  
you,  
good  bye.  

Seven  Days  is  a  nice  restaurant  in  the  expensive  price  range,  in  the  north  part  of  
the  town,  if  you  don’t  care  about  what  food  they  serve.
Outline
8  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  Evalua8on  
¤  Conclusion  
Recurrent  Genera8on  Model
9  

Inform(name=Seven_Days,  food=Chinese) dialog  act  1-­‐hot  


representa)on
0,  0,  1,  0,  0,  …,  1,  0,  0,  …,  1,  0,  0,  0,  0,  0… …

       </s>                    SLOT_NAME                  serves                    SLOT_FOOD                    .                        </s>


       </s>                      Seven  Days                  serves                          Chinese                    .                        </s>
delexicalisa)on
RNNLM  (Mikolov  et  al  2010)
Recurrent  Genera8on  Model
10  

¤  Gates  are  controlled  by  exact  matching  of  


features  and  generated  tokens.  
¤  Apply  a  decay  factor  δ<1  on  feature  values.  
SLOT_NAME serves SLOT_FOOD . </s>
1

nNAME nFOOD

¤  Binary  slots/special  values  need  to  be  addi8onally  


handled.
Outline
11  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  Evalua8on  
¤  Conclusion  
Over-­‐genera8on  &  Reranking
12  

¤  Generate  a  bunch  of  candidate  uNerances.  


¤  Rerank  them!    
Seven  days  is  a  good  restaurant  in  the  south.   0.9
There  is  no  restaurant  in  the  south. 0.2
Seven  days  is  in  the  south  part  of  town. 0.7

¤  Simple  &  varia8on  included.  

(Oh & Rudnicky 2000)


CNN  Seman8c  Reranker
13  

Target  dialogue  act:  inform(name=Seven_days,  food=Chinese)   (Kalchbrenner  et  al.,  2014)


inform
Generated  candidate:    </s>  SLOT_NAME  serves  SLOT_FOOD  .  </s>
confirm

</s>
request
SLOT_  
NAME SLOT_NAME=
Value
serves
SLOT_NAME=
SLOT_   NIL
FOOD
SLOT_FOOD=
. Value
SLOT_FOOD=
</s>
NIL
ALLOW_KID  
=Yes
ALLOW_KID  
=No
ALLOW_KID=
NIL

Sentence  representa8on   1-­‐D  convolu8onal  layer  with   Average  pooling   Fully  connected  layer  for  
over  delexicalised  corpus mul8ple  feature  maps over  8me classifying  dialogue  act
Backward  Reranker
14  

¤  Train  a  RNN  with  uNerances  reversed.  


¤  In  order  to  consider  backward  context  
¤  Ex.    “Seven  Days  is  an  excep@onal  restaurant.”  

¤  Reranking  Score:  


¤  LLFowardRNN+LLBackwardRNN-­‐LossCNN  
Outline
15  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  Evalua8on  
¤  Conclusion  
Setup
16  

¤  Data  collec8on:  


¤  SFX  Restaurant  domain:  8  act  types,  12  slots(1  binary).  
¤  Workers  recruited  from  Amazon  MT.  
¤  Asked  to  generate  system  responses  given  a  DA.  
¤  Result  in  ~5.1K  uNerances,  228  dis8nct  acts.    
¤  Training:    BPTT,  L2  reg,  SGD  w/  early  stopping.    
   train/valid/test:  3/1/1,  data  up-­‐sampling  
Generated  Examples
17  
Generated  Examples
18  
Generated  Examples
19  
Outline
20  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  AutomaWc  EvaluaWon  
¤  Human  Evalua8on  
¤  Conclusion  
Automa8c  Evalua8on
21  

¤  Test  set:      1039  uNerances,  1848  required  slots.  


¤  Metrics:  BLEU-­‐4  (against  mul8ple  references),  
   ERR(slot  errors)  
¤  Averaged  over  10  random  ini8alised  networks  
¤  Baselines:    
¤  handcra[ed  generator  (hdc)  
¤  class-­‐based  LM  (classlm,  O&R  2000)
Automa8c  Evalua8on
22  

Metrics   hdc   classlm   rnn  

BLEU   0.440   0.757   0.777  


ERR   0   47.8   0  
Outline
23  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  EvaluaWon  
¤  Conclusion  
Human  Evalua8on
24  

¤  Setup  
¤  Judges  (~60)  recruited  from  Amazon  MT.  
¤  Asked  to  evaluate  two  system  responses  pairwise.  
¤  Comparing  handcra[ed  (hdc),  RNN  top-­‐1  (rnn1),  RNN  
sample  from  top-­‐5  (rnn5),  and  class-­‐based  LM  
sampled  from  top-­‐5  (classlm5)  .  
¤  Metrics:  
¤  Informa8veness,  Naturalness  (ra8ng  out  of  5)  
¤  Preference
Human  Evalua8on
25  

4   4.2   3.8  
hdc   classlm5   rnn1   rnn5  
3.9   4.1  
rnn5   rnn5   3.7  
3.8   4  
3.7   3.9   3.6  
3.6   3.8  
3.5   3.7   3.5  
Info.   Nat.   Info.   Nat.   Info.   Nat.  

Pref.  (%)   Pref.  (%)   Pref.  (%)  

hdc  
37%   class
rnn5   rnn5   rnn1  
lm5  
rnn5   53%   53%   47%  
47%  
63%  
:  p<0.5
Outline
26  

¤  Intro  
¤  Recurrent  Generator  
¤  Rerankers  
¤  Experiments  
¤  Setup  
¤  Automa8c  Evalua8on  
¤  Human  Evalua8on  
¤  Conclusion  
Conclusion
27  

¤  NLG  can  be  solved  using  RNN.  


¤  Over-­‐genera8on  –  reranking  paradigm,  a  hybrid  
of  RNNs  &  CNN  approach.  
¤  Both  automa8c  &  human  evalua8on  was  done.  
¤  More  colloquial,  more  scalable.  
¤  Poten8al  for  open  domain  SDS.  
Papers
28  

¤  Tsung-­‐Hsien  Wen,  Milica  Gasic  ,  Dongho  Kim,  Nikola  Mrksic,  Pei-­‐
Hao   Su,   David   Vandyke,   and   Steve   Young.   Stochas8c   language  
genera8on   in   dialogue   using   recurrent   neural   networks   with  
convolu8onal  sentence  reranking.  In  Proceedings  of  SIGdial  2015.    
¤  Tsung-­‐Hsien  Wen,  Milica  Gasic  ,  Nikola  Mrksic,  Pei-­‐Hao  Su,  David  
Vandyke,   and   Steve   Young.   Seman8cally   Condi8oned   LSTM-­‐based  
Natural   Language   Genera8on   for   Spoken   Dialogue   Systems.   To   be  
appear  in  Proceedings  of  EMNLP  2015.
Selected  References
29  

¤  Amanda   Stent,   MaNhew   Marge,   and   Mohit   Singhai.   2005.  


Evalua8ng  evalua8on  methods  for  genera8on  in  the  presence  of  
varia8on.  In  Proceedings  of  CICLing  2005.  
¤  Alice  H.  Oh  and  Alexander  I.  Rudnicky.  2000.  Stochas8c  language  
genera8on   for   spoken   dialogue   systems.   In   Proceedings   of   the  
2000  ANLP/NAACL  Workshop  on  Conversa8onal  Systems.  
¤  Tomas   Mikolov,   Mar8n   Karafit,   Lukas   Burget,   Jan   Cernocky,   and  
Sanjeev   Khudanpur.   2010.   Recurrent   neural   network   based  
language  model.  In  Proceedings  on  InterSpeech.    
¤  Nal  Kalchbrenner,  Edward  GrefensteNe,  and  Phil  Blunsom.  2014.  
A   convolu8onal   neural   network   for   modelling   sentences.  
Proceedings  of  the  52nd  Annual  Mee8ng  of  ACL.  
Thank  you!  Ques8ons?

This  project  is  supported  by  Toshiba  Research  Europe  Ltd,  


Cambridge  Research  Laboratory.  

Dialogue  Systems  Group  

You might also like