Professional Documents
Culture Documents
Get To The Point:: Summarization With Pointer-Generator Networks
Get To The Point:: Summarization With Pointer-Generator Networks
+
*Stanford NLP Google Brain
Select parts (typically sentences) of the original Generate novel sentences using natural language
text to form a summary. generation techniques.
● Multi-sentence summaries
(usually 3 or 4 sentences,
average 56 words)
● Summary contains
information from
throughout the article
Sequence-to-sequence + attention model
Distribution
Vocabulary
m wei
d su ghte
States Distribution
hte d su
weig
Attention
m a zoo
Hidden States
Encoder
Decoder
Hidden
...
Germany emerge victorious in 2-0 win against Argentina on Saturday ... <START> Germany
beat
Hidden States
Encoder
Decoder
Hidden
States
...
Germany emerge victorious in 2-0 win against Argentina on Saturday ... <START> Germany
...
Germany emerge victorious in 2-0 win against Argentina on Saturday ... <START>
Source Text
Two Problems
point!
point!
... ...
"2-0"
a zoo
Distribution
Vocabulary
Context Vector
a zoo
States Distribution
Hidden Attention
Hidden States
Decoder
Encoder
...
Germany emerge victorious in 2-0 win against Argentina on Saturday ... <START> Germany beat
Before After
UNK UNK was expelled from the gaioz nigalidze was expelled from the
dubai open chess tournament dubai open chess tournament
the 2015 rio olympic games the 2016 rio olympic games
Two Problems
Source Text
Final Coverage
Results
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result
Results
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result
Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result
Paulus et al. 2017 (hybrid RL approach) 39.9 15.8 36.9 worse ROUGE; better human eval
Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1 better ROUGE; worse human eval
Results
ROUGE compares the machine-generated summary to the human-written reference summary
and counts co-occurrence of 1-grams, 2-grams, and longest common sequence.
Nallapati et al. 2016 35.5 13.3 32.7 Previous best abstractive result
Ours (pointer-generator)
39.5
39.9
?
15.7
17.3
15.8
33.4
36.4
36.9
Our improvements
Paulus et al. 2017 (RL-only approach) 41.2 15.8 39.1 better ROUGE; worse human eval
The difficulty of evaluating summarization
● Summarization is subjective
○ There are many correct ways to summarize
The difficulty of evaluating summarization
● Summarization is subjective
○ There are many correct ways to summarize
● Summarization is subjective
○ There are many correct ways to summarize
SAFETY
Human-level
summarization
pa
g
rst text
din
rap
an
un long
hra
sin
de
g
MOUNT ABSTRACTION
Extractive
methods
SAFETY
Human-level
summarization
repetition
pa
g
rst text
din
rap
an
un long
hra
sin
de
copying errors
nonsense
g
MOUNT ABSTRACTION SWAMP OF BASIC ERRORS
Extractive
methods
SAFETY
Human-level
summarization
more high-level understanding? repetition
more scalability?
better metrics?
RNNs
copying errors
nonsense
RNNs
Extractive
methods
SAFETY
Thank you!