Professional Documents
Culture Documents
NLP
Carlos Escolano
carlos.escolano@tsc.upc.edu
PostDoc
Universitat Politecnica de Catalunya
Technical University of Catalonia
Outline
● Machine Translation
● Text Summarization
● Question Answering
● Dialog
2
Machine Translation
3
The origins of Machine Translation
Source: https://www.youtube.com/watch?v=K-HfpsHPmvw&feature=youtu.be
4
Statistical Machine Translation
Given a source sentence x want to found about the most probable sentence y in the
target language
5
Statistical Machine Translation
source:http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture08-nmt.pdf
6
Alignment
Source: https://www.aclweb.org/anthology/J93-2003.pdf
7
Seq2Seq
● The model acts as a conditional language model.
● It models the translation probability as:
8
Source: http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture08-nmt.pdf
9
source:http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture08-nmt.pdf
10
Training
source:http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture08-nmt.pdf
11
Transformer. Autoregressive inference
Source: https://lena-voita.github.io/nlp_course/seq2seq_and_attention.html
12
Greedy decoding
Source: http://web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture08-nmt.pdf
13
Beam Search Decoding
● On each step of decoder, keep track of the k most probable partial
translations (hypotheses).
● Usually k is between 5 and 10.
● Decoding stop when maximum length achieved or <END> token found.
● Beam Search does not provide an optimal decoding.
● Hypotheses are ranked according their score:
14
Beam Search Decoding
Source: https://huggingface.co/blog/how-to-generate
15
BLEU
● Automatic measure of Machine Translation Quality. Computed by
comparing the decoded sentences with one or several human generated
references.
● Based on:
○ N-gram precision: The correctly predicted n-grams from size to 1 to a
determined size (usually 4)
○ Length penalty: Penalty over short decoded outputs compared to
the reference.
16
Text Summarization
17
Summarization
Given input text x, write a summary y which us shorter and contains the main information of x.
18
Strategies: Extractive summarization
19
Strategies: Abstractive summarization
● Generates new text using
language generation
techniques.
● More difficult to
implement.
● No restricted by phrasing.
20
Precision and recall
● Precision is the fraction of retrieved documents that are relevant to the
query.
21
Neural Summarization
● Single document
summarization is a
translation task.
● Seq2Seq + attention
architecture.
23
Metrics: ROUGE
24
Question Answering
Question Answering
26
Example
27
Pretrained models for QA
28
Dialog
29
End2End Models
30
Prompt Tuning
● Input provided to a language model system to perform a downstream task (e.g
Summarization, Dialog, NER, etc).
Given the above information, write a witty speaker bio about Reid.
31
Example: Dialog
You are an expert baker answering users' questions. Reply as agent.
Example conversation:
User: I want to bake a cake but don't know what temperature to set the oven to.
Agent: For most cakes, the oven should be preheated to 350°F (177°C).
Current conversation:
Agent:
32
ChatGPT
33
ChatGPT
34
ChatGPT
35
Virtual Assistants
36
Virtual Assistants. Example
ASR
“Open Netflix”
Intent Recognition
“Alexa, open Netflix” Start an App (Which one?)
NER
“Alexa, open Netflix” “Netflix”
run netflix
37
Questions?
38