Welcome to Scribd!

NMT Vs SMT: Dragos Munteanu

Uploaded by

0% found this document useful (0 votes)

1 views20 pages

NMT vs SMT - SMT learns from past examples and copies translations while NMT learns concepts and can generalize better. - SMT trains on parallel text and uses statistical analysis to build translation and language models while NMT uses neural networks with word embeddings, LSTM units, and attention. - NMT provides improved translation quality compared to SMT and is easier to adapt or retrain for new domains or languages.

Original Description:

Original Title

2017_MTSummit_SMTvsNMT

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

1 views20 pages

NMT Vs SMT: Dragos Munteanu

Uploaded by

Dragos Munteanu

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 20

Search inside document

NMT vs SMT

Dragos Munteanu
MT approaches

• Rule-based: explicitly model all aspects of language

• SMT: learn from, and copy, past experience

• NMT: learn “concepts”, generalize

Machine Learning

data

input output
Machine
Training + Decoding

model
SMT: Training
parallel
……… ……… monolingual ………
………
……… ………
french english english
……… ……… ………

Statistical Statistical
Analysis Analysis
P(s/t) P(t)
Translation Model Language Model

la the 80% the death of 54%

la a 12% the capital of 34%

la 8% a capital of 11%
capitale capital 70% capital of france 41%
capitale death 30%
capital from france 9%
de of 53%
of france is 45%
de from 47%
of the france 2%
france france 100%
france is paris 23%
Is est 75%
Is was 25% france was paris 22%

paris paris 100%

SMT: Decoding
Translation Model Language Model

la the 80% the death of 54%

la a 12% the capital of 34%
la 8%
a capital of 11%
capitale capital 70%
capital of france 41%
capitale death 30%
de of 53% capital from france 9%

de from 47% of france is 45%

france france 100% of the france 2%
Is est 75%
france is paris 23%
Is was 25%
france was paris 22%
paris paris 100%

Translation Score

the capital of france is paris 94%

Input capital of france is paris 71%

la capitale de la france est paris Statistical a capital of france is paris 65%

Search ... …

a death from france was paris 3%

SMT Models

Translation Part of Lexicalized

Distortion
Model Speech Reordering

Language
Alignment Reordering
Model

Syntax
Morphology Smoothing Preordering
Model

Capitalization Transliteration
Word
Deletion
NMT Decoding
-0.2
-0.1
0.1
0.4
-0.3
1.1

ENCODER DECODER
4.3
-0.2
0.5
0.9
1.3
3.4
-5.3

Input
-6.2

Output
4.8
9.3
3.4

Text …
2.6
4.9 Text
0.1
2.6
8.3
-7.3
5.1
1.5
0.6
9.3
-6.2
2.9
1.4
-1.3
A Neural Network

PARAMETERS
A Deep Neural Network
A Deep Recurrent Neural Network
Word representations

Sparse
All words are equally different

Dense
Similar words have similar vectors
Word Embeddings

XKing – XMan + XWoman = XQueen

Training: backpropagation

Predict

Update
Training: Dropout
Long Short Term Memory Units
Attention
NMT Advantages

• Improved quality

• Easy to adapt or re-train

• Opens new opportunities

Investment Effects on Quality

18
NMT Opportunities: Multilingual translation
NMT Opportunities: Low resource translation

Alignment Alignment

TRAINING TRANSLATING

RA Maturity List
Document30 pages
RA Maturity List
smasalmeh2872
No ratings yet
Lab Manual For Jncia
Document35 pages
Lab Manual For Jncia
Amit Kumar Saini
100% (1)
Es ZG621 Revised Handout
Document12 pages
Es ZG621 Revised Handout
kansal611
No ratings yet
Anyconnect VPN Troubleshooting
Document21 pages
Anyconnect VPN Troubleshooting
Taufik Wenger
No ratings yet
VNX7500 Hardware Overview
Document126 pages
VNX7500 Hardware Overview
Sandeep Reddy
No ratings yet
Wine Trade Monitor 2024 en
Document12 pages
Wine Trade Monitor 2024 en
Magui Arana
No ratings yet
UnderstandingCryptology CoreConcepts 6-2-2013
Document128 pages
UnderstandingCryptology CoreConcepts 6-2-2013
zenzei_
No ratings yet
Dosificación Tarantula Mixture Sheet
Document10 pages
Dosificación Tarantula Mixture Sheet
cris
No ratings yet
Geita Gold Mine MD Safety Meeting
Document50 pages
Geita Gold Mine MD Safety Meeting
Stephen Madaha
No ratings yet
New Book Launch 19% 30% 42%: Planning Phase 62% 95% 100%
Document2 pages
New Book Launch 19% 30% 42%: Planning Phase 62% 95% 100%
MONICA GAVIRIA
No ratings yet
Abb CMD 2016 Fact Sheets
Document16 pages
Abb CMD 2016 Fact Sheets
Luz Stella Calixto Gomez
No ratings yet
Lecture 6: Instruction Set Architecture: Computer Engineering 585 Fall 2001
Document14 pages
Lecture 6: Instruction Set Architecture: Computer Engineering 585 Fall 2001
Prasanna Niyadagala
No ratings yet
Fire Safety Activity
Document3 pages
Fire Safety Activity
Jaycel Ongy
No ratings yet
Case%Analysis:%: Maple% Situa2on%
Document4 pages
Case%Analysis:%: Maple% Situa2on%
Zexi WU
No ratings yet
Business 002
Document5 pages
Business 002
manutduda
No ratings yet
Proyeccion Avance LB1
Document2 pages
Proyeccion Avance LB1
MONICA GAVIRIA
No ratings yet
The Connected Consumer Q4 2020: Prepared by Decision Lab MARCH 2021
Document39 pages
The Connected Consumer Q4 2020: Prepared by Decision Lab MARCH 2021
Trần Quốc Cường
No ratings yet
Dealers Stock Share (0-10%) Total Number of Dealers:-106
Document6 pages
Dealers Stock Share (0-10%) Total Number of Dealers:-106
Karan Trivedi
No ratings yet
Fraud in The Wake Of: COVID-19
Document18 pages
Fraud in The Wake Of: COVID-19
aymen
No ratings yet
Crypto Vip Binance Report 2020 JUNE: Win & Loss Chart
Document3 pages
Crypto Vip Binance Report 2020 JUNE: Win & Loss Chart
Ben Adamtey
No ratings yet
Eng Project Scurve 250620 PDF
Document1 page
Eng Project Scurve 250620 PDF
Elangkoh
No ratings yet
Benitez, Jewel Ann Q. Analysis #3
Document7 pages
Benitez, Jewel Ann Q. Analysis #3
MIKASA
No ratings yet
Semana 27 Okok
Document63 pages
Semana 27 Okok
Jimena Martinez
No ratings yet
Template For Waiting Line Analysis: Probability of # of Customers in System
Document1 page
Template For Waiting Line Analysis: Probability of # of Customers in System
Dang Phan Hai
No ratings yet
D2 Stategy Project - Gisela
Document13 pages
D2 Stategy Project - Gisela
javiermartincalleja
No ratings yet
ENEL Cyber Security PDF
Document49 pages
ENEL Cyber Security PDF
Inu Sam
No ratings yet
2017 Airline EM Alton Fleet Trends 19oct2017
Document16 pages
2017 Airline EM Alton Fleet Trends 19oct2017
Omar A Tavarez
No ratings yet
Sentiment Chart
Document1 page
Sentiment Chart
api-26372760
No ratings yet
Composite: Vanguard Mid-Cap Index Fund & Vanguard Long-Term Treasury Fund (5YR Performance Analysis)
Document21 pages
Composite: Vanguard Mid-Cap Index Fund & Vanguard Long-Term Treasury Fund (5YR Performance Analysis)
Amnuay Pra
No ratings yet
Module 3 Demos
Document1,115 pages
Module 3 Demos
LEONARD EDUARDO BARRERA PARRA
No ratings yet
Sample of S-Curve Diagram
Document2 pages
Sample of S-Curve Diagram
Syamz Azrin
No ratings yet
Economie Numerique 4
Document168 pages
Economie Numerique 4
Biggest Junjun
No ratings yet
Covid-19 Benchmarking Report December Edition
Document19 pages
Covid-19 Benchmarking Report December Edition
Mayra Aldave Milla
No ratings yet
Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug
Document7 pages
Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug Portug
Fred
No ratings yet
2013 Republican Candidate Poll
Document10 pages
2013 Republican Candidate Poll
Joseph Thorpe
No ratings yet
Capaian Hafalan Qur'An Kelas 7 BULAN SEPTEMBER 2022-2023
Document5 pages
Capaian Hafalan Qur'An Kelas 7 BULAN SEPTEMBER 2022-2023
Rizqy Faturrahman
No ratings yet
Poker Odds
Document2 pages
Poker Odds
api-3826274
No ratings yet
Rechazos Frecuencia % % Acumulado
Document1 page
Rechazos Frecuencia % % Acumulado
Manuel data
No ratings yet
Fab Pcuv
Document36 pages
Fab Pcuv
ravikiran
No ratings yet
Format Specifier
Document4 pages
Format Specifier
ABHISHEK GOUTAM
No ratings yet
Grafik
Document1 page
Grafik
Muhammad Zainuri
No ratings yet
DCX Research New Branding Web Version PDF
Document28 pages
DCX Research New Branding Web Version PDF
David Briggs
No ratings yet
RAJAR DataRelease InfographicQ12020
Document1 page
RAJAR DataRelease InfographicQ12020
Lukáš Polák
No ratings yet
Mantenimiento Indicador Unidad Dias 12 13
Document60 pages
Mantenimiento Indicador Unidad Dias 12 13
vanessa sierra sanchez
No ratings yet
Sales 2009: Product Sales 2008 Sales 2009 Difference
Document2 pages
Sales 2009: Product Sales 2008 Sales 2009 Difference
Suad
No ratings yet
Tick Size Pilot: Year in Review
Document13 pages
Tick Size Pilot: Year in Review
tabbforum
No ratings yet
Startups and Investment: Where Is Italy At?
Document26 pages
Startups and Investment: Where Is Italy At?
ruga
No ratings yet
Decision Lab - The Connected Consumers - Q1 2021 - 0621
Document39 pages
Decision Lab - The Connected Consumers - Q1 2021 - 0621
Tròn Quayy
No ratings yet
Sales Performance 2017
Document1 page
Sales Performance 2017
coloradoresources
No ratings yet
The Connected Consumer Q2 2020: Prepared by Decision Lab AUGUST 2020
Document42 pages
The Connected Consumer Q2 2020: Prepared by Decision Lab AUGUST 2020
Chjk Pink
No ratings yet
Seguimiento Programa Personalizado
Document2 pages
Seguimiento Programa Personalizado
Claudio Castillo Garrido
No ratings yet
Pareto Chart
Document1 page
Pareto Chart
Dave Manalo
No ratings yet
The Indian Media & Entertainment Industry 2019: Trends & Analysis - Past, Present & Future
Document63 pages
The Indian Media & Entertainment Industry 2019: Trends & Analysis - Past, Present & Future
Sharvari Shankar
No ratings yet
OKR Template - by Kemb GMBH
Document29 pages
OKR Template - by Kemb GMBH
sandino
No ratings yet
French Public Opinion On Gmos & Food Safety
Document12 pages
French Public Opinion On Gmos & Food Safety
elplasti
No ratings yet
Moldova Poll Presentation
Document64 pages
Moldova Poll Presentation
Radu Soltan
No ratings yet
Global Mobile Consumer Trends, 2nd Edition: Mobile Continues Its Global Reach Into All Aspects of Consumers' Lives
Document19 pages
Global Mobile Consumer Trends, 2nd Edition: Mobile Continues Its Global Reach Into All Aspects of Consumers' Lives
Tristan Lirio
No ratings yet
Smartphone Brand Survey (Vietnam) : Asia Plus Inc
Document29 pages
Smartphone Brand Survey (Vietnam) : Asia Plus Inc
NgơTiênSinh
No ratings yet
Percentage Practice Sheet - 01
Document3 pages
Percentage Practice Sheet - 01
275 DILIP RAYGOR
No ratings yet
2015 ITOA Survey Report
Document24 pages
2015 ITOA Survey Report
Richard Ramirez
No ratings yet
2024 Global Ecommerce Report
Document158 pages
2024 Global Ecommerce Report
Irene Nguyen
No ratings yet
Sage UBS v9.9.5.0 Release Notes
Document31 pages
Sage UBS v9.9.5.0 Release Notes
Nurul Syuhaida
No ratings yet
6320 CA Manual PDF
Document96 pages
6320 CA Manual PDF
Fabio Andres Plata Torres
No ratings yet
PCM320 IDM320 NIM220 PMM310 Base Release Notes 1210
Document48 pages
PCM320 IDM320 NIM220 PMM310 Base Release Notes 1210
Eduardo Lecaros Cabello
No ratings yet
Shellshock Aimbot
Document6 pages
Shellshock Aimbot
kokonat
0% (1)
Principles and Functional Requirements
Document21 pages
Principles and Functional Requirements
Ilir Bojaxhiu
No ratings yet
Setcom S CORE
Document8 pages
Setcom S CORE
haas84
No ratings yet
National University of Modern Languages: Complex Computing Problem (CCP)
Document5 pages
National University of Modern Languages: Complex Computing Problem (CCP)
Hanzala Amir
No ratings yet
SAP CRM Course Outline
Document6 pages
SAP CRM Course Outline
nh
No ratings yet
18ME751 Notes
Document119 pages
18ME751 Notes
Lekhu Reddy
100% (3)
Wireless Network Security
Document6 pages
Wireless Network Security
IoNForces
No ratings yet
FOP ConfiguringaGrandstreamGXW 410XDevicetoactasanFXOGateway 170722 0237 30740
Document5 pages
FOP ConfiguringaGrandstreamGXW 410XDevicetoactasanFXOGateway 170722 0237 30740
Eloi Sa
No ratings yet
National Consortium For Data Science
Document25 pages
National Consortium For Data Science
Gerardo Rojas
No ratings yet
05 Web - Datasheet - c1.9 Chromadek Colour Chart
Document2 pages
05 Web - Datasheet - c1.9 Chromadek Colour Chart
margarido89
No ratings yet
Use Case Diagrams With VP UML 26zig05
Document4 pages
Use Case Diagrams With VP UML 26zig05
mahmoud_nazem
No ratings yet
Radio Network Planning: in Arcgis
Document12 pages
Radio Network Planning: in Arcgis
TDMA2009
No ratings yet
EIT-Digital IoT Through EmbeddedSystems onlineDEF06 PDF
Document10 pages
EIT-Digital IoT Through EmbeddedSystems onlineDEF06 PDF
Braxt MwIra Gibecière
No ratings yet
Infosys Test 2 Questions
Document23 pages
Infosys Test 2 Questions
Charan Macharla's
No ratings yet
Indian Institute of Information Technology Surat
Document2 pages
Indian Institute of Information Technology Surat
Harry Ginny
No ratings yet
Bob Cam V24 Getting Started Manual
Document354 pages
Bob Cam V24 Getting Started Manual
Dug McCallum
No ratings yet
Untitled
Document60 pages
Untitled
Andrés Orjuela
No ratings yet
Docs Icinga Org Latest en DB Model HTML
Document39 pages
Docs Icinga Org Latest en DB Model HTML
Ivan Velikov
No ratings yet
Caldera DR-DOS 7.02 User Guide
Document34 pages
Caldera DR-DOS 7.02 User Guide
Huu Dinh Nguyen
No ratings yet
Strategy Canvas
Document9 pages
Strategy Canvas
Zain Ul Abidin Rana
No ratings yet
Plume P4 Pro-PGN515 Diagram
Document1 page
Plume P4 Pro-PGN515 Diagram
ncir
No ratings yet
ISYE3039-HW5-solutions - ISYE 3039 HW 5 Solutions Tuesday Oct 1 2013 Problem 1 6.19 Samples of N 4
Document16 pages
ISYE3039-HW5-solutions - ISYE 3039 HW 5 Solutions Tuesday Oct 1 2013 Problem 1 6.19 Samples of N 4
Omar Bahgat
No ratings yet
Getting Started With Scilab-Seminario
Document93 pages
Getting Started With Scilab-Seminario
Hugh cab
No ratings yet