Welcome to Scribd!

Levenshtein Distance PDF

Uploaded by

0% found this document useful (0 votes)

17 views3 pages

The document discusses implementing Levenshtein distance in code, comparing it to other algorithms like Needleman-Wunsch and Smith-Waterman, and noting there are two approaches to Levenshtein distance based on whether character replacement has a cost of 1 or 2. It also shows figures demonstrating the Levenshtein distance calculation and character edits between strings.

Original Description:

Original Title

Levenshtein Distance.pdf

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

17 views3 pages

Levenshtein Distance PDF

Uploaded by

eminem09087

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

CHUA, Justin

LEGASPI, John

INTRNLP Assignment #2 : Levenshtein Distance

1) Implementing the code for Levenshtein Distance

After some research, it can be seen that there are two ways to implement
Levenshtein Distance into code. The main difference is the cost of changing a character.
Normally it is considered to cost 2, 1 for removing the character, and another for
inserting another character to replace it, but for some replacing a character with another
is considered to be 1 cost. This was considered when implementing the code for
Levenshtein Distance, hence there are two java files submitted, along with this
document.

Figure 1. Levenshtein Distance with Replacement Cost of 2

Figure 2. Levenshtein Distance with the Replacement Cost of 1

Aside from the generation of the table along with the output of the Levenshtein
Distance, there is also an additional output displaying what happened to the source
string as it is being edited to the target string. With characters R, M, I, and D showing
Replacement of character, Matching character, Including a new character, and Deleting
a character respectively.
Figure 3. Display of edits done to the strings

2) Comparing Levenshtein Distance with Other Algorithms

a) Needleman-Wunsch Algorithm
The Levenshtein Distance is a type of edit distance that is used to measure the
degree of similarity between two strings or sequences. It is defined by a set of
edit operations which are insertion, deletion and substitution and every operation
corresponds to a cost. The distance between the two strings or sequences are
determined by the total cost of every operation to transform one string to another.

The Levenshtein Distance finds the minimum number to transform a particular

sequence to another sequence. While the Needleman-Wunsch algorithm divides
the large problem into sequences of smaller problems. The Needleman-Wunsch
algorithm is building up the best possible alignment of the sequences by using
optimal alignments of smaller subsequences. It is computed by assigning a score
to each alignment between the two input strings and choosing the score of the
best alignment which is the highest computed total score

The main difference between the Needleman-Wunsch algorithm and the

Levenshtein distance algorithm is that the Levenshtein distance algorithm uses a
fixed penalty cost to any mismatched letters while the Needleman-Wunsch
algorithm gives weights to matches and mismatches differently.

b) Smith-Waterman Algorithm
The Smith-Waterman Algorithm was based on the earlier model which was the
Needleman-Wunsch algorithm. It is an algorithm that takes alignments of any
length for a character sequence at any location in the sequence. It determines
whether an optimal alignment can be found based on scores, weights which are
assigned to each character that is being compared. Scores are added together
and the highest scoring alignment would be chosen.

It is similar to edit distance but instead of finding the minimum, it is finding the
maximum scores by finding similar parts of the sequences.
References:

- Smith-Waterman Algorithm. (n.d.). Retrieved July 27, 2020, from

https://cs.stanford.edu/people/eroberts/courses/soco/projects/computers-and-the-hgp/sm
ith_waterman.html
- Doan, A., & Ives, Z. (2012). Levenshtein Distance. Retrieved July 27, 2020, from
https://www.sciencedirect.com/topics/computer-science/levenshtein-distance

ATMdesk User Manual
Document86 pages
ATMdesk User Manual
carabinieri2
No ratings yet
Project GMDSS: Navtex
Document15 pages
Project GMDSS: Navtex
Alex Barban
100% (1)
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Educational Innovation
Document31 pages
Educational Innovation
Christian Behil
No ratings yet
Samsung LTE ENB Alarm Manual For PKG 5 0 0 v1 0 PDF
Document356 pages
Samsung LTE ENB Alarm Manual For PKG 5 0 0 v1 0 PDF
Vivek Kumar
No ratings yet
LTE VoLTE Mobility Optimization
Document28 pages
LTE VoLTE Mobility Optimization
Gautamray
100% (1)
Digital Marketing Job Description
Document3 pages
Digital Marketing Job Description
sivapunithan S
No ratings yet
MICRCheck Prixa
Document3 pages
MICRCheck Prixa
Oliver James
No ratings yet
FreeSWITCH High Availability and Scaling
Document36 pages
FreeSWITCH High Availability and Scaling
Minh Nguyen
No ratings yet
CBCT2203 - CBCT2203 (Basic Concepts of Information Technology)
Document263 pages
CBCT2203 - CBCT2203 (Basic Concepts of Information Technology)
JeroBazero
50% (4)
Assignment No 1 (Data Science) - Ashber
Document9 pages
Assignment No 1 (Data Science) - Ashber
Ashber Ur Rehman Khan
No ratings yet
Audio Fingerprinting: Combining Computer Vision & Data Stream Processing Shumeet Baluja & Michele Covell Google, Inc. 1600 Amphitheatre Parkway, Mountain View, CA. 94043
Document4 pages
Audio Fingerprinting: Combining Computer Vision & Data Stream Processing Shumeet Baluja & Michele Covell Google, Inc. 1600 Amphitheatre Parkway, Mountain View, CA. 94043
petardsc
No ratings yet
Different Distances Used in K-NN
Document8 pages
Different Distances Used in K-NN
demon
No ratings yet
Visapp2013 PDF
Document6 pages
Visapp2013 PDF
Dana Flores
No ratings yet
Combining Neural Gas and Learning Vector Quantization For Cursive Character Recognition
Document13 pages
Combining Neural Gas and Learning Vector Quantization For Cursive Character Recognition
gusti_puspo
No ratings yet
Assignment No. 2: Similarity and Dissimilarity Measures
Document11 pages
Assignment No. 2: Similarity and Dissimilarity Measures
Ahmed Qurada
No ratings yet
PSO Algorithm With Self Tuned Parameter For Efficient Routing in VLSI Design
Document4 pages
PSO Algorithm With Self Tuned Parameter For Efficient Routing in VLSI Design
sudipta2580
No ratings yet
DNA Sequence Alignment Algorithms
Document21 pages
DNA Sequence Alignment Algorithms
Priya Sharma
No ratings yet
Blue and White Modern Technology Portfolio Presentation
Document13 pages
Blue and White Modern Technology Portfolio Presentation
chetan jangir
No ratings yet
Bajenov Islam
Document7 pages
Bajenov Islam
Shafayet Uddin
No ratings yet
Application of SOMA Algorithm for Shortest Path Problem
Document6 pages
Application of SOMA Algorithm for Shortest Path Problem
Avinash Kumar
No ratings yet
2012 Liviu P. Dinu, Alexandru Popa, 2012. On The Closest String Via Rank Distance
Document14 pages
2012 Liviu P. Dinu, Alexandru Popa, 2012. On The Closest String Via Rank Distance
a
No ratings yet
String Matching Algorithms and Their Applicability in Various Applications
Document5 pages
String Matching Algorithms and Their Applicability in Various Applications
yetsedaw
No ratings yet
Similarity Distances For Natural Language Processing
Document16 pages
Similarity Distances For Natural Language Processing
ibrahimcakirlar35
No ratings yet
Fully Dynamic Maximal Matching in O (Log N) Update Time
Document32 pages
Fully Dynamic Maximal Matching in O (Log N) Update Time
Manoj
No ratings yet
Speaker Recognition
Document31 pages
Speaker Recognition
thesovereignmoonlove
No ratings yet
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
Document7 pages
Exercise 4: Self-Organizing Maps: Articial Neural Networks and Other Learning Systems, 2D1432
Durai Arun
No ratings yet
PSO Algorithm with Self Tuned Parameters for Efficient VLSI Routing
Document4 pages
PSO Algorithm with Self Tuned Parameters for Efficient VLSI Routing
sudipta2580
No ratings yet
IV Distance and Rule Based Models 4.1 Distance Based Models
Document45 pages
IV Distance and Rule Based Models 4.1 Distance Based Models
Ram
No ratings yet
SOM
Document15 pages
SOM
Wajid Ali Zaidi
No ratings yet
Downey 07 Rand Float
Document8 pages
Downey 07 Rand Float
rose sandi
No ratings yet
An Introduction To Locally Linear Embedding
Document13 pages
An Introduction To Locally Linear Embedding
Jorge Leandro
No ratings yet
Chapter - 1: 1.1 Overview
Document50 pages
Chapter - 1: 1.1 Overview
karthik0484
No ratings yet
Module 1 Jacard Distance and Editdistance
Document16 pages
Module 1 Jacard Distance and Editdistance
Dannapurna D
No ratings yet
Clifford Sze-Tsan Choy and Wan-Chi Siu - Fast Sequential Implementation of "Neural-Gas" Network For Vector Quantization
Document4 pages
Clifford Sze-Tsan Choy and Wan-Chi Siu - Fast Sequential Implementation of "Neural-Gas" Network For Vector Quantization
Tuhma
No ratings yet
Levenshtein Algorithm 1 PDF
Document10 pages
Levenshtein Algorithm 1 PDF
yetsedaw
No ratings yet
Viterbi Decoding of Convolutional Codes: Ecture
Document11 pages
Viterbi Decoding of Convolutional Codes: Ecture
Roosewelt Arul
No ratings yet
Vector Quantization Techniques Explained
Document8 pages
Vector Quantization Techniques Explained
kapil1411
No ratings yet
Efficient Algorithm For Auto Correction Using N-Gram Indexing
Document5 pages
Efficient Algorithm For Auto Correction Using N-Gram Indexing
Sandeep Kumar Das
No ratings yet
Distance Measures in Machine Learning
Document6 pages
Distance Measures in Machine Learning
Talha Farooq
No ratings yet
Self Organizing Markov Map For Speech and Gesture Recognition
Document5 pages
Self Organizing Markov Map For Speech and Gesture Recognition
Ijarcsee Journal
No ratings yet
Modelo de Vibraciones
Document13 pages
Modelo de Vibraciones
Cesar Diaz Malaver
No ratings yet
Randomized Algorithms and NLP: Using Locality Sensitive Hash Functions For High Speed Noun Clustering
Document8 pages
Randomized Algorithms and NLP: Using Locality Sensitive Hash Functions For High Speed Noun Clustering
Shreyas Bhatt
No ratings yet
A Chaotic Direct-Sequence Spread-Spectrum Communication System
Document4 pages
A Chaotic Direct-Sequence Spread-Spectrum Communication System
Nhan Bui
No ratings yet
Desalebteg
Document16 pages
Desalebteg
Mekonnen Ayal
No ratings yet
RBF, KNN, SVM, DT
Document9 pages
RBF, KNN, SVM, DT
Qurrat Ul Ain
No ratings yet
Vector Quantization and K-means Clustering Explained
Document19 pages
Vector Quantization and K-means Clustering Explained
asif khan
No ratings yet
Betweeness Pur
Document22 pages
Betweeness Pur
abdou sk
No ratings yet
SDN 2014
Document9 pages
SDN 2014
AGUNG TRI LAKSONO
No ratings yet
Adaptive Consensus and Algebraic Connectivity Estimation in Sensor Networks With Chebyshev Polynomials
Document6 pages
Adaptive Consensus and Algebraic Connectivity Estimation in Sensor Networks With Chebyshev Polynomials
luna lurantiz
No ratings yet
Image Denoising Using Wavelet Thresholding and Model Selection
Document4 pages
Image Denoising Using Wavelet Thresholding and Model Selection
jebilee
No ratings yet
A Real Time Indoor Localization Application
Document12 pages
A Real Time Indoor Localization Application
amulya5235
No ratings yet
Unicast Routing algorithms Part II: LS, PV routing
Document9 pages
Unicast Routing algorithms Part II: LS, PV routing
Mohmed Awad
No ratings yet
B.E Project SEM 8 GrouP 10
Document24 pages
B.E Project SEM 8 GrouP 10
Mohit Baranwal
No ratings yet
Ranking of Closeness Centrality For Large-Scale Social Networks
Document10 pages
Ranking of Closeness Centrality For Large-Scale Social Networks
CBChaudhari
No ratings yet
Ttthesis08 Ps
Document3 pages
Ttthesis08 Ps
punjan63
No ratings yet
Approximate String Matching for Music Retrieval
Document12 pages
Approximate String Matching for Music Retrieval
Jay Kadam
No ratings yet
CN Unit-Iii Part 2
Document25 pages
CN Unit-Iii Part 2
chinnupyari
No ratings yet
Chapter - 13 (Greedy Technique)
Document18 pages
Chapter - 13 (Greedy Technique)
Jenber
No ratings yet
Scaling Benchmark
Document7 pages
Scaling Benchmark
Juan Pablo Henríquez Valencia
No ratings yet
MIDTERM
Document15 pages
MIDTERM
Pamela Morcilla
No ratings yet
Finding Community Structure in Very Large Networks
Document6 pages
Finding Community Structure in Very Large Networks
Diego Moreno
No ratings yet
A New Statistical Measure of Signal Similarity
Document9 pages
A New Statistical Measure of Signal Similarity
Impulse Detection Systems
No ratings yet
Load Balancing and Switch Scheduling Duality
Document6 pages
Load Balancing and Switch Scheduling Duality
Nikhin Perumal Thomas
No ratings yet
LMS ALGORITHM: A SIMPLE YET EFFECTIVE ADAPTIVE FILTERING TECHNIQUE
Document14 pages
LMS ALGORITHM: A SIMPLE YET EFFECTIVE ADAPTIVE FILTERING TECHNIQUE
jaigodara
No ratings yet
Smoothing Methods Ss 3
Document6 pages
Smoothing Methods Ss 3
abyss1988
No ratings yet
Fpga Implementation of Adaptive Weight PDF
Document7 pages
Fpga Implementation of Adaptive Weight PDF
iaetsdiaetsd
No ratings yet
Fingerprint Matching Algorithm Based on Error Propagation
Document4 pages
Fingerprint Matching Algorithm Based on Error Propagation
Anonymous M1fQ1aF
No ratings yet
Ranking of Closeness Centrality For Large-Scale Social Networks
Document10 pages
Ranking of Closeness Centrality For Large-Scale Social Networks
Ardiansyah S
No ratings yet
Audio and Network Logging
Document2,800 pages
Audio and Network Logging
SNEYDER ZABALA
No ratings yet
A3 Logics
Document14 pages
A3 Logics
Aayush Garg
No ratings yet
CAREER AT MNC WITH WALK-IN INTERVIEW
Document2 pages
CAREER AT MNC WITH WALK-IN INTERVIEW
James Karthick
No ratings yet
CXCI Cordex 2v0 Quick Ref
Document2 pages
CXCI Cordex 2v0 Quick Ref
Guillermo Ovelar
No ratings yet
Template SOP
Document4 pages
Template SOP
Kurniawan Gustyanto
No ratings yet
4nqs14dng - Getting Started With WeeeCode Part 2
Document27 pages
4nqs14dng - Getting Started With WeeeCode Part 2
Cyriz Pacursa
100% (1)
MPLAB XC16 C Compiler Users Guide DS50002071 PDF
Document413 pages
MPLAB XC16 C Compiler Users Guide DS50002071 PDF
Juan Fabian Morales
No ratings yet
1-What Is The Difference Between Sequence Diagram and Communication Diagram in UML?
Document8 pages
1-What Is The Difference Between Sequence Diagram and Communication Diagram in UML?
Husam Shujaadiin
No ratings yet
Inteliscada Emea 2020
Document12 pages
Inteliscada Emea 2020
Reggie Harvir
No ratings yet
Neurowerk EEG23
Document4 pages
Neurowerk EEG23
hassan MOHACHI
No ratings yet
SZB4 (GB)
Document2 pages
SZB4 (GB)
naami2004
No ratings yet
CSD Sierra & Sierra Tools
Document2 pages
CSD Sierra & Sierra Tools
Konrad Żaba
No ratings yet
ABOUT CELLS
Document40 pages
ABOUT CELLS
Andre
No ratings yet
SKF TSO 230 Specification
Document2 pages
SKF TSO 230 Specification
faisal hajj
No ratings yet
SC18 - Support Update
Document9 pages
SC18 - Support Update
daniel_vp21
No ratings yet
Data Structure
Document476 pages
Data Structure
daman khurana
No ratings yet
Java All GTU Programs 06012015 051359AM
Document23 pages
Java All GTU Programs 06012015 051359AM
Rohit
No ratings yet
TRX Configuration Check V1 00
Document12 pages
TRX Configuration Check V1 00
Hoyeborday Holumeday Temmythayor
No ratings yet
Analisis Sains Gerak Gempur 2016 SKTP
Document9 pages
Analisis Sains Gerak Gempur 2016 SKTP
mohdnazlan
No ratings yet
Job Description - Active Directory Engineer
Document2 pages
Job Description - Active Directory Engineer
Lukk Il
No ratings yet
Ship Building Computer Aids
Document4 pages
Ship Building Computer Aids
Ankit Maurya
No ratings yet