Welcome to Scribd!

Blosum 10

Uploaded by

0% found this document useful (0 votes)

7 views3 pages

The document provides an overview of how BLOSUM matrices are computed to score amino acid substitutions. It discusses that BLOSUM matrices are derived from large datasets of local multiple sequence alignments of conserved blocks from related proteins, rather than from full-length alignments. The specific steps are to cluster sequences based on a percentage identity threshold, count observed amino acid replacement frequencies across clusters, calculate expected frequencies, and determine log-odds scores to capture how much more or less likely substitutions are compared to random chance. While BLOSUM matrices do not use an explicit evolutionary model like PAM matrices, they still tend to favor substitutions between biochemically similar amino acids.

Original Description:

Original Title

blosum10

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views3 pages

Blosum 10

Uploaded by

Qazal Sarvari

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 3

Search inside document

Computational Genomics and Molecular Biology 1

BLOSUM Lecture Notes

Dannie Durand

BLOSUM Matrices

See Ewens and Grant, 6.5.2. for a detailed discussion of how the BLOSUM matrices are computed. Note that their
notation is slightly different.

Overview

• BLOSUM = BLOck SUbstitution Matrices, (Henikoff and Henikoff, 1992).

• Trusted Data

– Much larger dataset (∼ 2,000 blocks)

– Local MSA’s (ungapped blocks). Highly conserved due to selective pressure.

• Compute pairwise amino acid alignment counts

– Count amino acid replacement frequencies directly from columns in blocks (no trees.)
– General idea:
∗ Cluster sequences that are n% identical.
∗ Count amino acid pairs across clusters, treating clusters as an ”average sequence”. Normalize by the
number of sequences in the cluster.
∗ Do not count amino acid pairs within a cluster.

Specifics of calculating the Blosumn log odds matrix

• Input: B blocks of sequences. Each block b contains kb sequences of length nb (no gaps).

• Cluster sequences such that within each cluster, each sequence is at least n% identical to one other sequence in
the cluster.

• Let Cb be the number of clusters in block b following the clustering step, where the ith cluster Cbi has kbi sequences
(kb = ∑ kbi ).
Computational Genomics and Molecular Biology 2

1) The observed frequency of x aligned with y is calculated as follows:

Abxy =
Cb nb
∑i=1 ∑ j>i ∑l=1 (# x’s at site l in Cbi ) · (# y’s at site l in Cb j ) + (# y’s at site l in Cbi ) · (# x’s at site l in Cb j )
kbi · kb j

∑Bb=1 abxy
Axy = Cb
∑Bb=1 nb · 2

2) The expected frequency of x aligned with y is calculated as follows:

Cb nb
(# x’s at site l in Cbi )
pbx = ∑∑ kbi
i=1 l=1
∑Bb=1 pbx
px =
∑Bb=1 nb ·Cb
Exy = px py + py px
Exx = p2x

3) Calculate the log odds matrix from the observed and expected frequencies:
Axy
S[x, y] = 2 log2
Exy

Blosumx matrices:

• Sequences that are n% identical were clustered during the construction of the matrix.
• Corrects for sample bias.
• Parameter of evolutionary divergence.

Comparing PAM and BLOSUM Matrices

PAM BLOSUM
Evolutionary model Explicit evolutionary model None
Data Full length MSAs of closely related sequences. Conserved blocks in protein
Bias correction Trees Clustering
Evolutionary distance From Markov model of sequence evolution. From clustering of sequences.
Matrices Transition and log odds scoring matrices Log odds scoring matrix only.
Parameter n Distance increases with n Distance decreases with n
Biophysical properties Derived indirectly from data Derived indirectly from data
Computational Genomics and Molecular Biology 3

The PAM and BLOSUM matrices were constructed from an evolutionary model and conserved blocks where amino
acids are under selective constraints, respectively. Nevertheless, the matrices favor replacement of amino acids which
share biochemical properties. Inspection of the BLOSUM 62 matrix shows that alignments of residues in the same
biochemical group tend to have positive log odds scores. These residues are more likely to be observed together in
related sequences than by chance. Residues from different groups tend to have negative scores. These residues are less
likely to be observed together in related sequences than in chance alignments. A score of zero means that this pair of
residues is equally likely in related and chance alignments.

Seq Indentity PAM BLOSUM

20 250 45
30 160 62
40 120 80
50 80 -
60 60 -

Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet
Hawker 8TS Battery Charger - Technical Programing Handbook
Document15 pages
Hawker 8TS Battery Charger - Technical Programing Handbook
Daniel Aguirre
100% (4)
(Yves Meyer, Ronald Coifman) Wavelets Calderón-Z
Document335 pages
(Yves Meyer, Ronald Coifman) Wavelets Calderón-Z
charlyshaka1
No ratings yet
Oracle
Document1 page
Oracle
山口記世
0% (1)
Cyberknife 1
Document37 pages
Cyberknife 1
Hari Kumar
No ratings yet
Alan Watts-Instant Wind Forecasting - Adlard Coles (2010)
Document119 pages
Alan Watts-Instant Wind Forecasting - Adlard Coles (2010)
Wee Wee
No ratings yet
Lecture-03 Organization of Maintenance Force
Document34 pages
Lecture-03 Organization of Maintenance Force
Mohammad Shafi
No ratings yet
RAPIDPoint 500 Operator S Guide - English - Rev C DXDCM 09008b83808b103e-1573180329705
Document498 pages
RAPIDPoint 500 Operator S Guide - English - Rev C DXDCM 09008b83808b103e-1573180329705
Олександр
100% (1)
Electrochemical Processes in Biological Systems
From Everand
Electrochemical Processes in Biological Systems
Andrzej Lewenstam
No ratings yet
Sequence Alignment Methods and Algorithms
Document37 pages
Sequence Alignment Methods and Algorithms
api-3747254
75% (4)
Algebraic Geometry - Gathmann
Document214 pages
Algebraic Geometry - Gathmann
Niflheim
No ratings yet
CBC-Event Management Services NC III - Modules of Instruction - Core
Document39 pages
CBC-Event Management Services NC III - Modules of Instruction - Core
Jhainne Yome Fulgencio
No ratings yet
JR Maths-1a CDF & Imp Questions-1
Document16 pages
JR Maths-1a CDF & Imp Questions-1
daramhemanth1
No ratings yet
Icassp40776 2020 9053737
Document5 pages
Icassp40776 2020 9053737
Prasad Nizampatnam
No ratings yet
Linkage Numerical
Document4 pages
Linkage Numerical
Abaidullah
No ratings yet
G P S Raghava: Basics of Sequence Alignment and Weight Matrices and DOT Plot
Document35 pages
G P S Raghava: Basics of Sequence Alignment and Weight Matrices and DOT Plot
Raj Kumar Soni
No ratings yet
Bioinformatics Session5
Document32 pages
Bioinformatics Session5
Rohan Ray
No ratings yet
Research On Collision Detection Algorithm Based On AABB-OBB Bounding Volume
Document3 pages
Research On Collision Detection Algorithm Based On AABB-OBB Bounding Volume
Sachin G
No ratings yet
Extra Notes On Threading
Document6 pages
Extra Notes On Threading
Arthur Derman Sith-lee
No ratings yet
BLOSUM Matrices
Document18 pages
BLOSUM Matrices
shushantobanik77
No ratings yet
CMA-ES With MATLAB Code
Document30 pages
CMA-ES With MATLAB Code
ReveloApraezCesar
No ratings yet
Solid State Physics Notes Solid State Physics Notes
Document87 pages
Solid State Physics Notes Solid State Physics Notes
Roy Vesey
No ratings yet
Condensed Matter Physics
Document24 pages
Condensed Matter Physics
Ilkay Demir
No ratings yet
Lect 5
Document43 pages
Lect 5
norain ismail sulieman
No ratings yet
Day 1 Algebraic
Document50 pages
Day 1 Algebraic
eetaha
No ratings yet
A Robust and Sparse K-Means Clustering Algorithm
Document21 pages
A Robust and Sparse K-Means Clustering Algorithm
MUHAMMAD MUHAJIR, S.Si., M.Sc.
No ratings yet
3P LF in The PFR - Sugiarto
Document11 pages
3P LF in The PFR - Sugiarto
Zulqibal
No ratings yet
Non-Linear Effects in Seismic Base Isolation
Document13 pages
Non-Linear Effects in Seismic Base Isolation
Rolex11
No ratings yet
Principal Component Analysis (PCA) - : San José State University Math 253: Mathematical Methods For Data Visualization
Document49 pages
Principal Component Analysis (PCA) - : San José State University Math 253: Mathematical Methods For Data Visualization
Andreea Isar
No ratings yet
Multi-Sub-Swarm PSO Classifier Design and Rule Extraction: Yanwei Chang Guofang Yu
Document4 pages
Multi-Sub-Swarm PSO Classifier Design and Rule Extraction: Yanwei Chang Guofang Yu
marcio pivello
No ratings yet
Sequence Analysis - Pairwise Alignment
Document26 pages
Sequence Analysis - Pairwise Alignment
Anjana's World
No ratings yet
Hoofdstuk 5
Document19 pages
Hoofdstuk 5
Eremas Kotipe
No ratings yet
Mock Exam Model Answers
Document13 pages
Mock Exam Model Answers
Fatma Zorlu
No ratings yet
A Decision-Directed Bayesian Equalizer: California Santa Clara University
Document5 pages
A Decision-Directed Bayesian Equalizer: California Santa Clara University
api-3832492
No ratings yet
HT TP: //qpa Pe R.W But .Ac .In: 2012 Soft Computing
Document7 pages
HT TP: //qpa Pe R.W But .Ac .In: 2012 Soft Computing
arnab paul
No ratings yet
Short Circuit Analysis in Unbalanced Distribution Networks: R. Ebrahimi, S. Jamali, A. Gholami and A.Babaei
Document5 pages
Short Circuit Analysis in Unbalanced Distribution Networks: R. Ebrahimi, S. Jamali, A. Gholami and A.Babaei
Abcd
No ratings yet
Linearna Algebra 2
Document14 pages
Linearna Algebra 2
rankoman
0% (1)
Cycle-Based Singleton Local Consistencies: Robert J. Woodward, Berthe Y. Choueiry Christian Bessiere
Document2 pages
Cycle-Based Singleton Local Consistencies: Robert J. Woodward, Berthe Y. Choueiry Christian Bessiere
Thanh Khong Minh
No ratings yet
Assignment 1 Measurements and Unceratinities (Answers)
Document4 pages
Assignment 1 Measurements and Unceratinities (Answers)
panghua tan
No ratings yet
Rami Mangoubi
Document4 pages
Rami Mangoubi
Neeraj Kannaugiya
No ratings yet
Model-Based Clustering: o K o K
Document6 pages
Model-Based Clustering: o K o K
yellow
No ratings yet
A DCM Based Orientation Estimation Algorithm With An Inertial Measurement Unit and A Magnetic Compass
Document18 pages
A DCM Based Orientation Estimation Algorithm With An Inertial Measurement Unit and A Magnetic Compass
anshulara
No ratings yet
740 Fall 2017 Krakernauyd3896Lesson 2
Document2 pages
740 Fall 2017 Krakernauyd3896Lesson 2
Fabrisio Nathaniel
0% (1)
Craig-Bampton Method For A Two Component System: This Is A Work in Progress
Document15 pages
Craig-Bampton Method For A Two Component System: This Is A Work in Progress
nemaderakesh
No ratings yet
Prez 2009
Document22 pages
Prez 2009
marina987
No ratings yet
Apply Genetic Algorithm To The Learning Phase of A Neural Network
Document6 pages
Apply Genetic Algorithm To The Learning Phase of A Neural Network
jkl316
No ratings yet
L05 Deseq2 Anders
Document46 pages
L05 Deseq2 Anders
xin dd
No ratings yet
Asymptotic Equipartition Property of Output When Rate Is Above Capacity
Document23 pages
Asymptotic Equipartition Property of Output When Rate Is Above Capacity
Bob Andrews
No ratings yet
Physics Notes
Document69 pages
Physics Notes
Bilal Ahmed
No ratings yet
Crystal Structure
Document14 pages
Crystal Structure
Mahesh Lohith K.S
100% (4)
Scientific Reasons / Short Questions: Chapter # 1 Scope of Physics
Document22 pages
Scientific Reasons / Short Questions: Chapter # 1 Scope of Physics
Kamran Ali
No ratings yet
Text Clustering
Document47 pages
Text Clustering
Alex Ciocan
No ratings yet
Block Diagonal Inversion
Document13 pages
Block Diagonal Inversion
Konstantinos Politis
No ratings yet
ECS7020P ClassificationExercisesSolutions II
Document7 pages
ECS7020P ClassificationExercisesSolutions II
Yen-Kai Cheng
No ratings yet
EP UNIT New
Document20 pages
EP UNIT New
tejav2468
No ratings yet
Modal Synthesis Analysis Using Craig-Bampton Methodin An Object Oriented Approach
Document3 pages
Modal Synthesis Analysis Using Craig-Bampton Methodin An Object Oriented Approach
Mr. S. Thiyagu Asst Prof MECH
No ratings yet
Week 2 AIM
Document50 pages
Week 2 AIM
tamileditsckdrama
No ratings yet
Single Crossbar Array of Memristors With Bipolar I
Document7 pages
Single Crossbar Array of Memristors With Bipolar I
Nakul singh
No ratings yet
Solution 2
Document6 pages
Solution 2
student mail
0% (1)
(Yaw, Pitch, Roll) to (ω, φ, κ) - Companion notes: 1 A few definitions
Document2 pages
(Yaw, Pitch, Roll) to (ω, φ, κ) - Companion notes: 1 A few definitions
Cristi Valentin
No ratings yet
Solu 7
Document24 pages
Solu 7
engrs.umer
No ratings yet
Introduction To Parallel Programming: Parallel Methods For Matrix Multiplication
Document50 pages
Introduction To Parallel Programming: Parallel Methods For Matrix Multiplication
Shalu Ojha
No ratings yet
Ligand-Based+structure-Based Screening
Document107 pages
Ligand-Based+structure-Based Screening
Manohar
No ratings yet
Business Math Assignment
Document4 pages
Business Math Assignment
Ashiqur Rahman
No ratings yet
Replacement Policy
Document25 pages
Replacement Policy
Salvador Catalfamo
No ratings yet
Blocked Matrix Multiply
Document6 pages
Blocked Matrix Multiply
Deivid CArpio
No ratings yet
Advances in Chemical Physics
From Everand
Advances in Chemical Physics
Stuart A. Rice
No ratings yet
Combining Pattern Classifiers: Methods and Algorithms
From Everand
Combining Pattern Classifiers: Methods and Algorithms
Ludmila I. Kuncheva
No ratings yet
Financial Literacy Kim Redgwell
Document1 page
Financial Literacy Kim Redgwell
api-527336218
No ratings yet
BKD Action Plan SY 2022-2023
Document1 page
BKD Action Plan SY 2022-2023
Kirt Allen Florida
100% (1)
Architecture - July 2018
Document16 pages
Architecture - July 2018
Artdata
No ratings yet
Bacs3413 Project II 22.1.16
Document4 pages
Bacs3413 Project II 22.1.16
Rexl Rxz
No ratings yet
Modern Trends in Internal Combustion Engines
Document15 pages
Modern Trends in Internal Combustion Engines
MahmOud S. El-Mashad
No ratings yet
Old Oil Barrel Generator
Document15 pages
Old Oil Barrel Generator
Albert Newhearth
No ratings yet
11 Inlis-II (5sag)
Document93 pages
11 Inlis-II (5sag)
Serdar Kurbanov
No ratings yet
iNTRO TO Technopreneurship
Document8 pages
iNTRO TO Technopreneurship
jun jun
No ratings yet
CBLM Driving Ncii 2022 Drive Light Vehicle
Document96 pages
CBLM Driving Ncii 2022 Drive Light Vehicle
masterachilles109
No ratings yet
CAT 330 B 9 HN Parts Catalog PDF - Part12
Document1 page
CAT 330 B 9 HN Parts Catalog PDF - Part12
riansiregar
No ratings yet
Pe1 - 101318090 - Hasanuddin Parlindungan Lubis
Document7 pages
Pe1 - 101318090 - Hasanuddin Parlindungan Lubis
Jeffrey chrystian
No ratings yet
AA - Activity 2
Document1 page
AA - Activity 2
Nathaniel Laranjo
No ratings yet
Katalog DC Motoru 25 500w
Document10 pages
Katalog DC Motoru 25 500w
Francesco Moratti
No ratings yet
Zoeppritz Equations
Document3 pages
Zoeppritz Equations
riyadi
No ratings yet
cNT1ZwHyTOid5IhGX3dU - Orica - WebGen 4pp-A4-Brochure - WEB
Document4 pages
cNT1ZwHyTOid5IhGX3dU - Orica - WebGen 4pp-A4-Brochure - WEB
Juan Medina
No ratings yet
Guidelines For Rating Elementary Pupils
Document7 pages
Guidelines For Rating Elementary Pupils
CJ Romano
No ratings yet
Eéáfh Tdibn3ihuvbr3792
Document9 pages
Eéáfh Tdibn3ihuvbr3792
Ati Animations
No ratings yet
Unitex Roof Cowl System Fitting Instruction Sheet: Typical Installations. Ducting Supplied Separately
Document1 page
Unitex Roof Cowl System Fitting Instruction Sheet: Typical Installations. Ducting Supplied Separately
Adam Son
No ratings yet
Pareidolia: Why We See Faces in Hills, The Moon and Toasties
Document4 pages
Pareidolia: Why We See Faces in Hills, The Moon and Toasties
Michael Smith
No ratings yet
Bettis - Line Break Detection - Lineguard2100
Document6 pages
Bettis - Line Break Detection - Lineguard2100
andy175
No ratings yet
Cbme Chapter 1
Document3 pages
Cbme Chapter 1
Pauline Ocampo
No ratings yet
Unit 5 - QB FML
Document12 pages
Unit 5 - QB FML
Santhosh Kannekanti
No ratings yet