Professional Documents
Culture Documents
- Sequence of Amino Acids: The primary structure refers to the specific sequence of amino acids
linked together in a polypeptide chain.
- Amino Acids: Proteins are made up of 20 different amino acids. The sequence and number of
these amino acids determine the protein's primary structure.
- Peptide Bonds: Amino acids are linked together by peptide bonds, formed through a
dehydration synthesis reaction between the carboxyl group of one amino acid and the amino
group of another.
- Directionality: The primary structure has a specific directionality determined by the
arrangement of amino acids. It has an amino (N-terminus) end and a carboxyl (C-terminus) end.
- Genetic Code: The sequence of amino acids in a protein is dictated by the genetic code encoded
in DNA and transcribed into mRNA, which is then translated into a specific sequence of amino
acids during protein synthesis.
- Variability: The primary structure's unique sequence of amino acids determines a protein's
specific function, shape, and properties. Even a small change in the sequence can drastically
alter a protein's structure and function.
• Secondary structure of protein:
o Alpha Helix:
o Beta Sheets:
- Structure: Beta sheets are formed by polypeptide strands lying adjacent to each other,
connected by hydrogen bonds between the peptide backbone atoms of different strands.
- Directionality: Beta sheets can be either parallel or antiparallel. In parallel sheets, adjacent
strands run in the same direction, while in antiparallel sheets, adjacent strands run in
opposite directions.
- H-bonding Pattern: The hydrogen bonds form between the carbonyl oxygen of one peptide
bond and the amide hydrogen of a neighbouring peptide bond in an adjacent strand.
- Common Amino Acids: Amino acids with relatively small side chains, like glycine and alanine,
are commonly found in beta sheets due to their ability to pack together efficiently.
- Role in Proteins: Beta sheets are often involved in the formation of the protein core or as
parts of structural motifs, like beta barrels found in proteins like porins.
Both alpha helices and beta sheets are important secondary structures that contribute to the overall
three-dimensional structure of proteins, influencing their function, stability, and interactions with
other molecules.
• Tertiary structure of protein:
The quaternary structure of a protein refers to the arrangement and interaction between multiple
individual polypeptide chains (subunits) to form a functional protein complex.
o Multiple Subunits: Proteins with quaternary structure consist of two or more polypeptide chains
called subunits, which may be identical or different. These subunits come together to form a
larger functional unit.
o Interactions Between Subunits: The interactions stabilizing the quaternary structure can include
various types:
- Noncovalent Bonds: Such as hydrogen bonds, hydrophobic interactions, van der Waals
forces, and electrostatic interactions between the subunits.
- Disulfide Bonds: Covalent bonds formed between cysteine residues in different subunits can
contribute to the stability of the complex.
o Functional Assembly: The arrangement of subunits in the quaternary structure is crucial for the
protein's overall function. It can create active sites, allosteric sites, or binding sites that are not
present in the individual subunits.
o Examples: Many proteins exhibit quaternary structure, such as hemoglobin (composed of four
subunits), DNA polymerase (composed of multiple subunits), and antibodies (composed of two
heavy and two light chains).
o Symmetry: Quaternary structures can exhibit different types of symmetry, such as:
- Cyclic Symmetry: Where the subunits are arranged in a circular fashion.
- Helical Symmetry: Where the subunits form a helical arrangement.
- Dihedral Symmetry: Where the subunits form two-fold rotational symmetry.
o Regulation: Changes in the quaternary structure can impact the protein's function, and this
structure can be regulated by various factors, including pH, temperature, binding of ligands, or
post-translational modifications.
o Importance: Proteins with quaternary structure often display enhanced stability, increased
specificity, and new functional properties compared to their individual subunits. This complex
structure allows proteins to perform intricate biological functions.
MOTIFS:
Protein motifs are short, conserved sequences of amino acids or three-dimensional structures that
have a specific function or contribute to a particular protein folding pattern. These motifs often
recur in various proteins and play key roles in their structural and functional properties. Some
common protein motifs:
❖ Helix-Turn-Helix (HTH):
➢ Structure: Consists of two alpha helices connected by a short strand of amino acids,
forming a "hairpin" structure.
➢ Function: Often involved in DNA binding in transcription factors and other DNA-binding
proteins.
❖ Zinc Finger Motif:
➢ Structure: Involves a small protein domain with a zinc ion coordinated by cysteine and
histidine residues.
➢ Function: Functions as a DNA- or RNA-binding motif in transcription factors, nucleases,
and other regulatory proteins.
❖ Leucine Zipper:
➢ Structure: Involves a repetitive pattern of leucine residues along an alpha helix.
➢ Function: Mediates protein-protein interactions, often involved in the formation of
dimers or multimers.
❖ Coiled-Coil Motif:
➢ Structure: Formed by two or more alpha helices wrapping around each other in a coiled
fashion.
➢ Function: Common in structural proteins and involved in various cellular processes,
including vesicle trafficking and muscle contraction.
❖ Beta-Turn:
➢ Structure: A sharp bend or turn in a polypeptide chain, typically involving four amino
acid residues.
➢ Function: Important for reversing the direction of a protein chain and contributing to the
overall protein fold.
These motifs highlight the diversity of structures and functions in proteins, and they often serve as
building blocks for the assembly of more complex protein structures. The identification and
understanding of these motifs contribute to our knowledge of protein structure and function in
various biological processes.
• DOMAIN:
1. Sequence Similarity: Begin with a protein sequence whose structure is unknown but has a
homologous protein with a known structure (template). This sequence can be obtained from
a protein database like PDB.
2. Template Selection: Identify a suitable template protein with a similar sequence to the
target protein. This can be done through sequence alignment methods to find the best
match. BLAST can be done to obtain the template.
3. Model Building: Use the known structure of the template protein as a scaffold to predict the
structure of the target protein. This involves aligning the sequences, transferring structural
information, and adjusting for differences between the target and template sequences.
Model building can be carried out on SWISS-MODEL.
4. Model Refinement: Refine the initial model through energy minimization or molecular
dynamics simulations to improve the accuracy of the predicted structure. This step can be
carried out by ModRefiner.
5. Validation: Assess the quality of the model using various validation tools to ensure the
reliability and accuracy of the predicted structure. The validation of the structure can be
carried out by PROCHECK.
Advantages and Limitations of Homology Modeling:
Advantages:
Limitations:
Advantages:
Limitations:
RasMol is a molecular visualization tool that has been widely used in the field of structural biology
for the analysis and visualization of macromolecular structures, especially protein and nucleic acid
structures. RasMol is used for:
While RasMol has been a valuable tool, it is worth noting that its development has slowed, and
other modern molecular visualization tools such as PyMOL and Chimera have gained popularity.
Researchers often choose tools based on their specific requirements and the features offered by
different software packages.
• USE OF PyMol:
PyMOL is a powerful molecular visualization software widely used in structural biology and related
fields. Here are several key uses and functionalities of PyMOL:
PyMOL's user-friendly interface and extensive functionalities make it a versatile tool for molecular
visualization, structural analysis, and research in various scientific disciplines. Its flexibility and
capabilities cater to the needs of both beginners and advanced users in the field of structural
biology.
ENERGY MINIMIZATIONS AND EVALUATION BY RAMACHANDRAN PLOT:
The Ramachandran plot is a useful tool in protein structure analysis, particularly in evaluating and
guiding energy minimization processes. Here's how energy minimizations can be guided by and
evaluated using the Ramachandran plot:
The Ramachandran plot serves as a visual aid to guide and evaluate energy minimization processes
by highlighting energetically favorable and unfavorable regions in protein structure, aiding in the
refinement and validation of protein models.
PROTEOME:
• Complete Set of Proteins: The proteome refers to the entire complement of proteins expressed
by a cell, tissue, organism, or a biological system at a specific time under defined conditions.
• Dynamic and Complex: It represents the dynamic and complex assembly of proteins present in
an organism, including variations in protein isoforms, post-translational modifications, and
protein-protein interactions.
• Derived from Genome: The proteome is a product of the genome, where genetic information
encoded in DNA is transcribed into mRNA and translated into proteins. However, the proteome
is more dynamic and diverse than the genome due to various regulatory mechanisms influencing
protein expression.
• Functional Entities: Proteins within the proteome perform diverse biological functions, including
enzymatic catalysis, structural support, signaling, transport, regulation of gene expression, and
participation in various cellular processes.
• Highly Varied: It encompasses a vast range of proteins with different sizes, shapes, functions,
and cellular locations, reflecting the complexity and diversity of biological systems.
• Subject to Change: The proteome of a cell or organism can change dynamically in response to
internal cellular cues, environmental stimuli, developmental stages, and disease conditions.
• Studying the Proteome: Proteomics is the study of the proteome, involving techniques such as
mass spectrometry, protein microarrays, and bioinformatics to analyze and characterize the
entire complement of proteins within a biological sample.
Understanding the proteome is crucial in elucidating cellular functions, biological processes, disease
mechanisms, and identifying potential biomarkers or therapeutic targets for various conditions.
INTERACTOME:
The term "interactome" refers to the complete set of molecular interactions occurring within a cell,
tissue, organism, or biological system.
• Molecular Interactions: The interactome comprises all the physical and functional interactions
among biomolecules within a biological system. This includes interactions between proteins,
nucleic acids, lipids, small molecules, and other cellular components.
• Protein-Protein Interactions (PPIs): A significant part of the interactome involves interactions
between proteins. These interactions can include direct physical contacts, transient associations,
or stable complexes formed between different proteins.
• Complex Networks: The interactome forms a complex network of interactions, often visualized
as nodes (representing biomolecules) connected by edges (representing interactions). These
networks help in understanding the functional relationships and organization within biological
systems.
• Dynamic Nature: Interactomes are dynamic and context-dependent, varying based on cellular
conditions, environmental stimuli, developmental stages, and disease states. They can change
over time and in response to different cellular processes or perturbations.
• Studying the Interactome: Interactomics, a subfield of systems biology, focuses on studying and
characterizing the entire spectrum of molecular interactions using various experimental and
computational approaches. Techniques like yeast two-hybrid assays, co-immunoprecipitation,
mass spectrometry, and computational modeling contribute to understanding the interactome.
• Biological Significance: Understanding the interactome provides insights into the functional
organization of biological systems, cellular pathways, regulatory networks, disease mechanisms,
and the identification of potential drug targets or biomarkers.
• Challenges: Mapping the entire interactome of a cell or organism remains challenging due to its
complexity, dynamics, and the sheer number of potential interactions. However, advancements
in experimental techniques and computational tools continue to improve our understanding of
interactomes across various biological systems.
In summary, the interactome represents the intricate web of molecular interactions that drive
cellular functions and underlie biological processes, offering valuable insights into the complexities
of living systems.
MALDI-TOF:
Matrix Associated Laser Desorption/Ionisation - Time Of Flight spectrometry.
Principle:
Significance:
• High Sensitivity and Accuracy: MALDI-TOF spectrometry enables the rapid and accurate analysis
of biomolecules with high sensitivity, detecting ions in the mass range of large molecules.
• Minimal Fragmentation: It's a soft ionization method, causing minimal fragmentation of the
analyte molecules, allowing intact molecular ions to be detected.
• Analysis of Large Biomolecules: MALDI-TOF is particularly useful for analyzing proteins, peptides,
nucleic acids, lipids, and other large biomolecules.
Procedure:
• Sample Preparation: Biomolecules, such as proteins, peptides, nucleic acids, or other large
molecules, are prepared for analysis. They are mixed with a matrix material, often an organic
acid or aromatic compound, to assist in ionization.
• Application to Sample Target: The prepared sample is applied or spotted onto a metal or
conductive target plate. It's crucial to evenly distribute the sample-matrix mixture on the target
to ensure consistent ionization.
• Desorption and Ionization: A pulsed laser beam is directed onto the sample target. The matrix
absorbs the laser energy, causing it to vaporize and desorb the analyte molecules from the
matrix surface. This process ionizes the molecules, creating gas-phase ions.
• Ion Acceleration: Electric fields are applied to accelerate the ions into the flight tube, where they
gain kinetic energy. The ions are accelerated uniformly to ensure they all start their flight with
the same kinetic energy.
• Time-of-Flight Measurement: The ions travel through the flight tube, which is under vacuum,
allowing the ions to travel unimpeded. Lighter ions travel faster than heavier ones and reach the
detector at different times based on their mass-to-charge ratio (m/z).
• Detection and Data Collection: As ions reach the detector, their arrival times are recorded. The
time taken by ions to reach the detector (time-of-flight) is proportional to their m/z ratio. The
detector captures this data as a mass spectrum.
• Data Processing and Analysis: The acquired time-of-flight data is processed using specialized
software to generate a mass spectrum. This spectrum displays the distribution of ion masses in
the sample, providing information about the molecular weight and abundance of the analyte
molecules.
• Interpretation and Analysis: Researchers interpret the mass spectrum to identify and
characterize the biomolecules present in the sample. Peak intensities and positions in the mass
spectrum provide information about the molecular weights of the ions, aiding in the analysis and
identification of the sample's constituents.
Applications:
Proteomics: Identification and characterization of proteins, including post-translational
modifications, protein profiling, and biomarker discovery.
Biochemistry and Structural Biology: Studying protein complexes, protein-protein interactions, and
molecular structure determination.
STRING:
STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a bioinformatics database and
web resource that focuses on the functional associations and interactions between proteins.
MMDB:
The Molecular Modeling Database (MMDB) is a database that provides access to experimentally
determined three-dimensional structures of biological macromolecules, including proteins, nucleic
acids, and complex assemblies.
• Content and Scope: MMDB primarily contains structures obtained through experimental
methods such as X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy (EM).
It includes atomic coordinates, structural information, and associated experimental data for a
wide range of biomolecules.
• Integration with NCBI: MMDB is part of the National Center for Biotechnology Information
(NCBI), which is a comprehensive resource for biological information. MMDB integrates
seamlessly with other NCBI databases, allowing users to access structural data in conjunction
with sequence, function, and literature information.
• Data Sources: The database collects experimentally determined structures from various sources,
including Protein Data Bank (PDB) depositions, scientific literature, and direct submissions from
researchers, ensuring a comprehensive and diverse collection of macromolecular structures.
• Accessibility and Search: Users can access MMDB through the NCBI website or through
programmatic access methods. The database offers search functionalities based on keywords,
sequence homology, molecular function, and structure similarity, facilitating efficient retrieval of
structural data.
• Visualization and Analysis: MMDB provides tools for visualizing three-dimensional structures of
biomolecules. Users can view molecular structures, analyze atomic coordinates, and perform
structural comparisons or superimpositions to understand structural relationships among
macromolecules.
• Data Integration: MMDB data is cross-referenced with other NCBI databases, such as GenBank,
PubMed, and Entrez, allowing users to explore relationships between structural information and
associated biological data, such as genetic sequences, protein functions, pathways, and
literature citations.
• Applications: MMDB is extensively used in structural biology, bioinformatics, drug discovery,
molecular modeling, and computational biology. Researchers use MMDB to study
macromolecular structures, understand their functions, and develop computational models for
predicting structure-function relationships.
• Updates and Maintenance: MMDB is regularly updated with newly determined structures,
ensuring that users have access to the latest experimental data on biological macromolecules.
In summary, MMDB serves as a valuable resource for researchers, providing a centralized repository
of experimentally determined macromolecular structures along with associated data, facilitating the
exploration, analysis, and understanding of biomolecular structures and functions.
CADD:
Computer-Aided Drug Design (CADD) involves computational methods to aid in the discovery and
development of new drugs. Here's the procedure and applications of CADD:
Procedure:
• Target Identification: Define the biological target (e.g., protein, enzyme, receptor) associated
with a disease or condition.
• Target Validation and Structure Determination: Validate the biological relevance of the target
through experimental studies. Obtain or predict the three-dimensional structure of the target
protein through experimental techniques (e.g., X-ray crystallography, NMR) or computational
modeling (e.g., homology modeling, molecular dynamics simulations).
• Virtual Screening: Utilize computational methods (ligand-based or structure-based) to screen
large chemical libraries (in silico) for potential drug candidates that bind to the target. Ligand-
based methods involve comparing the chemical features of known active compounds to find
similar molecules. Structure-based methods involve docking small molecules into the binding
site of the target protein to predict their binding affinity and mode.
• Hit Identification and Optimization: Identify potential lead compounds or hits that exhibit
promising interactions with the target. Optimize the chemical structure of the hits to improve
their potency, selectivity, pharmacokinetic properties, and reduce toxicity through iterative
cycles of design, synthesis, and testing.
• ADME-Tox Prediction: Predict the Absorption, Distribution, Metabolism, Excretion (ADME), and
potential toxicity (Tox) profiles of the lead compounds using computational models to prioritize
compounds with favorable drug-like properties.
• Experimental Validation: Synthesize the most promising lead compounds for experimental
validation through in vitro and in vivo studies to assess their efficacy, safety, and
pharmacological profile.
Applications:
• Drug Discovery: CADD accelerates the discovery of novel drug candidates by screening vast
chemical libraries to identify potential hits and optimizing them for better efficacy and safety
profiles.
• Target Identification and Validation: Helps in identifying and validating new drug targets
involved in various diseases, aiding in the understanding of disease mechanisms.
• Repurposing of Drugs: Identifies existing drugs that can be repurposed for new therapeutic
applications by analyzing their interactions with different targets.
• Cost and Time Efficiency: Reduces the cost and time required for experimental synthesis and
testing of compounds by prioritizing the most promising candidates.
CADD plays a pivotal role in modern drug discovery pipelines, complementing experimental
approaches to expedite the development of safe and effective therapeutic agents.