Welcome to Scribd!

BIO Code Report

Uploaded by

0% found this document useful (0 votes)

10 views6 pages

This document discusses using Biopython to analyze a COVID-19 DNA sequence. It shows how to import modules, parse the FASTA format DNA sequence, transcribe it to mRNA, translate the mRNA to an amino acid sequence, split the sequence at stop codons to identify proteins, and use ProtParam to analyze properties of the identified proteins such as molecular weight and flexibility.

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

10 views6 pages

BIO Code Report

Uploaded by

Sai Sangavi

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 6

Search inside document

COVID2–19 DNA sequence data using python.

Major Modules Used:

Bio Python
Squiggle
Pandas

Importing Modules:

from future import division

from Bio.SeqUtils import ProtParam
import warnings
import pandas as pd
from Bio import SeqIO
from Bio.Data import CodonTable

We will use Bio.SeqIO from Biopython for parsing

DNA sequence data(fasta). It provides a simple
uniform interface to input and output assorted
sequence file formats.

for sequence in SeqIO.parse(r'Covid.fna', "fasta"):

print(sequence.seq)
print(len(sequence), 'nucliotides')

DNAsequence = SeqIO.read(r'Covid.fna', "fasta")

print(DNAsequence)
Since input sequence is FASTA (DNA), and
Coronavirus is RNA type of virus, we need to:
Transcribe DNA to RNA (ATTAAAGGTT… =>
AUUAAAGGUU…)
Translate RNA to Amino acid sequence
(AUUAAAGGUU… => IKGLYLPR*Q…)
In the current scenario, the .fna file starts with
ATTAAAGGTT, then we call transcribe() so T
(thymine) is replaced with U (uracil), so we get the
RNA sequence which starts with AUUAAAGGUU
The transcribe() method will convert the DNA to
mRNA.
DNA = DNAsequence.seq
mRNA = DNA.transcribe()
print(mRNA)
print('Size : ', len(mRNA))

The difference between the DNA and the mRNA is

just that the bases T (for Thymine) are replaced
with U (for Uracil).
Next, we are going to translate the mRNA sequence
to amino-acid sequence using translate() method,
we get something like IKGLYLPR*Q ( is so-called
STOP codon, effectively is a separator for proteins).
Amino_Acid = mRNA.translate(table=1, cds=False)
print('Amino Acid', Amino_Acid)
print("Length of Protein:", len(Amino_Acid))
print("Length of Original mRNA:", len(mRNA))

The standard genetic code is traditionally

represented as an RNA codon table because, when
proteins are made in a cell by ribosomes, it is
mRNA that directs protein synthesis. The mRNA
sequence is determined by the sequence of
genomic DNA. Here are some features of codons:
Most codons specify an amino acid
Three “stop” codons mark the end of a protein
One “start” codon, AUG, marks the beginning of a
protein and also encodes the amino acid
methionine.
A series of codons in part of a messenger RNA
(mRNA) molecule. Each codon consists of three
nucleotides, usually corresponding to a single
amino acid. The nucleotides are abbreviated with
the letters A, U, G, and C. This is mRNA, which
uses U (uracil). DNA uses T (thymine) instead. This
mRNA molecule will instruct a ribosome to
synthesize a protein according to this code. Source

print(CodonTable.unambiguous_rna_by_name['Sta
ndard'])
Now we are extracting the Proteins (chains of
amino acids), basically separating at the stop
codon, marked by * (ASTERISK). Then let’s remove
any sequence less than 20 amino acids long, as
this is the smallest known functional protein

Proteins = Amino_Acid.split('*')
df = pd.DataFrame(Proteins)
df.describe()
print('Total proteins:', len(df))
def conv(item):
return len(item)
def to_str(item):
return str(item)
df['sequence_str'] = df[0].apply(to_str)
df['length'] = df[0].apply(conv)
df.rename(columns={0: "sequence"}, inplace=True)
df.head()
functional_proteins = df.loc[df['length'] >= 20]

print('Total functional proteins:',

len(functional_proteins))

print(functional_proteins.describe())

Protein Analysis With The Protparam Module In

Biopython using ProtParam.

poi_list = []
MW_list = []

for record in Proteins[:]:

print("\n")
X = ProtParam.ProteinAnalysis(str(record))
POI = X.count_amino_acids()
poi_list.append(POI)
MW = X.molecular_weight()
MW_list.append(MW)
print("Protein of Interest = ", POI)
try:
print("Amino acids percent = ",
str(X.get_amino_acids_percent()))
except ZeroDivisionError:
pass
print("Molecular weight = ", MW)
try:
print("Aromaticity = ", X.aromaticity())
except ZeroDivisionError:
pass
print("Flexibility = ", X.flexibility())
try:
print("Secondary structure fraction = ",
X.secondary_structure_fraction())
except ZeroDivisionError:
pass

As The Above Code Produces The OutPut For All

The 775 proteins, we have attached only one of the
output screen.

Ass 2 Bioinformatics
Document8 pages
Ass 2 Bioinformatics
Muhammad Anas Jamshed
No ratings yet
BioInfo2 Assignment - Python
Document11 pages
BioInfo2 Assignment - Python
Jin Yeow
No ratings yet
ARTP (Adaptive Rank Truncated Product) Package: Detailed Examples of Computing The Gene and Path-Way P-Values
Document3 pages
ARTP (Adaptive Rank Truncated Product) Package: Detailed Examples of Computing The Gene and Path-Way P-Values
ajquinonesp
No ratings yet
Lab Assignments
Document4 pages
Lab Assignments
gyan
100% (1)
Assignment 3
Document8 pages
Assignment 3
Samson Fung
No ratings yet
Perl Bioinf 0411 PDF
Document69 pages
Perl Bioinf 0411 PDF
Nitin Khadse
No ratings yet
Differential Expression Analysis With Deseq2: Dr. Kathi Zarnack
Document8 pages
Differential Expression Analysis With Deseq2: Dr. Kathi Zarnack
Gaurav Sakhare
No ratings yet
An Introduction To Exomepeak: Jia Meng, PHD Modified: 18 August, 2013. Compiled: June 24, 2014
Document5 pages
An Introduction To Exomepeak: Jia Meng, PHD Modified: 18 August, 2013. Compiled: June 24, 2014
Ha
No ratings yet
Simulate - Monte Carlo Simulations: Filename
Document7 pages
Simulate - Monte Carlo Simulations: Filename
Andxp51
No ratings yet
Aman Nye
Document13 pages
Aman Nye
Solah Alaam
No ratings yet
Ejercicios BioPython1
Document9 pages
Ejercicios BioPython1
juan antonio garcia
No ratings yet
In Silico Genome Analysis-Inderjit (SoAB)
Document5 pages
In Silico Genome Analysis-Inderjit (SoAB)
tango0385
No ratings yet
1o9u.pdb (Renum - 1, Water & Ligand Remove) : 1. Extract The Residues Sequence by Using The Following Script
Document6 pages
1o9u.pdb (Renum - 1, Water & Ligand Remove) : 1. Extract The Residues Sequence by Using The Following Script
azhagar_ss
No ratings yet
MD Simulation Tutorial in Gromacs
Document9 pages
MD Simulation Tutorial in Gromacs
Simanta Paul
No ratings yet
Working With Affymetrix Data: Estrogen, A 2x2 Factorial Design Example
Document15 pages
Working With Affymetrix Data: Estrogen, A 2x2 Factorial Design Example
Charles Wang
No ratings yet
HW 13
Document6 pages
HW 13
David M Rodgers
No ratings yet
TPIEA User's Guide: June 2016
Document8 pages
TPIEA User's Guide: June 2016
piruvato
No ratings yet
Workshop Practice 1: Reading and Manipulating Short Reads
Document16 pages
Workshop Practice 1: Reading and Manipulating Short Reads
rashsplash
No ratings yet
Emboss
Document35 pages
Emboss
faridkhan
100% (2)
Topic 2.7 Transcriptin and Translation
Document7 pages
Topic 2.7 Transcriptin and Translation
Sakina İmanova
No ratings yet
Assignment 3 A DSP
Document7 pages
Assignment 3 A DSP
Kashif Abbas
No ratings yet
L1 Exercises Solutions
Document15 pages
L1 Exercises Solutions
Johnny big Bollocks
100% (1)
1.diagnosis Using ML
Document69 pages
1.diagnosis Using ML
Choral Wealth
No ratings yet
QBasic Summary
Document5 pages
QBasic Summary
inge68m
No ratings yet
Pograms
Document20 pages
Pograms
Thamizh Arasi
No ratings yet
Manual de Ejercicios de Python
Document1 page
Manual de Ejercicios de Python
Daniel Alonso
No ratings yet
Lecture1 Strings
Document16 pages
Lecture1 Strings
Foaina
No ratings yet
Python Lab Manual
Document19 pages
Python Lab Manual
Rahul Yadav
No ratings yet
Agkeller Userpage Fu Berlin de Source Intro Openmm Intro Openmm HTML
Document1 page
Agkeller Userpage Fu Berlin de Source Intro Openmm Intro Openmm HTML
paumarc
No ratings yet
Nucleic Acids - Janella Jane Ilag
Document50 pages
Nucleic Acids - Janella Jane Ilag
Janella Jane Ramos Ilag
No ratings yet
Lec 2 PDF
Document28 pages
Lec 2 PDF
ziadmohamad3412
No ratings yet
Tutorial 8 Solution
Document7 pages
Tutorial 8 Solution
ritz0874
No ratings yet
Machine Learning Lecture - 4 and Lecture - 5
Document73 pages
Machine Learning Lecture - 4 and Lecture - 5
Charmil Gandhi
No ratings yet
Quantum Espresso Tutorial Surface
Document15 pages
Quantum Espresso Tutorial Surface
Ng Wei Jiang
No ratings yet
Quiz #3: Biochemical Engineering Fall 2003
Document5 pages
Quiz #3: Biochemical Engineering Fall 2003
Princess Janine Catral
No ratings yet
Lecture 16
Document22 pages
Lecture 16
Yom ERA
No ratings yet
Ass
Document5 pages
Ass
Taqwa Elsayed
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
Document12 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
Shubham Anand
No ratings yet
Python Basic and Advanced-Day 8
Document20 pages
Python Basic and Advanced-Day 8
Ashok Kumar
100% (1)
p3 Python Project
Document4 pages
p3 Python Project
Daniella Vargas
No ratings yet
Week 07 Tutorial Sample Answers
Document11 pages
Week 07 Tutorial Sample Answers
MP
No ratings yet
Using U Boot
Document5 pages
Using U Boot
RAMU
No ratings yet
Assignment 1700480105
Document34 pages
Assignment 1700480105
sowmeya veeraraghavan
No ratings yet
Nano Sci Tech 20081
Document5 pages
Nano Sci Tech 20081
mbjasser
No ratings yet
Introduction To Python (Part III)
Document29 pages
Introduction To Python (Part III)
Subhradeep Pal
No ratings yet
Programming With Awk and Perl
Document4 pages
Programming With Awk and Perl
Ujjwal Pradhan
No ratings yet
AJP Unit I
Document14 pages
AJP Unit I
srnarayanan_slm
No ratings yet
Ia Ques An D Answerkey
Document9 pages
Ia Ques An D Answerkey
padma priya
No ratings yet
19Nh14 102190051 Lab13 Chương Trình MapReduce Shortest Path Using Parallel Breadth First Search BFS 02
Document16 pages
19Nh14 102190051 Lab13 Chương Trình MapReduce Shortest Path Using Parallel Breadth First Search BFS 02
Tri An Nguyễn
No ratings yet
Affy Diffexp Clustering Exercise-1
Document16 pages
Affy Diffexp Clustering Exercise-1
emilio
No ratings yet
SP Lab 2017
Document50 pages
SP Lab 2017
Ashish Dani Mathew
No ratings yet
Python Unit 5 and 4
Document20 pages
Python Unit 5 and 4
Shubham Mishra
No ratings yet
solutionsExerciseMaster11 23
Document13 pages
solutionsExerciseMaster11 23
Huy
No ratings yet
Rsamtools Overview
Document13 pages
Rsamtools Overview
Marcus Vinicius
No ratings yet
Python For Network Engineers - Huawei Presentation - Updated
Document44 pages
Python For Network Engineers - Huawei Presentation - Updated
Kha
No ratings yet
Python Programming Unit1
Document41 pages
Python Programming Unit1
Kartik jain
No ratings yet
Python
Document9 pages
Python
Zareth Huaman
No ratings yet
Name - Per. - Date - Chapter 12-Protein Synthesis Worksheet
Document2 pages
Name - Per. - Date - Chapter 12-Protein Synthesis Worksheet
Lovryan Tadena Amiling
No ratings yet
Streaming and I/0: Chapter 14 D&D
Document37 pages
Streaming and I/0: Chapter 14 D&D
Al Gambardella
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Translation
Document1 page
Translation
Gianne Abcede
No ratings yet
DNA Technology and Genomics: Lecture Outline
Document20 pages
DNA Technology and Genomics: Lecture Outline
Eiann Jasper Longcayana
No ratings yet
Multiple-Choice Test: 5 Cell and Nuclear Division
Document5 pages
Multiple-Choice Test: 5 Cell and Nuclear Division
Ala
No ratings yet
Solution To The 50-Year-Old Okazaki-Fragment Problem: Commentary
Document3 pages
Solution To The 50-Year-Old Okazaki-Fragment Problem: Commentary
aparna viswanb
No ratings yet
Practice Worksheet For SBQ1-AK
Document17 pages
Practice Worksheet For SBQ1-AK
Shamma Ahmed
No ratings yet
Test Bank For Molecular Diagnostics Fundamentals Methods and Clinical Applications 1st Edition Buckingham
Document5 pages
Test Bank For Molecular Diagnostics Fundamentals Methods and Clinical Applications 1st Edition Buckingham
glendavictoriabbk
No ratings yet
Globin Genes and Thalassemia (Class)
Document44 pages
Globin Genes and Thalassemia (Class)
Naman Mishra
No ratings yet
SR Neet Star Super Chaina (Biology GT) Q.P Ex - Dt. 18.03.2024
Document11 pages
SR Neet Star Super Chaina (Biology GT) Q.P Ex - Dt. 18.03.2024
divya202230014
No ratings yet
The New Central Dogma of Molecular Biology: March 2020
Document33 pages
The New Central Dogma of Molecular Biology: March 2020
Angelina Koban
No ratings yet
Regulation of Growth and Death in Escherichia Coli by Toxin-Antitoxin Systems
Document12 pages
Regulation of Growth and Death in Escherichia Coli by Toxin-Antitoxin Systems
Душан Марковић
No ratings yet
Ebook Human Reproductive Genetics Emerging Technologies and Clinical Applications PDF Full Chapter PDF
Document67 pages
Ebook Human Reproductive Genetics Emerging Technologies and Clinical Applications PDF Full Chapter PDF
allen.elliott147
100% (29)
11.4 Meiosis: Multiple Choice
Document5 pages
11.4 Meiosis: Multiple Choice
MING ZHU
No ratings yet
Worksheet As Level Nucleic Acids and Protein Synthesis 1
Document4 pages
Worksheet As Level Nucleic Acids and Protein Synthesis 1
Areeb
No ratings yet
Biotech Notes
Document36 pages
Biotech Notes
Catherine Basadre
No ratings yet
Grp.2 DNA REPLICATION EMERALD
Document48 pages
Grp.2 DNA REPLICATION EMERALD
Linda Ann Bacunador
No ratings yet
Transformation Protocol Worksheet Practicals B1 T2021 2022
Document7 pages
Transformation Protocol Worksheet Practicals B1 T2021 2022
Sébastien Urien
No ratings yet
CHPT 8
Document16 pages
CHPT 8
api-318387471
No ratings yet
Worksheet 2.dna and Rna
Document2 pages
Worksheet 2.dna and Rna
Team kalogxz Compilation
No ratings yet
12 Biology Notes Ch06 Molecular Basis of Inheritance
Document6 pages
12 Biology Notes Ch06 Molecular Basis of Inheritance
Inderpal Singh
No ratings yet
Translation in Prokaryotes: NEET 2020
Document1 page
Translation in Prokaryotes: NEET 2020
ADIKKI ANOOHYA
No ratings yet
True or False
Document4 pages
True or False
taya guy
No ratings yet
Journal of Genetics and Genomics
Document13 pages
Journal of Genetics and Genomics
visini
No ratings yet
Enzymes Used in RDT Corrected Version Edited
Document43 pages
Enzymes Used in RDT Corrected Version Edited
Yuppie Raj
No ratings yet
Microbiology 13th Edition Tortora Test Bank
Document24 pages
Microbiology 13th Edition Tortora Test Bank
magnusngah7aaz
100% (33)
4-BIOL 101 Study Guide Quiz 4
Document5 pages
4-BIOL 101 Study Guide Quiz 4
Suraj Naik
No ratings yet
Biochem Lec20
Document4 pages
Biochem Lec20
Louis Fortunato
No ratings yet
Alternative Splicing
Document25 pages
Alternative Splicing
tendril123
No ratings yet
AlzheimersDisease Student CL
Document9 pages
AlzheimersDisease Student CL
Griselda Ramon
No ratings yet
Rna Processing Eukaryotes
Document33 pages
Rna Processing Eukaryotes
Nathanael
No ratings yet
(Reviewer) CMB Dna To Rna
Document9 pages
(Reviewer) CMB Dna To Rna
Coleen Pareja
No ratings yet