Professional Documents
Culture Documents
Lecture 01
Lecture 01
Learning Outcomes
After reading this students should able to:-
Pursue deeper study in specialist subjects and have an Demonstrate analytical and problem-solving skills, enabling
them to apply their knowledge in a wide variety of situations.
symbols. Similarly, the first time scientists saw gene and protein sequences, they saw a string of symbols with no clear meaning in terms of biological function. But now, bioinformatics is showing us many things about what sequences mean. Using bioinformatics, sequences are being used to reveal relationships among different life forms that we could not find out any other way. Bioinformatics is revealing the rules and meaning of a language that is new to human beings but in fact is billions of years old the Language of Life. Bioinformatics is an important part of modern biology because it allows scientists useful, powerful ways to look at their data. Its one thing to have several DNA sequences from different organisms written down on a piece of paper, but its quite another to have those sequences available in computer databases and to be able to use computers to compare how similar those sequences are, investigate what functions the DNA sequences might have, etc. Another important point is that the number of available DNA sequences is growing exponentially, so bioinformatics work is becoming There are several definitions for bioinformatics here I give my own definition.
2A.501
annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. This includes databases of the sequences and structural information as well methods to access, search, visualize and retrieve the information. Function genomics, bimolecular structure, proteome analysis, cell metabolism, biodiversity, downstream, processing in chemical engineering, drug and vaccine design are some of the areas in which Bioinformatics is an integral component. Bioinformatics concern the creation and maintenance of databases of biological information whereby researchers can both access existing information and submit new entries. The most pressing tasks in bioinformatics involve the analysis of sequence information. Computational Biology is the name given to this process, and it involves the following
Analysis The most pressing tasks in bioinformatics involve the analysis of sequence information. Computational Medicinal Chemistry is the name given to this process, and it involves the following:
Sub-disciplines within Bioinformatics There are three important sub-disciplines within bioinformatics involving computational biology:
The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information.
The Loose Definition
The creation of databases of biological information The maintenance of these databases Today, we are sequencing tens of Millions of bases a year and
undertaking to sequence whole organism genomes. The growth of the sequence databases is an unbroken exponential.
There are other fields-for example medical imaging / image analysis which might be considered part of bioinformatics. There is also a whole other discipline of biologically inspired computation; genetic algorithms, Artificial Intelligence, neural networks. Often these areas interact in strange ways. Neural networks, inspired by crude models of the functioning of nerve cells in the brain, are used in a program called PHD to predict, surprisingly accurately, the secondary structures of proteins from their primary sequences. What almost all bioinformatics has in common is the processing of large amounts of biologically derived information, whether DNA sequences or breast X-rays. We should not think all biological computing is bioinformatics, e.g. mathematical modeling is not bioinformatics, even when connected with biology-related problems. In my opinion, bioinformatics has to do with management and the subsequent use of biological information, particular genetic information. Even though the three terms: bioinformatics, computational biology and bioinformation infrastructure are often times used interchangeably, broadly, the three may be defined as follows:
2A.501
1. Bioinformatics: refers to database-like activities, involving persistent sets of data that are maintained in a consistent state over essentially indefinite periods of time; 2. Computational biology: encompasses the use of algorithmic tools to facilitate biological analyses; while 3. Bioinformation infrastructure: comprises the entire collective of information management systems, analysis tools and communication networks supporting biology. Thus, the latter may be viewed as a computational scaffold of the former two We can define bioinformatics as the study of information content and information flow in biological systems and processes. It has evolved to serve as the bridge between observations (data) in diverse biologically related disciplines and the derivations of understanding (information) about how the systems or processes function, and subsequently the application (knowledge). A more pragmatic definition in the case of diseases is the understanding of dysfunction (diagnostics) and the subsequent applications of the knowledge for therapeutics and prognosis.
characterize the genetic code of genes, the proteins linked to each gene and their associated functions.
Definitions Bioinformatics
2A.501
advanced computing techniques. Bioinformatics is particularly important as an adjunct to genomics research, because of the large amount of complex data this research generates.
simultaneously
DNA Chips and Array Analyses
Spec
Medical applications: Genetic Disease ... SNPs
Information about human and other animal genes and The management and analysis of data from biological
research.
The greatest challenge facing the molecular biology community today is to make sense of the wealth of data that has been produced by the genome sequencing projects. Traditionally, molecular biology research was carried out entirely at the experimental laboratory bench but the huge increase in the scale of data being produced in this genomic era has seen a need to incorporate computers into this research process. Sequence generation, and its subsequent storage, interpretation and analysis are entirely computer dependent tasks. However, the molecular biology of an organism is a very complex issue with research being carried out at different levels including the genome, proteome, transcriptome and metabalome levels. Following on from the explosion in volume of genomic data, similar increase in data have been observed in the fields of proteomics,transcriptomics and metabalomics.The first challenge facing the bioinformatics community today is the intelligent and efficient storage of this mass of data. It is then their responsibility to provide easy and reliable access to this data. The data itself is meaningless before analysis and the sheer volume present makes it impossible for even a trained biologist to begin to interpret it manually. Therefore, incisive computer tools must be developed to allow the extraction of meaningful biological information There are three central biological processes around which bioinformatics tools must be developed:
DNA sequence determines protein sequence Protein sequence determines protein structure Protein structure determines protein function
The integration of information learned about this key biological process should allow us to achieve the long-term goal of the complete understanding of the biology of organisms. Now we will see few challenges of bioinformatics by which we can know the importance of bioinformatics. Challenges of Bioinformatics
Precise, predictive model of transcription initiation and
termination: ability to predict where and when transcription will occur in a genome
2A.501
splicing: ability to predict the splicing pattern of any primary transcript in any tissue
Precise, quantitative models of signal transduction pathways:
1970: The details of the Needleman-Wunsch algorithm for sequence comparison are published 1971: Ray Tomlinson (BBN) invents the email program 1972: The first recombinant DNA molecule is created by Paul Berg and his group 1973: The Brookhaven Protein Data Bank is announced (Acta. Cryst. B, 1973, 29: 1746). Robert Metcalfe receives his Ph.D. from Harvard University. His thesis describes Ethernet 1974: Vint Cerf and Robert Kahn develop the concept of connecting networks of computers into an internet and develop the Transmission Control Protocol (TCP). Charles Goldfarb invents SGML (Standardized General Markup Language). 1975: Microsoft Corporation is founded by Bill Gates and Paul Allen. Two-dimensional electrophoresis, where separation of proteins on SDS polyacrylamide gel is combined with separation according to isoelectric points, is announced by P. H. OFarrell (J. Biol. Chem., 250: 4007-4021, 1975). E. M. Southern published the experimental details for the Southern Blot technique of specific sequences of DNA (J. Mol. Biol., 98: 503-517, 1975). 1976: The Unix-To-Unix Copy Protocol (UUCP) is developed at Bell Labs. 1977: The full description of the Brookhaven PDB (http:// www.pdb.bnl.gov) is published (Bernstein, F.C.; Koetzle, T.F.; Williams, G.J.B.; Meyer, E.F.; Brice, M.D.; Rodgers, J.R.; Kennard, O.; Shimanouchi, T.; Tasumi, M.J.; J. Mol. Biol., 1977, 112:, 535). Allan Maxam and Walter Gilbert (Harvard) and Frederick Sanger (U.K. Medical Research Council), report methods for sequencing DNA. 1978: The first Usenet connection is established between Duke and the University of North Carolina at Chapel Hill by Tom Truscott, Jim Ellis and Steve Bellovin. 1980: The first complete gene sequence for an organism (FX174) is published. The gene consists of 5,386 base pairs which code nine proteins. Wthrich et. al. publish paper detailing the use of multidimensional NMR for protein structure determination (Kumar, A.; Ernst, R.R.; Wthrich, K.; Biochem. Biophys. Res. Comm., 1980, 95:, 1). IntelliGenetics, Inc. founded in California. Their primary product is the IntelliGenetics Suite of programs for DNA and protein sequence analysis. 1981: The Smith-Waterman algorithm for sequence alignment is published. IBM introduces its Personal Computer to the market. 1982: Genetics Computer Group (GCG) created as a part of the University of Wisconsin of Wisconsin Biotechnology Center. The companys primary product is The Wisconsin Suite of molecular biology tools.
curricula for secondary, undergraduate and graduate education this challenge is very important in India why because as the bioinformatics is mix of biology and informatics and we do not have a full fledged course which covers both the biology and information technology as of now. We need to introduce such courses from the school level, as the bioinformatics requires a person who is strong in biology and computer science. History of Bioinformatics Bioinformatics encompasses the use of tools and techniques from three separate disciplines; molecular biology (the source of the data to be analyzed), computer science (supplies the hardware for running analysis and the networks to communicate the results), and the data analysis algorithms, which strictly define bioinformatics. For this reason, the editors have decided to incorporate events from these areas into a brief history of the field. 1933: A new technique, electrophoresis, is introduced by Tiselius for separating proteins in solution. 1951: Pauling and Corey propose the structure for the alpha helix and beta-sheet (Proc. Natl. Acad. Sci. USA, 27: 205-211, 1951; Proc. Natl. Acad. Sci. USA, 37: 729-740, 1951). 1953:Watson and Crick propose the double helix model for DNA based on x-ray data obtained by Franklin and Wilkins (Nature, 171: 737-738, 1953). 1954:Perutzs group develops heavy atom methods to solve the phase problem in protein crystallography. F. Sanger announces 1955:The sequence of the first protein to be analyzed, bovine insulin. 1958: The first integrated circuit is constructed by Jack Kilby at Texas Instruments. The Advanced Research Projects Agency (ARPA) is formed in the US. 1968: Packet-switching network protocols are presented to ARPA. 1969: The ARPANET is created by linking computers at Stanford, UCSB, The University of Utah and UCLA.
2A.501
1983: The Compact Disk (CD) is launched. Name servers are developed at the University of Wisconsin 1984: Jon Postels Domain Name System (DNS) is placed online. The Macintosh is announced by Apple Computer 1985:The FASTP algorithm is published. The PCR reaction is described by Kary Mullis and co-workers. 1986:The term Genomics appeared for the first time to describe the scientific discipline of mapping, sequencing, and analyzing genes. The term was coined by Thomas Roderick as a name for the new journal. Amoco Technology Corporation acquires IntelliGenetics. NSFnet debuts. The SWISS-PROT database is created by the Department of Medical Biochemistry of the University of Geneva and the European Molecular Biology Laboratory (EMBL). 1987: The use of yeast artificial chromosomes (YAC) is described (David T. Burke, et. al., Science, 236: 806-812). The physical map of e. coli is published (Y. Kohara, et. al., Cell 51: 319-337). PERL (Practical Extraction Report Language) is released by Larry Wall. 1988: The National Center for Biotechnology Information (NCBI) is established at the National Cancer Institute. The Human Genome Initiative is started (Commission on Life Sciences, National Research Council. Mapping and Sequencing the Human Genome, National Academy Press: Washington, D.C.), 1988. The FASTA algorithm for sequence comparison is published by Pearson and Lupman. A new program, an Internet computer virus designed by a student, infects 6,000 military computers in the US. 1989: The Genetics Computer Group (GCG) becomes a private company. Oxford Molecular Group, Ltd. (OMG) founded in Oxford, UK by Anthony Marchington, David Ricketts, James Hiddleston, Anthony Rees, and W. Graham Richards. Primary products: Anaconda, Asp, Cameleon and others (molecular modeling, drug design, protein design). 1990: The BLAST program (Altschul, et. al.) is implemented. Molecular Applications Group is founded in California by Michael Levitt and Chris Lee. Their primary products are Look and SegMod which are used for molecular modeling and protein design. InforMax is founded in Bethesda, MD. The companys products address sequence analysis, database and data management, searching, publication graphics, clone construction, mapping and primer design. The HTTP 1.0 specification is published. Tim Berners-Lee publishes the first HTML document. 1991: The research institute in Geneva (CERN) announces the creation of the protocols which make-up the World Wide Web.
Linus Torvalds announces a Unix-Like operating system which later becomes Linux. The creation and use of expressed sequence tags (ESTs) is described (J. Craig Venter, et. al., Science, 252: 1651-1656). Incyte Pharmaceuticals, a genomics company headquartered in Palo Alto California, is formed. Myriad Genetics, Inc. is founded in Utah. The companys goal is to lead in the discovery of major common human disease genes and their related pathways. The Company has discovered and sequenced, with its academic collaborators, the following major genes: BRCA1, BRCA2, CHD1, MMAC1, MMSC1, MMSC2, CtIP, p16, p19, and MTS2. 1992: Human Genome Systems, Gaithersburg Maryland, is formed by William Haseltine. The Institute for Genomic Research (TIGR) is established by Craig Venter. Genome Therapeutics announces its incorporation. Mel Simon and coworkers announce the use of BACs for cloning. 1993: CuraGen Corporation is formed in New Haven, CT. Affymetrix begins independent operations in Santa Clara, California Compugen begins operations in Israel. InterNIC is created by the National Science Foundation. 1994: Netscape Communications Corporation founded and releases Navigator, the commercial version of NCSAs Mozilla. Gene Logic is formed in Maryland. The PRINTS database of protein motifs is published by Attwood and Beck. Oxford Molecular Group acquires IntelliGenetics. 1995: Microsoft releases version 1.0 of Internet Explorer. Sun releases version 1.0 of Java. Sun and Netscape release version 1.0 of JavaScript Version 1.0 of Apache is released. The Haemophilus influenzea genome (1.8 Mb) is sequenced. The Mycoplasma genitalium genome 1996: The-working draft for XML is released by W3C. Oxford Molecular Group acquires the MacVector product from Eastman Kodak. The genome for Saccharomyces cerevisiae (bakers yeast, 12.1 Mb) is sequenced. The Prosite database is reported by Bairoch, et.al. Affymetrix produces the first commercial DNA chips. Structural Bioinformatics, Inc. founded in San Diego, CA 1997: The genome for E. coli (4.7 Mbp) is published. Oxford Molecular Group acquires the Genetics Computer Group. LION bioscience AG founded as an integrated genomics company with strong focus on bioinformatics. The company is built from IP out of the European Molecular Biology Laboratory (EMBL), the European Bioinformatics Institute (EBI), the
2A.501
German Cancer Research Center (DKFZ), and the University of Heidelberg. Paradigm Genetics Inc., a company focussed on the application of genomic technologies to enhance worldwide food and fiber production, is founded in Research Triangle Park, NC. deCode genetics publishes a paper that described the location of the FET1 gene, which is responsible for familial essential tremor, on chromosome 13 (Nature Genetics 1998: The genomes for Caenorhabditis elegans and bakers yeast are published. The Swiss Institute of Bioinformatics is established as a nonprofit foundation. Craig Venter forms Celera in Rockville, Maryland. PE Informatics was formed as a Center of Excellence within PE Biosystems. This center brings together and leverages the complementary expertise of PE Nelson and Molecular Informatics, to further complement the genetic instrumentation expertise of Applied Biosystems. Inpharmatica, a new Genomics and Bioinformatics company, is established by University College London, the Wolfson Institute for Biomedical Research, five leading scientists from major British academic centers and Unibio Limited. GeneFormatics, a company dedicated to the analysis and prediction of protein structure and function, is formed in San Diego. Molecular Simulations Inc. is acquired by Pharmacopeia 1999: deCode genetics maps the gene linked to pre-eclampsia as a locus on chromosome 2p13. 2000: The genome for Pseudomonas aeruginosa (6.3 Mbp) is published. The A. thaliana genome (100 Mb) is sequenced. The D. melanogaster genome (180Mb) is sequenced. Pharmacopeia acquires Oxford Molecular Group. 2001::The human genome (3,000 Mbp) is published. 2002::Structural Bioinformatics and GeneFormatics merge. 2004: The draft genome sequence of the brown Norway laboratory rat, Rattus norvegicus, was completed by the Rat Genome Sequencing project Consortium. The paper appears in the April 1 edition of Nature.
warehousing and data mining have become major issues for biotechnologists and biological scientists due to sudden growth in quantitative data in biology such as complete genomes of biological species including human genome, protein sequences, protein 3-D structures, metabolic pathways databases, cell line & hybridoma information, biodiversity related information. Advancements in information technology, particularly the Internet, are being used to gather and access ever-increasing information in biology and biotechnology. Functional genomics, proteomics, discovery of new drugs and vaccines, molecular diagnostic kits and pharmacogenomics are some of the areas in which bioinformatics has become an integral part of Research & Development. The knowledge of multimedia databases, tools to carry out data analysis and modeling of molecules and biological systems on computer workstations as well as in a network environment has become essential for any student of Bioinformatics. Bioinformatics, the multidisciplinary area, has grown so much that one divides it into molecular bioinformatics, organal bioinformatics and species bioinformatics. Issues related to biodiversity and environment, cloning of higher animals such as Dolly and Polly, tissue culture and cloning of plants have brought out that Bioinformatics is not only a support branch of science but is also a subject that directs future course of research in biotechnology and life sciences. The importance and usefulness of Bioinformatics is realized in last few years by many industries. Therefore, large Bioinformatics R & D divisions are being established in many pharmaceutical companies, biotechnology companies and even in other conventional industry dealing with biological. Bioinformatics is thus rated as number one career in the field of biosciences. The need of trained manpower in this area is sharply on the rise but there are very few training institutions in the world where such training is provided. In short, Bioinformatics deals with database creation, data analysis and modeling. Data capturing is done not only from printed material but also from network resources. Databases in biology are generally in the multimedia form organized in relational database model. Modeling is done not only on single biological molecule but also on multiple systems thus requiring a use of high performance computing systems. Potential of Bioinformatics The potential of bioinformatics in the identification of useful genes leading to the development of new gene products, drug discovery and drug development has led to a paradigm shift in biology and biotechnology-these fields are becoming more & more computationally intensive. The new paradigm, now emerging, is that all the genes will be known in the sense of being resident in database available electronically, and the starting point of biological investigation will be theoretical and a scientist will begin with a theoretical conjecture and only then turning to experiment to follow or test the hypothesis. With a much deeper understanding of the biological processes at the molecular level, the Bioinformatics scientist have developed new techniques to analyze genes on an industrial scale resulting in a new area of science known as Genomics.
Scope of Bioinformatics
Bioinformatics has evolved into a full-fledged scientific discipline over the last decade. The definition of Bioinformatics is not restricted to computational molecular biology and computational structural biology. Bioinformatics uses advances in the area of computer science, information science, computer and information technology, communication technology to solve complex problems in life sciences like comparative genomics, structural genomics, transcriptiomics, Proteomics, cellunomics and metabolic pathway engineering. And particularly in biotechnology. Developments in these fields have direct implications to healthcare, medicine, discovery of next generation drugs, development of agricultural products, renewable energy, environmental protection etc. Data capture, data
2A.501
The shift from gene biology has resulted in the development of strategies-from lab techniques to computer programs to analyze whole batch of genes at once. Genomics is revolutionizing drug development, gene therapy, and our entire approach to health care and human medicine. The genomic discoveries are getting translated in to practical biomedical results through Bioinformatics applications. Work on proteomics and genomics will continue using highly sophisticated software tools and data networks that can carry multimedia databases. Thus, the research will be in the development of multimedia databases in various areas of life sciences and biotechnology. There will be an urgent need for development of software tools for data mining, analysis and modeling, and downstream processing. Security of data, data transfer and data compression, auto checks on data accuracy and correctness will also be major research area of bioinformatics. The use of virtual Reality in drug design, metabolic pathway design, and unicellular organism design, paving the way to design and modification of unicellular organisms, will be the challenges, which are the challenges, which Bioinformatics scientist and specialist have to tackle. It has now been universally recognized that Bioinformatics is the key to the new grand data-intensive molecular biology that will take us into 21 century Some questions will arise definitely in you which I need to clarify then and there .I try to guess those questions from you and give you answers . How does bioinformatics help biologists? Biology in the 21st century is being transformed from a purely lab-based science to an information science as well. Major advances in the field of molecular biology over the past few years including the ever growing genomic data have led to large amount biological information, which is difficult to decipher by the scientific community. Bioinformatics is all about retrieving, organizing, analyzing and storing data, which will require the help of computational methods. It delivers easy access of information and projects a method for extracting only that information that is specifically asked by the biologists. Therefore, the field of bioinformatics has evolved such that the most pressing task involves the analysis and interpretation of various types of data, including nucleotide and amino acid sequences, protein domains, and protein structures. This ultimately helps the biologist to obtain a comprehensive picture of the cellular activities and thus base their research on how these are altered under various conditions. Does bioinformatics pertain only to data mining? No, it does not. It involves a lot more intensive research and analysis of the huge data that is unmanageable otherwise. A bioinformatician is usually involved in areas such as biological tool development using neural networks, genetic algorithms; comparative genomics, functional genomics, structural genomics, database development and management, integration of various fields of life sciences to develop systems biology tools, phylogenetic analysis, proteomic studies, and the like. What are the areas of research in bioinformatics? Research in bioinformatics will include development and implementation of tools that enable efficient access to various
types of information, which should be usable as well as manageable. Development of algorithms for prediction of a number of different biological data like genes, protein functions, protein structure, domains, and also to assess the relationship between large amounts of data sets is also another area of research. It is very common now for a scientist to conduct vast numbers of database searches to formulate hypothesis and to design large-scale experiments. The areas of Genomics, Proteomics have come a long way due to inputs from bioinformatics analysis. I close this lesson by giving few definitions to the terminology, which we came across.
Language of Science
Genome All the genetic material in the chromosomes of a particular organism; its size is generally given as the total number of base pairs. One complete set of genes in an organism (a haploid set). Except for occasional unrepaired damage to its DNA (= mutations), the genome is fixed. Proteome All of the proteins produced by a given species, just as the genome is the totality of the genetic information possessed by that species. The complete profile of proteins expressed in a given tissue, cell or biological system at a given time. Two Popular Definitions
All the proteins that can be synthesized by the cell. (The
original definition.) All the proteins synthesized by a particular cell at a particular time Transcriptome The entire set of messenger RNA expressed while building, running and maintaining an organism. The transcriptome is all the mRNA transcribed from genes within a given genome. Meabalome All the metabolic machinery, e.g.,
Enzymes Coenzymes Small metabolites, like
The intermediates in glycolysis and cellular respiration Nucleotides present in a cell at a given time. Varies with the differentiated state of the cell and its current activities.
Genomics The comprehensive study of whole sets of genes and their interactions rather than single genes or proteins. Proteomics The study of the full set of proteins encoded by a genome Transcriptomics The generation and studies of complete mRNA expression profiles
2A.501
Metabalomics Is the computing of emergent proper biological systems such as development, biological clocks, and infer kinetic models of DNA, RNA, and proteins. Data Mining An information extraction activity whose goal is to discover hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results. Typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit risk analysis. Phylogeny Evolutionary relationships within and between taxonomic levels, particularly the patterns of lines of descent. Phylogenetics -The taxonomical classification of organisms based on their degree of evolutionary relatedness. Phylogenetic tree - A variety of dendrogram (diagram) in which organisms are shown arranged on branches that link them according to their relatedness and evolutionary descent. Dendrogram A dendrogram is a tree-like diagram that summaries the process of clustering. Similar cases are joined by links whose position in the diagram is determined by the level of similarity between the cases Protein Domains In the light of structural and biochemical evidence which has accumulated over recent years it has become increasingly clear that the traditional view that polypeptide = protein is inadequate to describe some naturally occurring polypeptides. In particular, it can be shown that different regions along a single polypeptide chain can act as independent units, to the extent that they can be excised from the chain, and still be shown to fold correctly and often still exhibit biological activity. These independent regions are termed domains. Protein Motifs A conserved element of a protein sequence alignment that usually correlates with a particular function. Motifs are generated from a local multiple protein sequence alignment corresponding to a region whose function or structure is known. It is sufficient that it is conserved, and is hence likely to be predictive of any subsequent occurrence of such a structural/functional region in any other novel protein sequence. mRNA messenger RNA Mutagen An agent that increases the rate of mutations in an organism. Mutation An inheritable change of a gene, which includes genetic (point or single base) changes, from one allelic form to another; or larger scale alterations such as chromosomal deletions or rearrangements. Algorithms A finite set of step-by-step instructions for a problem-solving or computation procedure, especially one that can be implemented by a computer
Pharmacogenomics The science of understanding the correlation between an individual patients genetic make-up (genotype) and their response to drug treatment. Some drugs work well in some patient populations and not as well in others. Studying the genetic basis of patient response to therapeutics allows drug developers to more effectively design therapeutic treatments. Molecular Dynamics The study of intramolecular conformations and molecular motions, using computational simulations. Calculations simulating the motion of each atom in a molecular system at a fixed energy, fixed temperature, or with controlled temperature changes. Biodiversity The variety and variability among living organisms and the ecosystems in which they occur. Biodiversity includes the number of different items and their relative frequencies; these items are organized at many levels, ranging from complete ecosystems to the biochemical structures that are the molecular basis of heredity. Thus, biodiversity encompasses expressions of the relative abundances of different ecosystems, species, and genes. (OR) IN Other Words Biodiversity, or biological diversity, is the term for the variety of life and the natural processes of which living things are a part. This includes the living organisms and the genetic differences between them and the communities in which they occur. The concept of biodiversity represents the ways that life is organized and interacts on our planet. These interactions can take place on scales ranging from the smallest, at the chromosome level, to organisms, ecosystems, and even to entire landscapes. SNP Single Nucleotide Polymorphism. When comparing the same sequence from two individuals, there can often be single base pair changes. These can be useful genetic markers EMBL European Molecular Biology Labs. The EMBL Nucleotide Sequence database is a comprehensive database of DNA and RNA sequences. The database is produced in collaboration with GenBank and the DNA Database of Japan (DDBJ) DNA chips /DNA Microarrays This technology promises to monitor the whole genome on a single chip so that researchers can have a better picture of the interactions among thousands of genes simultaneously. Or standard blotting membranes, and can be created by hand or make use of robotics to deposit the sample. In general, arrays are described as macro arrays or micro arrays, the difference being the size of the sample spots. This technology promises to monitor the whole genome on a single chip so that researchers can have a better picture of the interactions among thousands of genes simultaneously. Forensic The branch of science that employs scientific technology to assist in the determination of facts in the courts of law
2A.501
RNA Splicing/Aternative Splicing Alternative RNA splicing operates in multicellular organisms to generate rich proteomic diversity and to regulate the appearance of tissue-specific mRNA transcripts. Yet there is a limited understanding of these complex mechanisms, and how they respond to physiological inputs and developmental cues Ab Initio it is a Latin word which means from the beginning. Molecular orbital Calculations, which use all the molecular orbitals in a calculation, not just the valence electron orbitals. Absolute configuration the way, in 3-dimensional space, in which 4 different substituents are arranged off a chiral carbon. This can only be determined by X-ray Crystallography. However other compounds, which can be, related to one with known configuration, by syntheses where there are no changes at a chiral centre, bcan also be assigned an absolute configuration Ontology Ontology is the study of being, and it encompasses everything involved with the beings within humans, the process of becoming our beings fully, and relationships between degrees of being and the ontological worlds they create. Ontological refers to anything that has to do with the real self. For example, ontologically sensitive people are sensitive to the real selves within themselves and within others Speciation The process by which one or more populations of a species become genetically different enough to form a new species. The process often requires populations to be isolated for a long period of time. Computer Science The systematic study of computing systems and computation. The body of knowledge resulting from this discipline contains theories for understanding computing systems and methods; design methodology, algorithms, and tools; methods for the testing of concepts; methods of analysis and verification; and knowledge representation and implementation. Or briefly Study of the implementation, organization, and application of computer software and hardware resources. Information Science Pure and applied science involving the collection, organization, and management of information Information Technology Acquisition, processing, storage and dissemination of all types of information using computer technology and telecommunication systems Comparative Genomics The study of human genetics by comparisons with model organisms such as mice, the fruit fly, and the bacterium E. coli. The comparison of genomes and of distinct individuals within a genome. Comparative genomics makes possible the application of information gained from a simple genome to a more complex genome, and is the basis for the understanding of genetic variation amo
Structural Genomics the branch of genomics that determines the three-dimensional structures of proteins Cellunomics In abstract, one could visualize the combination of different molecules in a particular cell at an instant of time as a cellular state. The set of all states that a particular cell could enter is known as the cellome; cellomics is the study of a particular cellome Metabolic Pathway Engineering
It offers
Higher yields and productivities Less side-products Good stereo specificity Optimal conditions for biological reactions Conversion of a cheap raw material to a high-value product Novel products and processes Environmentally friendly production methods
Biotechnology The simplest definition of biotechnology is applied biology. The application of biological knowledge and techniques to develop products. It may be further defined as the use of living organisms to make a product or run a process. By this definition, the classic techniques used for plant and animal breeding, fermentation and enzyme purification would be considered biotechnology. Some people use the term only to refer to newer tools of genetic science. In this context, biotechnology may be defined as the use of biotechnical methods to modify the genetic materials of living cells so they will produce new substances or perform new functions. Examples include recombinant DNA technology, in which a copy of a piece of DNA containing one or a few genes is transferred between organisms or recombined within an organism. Organal Bioinformatics Collecting, storing retrieving the organs (A part of the body that consists of different types of tissue and that performs a particular function. Examples include the kidneys, heart and brain. ) Data and analyzing is called organal bioinformatics. Data warehousing: A data warehouse is a collection of data gathered and organized so that it can easily by analyzed, extracted, synthesized, and otherwise be used for the purposes of further understanding the data. It may be contrasted with data that is gathered to meet immediate business objectives such as order and payment transactions, although this data would also usually become part of a data warehouse. Sequence One can use this word as a verb or as a noun. A sequence (noun) is a string of bases in DNA or a string of amino acids in a protein. Also, to sequence (verb) means the experimental process of determining the order of the bases in a DNA fragment or the order of amino acids in a protein.
10
2A.501
Path way A system of proteins that work together. For example, a pathway could include protein, A which sends a signal to protein B, which sends a signal to protein C, and so on until a biological effect occurs. Both p53 and pRB are members of pathways in human cells. Promoters A DNA site to which RNA polymerase will bind and initiate transcription. A DNA sequence that is located in front of a gene and controls gene expression. Promoters are required for binding of RNA polymerase to initiate transcription. Repeat The distance from the center of one motif or pattern to the center of the next. Gene The fundamental physical and functional unit of heredity. A gene is an ordered sequence of nucleotides located in a particular position on a particular chromosome that encodes a specific functional product (i.e., a protein or RNA molecule. CDS Coding sequence. This is the portion of an mRNA or genomic sequence that encodes for a protein sequence. Clustering Clustering is the use of multiple computers and storage devices to create what seems to be a single system. Clustering is often used to increase a systems availability and for load balancing on highly-trafficked Web sites
based homology. Modeling studies are carried out to predict 3D structure of the Egp of JE virus. The template used is the structure of Egp of TBE virus. Conformational epitopes are predicted using this model of the Egp. Thus it has been showed that, the use of Bioinformatics tools and techniques not only reduces the time required to identify the candidate peptide as vaccine but also provides an insight in structure function relationship of virus protein. Testing your knowledge: This section is to test yourself that is how much you understood from this lesson.. Try to answer the Following Questions 1. What is Bioinformatics? 2. What is the scope of Bioinformatics? 3. What are the challenges of Bioinformatics? 4. How old is bioinformatics? 5. Are there any standards in Bioinformatics? 6. Can you give only one exact definition to Bioinformatics? 7. How can we tackle tasks in bioinformatics?
The scientist is not a person who gives the right answers; he is one who asks the right questions
Notes