Professional Documents
Culture Documents
Research Project - Literature Survey PDF
Research Project - Literature Survey PDF
A Report submitted by
MASTERS IN TECHNOLOGY
IN
ARTIFICIAL INTELLIGENCE
AT
This is to certify that the literature review entitled “Machine learning with Mice Protein
Expression Dataset”, has been done by Mr. Devansh Parikh under my guidance and
supervision for the degree of Masters of Technology in Artificial Intelligence of MPSTME,
SVKM’s NMIMS (Deemed-to-be University), Mumbai, India.
_______________
Date
Place: Mumbai
1. Introduction
Down syndrome (DS) is the most common genetic cause of learning/memory deficits. It is
due to the presence of an extra copy of the long arm of human chromosome 21. The
overexpression of genes encoded by the extra copy of a normal chromosome in DS is
believed to be sufficient to perturb normal pathways and normal responses to stimulation,
causing learning and memory deficits.
The dataset chosen is the Mice Protein Expression dataset, from the UCI Machine Learning
repository.
In this project, we will aim to do two things: the first one is to correctly classify the type of
mouse, based on expression levels of 77 proteins. This will be a supervised learning problem.
The second one is an unsupervised learning problem, that aims to cluster groups of mice, that
have been treated differently(different classes), in order to assess the effect of the drug
memantine in recovering the ability to learn in trisomic mice, some mice have been injected
with the drug and others have not.
As mentioned before, down syndrome (DS) is the most common genetic cause of
learning/memory deficits. It is due to an extra copy of the long arm of human chromosome 21
(Hsa21) and the consequent increased level of expression, due to dosage, of some subset of
the genes it encodes. Although no pharmacotherapies for learning deficits in DS are
available, because the incidence is high (one in 1000 live births worldwide), there is
considerable interest in their identification.
However due to improvements in machine learning
Dataset Description
The data set consists of the expression levels of 77 proteins/protein modifications that
produced detectable signals in the nuclear fraction of cortex. There are 38 control mice and
34 trisomic mice (Down syndrome), for a total of 72 mice. In the experiments, 15
measurements were registered of each protein per sample/mouse. Therefore, for control mice,
there are 38x15, or 570 measurements, and for trisomic mice, there are 34x15, or 510
measurements. The dataset contains a total of 1080 measurements per protein. Each
measurement can be considered as an independent sample/mouse.
The eight classes of mice are described based on features such as genotype, behavior and
treatment. According to genotype, mice can be control or trisomic. According to behavior,
some mice have been stimulated to learn (context-shock) and others have not (shock-context)
and in order to assess the effect of the drug memantine in recovering the ability to learn in
trisomic mice, some mice have been injected with the drug and others have not.
Classes:
c-CS-s: control mice, stimulated to learn, injected with saline (9 mice)
c-CS-m: control mice, stimulated to learn, injected with memantine (10 mice)
c-SC-s: control mice, not stimulated to learn, injected with saline (9 mice)
c-SC-m: control mice, not stimulated to learn, injected with memantine (10 mice)
The aim is to identify subsets of proteins that are discriminant between the classes.
What is Down Syndrome?
In every cell in the human body there is a nucleus, where genetic material is stored in genes.
Genes carry the codes responsible for all of our inherited traits and are grouped along
rod-like structures called chromosomes. Typically, the nucleus of each cell contains 23 pairs
of chromosomes, half of which are inherited from each parent. Down syndrome occurs when
an individual has a full or partial extra copy of chromosome 21.
This additional genetic material alters the course of development and causes the
characteristics associated with Down syndrome. A few of the common physical traits of
Down syndrome are low muscle tone, small stature, an upward slant to the eyes, and a single
deep crease across the center of the palm – although each person with Down syndrome is a
unique individual and may possess these characteristics to different degrees, or not at all.
Memantine Drug
Saline
Saline, also known as saline solution, is a mixture of sodium chloride in water and has a
number of uses in medicine. Applied to the affected area it is used to clean wounds, help
remove contact lenses, and help with dry eyes. By injection into a vein it is used to treat
dehydration such as from gastroenteritis and diabetic ketoacidosis. It is also used to dilute
other medications to be given by injection.
2. Literature Review:
Summary of papers I have studied, while working on this project.
4.Protein Dynamics Associated with Failed and Rescued Learning in the Ts65Dn Mouse
Model of Down Syndrome
Authors:
Md. Mahiuddin Ahmed, A. Ranjitha Dhanasekaran, Aaron Block, Suhong Tong, Alberto C.
S. Costa, Melissa Stasko,Katheleen J. Gardiner
Summary:
The study examines the effect of metamine on the Ts65Dn mouse model. These mice display
many features relevant to those seen in DS, including deficits in learning and memory (L/M)
tasks requiring a functional hippocampus. The N-methyl-D-aspartate (NMDA) receptor
antagonist, memantine, was shown to rescue performance of the Ts65Dn in several L/M tasks
The results obtained are as follows:
(i) Of the dynamic responses seen in control mice in normal learning, >40% also occur in
Ts65Dn in failed learning or are compensated by baseline abnormalities, and thus are
considered necessary but not sufficient for successful learning.
(ii) Treatment with memantine does not in general normalize the initial protein levels but
instead induces direct and indirect responses in approximately half the proteins measured and
results in normalization of the endpoint protein levels.
5. Discovery and genetic localization of Down syndrome cerebellar phenotypes using the
Ts65Dn mouse
Authors: Laura L. Baxter, Timothy H. Moran, Joan T. Richtsmeier, Juan Troncoso, Roger H.
Reeves
Summary:
Down syndrome (DS) is the most common genetic cause of mental retardation and affects
many aspects of brain development. DS individuals exhibit an overall reduction in brain size
with a disproportionately greater reduction in cerebellar volume. The Ts65Dn mouse is
segmentally trisomic for the distal 12–15 Mb of mouse chromosome 16, a region that shows
perfect conserved linkage with human chromosome 21, and therefore provides a genetic
model for DS.
In this study, high resolution magnetic resonance imaging and histological analysis
demonstrate precise neuro- anatomical parallels between the DS and the Ts65Dn cerebellum.
Cerebellar volume is significantly reduced in Ts65Dn mice due to reduction of both the
internal granule layer and the molecular layer of the cerebellum. Granule cell number is
further reduced by a decrease in cell density in the internal granule layer. Despite these
changes in Ts65Dn cerebellar structure, motor deficits have not been detected in several tests.
Reduction in granule cell density in Ts65Dn mice correctly predicts an analogous pathology
in humans; a significant reduction in granule cell density in the DS cerebellum is reported
here for the first time.
Results:
The candidate region of genes on chromosome 21 affecting cerebellar development in DS is
therefore delimited to the subset of genes whose orthologs are at dosage imbalance in
Ts65Dn mice, providing the first localization of genes affecting a neuroanatomical phenotype
in DS. The application of this model for analysis of developmental perturbations is extended
by the accurate prediction of DS cerebellar phenotypes.
7. Adversarial Autoencoders
Authors:
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey
Summary:
In this paper, they propose the "adversarial autoencoder" (AAE), which is a probabilistic
autoencoder that uses the recently proposed generative adversarial networks (GAN) to
perform variational inference by matching the aggregated posterior of the hidden code vector
of the autoencoder with an arbitrary prior distribution. Matching the aggregated posterior to
the prior ensures that generating from any part of prior space results in meaningful samples.
As a result, the decoder of the adversarial autoencoder learns a deep generative model that
maps the imposed prior to the data distribution. We show how the adversarial autoencoder
can be used in applications such as semi-supervised classification, disentangling style and
content of images, unsupervised clustering, dimensionality reduction and data visualization.
They performed experiments on MNIST, Street View House Numbers and Toronto Face
datasets and show that adversarial autoencoders achieve competitive results in generative
modeling and semi-supervised classification tasks.
Summary:
6. Unsupervised Learning GAN’s are at the cutting edge of New and improved
unsupervised learning, and along algorithm for clustering.
Using Generative
with a traditional method like
clustering, they are shown to give
promising results.
Adversarial Training And
Clustering