Professional Documents
Culture Documents
Microarray Data Analysis Using Bioconductor
Microarray Data Analysis Using Bioconductor
Alex Sánchez
Statistics and Bioinformatics Research Group
Departament d’Estadı́stica. Universitat de Barcelona
April 12, 2007
Contents
1 Bioconductor Classes 2
1.1 Biobase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 class phenoData . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 class MIAME . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 class exprSet . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.4 Computing on exprSet objects . . . . . . . . . . . . . . . 4
1.1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 affy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 class AffyBatch . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Computing on AffyBatch . . . . . . . . . . . . . . . . . . 5
1.2.3 cdfenvs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1
1 Bioconductor Classes
Object-oriented design provides a convenient way to represent data and actions
that can be performed on them. A class can be tought of as a template, a
description of what constitutes each instance of the class. An instance of a
class is a realization of what describes the class. Attributes of a class are data
components, and methods of a class are functions, or actions the instance/class
is capable of 1 .
The R language has an implementation of object concepts through the pack-
age methods.
1.1 Biobase
The package Biobase contains basic strucutures for microarray data.
> library(Biobase)
> print(pData(my.targets))
1 This lab is based on some of Laurent Gautier excellent labs
2
age treat
Aged LPS 80L.CEL Aged LPS
Aged LPS 86L.CEL Aged LPS
Aged LPS 88L.CEL Aged LPS
Aged Medium 81m.CEL Aged MED
Aged Medium 82m.CEL Aged MED
Aged Medium 84m.CEL Aged MED
Experiment data
Experimenter name: LPS_Experiment
Laboratory: National Cancer Institute
Contact information: Lakshman Chelvaraja
Title: Molecular basis of age associated cytokine dysregulation in LPS stimulated macroph
URL: http://www.jleukbio.org/cgi/content/abstract/79/6/1314
PMIDs:
No abstract available.
> data(sample.exprSet.1)
> m <- exprs(sample.exprSet.1)[, c(1:3, 13:15)]
> colnames(m) <- NULL
> eset <- new("exprSet", exprs = m, phenoData = my.targets,
+ description = my.desc)
> eset
3
varLabels
age: read from file
treat: read from file
> description(eset)
Experiment data
Experimenter name: LPS_Experiment
Laboratory: National Cancer Institute
Contact information: Lakshman Chelvaraja
Title: Molecular basis of age associated cytokine dysregulation in LPS stimulated macroph
URL: http://www.jleukbio.org/cgi/content/abstract/79/6/1314
PMIDs:
No abstract available.
1.1.5 Exercises
1. Obtain an expression matrix from somewhere in the net and create a data
frame describing this dataset.
2. create an instance of class ’exprSet’ using the matrix and the data.frame
created.
3. export it to a MS Excel-friendly format.
4
4. Load the data package golubEsets. The exprSet to use is the golubTrain.
5. For each gene, compute the ratio
mean(ALL)
rALL.AML =
mean(AML)
1.2 affy
1.2.1 class AffyBatch
The class AffyBatch extends the class exprSet. This means that it inherits
characteristics from its ancestor. Therefore is it also constituted of:
> library(affy)
> data(affybatch.example)
> affybatch.example
AffyBatch object
size of arrays=100x100 features (240 kb)
cdf=cdfenv.example (150 affyids)
number of samples=3
number of genes=150
annotation=
5
eset <- computeExprSet(affybatch.example, pmcorrect.method="pmonly", summary.method="m
expresso The function expresso is a wrapper around the processing methods
bgcorrect, normalize and computeExprSet applied in sequence.
eset <- expresso(affybatch.example, widget=TRUE)
mas, rma These functions are wrapper around expresso. They define popu-
lar/standard settings for pre-processing.
1.2.3 cdfenvs
The cdfenvs are of environment. They contain associative mappings probe set
identifiers and indices for the rows in the matrix that contains probe intensities.
1.2.4 Exercises
1. load the library affydata
2. load the dataset Dilution. This dataset is used for the exercises
3. use the hist to plot intensity distributions
4. process the AffyBatch using no background correction, the qspline nor-
malization method, the pmonly perfect match correction method and the
medianpolish summary method.