Professional Documents
Culture Documents
BioMarker
Definition:
Biomarker is a substance used as an indicator of a biologic state Existence of living organisms or biological process. A particular disease state
Metabolites:
Carbohydrates
Lipids Small molecules
Biomarker
Detection of biomarker Detection of biomarker diagnosis Self properties, e.g enzymatic activities Antibodies, IHC, ELISA
Detection of biomarker Quantitative a link between quantity of the marker and disease Qualitative a link between exist of a marker and disease
The marker must be easily detected without complicated medical procedures. The disease markers released to serum and urine are good targets for application of early screening.
The method for screening should be cost effective.
Large screening trials have shown that PSA nearly doubles the rate of detection when combined with other methods. Based on these data, PSA testing was approved by the US FDA for the screening and early detection of prostate cancer.
PSA is also found in the cytoplasm of benign prostate cells. I never dreamed that my discovery four decades ago would lead to such a profit-driven public health disaster." -Richard Ablin (inventor of the PSA test) PSA screening generates ~$1.7 billion annually in the U.S. alone.
Sensitivity = the ability of the test to detect the disease (True positive rate) Specificity = the likelihood that your test will be normal if you are disease free (True Negative)
-Statistics are the formalization of common sense -because they have to handle many different situations, they can be really complicated -they should make you feel really good or really bad about your data -People are inherently bad at statisitics and probability Case Study: rate for being HIV positive: 1:10000 false positive rate of HIV test: 1:1000
-Statistics are the formalization of common sense -because they have to handle many different situations, they can be really complicated -they should make you feel really good or really bad about your data -People are inherently bad at statisitics and probability Case Study: rate for being HIV positive: 1:10000 false positive rate of HIV test: 1:1000
What is the chance that I am HIV negative? 0.0001 0.001 0.01 0.1 0.9 0.99 0.9999
-Statistics are the formalization of common sense -because they have to handle many different situations, they can be really complicated -they should make you feel really good or really bad about your data -People are inherently bad at statisitics and probability Case Study: rate for being HIV positive: 1:10000 false positive rate of HIV test: 1:1000
What is the chance that I am HIV negative? 0.0001 0.001 0.01 0.1 0.9 0.99 0.9999
For every 1 True Positive there will be 10 false positives, so my chance of being Negative is 10/11.
Rate is 15:10000 False Positive Rate is 60:1000 For every 15 True Positives, there will be 600 False Positives! Chance of being Negative 600/615 = .97 Chance of being Positive = .03 (before test chance was 0.015) -Is this true?
Rate is 15:10000 False Positive Rate is 60:1000 For every 15 True Positives, there will be 610 False Positives! Chance of being Negative 600/615 = .97 Chance of being Positive = .03 (before test chance was 0.015) -Is this true? The test will miss 80% of the true positives (sensitivity = 20%) so there will only be 3 True Positives Detected so: Chance of being Negative 600/603 = 0.995 Chance of being True Positive = 0.005 Follow up for a +HIV test is another blood test. Follow up for +PSA test is tissue biopsy.
By Age 65 the rate of Prostate Cancer climbs to 8:1000 and the test performs much better. For every 8 True Positives, there will be 60 False Positives! Chance of being Negative 60/68 = .88 Chance of being Positive = .12 (before test chance was 0.015)
Prostate Cancer is one of the most frequent cancers (15:10000), most cancers are much less frequent (1:10000: 1:50000) so a biomarker would have to be much better than the PSA test. It is currently believed that a new biomarker would need sensitivity and specificity better than 95%.
SELDI can detect 200-300 features in a sample. It has been used to find biomarkers from everything from blood to tears.
-Realization that serum and other biofluids are incredibly complex. -Realization that serum and other biofluids are incredibly variable and fragile -some strong biomarkers -blood collection tube -# of freeze-thaw cycles -diet
Typical Biomarker Discovery study will take 50 samples per condition. Typically takes 10 samples per condition to have a 90% chance of finding differences of 2 times.Validation will take 1000s of samples. Finally the assay will have to be converted to something that can be done in a clinical lab.
2007
Gastro
Colon Bladder Brain Leukemia Myeloma Thyroid Testicular x
x
x
x
x
x
x x x x x x x x x x x x x x
Conclusions
-Biomarker Discovery is difficult -biofluids are complex -biofluids have a high dynamic range -biomarkers are usually low abundance -even taking proximal fluids typically does not help -the is a lot of person to person variability -Most Biomarkers will never become clinically relevant -statistical standards for diagnostic tools is very high -the more prevalent the disease the better the biomarker will perform -An MS based biomarker assay is unlikely due to the greater analytical performance of antibody based methods.
Quantitative Approaches
Stable Isotope Labeling methods -adds heavy isotopes to one sample so chemically identical compounds are mass shifted -added to the peptides/proteins using reactive groups -added to the proteins in vivo using heavy amino acids -can be multiplexed Label free methods -extracted ion chromatograms -spectral counting
100
3348.0
4700 Reflector Spec #1 MC[BP = 863.4, 3348]
1737.8809
90
100 90
1738.8808
1941.2
80
80
70
60
% Intensity
70
1059.5333
1737.8809
50
1739.8810
40
60
30
% Intensity
20
1740.8808
50
10
963.5271
1296.6797
0 1737.49425
1738.56954
1740.72011
1741.79540
1742.87069
1021.5520
40
1210.6891
30
1425.6223
1353.6017
1901.8827
1079.5632
881.2428
1222.6218
20
995.5375
1125.4923
1174.5804
1570.6759
1720.8409
1495.6821
2030.0236
2242.1663
1844.8245
1922.8702
2211.0522
2465.1926
10
0 799.0
1441.8
2539.4324
2727.4
3370.2
4013.0
Isotope-labeled linker: heavy or light, depending on which isotope is used Affinity tag: enables the protein or peptide bearing an ICAT to be isolated by affinity chromatography in a single step
O
NH NH
Linker: Heavy version will have deuteriums at * Light version will have hydrogens at *
H N S O
* *
O O
O
*
H N I O
Quantification MS
Identification MS/MS
NH2-EACDPLR-COOH
Light
100 MIX
100
Heavy
200
400 m/z
600
ICAT Quantitation
Reporter
Charged
Gives strong signature ion in MS/MS Gives good b- and y-ion series Maintains charge state Maintains ionization efficiency of peptide
Balance
Neutral loss
Balance changes in concert with reporter mass to maintain total mass of 145 Neutral loss in MS/MS
PRG
Amine specific
Reporter Reporter Group mass (Mass = 114 thru N 117) 114 117 (Retains Charge)
N O
O N
Balance
(Mass = 31 thru 28)
S1
b 114
115 30 -PRG +
S2
Mix
116 29 -PRG +
MS
115 b
y
y y
MS/MS
116 b 117
S3
117
28 -PRG +
S4
100
90
114
115
116
80
70 60
1347.0
1349.6
1352.2
1357.4
1360.0
% Intensity
y8
50
b4
y10
b2
117
112.1 q,H
b9 y9
142.1
y4
y11
y2
b6
y6
b8
1352.8
1428.0
y3
Mass (m/z)
b10
b1
y5
b7
*-TPHPALTEAK-*
8396.7
50 40 30 P
y8
y10
b2
112.1 q,H
20 10 0 9.0
292.8
576.6
Mass (m/z)
860.4
b9 y9
142.1
y4
1144.2
y11
y2
b6
y6
b8
1352.8
1428.0
b4
114.1
115.1
116.1
117.1
b10
b1
y3
y5
b7
111.0
112.8
114.6
116.4
118.2
120.0 757 759 761 763 Mass (m/z) 765 767 869 871 873 875 Mass (m/z) 877 879
Mass (m/z)
b7
y8
Reporter Group Placement: Selection of Quiet Summed Ion Intensity Region (~75,000 Spectra)
160000000
120000000
80000000
40000000
0 0 200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
Test 1
Test 2
Test 3
Trypsin Digestion
114
115
116
117
Quant ID and
MIX
SCX
Single 2D LC analysis for combined samples (4-plex) LC MS/MS Analysis
MS/MS
35
114
30
25 y1 y2
20
y3 15 b3 10 b2 y4 b4 5 b5 y6 y7b6 b7 0 100 200 300 400 500 m/z, amu 600 700 800 900 y5
Expensive
Works best on high resolution instruments.
Label-Free Quantitation
All approaches so far require purchase of isotopically labeled reagents (can be expensive). What if you want to compare large numbers of samples (10+) What if you cant afford lots of reagents? Peak/Spectral counting Peak area comparison (Extracted Ion Chromatograms)
Spectral Counting
Count the number of peptides identified from a protein in each sample. Typically do not count repeat identifications of the same peptide Not accurate at quantifying magnitude of change, but can be used to determine if there is a difference.
In general, need a spectral count difference of about 4 peptides in order to be confident of a difference being real. Most proteins in complex mixtures are identified by less than 4 peptides.
EIC
emPAI
(Exponentially Modified Protein Abundance Index)
emPAI = 10PAI 1 Where PAI = Nobserved / Nobservable What is an observable peptide Peptides with a precursor mass between 800-2400Da. There is a roughly linear relationship between log protein concentration and the ratio of observable peptides observed in range of 3-500 fmoles. If you know how much total protein you analyzed you can derive absolute abundancies.
MRM
(Multiple Reaction Monitoring)
Look for a component of a specific mass that when fragmented forms a fragment of another specific mass.
Transition:
MRM
Best performed on a triple quadrupole instrument. Scans are very fast, so can perform multiple transition scans on a chromatographic time-scale. Requires a lot of optimization: Verify transitions are reproducible, typically want 2-3 transitions/peptide, 3-4 peptides/protein. Determine the retention time to maximize the number of peptides that can be analyzed per run. It is possible to analyze 100s of transition per hour MRM coupled to isotopically labeled peptides allows for very high sensitivity and high accuracy analysis and can give absolute quantification. Once optimized 1000s of samples can be run in a short time frame Not for discovery! You must already know what you are looking for, sometimes refered to as targeted proteomics
Conclusions
There are many different ways to quantitate proteomics data Quantitative studies need to be approached carefully, because it is easy to make mistakes No one strategy is best MRM is the most sensitive and accurate, but requires the most optimization and cannot be used for discovery.