NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI
World Scientifc
PRINCIPLES AND ADVANCED METHODS IN
MEDICAL IMAGING AND IMAGE ANALYSIS
ATAM P DHAWAN
New Jersey Institute of Technology, USA
H K HUANG
University of Southern California, USA
DAESHIK KIM
Boston University, USA
British Library CataloguinginPublication Data
A catalogue record for this book is available from the British Library.
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN13 9789812705341
ISBN10 9812705341
ISBN13 9789812705358 (pbk)
ISBN10 981270535X (pbk)
Typeset by Stallion Press
Email: enquiries@stallionpress.com
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.
Copyright © 2008 by World Scientific Publishing Co. Pte. Ltd.
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Printed in Singapore.
PRINCIPLES AND ADVANCED METHODS IN MEDICAL IMAGING AND
IMAGE ANALYSIS
ChingTing  Principles and Adv Methods.pmd 6/18/2008, 9:12 AM 1
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
To
My wife, Nilam,
for her support and patience;
and my sons, Anirudh and Akshay,
for their quest for learning.
(Atam P Dhawan)
To
My wife, Fong,
for her support;
and my daughter, Cammy; and my son, Tilden,
for their young wisdom.
(HK Huang)
To
My daughter, Zeno,
for her curiosity
(DaeShik Kim)
v
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
Preface and Acknowledgments
We are pleased to bring “Principles and Advanced Methods in
Medical Imaging and Image Analysis”, a volume of contributory
chapters, to the scientiﬁc community. The book is a compilation
of carefully crafted chapters written by leading researchers in the
ﬁeld of medical imaging they have put in a great deal of effort
in contributing the various chapters. This book can be used as
a research reference or a text book for graduate level courses in
biomedical engineering and medical sciences.
The book is a unique combination of chapters describing the
principles as well as stateoftheart advanced methods in medical
imaging and image analysis for selected applications. Though com
puterizedmedical imaginghas a verywide spectrumof applications
in diagnostic radiology and medical research, we have selected a
subset of important imaging modalities with speciﬁc applications
that are signiﬁcant in medical sciences and clinical practice. The
topics covered in the chapters have been developed with a natural
progression of understanding, keeping in mind future technological
advances that are expected to have a major impact in clinical prac
tice and the understanding of complex pathologies. We hope that
this book will provide a unique learning experience from theoreti
cal concepts to advanced methods and applications to researchers,
clinicians and students.
vii
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
viii
Preface and Acknowledgments
We are very grateful to our contributors who are internationally
renowned experts and experienced researchers in their respective
ﬁelds within the wide spectrum of medical imaging and comput
erized medical image analysis. We also gracefully acknowledge the
support provided by the editorial board and staff members of World
Scientiﬁc Publishing. Special thanks to Ms CT Ang for her guidance
and patience in preparing this book.
We hope that readers will ﬁnd this book useful in providing a
concise version of important principles, advances, and applications
in medical imaging and image analysis.
Atam P Dhawan
HK Huang
DaeShik Kim
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
Contributors
Walter J Akers, PhD
Staff Scientist, Optical Radiology Laboratory
Department of Radiology
Washington University School of Medicine
University of Washington at St Louis
Elsa Angelini, PhD
Ecole Nationale Supérieure des
Télécommunications
Paris, France
Leonard Berliner, MD
Department of Radiology
New York Methodist Hospital, NY
Sharon Bloch, PhD
Optical Radiology Laboratory, Department of Radiology
Washington University School of Medicine
University of Washington at St Louis
Christos Davatzikos, PhD
Director, Section of Biomedical Image Analysis
Associate Professor, Department of Radiology
University of Pennsylvania
ix
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
x
Contributors
Mathieu De Craene, PhD
Computational Imaging Lab
Department of Information
and Communication Technologies
Universitat Pompeu Fabra, Barcelona
Atam P Dhawan, PhD
Professor, Department of Electrical
and Computer Engineering
Professor, Department of Biomedical Engineering
New Jersey Institute of Technology
Qi Duan, PhD
Department of Biomedical Engineering
Columbia University
Alejandro F Frangi, PhD
Computational Imaging Lab
Department of Information
and Communication Technologies
Universitat Pompeu Fabra, Barcelona
Shunichi Homma, MD
Margaret Millikin Hatch Professor
Department of Medicine
Columbia University
HK Huang, DSc
Professor and Director, Imaging Informatics Division
Department of Radiology, Keck School of Medicine
Department of Biomedical Engineering
Viterbi School of Engineering
University of Southern California
DaeShik Kim, PhD
Director, Center for Biomedical Imaging
Associate Professor, Anatomy and Neurobiology
Boston University School of Medicine
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
Contributors
xi
Elisa E Konofagou, PhD
Assistant Professor
Department of Biomedical Engineering
Columbia University
Andrew Laine, PhD
Professor
Department of Biomedical Engineering
Columbia University
Angela R Laird, PhD
Assistant Professor, Department of Radiology
University of Texas Health Sciences Center
San Antonio
Maria YY Law, PhD
Associate Professor
Department of Health Technology and Informatics
The Hong Kong Polytechnic University
Heinz U Lemke, PhD
Research Professor, Department of Radiology
University of Southern California
Los Angeles, CA
Guang Li, PhD
Medical Physicist, Radiation Oncology Branch
National Cancer Institute,
NIH, Bethesda, Maryland
Brent J Liu, PhD
Assistant Professor and Deputy Director of Informatics
Department of Radiology, Keck School of Medicine
Department of Biomedical Engineering
Viterbi School of Engineering
University of Southern California
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
xii
Contributors
Tianming Liu, PhD
Center for Bioinformatics
Harvard Medical School
Department of Radiology
Brigham and Women’s Hospital, MA
Sachin Patwardhan, PhD
Research Scientist, Department of Radiology
Mallinckrodt Institute of Radiology
University of Washington, St Louis
Xiaochuan Pan, PhD
Professor
Department of Radiology
Cancer Research Center
The University of Chicago
Itamar Ronen, PhD
Assistant Professor
Center for Biomedical Imaging
Department of Anatomy and Neurobiology
Boston University School of Medicine
Yulin Song, PhD
Associate Professor
Department of Radiology
Memorial SloanKettering Cancer Center
New Jersey
Song Wang, PhD
Department of Electrical
and Computer Engineering
New Jersey Institute of Technology
Pat Zanzonico, PhD
Molecular Pharmacology and Chemistry
Memorial SloanKettering Cancer Center
New York
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
Contributors
xiii
Zheng Zhou, PhD
Manager
Imaging Processing and Informatics Lab
Department of Radiology
University of Southern California
Xiang Sean Zhou, PhD
Senior Staff Scientist, Program Manager
Computer Aided Diagnosis and Therapy Solutions
Siemens Medical Solutions, Inc., Malvern PA
Lionel Zuckier, MD
Head, Nuclear Medicine
Department of Radiology
New Jersey University for Medicine and Dentistry
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
Contents
Preface and Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1. Introduction to Medical Imaging and Image
Analysis: AMultidisciplinary Paradigm . . . . . . . . . 1
Atam P Dhawan, HK Huang and DaeShik Kim
Part I. Principles of Medical Imaging and Image
Analysis
2. Medical Imaging and Image Formation. . . . . . . . . . 9
Atam P Dhawan
3. Principles of Xray Anatomical Imaging
Modalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Brent J Liu and HK Huang
4. Principles of Nuclear Medicine Imaging
Modalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Lionel S Zuckier
5. Principles of Magnetic Resonance Imaging . . . . . . 99
Itamar Ronen and DaeShik Kim
6. Principles of Ultrasound Imaging Modalities . . . . 129
Elisa Konofagou
7. Principles of Image Reconstruction Methods. . . . . 151
Atam P Dhawan
8. Principles of Image Processing Methods . . . . . . . . . 173
Atam P Dhawan
xv
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
xvi
Contents
9. Image Segmentation and Feature Extraction . . . . . 197
Atam P Dhawan
10. Clustering and Pattern Classiﬁcation . . . . . . . . . . . . 229
Atam P Dhawan and Shuangshuang Dai
Part II. Recent Advances in Medical Imaging and
Image Analysis
11. Recent Advances in Functional Magnetic
Resonance Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
DaeShik Kim
12. Recent Advances in Diffusion Magnetic
Resonance Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
DaeShik Kim and Itamar Ronen
13. Fluorescence Molecular Imaging: Microscopic to
Macroscopic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Sachin V Patwardhan, Walter J Akers and
Sharon Bloch
14. Tracking Endocardium Using Optical Flow
Along IsoValue Curve . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Qi Duan, Elsa Angelini, Shunichi Homma and
Andrew Laine
15. Some Recent Developments in Reconstruction
Algorithms for Tomographic Imaging . . . . . . . . . . . 361
ChienMin Kao, Emil Y Sidky, Patrick La Rivière
and Xiaochuan Pan
16. ShapeBased Reconstruction from Nevoscope
Optical Images of Skin Lesions . . . . . . . . . . . . . . . . . . 393
Song Wang and Atam P Dhawan
17. Multimodality Image Registration and Fusion . . . 413
Pat Zanzonico
18. Wavelet Transform and Its Applications in
Medical Image Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 437
Atam P Dhawan
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
Contents
xvii
19. Multiclass Classiﬁcation for Tissue
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Atam P Dhawan
20. From Pairwise Medical Image Registration to
Populational Computational Atlases. . . . . . . . . . . . . 481
M De Craene and AF Frangi
21. Grid Methods for Large Scale Medical Image
Archiving and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 517
HK Huang, Zheng Zhou and Brent Liu
22. ImageAssisted Knowledge Discovery
and Decision Support in Radiation
Therapy Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
Brent J Liu
23. Lossless Digital Signature Embedding
Methods for Assuring 2D and 3D Medical
Image Integrity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
Zheng Zhou, HK Huang and Brent J Liu
Part III. Medical Imaging Applications, Case Studies
and Future Trends
24. The Treatment of Superﬁcial Tumors Using
Intensity Modulated Radiation Therapy and
Modulated Electron Radiation Therapy . . . . . . . . . 599
Yulin Song and Maria Chan
25. Image Guidance in Radiation Therapy. . . . . . . . . . . 635
Maria YY Law
26. Functional Brain Mapping and Activation
Likelihood Estimation MetaAnalysis. . . . . . . . . . . . 663
Angela R Laird, Jack L Lancaster and Peter T Fox
27. Dynamic Human Brain Mapping and Analysis:
From Statistical Atlases to PatientSpeciﬁc
Diagnosis and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 677
Christos Davatzikos
January 23, 2008 11:18 WSPC/SPIB540:Principles and Recent Advances fm
xviii
Contents
28. Diffusion Tensor Imaging Based Analysis
of Neurological Disorders . . . . . . . . . . . . . . . . . . . . . . . 703
Tianming Liu and Stephen TC Wong
29. Intelligent Computer Aided Interpretation
in Echocardiography: Clinical Needs
and Recent Advances . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
Xiang Sean Zhou and Bogdan Georgescu
30. Current and Future Trends in Radiation
Therapy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745
Yulin Song and Guang Li
31. IT Architecture and Standards for a
Therapy Imaging and Model Management
System (TIMMS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
Heinz U Lemke and Leonard Berliner
32. Future Trends in Medical and Molecular
Imaging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
Atam P Dhawan, HK Huang and DaeShik Kim
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
CHAPTER 1
Introduction to Medical Imaging
and Image Analysis:
A Multidisciplinary Paradigm
Atam P Dhawan, HK Huang and DaeShik Kim
Recent advances in medical imaging with signiﬁcant contributions from
electrical and computer engineering, medical physics, chemistry, and
computer science have witnessed a revolutionary growth in diagnostic
radiology. Fast improvements in engineering and computing technolo
gies have made it possible to acquire highresolution multidimensional
images of complexorgans toanalyze structural andfunctional information
of human physiology for computerassisted diagnosis, treatment evalua
tion, and intervention. Through large databases of vast amount of infor
mation such as standardized atlases of images, demographics, genomics,
etc. newknowledge about physiological processes andassociatedpatholo
gies is continuously being derived to improve our understanding of criti
cal diseases for better diagnosis and management. This chapter provides
an introduction to this ongoing knowledge quest and the contents of the
book.
1.1 INTRODUCTION
In a general sense, medical imaging refers to the process involving
specialized instrumentation and techniques to create images or rel
evant information about the internal biological structures and func
tions of the body. Medical imaging is sometimes categorized, in a
wider sense, as a part of radiological sciences. This is particularly
1
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
2
Atam P Dhawan, HK Huang and DaeShik Kim
relevant because of its most commonapplications indiagnostic radi
ology. In clinical environment, medical images of a speciﬁc organ or
part of the body are obtained for clinical examination for the diag
nosis of a disease or pathology. However, medical imaging tests are
also performed to obtain images and information to study anatom
ical and functional structures for research purposes with normal as
well as pathological subjects. Such studies are very important to
understand the characteristic behavior of physiological processes in
humanbody to understandanddetect the onset of a pathology. Such
an understanding is extremely important for early diagnosis as well
as developinga knowledge base tostudythe progressionof a disease
associated with the physiological processes that deviate from their
normal counterparts. The signiﬁcance of medical imaging paradigm
is its direct impact on the healthcare through diagnosis, treatment
evaluation, intervention and prognosis of a speciﬁc disease.
Froma scientiﬁc point of view, medical imaging is highly multi
disciplinary and interdisciplinary with a wide coverage of physical,
biological, engineeringandmedical sciences. The overall technology
requires direct involvement of expertise in physics, chemistry, biol
ogy, mathematics, engineering, computer science and medicine so
that useful procedures and protocols for medical imaging tests with
appropriate instrumentation can be developed. The development
of a speciﬁc imaging modality system starts with the physiological
understanding of the biological medium and its relationship to
the targeted information to be obtained through imaging. Once
such a relationship is determined, a method for obtaining the tar
geted information using a speciﬁc energy transformation process,
often known as physics of imaging, is investigated. Once a method
for imaging is established, proper instrumentation with energy
source(s), detectors, and data acquisition systems are designed
and integrated to physically build an imaging system for imaging
patients to obtain target information in the context of a patholog
ical investigation. For example, to obtain anatomical information
about internal organs of the body, Xray energy may be used. The
Xray energy, while transmitted through the body, goes through
attenuation based on the density of the internal structures. Thus,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
Introduction to Medical Imaging and Image Analysis
3
the attenuation of the Xray energy carries the target information
about the density of internal structures which is then displayed as a
twodimensional (incase of radiography or mammography) or mul
tidimensional (3Din case computed tomography (CT); 4Din case of
cineCT) image. This information(image) canbe directly interpreted
by a radiologist or further processed by a computer for image pro
cessing and analysis for better interpretation.
With the evolutionary progress in engineering and computing
technologies in the last century, medical imaging technologies have
witnessed a tremendous growth that has made a major impact in
diagnostic radiology. These advances have revolutionarized health
care through fast imaging techniques; data acquisition, storage and
analysis systems; high resolution picture archiving and communi
cation systems; information mining with modeling and simulation
capabilities to enhance our knowledge base about the diagnosis,
treatment and management of critical diseases such as cancer, car
diac failure, brain tumors and cognitive disorders.
Figure 1 provides a conceptual notion of the medical imaging
process from determination of principle of imaging based on the
target pathological investigation to acquiring data for image recon
struction, processing and analysis for diagnostic, treatment evalua
tion, and/or research applications.
There are many medical imaging modalities and techniques
that have been developed in the past years. Anatomical structures
can be effectively imaged today with Xray computed tomogra
phy (CT), magnetic resonance imaging (MRI), ultrasound, and opti
cal imaging methods. Furthermore, information about physiologi
cal structures with respect to metabolism and/or functions, can be
obtained through nuclear medicine [single photon emission com
puted tomography (SPECT) and positron emission tomography
(PET)], ultrasound, optical ﬂuorescence, and several derivative pro
tocols of MRI such as fMRI, diffusiontensor MRI, etc.
The selection of an appropriate medical imaging modality is
important for obtaining the target information for a successful
pathological investigation. For example, if information has to be
obtained about the cardiac volumes and functions associated with
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
4
Atam P Dhawan, HK Huang and DaeShik Kim
Physiology and
Understanding
of Imaging
Medium
Principle
of Imaging
Target
Investigation
or
Pathology
Physics of
Imaging
Detector
Physics
Imaging
Instrumentation
EnergySource
Physics
Data
Acquisition
Image
Reconstruction
Image
Processing
Interpretation:
Diagnosis
Evaluation
Intervention
Database and
Computerized
Analysis
New Knowledge
Fig. 1. A conceptual block diagram of medical imaging process for diagnostic,
treatment evaluation and intervention applications.
a beating heart, one has to determine the requirements and limita
tions about the spatial and temporal resolution for the target set of
images. It is also important to keep in mind the type of pathology
being investigated for the imaging test. Depending on the investi
gation, such as metabolism of cardiac walls, or opening and closing
measurements of mitral valve, a speciﬁc medical imaging modality
(e.g. PET) or a combination of different modalities (e.g. stressPET
and ultrasound) can be selected.
1.1.1 Book Chapters
In this book, we present a collection of carefully written chapters to
describe principles and recent advances of major medical imaging
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
Introduction to Medical Imaging and Image Analysis
5
modalities and techniques. Case studies and data analysis protocols
are also described for investigating selected critical pathologies. We
hope that this book will be useful for engineering as well as clinical
students and researchers. The book presents a natural progression
of technology development and applications through the chapters
that are written by leading and renowned researchers and educa
tors. The book is organized in three parts: Principles of Imaging
and Image Analysis (Chapters 2–10); Recent Advances in Medical
ImagingandImageAnalysis (Chapters 11–23); andMedical Imaging
Applications, Case Studies and Future Trends (Chapters 24–32).
Chapter 2 describes some basic principles of medical imaging
and image formation. In this chapter, AtamDhawan has focused on
a basic mathematical model of image formation for a linear spatially
invariant imaging system.
InChapter 3, Brent LiuandHKHuangpresent basic principles of
Xray imaging modalities. Xray radiography, mammography, com
puted tomography (CT) and more recent PETXCT fusion imaging
systems are described.
Principles of nuclear medicine imaging are described by Lionel
Zuckier in Chapter 4 where he provides foundation and clinical
applications of single photon emission tomography (SPECT) and
positron emission tomography (PET).
In Chapter 5, Itamar Ronen and DaeShik Kimdescribes sophis
ticated principles and imaging techniques of Magnetic Resonance
Imaging (MRI). Imaging parameters and pulse techniques for use
ful MR imaging are presented.
Elisa Konofagou presents the principles of ultrasound imaging
in Chapter 6. Instrumentation and various imaging methods with
examples are described.
In Chapter 7, Atam Dhawan describes the foundation of multi
dimensional image reconstruction methods. Abrief introduction of
different types of transform and estimation methods is presented.
Atam Dhawan presents a spectrum of image enhancement,
restoration and ﬁltering operations in Chapter 8. Image processing
methods in spatial (image) domain as well as frequency (Fourier)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
6
Atam P Dhawan, HK Huang and DaeShik Kim
domain are described. In Chapter 9, Atam Dhawan describes basic
image segmentation and feature extraction methods for representa
tion of regions of interest for classiﬁcation.
In Chapter 10, Atam Dhawan and Shuangshuang Dai present
principles of pattern recognition and classiﬁcation. Genetic algo
rithmbasedfeature selectionandnonparametric classiﬁcationmeth
ods are also described for image/tissue classiﬁcation for diagnostic
applications.
Advances inMRimagingwithrespect tonewmethods andpulse
sequences associated with functional imaging of brain are described
byDaeShikKiminChapter 11. Diffusionanddiffusiontensor based
magnetic resonance imaging methods are described by DaeShik
Kim and Itamar Ronen in Chapter 12. These two chapters bring the
most recent developments infunctional brainimaging to investigate
neuronal information including homodynamic response and axonal
pathways.
Chapter 13 provides a spectrum of optical and ﬂuorescence
imagingfor 3Dtomographic applications. Throughspeciﬁc contrast
imaging methods, Sachin Patwardhan, Walter Akers and Sharon
Bloch explore molecular imaging applications.
In Chapter 14, Qi Duan, Elsa Angelini, Shunichi Homma and
AndrewLaine presents recent investigations in dynamic ultrasound
image analysis for tracking endocardium in 4D cardiac imaging.
ChienMin Kao, Emil Y. Sidky, Patrick LaRiviere, and Xiaochuan
Pan describe recent advances in model based multidimensional
image reconstruction methods for medical imaging applications in
Chapter 15. These methods use multivariate statistical estimation
methods in image reconstruction.
Shapebased optical image reconstruction of speciﬁc entities
frommultispectral images of skinlesions is presentedby Song Wang
and Atam Dhawan in Chapter 16.
Clinical multimodality image registration and fusion methods
with nuclear medicine and optical imaging are described by Pat
Zanzonico in Chapter 17. Pat emphasizes on clinical needs of local
ization of metabolic information with real time processing and
efﬁciency requirements.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
Introduction to Medical Imaging and Image Analysis
7
Recently wavelet transform has been extensively investigated
for obtaining localized spatiofrequency information. The use of
wavelet transform in medical image processing and analysis is
described by Atam Dhawan in Chapter 18.
Medical image processing and analysis often require a multi
class characterization for image contents. Atam Dhawan presents a
probabilistic multiclass tissue characterization method for MRbrain
images in Chapter 19.
In Chapter 20, Mathieu De Craene and Alejandro F Frangi
present a reviewof advances in image registration methods for con
structing standardized computational atlases.
In Chapter 21, HK Huang, Zheng Zhou and Brent Liu describe
information processing and computational methods to deal with
large image archiving and communication corresponding to large
medical image databases.
Brent Lu, in Chapter 22, describes knowledge mining and deci
sion making strategies for medical imaging applications in radiation
therapy planning and treatment.
With large image archiving and communication systems linked
with large image databases, information integrity becomes a critical
issue. InChapter 23, Zheng Zhou, HKHuang andBrent J Liupresent
lossless digital signature embedding methods in multidimensional
medical images for authentiﬁcation and integrity.
Medical imaging applications in intensity modulated radiation
therapy (IMRT), a radiation treatment protocol, are discussed by
Yulin Song in Chapter 24.
In Chapter 25, Maria Law presents the detailed role of medical
imaging based computer assisted protocols for radiation treatment
planning and delivery.
Recently developed fMR and diffusionMR imaging meth
ods provide overwhelming volumes of image data. A produc
tive and useful analysis of targeted information extracted from
such MR images of brain is a challenging problem. In Chapter 26,
Angela Laird, Jack Lancaster and Peter Fox describe recently devel
oped maximum likelihood estimation based “meta” analysis algo
rithms for the investigation of a speciﬁc pathology. In Chapter 27,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch01 FA
8
Atam P Dhawan, HK Huang and DaeShik Kim
Christos Davatzikos presents dynamic brain mapping methods for
analysis of patient speciﬁc information for better pathological char
acterization and diagnosis. Tianming Liu and Stephen Wong, in
Chapter 28, explore a recently developed modelbased image anal
ysis algorithms for analyzing diffusiontensor MR brain images for
the characterization of neurological disorders.
Modelbased intelligent analysis and decisionsupport tools are
important in medical imaging for computerassisted diagnosis and
evaluation. Xiang Sean Zhou, in Chapter 29, presents speciﬁc chal
lenges of intelligent medical image analysis, speciﬁcallyfor the inter
pretation of cardiac ultrasound images. However, the issues raised
in this chapter could be extended to other modalities and applica
tions. In Chapter 30, Yulin Song and Guang Li present an overview
of future trends and challenges in radiation therapy methods
that closely linked with high resolution multidimensional medical
imaging.
Heinz U Lemke and Leonard Berliner, in Chapter 31, describes
speciﬁc methods and information technology (IT) issues in dealing
with image management systems involving very large databases
and widely networked image communication systems.
To conclude, Chapter 32 presents a glimpse of future trends and
challenges in highresolution medical imaging, intelligent image
analysis, and smart data management systems.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
CHAPTER 2
Medical Imaging and Image Formation
Atam P Dhawan
Medical imaging involves a good understanding of imaging medium
and object, physics of imaging, instrumentation, and often computerized
reconstruction and visual display methods. Though there are a number of
medical imaging modalities available today involving ionized radiation,
nuclear medicine, magnetic resonance, ultrasound, and optical methods,
each modality offers a characteristic response to structural or metabolic
parameters of tissues and organs of human body. This chapter provides
an overview of the principles of medical imaging modalities and a basic
linear spatially invariant image formation model used for most common
image processing tasks.
2.1 INTRODUCTION
Medical imaging is a process of collecting information about a
speciﬁc physiological structure (an organ or tissue) using a pre
deﬁned characteristic property that is displayed in the form of
an image. For example, in Xray radiography, mammography and
computed tomography (CT), tissue density is the characteristic
property that is displayed in images to show anatomical struc
tures. The information about tissue density of anatomical struc
tures is obtained by measuring attenuation to Xray energy when
it is transmitted through the body. On the other hand, a nuclear
9
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
10
Atam P Dhawan
medicine positron emission tomography (PET) image may show
glucose metabolism information in the tissue or organ. A PET
image is obtainedby measuring gammaray emission fromthe body
when a radioactive pharmaceutical material, such as ﬂurodeoxyglu
cose (FDG) is injected in the body. FDG metabolizes with the tis
sue through blood circulation eventually making it a source of
emission of gammaray photons. Thus, medical images may pro
vide anatomical, metabolic or functional information related to
an organ or tissue. These images through proper instrumentation
and data collection methods can be primarily reconstructed in
two or threedimensions and then displayed as multidimensional
data sets.
The basic process of image formation requires an energy source
to obtain information about the object that is displayed in the form
of an image. Some form of radiation such as optical light, Xray,
gammaray, RF or acoustic waves, interacts with the object tissue
or organ to provide information about its characteristic property.
The energy source can be external (Xray radiography, mammog
raphy, CT, ultrasound), internal [nuclear medicine: single photon
emissioncomputedtomography(SPECT); positronemissiontomog
raphy (PET)], or a combination of both internal and external such as
in magnetic resonance imaging where proton nuclei that are avail
able in the tissue in the body provides electromagnetic RF energy
based signals in the presence of an external magnetic ﬁeld and a
resonating RF energy source.
As described above, image formation requires an energy source,
a mechanism of interaction of energy with the object, an instru
mentation to collect the data with the measurement of energy after
the interaction, and a method of reconstructing images of infor
mation about the characteristic property of the object from the
collected data.
The following imaging modalities are commonly used for
medical applications today. The medical imaging modalities
are brieﬂy described below with their respective principles of
imaging.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
11
2.2 XRAY IMAGING
Xrays were invented by in Conrad Rontgen in 1895 who described
it as new kind of rays which can penetrate almost anything. He
described the diagnostic capabilities of Xrays for imaging human
body and received the Nobel prize in 1901. Xray radiography is the
simplest form of medical imaging with the transmission of Xrays
through the body which is then collected on a ﬁlm or an array of
detectors. The attenuation or absorption of Xrays is described by
the photoelectric and Compton effects providing more attenuation
through bones than soft tissues or air.
1−5
The diagnostic range of Xrays is used between 0.5Aand 0.01A.
A wavelength which corresponds to the photon energy of approx
imately 20 kev to 1.0 Mev. In this range, the attenuation is quite
reasonable to discriminate bones, soft tissue and air. In addition,
the wavelength is short enough for providing excellent resolution
of images even with sub mm accuracy. Shorter wavelengths than
diagnostic range of Xrays provides much higher photon energy
and therefore less attenuation. Increasing photon energy makes the
human body transparent for the loss of any contrast in the image.
The diagnostic Xrays wavelength range provides higher energy per
photons and provides a refractive index of unity for almost all mate
rials in the body. This guarantees that the diffraction will not distort
the image and rays will travel in straight lines.
1−8
Xraymedical imaginguses anexternal ionizedradiationsource,
an Xray tube to generate Xray radiation beam that is transmit
ted through human body. Attenuation to Xray radiation beam is
measured to provide information about variations in the tissue den
sity that is displayed in Xray 2D radiographs or 3D computed
tomography (CT) images. The output intensity of a radiation beam
parallel to xdirection for a speciﬁc ycoordinate location in the
selected zaxial planar crosssection, I
out
(y; x,z) would be given by:
I
out
(y; x, z) = I
in
(y; x, z)e
−
µ(x,y;z)dx
,
where µ(x, y, z) represents attenuation coefﬁcient to the transmitted
Xray energy.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
12
Atam P Dhawan
Fig. 1. An Xray mammography image with microcalciﬁcation areas.
Xrayconventional radiographycreates a2Dimage of a3Dobject
projectedon the detector plane. Figure 1 shows a 2Dmammography
image of a female breast. Several microcalciﬁcation areas can be seen
in this image.
While 2D projection radiography may be adequate for many
diagnostic applications, it does not provide 3Dqualitative andquan
titative information about the anatomical structures and associated
pathology that is necessary for diagnostics and treating a number
of diseases or abnormalities. Combining Radon transform acquir
ing rayintegral measurements with 3D scanning geometry, Xray
computed tomography (CT) provides a threedimensional recon
struction of internal organs and structures.
9−11
Xray CT has proven
to be very useful and sophisticated imaging tool in diagnostic radi
ology and therapeutic intervention protocols. The basic principle of
Xray CT is the same as that of Xray digital radiography: Xrays are
transmitted through the body and collected by an array of detectors
to measure the total attenuation along the Xray path.
8−11
Figure 2,
shows a pathological axial image of the cardiovascular cavity of a
cadaver. The corresponding image obtainedfromXray CTis shown
at the bottom of Fig. 2.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
13
Fig. 2. Top: a pathological axial image of the cardiovascular cavity of a cadaver,
bottom: the corresponding image obtained from Xray CT.
2.3 MAGNETIC RESONANCE IMAGING
The principle of nuclear magnetic resonance for medical imag
ing was ﬁrst demonstrated by Raymond Damadian in 1971 and
Paul Lauterbur in 1973. Nuclear magnetic resonance (NMR) is a
phenomenon of magnetic systems that possesses both a magnetic
moment and an angular moment. In magnetic resonance imaging
(MRI), the electromagnetic induction based signals at magnetic res
onance frequency in the radio frequency (RF) range are collected
through nuclear magnetic resonance from the excited nuclei with
magnetic moment and angular momentum present in the body.
4−7
All materials consist of nuclei which are protons, neutrons or
a combination of both. Nuclei that contain an odd number of pro
tons, neurons or both in combination possess a nuclear spin and a
magnetic moment. Most materials are composed of several nuclei
which have the magnetic moments such as
1
H,
2
H,
13
C,
31
Na, etc.
When such material is placed number a magnetic ﬁeld, randomly
orientednuclei experience an external magnetic torque which aligns
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
14
Atam P Dhawan
the nuclei either in a parallel or an antiparallel direction in reference
to the external magnetic ﬁeld. The number of nuclei aligned in par
allel is greater by a fraction than the number of nuclei aligned in
an antiparallel direction and is dependent on the strength of the
applied magnetic ﬁeld. Thus, a net vector results in the parallel
direction. The nuclei under the magnetic ﬁeld rotate or precess like
spinning tops precessing around the direction of the gravitational
ﬁeld. The rotating or precessional frequency of the spins is called
the Larmor precession frequency and is proportional to the mag
netic ﬁeld strength. The energy state of the nuclei in the antiparallel
direction is higher than the energy state of the nuclei in the parallel
direction. When an external electromagnetic radiation at the Larmor
frequency is applied through the RF coils (because the natural mag
netic frequency of these nuclei fall within the radiofrequency range),
some of the nuclei directed in the parallel direction get excited and
go to the higher energy state, becoming in the direction antipar
allel to the external magnetic ﬁeld to the antiparallel direction. The
lower energy state has the larger population of spins than the higher
energystates. Thus, throughthe applicationof the RFsignal, the spin
population is also affected.
When the RF excitation signal is removed, the excited portions
tend to return to their lower energy states through relaxation result
ing in the recovery of the net vector and the spin population. The
relaxationprocess causes the emissionof a RFfrequencysignal at the
same Larmor frequency which is received by the RF coils to gener
ate an electric potential signal called the freeinduction decay (FID).
This signal becomes the basis of MR imaging.
Given an external magnetic ﬁeld H
0
, the angular (Larmor) fre
quency, ω
0
of nuclear precession can be expressed:
ω
0
= γH
0
. (1)
Thus, the precession frequency depends on the type of nuclei with a
speciﬁc gyromagnetic ratio γ and the intensity of the external mag
netic ﬁeld. This is the frequency on which the nuclei can receive
the radio frequency (RF) energy to change their states for exhibiting
nuclear magnetic resonance. The excitednuclei returnto the thermal
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
15
equilibrium through a process of relaxation emitting energy at the
same precession frequency, ω
0
.
It can be shown that during the RF pulse (nuclear excitation
phase), the rate of change in the net stationary magnetization vector
M can be expressed as (the Bloch equation):
d
M
dt
= γ
M ×
H, (2)
where
H is the net effective magnetic ﬁeld.
Considering the total response of the spin systemin the presence
of an external magnetic ﬁeld along with the RF pulse for nuclear
excitation followed by the nuclear relaxation phase, the change of
the net magnetization vector can be expressed as Ref. [5]:
d
M
dt
= γ
M ×
H −
M
x
i + M
y
j
T
2
−
(M
z
− M
0
z
)
k
T
1
, (3)
where
M
0
z
is the net magnetization vector in thermal equilibrium
in the presence of an external magnetic ﬁeld H
0
only, and T
1
and T
2
are, respectively, the longitudinal (spinlattice) andtransverse (spin
spin) relaxation times in the nuclear relaxation phase when excited
nuclei return to their thermal equilibrium state.
In other words, the longitudinal relaxation time, T
1
represents
the return of net magnetization vector in z direction to its thermal
equilibriumstate while the transverse relaxationtime, T
2
, represents
the loss of coherence or dephasing of spinleading to the net zero vec
tor in the xy plane. The longitudinal and transverse magnetization
vectors with respect to the relaxation times in the actual stationary
coordinate system, can be given by:
M
x,y
(t) =
M
x,y
(0)e
−t/T
2
e
−iω
0
t
M
z
(t) =
M
0
z
(1 − e
−t/T
1
) +
M
z
(0)e
−t/T
1
(4)
where
M
x,y
(0) =
M
x
,y
(0)e
−iω
0
τ
p
.
M
x,y
(0) represents the initial transverse magnetization vector with
the time set to zero at the end of the RF pulse of duration τ
p
.
During imaging, the RF pulse, transmitted through an RF coil
causes nuclear excitation changing the longitudinal and transverse
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
16
Atam P Dhawan
magnetization vectors. After the RF pulse is turned off, the excited
nuclei go through the relaxation phase emitting the absorbedenergy
at the same Larmor frequency that can be detected as an electrical
signal, calledthe free inductiondecay(FID). The FIDis the rawNMR
signal that can be acquired through the same RF coil tuned at the
Larmor frequency.
Let us represent a spatial location vector r in the spinning nuclei
systemwith a net magnetic ﬁeld vector
H
r
(r) and the corresponding
net magnetization vector
M(r, t), the magnetic ﬂux φ(t) through the
RF coil can be given as Ref. [5]:
φ(t) =
object
H
r
(r) ·
M(r, t)dr, (5)
where the voltage induced in the RF coil, V(t) is the rawNMRsignal
and can be expressed (using Faraday’s Law) as:
V(t) = −
∂φ(t)
∂t
= −
∂
∂t
object
H
r
(r) ·
M(r, t)dr. (6)
Figure 3 provides axial, coronal andsagittal crosssectionMRimages
of a brain. Details of the gray andwhite matter structures are evident
in these images.
2.4 SINGLE PHOTON EMISSION COMPUTED
TOMOGRAPHY
In 1934, Jean Frederic Curie and Irene Curie discovered radiophos
phorous
32
P, a radioisotope to demonstrate radioactivity decay.
Fig. 3. (from left to right): axial, coronal and sagittal crosssection MR images of
a human brain.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
17
In 1951, radionucleotide imaging of thyroid was demonstrated by
Cassen through administration of iodine radioisotope
131
I. Anger
in 1952, developed a scintillation camera, also known as Anger
camera with sodium iodide crystals coupled with photomultiplier
tubes. Kuhl and Edwards developed a transverse section tomog
raphy gamma ray scanner for radionuclide imaging in 1960s.
12−15
Their imaging system included an array of multiple collimated
detectors surrounding a patient with rotatetranslation motion to
acquire projection for emission tomography. With the advances of
computer reconstruction algorithms and detector instrumentation,
the gamma ray imaging is now known as single photon emission
computed tomography (SPECT) for 3D imaging of human organs
that is extended to even full body imaging. The radioisotopes are
injected in the body through the administration of radiopharmaceu
tical drugs that metabolize with the tissue making tissue a source of
gamma ray emissions. The gamma rays fromthe tissue pass through
the body and are captured by the detectors surrounding the body to
acquire rawdata for deﬁning projections. The projection data is then
used in reconstruction algorithms to display images with the help
of a computer and highresolution displays. In SPECT imaging, the
commonlyusedradionuclides are Thallium
201
Tl, Technetium
99m
Tc,
Iodine
123
I andGallium
68
Ga. These radionuclides decay by emitting
gamma rays with photon energies ranging from 135 keV to 511 keV.
The attenuation to gamma rays is similar in nature as of Xrays and
can be expressed as:
I
d
= I
0
e
−τx
,
whereI
0
is theintensityof gammarays at thesource, I
d
is theintensity
at the detector after the gamma rays have passed the distance x in
the body with a linear attenuation coefﬁcient ν that depends on the
density of the medium and the energy of gamma ray photons.
Figure 4 shows
99m
Tl SPECT images of a human brain. It can be
noticed that SPECT images are poor in resolution and anatomical
structure as compared to CT or MR images. However, the SPECT
images show radioactivity distribution in the tissue representing a
speciﬁc metabolism or blood ﬂow.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
18
Atam P Dhawan
Fig. 4. SPECT image of a human brain.
2.5 POSITRON EMISSION TOMOGRAPHY
Positron emission tomography (PET) imaging methods were devel
oped in 1970s by a number of researchers including Phelp,
Robertson, TerPogossian and Brownell and several others.
14,16
The
concept of PET imaging is based on the simultaneous detection of
two 511 keVenergy photons traveling in the opposite direction. The
distinct feature of PETimagingis its abilitytotrace radioactive mate
rial metabolized in the tissue to provide speciﬁc information about
its biochemical and physiological behavior.
Some radioisotopes decay by emitting positive charged parti
cles called positrons. The emission of positron is accompanied by
a signiﬁcant amount of kinetic energy. After emission, a positron
travels typically for 1 mm– 3 mm losing some of its kinetic energy.
The loss of energy makes the positron suitable for interaction with a
loosely bound electron within a material for annihilation. The anni
hilation of the positron with the electron causes the formation of
two gamma photons with 511 keV traveling in opposite directions
(close to 180
◦
apart). The two photons can be detected by two sur
rounding scintillation detectors simultaneously within a small time
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
19
window. This simultaneous detection within a small time window
(typically in the order of nanoseconds) is called a coincidence detec
tion indicating the origin of annihilation along the line joining the
two detectors involved in coincidence detection. Thus, by detecting
a large number of coincidences, the source location and distribution
can be reconstructed through image reconstruction algorithms. It
should be noted that the point of emission of a positron is different
from the point of annihilation with an electron. Though the imag
ing process is aimed at the reconstruction of source representing the
locations of emission of positrons, it is the locations of annihilation
events that are reconstructed as an image in the positron emission
tomography (PET). However, the distribution of emission events
of positrons is considered to be close enough to the distribution of
annihilation events within a resolution limit.
The main advantage of PET imaging is its ability of extracting
metabolic and functional information of the tissue because of the
unique interaction of positron within the matter of the tissue. The
most common positron emitter radionuclide used in PET imaging
is ﬂuorine
18
F that is administered as ﬂuorine labeled radiopharma
ceutical calledﬂuorodeoxyglucose (FDG). The FDGimages obtained
through PET imaging show very signiﬁcant information about the
glucose metabolism and bloodﬂow of the tissue. Such metabolism
information has been proven to be of critical in determining the het
erogeneity and invasiveness of tumors.
Figure 5 shows a set of axial crosssection of brain PET images
showingglucose metabolism. The streakingartifacts andlowresolu
tion details can be noticed in these images. The artifacts seen in PET
images are primarily because of low volume of data caused by the
nature of radionuclidetissue interaction and electronic collimation
necessary to reject the scattered events.
2.6 ULTRASOUND IMAGING
Soundor acoustic waves were successfully usedinsonar technology
in military applications in World War II. The potential of ultrasound
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
20
Atam P Dhawan
Fig. 5. Serial axial images of a human brain acquired using FDG PET.
waves in medical imaging was explored and demonstrated by sev
eral researchers in the 1970s and 1980s including Wild, Reid, Frey,
Greenleaf and Goldberg.
17−20
Today, ultrasound imaging is success
fullyusedindiagnostic imagingof anatomical structures, bloodﬂow
measurements and tissue characterization. Safety, portability and
lowcost aspects of ultrasound imaging have made it a signiﬁcantly
successful diagnostic imaging modality.
Sound waves are characterized by wavelength and frequency.
Sound waves audible to the human ear are comprised of frequen
cies ranging from 15 Hz to 20 kHz. Sound waves with frequencies
above 20 kHz are called ultrasound waves. The velocity of propaga
tion of sound in water and in most body tissues is about 1500 m/sec.
Thus, the wavelength based resolution criterion is not satisﬁed
fromelectromagnetic radiation concept. The resolution capability of
acoustic energy is therefore dependent on the frequency spectrum.
The attenuation coefﬁcient in body tissues varies approximately
proportional to the acoustic frequency at about 1.5 dB/cm/MHz.
Thus, at much higher frequencies, the imaging is not meaningful
because of excessive attenuation. In diagnostic ultrasound, imag
ing resolution is limited by the wavelength. Shorter wavelengths
provide better imaging resolution. Shorter waves can also penetrate
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
21
deeper into tissue. Since the velocity of sound waves in a speciﬁc
medium is ﬁxed, the wavelength is inversely proportional to the
frequency. In medical ultrasound imaging, sound waves of 2 MHz
to 10 MHz can be used but 2 MHz to 5 MHz frequencies are more
common.
Let us assume that a transducer provides an accoustic signal of
s(x, y) intensity with a pulse ω(t) that is transmitted in a medium
with an attenuation coefﬁcient, µand reﬂected by a biological tissue
of reﬂectivity R(x, y, z) with a distance z from the transducer. The
recorded reﬂected intensity of a time varying accoustic signal, J
r
(t)
over the region R can then be expressed as:
J
r
(t) = K
e
−2µz
z
R(x, y, z)s(x, y) ¯ ω
t −
2z
c
dxdydz
,
(7)
where K, ¯ ω(t) and c, respectively, represent a normalizing constant,
received pulse and the velocity of the acoustic signal in the medium.
Usinganadaptive time varyinggainto compensate for the atten
uation of the signal, Eq. 7 for the compensated recorded reﬂected
signal from the tissue, J
cr
(t) can be simpliﬁed to:
J
cr
(t) = K
R(x, y, z)s(x, y) ¯ ω
t −
2z
c
dxdydz
or, in terms of a convolution as:
J
cr
(t) = K
R
x, y,
ct
2
⊗ s(− x, −y) ¯ ω(t)
, (8)
where ⊗ represents a 3D convolution. This is a convolution of a
reﬂectivity term characterizing the tissue and an impulse response
characterizing the source parameters.
Backscattered echo and Doppler shift principles are more com
monly used with the interaction of sound waves with human tissue.
Sometimes, the scattering information is complemented with trans
mission or attenuation related information such as velocity in the
tissue. Figure 6 shows a diastolic color Doppler ﬂowconvergence in
the apical fourchamber view of mitral stenosis.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
22
Atam P Dhawan
Fig. 6. Adiastolic color Doppler ﬂowimage showing an apical fourchamber view
of mitral stenosis.
2.7 PRINCIPLES OF IMAGE FORMATION
It is usually desirable for an imaging system to behave like a lin
ear spatially invariant system. In other words, the response of the
imaging system should be consistent, scalable and independent of
the spatial position of the object being imaged. A system is said to
be linear if it follows two properties: scaling and superposition.
1−3
In mathematical representation, it can be expressed as:
h{aI
1
(x, y, z) + bI
2
(x, y, z)} = ah{I
1
(x, y, z)} + bh{I
2
(x, y, z)}, (9)
where a and b area scalar multiplication factors, and I
1
(x, y, z) and
I
2
(x, y, z) are two inputs to the system represented by the response
function h.
It should be noted that in realworld situations, it is difﬁcult to
ﬁnd a perfectly linear image formation system. For example, the
response of photographic ﬁlm or Xray detectors cannot be linear
over the entire operating range. Nevertheless, under constrained
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
23
conditions and limited exposures, the response can be practically
linear. Also, anonlinear systemcanbemodeledwithpiecewiselinear
properties under speciﬁc operating considerations.
In general, image formation is a neighborhood process. One can
assume a radiant energy such as light source to illuminate an object
represented by the function f (α, β, γ). Using the additive property
of radiating energy distribution to form an image, g(x, y, z) can be
written as:
g(x, y, z) =
+∞
−∞
∞
−∞
+∞
−∞
h(x, y, z, α, β, γ, f (α, β, γ))dαdβdγ, (10)
where h(x, y, z, α, β, γ, f (α, β, γ)) is called the response function of the
image formation system. If the image formation system is assumed
to be linear, the image expression becomes:
g(x, y, z) =
+∞
−∞
∞
−∞
+∞
−∞
h(x, y, z, α, β, γ)f (α, β, γ)dαdβdγ. (11)
The response function h(x, y, z, α, β, γ) is called the Point Spread
Function (PSF) of the image formation system. The PSF depends
on the spatial extent of the object and image coordinates systems.
The expression h(x, y, z, α, β, γ) is the generalized version of the PSF
described for the linear image formation system that can be further
characterized as a spatially invariant (SI) or spatially variant (SV)
system. If a linear image formation system is such that the PSF is
uniformacross the entire spatial extent of the object and image coor
dinates, the systemis called a linear spatially invariant (LSI) system.
In such a case, the image formation can be expressed as:
g(x, y, z) =
+∞
−∞
∞
−∞
+∞
−∞
h(x−α, y−β, z−γ)f (α, β, γ)dαdβdγ. (12)
In other words, for an LSI image formation system, the image is rep
resented as the convolution of the object radiant energy distribution
and the PSF of the image formation system. It should be noted that
the PSF is basically a degrading function that causes a blur in the
image and can be compared to the unit impulse response, a com
mon term used in signal processing. In other words, the acquired
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
24
Atam P Dhawan
image g(x, y, z) for a LSI imaging system, can be expressed as the
convolution of object distribution with the PSF as:
g(x, y, z) = h(x, y, z) ⊗ f (x, y, z) + n(x, y, z), (13)
where n(x, y, z) represents an additive noise term.
Considering Fourier Transform, the above equation can be rep
resented in frequency domain:
G(u, v, w) = H(u, v, w)F(u, v, w) + N(u, v, w), (14)
where G(u, v, w), H(u, v, w) and N(u, v, w) are, respectively, Fourier
Transform of g(x, y, z), f (x, y, z) and n(x, y, z) as:
G(u, v, w) =
∞
−∞
∞
−∞
g(x, y, z)e
−j2π(ux+vy+wz)
dxdydx,
H(u, v, w) =
∞
−∞
∞
−∞
h(x, y, z)e
−j2π(ux+vy+wz)
dxdydx,
N(u, v, w) =
∞
−∞
∞
−∞
n(x, y, z)e
−j2π(ux+vy+wz)
dxdydx. (15)
Image processing and enhancement operations can be easily and
more effectively performed on the above described representation
of image formation through a LSI imaging system. However, the
validity of suchanassumptionfor imaging systems inthe real world
may be limited.
2.8 RECEIVER OPERATING CHARACTERISTICS (ROC)
ANALYSIS AS A PERFORMANCE MEASURE
Receiver operating characteristic (ROC) analysis is considered a sta
tistical measure for studying the performance of an imaging or diag
nostic systemwith respect to its ability to detect a system’s ability to
detect abnormalities accurately and reliably (true positive) without
providing false detections. In other words, ROC analysis provides
a systematic analysis of sensitivity and speciﬁcity of a diagnostic
test.
1,8,21
Let us assume the total number of examination cases to be N
tot
,
out of which N
tp
cases have positive truecondition with the actual
presence of the object and the remaining cases, N
tn
, have negative
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
25
truecondition with no object present. Let us suppose these cases are
examined though the test for which we need to evaluate accuracy,
sensitivity andspeciﬁcity factors. Considering the observer does not
cause any loss of information or misinterpretation, let N
otp
(true
positive) be the number of positive observations from N
tp
positive
truecondition cases and N
ofn
(false negative) be the number of neg
ative observation from N
tp
positive truecondition cases. Also, let
N
otn
(true negative) be the number of negative observations from
N
tn
negative truecondition cases and N
ofp
(false positive) be num
ber of positive observation from N
tn
negative truecondition cases.
Thus,
N
tp
= N
otp
+ N
ofn
and
N
tn
= N
ofp
+ N
otn
. (16)
The following relationships can be easily derived from above
deﬁnitions.
(1) True positive fraction (TPF) is the ratio of the number of positive
observations to the number of positive truecondition cases.
TPF = N
otp
/N
tp
(17)
(2) False negative fraction (FNF) is the ratio of the number of nega
tive observations to the number of positive truecondition cases.
FNF = N
ofn
/N
tp
(18)
(3) False positive fraction (FPF) is the ratio of the number of positive
observations to the number of negative truecondition cases.
FPF = N
ofp
/N
tn
(19)
(4) True negative fraction (TNF) is the ratio of the number of nega
tive observations to the number of negative trueconditioncases.
TNF = N
otn
/N
tn
(20)
This should be noted that:
TPF + FNF = 1
(21)
TNF + FPF = 1.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
26
Atam P Dhawan
FPF
TPF
b
a
c
Fig. 7. ROC curves with curve “a” indicating better overall classiﬁcation ability
than the curve “b” while the curve “c” shows the random probability.
AgraphbetweenTPF andFPF is calleda receiver operating char
acteristic (ROC) curve for a speciﬁc medical imaging or diagnostic
test for detection of an object. It should also be noted that statistical
randomtrials with equal probability of positive and negative obser
vations wouldleadto the diagonally placedstraightline as the ROC
curve. Different tests and different observers may lead to different
ROC curves for the same object detection. Figure 7 shows different
three different ROC curves for a hypothetical detection/diagnosis.
It can be noted that observer corresponding to curve “a” is far better
than the observer “b.”
True positive fraction, TPF, is also called the sensitivity while
the true negative fraction (TNF) is known as speciﬁcity of the test
for detection of an object. Accuracy of the test is given by a ratio of
correct observation to the total number of examination cases. Thus,
Accuracy = (TPF + TNF)/N
tot
. (22)
Inother words, different imagingmodalities andobservers maylead
to different accuracy, sensitivity and speciﬁcity levels.
2.9 CONCLUDING REMARKS
This chapter presented basic principles of major medical imaging
modalities anda linear spatially invariant model of image formation
that is practically easier to deal with postprocessing operations for
image enhancement and analysis. Though these assumptions may
not be strictly followed by the real world imaging scanners, medi
cal imaging systems often perform close to them. Medical imaging
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
27
modalities and image analysis systems are evaluated on the basis of
their capabilities to detect true detections of pathologies while mini
mizing the false detections. Such performance evaluations are often
conducted through receiver operating characteristic (ROC) curves
that provides a very useful way of understanding detection capa
bility in terms of the sensitivity and speciﬁcity and the relation
ship of potential tradeoffs between true positive and false positive
detections. Quantitative data analysis with appropriate models can
improve image presentation (through better image reconstruction
methods), and image analysis with feature detection, analysis and
classiﬁcation to improve the true positive rate while minimizing
false positive rate of detection of a speciﬁc pathology for which
imaging tests are performed.
References
1. DhawanAP, Medical Image Analysis, JohnWiley&Sons, Hoboken, 2003.
2. Barrett H, Swindell W, Radiological Imaging: The Theory of Image Forma
tion, Detection and Processing Volumes 1–2, Academic Press, New York,
1981.
3. Bushberg JT, Seibert JA, Leidholdt EM, Boone JM, The Essentials of Med
ical Imaging, Williams & Wilkins, 1994.
4. Cho ZH, Jones JP, Singh M, Fundamentals of Medical Imaging, John
Wiley & Sons, New York, 1993.
5. Liang Z, Lauterbur PC, Principles of Magnetic Resonance Imaging, IEEE
Press, 2000.
6. Lev MH, Hochberg F, Perfusion magnetic resonance imaging to assess
brain tumor responses to new therapies, J Mofﬁt Cancer Center 5:
446–450, 1998.
7. Stark DD, Bradley WG, Magnetic Resonance Imaging, 3rd edn., Mosby,
1999.
8. Shung KK, Smith MB, Tsui B, Principles of Medical Imaging, Academic
Press, 1992.
9. Hounsﬁeld GN, A method and apparatus for examination of a body
by radiation such as X or gamma radiation, Patent 1283915, The Patent
Ofﬁce, London, England, 1972.
10. HounsﬁeldGN, Computerizedtransverse axial scanning tomography:
Part 1, description of the system, Br J Radiol 46: 1016–1022, 1973.
11. Cormack AM, Representation of a function by its line integrals with
some radiological applications, J Appl Phys 34: 2722–2727, 1963.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch02 FA
28
Atam P Dhawan
12. Cassen B, Curtis L, Reed C, Libby R, Instrumentation for
131
I used in
medical studies, Nucleonics 9: 46–48, 1951.
13. Anger H, Use of gammaray pinhole camera for in vivo studies, Nature
170: 200–204, 1952.
14. Brownell G, Sweet HW, Localization of brain tumors, Nucleonics 11:
40–45, 1953.
15. Casey ME, Eriksson L, Schmand M, Andreaco M, Paulus M, Dahlborn
M, Nutt R, Investigation of LSO crystals for high spatial resolution
positron emission tomography, IEEETrans Nucl Sci 44: 1109–1113, 1997.
16. Kuhl E, Edwards RQ, Reorganizingdata fromtransverse sections scans
using digital processing, Radiology 91: 975–983, 1968.
17. FishP, Physics and Instrumentation of Diagnostic Medical Ultrasound, John
Wiley & Sons, Chichester, 1990.
18. Kremkau FW, Diagnostic Ultrasound Principles and Instrumentation,
Saunders, Philadelphia, 1995.
19. Kremkau FW, Doppler Ultrasound: Principles and Instruments, Saunders,
Philadelphia, 1991.
20. Hykes D, Ultrasound Physics and Instrumentation, Mosby, New York,
1994.
21. Swets JA, Pickett RM, Evaluation of Diagnostic Systems, Academic Press,
Harcourt Brace Jovanovich Publishers, New York, 1982.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
CHAPTER 3
Principles of Xray Anatomical Imaging
Modalities
Brent J Liu and HK Huang
This chapter provides basic concepts of various Xray imaging modal
ities. The ﬁrst part of the chapter addresses digital Xray projection
radiography which includes digital ﬂuorography, computed radiogra
phy, Xray mammography, and digital radiography. The key compo
nents belonging to each of these imaging modalities will be discussed
along with basic principles to reconstruct the 2D image. The second part
of the chapter focuses on 3D volume Xray acquisition which includes
Xray CT, multislice, cine, and 4D CT. The image reconstruction methods
will be discussed along with key components which have advanced the
CT technology to the present day.
3.1 INTRODUCTION
This chapter will present Xray anatomical imaging modalities
which cover a large amount of the total number of diagnostic imag
ingprocedures. Xrayprojectionradiographyalone accounts for 70%
of the total number of diagnostic imagingprocedures. Inthis chapter,
we will only focus on digital Xray anatomical imaging modalities,
which include digital ﬂuorography, computed radiography, Xray
mammography, digital radiography, Xray CT, and multislice, cine,
and 4D Xray CT.
29
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
30
Brent J Liu and HK Huang
There are twoapproaches toconvert aﬁlmbasedimage todigital
form. The ﬁrst is to utilize existing equipment in the radiographic
procedure room and only change the image receptor component.
Twotechnologies, computedradiography(CR) usingthe photostim
ulable phosphor imagingplate technology, anddigital ﬂuorography,
are in this category. This approach does not require any modiﬁcation
in the procedure roomand is therefore more easily adopted for daily
clinical practice. Thesecondapproachis toredesigntheconventional
radiographic procedure equipment, including the geometry of the
Xray beams and the image receptor. This method is therefore more
expensive to adopt, but the advantage is that it offers special features
like low Xray scatter which would not otherwise be achievable in
the conventional procedure.
3.2 DIGITAL FLUOROGRAPHY
Since 70% of the radiographic procedures still use ﬁlm as an output
medium, it is necessary to develop methods to convert images on
ﬁlms to digital format. This section discusses digital ﬂuorography
which converts images to digital format utilizing a video camera
and A/D converter.
The video scanning system is a low cost Xray digitizer which
produces either a 512 Kor 1 Kdigitized image with 8 bits/pixel. The
system consists of three major components: a scanning device with
a video or a chargecoupled device (CCD) camera that scans the
Xray ﬁlm, an analog/digital converter that converts the video sig
nals from the camera to gray level values, and an image memory to
store the digital signals fromthe A/Dconverter. The image stored in
the image memory is the digital representation of the Xray ﬁlm or
image inthe image intensiﬁer tube obtainedby using the video scan
ning system. If the image memory is connectedto a digitaltoanalog
(D/A) conversion circuitry and to a TV monitor, this image can be
displayed back on the monitor (which is a video image). The mem
ory can be connected to a peripheral storage device for longterm
image archive. Figure 1 shows a block diagram of a video scanning
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
31
A/D
Image
Memory
D/A
Video
Monitor
Image
Processor/
Computer
Digital
Storage
Device
Video
Scanner
Digital Chain
Fig. 1. Block diagram of a video scanning system, the digital chain is a standard
component in all types of scanner.
system. The digital chainshownis a standardcomponent inall types
of scanner.
Video scanner system can be connected to an image intensi
ﬁer tube to form a digital ﬂuoroscopic system. Digital ﬂuorogra
phy is a method that can produce dynamic digital Xray images
without changing the radiographic procedure room drastically
from conventional ﬂuorography. This technique requires an add
on unit in the conventional ﬂuorographic system. Figure 2 shows
a schematic of the digital ﬂuorographic system with the following
major components:
(1) Xray source: The Xray tube and a grid to minimize Xrays
scatter.
(2) Image receptor: The image receptor is an image intensiﬁer tube.
(3) Video camera plus optical system: The output light from the
image intensiﬁer goes through an optical system, which allows
the video camera to be adjusted for focusing. The amount of
light going into the camera is controlled by means of a light
diaphragm. The camera used is usually a plumbicon or a CCD
(charge couple device) with 512 or 1024 scan lines.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
32
Brent J Liu and HK Huang
(1)
Xray
Tube
Collimator Table Patient Grid
Image
Intensites
Tube
Light
Diaphragm
TV
Camara
Optics
Digital
Chain
TV
Monitor
(2) (3) (4)
Fig. 2. Schematic of a digital ﬂuorographic system coupling the image intensiﬁer
and the digital chain. See text for key to numbers.
(4) Digital chain: The digital chain consists of an A/D converter,
image memories, image processor, digital storage, and video
display. The A/D converter, the image memory, and the digi
tal storage can handle 512 × 512 × 8 bit image at 30 frames per
second, or 1024 × 1024 × 8 bit image at 7.5 frames per second.
Sometime the RAID (redundant array of inexpensive disks) is
used to handle the high speed data transfer.
Fluorography is used to visualize the motion of body compart
ments (e.g. blood ﬂow, heart beat), the movement of a catheter, as
well as to pinpoint anorganina body regionfor subsequent detailed
diagnosis. Each exposure required in a ﬂuorography procedure is
very minimal compared with a conventional Xray procedure.
Digital ﬂuorography is considered to be an addon system
because a digital chain is added to an existing ﬂuorographic
unit. This method utilizes the established Xray tube assembly,
image intensiﬁer, video scanning, and digital technologies. The out
put from a digital ﬂuorographic system is a sequence of digital
images displayed on a video monitor. Digital ﬂuorography has an
advantage over conventional ﬂuorography in that it gives a larger
dynamic range image and can remove uninteresting structures in
the images by performing digital subtraction.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
33
When image processing is introduced to the digital ﬂuoro
graphic system, dependent on the application, other names are
used, for example, digital subtraction angiography (DSA), digital
subtraction arteriography (DSA), digital video angiography (DVA),
intravenous video arteriography (IVA), computerized ﬂuoroscopy
(CF), and digital video subtraction angiography (DVSA).
3.3 IMAGING PLATE TECHNOLOGY
Imaging plate system, commonly called computed radiography
(CR), consists of two components: the imaging plate and the scan
ning mechanism. The imaging plate (laserstimulated luminescence
phosphor plate) used for Xrays detection, is similar in principle to
the phosphor intensiﬁer screen used in the standard screen/ﬁlm
receptor. The scanning of a laserstimulated luminescence phosphor
imaging plate also uses a scanning mechanism (Reader) similar to
that of a laser ﬁlm scanner. The only difference is that instead of
scanning an Xray ﬁlm, the laser scans the imaging plate. This sec
tion describes the principle of the imaging plate, speciﬁcations of
the system, and system operation.
3.3.1 Principle of the LaserStimulated Luminescence
Phosphor Plate
The physical size of the imaging plate is similar to that of a con
ventional radiographic screen; it consists of a support coated with
a photostimulable phosphorous layer made of BaFX: Eu
2+
(X= Cl,Br,I), Europiumactivated bariumﬂuorohalide compounds.
After the Xray exposure, the photostimulable phosphor crystal is
able tostore a part of the absorbedXrayenergyina quasistable state.
Stimulation of the plate by a 633 nanometer wavelength helium
neon (red) laser beam leads to emission of luminescence radiation
of a different wavelength (400 nanometer), the amount of which is
a function of the absorbed Xray energy [Fig. 3(B)].
The luminescence radiation stimulated by the laser scanning is
collected through a focusing lens and a light guide into a photomul
tiplier tube, which converts it into electronic signals. Figure 3(A)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
34
Brent J Liu and HK Huang
Xray Photons
Unused imaging plate
Recording the Xray image
The laser beams extract
the Xray image from the
plate by converting it to
light photons which form a
light image.
The small amount of re
sidual image on the plate
is thoroughly erased by
flooding the plate with light.
The erased plate can be
used again.
LaserBeam Scanning
Light
Reading
Xray
exposure
BaFX
crystals
support
(A)
Erasing
Fig. 3. Physical principle of laserstimulated luminescence phosphor imaging
plate. Above: (A) Fromthe Xray photons exposing the imaging plate to the forma
tion of the light image. Below: (B) The wavelength of the scanning laser beam(b) is
different from that of the emitted light (a) from the imaging plate after stimulation
(courtesy of J Miyahara, Fuji Photo Film Co Ltd).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
35
Fig. 3. (Continued)
shows the physical principle of the laserstimulated luminescence
phosphor imaging plate. The size of the imaging plate can be 8×10,
10 × 12, 14 × 14, or 14 × 17 square inches. The image produced is
2 000 ×2 500 ×10 bits.
3.3.2 Computed Radiography System Block Diagram
and its Principle of Operation
The imaging plate is housed inside a cassette just like a screen/ﬁlm
receptor. Exposure of the imagingplate (IP) toXrayradiationresults
in the formation of a latent image on the plate (similar to the latent
image formed in a screen/ﬁlm receptor). The exposed plate is pro
cessed through a CRReader to extract the latent image —analogous
to the exposed ﬁlm developed by a ﬁlm developer. The processed
imaging plate can be erased by bright light and be used again. The
imaging plate can either be removable or nonremovable. An image
processor is used to optimize the display (e.g. lookup tables) based
on types of exam and body regions.
The output of this system can be one of two forms — a printed
ﬁlm or a digital image — the latter can be stored in a digital storage
device and be displayed on a video monitor. Figure 4 illustrates the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
36
Brent J Liu and HK Huang
1
2
3
4
CRT
To host computer
Controller
A/D
converter
Semiconductor laser
Stimulable phospor detector
Fig. 4. Dataﬂowof an upright CR systemwith nonremovable imaging plates (IP).
(1) Formation of the latent image on the IP. (2) The IP is scanned by the laser beam.
(3) Light photons are converted to electronic signals. (4) Electronic signals are con
verted to digital signals which form a CR image (courtesy of Konica Corporation,
Japan).
dataﬂow of an upright CR system with three unremovable imaging
plates. Figure 5 shows the latest XG5000 multiplate reader system
with removable imaging plate and its components.
3.3.3 Operating Characteristics of the CR System
A major advantage of the CR system compared to the conven
tional screen/ﬁlm system is that the imaging plate is linear and
has a large dynamic range between the Xray exposure and the
relative intensity of the stimulated phosphors. Hence, under a
similar Xray exposure condition, the image reader is capable of
producing images with density resolution comparable or superior
to those from the conventional screen/ﬁlm system. Since the image
reader automatically adjusts the amount of exposure received by
the plate, over or underexposure within a certain limit would not
affect the appearance of the image. This useful feature can best be
explained by the two examples given in Fig. 6.
In quadrant A of Fig. 6, example I represents the plate exposed
to a higher relative exposure level but with a narrower exposure
range (10
3
–10
4
). The linear response of the plate after laser scan
ning yields a high level but narrow light intensity (photostimula
ble luminescence, PSL) range from 10
3
–10
4
. These light photons are
converted into electronic output signals representing the latent
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
37
Fig. 5. A Fuji XG5000 CR System with the multiimaging plate reader and two
QA/Image Processing workstations (IIP and IIP Lite). Note that the second work
station shares the same database as the ﬁrst workstation so that an Xray technician
can perform QAand image processing while another is operating the plate reader
and processing the imaging plates.
image stored on the image plate. The image processor senses a nar
row range of electronic signals and selects a special lookup table
[the linear line in Fig. 6(B)], which converts the narrow dynamic
range 10
3
–10
4
to a large light relative exposure of 1 to 50 [Fig. 6(B)].
If hardcopy is needed, a large latitude ﬁlm can be used that cov
ers the dynamic range of the light exposure from 1 to 50, as shown
in quadrant C, these output signals will register the entire optical
density (OD) range fromOD0.2 to OD2.8 on the ﬁlm. The total sys
temresponse including the imaging plate, the lookup table, and the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
38
Brent J Liu and HK Huang
Fig. 6. Two examples, I and II, illustrate the operating characteristics of the CR
system and explain how it compensates for over and under exposures.
ﬁlm subject to this exposure range is depicted as curve I in quad
rant D. The systemresponse curve, relating the relative exposure
on the plate and the OD of the output ﬁlm, shows a high gamma
value andis quite linear. This example demonstrates howthe system
accommodates a high exposure level with a narrowexposure range.
Consider example II, in which the plate receives a lower ex
posure level but with wider exposure range. The CR system auto
matically selects a different lookup table in the image processor
to accommodate this range of exposure so that the output sig
nals again span the entire light exposure range form 1 to 50.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
39
The systemresponse curve is shown as curve II in quadrant D. The
keyinselectingthe correct lookuptable is that the range of the expo
sure has to span the total light exposure of the ﬁlm, namely from1 to
50. It is noted that in both examples, the entire useful optical density
range for diagnostic radiology is utilized.
If a conventional screen/ﬁlm combination system was used,
exposure on example I in Fig. 6 would only utilize the higher optical
density region of the ﬁlm, whereas in example II it would utilize
the lower region. Neither case would utilize the full dynamic range
of the optical density in the ﬁlm. From these two examples, it is
seen that the CRsystemallows the utilization of the full optical den
sity dynamic range, regardless whether the plate is overexposed or
underexposed. Figure 7 shows an example comparing the results
of using screen/ﬁlm versus CR under identical Xray exposures.
Fig. 7. Comparison of quality of images obtained by using (A) the conven
tional screen/ﬁlmmethod and (B) CR techniques. Exposures were 70 kVp; 10 mAs,
40 mAs, 160 mAs, 320 mAs on a skull phantom. It is seen that in this example that
the CR technique is almost dose independent (courtesy of Dr S Balter).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
40
Brent J Liu and HK Huang
The same effect is achieved if the image signals are for digital out
put, and not for hard copy ﬁlm. That is, the digital image produced
from the image reader and the image processor will also utilize
the full dynamic range from quadrant D to produce 10bit digital
numbers.
3.4 FULLFIELD DIRECT DIGITAL MAMMOGRAPHY
3.4.1 Screen/Film and Digital Mammography
Conventional screen/ﬁlm mammography produces a very high
quality mammogram on an 8 sq in×10 sq in ﬁlm. Some ab
normalities in the mammogram require 50 µm spatial resolution
to be recognized. For this reason, it is difﬁcult to use CR or a laser
ﬁlm scanner to convert a mammogram to a digital image, hinder
ing the integration of the modality images to PACS. Yet, mammo
graphy examinations account for about 8% of all diagnostic
procedures in a typical radiology department. During the past sev
eral years, due to much support from the National Cancer Institute
and the United States Army Medical Research and Development
Command, some direct digital mammography systems have been
developed by joint efforts between academic institutions and pri
vate industry. Some of these systems are in clinical use. In the next
section, we describe the principle of digital mammography, a very
critical component in a totally digital imaging system in a hospital.
3.4.2 Full Field Direct Digital Mammography
There are two methods of obtaining a full ﬁeld direct digital mam
mogram, one is the imaging plate technology described in Sec. 3.3
but with higher resolution imaging plate of different materials
and higher quantum efﬁcient detector systems. The other is the
slotscanning method. This section summarizes the slot scanning
method.
The slotscanning technology modiﬁes the image receptor of
a conventional mammography system by using a slotscanning
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
41
mechanism and detector system. The slotscanning mechanism
scans a breast by an Xray fan beam and the image is recorded by
a chargedcouple device (CCD) camera encompassed in the Bucky
antiscatter grid of the mammography unit. Figure 8 shows a pic
ture of a FFDDMsystem. The Xray photons emitted fromthe Xray
tube are shaped by a collimator to become a fan beam. The width
of the fan beam covers one dimension of the image area (e.g. xaxis)
and the fan beam sweeps in the other direction (yaxis). The move
ment of the detector system is synchronous with the scan of the
fan beam. The detector system of the FFDDM shown is composed
of a thin phosphor screen coupled with four CCD detector arrays
via a tapered ﬁber optic bundle. Each CCD array is composed of
1 100×300 CCDcells. The gapbetweenanytwoadjacent CCDarrays
Fig. 8. A slotscanning digital mammography system. The slot with 300 pixel
width covering the xaxis (4 400 pixels). The Xray beam sweeps (arrow) in the y
direction producing over 5 500 pixels. X: Xray and collimator housing, C: breast
compressor.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
42
Brent J Liu and HK Huang
requires a procedure called “butting” to minimize the loss of pixels.
The phosphor screen converts the penetrated Xray photons (i.e. the
latent image) to light photons. The light photons pass through the
ﬁber optic bundle, reach the CCD cells, and then are transformed
to electronic signals. The more light photons received by each CCD
cell, the larger the signal is transformed. The electronic signals are
quantized by an analog to digital converter to create a digital image.
Finally, the image pixels travel through a data channel to the sys
tem memory of the FFDDM acquisition computer. Figure 9 shows
a 4 K×5 K×12 bit digital mammogram obtained with the system
shown in Fig. 8. A screening mammography examination requires
four images, two for each breast, producing a total of 160 Mbytes of
image data.
Fig. 9. A 4 K×5 K×12 bit digital mammogram obtained with the slotscanning
FFDDMshown on a 2 K×2.5 Kmonitor. The windowat the upper part of the image
is the magniﬁed glass showing a true 4 K×5 K region (courtesy of Drs E Sickles
and SL Lou).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
43
3.5 DIGITAL RADIOGRAPHY
During the past ﬁve years, research laboratories and manufacturers
have devoted tremendous energy and resources investigating new
digital radiography systems other than CR. The main emphases are
to improve the image quality and operation efﬁciency, and to reduce
the cost of projection radiography examination. Digital radiography
(DR) is an ideal candidate. In order to compete with conventional
screen/ﬁlm and CR, a good DR system should:
• Have a high detector quantumefﬁciency (DQE) detector with 2–3
or higher line pair/mm spatial resolution, and a higher signal to
noise ratio.
• Produce digital images of high quality.
• Deliver low dosage to patients.
• Produce the digital image within seconds after Xray exposure.
• Comply with industrial standards.
• Have an open architecture for connectivity.
• Be easy to operate.
• Be compact in size.
• Offer competitive cost savings.
Depending onthe methodusedfor the Xray photonconversion,
DR can be categorized into direct and indirect image capture meth
ods. In indirect image capture, attenuated Xray photons are ﬁrst
converted to light photons by the phosphor or the scintillator, from
which the light photons are converted to electronic signals to form
the DR image. The direct image capture method generates a digital
image without going through the light photon conversion process.
Figure 10 shows the difference between the direct and the indirect
digital capture method. The advantage of the direct image capture
method is that it eliminates the intermediate step of light photon
conversion. The disadvantages are that the engineering involved in
direct digital capture is more elaborate, and that it is inherently dif
ﬁcult to use the detector for dynamic image acquisition due to the
necessity of recharging the detector after each read out. The indirect
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
44
Brent J Liu and HK Huang
Selenium + Semiconductor
Converts Xrays to
Electrical Signals
Direct Digital
Radiograph
Scintillator or Phosphor Converts
Xrays to Light Photons
Light Photons to
Electrical Signals
Indirect digital
Radiograph
e
e
Light photons
(B) Indirect Image Capture
XRays
XRays
Scintillator or Phosphor Converts
Xrays to Light Photons
Light Photons to
Electrical Signals
Scintillator or Phosphor Converts
Xrays to Light Photons
(A) Direct Image Capture
Fig. 10. Direct and indirect image capture methods in digital radiography.
capture method uses either the amorphous silicon phosphor or scin
tillator panels. The direct capture method uses the amorphous sele
niumpanel. It appears that the direct capture methodhas the advan
tage over the indirect capture method since it eliminates the inter
mediate step of light photon conversion.
Two prevailing scanning modes in digital radiography are slot
and areal scanning. The digital mammography system discussed in
the last section uses the slotscanning method. Current technology
for areal detection mode uses the ﬂatpanel sensors. The ﬂatpanel
can be one large or several smaller panels put together. The areal
scan method has the advantage of being fast in image capture, but
it also has two disadvantages, one being the high Xray scattering.
The secondis the manufacturingof the large ﬂat panels is technically
difﬁcult.
Digital radiography (DR) design is ﬂexible which can be used as
an addon unit in a typical radiography roomor a dedicated system.
In the dedicated system, some design can be used both as a table top
unit attached to a Carm radiographic device or as an upright unit
shown in Fig. 11. Figure 12 illustrates the formation of a DR image,
comparing it with Fig. 4 on that of a CR image. A typical DR unit
produces a 2 000 × 2 500 × 12 bit image instantaneously after the
exposure.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
45
(A) Dedicated
CArm System
(B) Dedicated
Chest
(C) AddOn
Fig. 11. Three conﬁgurations of digital radiography design.
Unused IP IP with latent image IP with residue image
High Intensity Light
Xray
Digital
Image
4 6 8 (100nm)
Emission
Light
Stimulated
light
DR Laser
Reader
Fig. 12. Steps inthe formationof a DRimage, comparing it withthat of a CRimage
shown in Fig. 4.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
46
Brent J Liu and HK Huang
3.6 XRAY CT AND MULTISLICE CT
3.6.1 Image Reconstruction from Projections
Since most sectional images, like CT, are generated based on image
reconstruction fromprojections, we ﬁrst summarize the Fourier pro
jection theorem, the algebraic reconstruction, and the ﬁltered back
projection method before the discussion of imaging modalities.
3.6.1.1 The Fourier Projection Theorem
Let f (x, y) be a 2D crosssectional image of a threedimensional
object. The image reconstruction theorem states that f (x, y) can be
reconstructed fromthe crosssectional onedimensional projections.
In general, 180 different projections in one degree increments are
necessary to produce a satisfactory image, and using more projec
tions always result in a better reconstructed image.
Mathematically, the image reconstruction theorem can be
described with the help of the Fourier transform(FT). Let f (x, y) rep
resent the twodimensional image to be reconstructed and let p(x)
be the onedimensional projection of f (x, y) onto the horizontal axis,
which can be measured experimentally (see Fig. 13, the zero degree
projection). In the case of Xray CT, we can consider p(x) as the total
linear attenuation of tissues transverses by a collimated Xray beam
at location x.
Then:
p(x, 0) =
+∞
−∞
f (x, y)dy. (1)
The 1D Fourier transform of p(x) has the form:
P(u) =
+∞
−∞
+∞
−∞
f (x, y)dy
exp ( −i2πux) dx. (2)
Equations (1) and (2) imply that the 1D Fourier transform of a one
dimensional projection of a twodimensional image is identical to
the corresponding central section of the twodimensional Fourier
transform of the object. For example, the twodimensional image
can be a transverse (cross) sectional Xray image of the body, and
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
47
2D IFT
F(u,0)= (P(x,0))
1D FT
1
1
2
3
4
Xrays
F(u, )= (P(x, ))
P(x, )
P(x,0)
Spatial Domain
Frequency Domain
f(x, y)
2D FT
2
2
= 0'...180'
= 0 θ
θ
θ
θ θ
ℑ
ℑ
'...180'
Fig. 13. Principle of the Fourier projection theoremfor image reconstruction from
projections. F(0,0) is at the center of the 2D FT, low frequency components are
represented at the center region. The numerals represent the steps described in the
text.
P(x, θ): Xrays projection at angle θ
F(u, θ): 1D Fourier transform of p(x, θ)
IFT: Inverse Fourier transform
the onedimensional projections can be the Xray attenuation pro
ﬁles (projection) of the same section obtained from a linear Xray
scan at certain angles. If 180 projections at one degree increments
are accumulated and their 1D FTs performed, each of these 180 1D
Fourier transformrepresents a correspondingcentral line of the two
dimensional Fourier transform of the Xray crosssectional image.
The collection of all these 180 1DFourier transformis the 2DFourier
transform of f (x, y).
The steps of a 2D image reconstruction from its 1D projections
shown in Fig. 13 are as follows:
(1) Obtain 180 1D projections of f (x, y), p(x, θ) where θ = 1, . . . , 180.
(2) Perform the FT on each 1D projection.
(3) Arrange all these 1DFTs according to their corresponding angles
in the frequency domain. The result is the 2D FT of f (x, y).
(4) Perform the inverse 2D FT of (3), which gives f (x, y).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
48
Brent J Liu and HK Huang
The Fourier projection theorem forms the basis of tomographic
image reconstruction. Other methods that can also be used to recon
struct a 2D image from its projections are discussed later in this
chapter. We emphasize that the reconstructed image from projec
tions is not always exact; it is only an approximation of the original
image. Adifferent reconstruction method will give a slightly differ
ent version of the original image. Since all these methods require
extensive computation, specially designed image reconstruction
hardware is normally used to implement the algorithm. The term
“computerized (computed) tomography” (CT) is often used to rep
resent that the image is obtained from its projections using a recon
struction method. If the 1D projections are obtained from Xray
transmission (attenuation) proﬁles, the procedure is called XCT or
Xray CT. In the following sections, we summarize the algebraic and
ﬁltered backprojection methods with simple numerical examples.
3.6.1.2 The Algebraic Reconstruction Method
The algebraic reconstruction method is often used for the recon
struction of images from an incomplete number of projections
(i.e. <180
◦
). The result is an exact reconstruction (a pure chance) of
the original image f (x, y). For a 512 ×512 image, it will require over
180 projections, each with sufﬁcient data points in the projection, to
render a good quality image.
3.6.1.3 The Filtered (Convolution) BackProjection Method
The ﬁltered backprojection method requires two components, the
backprojection algorithm, and the selection of a ﬁlter to modify the
projectiondata. The selectionof a proper ﬁlter for a givenanatomical
region is the key in obtaining a good reconstruction from ﬁltered
(convolution) backprojection method. This is the method of choice
for almost all XCT scanners. The result of this method, is an exact
reconstruction (again, by pure chance) of the original f (x, y). The
mathematical formulation of the ﬁltered backprojection method is
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
49
given in Eq. (3):
f (x, y) =
π
0
h(t)
∗
m(t, θ)dθ, (3)
where m(t, θ) is the “t” sampling point at “θ” angle projection, h(t)
is the ﬁltered function, and “
∗
” is the convolution operator.
3.6.2 Transmission Xray Computed Tomography (XCT)
3.6.2.1 Conventional XCT
A CT scanner consists of a scanning gantry housing an Xray tube
and a detector unit, and a movable bed which can align a speciﬁc
cross section of the patient with the gantry. The gantry provides a
ﬁxed relative position between the Xray tube and the detector unit.
A scanning mode is the procedure of collecting Xray attenuation
proﬁles (projections) from a transverse (cross) section of the body.
Fromthese projections, the CTscanner’s computer programor back
projector hardware reconstructs the corresponding crosssectional
image of the body. Figures 14 and15 showthe schematic of two most
popular XCT scanners (third and fourth generations), both using an
Xray fan beam. These types of XCT take about 5 seconds for one
sectional scan, and more time for image reconstruction.
3.6.2.2 Spiral (Helical) XCT
Three other conﬁgurations can improve the scanning speed: the
helical (spiral) CT, the cine CT (Sec. 3.6.2.3), and the multislice CT
(Sec. 3.6.2.4). The helical CT is based on the design of the third, or
the fourthgeneration scanner, the cine CT uses a scanning electron
beam Xray tube, and multislice CT uses a cone beam instead of a
fan beam.
The CT conﬁgurations shown in Figs. 14 and 15 have one com
mon characteristic: the patient’s bed remains stationary during
the scanning; after a complete scan, the patient’s bed advances a
certain distance and the second scan resumes. The startandstop
motions of the bedslowdownthe scanningoperation. If the patient’s
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
50
Brent J Liu and HK Huang
Fig. 14. Schematic of the rotation scanning mode using a fanbeam Xray. The
detector array rotates with the Xray tube as a unit.
bed can assume a forward motion at a constant speed while the
scanning gantry rotated continuously, the total scanning time of a
multiple section examination could be reduced. Such a conﬁgura
tion is not possible, however, because the scanning gantry is con
nected to the external high energy transformer and power supply is
through the cables. The spiral or helical CT design does not involve
cables.
Figure 16 illustrates the principle of spiral CT. There are two pos
sible scanning modes: single helical and cluster helical. In the single
helical mode, the bed advances linearly while the gantry rotates
in sync for a period of time, say 30 seconds. In the cluster helical
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
51
Direction of
XRay Motion
XRay Tube
Stationary Scintillation Dectector Array
Fig. 15. Schematic of the rotation scanning mode with a stationary scintillation
detector array, only Xray source rotates.
Fig. 16. Helical (spiral) CT scanning modes.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
52
Brent J Liu and HK Huang
mode, the simultaneous rotation and translation lasts only 15 sec
onds, whereupon both motions stop for 7 seconds before resuming
again. The single helical mode is usedfor patients whocanholdtheir
breath for a longer period of time, while the cluster helical mode is
for patients who need to take a breath after 15 seconds.
The design of the helical XCT introduced in the late 1980s.
It is based on three technological advances: the slipring gantry,
improved detector efﬁciency, and greater Xray tube cooling capa
bility. The slipring gantry contains a set of rings and electrical com
ponents that rotate, slide and make contact to generate both high
energy(tosupplythe Xraytube andgenerator) andstandardenergy
(to supply powers to other electrical and computer components).
For this reason, no electrical cables are necessary to connect the
gantry and external components. During the helical scanning, the
term “pitch” is used to deﬁne the relationship between the Xray
beam collimation and the velocity of the bed movement.
Pitch = Table movement in mm per gantry rotation/slice thickness.
Thus, a pitch equals to “1” means that the gantry rotates a complete
360
◦
as the bed advances 1.5 mm in one second which gives a slice
thickness of 1.5 mm. During this time, raw data is collected cover
ing 360 degrees and 1.5 mm. Assuming one rotation takes one sec
ond, then for the single helical scan mode, 30 seconds of raw data
are continuously collected while the bed moves 45 mm. After the
data collection phase, the rawdata are interpolated and/or extrapo
latedtosectional projections. Theseorganizedprojections areusedto
reconstruct individual sectional images. In this case, they are 1.5 mm
contiguous slices. Reconstruction slice thickness can be from1.5 mm
to 1 cm, depending on the interpolation and extrapolation methods
used.
The advantages of the spiral CT scans are speed of scanning,
allowing the user to select slices from continuous data to recon
struct slices with peak contrast medium, retrospective creation of
overlappingor thinslices, andvolumetric data collection. The disad
vantages are the helical reconstruction artifacts and potential object
boundary unsharpness.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
53
3.6.2.3 Cine XCT
Cine XCT, introduced in early 1980s, uses a completely different
Xray technology, namely, an electron beam Xray tube: this scan
ner is fast enough to capture the motion of the heart. The detec
tor array of the system is based on the fourthgeneration stationary
detector array (scintillator and photodiode). As shown schemati
cally in Fig. 17, an electron beam(1) is accelerated through the Xray
tube and bent by the deﬂection coil (2) toward one of the four tar
get rings (3). Collimators at the exit of the tube restrict the Xray
beam to a 30
◦
fan beam, which forms the energy source of scan
ning. Since there are four tungsten target rings, each of which has
a fairly large area (210
◦
tungsten, 90 cm radius) for heat dissipa
tion, the Xray fan beam can sustain the energy level required for
scanning continuously for various scanning modes. In addition, the
detector and data collection technologies used in this system allow
very rapid data acquisition. Two detector rings (indicated by 4 in
Fig. 17) allow data acquisition for two consecutive sections simul
taneously. For example, in the slow acquisition mode with a 100 ms
scanning time, and a 8 ms interscan delay, cine XCT can provide
9 scans/s, or inthe fast acquisitionmode witha 20 ms scanning time,
34 scans/s.
The scanningcanbe done continuouslyonthe same bodysection
(tocollect dynamic motiondata of the section) or alongthe axis of the
patient (to observe the vascular motion). Because of its fast scanning
speed, cine XCT is used for cardiac motion and vascular studies and
Fig. 17. Schematic of the cine XCT. Source: Diagram adapted from a technical
brochure of Imatron Inc.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
54
Brent J Liu and HK Huang
emergency room scans. Until the availability of the multislice XCT,
cine XCT is the fast scanner for dynamic studies.
3.6.2.4 Multislice XCT
In spiral XCT, the patient’s bed moves during scan, but the Xray
beam is a fan beam perpendicular to the patient axis, and the detec
tor systemis built to collect data for the reconstruction of one slice. If
the Xray beamis shaped to a threedimensional cone beamwith the
zaxis parallel to the patient’s axis, and if a multiple detector array
(in the zdirection) system is used to collect the data, then we have
a multislice XCT scanner (see Fig. 18). Multislice XCT, in essence, is
also spiral scan except that the Xray beamis shaped to a cone beam
geometry. Multislice XCT can obtain many images in one exami
nation with a very rapid acquisition time, for example, 160 images
in 20 seconds, or 8 images/sec, or 4 MB/sec of raw data. Figure 18
shows the schematic. It is seen from this ﬁgure that a full rotation
of the cone beam is necessary to collect sufﬁcient projection data to
reconstruct the number of slides equal to the zaxis collimation of
the detector system (see below for deﬁnition). Multislice XCT uses
several new technologies:
(1) New detector: Ceramic type detector is used to replace tra
ditional crystal technology. Ceramic detector has the advan
tages of more light photons in the output, less afterglow time,
higher resistance to radiation and mechanical damage, and can
be shaped much thinner (1/2) for equivalent amount of Xray
absorption, compared with the crystal scintillators.
(2) Realtime dose modulation: A method to minimize dose deliv
eredto the patient using the cone beamgeometry by modulating
the mAs (milliampereseconds) of the Xrays beam during the
scan.
(3) Cone beam geometry image reconstruction algorithm: Efﬁcient
collection and recombination of cone beam Xray projections
(raw data) for sectional reconstruction.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
55
Fig. 18. Geometry of the multislice XCT. The patient axis is in the zdirection.
The Xrays (X) shaped as a collimated cone beam rotates around the zaxis 360
◦
in sync continuously with the patient’s bed moving linearly in zdirection. The
detector system(D) is a combination of detector arrays shaped in a concave surface
facing the Xray beam. The number of slices per 360
◦
rotation are determined by
two factors: the number of detector arrays in the zdirection, and the method used
to recombine the cone beam projection data into transverse sectional projections
(Fig. 13). The reconstructed images are transverse viewperpendicular to the zaxis.
If the cone beamdoes not rotate while the patient’s bedis moving, the reconstructed
image is equivalent to a digital ﬂuorographic image.
(4) High speed data output channel: During one examination, say,
for 160 images, much more data have to be collected during the
scanning. Fast I/O data channels from the detector system to
image reconstruction are necessary.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
56
Brent J Liu and HK Huang
If the patient bed is moving linearly, but the gantry does not
rotate, the result is a digital ﬂuorographic image with better image
quality than that discussed in Sec. 3.2. Currently, 16slice and
32slice detector CT scanners are in heavy clinical use with the
64, 128, and 256slice detector CT scanners on the near horizon.
Figure 19 shows a 3D rendered volume image of acquisition data
from a 64slice detector CT scan of a human heart. With the advent
of the 256slice detector CT scanner, it will be feasible to acquire
image data for an entire organ such as the heart in a single scan
cycle as shown in the ﬁgure.
3.6.3 Some Standard Terminology Used in MultiSlice XCT
Recall the term “pitch” deﬁned in spiral XCT, with cone beam
multidetector scanning, because of the multidetector arrays in the
zdirection(Fig. 18) the table movement canbe manytimes the thick
ness of an individual slice. For example, take a 16 mm × 1.5 mm
detector system (16 arrays with 1.5 mm thickness per array), and
Fig. 19. A3Dvolume rendered image of the heart fromdata acquired by a 64slice
CT scanner. Note that a 256slice CT scanner would be able to scan the entire heart
in one single rotation (courtesy of Toshiba Medical Systems).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
57
with the slice thickness of an individual image being 1.5 mm, then
use the deﬁnition of “pitch” in spiral scan:
Pitch = Table movement in mm per gantry rotation/Slice thickness
= (16 ×1.5 mm/rotation)/1.5 mm = (24 mm/rotation)/1.5 mm.
That means the table moves 24 mm/rotation with a reconstructed
slice thickness of 1.5 mmwould have a pitch of 16. (Sec. 3.6.2.2) This
case also represents contiguous scans. Comparing this example with
that shown in Section 3.6.2.2 for single slice scan, the deﬁnition of
“pitch” shows some discrepancy due to the size of the multidetec
tor arrays. Since different manufacturers produce different sizes of
multidetector arrays, the word “pitch” becomes confusing. For this
reason, the international electrotechnical commission (IEC) accepts
the following deﬁnition of pitch (now often referred to as the IEC
pitch):
zaxis collimation (T) =the width of the tomographic section along
the zaxis imaged by one data channel (array). In multidetector row
(multislice) CT scanners, several detector elements may be grouped
together to form one data channel (array).
Number of data channels (N) =the number of tomographic sections
imaged in a single axial scan.
Table speed or increment (I) =the table increment per axial scan or
the table increment per rotation of the Xray tube in a helical (spiral)
scan.
Pitch (P) = Table speed (I mm/rotation)/(N· T)
Thus, for a 16 detector scanner in a 16 ×1.5 mm scan mode, N=16
andT=1.5 mm, andif the table speed=24 mm/rotation, then P=1,
acontiguous scan. If thetablespeedis 36 mm/rotation, thenthepitch
is 36/(16
∗
1.5) =1.5.
3.6.4 FourDimensional (4D) XCT
Refer to Fig. 17, with the bed stationary, but the gantry continuously
rotates, we would have a fourdimensional XCT, with the fourth
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
58
Brent J Liu and HK Huang
dimension as time. In this scanning mode, human body physiologi
cal dynamic can be visualized in 3D. Current multislice XCT with a
limited size of detector arrays in zdirection, and data collection sys
temof 100 MB/sec can only visualize a limited segment of the body.
In order to realize the potential clinical applications of 4D XCT, sev
eral challenges are in order:
(1) Extend the cone beamXray and the length of the detector array
in the zdirection. Currently, detector system with 256 arrays
with 912 detectors per array is available in some prototype 4D
XCT systems.
(2) Improve the efﬁciency and performance of the A/D conversion
at the detector.
(3) Increase the data transfer rate between the data acquisition sys
tem to the display system from the 100 MB/sec to 1 GB/sec.
(4) Revolutionize display method for 4D images.
4D XCT can produce images of gigabyte range per examination.
Methods of archive, communication, and display become challeng
ing issues.
3.6.4.1 PET/XCT Fusion Scanner
XCT is excellent for anatomical delineation with fast scanning time,
while positron emission tomography (PET) is slow in obtaining
physiological images of poorer resolution, but good for the differen
tiation between benign and malignant tumors. PET requires atten
uation correction in image reconstruction, and the fast CT scan time
can provide the anatomical tissue attenuation in seconds which can
be used as a base for PET data correction. Thus, the combination of
a CT and a PET scanner during a scan would give a very power
ful tool for improving the clinical diagnostic accuracy when neither
alone would be able to provide such result. Yet, the two scanners
have to be combined as one system otherwise the misregistration
between CT and PET images would sometimes give misinforma
tion. CT/PET Fusion scanner is such a hybrid scanner which can
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
59
Fig. 20. Reconstructed image showing fusion of both CT image data and positron
emission tomography (PET) image data into a single image. Note that PET image
data shows physiological function while the CT image data shows the anatomical
features. Tools allow the user to dynamically change how much PET or CT data
is displayed in the fused image. Note the areas in the body such as the heart with
high activity signal from the acquired PET data.
obtain the CT images as well as PET images during an examina
tion. The PET images so obtained actually have better resolution
than that without using the CT attenuation correction. The output
of a PET/CT fusion scanner is two sets of images, CT and PET with
the same coordinate system for easy fusing of the images together.
Figure 20 shows an example of CT/PET fused image data set show
ing both anatomical as well as physiological function.
3.6.4.2 Components and Data Flow of an XCT Scanner
The major components and data ﬂow of an XCT include a gantry
housing the Xray tube, the detector system, and signal pro
cessing/conditioning circuits; a frontend preprocessor unit for
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
60
Brent J Liu and HK Huang
cone/fan beam projection data corrections and recombination to
transverse sectional projection data; a highspeed computational
processor; a hardware backprojector unit; and a video controller for
displaying images. In XCT, its CT number, or pixel/voxel value, or
Hounsﬁeld number, represents the relative Xray attenuation coef
ﬁcient of the tissue in the pixel/voxel, is deﬁned as follows:
CT number = K(µ −µ
w
)/µ
w
,
where µ is the attenuation coefﬁcient of the material under consid
eration, µ
w
is the attenuation coefﬁcient of water, and Kis a constant
set by the manufacturer.
3.6.5 XCT Image Data
3.6.5.1 Slice Thickness
Current multislice CT scanners can feature up to 32 detectors in an
array. In a spiral scan, multiple slices of data can be acquired simul
taneously for different detector sizes, and 0.75 mm, 1 mm, 2 mm,
3 mm, 4 mm, 5 mm, 6 mm, 7 mm, 8 mm, and 10 mm slice thickness
can be reconstructed.
3.6.5.2 Image Data Size
Astandard Chest CT of coverage size between 300 mm–400 mmcan
yield image sets from 150–200 images all the way up to 600–800
images depending on the slice thickness, or data sizes from 75 MB
up to 400 MB. Performancewise, that same standard chest CT can
be acquired in 0.75 mm slices in 10 seconds. A whole body CT can
produce upto 2 500 images or 1 250 MB(1.25 GB) of data. Eachimage
is 512 ×512 ×2 byte size.
3.6.5.3 Data Flow/PostProcessing
The fan/cone beam raw data are obtained by the acquisition host
computer. Slice thickness reconstructions are performed on the raw
data. Once the set of images is acquired in DICOMformat, any post
processing is performed on the DICOM data. This includes sagittal,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
Principles of Xray Anatomical Imaging Modalities
61
coronal, and offaxis slice reformats as well as 3D post processing.
Sometimes the cone beam raw data are saved for future reconstruc
tion of different slice thicknesses.
Some newer scanners feature a secondary computer, which
shares the same database as the acquisition host computer. This sec
ondary computer can perform the same postprocessing functions
while the scanner is acquiring new patient data. This secondary
computer also can perform network send jobs to data storage or
another DICOM destination (e.g. highly specialized 3D processing
workstation) andmaintains a sendqueue, thus alleviating the acqui
sition host computer from these functions and improving system
throughput.
References
1. Cao X, Huang HK, Current status and future advances of digital radio
graphy and PACS, IEEE Eng Med & Bio 19(5): 80–88, 2000.
2. Feldkamp LA, Davis, LC, Kress JW, Practical conebeam algorithm,
J Optical Society Amer A 1: 612–619, 1984.
3. Huang HK, PACS and Imaging Informatics: Basic Principles and Applica
tions, Wiley & Sons, NY, 2004.
4. Stahl JN, Zhang J, Chou TM, Zellner C, Pomerantsev EV, Huang HK,
Anewapproach to teleconferencing with intravascular ultrasound and
cardiac angiography in a lowbandwidth environment, RadioGraphics
20: 1495–1503, 2000.
5. Taguchi K, Aradate H, Algorithmfor image reconstruction in multislice
helical CT, Medical Physics 25(4): 550–561, 1998.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch03 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
CHAPTER 4
Principles of Nuclear Medicine
Imaging Modalities
Lionel S Zuckier
Nuclear medicine utilizes radioactive molecules (radiopharmaceuticals)
for the diagnosis and treatment of disease. The diagnostic information
obtained from imaging the distribution of radiopharmaceuticals is fun
damentally functional and thus differs from other imaging disciplines
within radiology, which are primarily anatomic in nature. Imaging using
radiopharmaceuticals can be subdivided into single and dualphoton
modalities. A wide selection of radiopharmaceuticals is available for
singlephoton imaging designed to study numerous physiologic pro
cesses within the body. Static, dynamic, gated and tomographic modes
of singlephoton acquisition can be performed. Dualphoton imaging is
the principle underlying positron emission tomography (PET) and is fun
damentally tomographic. PET has expanded rapidly due to the clinical
impact of the radiopharmaceutical
18
Fﬂuorodeoxyglucose, a glucose ana
logue used for imaging of malignancy. The fusion of nuclear medicine
tomographic images with anatomic CT is evolving into a dominant imag
ing technique. The current chapter will review physical, biological and
technical concepts underlying nuclear medicine.
4.1 INTRODUCTION
4.1.1 Physical Basis of Nuclear Medicine
Nuclear medicine is a branch of medicine which utilizes molecules
containing radioactive atoms (radiopharmaceuticals) for the diag
nosis and treatment of disease. Radioactive atoms have structurally
unstable nuclei and seek to achieve greater stability by the release
of energy and/or particles in a process termed radioactive decay.
1
63
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
64
Lionel S Zuckier
Atoms with unstable arrangements of protons and neutrons are
termed radionuclides. This stochastic process is governed by ﬁrst
order kinetics such that for N atoms, the rate of decay dN/dt is
equal to −λN, where t is time and λ is the physical decay constant.
It follows that N(t) =N
0
e
−λt
where N
0
is the number of radioactive
atoms present at time 0. The time for half of a sample of atoms to
decay is a constant termed the physical halflife (T
1/2
) and is charac
teristic for each radionuclide. The physical decay constant λ can be
expressed as 0.693/T
1/2
. It is customary to quantify the amount of
a radioactive substance by its rate of decay, or activity. The S.I. unit
Becquerel (Bq) is equal to 1 disintegration per second (dps), while the
traditional unit Curie (Ci) is equal to 3.7 ×10
10
dps.
A second important feature in characterizing a radionuclide is
the nature, frequency, and energy of its emitted radiations. Various
types of radiation may be emitted fromthe atomic nucleus (Table 1).
Alpha (α), beta (β
−
) and positron (β
+
) radiations are particulate and
penetrate relatively short distances in tissue. Gamma (γ) radiation
is nonparticulate and penetrating, making it useful for diagnostic
imaging purposes, where it can be detected by instruments external
to the body. Other types of penetrating radiations which are imaged
innuclear medicine include Xrays that are emittedas a consequence
of rearrangement of shell electrons, and 511 keV photons (annihila
tion radiations) that result from positronelectron annihilation.
Table 1. Ionizing Radiations
33
Type Rest Mass Charge Origin
(Atomic Mass Units)
Alpha (α) 4.002 +2 Nucleus
Beta negative (β
−
) 5.486 ×10
−4
−1 Nucleus
or electron (e
−
)
Gamma (γ) 0 0 Nucleus
Beta + (β
+
) 5.486 ×10
−4
(e
−
)+1 Nucleus
or positron (e
+
)
Xray 0 0 Shell electrons
Annihilation photons 0 0 Annihilation of
e
+
and e
−
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
65
Penetrating radiation is subject to attenuation in softtissues in
an exponential manner. As photons travel a distance x through mat
ter, the intensity of radiation decreases as e
−µx
where µ is the linear
attenuation coefﬁcient dependent on the mass density of the atten
uator, its atomic number (Z), and the energy of the radiation. For
photons typically used in nuclear medicine, the predominant inter
action with soft tissue is Compton scattering, potential degrading
the image by redirecting the photons. Attenuation is also a fun
damental foil to quantitative analysis, as the radiation measured
by detectors external to the body is reduced to a variable degree
depending on the nature and amount of intervening attenuator and
is no longer directly proportional to the activity at the source being
imaged. Methods are available toestimate andcompensate for atten
uation in nuclear medicine imaging and will be discussed where
relevant.
The types of radiation enumerated above are all ionizing; when
they pass through tissues they deposit energy leading to potential
chemical and biologic effects. In addition to manmade radiation
from medical, industrial and military causes, inevitable exposure
to ionizing radiation also results from natural radiation emanating
fromouter space and radionuclides in the earth’s crust. Mammalian
cells posses repair mechanisms which, at least in part, repair such
damage. The study of the interaction of radiation and living organ
isms comprises the discipline of radiation biology.
2
Much has been
learned regarding the potential toxicity of radiation and this has
informed the ﬁeld of radiation safety.
3
Controversy persists to the
present day as to whether there is a minimal amount of radiation
necessary to cause cell damage. However, for safety and regulatory
purposes we assume that exposure to even a small amount of radia
tion is associated with a ﬁnite risk (nonthreshold model). As a general
rule, particulate radiation is signiﬁcantly more injurious than non
particulate, as the energy is deposited in a more concentrated track.
Physicians must weigh the potential beneﬁts of radiologic studies
withtheir inherent risks. Radiopharmaceuticals are therefore admin
istered in the smallest amounts necessary to effect a diagnosis or
therapy.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
66
Lionel S Zuckier
4.1.2 Conceptual Basis of Nuclear Medicine
Nuclear medicine owes much of its approach to the Tracer Principle,
developed by George de Hevesy, who was awarded the Nobel Prize
inChemistryin1943 for his “workonthe use of isotopes as tracer ele
ments inresearches onchemical processes.”
4
Inde Hevesy’s method,
a radioactive atom is introduced within a molecule under study,
thereby allowing the newly formed radioactive moiety to serve as
an easily identiﬁable version of the compound, a radioactive tracer.
5
The use of radioactive molecules to elucidate physiologic pathways
has been adopted by nuclear medicine. Radionuclides with appro
priate physical properties are substituted into molecules of biologi
cal interest thereby creating radiopharmaceuticals which, combined
with selectivity of the body’s physiologic processes, can be used to
identify and target cells and organs of interest (Fig. 1). Techniques
L
S
H
K
Br
Bl
K
Legend
M
M
M
M
T
Fig. 1.
18
FFDG PET scan demonstrates foci of malignant tumor within lymph
nodes in the patient’s right neck (arrow). Malignant tumor, in contrast to most
other tissues, tends to preferentially metabolize glucose; FDGcan therefore be used
to identify sites of malignancy, as in the present case. The distribution of FDG also
reﬂects normal tissues of the body that avidly utilize glucose, including brain (Br),
and to a lesser degree liver (L) and spleen (S). In the resting state, uptake of FDGby
muscles (M) is minimal. Uptake bythe heart (H), tonsils (T) andbowel (arrowheads)
noted in the current study tend to be variable in intensity. Unlike glucose, a large
proportion of FDG is excreted by the kidneys (K) into the urinary bladder (Bl).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
67
within nuclear medicine are unique amongst the radiologic modal
ities in that they primarily yield information regarding the function
of tissues, rather than anatomic or structural detail.
Optimally, one of the atoms within a biologically relevant
molecule is substituted with a suitable radioactive isotope of the
same element. The difference in atomic mass of isotopes is due to
a variation in the number of neutrons while the number of protons
is unchanged, the latter guaranteeing virtually identical chemical
behavior of the moieties. For example,
123
I or
131
I can be substi
tuted for stable (nonradioactive)
127
I within sodium iodide (NaI),
which is still taken up by the thyroid gland in a manner identi
cal to the nonradioactive substrate. More commonly, because of
limitations in available radionuclides and their imaging proper
ties, an analog of the molecule of interest is created which shares
critical biochemical features, although its chemical structure differs
and its biological fate is not identical to that of the original com
pound.
18
FFluorodeoxyglucose (FDG) represents a radiopharma
ceutical which shares some, but not all, features of glucose, yet is of
immense clinical utility (Fig. 2).
Glucose
O
OH
OH
HO
HO
HO
G
L
Y
C
O
L
Y
S
I
S
X
Glucose6P

FDG6P
Cell
Glucose

FDG
O
OH
18
F
HO
HO
HO
FDG
G
l
u
t
1
,
3
H
e
x
o
k
i
n
a
s
e
Fig. 2. Structure of glucose and
18
Fﬂourodeoxyglucose (FDG) are illustrated
on the left side of the diagram; in the latter, a hydroxyl group has been replaced
with radioactive
18
Fﬂourine to form FDG. Although glucose and FDG are taken
up similarly in cells by the glut1 and glut 3 glucose transporters and both are
phosphorylatedbythe enzyme hexokinase, FDG6phosphate (FDG6P), incontrast
to glucose6phosphate (glucose6P), can not proceed to glycolysis and in effect
becomes trapped within the cells.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
68
Lionel S Zuckier
The methods of tracking distribution of radiopharmaceuticals
in clinical nuclear medicine are varied. Infrequently, radiophar
maceuticals are used in nonimaging quantitative assays where
samples of blood or urine are measured in sensitive well type scin
tillation detectors as a means of deriving information regarding
physiologic function and metabolic clearance. Examples include
the measurements of the absorption and subsequent excretion of
57
Colabeled vitamin B
12
(used in the evaluation of vitamin B
12
deﬁciency), and the determination of the rate of renal excretion
of
51
CrEDTA (used in the measurement of renal function). Non
imaging probetype radiation detectors are used for organbased
counting such as in the measurement of thyroid uptake of radio
active iodine. While no image is generated by the thyroid probe, the
data are fundamentally spatial in that the detector interrogates a
deﬁned volume of tissue. The use of nonimaging probes has also
spread to the surgical suite, in order to identify lymph nodes which
have been rendered radioactive by virtue of draining an anatomic
region where radiopharmaceutical has been injected into the subcu
taneous tissue. These collimated solidstate handheld scintillation
detectors, used in sentinel lymph node biopsy, have a welldeﬁned
ﬁeldofview and can be slowly translated over the surgical bed
to reveal the location of radioactive lymph nodes or other targets.
A third and most common method of assaying the distribution of
radiopharmaceuticals used in the current practice of clinical nuclear
medicine, is radionuclide imaging. This noninvasive technique has
evolved as an integral component of medical imaging for over ﬁve
decades, often serving as the proving ground for concepts that were
subsequently introduced into other radiologic modalities.
6
An important facet of nuclear medicine is the administra
tion of therapeutic radiopharmaceuticals designed to destroy tar
geted cells.
7
Therapeutic radionuclides in clinical use today emit
β
−
particles. For example,
131
INaI is used for treatment of thy
roid cancer while
90
Y or
131
I labeled antibodies are administered
to destroy lymphoma cells. While these therapies, per se, are not
imaging examinations, therapeutic applications are frequently pre
ceded by imaging using γemitting analogues in order to predict
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
69
efﬁcacy and toxicity, a discipline within medical physics termed
dosimetry.
4.1.3 Radiopharmaceuticals in Nuclear Medicine
Radiopharmaceuticals for diagnostic purposes are generally labeled
with γ emitting radionuclides; γ photons which readily exit the
body, are detectable by a variety of instruments discussed within
this chapter, and pose the lowest radiation risk to the patient. The
γemittingradionuclides usedemit photons withone or more princi
pal energies in the range of 69 keV–394 keV(Table 2). The most com
mon radionuclide used today is
99m
Tc which possesses the nearly
optimal characteristics of 140 keV γradiation, physical T
1/2
of six
hours, and the absence of energetic particulate emissions. Determi
nation of the spatial distribution of a radiopharmaceutical based
on the detection of individual photons emitted from the patient’s
body is termed singlephoton imaging. A contrasting imaging pro
cess prevails in positron emission tomography (PET) where radio
pharmaceuticals incorporate radionuclides that emit positrons in
the course of their radioactive decay (Table 3). In PET, the spatial
distribution of the radiopharmaceutical is determined by detect
ing a pair of simultaneouslyemitted photons resulting from the
Table 2. Radionuclides used in SinglePhoton Radionuclide Imaging
34
Radionuclide Principle Photon
Energies (keV)
HalfLife
(Hours)
Common
Radiopharmaceutical
Forms
Clinical Application
Tc99m 140 6.02 Numerous Numerous
I131 364 193 NaI, MIBG Thyroid, tumors (1)
Ga67 93, 185, 300, 394 78.3 Gacitrate Inﬂammation,
infection
In111 171, 245 67.9 Labeled leukocytes,
octreotide
Infection, tumors (2)
Tl201 69–80 73.1 Thallous chloride Cardiac perfusion
(Hg Xrays)
I123 159 13.2 NaI, MIBG Thyroid, tumors (1)
Legend: MIBG — metaiodobenzylguanidine, octreotide, (1) tumor of the APUD
family, (2) somatostatin receptor bearing tumors.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
70
Lionel S Zuckier
Table 3. Radionuclides Commonly used in Positron Emission Tomography
26
Radionuclide Method of Halflife Max β
+
Energy Maximal Range in
Production (Minutes) (MeV) Water (mm)
C11 Cyclotron 20.4 0.96 3.9
N13 Cyclotron 9.96 1.2 5.1
O15 Cyclotron 2.05 1.7 8.0
F18 Cyclotron 110 0.64 2.3
Ru82 Strontium 1.3 3.4 18
82 generator
annihilation of a positron and electron in a process termed dual
photon or coincidence imaging. In the discussion that follows, instru
ments for both singlephoton and dualphoton imaging will be
reviewed. Discussion of nonimaging detectors will serve as an
introduction to the principles of equipment used in nuclear
medicine imaging.
4.2 NUCLEAR MEDICINE EQUIPMENT
8
4.2.1 Nonimaging
Basic to understanding the techniques and equipment used for
imaging in nuclear medicine are the principles underlying the scin
tillation detector.
9
When used for in vivo assay of a radiopharma
ceutical within an organ, such as the amount of radioactive iodine
uptake within the thyroid gland, the scintillation detector is collo
quially termed a scintillation probe. Components of the scintillation
probe include a collimator, scintillationcrystal, photomultiplier tube
(PMT), and electronics (Fig. 3). The collimator restricts the ﬁeldof
viewof the crystal to a ﬁnite region opposite the collimator opening
(aperture). The scintillation crystal effectively shifts the wavelength
of photon energy fromγ rays to visible light, in quanta proportional
to the energy of the incident photon. In clinical systems, crystals
of sodium iodide, purposely contaminated or “doped” with thal
lium ions (NaI(Tl)), are commonly used. The crystals are optically
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
71
Fig. 3. Scintillation crystal. In the illustration, a 159 keV photon, emitted from the
decay of
123
I within the thyroid gland, enters the aperture of the collimator and
is absorbed within the NaI(Tl) crystal, resulting in conversion to visible light (a
scintillation). The light is incident on the photocathode within the photomultiplier
tube (PMT), which dislodges electrons that are subsequently ampliﬁed many folds
by a series of dynodes held at progressively greater voltages. Pulses produced by
the PMT for each photon absorbed are sorted by the pulseheight analyzer (PHA),
resulting in an energy spectrum (actual
123
I spectrum shown). Counts within a
deﬁned range of energies (i.e. the photopeak energy window) on the PHA are
integrated over time by the scaler/ratemeter. In a related design, the scintillation
crystal is fabricated with a well into which samples are placed for analysis and
counting with high geometric efﬁciency (lower right).
coupledto PMTs whichconvert the scintillations of light into current
which is ampliﬁed to detectable levels. PMTs consist of a photocath
ode, designed to emit photoelectrons upon being struck by incident
light, multiple dynodes held at progressively increasing voltages
producing to an ampliﬁed cascade of electrons, and an anode which
collects the current. Voltage across PMTs is in the 1000 volt range.
Each γ photon originally absorbed in the crystal results in a dis
crete pulse of current exiting the PMT; the amplitude of this pulse
is proportional to the incident photon energy. A device similar to
the scintillation probe is used to characterize and count radioactive
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
72
Lionel S Zuckier
samples which are placed within a specialized scintillation crystal
with an indentation or well which serves to increase the efﬁciency of
counting (Fig. 3).
Electronics in counting systems typically include a pulse
height analyzer (PHA), which can discriminate pulses of differing
amplitudes originating from photons of differing energies. Scat
tered photons loose a portion of their initial energy and can
thereby be differentiated from unscattered photons and excluded
from counting if so desired. Photons emitted from multiple radio
nuclides can also be discriminated using the PHA. As a general
rule, counts are integrated over a ﬁxed period of time and dis
played on a scaler/ratemeter. Modern devices can estimate and
compensate for the fraction of counts lost due to dead time (pulse
resolving time).
4.2.2 The Rectilinear Scanner
When collimated, the scintillation probe effectively interrogates a
welldeﬁned volume, and therefore conveys spatial information.
Early efforts to provide a distribution map of radiopharmaceuti
cals within the body were obtained by manually translating the
scintillation crystal over a region of interest.
5
A mechanical sys
tem of rectilinear scanning, developed by Benedict Cassen in the
early 1950s, incorporated a systematic method of measuring count
rates in a raster pattern over a region of interest.
10
Images were
recorded on paper or ﬁlm as dots whose intensity was proportional
to the count rate sampled at the corresponding locations over the
patient. One of the advantages of the rectilinear scanner was that
it allowed for a simple method of mapping palpable abnormalities
on the patient to locations on the printed image. A disadvantage
of rectilinear scanning was the relatively protracted time needed
to scan an area of interest, since the data were collected serially.
11
This precluded imaging of dynamic processes, such as the ﬂow of
blood to an organ, leading to the development of alternate imaging
methods.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
73
4.2.3 The Anger Gamma Camera
4.2.3.1 Background
The need to simultaneously, rather than sequentially, sample counts
from within the volume of interest was a factor that led to
development of the gammacamera (γcamera) by Hal Anger in
1952.
12,13
Basic principles introduced by Anger remain operative in
nuclear medicine imaging systems used today, albeit with reﬁne
ments in image acquisition, storage and retrieval made possible by
widespread availability of microprocessors. Elements of the Anger
scintillationcamera designinclude the collimator, crystal, PMTs, and
electronics (Fig. 4).
4.2.3.2 Collimation
14,15
Purpose of collimation in γcameras is to map the distribution of
activity onto the crystal surface in an orderly and interpretable man
ner. The majority of collimators used today are parallel hole, consist
ingof multiple leadsepta (or partitions) arrangedperpendicularlyto
the crystal face so that they only permit passage of γ photons normal
to the crystal. Collimators are designed to be speciﬁc to a particular
range of radionuclide energies. Collimators are also designed based
on preferences between the competing goals of countrate sensitiv
ity and spatial resolution, as determined by the length, thickness,
and spacing (aperture width) of the septa.
16
Clinical systems sold
today are typically equipped with a selection of low energy col
limators designated for “highresolution,” “highsensitivity,” and
“allpurpose” applications. For nuclear imaging laboratories that
utilize
67
Ga,
111
In or
131
I, collimators designed for mediumenergy
(
67
Ga,
111
In) andhighenergy (
131
I) imaging are also required.
There is a relationshipbetweenthe sourcetocollimator distance
(r), image spatial resolution and countrate sensitivity for parallel
hole collimators. As r increases, spatial resolution worsens while
the countrate sensitivity remains constant (Fig. 5). While this latter
observation appears to contradict the socalled inverse square law,
in fact, the intensity of radiation at each collimator aperture does
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
74
Lionel S Zuckier
Fig. 4. Basic principle of the Anger scintillation camera. In the current illustra
tion, an area of concentrated radiopharmaceutical within the patient’s body emits
γ photons in an isotropic manner. The fate of various emitted photons is illus
trated. Photon 1 exits the body but does not intersect the γ camera, while photon 2
is completely absorbed within the patient. Photon 3 exits the body and intersects
the γ camera, but the angle of incidence is such that the photon is absorbed by the
lead septa partition of the parallelhole collimator. Photon 4 travels in a direction
such that it is able to pass through a collimator aperture, strike and be absorbed by
the Na(I) crystal. The energy of the photon is transformed to visible light emitted
isotropically within the crystal which travels or is reﬂected towards the photomul
tiplier tubes (PMTs). These convert the light signal to an electronic pulse which is
ampliﬁed and then analyzed by the positioning and summing circuits to determine
the apparent position of absorption. If the total energy, or Z signal, of the ampli
ﬁed photon (as indicated on the illustrated
99m
Tc energy spectrum by the number
4) falls within a 20% window centered on the 140 keV energy peak, the event is
accepted and x and y coordinates are stored within the image matrix. If a γ photon
is scattered within the patient as in photon 5, the lower energy of its pulse will be
rejected by the PHA, and the erroneous projection of this photon will not be added
to the image matrix. Less commonly, diverging, converging or pinhole collimators
may be used in place of the parallel hole collimator.
decrease as 1/r
2
; however, the number of elements which admit
photons increases as r
2
, based on geometric considerations. This
in turn explains the loss of resolution with increasing distance, with
radiation emitted froma focal source passing through the collimator
to interact with everlarger regions of the crystal.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
75
Fig. 5. Four oneminute images taken anteriorly over the chest of a patient with
the collimator placed 2, 4, 8 and 16 inches from the subject. While the count rate
remains relatively constant over these distances (total counts noted in the lower
right corner of each image), degradation of spatial resolution is readily apparent.
An additional collimator design is used to image small objects
with superior resolution. The pinhole collimator (Fig. 4) consists
of a single small aperture (3 mm–5 mm in diameter) within a lead
housing which is offset from the crystal, thereby restricting inci
dent photons to those that pass through the aperture. This creates
an inverted and potentially magniﬁed image on the crystal face in a
manner analogous to that used in pinhole photography. The pinhole
collimator has the capacityto increase resolution(throughits magni
fying effect), especially at close distances, but does so at the expense
of ﬁeldofviewand parallax distortion. As sourcetocollimator dis
tance increases with this collimator, the ﬁeld of view of the cam
era increases while magniﬁcation and resolution less and countrate
efﬁciency markedly decreases.
Additional collimator designs include diverging and converg
ing collimators, the former allowing imaging of an area larger than
the collimator face and the latter allowing the enlargement of small
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
76
Lionel S Zuckier
regions of interest (Fig. 4). These are infrequently used today but
may still ﬁnd application in portable cameras with small ﬁeldof
view crystals.
4.2.3.3 Crystal
As in the nonimaging probe, γcamera crystals are generally com
posed of NaI(Tl). Features that make this crystal desirable include
high mass density and atomic number (Z), thereby effectively
stopping γ photons, and high efﬁciency of light output. Most
current cameras incorporate large (50 cm×60 cm) rectangular detec
tors. While expensive, the larger ﬁeld of view results in increased
efﬁciency.
17
In early designs, crystals were often 0.5 inches thick,
which was wellsuited for high energy γ photons. In more recent
implementations of the γcamera, crystals only 3/8inch or 1/4inch
thick are used, which is more than adequate for stopping the pre
dominantly lowenergy photons in common use today and which
also results in superior intrinsic spatial resolution. In theAnger cam
era design, the NaI(Tl) crystal is optically coupled to an array of
PMTs which is packed against the undersurface of the crystal. Light
pipes may be used to redirect light from the crystal into the PMTs.
4.2.3.4 Positioning Circuitry and Electronics
Early positioning circuitry in the Anger γcamera was analog in
nature. γ photons incident on the NaI crystal resulted in production
of light whichpropagatedthroughout the crystal andwas converted
to anampliﬁedelectrical pulse by the PMTs. Output of the PMTs was
summed to produce an aggregate XandYsignal which reﬂected the
location of the scintillation event in the crystal, and which was used
to deﬂect the beamof a cathode ray tube (CRT) in order to produce a
single spot on the image. The sum of the PMT signals (Z signal) was
proportional to the γ photon energy and was used to exclude lower
energy scattered photons. In order to accurately superimpose the
distribution of multiple γ photons of different energies emanating
from one or more radionuclides (such as 171 and 245 keV photons
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
77
of
111
In and 93, 185, 300 and 394 keV photons of
67
Ga), the Z pulse
is also used to normalize the X and Y signals so that the images
described by each photopeak are superimposable and image size is
energyinvariant (Z pulse normalization).
As microcomputers became faster, less expensive, and more
widely available, successive versions of γcameras increasingly
incorporated microprocessors.
18
Initially, the x and y signals of
the γcamera were ﬁrst processed by analog means and subse
quently converted to digital signals by analogtodigital convert
ers (ADCs). Computers were then used for image storage, viewing,
image correction
19
and various quantitative analyses. γcameras
eventually became fully “digital” in that the output of each PMT
was digitized. Most current digital γcameras have ability to adjust
the gain of each PMT individually, leading to the improved overall
camera performance.
20
Individual events, as detected by the PMTs,
are then corrected for local differences in pulseheight spectra and
for positioning. These reﬁnements have led to improvement in spa
tial resolution and image uniformity.
4.2.3.5 Modes of Acquisition
Prior to image acquisition, the operator must specify acquisition
parameters such as size of the image matrix, number of brightness
levels, photopeak and window width. Typically, for a
99m
Tc acqui
sition, a 128 × 128 matrix is used with 2
8
or 2
16
levels of bright
ness, corresponding to maximum of 256 or 65 536 counts per pixel,
respectively. The acquisition window refers to the range of photon
energies which will be accepted by the PHA. The peak and win
dow width selected for
99m
Tc are 140 keV and 20%, respectively. As
mentioned earlier, scattered photons have decreased energy, and
the energy window excludes most, though not all, of these lower
energy photons. Modern cameras allow concurrent acquisition of
photons in multiple energy windows, whether emanating from a
single radionuclide with multiple γ emissions (such as
67
Ga), or
multiple radionuclides with one or more energy peaks each (Fig. 6).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
78
Lionel S Zuckier
Fig. 6. Dualisotope acquisition. 24 hours prior toimaging, 0.5 mCi of
111
Inlabeled
autologous white blood cells (
111
InWBCs) were injected intravenously into the
patient to localize sites of infection. 30 minutes prior to imaging, 10 mCi of
99m
Tc
sulfur colloid were injected intravenously to localize marrow. Coimaging of the
99m
Tc window (140%±20%keV) and dual
111
In windows (171%±15%keV and
245%±15%keV) was performed, thereby producing simultaneous images of the
marrow(left panel) andwhite bloodcell distribution (right panel). In spite of differ
ing energy, Zpulse normalization has resulted in superimposable images. Marrow,
liver and spleen are visible on both marrow and WBC studies. The
99m
Tc study is
used to estimate WBC activity which is due to normal marrow distribution and
of no pathologic consequence. To signify infection,
111
InWBC activity must be in
areas other than the visualized marrow distribution.
In the latter case, photons derived from each radionuclide can sub
sequently be separated into separate images, each reﬂecting the dis
tribution of a single radiopharmaceutical. Multiisotope imaging is
especially helpful when the two sets of images are used for com
parison purposes. Depending on the relative activities of the radio
pharmaceuticals and other considerations, the images of the isotope
emitting the lower energy photons may have to be corrected for
downscatter of higher energy photons into its energy window.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
79
Current clinical γcameras typicallyacquire several types of data
(Fig. 7). To acquire a static (or spot) view, the camera is placed
over a region of the body and an acquisition is performed for a
predetermined number of counts or interval of time. The latter is
appropriate when intensities of different parts of the body are being
compared. Dynamic imaging refers to the acquisition of multiple
sequential images at deﬁned intervals. These may be of short dura
tion, such as a series of twosecond images to portray blood ﬂow, or
Fig. 7. Images taken from a bone scan illustrate various modes of acquisition.
Initial anterior images fromthe dynamic ﬂowstudy(toppanel) consist of sequential
2second images taken over the feet subsequent to injection of 25 mCi of
99m
Tc
labeled MDP. Static (spot) images were taken two hours thereafter in anterior, left
lateral, and plantar (sole of foot) projections for ﬁve minutes each. Sweep images
were alsotakenat that time, inanterior andposterior projections, where the detector
and patient table move with respect to each other in order to produce an extended
ﬁeldofviewimage. Increased ﬂowand bone uptake within the left foot are highly
suggestive of osteomyelitis (infected bone).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
80
Lionel S Zuckier
of longer duration, such as multiple oneminute images to demon
strate uptake and excretion of radiopharmaceuticals by the kidneys
or liver. Many cameras have the ability to acquire wholebody views,
where the detector and patient bed move with respect to each other
during the acquisition, allowing an elongated sweep image to be
obtained. Gated images are obtained during the course of a cyclical
physiologic process, where a series of dynamic images is acquired
based on repetitive sampling synchronized by a physiologic trigger
(Fig. 8). This is commonly used to obtain a series of images dur
ing phases of the cardiac cycle, thereby portraying the change in
leftventricular volume during this period. In this method, the R
wave of the electrocardiograph (ECG) is used to repetitively trigger
acquisition into a series of 16 brief frames into which counts are
accrued. When summed over the course of several hundred cardiac
beats, the limitation of statistical “noise” imposed by the fewcounts
collected over each subsecond segment of the physiologic cycle is
overcomed.
In general, two modes of data collection are possible, frame mode
and list mode. In the former and more common method of imaging,
events detected by the PHAthat fall within predetermined param
eters are incremented into a speciﬁc image matrix. For example, in
dynamic or gatedimaging, framelengthis prescribeda priori, andthe
counts are parsed into the appropriate matrices in real time. Frame
mode is an efﬁcient method of memory utilization, and images are
retrievable immediately following completion of acquisition. Adis
advantage of frame mode is that the acquisition parameters must
be selected prior to the acquisition, and cannot be retrospectively
changed. For example, if the patient’s heart rate changes during the
acquisition, or if we wish to adjust the energy window, there is no
way to alter the acquisition parameters. Alternatively, the time, loca
tion, andevenenergy of eachscintillationevent over the entire dura
tion of the acquisition, in addition to any relevant physiologic trig
gers, are stored when acquiring in list mode. At the conclusion of the
acquisition, eachevent canbe retrospectivelysortedfromthe list into
speciﬁc time bins or energy windows. As the data list remains intact,
this exercise can be repeated as many times as desired. List mode
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
81
A B CDE F GH I J A B C DE F GH I J K L MNOP A B CDE F GH I J K L MNOP K L MNOP
A B C D
E F G H
I J K L
M N O P
Time
% EDV
100
75
50
25
0
ECG
ROI
TimeActivity Curve
BK
Fig. 8. Gatedcardiac study performedafter labeling of the patient’s redbloodcells
with 25 mCi of
99m
Tc. The electrocardiograph tracing (top) illustrates the division
of the cardiac cycle into 16 frames, marked AP for illustrative purposes. Counts
fromthe γcamera during each portion of the cycle are assigned to a corresponding
image matrix (labeled AP). After counts from several hundred beats have been
summed, count statistics are adequate to portray the change in volume of blood
within the heart during a cardiac cycle. These can be shown as sequential images,
as illustrated, or as a cine loop. Quantitative analysis can also be performed. As
illustrated, regionsofinterest (ROIs) are placed over the left ventricle (LV) at both
end diastole and end systole (black and white curves, respectively). Nonspeciﬁc
activity overlying the heart is estimated from an adjacent background (BK) region
and subtracted from the LV regions. A timeactivity curve (solid line) depicts the
change in LV ventricular volume during the course of an average heartbeat. The
dotted line illustrates the ﬁrst derivative. In the current illustration, the percent
change in LV volume during contraction (ejection fraction) is 45%, slightly below
normal. RVand SP indicate locations of the right ventricle and spleen, respectively.
necessitate increased storage requirements, but is especially useful
in research applications where data may be analyzed in multiple
ways.
Tomographic imaging refers to the acquisition of 3D data. The
initial development of tomography occurred in nuclear medicine
6
;
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
82
Lionel S Zuckier
subsequently this technique was extended to computed transmis
sion tomography (CT) and magnetic resonance imaging (MRI).
Single and dualphoton methods of tomography will be discussed
separately below.
4.2.3.6 Analysis of Data
A strength of nuclear medicine is quantitative analysis. Regions of
interest (ROIs) can be deﬁned on an image, or series of images, and
usedtoextract areal count rates. Whenappliedtoa dynamic series of
images, the result is called a timeactivity curve (TAC). For example,
a ROI over the left ventricle, in conjunction with gating by the elec
trocardiograph (ECG), can be applied to obtain a TACof ventricular
volume during systole from which we can derive a leftventricular
ejection fraction (Fig. 8). Routine applications in common use today
which utilize ROIs include studies of renal function, gastric empty
ing, gallbladder ejection fraction, and cardiac ejection fraction.
Two factors complicate the analysis of ROIs in planar scintigra
phy. Attenuation of overlying soft tissues may vary across a single
image, among multiple images of the same patient, and certainly
from patient to patient. Attempts to compare relative uptake by
left and right kidneys within a single image may therefore be con
founded by differences in attenuation of the overlying soft tissues.
Toa certaindegree, the approximate depthof organs, estimatedfrom
orthogonal views or other anatomic imaging modalities, canbe used
to compensate for attenuation based on the assumption that soft tis
sue is equivalent to water as an attenuating medium.
The second factor which confounds quantitative analysis is the
activityresidingwithintissues above or belowa structure of interest.
With reference to the example cited above, attempts to compare left
and right renal activity may be confounded by activity in overlying
soft tissues, such as the liver. To compensate, a background ROI is
typically deﬁned adjacent to the area of interest and is then used to
estimate and correct for these nonspeciﬁc counts. Asimilar method
is used to correct the left ventricular ROI for blood pool activity
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
83
originating in the overlying chest wall and lungs in calculation of
left ventricular ejection fraction (Fig. 8).
4.2.4 Alternate Scintigraphic Camera Designs
By far, the Angerstyle γcamera has predominated in clinical
radionuclide imaging. However, other camera designs have been
developed and commercialized, especially for niche applications.
One such camera, designed by Bender and Blau in the early 1960s,
utilized 294 small NaI(Tl) crystals that were monitored by 35 PMTs
assigned to 14 rows and 21 columns.
21
In contrast to the Anger scin
tillation camera, in which imaging is predicated upon localizing the
position of scintillation events in a large crystal, this scintillation
camera decodes position based on the photon’s interaction with
speciﬁc crystals, each of which represents a ﬁnite spatial location.
A major advantage of this design, which found application in ﬁrst
pass cardiac studies, is a higher countrate capability than that of the
Anger gamma camera. Recently, development of solid state detec
tors has resulted in reemergence of multicrystal cameras, especially
for portable or dedicated cardiac applications (Fig. 9).
4.3 TOMOGRAPHY
4.3.1 SinglePhoton
In current methods of singlephoton tomography, the γcamera
describes a circular or elliptical orbit aroundthe patient as it acquires
projection images at multiple angles. Data are then reconstructed
using either ﬁltered backprojection or iterative algorithms to esti
mate the original 3Ddistribution of activity
22
(Fig. 10). Tomography
does not increase spatial resolution. It is used to increase contrast by
eliminating overlying activity and is helpful in improving anatomic
localization. Tomographic images are also amenable to fusion with
CT or MRI data.
In theory, projection images obtained subtending 180
◦
around
the patient should be sufﬁcient to reconstruct the threedimensional
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
84
Lionel S Zuckier
Fig. 9. Pediatric bone scan image using a portable solid state multicrystal camera
(Digirad, Poway, CA) to visualize detailed uptake of radiopharmaceutical in bones
of the hand. The detector consists of 4096 3 mm×3 mm crystals of thalliumdoped
cesium iodide [CsI(Tl)].
distribution of activity and this is the standard acquisition method
used in cardiac perfusion tomography. The heart is situated antero
laterally in the left chest and a 180
◦
acquisition, centered on the
heart, is obtained extending from right anterior oblique to left pos
terior oblique projections. Cameras with multiple detector heads are
useful for tomographic acquisitions because they reduce the acqui
sition time required. An efﬁcient method of using two detectors for
cardiac imaging is to arrange the detectors 90
◦
to one another, in a so
called “L” conﬁguration. In this way, the assembly need rotate only
90
◦
to complete the entire 180
◦
data acquisition.
Attenuation of photons in tissue and loss of resolution with dis
tance from the collimator act to degrade photons originating on the
far side of the body. As a result, in most noncardiac tomographic
applications, acquisitions are performed using projections over a
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
85
Fig. 10. SPECT imaging. In this example, red blood cells from the patient have
been labeled with
99m
Tc and reinjected into the patient. An acquisition is performed
consisting of 64 projections taken about the upper abdomen. Eight representative
projection images are displayed in panel A. The 3Ddistribution of activity has been
iteratively reconstructed fromthe projection data; selected axial, saggital and coro
nal images are shown (panel B). Note the relative greater intensity of the periphery
of the liver as compared to its center due to effect of softtissue attenuation. Leg
end: Aaorta; H heart; I inferior vena cava; K kidney; L liver; S spleen; V vertebral
bodies.
full 360
◦
rotation. In this case, dualheaded cameras are conﬁgured
with the heads opposing each other at 180
◦
. The assembly rotates
180
◦
to complete the entire 360
◦
data acquisition. For a threeheaded
camera, heads are spaced 120
◦
apart and a complete acquisition can
take place following a 120
◦
rotation.
Attenuation interferes with quantitative analysis of images in
SPECT. This problem is manifested in cardiac perfusion imaging,
where variable softtissue attenuation leads to apparent regional
defects in the myocardium which simulate lack of perfusion. In the
imaging of the brain, it has been possible to correct for attenuation
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
86
Lionel S Zuckier
by assuming the cranial contour conforms to an oval and consists of
water density; however, this method cannot be generalized to more
complicated and heterogeneous parts of the body. Attenuation cor
rection can also be based on actual attenuation measurements using
radioactive sources which are rotated around the patient, thereby
obtaining a crude transmissionbased attenuation map. The mea
sured attenuation map is then typically segmented, to minimize
stochastic noise, and used to derive an energyappropriate correc
tionfor the emissionscan.
23
Most recently, SPECTcameras have been
manufactured with integrated inline CT scanners.
24
Using lookup
tables, it is thenpossibletotranslatetheattenuationof thelowenergy
Xray beamto the energyappropriate attenuation correction. At the
present time, a minority of clinical SPECT is performed with atten
uation correction however its use appears to be increasing.
4.3.2 DualPhoton
Radionuclides for dualphoton imaging emit positrons.
25
As dis
cussed earlier, these particles do not exit the body and therefore
cannot be directly imaged. Each positron travels only several mil
limeters or less within the patient (depending on its kinetic energy
and location), comes to rest, and combines with an electron. In
this process, the electron and positron annihilate each other and
their rest masses are converted into energy, resulting in creation
of two 511 keV photons which travel in nearly opposite direc
tions (Fig. 11). Imaging systems in positron emission tomography
(PET) are designed to identify these paired photons and deter
mine their line of origin, the lineofresponse.
26
In some early clinical
systems, modiﬁed dualheaded γcameras with rectangular detec
tors were used to detect coincident photons in PET imaging. How
ever, because these detectors only subtended a small fraction of
the angles surrounding the patient, count rate sensitivity was poor
and use of this method has declined. Currently, clinical systems
utilize multiple dedicated rings of detectors which surround the
patient in a 360
◦
geometry. A typical clinical PET system consists
of 3–4 rings of detectors, each subdivided into 6–8 planes and with
1000 elements per transaxial plane. The large number of detector
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
87
511KeV
511KeV
β
−
β
+
*
*
FDG
18
F
Detector Ring
Fig. 11. Principle of PET imaging. A patient with lymphoma is noted to have
intense
18
FFDG concentration in an axillary lymph node.
18
FFDG localizes in the
tumor, presumably due to the increased rate of glycolysis (glucose metabolism)
in malignant tissue. With radioactive decay of the
18
F, a positron (β
+
) is emitted,
travels on the order of 1 mm, comes to rest, and combines with a ubiquitous elec
tron (β
−
) to produce a pair of nearlyopposed 511 keVphotons (dotted lines). In the
illustration, the pair of annihilation radiation photons nearly simultaneously inter
sect two elements within the ring of detectors (asterisks). The line that is deﬁned by
the two detectors is termed the lineofresponse (dashed line). Millions of such coin
cidences are used to reconstruct the original distribution of
18
F, thereby identifying
the location of the tumor.
elements in a PET scanner is expensive and difﬁcult to design.
Amethod which has been used to simplify the scanner is to score a
large block detector crystal in such a way as to have it function as up
to 64 individual detector elements, backed by only 4 PMTs. In some
designs, each PMT is shared among four adjacent detectors, further
reducing their number and the overall cost. The exact location of
each photon interaction within the block detector is encoded by the
intensity of light recorded at the PMTs, which is unique for each
crystal element.
Optimal scintillators for PET are different than those for single
photon imaging. The 511 keV photons are difﬁcult to stop, and lack
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
88
Lionel S Zuckier
Table 4. Scintillators Used in PET After Zanzonico
26
Scintillator Mass Density (ρ) Effective Light Output Scintillation
(gm/cm
3
) Atomic (Photons/ Decay Time
Number, Z MeV) (µsec)
NaI(Tl) 3.7 51 41 000 230
Bismuth germanate, 7.1 75 9 000 300
BGO
Lutecium ortho 7.4 66 30 000 40
oxysilicate, LSO
Germanium ortho 6.7 59 8 000 60
oxysilicate, GSO
of absorptive collimation in PETleads to high count rates andpoten
tially high deadtimes. Crystals with high densities, high atomic
numbers, and rapid light output are favored (Table 4). High light
output is also desirable in that it reduces statistical uncertainty and
therefore improves scatter rejection.
In contrast to singlephoton imaging, where collimation is
requiredto relate a photonto its line of origin, no absorptive collima
tionis requiredinPET. This allows for far greater countrate sensitiv
ity than in singlephoton systems. Furthermore, no degradation of
resolutionoccurs withincreasing distance fromthe detectors. Detec
tor elements are usually operated in coincidence with only a subset
of all the other remaining detector elements, eliminating the consid
eration of adjacent elements where the lines of response would lie
outside of the patient’s body. When two photons are detected by the
scanner within a ﬁnite time interval τ, typically only 6 ns–12 ns, the
detectors involved deﬁne a lineofresponse for subsequent recon
struction of the in vivo distribution of radionuclide. The energies of
the photons are usually windowed to reduce scattering. Millions of
linesofresponse are then used to calculate the distribution of radio
pharmaceutical within the patient.
The paired 511 keV annihilation photons may interact with
the PET camera in several different ways (Fig. 13). Depending
on the orientation of the 511 keV photons, some positron annihi
lations are missed completely. In a large number of others, only
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
89
one of the pair of annihilation photons interacts with a detector,
resulting in an unpaired single. Of the pairs of photons detected by
the camera within the timing window τ and which therefore deﬁne
a lineofresponse, some reﬂect accurate or true events, correspond
ing to absorption of nonscattered 511 keV photons that originate
from a single positronelectron annihilation. Other pairs of pho
tons detected occur following the scattering of one or both photons
through shallow angles, thereby remaining within an acceptable
energy window but deﬁning an erroneous lineofresponse (scattered
coincidences). Athirdcategory of pairedphotons is designatedas ran
doms, due to the mistaken pairing of two independent 511 keV pho
tons which are incident on the detectors within the speciﬁed timing
window τ despite originating from two separate positronelectron
annihilations. The narrower the timing window τ, the smaller the
number of random coincidences accepted. However, as τ is made
too short, true coincidences are also lost because of the nontriv
ial time required for photon ﬂight, scintillation within the crystal,
and electronic processing. Analogously, a narrower energy win
dow will decrease scattered photons but at a cost of decreased
true coincidences due to limitations in energy resolution of the
system.
In the past, a major limitation of PET systems has been the high
count rate, and demands placed upon the electronics and proces
sors. Additionally, the true count rate increases proportional to the
activitypresent withinthe patient while the randomcoincidence rate
increases as the square of the activity, and becomes critical at high
count rates. In order to decrease the number of randoms and scat
tered coincidences and to reduce the huge computational demand
which increases as the square of the number of detectors, many sys
tems introduce leador tungstensepta betweenthe rings of detectors,
which prevents oblique linesofresponse. These systems are termed
2D in that only events within rings or between adjacent rings are
acquired (Fig. 12). The transaxial images so derived are stacked to
constitute a threedimensional volume of distribution from which
coronal, saggital and maximumintensity projection (MIP) images
are derived.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
90
Lionel S Zuckier
2D Acquisition
Septa
Extended
3 1 2 n +1 +2 +3
3 1 2 n +1 +2 +3 3 1 2 n +1 +2 +3
3 1 2 n +1 +2 +3
Septa
Retracted
3D Acquisition
Fig. 12. 2D versus 3D PET. In 2D acquisition, lead or tungsten septa are extended
between adjacent rings of scintillation detectors. Only linesofresponse within a
single ring or possibly adjacent rings are permitted (solid line) while paired pho
tons with greater obliquity are absorbed by the septa (broken lines). In 3D acquisi
tion, the septa are retracted and each detector element may be in coincidence with
opposite detector elements in any of the rings. This results in increased sensitivity
but markedly increased random coincidences and potentially dead time. Sequen
tial scans in a patient with previously treated tumor in the frontal lobe demonstrate
the improved quality of 3D scan relative to 2D scans, which is achievable when
imaging small body parts such as the head where scatter is minimal.
By removing the lead or tungsten septa between adjacent rings
of detectors, it is possible to perform an acquisition where linesof
response between all the detector rings are potentially available to
detect coincident pairs of photons, an approach termed 3D(Fig. 12).
3Dacquisition has been facilitated by faster computer processors
and improvements in detectors which allow better temporal and
energy resolution. Because of the deleterious effect of high count
rate, the administered activities are decreased in 3D acquisitions.
Imaging times may also be signiﬁcantly reduced due to the greater
countrate sensitivity.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
91
511KeV
511KeV
1b
1a
4a
3a
5b
5a
*
A
B
C
2a
Fig. 13. Variety of photon interactions in PET imaging. Positrons emitted fromthe
tumor in the patient’s left axilla result in emission of pairs of annihilation photons.
Representative interactions are illustrated. (1) A pair of annihilation photons is
detectedby the ring of detectors, resulting ina “true” coincidence whichis recorded
as the lineofresponse “A” (dashed line). (2) Only one of the pair of photons is
recorded by the ring of detectors, a “single.” The second photon escapes by passing
through the ring of detectors, or by passing out of plane or into a lead septum. (3a)
and (4a) Two unrelated single photons are recorded within the ﬁnite coincidence
timing window τ, resulting in false lineofresponse “B” (random). (5) One of the
511 keV photons is directly detected by a detector (5a) while the second undergoes
Compton scattering within the patient (5b). Depending on the scattering angle, if
this latter photon remains within the acceptable energy window, there will result a
malpositioned coincidence and erroneous lineofresponse “C.”
Anumber of sources of error impact upon the accuracy of dual
photon imaging. Depending on its kinetic energy, the positron may
travel up to several millimeters prior to coming to rest, which
displaces the lineofresponse from the actual site of the annihila
tion event. Secondly, because positrons may actually have nonzero
momenta immediately prior to annihilation, the emitted 511 keV
photons may not be exactly collinear (at 180
◦
to each other). This,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
92
Lionel S Zuckier
too, leads to errors in the designated lineofresponse. Each detector
element also has a ﬁnite size, which introduces further uncertainty
in the lineofresponse. As mentioned above, random and scattered
coincidences can leadto erroneous lines of response as well. Modern
PET scanners in clinical use have spatial resolution on the order of
3 mm–4 mm.
In PET imaging, attenuation correction is routinely performed
for clinical interpretation and is required for quantitative analy
sis. In uniform and geometrically simple regions of the body such
as the skull, attenuation correction may be based on the assump
tion that the body contour conforms to a waterequivalent regu
lar geometric shape, as discussed with regard to SPECT scanners.
Measurementbased attenuation correction in PET utilizes radioac
tive sources suchas
68
Ge (energy511 keV) and
137
Cs (energy662 keV)
which are rotatedaroundthe patient andusedto yielda crude atten
uation map for correcting the emission data. Often, the
68
Ge and
137
Cs attenuation maps are segmented into lung, bone, and soft tis
sue densities to overcome the degradative effects of the lowcount
(“noisy”) transmission data. These are then used to derive a 511 keV
appropriate attenuation map with which to correct the emission
scans. Most recently, PET scanners have been manufactured with
integrated inline CT scanners.
27
Using segmentation and lookup
tables, it is also possible to translate the energyspeciﬁc attenuation
of the Xray beam to that of the appropriate 511 keV photon energy
(Fig. 14).
A widely used measure of radiopharmaceutical uptake in clin
ical PET imaging is the standardized uptake ratio (SUV), which is
a dimensionless ratio of radiotracer concentration in the structure
of interest to the average concentration in the body (administered
activity divided by patient mass). Quantitation is most useful when
values are compared in a single subject before and after therapy.
Accurate measurement of SUV is dependent on accurate attenua
tion correction, and its clinical utility also requires standardization
of the interval between FDG administration and imaging, fasting of
the patient before FDG administration and other biologically rele
vant variables.
28
Most studies have shown that a semiquantitative
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
93
NAC CT
AC
Fused
Fig. 14. Nonattenuation corrected (NAC) and attenuation corrected (AC) scans
insame patient as Fig. 12. Original data froma PETCTscanconsists of NACandCT
images. The CT scan is used to create an energyappropriate attenuation correction
mapwhichis appliedtocreateACimages. CTscanis oftenfusedwiththeACimages
in order to improve anatomic localization of abnormalities. Note the distribution of
activity in the NAC image where central structures are relatively lower in activity
than peripheral structures; this has been corrected in the AC PET images.
visual grading system is equally effective as SUVbased diagnostic
criteria in differentiating malignant from benign tissue.
4.3.3 Fusion Imaging in Nuclear Medicine
A development in nuclear medicine which continues to evolve is
the combination or fusion of scintigraphic images with anatomic
radiologic modalities, chieﬂy CT. To a certain degree, this has
been facilitated by development of standards for image storage
and retrieval (Digital Imaging and Communications in Medicine
[DICOM]) and the interface of multiple modalities to common data
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
94
Lionel S Zuckier
and storage systems (Picture Archive and Communication Sys
tems or PACS).
29,30
Nuclear medicine images contain unique func
tional information but are often limited in anatomic content. CT and
anatomic MR imaging have superior spatial resolution, but gener
ally lack functional information. By fusing nuclear medicine and
anatomic images, it is possible to correlate changes in function with
particular anatomic structures. Initially, this was attemptedby using
software to retrospectively register
31
and then fuse (overlay) studies
performed on separate scanners at different times (Fig. 15). Methods
CT
RBC
Fusion
Axial Saggital Coronal
Fig. 15. Retrospective SPECT Fusion. Contrast enhanced CT scan, performed sev
eral days prior to the blood pool study illustrated in image 10, has been regis
tered and fused to the nuclear medicine study using a mutual information method
(MIMvista version 4.0, MIMvista Corp, Cleveland, OH). Note excellent registration
of the inferior vena cava (arrows) and spleen (arrowhead) on blood pool and CT
images. The study was done to evaluate the blood volume corresponding to an
area of contrast enhancement in the periphery of the left lobe of the liver on CT
(dashed circle). No corresponding increase in blood volume is noted and the lesion
therefore does not represent a hemangioma.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
95
of retrospective registrationwere frequentlyhamperedbyinevitable
differences in patient positioning, bowel preparation and other vari
able factors. To minimize this problem, a hardware approach was
developed where SPECT and PET scanners were combined inline
with CTscanners, thereby permitting both studies to be sequentially
acquiredonthe same gantrywithlittle or notime andpatient motion
between them (Fig. 16). An additional beneﬁt of combined devices
has been the ability to correct for attenuation as described above.
Currently, the vast majority of PET scanners sold for clinical use
incorporate CT; while less common on SPECT scanners, this feature
is increasing in frequency. Indeed, the success of CTfusion in clinical
Stress →
Rest →
Stress →
Rest →
V
e
r
t
i
c
a
l
L
o
n
g
A
x
i
s
S
h
o
r
t
A
x
i
s
A
Stress
Rest
LAD
RCA
B
Fig. 16. Hardware fusion of SPECT and multislice CT coronary angiography
(CTA) in a 50 year old man with chest pain performed on an inline dedicated
scanner (Research SPECT/16CT Inﬁnia LS, General Electric Healthcare Technolo
gies). (A) Selected tomographic perfusion images of the heart are displayed in short
axis, and vertical long axis. An area of diminished perfusion at stress (odd rows,
arrows) exhibits improved perfusion at rest (even rows, arrowheads). (B) Fused
SPECT/CTAdata combining anepicardial display of myocardial perfusionat stress
(upper image) and rest (lower image) with the coronary tree derived from the CTA
study illustrates the relationship of the ischemic territory and the right coronary
artery (RCA). The course of the left anterior descending (LAD) artery is also demon
strated. CTAimages illustrate luminal narrowing in the RCA(not shown). (Images
courtesy of Dr R BarShalom, Rambam Health Care Campus, Haifa, Israel.)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
96
Lionel S Zuckier
single anddualphoton tomography has ledto development of par
allel techniques for small animal imaging and research.
32
4.4 CONCLUDING REMARKS
In nuclear medicine, molecules containing radioactive atoms,
termed radiopharmaceuticals, are used to diagnose and treat dis
ease; the interaction of the radiopharmaceuticals with physiologic
processes within the body reveals unique functional information.
Methods of diagnosis in nuclear medicine include imaging of sin
gle photons originating from the radiopharmaceuticals, and in the
case of PET, imagingof dual photons that derive fromannihilationof
positrons. Current developments innuclear medicine include fusion
of SPECT and PET with anatomic modalities such as CT.
References
1. Groch MW, Radioactive decay, Radiographics 18(5): 1245–1246, 1998.
2. Hall EJ, Giaccia AJ, Radiobiology for the Radiologist, 6th edn., Lippincott
Williams & Wilkins, Philadelphia, PA, 2006.
3. Brodsky A, Kathren RL, Historical development of radiation safety
practices in radiology, Radiographics 9(6): 1267–1275, 1989.
4. Schuck HSR, Osterling Aet al., Nobel. The Man and his Prizes, 2nd edn.,
Elsevier Publishing Company, Amsterdam, 1962.
5. Graham LS et al., Nuclear medicine from Becquerel to the present,
Radiographics 9(6): 1189–1202, 1989.
6. Kuhl DE, Edwards RQ, Reorganizing data from transverse section
scans of the brain using digital processing, Radiology 91(5): 975–983,
1968.
7. Kassis AI, Adelstein SJ, Radiobiologic principles in radionuclide ther
apy, Journal of Nuclear Medicine 46 (Suppl 1): 4S–12S, 2005.
8. Budinger TF, Rollo FD, Physics and instrumentation, Prog Cardiovasc
Dis 20(1): 19–53, 1977.
9. Ranger NT, Radiation detectors in nuclear medicine, Radiographics
19(2): 481–502, 1999.
10. Blahd WH, Ben Cassen and the development of the rectilinear scanner,
Semin Nucl Med 26(3): 165–170, 1996.
11. Gottschalk A, Anger HO, Use of the Scintillation camera to reduce
radioisotope scanning time, Jama 192: 448–452, 1965.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
97
12. Anger HO, Scintillation Camera, The Review of Scientiﬁc Instruments
29(1): 27–33, 1958.
13. Collica CJ, Robinson T, Hayt DB, Comparative study of the gamma
camera and rectilinear scanner, Am J Roentgenol Radium Ther Nucl Med
100(4): 761–779, 1967.
14. Formiconi AR, Collimators, Q J Nucl Med 46(1): 8–15, 2002.
15. Swann S et al., Optimized collimators for scintillation cameras, J Nucl
Med 17(1): 50–53, 1976.
16. Anger HO, Scintillation camera with multichannel collimators, J Nucl
Med 5: 515–531, 1964.
17. MurphyPH, Burdine JA, Largeﬁeldofview(LFOV) scintillationcam
eras, Semin Nucl Med 7(4): 305–313, 1977.
18. ToddPokropek A, Advances in computers and image processing with
applications in nuclear medicine, Q J Nucl Med 46(1): 62–69, 2002.
19. Muehllehner G, Colsher JG, Stoub EW, Correction for ﬁeld nonunifor
mity in scintillation cameras through removal of spastial distortion,
J Nucl Med 21(8): 771–776, 1980.
20. Genna S, Pang SC, Smith A, Digital scintigraphy: Principles, design,
and performance, J Nucl Med 22(4): 365–371, 1981.
21. Bender MA, Blau M, The autoﬂuoroscope, Nucleonics 21(10): 52–56,
1963.
22. Larsson SA, Gamma camera emission tomography. Development and
properties of a multisectional emission computedtomography system,
Acta Radiol Suppl 363: 1–75, 1980.
23. Xu EZ et al., A segmented attenuation correction for PET, J Nucl Med
32(1): 161–165, 1991.
24. Bocher M et al., Gamma cameramounted anatomical Xray tomogra
phy: Technology, system characteristics and ﬁrst images, Eur J Nucl
Med 27(6): 619–627, 2000.
25. Votaw JR, The AAPM/RSNAphysics tutorial for residents. Physics of
PET, Radiographics 15(5): 1179–1190, 1995.
26. Zanzonico P, Positron emission tomography: A review of basic prin
ciples, scanner design and performance, and current systems, Semin
Nucl Med 34(2): 87–111, 2004.
27. Beyer T et al., Acombined PET/CT scanner for clinical oncology, J Nucl
Med 41(8): 1369–1379, 2000.
28. Keyes JW, Jr, SUV: Standard uptake or silly useless value? J Nucl Med
36(10): 1836–1839, 1995.
29. Alyafei S et al., Image fusion system using PACS for MRI, CT, and PET
images, Clin Positron Imaging 2(3): 137–143, 1999.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch04 FA
98
Lionel S Zuckier
30. GrahamRN, Perriss RW, ScarsbrookAF, DICOMdemystiﬁed: Areview
of digital ﬁle formats and their use in radiological practice, Clin Radiol
60(11): 1133–1340, 2005.
31. Zitova B, Flusser J, Image registration methods: A survey, Image and
Vision Computing 21(11): 977–1000, 2003.
32. Lewis JS et al., Small animal imaging. Current technology and perspec
tives for oncological imaging, Eur J Cancer 38(16): 2173–2188, 2002.
33. Technology NI.o.S.a. [cited February 11, 2007]; Available from:
http://physics.nist.gov/cuu/index.html.
34. Cherry SR, Sorenson JA, Phelps ME, Physics in Nuclear Medicine, 3rd
edn., Saunders, Philalephia, PA, 2003.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
CHAPTER 5
Principles of Magnetic Resonance
Imaging
Itamar Ronen and DaeShik Kim
The phenomenon of nuclear magnetic resonance was ﬁrst described by
Felix Bloch
1
and independently by EM Purcell in 1946.
2
Both scien
tists shared the Nobel Prize in 1952 for this pivotal discovery. The phe
nomenon is tightly linked to the broader ﬁeld of interaction between
matter and radiation commonly known as spectroscopy. It is within the
frame of nuclear magnetic resonance (NMR) spectroscopy that the ﬁeld
developed in leaps and bounces, leading to the discovery of Fourier
Transform NMR by RR Ernst (Nobel Prize in Chemistry, 1991) and to
an astonishingly wide range of applications, from the structural deter
mination of protein structure in solution to the investigation of metabolic
processes in live organisms, from solid state research to myriad appli
cations in organic chemistry. The unexpected paradigm shift in NMR
research came in the early 1970s, when independent research by two
ingenious scientists, Paul Lauterbur, then at SUNYStony Brook and Peter
Mansﬁeld at Nottingham University, UK, has raised the possibility of
obtaining images based on the signal generated by nuclear mag
netic resonance.
3−5
The humble beginnings, namely the projection
reconstruction maps of two waterﬁlled test tubes shown by Lauterbur
in his ﬁrst publication on the matter in the journal Nature, were soon
followed by the ﬁrst applications of this new technique to obtain
images of the human body, and this spawned a new ﬁeld — the
ﬁeld of magnetic resonance imaging (MRI). Both scientists shared the
Nobel Prize in Physiology or Medicine in 2003. MRI effectively revolu
tionized the biomedical sciences, allowing the noninvasive imaging of
practically every organ of the human body in health and disease.
MRI methodology began covering a broad range of diagnostic tools,
and invaded an astonishing variety of basic research ﬁelds, including
the ability to visualize brain function, characterize tissue microscopic
99
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
100
Itamar Ronen and DaeShik Kim
structure, reconstruct neural connections, and more — all in a nonin
vasive and harmless manner. In this chapter, we will explore the basic
principles of nuclear magnetic resonance — the way in which the NMR
signal is generated and detected, the properties of the NMR signal
and the way in which this signal is manipulated to provide us with
images.
5.1 PHYSICAL AND CHEMICAL FOUNDATIONS OF MRI
5.1.1 Angular Momentum of Atomic Nuclei
Atomic nuclei posses intrinsic angular momentum, associated with
precession of the nucleus about its own axis (spin). In classical
physics, angular momentumis a vector associated with a body rotat
ing or orbiting around an axis of rotation, and is given by
L = r ×
P
where
L is the angular momentumvector, r is the radius vector from
the center of rotation and
P is the linear momentumvector. The vec
tor multiplication operator, ×, generates the angular momentum to
be perpendicular to the plane deﬁned by
P and r, as can be seen in
Fig. 1.
In quantum mechanics, the angular momentum for particles
such as electrons, protons or atomic nuclei is given by L =
√
I(I +1)h, where L stands for the total angular momentum of the
Fig. 1. The relationship between the angular momentumL, the linear momentum
P and the radius vector r.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
101
particle, I is the spin number, or simply the spin of the particle, and
h is Planck’s constant in the appropriate units. The spin quantum
number I is characteristic of every nuclear species. I can be zero or
can take positive integer or half integer values. The differences in the
value of the spin quantum number reﬂect differences in the nuclear
composition and charge distribution. For instance, the nucleus of
1
H
(I = 1/2) consists of one proton only, while the nucleus of
2
H(I = 1)
consists of a proton and a neutron. For
12
C and
16
O I = 0, and these
nuclei have zero angular momentum. Table 1 lists some of the stable
isotopes of common elements in the periodic table together with
their spin numbers.
5.1.2 Energy States of a Nucleus with a Spin I
Anucleus with a spin =I possesses 2I +1 possible states, deﬁned by
a quantumnumber m
I
. mcan obtain the values −I, −I +1,…,I −1, I.
These states are associated with the different projections of L on the
(arbitrarily chosen) zaxis. The projection is then given by L
z
(m) =
mh. The relationship between L and the different L
z
for the case of
I = 3/2 is given in Fig. 2. It should be noted that the projection of
the angular momentum is well deﬁned on one axis only, while as a
result of Heisenberg’s uncertainty principle, L
x
and L
y
are not well
deﬁned, causing L to lie on an uncertainty cone. In the absence of an
Table 1.
Isotope Natural Spin (I) Magnetic Gyromagnetic Ratio
Abundance (%) Moment (µ)
†
(γ)
∗
1
H 99.9844 1/2 2.7927 26.753
2
H 0.0156 1 0.8574 4,107
11
B 81.17 3/2 2.6880 —
13
C 1.108 1/2 0.7022 6,728
17
O 0.037 5/2 −1.8930 −3,628
19
F 100.0 1/2 2.6273 25,179
29
Si 4.700 1/2 −0.5555 −5,319
31
P 100.0 1/2 1.1305 10,840
∗
γ in units of 10
7
rad T
−1
sec
−1
†
µ in units of nuclear magnetons = 5.05078 · 10
−27
JT
−1
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
102
Itamar Ronen and DaeShik Kim
Fig. 2. Quantization of the angular momentum for I = 3/2. The four possible m
states represent four projections on the zaxis.
external magnetic ﬁeld, states withdifferent mhave the same energy —
they are degenerate states.
5.1.3 Nuclear Magnetic Moment
The overall spin of the (charged) nucleus generates a magnetic dipole
moment, or a magnetic moment, alongthe spinaxis. The nuclear mag
netic moment, µ is again an intrinsic property of the speciﬁc nucleus.
The magnetic moment µ results from a motion of a charged parti
cle, similar to a generation of magnetic moment as a result of a loop
current. The magnetic moment µ is also a vector, and it is propor
tional to the angular momentum L through the gyromagnetic ratio γ:
µ = γL µ
z
= γL
z
= γmh. The gyromagnetic ratio is one of the most
useful constants in MR physics, and we will encounter it on several
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
103
occasions in our discussion. Table 1 lists the gyromagnetic ratio for
several nuclei of stable isotopes, as well as for the electron.
5.1.4 The Interaction with an External Magnetic Field
The nuclear magnetic moment can interact with an external magnetic
ﬁeld B
0
. The energy of this interaction is given by: E = −ˆ µ ·
ˆ
B
0
=
−(µ
x
B
0,x
+µ
y
B
0,y
+µ
z
B
0,z
), the scalar product between the magnetic
ﬁeld and the magnetic dipole. If the ﬁeld is oriented solely along the
zaxis, the energy of this interaction is thus given by: E = −ˆ µ ·
ˆ
B
0
=
−µ
z
B
0
= −γ
I
m
I
hB
0
m
I
= −I, −I + 1, . . . , I − 1, I where γ
I
is the
gyromagnetic ratio of the nucleus I. As can be appreciated, the effect
of the external magnetic ﬁeld is the removal of the degeneracy of the
energy levels. Eachmstate is nowcharacterizedbya distinct energy. In
the case of spin1/2, mis equal toeither −1/2 or +1/2, andthe energy
levels associatedwiththese two states are: E
m=−1/2
= E
0
+
1
2
γhB
0
and
E
m=+1/2
= E
0
−
1
2
γhB
0
. Figure 3 describes the energy level diagramof
a spin 1/2 particle in the presence of an external magnetic ﬁeld. The
Fig. 3. The effect of B
0
on the two degenerate m states for I = 1/2 particle.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
104
Itamar Ronen and DaeShik Kim
energy gap between the two levels is given by E = E
−1/2
− E
1/2
=
γhB
0
. This energy expressed in frequency, using the Planck relation
E = hυ where υ is the Larmor frequency:
ω
0
= 2πν
0
= γB
0
.
The removal of energylevel degeneracycanbe viewedas a result
of break of spherical symmetry introduced by the external magnetic
ﬁeld B
0
. As a result of the interaction with B
0
, L is now conﬁned to
2I +1 orientations with respect to B
0
, dictated by the possible 2I +1
L
z
values, which now reﬂect the projections of L on the axis deﬁned
by the direction of B
0
. In the case of I = 1/2, the state m = 1/2 has L
z
lie on the positive z, and m = −1/2 on the negative zaxis, similarly
to what is seen on Fig. 2 for the case I = 3/2.
5.1.5 The Classical Picture
For spin=1/2, the same result can be obtained solely from classi
cal considerations. One can envision the nucleus as a small mag
netic dipole with a magnetic moment µ. When placed in an external
magnetic ﬁeld B
0
, the nuclei will experience a torque in a plane per
pendicular to B
0
and precess along the magnetic ﬁeld lines with a
precession frequency ω
0
, proportional to the external magnetic ﬁeld
through the gyromagnetic ratio: ω
0
= γB
0
. The dipole precesses
along a cone that forms an angle θ with the zaxis. This is an equiva
lent to the uncertainty cone described for L
x
and L
y
of the quantum
particle. It should be noted that the classical picture converges with
the quantummechanical picture only for I = 1/2.
5.1.6 Distribution Among m States
The removal of energy degeneracy thus creates a uniformly spaced
energy ladder with2I +1 distinct states, separatedby ω = γB
0
. The
energy of the states increases as m decreases. In the case of I = 1/2,
the state m = 1/2 has the lower energy and is typically designated
as the α state, and the state with m = −1/2 is higher in energy and is
designated as the β state. Most importantly for MRI, the energy gap
is linearly proportional to the external magnetic ﬁeld, and so is the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
105
Larmor frequency. In room temperature, one can use the Maxwell
Boltzmann distribution to estimate the distribution of the particles,
in our case — nuclei, among the various energy levels. For I = 1/2,
the MB distribution among the states α and β is given by:
n
α,β
N
=
exp
−
E
0
±
1
2
hν
0
kT
exp
−
E
0
−
1
2
hν
0
kT
+ exp
−
E
0
+
1
2
hν
0
kT
,
where k is the Boltzmann constant, υ
0
is the Larmor frequency and
T is the temperature. For T = 300 k, and a magnetic ﬁeld of 3 Tesla,
the Larmor frequency is roughly 127.7 MHz, and the α state is more
populated than the β state such that:
n
β
n
α
=
exp
−
E
0
+
1
2
hv
0
kT
exp
−
E
0
−
1
2
hv
0
kT
= exp
−hv
0
kT
= 0.99998.
5.1.7 Macroscopic (Bulk) Magnetization
In a macroscopic sample, the total magnetic moment of the sample is
called the macroscopic or bulk magnetization. The bulk magnetiza
tion, or simply the magnetization at thermal equilibriumis denoted
byM(eq) or simplybyM, andit is the sumof the individual moments
in the sample. The sum of all moments with state α or β are given
by M
α,β
, where M
α,β
(x) = M
α,β
(y) = 0 and M
α,β
(z) > 0. The x and
y projections of M
α,β
vanish because of the uncertainty cone (in the
quantummechanical picture) or the lack of phase coherence among
the individual precessions (in the classical picture). As a result of the
thermal distribution among states, M
α
> M
β
, and the total magne
tization M is given by M = M
α
− M
β
. It can be easily shown that at
high temperatures, M =
1
4
N(γh)
2
B
0
/kT, known also as Curie’s law.
5.1.8 The Interaction with Radiofrequency Radiation —
the Resonance Phenomenon
At this point, an important point has been reached —the creation of
an energy gap between two unevenly populated states. Aradiofre
quency (RF) radiation at a frequency equal to the frequency gap
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
106
Itamar Ronen and DaeShik Kim
between the states will result in transitions of particles fromα to the
β states and in absorption of energy quanta. This is the nuclear mag
netic resonance phenomenon, and thus the resonance condition is
ω
RF
= ω
0
. The simplest experimental setting that can be envisioned
is that of a magnet that generates a static homogeneous magnetic
ﬁeld B
0
and a radiofrequency source that generates RF radiation
ω
RF
. If the sample inside the homogeneous B
0
contains nuclei with
I > 0 (e.g. a water sample where the hydrogen atoms have nuclei
with I = 1/2), by slowly varying either the external magnetic ﬁeld
B
0
or ω
RF
, the resonance condition will be met at some point, result
ing in absorption of RF. This absorption can be detected by a RF
detector. One of the ﬁrst NMR spectra ever obtained was of ethanol
(CH
3
CH
2
OH). The three resonances that were visible on the spec
trum were those of the three hydrogen “types” (the CH
3
group, the
CH
2
group and the OH group) and the slight variations in reso
nance frequencies among the three stem from slight differences in
the electron shielding around the different
1
H nuclei.
5.2 THE BLOCH EQUATIONS
A phenomenological description of the equations of motion for M,
the bulk magnetization, was given by Felix Bloch, and it is known
as the Bloch equations. The Bloch equations are extremely use
ful for the understanding of the various effects that experimental
manipulation of M have, and in particular — the effects of radiofre
quency radiation. The Bloch equations for the three components of
the magnetization are:
dM
x
dt
= −γ(B
0,y
M
z
− B
0,z
M
y
) −
M
x
T
2
,
dM
y
dt
= −γ(B
0,z
M
x
− B
0,x
M
z
) −
M
y
T
2
,
dM
z
dt
= −γ(B
0,x
M
y
− B
0,y
M
x
) −
M
eq,z
− M
z
T
1
,
where the ﬁrst term in each equation represents the torque exerted
on each magnetization component by the components of B
0
perpen
dicular to it. The second term in each equation is a relaxation term
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
107
that allows for the magnetization to regain its equilibrium value.
Since we assume B
0
along the zaxis, the relaxation along the zaxis
is called the longitudinal relaxation, whether the relaxation on the
xy plane is called the transverse relaxation. The sources of these
relaxation processes are different, and will be discussed later. If the
magnetic ﬁeld B
0
is aligned along the zaxis, and taking into account
the relation ω
0
= γB
0
, the Bloch equations in the presence of a static
magnetic ﬁeld take the form:
dM
x
dt
= ω
0
M
y
−
M
x
T
2
,
dM
y
dt
= −ω
0
M
x
−
M
y
T
2
,
dM
z
dt
= −
M
z,eq
− M
z
T
1
.
The solution of the Bloch equations for the transverse (xy) compo
nents of the magnetization is a clockwise precession accompanied
by decay at a rate 1/T
2
until M
xy
→ 0. The longitudinal component
of M decays according to 1/T
1
, approaching M
z,eq
:
M
x
(t) = M
xy
(0) cos (ωt) · exp
−
t
T
2
,
M
y
(t) = M
xy
(0) sin (ωt) · exp
−
t
T
2
,
M
z
(t) = M
z
(0) +
M
z,eq
− M
z
(0) · exp
−
t
T
1
.
5.2.1 The Inclusion of the RF Field in the Bloch Equations
As mentioned earlier, the actual MRexperiment involves the pertur
bation of M
eq
with a radiofrequency irradiation. The RF irradiation
generates an EM ﬁeld oscillating at a frequency which we denote
by ω
1
. The magnetic part of the electromagnetic ﬁeld can thus be
given by B
1
= γω
1
. In order to drive M out of equilibrium, B
1
must
operate perpendicular to the zaxis. For simplicity reasons, in our
discussion, we will assume B
1,x
> 0, B
1,y,z
= 0, or in other words,
B
1
exerts torque on the yz plane, tilting the equilibrium magneti
zation away from the zaxis toward the yaxis. Typically we apply
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
108
Itamar Ronen and DaeShik Kim
a linearly polarized oscillating ﬁeld. The linear polarization can be
decomposed into two circularly polarized counterrotating ﬁelds with
a frequency difference of 2ω:
B
RF
= [B
1,x
cos (ωt) + B
1,y
sin (ωt)] + [B
1,x
cos (ωt) − B
1,y
sin (ωt)].
The ﬁrst component is a counterclockwise rotating component. We
are interested in irradiation frequencies close to resonance, and at
resonance ω = ω
0
. Since Mis precessing clockwise under B
0
, it is 2ω
0
away fromresonance, and its inﬂuence on M can be neglected. Thus
B
RF
can be viewed as a circularly polarized ﬁeld, where the polar
ization rotates at a frequency ω : B
RF
= B
1,x
cos (ωt) − B
1,y
sin (ω, t).
When a B
RF
ﬁeld is applied, the total magnetic ﬁeld B is thus:
B =
B
1
cos (ωt)
−B
1
sin (ωt)
B
0
The Bloch equations thus assume the following form:
dM
x
dt
= −ω
1
sin (ωt)M
z
+ω
0
M
y
−
M
x
T
2
,
dM
y
dt
= ω
1
cos (ωt)M
z
+ω
0
M
x
−
M
y
T
2
,
dM
z
dt
= −ω
1
cos (ωt)M
y
+ω
1
sin (ωt)M
x
−
M
x,eq
− M
z
T
1
,
where ω
1
= γB
1
.
5.2.2 The Rotating Frame of Reference
This is a rather complicated picture, since the effective ﬁeld B is a
combination of a static magnetic ﬁeld B
0
and a rotating ﬁeld B
1
. In
order to simplify the picture, we move from the laboratory frame of
reference to a frame of reference that moves along with the rotating
ﬁeld B
1
, i.e. rotates clockwise at a frequency ω. Since B
1
is perpendic
ular to B
0
, and the frame of reference rotates at exactly the frequency
of rotation of the RF ﬁeld, both ﬁelds now appear to be static in this
reference. Expressing the components of the transverse magnetiza
tion in the rotating frame M
x
,y
in terms of the components in the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
109
laboratory frame yields:
M
x
= M
x
cos (ωt) − M
y
sin (ωt); M
y
= M
x
sin (ωt) + M
y
cos (ωt).
Rewriting the Bloch equations for M
x
and M
y
gives:
dM
x
dt
= (ω
0
−ω)M
y
−
M
x
T
2
,
dM
y
dt
= ω
1
M
z
−(ω
0
−ω)M
x
−
M
y
T
2
,
dM
z
dt
= −ω
1
M
y
−
M
z,eq
− M
z
T
1
.
Inthis frame, the effective magnetic ﬁeldis now
B
eff
=
B
1
0
B
0
−ω/γ
.
This is a static magnetic ﬁeld that is a sum of the RF ﬁeld B
1
, which
operates along the x
axis, and a reduced static magnetic ﬁeld B
0
−
ω/γ that operates along the zaxis. The magnetic ﬁeld along the z
axis seems reduced since now rotating frame follows at a frequency
ω the magnetization, which precesses at a frequency ω
0
. The relative
frequency among them is thus ω
0
− ω, and thus the static magnetic
ﬁeld seems “reduced.” The precession of M about the axis deﬁned
by the effective magnetic ﬁeld is depicted in Fig. 4(a).
Fig. 4. The effective ﬁeld in the rotating frame of reference (A) off resonance and
(B) on resonance.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
110
Itamar Ronen and DaeShik Kim
At resonance, ω = ω
0
and the z component of B
eff
vanishes.
This creates a particularly simple picture, where motion of the mag
netization is solely dictated by B
1
. This is an extremely important
achievement in our discussion, because it makes the description of
the effects of pulse sequences on the magnetization extremely intu
itive. In Fig. 4(b), the resonance condition in the rotating frame is
described. As can be seen, with the application of B
1,x
, M will pre
cess about the axis deﬁned by B
1,x
, i.e. on the zx plane, moving from
the positive xaxis towards the positive yaxis and so on.
5.2.3 RF Pulses
RFcanbe appliedina constant manner (CW) or a transient one (pulses).
An RF pulse will cause M to precess around B
1
, but only for the period
of its duration. After the pulse had ended, M will obey the Bloch
equations where the ﬁeld consists only of B
0
. When an RF pulse is
applied to M, the angle between M and the positive zaxis achieved
at the end of the pulse is deﬁned as the ﬂip, or tilt angle. A simple
formula for the tilt angle is: θ = γB
1
τ, where θ is the tilt angle, B
1
is
the amplitude of the RF ﬁeld, and τ is the pulse duration. Figure 5
describes a 90
◦
tilt angle for B
1
applied on the xaxis, or 90
◦
x
, and a
180
◦
pulse when B
1
is applied on the yaxis, or 180
◦
y
.
Fig. 5. The effect of (A) a 90 degree RF pulse when B
1
is along the xaxis; (B) a 180
degree RF pulse when B
1
is along the yaxis.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
111
5.3 THE FREE INDUCTION DECAY
The simplest pulseNMR experiment involves an excitation of the
magnetization by a RF pulse, and the detection of the precession
of the magnetization about the B
0
axis in the absence of a RF ﬁeld.
The signal picked up by the RF receiver coil (which may, or may not
be the one used for the RF transmission) is the one induced by the
oscillations of the x and y components of M. This signal is called
the free induction decay, or the FID. The FID should be identical
to the solution of the Bloch equations given previously. However,
since the signal undergoes demodulation, or in other words — the
RF frequency is subtracted fromthe actual frequency of the detected
signal, the FIDis analogous to the solution of the Bloch equations in
the rotating frame. The solution is given by:
S
x
(t) = S(0) cos[(ω
0
−ω
ref
)t] · exp
−
t
T
2
,
S
y
(t) = −S(0) sin[(ω
0
−ω
ref
)t] · exp
−
t
T
2
,
where S(0) is proportional to the projection of M on the XY plane
immediately after the RF pulse is given and is thus proportional
to M
z
sinθ, where θ is the ﬂip angle. The reference or demodulation
frequency ω
ref
is essentially the rotation frequency of the rotating
frame. The projection is of course maximized when θ = 90
◦
. The
FID is in fact a complex signal, and with a quadrature detection coil
both the real and imaginary parts of the signal, separated by a phase
of π/2, are detected. The FID for a simple signal that consists of one
resonance at ω
0
is thus a damped oscillation at a frequency ω
0
−ω
ref
and a decay time constant of T
2
, as can be seen in Fig. 6.
5.3.1 The NMR spectrum
The typical NMR experiment, and as we will see later — the MRI
image, carries the FID data onto the frequency domain via the
Fourier transformation. The Fourier transformationof the FIDyields
the real and imaginary parts of the NMR spectrum, also known as
the absorption anddispersion modes. It shouldbe notedthat the phase
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
112
Itamar Ronen and DaeShik Kim
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
t
Fig. 6. The real part of the FID (off resonance).
associated with the detection of the FIDis arbitrary, and thus can be
modiﬁed in the processing phase to yield a “pure” absorption (in
phase) spectrum, or any desired phase. The explicit expressions for
the real and imaginary parts of the NMR spectrum are given by:
ˆ
S
y
(ω) =
∞
0
S
y
(t) exp (iωt)dt =
S(0)T
2
1 + T
2
2
(ω −ω)
2
,
ˆ
S
x
(ω) =
∞
0
S
x
(t) exp (iωt)dt =
S(0)T
2
2
(ω −ω)
1 + T
2
2
(ω −ω)
2
,
where ω = ω
0
−ω
ref
. The real and imaginary parts of the spectrum
are seen on Fig. 7. The real part is a spectral line with a Lorenzian
line shape. The full width at half maximumis inversely proportional
to the characteristic decay time of the FID. Here, it is given by the
relaxation constant T
2
, and the relation between the full width at
half maximum (FWHM) and T
2
is υ
1/2
= 1/πT
2
. Later on we will
see that relaxation is enhanced by experimental factors that are not
necessarilyintrinsic tothe sample, andthus anewtermwill be added
to the apparent transverse relaxation — T
∗
2
.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
113
0
0.02
0.04
0.06
0.08
0.1
0.12
500 300 100 100 300 500
0.06
0.04
0.02
0
0.02
0.04
0.06
500 300 100 100 300 500
Fig. 7. The real part or absorptionmode (left) andthe imaginary part or dispersion
mode (right) of the NMR spectrum of a single resonance.
5.3.2 Relaxation in NMR
The NMR signal is governed by two distinct relaxation times. T
1
=
1/R
1
is the longitudinal relaxation time, which is dictatedby the energy
exchange between the systemand the “lattice,” which is contains all
other degrees of freedom to which the spin system is coupled. T
1
describes the rate of the return of M
z
to its equilibrium value, M
z,eq
.
T
2
= 1/R
2
is the transverse relaxation time. T
2
is associated with loss of
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
114
Itamar Ronen and DaeShik Kim
coherent spin motion on the xy plane, which results in a net decrease
of M
xy
and ultimately in its vanishing. Both relaxation times are inti
mately relatedto molecular motion andthe interactions between the
spins with neighboring spins and their surroundings. Relaxation
is induced by randomly ﬂuctuating magnetic ﬁelds, typically associ
ated with the modulation of nuclear interactions by the random, or
stochastic molecular motion.
5.3.2.1 T
1
Relaxation
The magnetization at equilibrium, M
eq
is governed by the distribu
tion of the spins among the two magnetic states, α andβ: M
eq
=M
α
−
M
β
. At equilibrium — this distribution is given by the Boltzmann
distribution. When the magnetization is out of equilibrium — what
will drive it back to equilibrium are ﬂuctuations in the magnetic
ﬁeld, whose frequency is somewhat represented by ω
0
and thus allow
for energy exchange. In the case of T
1
, which operates on M
z
, the ﬂuc
tuations have to induce changes in M
z
, thus will be generated by
ﬂuctuations in B
x,y
. Fluctuations in the magnetic ﬁeld will be gener
ated by interactions that are modulated by, e.g., molecular motion.
Many of those mechanisms include interactions between two neigh
boring spins. If the interaction between two spins is dependent on
their orientation in the magnetic ﬁeld, this interaction will be modu
lated, for example, by rotation of the molecule in which these spins
are incorporated.
5.3.2.2 Example — the dipoledipole interaction
Two neighboring magnetic dipoles interact with each other (think
two magnets). The interaction strength when the two are in an exter
nal magnetic ﬁeld B
0
depends, among other things, on the angle θ
between the axis that connects the two dipoles and the magnetic
ﬁeld. Speciﬁcally, this interaction is proportional to
1
r
3
ij
(3cos
2
θ
ij
− 1),
where r
ij
is the internuclear distance between nuclei i and j, and
θ
ij
is the angle described above. In liquids, random motion will
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
115
modulate both r and θ, and if the two nuclei belong to the same
molecule (e.g. the two hydrogen atoms in a water molecule), θ is
primarily modulated by molecular rotational motion. Rotation is a
random motion, but a typical rotation time will be closely related
to, e.g. the size of the molecule at hand: the larger the molecule,
the slower its characteristic rotation time. The characteristic time for
such random motion is given by a correlation time, τ
c
. If the char
acteristic motional/rotational time constant τ
c
is characterized by
a frequency, 2π/τ
c
that is similar to ω
0
, a new kind of resonance is
achieved — between the Larmor precession, and a random process
(e.g. molecular rotation). This allows for energy exchange between
the spin systemand the “lattice,” here characterized only by its rota
tional degree of freedom. This energy exchange is irreversible and
leads to loss of energy in the spin system that eventually returns to
thermal equilibrium where M = M
eq
. It is thus the rapport between
τ
c
, governed among other things by the molecular size, and ω
0
that
primarily deﬁnes T
1
in most situations.
5.3.2.3 T
2
Relaxation
T
2
is the decay of the x and y components of the magnetization.
For argument’s sake, let’s assume M = M
x
. The decay will result
from random ﬂuctuations on B
y
and B
z
. Fluctuations on B
y
induce
energy level changes since they act on M
z
, similarly to what we
previously saw. Only that this time, we do not have the contribu
tion from B
x
, thus the energyexchange component in T
2
is 1/2 of
that of T
1
. Fluctuations in B
z
are tantamount to randomly varying
the Larmor frequency ω
0
. This broadening of the resonance from
a single frequency ω
0
to a distribution of frequencies will result in
loss of coherence in the precession of the transverse components
of M about the zaxis. This irreversible phase loss will gradually
decrease the magnitude of M
xy
until phase coherence is completely
lost andM
xy
→0. This effect increases withB
0
, andthus T
2
decreases
monotonously with B
0
. T
2
is referred to as the transverse relaxation
time or as the spinspin relaxation time.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
116
Itamar Ronen and DaeShik Kim
5.3.2.4 T
∗
2
— The Effects of Field Inhomogeneity
As we saw earlier, T
2
contributes to the decay on the XY plane even
when the external ﬁeld is completely homogeneous. Inhomogeneity
of B
0
will contribute further to the loss of coherence, or dephasing of
M
xy
—simply because different parts of the sample “feel” a different
ﬁeld B
0
, resulting in a spread of frequencies. The total amount of
decay is given by an additional decay constant —T
∗
2
. The relaxation
rate due to T
∗
2
combines contributions fromthe “pure” T
2
relaxation
and those that stem from B
0
inhomogeneity: 1/T
∗
2
= 1/T
2
+ 1/T
2
,
where T
2
denotes the contribution to transverse relaxation from B
0
inhomogeneity.
5.3.2.5 Refocusing the Effects of Static B
0
Inhomogeneity — The Spin Echo
Erwin Hahn ﬁrst noticed that if an excitation pulse is followed by
another pulse after a time periodτ, a FIDis regeneratedafter another
τ periodelapsedfromthe secondpulse, eventhoughthe original FID
has completely vanished.
2
This phenomenon was to be later called
the spin echo, and it became a staple feature in numerous pulse
sequences, the most basic of which is used to measure T
2
. The spin
echo is best explained using the concept of two isochromats, or two
spin populations with distinctly different resonance frequencies, ω
s
and ω
f
, i.e. a “slow” and a “fast” frequencies stemming from dif
ferent B
0
felt by these populations. A diagrammatic description of
the sequence of events in a spin echo is given in Fig. 8. Following
the ﬁrst 90
◦
(x) pulse, both isochromats create an initial transverse
magnetization (a). After a period τ, as a result of the frequency dif
ference between the two, ω
s
is lagging behind ω
f
, as seen in (b). If a
180
◦
(x) pulse is given, the isochromats are ﬂipped around the xaxis,
and the “mirror image” of the two isochromats is such that now ω
s
is in the lead and ω
f
is lagging behind (c). After the same period
τ, the two isochromats will converge, or refocus, on the negative y
axis (d). The phase between the two isochromats that was created by
the inhomogeneity of B
0
is now restored to 0. By generalization —
the spin echo sequence refocuses phase loss that is due to static B
0
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
117
Fig. 8. Spin dynamics of two isochromats during a Hahn spin echo sequence: (A)
immediately after excitation; (B) following a delay τ; (C) after the 180
◦
(refocusing)
pulse and (D) after a second delay τ.
inhomogeneities. One should note that phase losses due to T
2
relax
ation are not restored, and neither are losses due to spin motion in
an inhomogeneous B
0
. Figure 9 shows the FID and the echo that
is generated by a spin echo sequence. It should be noted that
although the intensity of the echo is weighted by T
2
, the envelopes
of both the original FID and the echo are still decaying as a function
of T
∗
2
.
5.3.2.6 The Effect of T
1
Since T
1
operates on the zaxis, its effects are not directly visible on
the FID, or the NMR spectrum for that matter. Since the amount of
Fig. 9. The FID and echo formation for a Hahn spin echo.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
118
Itamar Ronen and DaeShik Kim
magnetization available for detection is dictated by M
z
, the inten
sity of the signal detected will depend on howlarge was M
z
prior to
the excitation pulse. If the time between subsequent excitations, also
known as TR (timetorepetition) is too short to let M
z
from previ
ous excitation to reach its equilibrium value M
z,eq
, then a reduction
in signal intensity occurs. This reduction is more severe for spin
populations with a longer T
1
, and this is the basis for obtaining T
1

based contrast in MR images. Since T
1
affects M
z
, an inversion pulse
(a 180
◦
pulse) applied ﬁrst to the sample inverts the magnetization
to yield M(0) = −M
z,eq
. From this point and on, the magnetization,
which does not possess transverse components, will relax along the
zaxis according to M
z
(t) = M
z,eq
(1 − 2 exp (−
t
T
1
)) (see the solution
for M
z
in the chapter “the Bloch equations”). To make the magneti
zation detectable, another pulse, a detection pulse, is needed, to ﬂip
the magnetization to the xy plane. This is the inversionrecovery
sequence, used both for measuring T
1
in a sample as well as for gen
erating contrast based on T
1
and on other mechanisms that will be
brieﬂy mentioned later.
5.4 SPATIAL ENCODING — DIFFERENTIATING THE NMR
SIGNAL ACCORDING TO ITS SPATIAL ORIGINS
Let us revise some of the simplest principles we knowso far through
asimpleexample: inahomogeneous ﬁeld, acoupleof test tubes ﬁlled
with water, set apart from each other on the xaxis will generate a
single peak, whose frequency is dictated by B
0
and γ: ω
0
= γ B
0
. Or
in other words: The FID(and thus the spectrum) will have one char
acteristic frequency, deﬁned by the chemical species in our sample
(e.g. water protons). As long as the ﬁeld B
0
is homogeneous — ω
0
is
constant across the sample.
If variability is introduce in B
0
along a certain axis, e.g. the xaxis,
the same variability will be expressed in ω
0
, and each point in space
with the same x coordinate will have the same ω
0
, only that now
ω
0
= ω
0
(x). The simplest such variability is one that is linear with
distance from an arbitrary point — a linear magnetic ﬁeld gradient
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
119
(MFG). It should be emphasized that in the MR convention, the B
0
ﬁeldis always orientedalongthe zaxis, but the variationinB
0
is linear
along the axis of choice. The resulting magnetic ﬁeld in the presence
of a MFG, e.g. along the xaxis is then:
B
0
(x) = B
0
(0) +B
0
(x) = B
0
(0) +
dB
0
dx
x = B
0
(0) + g
x
· x,
g
x
has thus units of g · cm
−1
(cgs), and it is the slope of the variation
of B
0
with x.
5.4.1 Acquisition in the Presence of a MFG
As can be seen on Fig. 10, application a MFG on the xaxis assigns
a resonance frequency to each position on that axis. In other words,
the FID consists now of a range of resonance frequencies and this
range is a result of the variability of B
0
across the sample. Acqui
sition of a FID in the presence of the MFG, and a following FT
yields a spectrum on which frequency is proportional to position on the
xaxis. The intensity of the “peak” is proportional to the total M(0)
at that speciﬁc location on the xaxis. This 1Dimage is thus a pro
jection of the 3D spin distribution in our sample on the xaxis.
Fig. 10. The sample in the presence of a magnetic ﬁeld gradient. Each point in the
sample along the xaxis feels a different magnetic ﬁeld (left). The result (right) is a
frequency encoded onedimensional image (projection) of the object.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
120
Itamar Ronen and DaeShik Kim
Historically speaking, Paul Lauterbur (Nature, 1973) ﬁrst suggested
the use of MFG for spatial encoding. His idea was to measure
the projections in different radial directions, and reconstruct the
object from them (projectionreconstruction). He called his method
Zeugmatography.
5.4.2 MFG, Spectral Width and Fieldof View
The range of frequencies or the spectral width SW that is spanned
by the MFG is related to the spatial range D on the axis of inter
est through the gradient strength and the gyromagnetic ratio.
In the case where the gradient was applied along the xaxis,
SW(x) =γ·g(x)·D(x). The implication is that the stronger is the gra
dient, the broader is the frequency span for a speciﬁc desired
ﬁeldofview (FOV) on a desired axis. Or, conversely, increasing the
FOV in a speciﬁc axis increases the SW on that axis.
5.4.3 Another Way to Look at the Effect of MFG
If two locations a andbonanobject alongthe gradient axis are desig
nated the locations x
a
and x
b
, respectively, then in the rotating frame
the frequencies generated at those two locations in the presence of
a MFG are two nonidentical frequencies, ω
a
and ω
b
, respectively.
By doubling the gradient strength twice, the frequencies are also
doubled to become 2ω
a
and 2ω
b
, respectively. This means that the
evolution of the FID in the presence of a magnetic ﬁeld gradient,
or more speciﬁcally, of the phase of each frequency component of
the FID is a function of both time (t) and the gradient strength (g).
Twice the time — twice the phase for the same g; twice the gradient
strength — twice the phase for the same t. Thus the detected signal
S is S(γ, t, g). A new variable k can be introduced, which has the
units of cm
−1
(inverse space). k is deﬁned as: k = γ
g(t)dt and if the
gradient is constant with time, then S = S(k) = S(γ · g · t). In a similar
way in which time and frequency domains are related to each other
via the Fourier transformation, so are the coordinate vector r and
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
121
the vector k:
f (r) = 2π
F(k) exp (ik · r)dk F(k) = 2π
f (r) exp (ik · r)dr,
or in other words, kspace and coordinate space are Fourier
conjugates.
The effect that gradients and RF pulses have on the phase of
the transverse magnetization can thus be efﬁciently described as a
trajectory in kspace. It is instructive to consider the case where a
sliceselective excitation pulse (e.g. a 90
x
pulse) is applied to the
equilibrium magnetization, and only manipulations of the magne
tization on the XY plane are considered. This covers the common
situation of trajectories in a 2Dkspace, encountered in all multislice
schemes. At t = 0 (right after the excitation pulse), M
xy
is inphase
and thus k
x
= k
y
= 0.
This is all verysimilar tospectroscopyonlythat insteadof having
different resonance frequencies that originate from different chemi
cal shifts, the different frequencies originate fromdifferent positions
in a nonhomogeneous ﬁeld!
5.4.4 Flexibility in Collecting Data in KSpace
Since s = s(k) = s(γ · t · g), the signal (encoded for position on, e.g.,
the xaxis) can be acquired in two different ways:
Keep the gradient constant: let the signal evolve with time, and
sample the FID at different time points. This is typically referred to
as frequency encoding.
Keep the time constant: sample the FID following application of
short gradients of the same duration, but with different gradient
strength. This is referred to as phase encoding.
Thus, if a rectangular portion of kspace needs to be sampled,
a logical way to achieve this goal is to acquire the data follow
ing sequential excitations of the spin system, where each excitation
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
122
Itamar Ronen and DaeShik Kim
is frequency encoded in one direction (say, the xaxis) and phase
encoded in the perpendicular direction (e.g. the yaxis).
5.4.5 The Gradient Echo
Typically, in order to allow for efﬁcient time management of other
pulse sequence elements and the acquisition of a full echo signal,
the magnetization is ﬁrst dephased along the frequency encoding
direction, and the rephased using a gradient of opposite polarity. If
the dephasing gradient amplitude equals the rephrasing gradient
amplitude, then magnetization components that gained phase −
created by a local ﬁeld B
0
−B for a time period t, will now recover
the same phase, if the local ﬁeld at the same point is B
0
+ B for
the same time t. More generally, the refocusing condition is that the
area of the dephasing gradient be equal to that of the rephrasing
gradient:
t(end)
t(start)
g
deph.
dt = −
t(end)
t(start)
g
reph.
dt. This allows for ﬂexibility
in choosing the time it takes to the magnetization to refocus. The
time between excitation and refocusing is the gradient echo time
(TE). The principle of the gradient echo is illustrated in Fig. 11.
Fig. 11. The gradient echo.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
123
5.4.6 Encoding for the Third Dimension: Slice Selection or
Additional Phase Encoding
There are two main options for spatially encoding the outofplane
dimension. One is to add a phaseencoding loop on the third dimen
sion. This choice is popular with imaging modalities that aim for
high spatial resolution. The other option is to combine frequency
selective RF pulses with magnetic ﬁeld gradients for a spatially
selective excitation. For example, for a RF pulse with a sincshape
envelope
sin τ
τ
, the frequency response function is rectangular with a
bandwidth of 1/τ. In the presence of a gradient, the external mag
netic ﬁeld is given by B(x) = B(0) + B = B(0) + g · x. The
range of the magnetic ﬁeld B can be expressed in terms of range
of frequencies ω/γ and it is this range of frequencies that are in
resonance with those contained in the sinc pulse bandwidth. This
provides a simple relationship between the bandwidth of the pulse,
the gradient applied in conjunction with the pulse and the spatial
extent of the excitation, or slice thickness: BW = γgx, where BW
is the bandwidth of the RF pulse, γ is the gyromagnetic ratio, g is
the gradient strength and x is the slice thickness. The carrier fre
quency of the RF pulse can be monitored to modify the location of
the center of the slice. Ina typical multislice MRI experiment, the slice
location is variedcyclically, andin order to avoidartifacts associated
with slice overlaps, the cycle is performed on odd and even slices
sequentially.
5.4.7 Intraslice Phase Dispersion
One problemassociated with slice selection stems fromthe fact that
due to the presence of the gradient during the application of the RF
pulse, a range of frequencies is being created. This means that except
for the one frequency ω
0
, i.e. the reference frequency for that partic
ular nucleus, all other frequencies excited by the pulse are off reso
nance. Off resonance effects are marginal when the pulse duration is
short withrespect tothefrequencyoffset causedbythegradients, but
this is typically not the case. The result is that frequency components
within the slice gain a phase component that is proportional to their
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
124
Itamar Ronen and DaeShik Kim
distance from the point in the slice excited by ω
0
(typically the cen
ter of the slice for symmetric frequency responses). This in turn
causes signal loss, which can be at times quite severe. In order to
refocus this phase dispersion, a gradient with the opposite polarity
to that of the slice selection gradient is applied immediately at the
end of the pulse. It can be shown that for complete refocusing the
following condition has to be met: S(g
refocusing
) =S(g
slice selection
)/2,
where S is the “area”, or the integral over time of the given
gradient.
5.4.8 A Complete Pulse Sequence
The ﬁrst MRI pulse sequence that incorporated all three elements
of spatial encoding: frequency encoding, phase encoding and slice
selectionwas the “spinwarp” suggestedbyW. Edelsteinin1980. The
schematics of the spin warp are given in Fig. 12. Many of the MRI
pulse sequences that were subsequently developedare conceptually
similar to the spin warp. Notable exceptions are sequences that are
based on a single excitation, such as echo planar imaging (EPI) and
multiple spinecho sequences.
Fig. 12. The spin warp pulse sequence.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
125
5.4.9 Contrast in MRI Sequences
Contrast is the visual differentiation between different parts of the
image. In MRI, contrast is typically based on a physical property
related to a speciﬁc spin population. The physical property is thus
called a contrast mechanism. Contrast can be based on relaxation
properties: T
1
, T
2
, T
∗
2
. Additionally, contrast based on spin mobility —
ﬂow (e.g. blood ﬂow), perfusion (mobility of water through the cap
illary bed into tissue), and selfdiffusion (random motion of water
molecules). A different type of contrast is based on chemical envi
ronment effects — proton or water chemical exchange between dif
ferent environments (e.g. binding sites on heavy macromolecules)
gives rise to contrast through magnetization transfer mechanism.
Relaxationbased contrast is the most basic way to obtain contrast
in MRI. The contrast is achieved by sensitizing the image to one
(or more) of the relaxation mechanisms previously mentioned. By
examining the simple spinwarp sequence, it is already possible to
get a sense of how contrast is achieved. First, since this is a gradient
echo sequence, the image will be primarily T
∗
2
weighted: the inten
sity of the echo is given by S(TE) = S(0)
∗
exp(−TE/T
∗
2
). This is not
entirely correct, since there are two other main factors that inﬂu
ence the contrast. One is explicitly present in the equation above —
S(0), or spin density. The other is caused by the ﬁnite time between
consecutive excitations (TR) —which affects the amount of longitu
dinal magnetization available for the next excitation. This contrast is
based on T
1
and becomes more pronounced as TR becomes shorter
or as the ﬂip angle is closer to 90
◦
. The possibility of obtaining con
trast based on T
2
is based on the introduction of a spinecho ele
ment in the pulse sequence. This can be easily done by inserting an
180
◦
pulse between the excitation and the center of the acquisition,
and accounting for the polarity of the gradient echo gradients. This
modiﬁcation converts the spinwarp sequence into a T
2
weighted
sequence. Other mechanisms will be described in detail in other
chapters of this book.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
126
Itamar Ronen and DaeShik Kim
5.4.10 Echo Planar Imaging (EPI)
In our review of pulse sequence principles, we assumed that each
kspace line requires a separate excitation. This puts a severe limit
on the minimum time required for obtaining an image — the need
to introduce a delay between excitations (TR) is the one most time
consuming element in the entire pulse sequence. P Mansﬁeld
5
sug
gested the possibility of obtaining an image with a single excitation.
The trick is to ﬁnd a trajectory in kspace that will cover the portion
of kspace we are interestedin. The way it is done is demonstratedin
Fig. 13. Following the excitation pulse, preencoding gradients are
applied in the phase encoding and readout (frequency encoding)
directions (a). From now on, the readout gradients switch polar
ity back and forth to allow for “zigzag” scan of kspace. Phase
encoding “blips” are introduced between readout lines to bring the
magnetization to consecutive kspace lines. Since the readout time
is typically very short (on the order of 1 ms–2 ms), the acquisition of
an entire image of a single slice takes typically less than 100 ms. An
entire volume that consists of several slices can thus be acquired in
a couple of seconds. EPI is thus the natural sequence to be used in
applications where temporal (andnot spatial) resolution is required.
Fig. 13. The EPI pulse sequence.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
127
The most notable application of this class is that of functional MRI.
In the version of EPI shown above, the signal is T
∗
2
weighted. An
introduction of a 180
◦
pulse, similar to what has been previously
mentioned, will result in a T
2
weighted EPI image.
5.5 CONCLUSION
In this chapter, we developed the theoretical foundations of NMR
and subsequently of MRI. The principles of nuclear magnetic reso
nance, nuclear relaxation, spatial encoding and MRI image contrast
have been discussed and amply illustrated. This basis should give
the reader a strong tool for understanding the sophisticated applica
tions of MRI inthe biomedical sciences, suchas functional MRI of the
brain using the blood oxygenation level dependent (BOLD) effect,
diffusion weighted and diffusion tensor imaging (DTI) and more.
References
1. Bloch F, Nuclear induction, Phys Rev 70(7–8): 460, 1946.
2. Hahn EL, Spin echoes, Phys Rev 80: 580–594, 1950.
3. Edelstein WA, Hutchison JMS, Johnson G, Redpath T, Spin warp NMR
imaging and applications to human whole body imaging, Phys Med Biol
25: 751–756, 1980.
4. Lauterbur PC, Image formation by induced local interaction: Examples
employing nuclear magnetic resonance, Nature 242: 190–191, 1970.
5. Mansﬁeld P, Multiplanar image formation using NMR spin echoes,
J Phys C 10: 55–58, 1977.
6. Purcell EM, Torrey HC, Pound RV, Resonance absorption by nuclear
magnetic moments in a solid, Phys Rev 69(1–2): 37, 1946.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch05 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
CHAPTER 6
Principles of Ultrasound Imaging
Modalities
Elisa Konofagou
Despite the fact that medical ultrasound preceded MRI and PET, ongo
ing advances have allowed it to continuously expand as a ﬁeld in its
numerous applications. In the past decade, with the advent of faster pro
cessing, specialized contrast agents, a better understanding of nonlinear
wave propagation, novel and realtime signal and image processing and
complex ultrasound transducer manufacturing, ultrasound imaging and
ultrasound therapy have enjoyed a multitude of newfeatures and clinical
applications. Those have added to the higher quality and wider applica
tions of diagnostic ultrasound images. Due to these developments, ultra
sound has become a very powerful imaging modality mainly due to its
unique temporal resolution, lowcost, nonionizing radiation and portabil
ity. Lately, unique features such as harmonic imaging, coded excitation,
3D visualization and elastic imaging, have added to higher quality and
wider range of applications of diagnostic ultrasound images. In this chap
ter, a short overview of the fundamentals of diagnostic ultrasound and a
brief summary of its many applications and methods are provided. The
ﬁrst part of this chapter will provide a short backgroundonthe ultrasound
physics andthe secondpart will constitute a short overviewonultrasound
imaging and image formation.
6.1 INTRODUCTION
Sounds with a frequency above 20 kHz are called ultrasonic, since
they occur at frequencies inaudible to the human ear. When emitted
at short bursts, propagating through media, such as water, with low
reﬂection coefﬁcients and reﬂected by obstacles along their propa
gation path, the detection of the reﬂection, or echo, of the ultrasonic
129
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
130
Elisa Konofagou
wave can help localize the obstacle. This principle has been used
by sonar (SOund NAvigation and Ranging) and inherently used
by marine mammals, such as dolphins and whales, to help them
localize prey, obstacles or predators. In fact, the frequencies used for
“imaging” vary signiﬁcantly dependent upon the application: from
underwater sonar (up to 300 kHz), diagnostic ultrasound (1 MHz–
40 MHz), therapeutic ultrasound (0.8 MHz–4 MHz) and industrial
nondestructive testing (0.8 MHz–20 MHz) to acoustic microscopy
(up to 2 GHz).
6.2 BACKGROUND
6.2.1 The Wave Equation
As the ultrasonic wave propagates through the tissue, its energy and
momentum are transferred to the tissue. No net transfer of mass
occurs at any particular point in the medium unless this is induced
by the momentum transfer. As the ultrasonic wave passes through
the medium, the peak local pressure in the medium increases. The
oscillations of the particles result to harmonic pressure variations
within the mediumand to a pressure wave that propagates through
Particle
distribution
Particle
displacement
Particle
distribution
Particle
displacement
Direction of propagation
Particle
distribution
λ
Particle
displacement
Particle
distribution
Particle
displacement
Particle
distribution
Particle
displacement
Direction of propagation
Fig. 1. Particle displacement and particle distribution for a traveling longitudinal
wave. The direction of propagation is from left to right, namely the longitudinal
(or, axial) direction. A shear wave can be created in the perpendicular direction,
in which case the particles would also be moving in a direction orthogonal to the
direction of propagation (not shown here).
1
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
131
S
S
(1)
(2)
z z+ z
u u+ u
F
SS
S
(1)
(2)
z z+δz
u u+δu
F
Fig. 2. A small volume of the medium of impedance Z (1) at equilibrium and
(2) undergoing oscillatory motion when an oscillatory force F is applied.
the medium as neighboring particles move with respect to one
another (Fig. 1). The particles of the medium can move back and
forth in a direction parallel (longitudinal wave) or perpendicular
(transverse wave) to the traveling direction of the wave.
Let’s consider the ﬁrst case.
Assuming that a small volume of the medium that can be mod
eled as a nonviscous ﬂuid (no shear waves can be generated) is
shown on Fig. 2, an applied force δF produces a displacement of
u + δu in the xposition on the righthand side of the small vol
ume. A gradient of force
∂F
∂z
is thus generated across the element in
question, and, assuming that the element is small enough so that
the measured quantities within the medium are constant, it can be
assumed as being linear, or:
δF =
∂F
∂z
δz, (1)
and according to Hooke’s Law,
F = KS
∂u
∂z
, (2)
where K is the adiabatic bulk modulus of the liquid and S is the area
of the regiononwhichthe force is exerted. Bytakingthe derivative of
both sides of Eq. 2 with respect to z and following Newton’s Second
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
132
Elisa Konofagou
Law, from Eq. 1 we obtain the socalled “wave equation”:
∂
2
u
∂z
2
−
1
c
2
∂
2
u
∂t
2
= 0 (3)
where c is the speedof soundgiven by c =
K
ρ
=
1
ρκ
, where ρ is the
density of the medium and κ is the compressibility of the medium.
Eq. 3 relates the second differential of the participle displacement
with respect to distance to the acceleration of a simple harmonic
oscillator. Note that the average speed of sound in most soft tissues
is about 1540 m/s with a total range of ±6%. For the shear wave
derivationof this equationplease refer to Wells
1
or Kinsler andFrey
2
among others.
The solutionof the wave equationis givenbyafunctionu, where:
u = u(ct −z). (4)
An appropriate choice of function for u in Eq. 4 can be:
u(t, z) = u
0
exp[jk(ct −z)], (5)
where k is the wavenumber and equal to 2π/λ with λ denoting the
wavelength (Fig. 1).
6.2.1.1 Impedance, Power and Reﬂection
The pressure wave that results fromthe displacement generatedand
given by Eq. 5 is given by:
p(t, z) = p
0
exp[jk(ct −z)], (6)
where p
0
is the pressure wave amplitude and j is equal to
√
−1. The
particle speed and the resulting pressure wave are related through
the following relationship:
u =
p
Z
, (7)
where Zis the acoustic impedance deﬁnedas the ratio of the acoustic
pressure wave at a point in the mediumto the speedof the particle at
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
133
the same point. The impedance is thus characteristic of the medium
and given by:
Z = ρc. (8)
The acoustic wave intensity is deﬁned as the average ﬂow of
energythrougha unit area inthe mediumperpendicular tothe direc
tion of propagation.
2
By following that deﬁnition, the intensity can
be found equal to
3
:
I =
p
2
0
2Z
, (9)
andusuallymeasuredinunits of mW/cm
2
indiagnostic ultrasound.
A ﬁrst step into understanding the generation of ultrasound
images is to follow the interaction of the propagating wave with
the tissue. Thanks to the varying mean acoustic properties of tis
sues, a wave transmitted into the tissue will get partly reﬂected at
areas where the properties of the tissue and, thus its impendance,
are changing. These areas constitute a socalled “impedance mis
match” (Fig. 3).
Reflected
Incident
Transmitted
r
u
r
p
t
u
i
p
t
p
t
u
t
i
r
Reflected
Incident
Transmitted
r
u
r
u
r
p
r
p
t
u
t
u
i
p
i
p
t
p
t
p
t
u
t
u
tt
ϑ
ii
rr
Medium 1
Medium 2
Interface
Reflected
Incident
Transmitted
r
u
r
u
r
p
r
p
t
u
t
u
i
p
i
p
t
p
t
p
t
u
t
u
tt
ii
rr
Reflected
Incident
Transmitted
r
u
r
u
r
p
r
p
t
u
t
u
i
p
i
p
t
p
t
p
t
u
t
u
tt
ii
ϑ
rr
ϑ
Medium 1
Medium 2
Interface
Fig. 3. An incident wave at an impedance mismatch (interface): A reﬂected and
a transmitted wave with certain velocities and pressure amplitudes are created
ensuring continuity at the boundary.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
134
Elisa Konofagou
The reﬂection coefﬁcient R of the pressure wave at an incidence
angle of ϑ
i
is given by:
R =
p
r
p
i
=
Z
2
cosϑ
t
−Z
1
cosϑ
i
Z
2
cosϑ
t
+Z
1
cosϑ
i
, (10)
where ϑ
t
is the angle of the transmitted wave (Fig. 3) also related to
the incidence angle through Snell’s Law:
λ
1
cos ϑ
i
= λ
2
cos ϑ
t
, (11)
where λ
1
and λ
2
are the wavelengths of the waves in medium 1 and
2, respectively, and related to the speeds in the two media through:
c = λf , (12)
where f is the frequency of the propagating wave.
As Fig. (3) also shows, the wave impingent upon the impedance
mismatch also generates a transmitted wave, i.e. a wave that prop
agates through. The transmission coefﬁcient is deﬁned as:
T =
p
t
p
i
=
2Z
2
cos ϑ
i
Z
2
cos ϑ
i
+Z
1
cos ϑ
t
. (13)
According to the parameters reported by Jensen
3
on impedance
and speed of sound of air, water and certain tissues, the reﬂection
coefﬁcient at a fatair interface is equal to −99.94% showing that
virtually all of the energy incident on the interface is reﬂected back
in tissues such as the lung. A more realistic example found in the
human body is the musclebone interface, where the reﬂection coef
ﬁcient is 49.25%, demonstrating the challenges encountered when
usingultrasoundfor the investigationof bone structure. Onthe other
hand, given the overall similar acoustic properties between different
soft tissues, the reﬂection coefﬁcient is too low when used to differ
entiate betweendifferent soft tissue structures rangingonlybetween
−10% and 0.
The values mentioned above determine both the interpretation
of ultrasound images, or sonograms, as well as the design of trans
ducers, as discussed in the sections below.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
135
6.2.1.2 Tissue Scattering
In the previous section, the notions of reﬂection, transmission and
propagation were discussed in the simplistic scenario of plane wave
propagation and its impingment on plane boundaries. In tissues,
however, such a situation is rarely encountered. In fact, tissues are
constitutedby cells andgroups of cells that serve as complex bound
aries to the propagating wave. As the wave propagates through all
these complex structures, reﬂected and transmitted waves are gen
erated at each one of these interfaces dependent on the local density,
compressibility and absorption of the tissue. The groups of cells
are called “scatterers” as they scatter acoustic energy. The backscat
tered ﬁeld, or what is “scattered back” to the transducer, is used to
generate the ultrasound image. In fact, the backscattered echoes are
usually coherent and can be used as “signatures” of tissues that are
e.g. in motion or under compression, as appliedin elasticity imaging
methods.
An example of such an ultrasound image can be seen in Fig. 4.
The capsule of the prostate is shown to have a strong echo, mainly
due to the high impedance mismatch between the surrounding
medium, gel in this case, and the prostate capsule. However, the
remaining area of the prostate is depicted as a grainy region sur
rounding the ﬂuid ﬁlled area of the urethra (dark, or lowscattering,
area in the middle of the prostate). This grainy appearance is
Urethral
crest
Central zone
Peripheral zone
Verumontanum
Fibrous connective
tissue
Urethral
crest
Central zone
Peripheral zone
Verumontanum
Fibrous connective
tissue
(A) (B)
Fig. 4. Sonogram of (A) an in vitro canine prostate and (B) its corresponding
anatomy at the same plane as that scanned.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
136
Elisa Konofagou
called “speckle,” a termborrowed fromthe laser literature.
4
Speckle
is produced by the constructive and destructive interference of
the scattered signals from structures smaller than the wavelength;
hence, the appearance of bright and dark echoes, respectively. So,
speckle does not necessarily relate to a particular structure in the
tissue.
Given its statistical signiﬁcance, in its simplest representation,
the amplitude of speckle has been represented as having a Gaus
sian distribution with a certain mean and variance.
5
In fact, these
same parameters have been used to indicate that the signal
tonoise ratio of an ultrasound image is fundamentally limited
to only 1.91.
5
As a result, in the past, several authors have
tried different speckle cancellation techniques
6
in an effort to
increase the image quality of diagnostic ultrasound. However,
speckle offers one important advantage that has rendered it
vital in the current applications of ultrasound (Sec. 5). Despite it
being described solely by statistics, speckle is not a random signal.
As mentioned earlier, speckle is coherent, i.e. it preserves its char
acteristics when shifting from position. Consequently, motion esti
mation techniques that can determine anything from blood ﬂow to
tissue elasticity are made possible in a ﬁeld that is widely known as
“speckle tracking.”
6.2.1.3 Attenuation
As the ultrasound wave propagates inside the tissue, it undergoes a
loss of power dependent onthe distance traveledinthe tissue. Atten
uationof the ultrasonic signal canbe attributedtoa varietyof factors,
such as divergence of the wavefront, reﬂection at planar interfaces,
scattering from irregularities or point scatterers and absorption of
the wave energy.
7
In this section, we will concentrate on the latter,
beingthe strongest factor insoft (other thanlung) tissues. Inthis case,
the absorptionof the wave’s energyleads to heat increase. The actual
cause of absorption is still relatively unknown but simple models
have beendevelopedtodemonstrate the dependence of the resulting
wave pressure amplitude decrease in conjunction with the viscosity
of tissues.
8
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
137
By not going into detail concerning the derivations of such a
relationship, an explanation of the phenomenon is provided here.
Let’s consider a ﬂuid with a certain viscosity that provides a certain
resistancetoawavepropagatingthroughits different layers. Inorder
to overcome the resistance, a certain force per unit area or, pressure,
needs to be applied that is proportional to the shear viscosity of the
ﬂuid η as well as the spatial gradient of the velocity,
7
or:
p ∝ η
∂u
∂z
. (14)
Equation 14 shows that a ﬂuid with higher viscosity will require
higher force to experience the same velocity gradient compared to
a less viscous ﬂuid. By considering Eqs. 2 and 14, an extra term can
be added to the wave equation that includes both the viscosity and
compressibility of the medium,
7
or:
∂
2
u
∂z
2
+
4η
3
+ξ
k
∂
3
u
∂z
2
∂t
−
1
c
2
∂
2
u
∂t
2
= 0, (15)
where ξ denotes the dynamic coefﬁcient of compressional viscosity.
The solution to this equation is given by:
u(t, z) = u
0
exp (−αz) exp[jk(ct −z)], (16)
where α is the attenuation coefﬁcient also given by (for α k):
α =
4η
3
+ξ
k
2
2ρc
. (17)
From Eq. 16, the effect of attenuation on the amplitude of the wave
is clearly depicted (Fig. 5). An exponential decay on the envelope of
the pressure wave highly dependent on the distance results fromthe
tissue attenuation. The intensity of the wave will decrease at twice
the rate, given that from Eq. 9:
I(t, z) =
p
2
0
Z
exp (−2αz) exp[2jk(ct −z)] (18)
or, the average intensity is equal to:
I = I
0
exp (−2αz). (19)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
138
Elisa Konofagou
z
u(t,z)
z
u(t,z)
Fig. 5. This is the attenuated wave of Fig. 1. Note that the envelope of the wave is
dependent on the attenuation of the medium.
Another important effect that the tissue attenuation can have on
the propagating wave is a frequency shift. This is because a more
complex form for the attenuation α is:
α = β
0
+β
1
f , (20)
where β
0
and β
1
are the frequencyindependent and frequency
dependent attenuationcoefﬁcients. Infact, the frequencydependent
term is the largest source of attenuation and increases linearly with
frequency. As a result, the spectrum of the received signal changes
as the pulse propagates through the tissue in such a way that a shift
to smaller frequencies, or downshift, occurs. In addition, the down
shift is dependent on the bandwidth of the pulse propagating in the
tissue and the mean frequency of a spectrum(in this case Gaussian
3
)
can be given by:
f = f
0
−(β
1
B
2
f
2
0
)z, (21)
where f
0
and B denote the center frequency and bandwidth of the
pulse. Thus, according to Eq. 21, the downshift due to attenuation
depends on the tissue frequencydependent attenuation coefﬁcient,
and the pulse center frequency and bandwidth. A graph showing
the typical values of frequencydependent attenuation coefﬁcients
(measured in dB/cm/MHz) in the biological tissue is given in Fig. 6.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
139
10
3
10
2
10
1
10
0
10
1
10
2
Attenuation (dB/cm/MHz)
Plasma
Blood
Spleen
Bone
Fat
Liver
Kidney
10
3
10
2
10
1
10
0
10
1
10
2
Attenuation (dB/cm/MHz)
Plasma
Blood
Spleen
Bone
Fat
Liver
Kidney
Fig. 6. Attenuation values of certain ﬂuids and soft tissues.
9
6.3 KEY TOPICS WITH RESULTS AND FINDINGS
6.3.1 Transducers
The pressure wave that was discussed in the previous section is
generated using an ultrasound transducer, which is typically a
piezoelectric material. “Piezoelectric” denotes the particular prop
erty of certain crystal polymers of transmitting a pressure (“piezo”
means “to press” in Greek) wave generated when an electrical
potential is applied across the material. Most importantly, since
this piezoelectric effect is reversible, i.e. a piezoelectric crystal
will convert an impinging pressure wave to an electric poten
tial, the same transducer can also be used as a receiver. Such
crystalline or semicrystalline polymers are the polyvinylidene
ﬂuoride (PVDF), quartz, barium titanate and lead zirconium
titanate (PZT).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
140
Elisa Konofagou
A singleelement ultrasound transducer is shown in Fig. 2.
Dependent upon its thickness (l) and propagation speed (c), the
piezoelectric material has a resonance frequency given by:
f
0
=
c
2l
. (22)
The speed in the PZT material is around 4 000 ms
−1
, so for a 5 MHz
transducer, the thickness should be 0.4 mm thick. The matching
layer is usually coated onto the piezoelectric crystal in order to min
imize the impedance mismatch between the crystal and the skin
surface and, thus, maximize the transmission coefﬁcient (Eq. 13). In
order to overcome the aforementioned impedance mismatch, the
ideal impedance Z
m
and thickness d
m
of the matching layer are
respectively given by:
Z
m
=
Z
T
Z (23)
and
d
m
=
λ
4
, (24)
with Z
T
denoting the transducer impedance and Z the impedance
of the medium.
The backing layers behind the piezoelectric crystal are used in
order to increase the bandwidth and the energy output. If the back
ing layer contains air, then the aircrystal interface yields a max
imum reﬂection coefﬁcient given the high impedance mismatch.
Another byproduct of an airbacked crystal element is that the
crystal remains relatively undamped, i.e. the signal transmitted will
have a low bandwidth and a longer duration. On the other hand,
the axial resolution of the transducer depends on the signal dura
tion, or pulse width, transmitted. As a result, there is a tradeoff
between transmitted power and resolution of an ultrasound sys
tem. Depending on the application, different backing layers are
therefore used. Airbacked transducers are used in continuous
wave and ultrasound therapy applications. Heavilybacked trans
ducers are utilized in order to obtain high resolution, e.g. for high
quality imaging at the expense of lower sensitivity and reduced
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
141
Fig. 7. Typical construction of a singleelement transducer.
3
penetration. Codedexcitation techniques have recently been
successfully applied to circumvent such tradeoffs.
For imaging purposes, an assembly of elements such as that in
Fig. 7 is usually used and called an “array” of such elements. In an
array, the elements are stacked next to each other at a distance equal
to less than a wavelength for the minimuminterference andreduced
grating lobes. The linear array has the simplest geometry. It selects
the region of interest by ﬁring elements above that region. The beam
can then be moved on a line by ﬁring groups of adjacent elements
andthenthe rectangular image obtainedis formedby combining the
received signals by all the elements. A curved array is used when
the transducer is smaller than the area scanned. Aphased array can
be used to change the “phase” or delay between the ﬁred elements
and thus achieve steering of the beam. The phased array is usually
the choice for cardiovascular exams, when the windowbetween the
ribs allows for a very small transducer to image the whole heart.
Focusing andsteering can both be achievedby modifying the proﬁle
of ﬁring delays between elements (Fig. 8).
6.3.2 Ultrasonic Instrumentation
Figure 9 shows a block diagram of the different steps that are used
in order to acquire, process and display the received signal fromthe
tissue.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
142
Elisa Konofagou
Array of
elements
Beam wavefront
Beam direction
Group 1
Group 2
Group 3
Group 1
Group 2
Group 3
x x x
x
x
x
Group 1
Group 2
Group 3
x
x
x
… …
… … …
…
τ τ τ
(A) (B) (C)
Fig. 8. Electronic (A) beamforming, (B) focusing and(C) focusing andbeamsteer
ing as achieved in phased arrays. The time delay between the ﬁrings of different
elements is denoted here by τ.
Fig. 9. Block diagram of a pulsedwave system and the resulting signal or image
at three different steps.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
143
6.3.2.1 Transducer Frequency
In ultrasound imaging, a pulse of a given duration, frequency and
bandwidth is ﬁrst transmitted. As mentioned before, a tradeoff
between penetration (or, low attenuation) and resolution exists.
Therefore, the chosen frequency will depend on the application.
Usually, for deeperorgans, such as the heart, the uterus and the
liver, the frequencies are restricted in the range of 3 MHz–5 MHz
while for more superﬁcial structures, such as the thyroid, the
breast, the testis and applications on infants, a wider range of
4 MHz–10 MHz is applied. Finally, for ocular applications, a range
of 7 MHz–25 MHz is determined by the low attenuation, low depth
and higher resolution required.
The pulse is usually a few cycles of that frequency long (usually
3–4 cycles) so as to ensure high resolution, and is generated by the
transmitter through a voltage step sinusoidal function at a voltage
amplitude (100 V–500 V) and a frequency equal to that of the res
onance frequency of the transducer elements. For static structures,
a single pulse or multiple pulses (usually used for averaging later)
could be used at an arbitrary frequency. However, for moving struc
tures, such as blood, liver and the heart, a fundamental limit on the
maximum pulse repetition frequency (PRF) is set by the maximum
depth of the structure, or PRF (kHz) = c/2D
max
. Typically, the PRF
is in the range of 1 kHz–3 kHz.
6.3.2.2 RF Ampliﬁer
The received signal needs to be initially ampliﬁed so as to guarantee
a goodsignaltonoise ratio. At the same time, the input of the ampli
ﬁer should be devoid of the high voltage pulse in order to protect
the circuits but also maintain its low noise and high gain. Atypical
dynamic range expectedat the output is on the order of 70 dB–80 dB.
6.3.2.3 TimeGain Compensation (TGC)
As indicated above, attenuation is unavoidable as the wave trav
els through the medium and it increases with depth. In order to
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
144
Elisa Konofagou
avoid artiﬁcial darkening of deeper structures as a result, a voltage
controlledattenuator is usuallyemployed, where a control voltage is
utilized to manually adjust the system gain accordingly after recep
tion of an initial scan. Alogarithmic voltage ramp is usually applied
that compensates for a mean attenuation level with depth.
6
The
dynamic range becomes further reduced to 40 dB–50 dB.
6.3.2.4 Compression Ampliﬁer
The signals will ultimately be displayed as a greyscale on a cathode
ray tube (CRT), where the dynamic range is typically only 20 dB–
30 dB. To this purpose, an ampliﬁer with a logarithmic response is
utilized.
6.3.3 Ultrasonic Imaging
Ultrasonic imaging is usually known as echography or sonography,
depending on which side of the Atlantic ocean one is scanning from.
As mentioned earlier, the signal acquired by the scanner can be pro
cessed and displayed in several different fashions. In this section,
the most typical and routinely used ones are discussed.
6.3.3.1 AMode
Since the image is a grayscale picture, the amplitude of the signal
is displayed. For this, the envelope of the RF signal needs to be
calculated. This is, for example, achieved by applying the Hilbert
transforms. The resulting signal is called a detected Ascan, Aline
or Amode scan (A for Amplitude). An example of that is shown on
Fig. 8.
6.3.3.2 BMode
When the received Ascans are spatially combined after acquisition
using either a mechanically moved transducer or the previously
mentioned arrays and used to brightnessmodulate the display in
a 2D format, the brightness or Bmode is created, which has a true
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
145
Fig. 10. Top: Bscan of an abdominal aorta in a mouse at 30 MHz; Bottom: Mmode
image over several cardiac cycles taken along the dashed line in the Bscan.
image format and is by far the most widely used diagnostic ultra
sound mode. By default, sonogram or echogram refers to Bmode.
Figure 10 shows a longitudinal Bmode image of anabdominal aorta.
One of the biggest advantages of ultrasound scanning is realtime
scanningandthis is achieveddue tothe shallowdepthof scanningin
most tissues and the high speed of sound. The frame rate is usually
on the order of 30 Hz–100 Hz (while in the Mmode version it can
be as fast the PRF itself, see below). The frame rate is limited by the
number of Amode scans acquired, N
A
, and the maximum depth,
i.e. the maximum frame rate is given by PRF
F
= c/2D
max
/N
A
.
6.3.3.3 MMode
Another way of displaying the Ascans is in function of time, espe
cially in cases where tissue motion needs to be monitored and ana
lyzed such as the case of heart valves or other cardiac structures. In
the case of Fig. 10, only one Ascan from a particular tissue struc
ture is displayed in brightness mode and followed in time, called
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
146
Elisa Konofagou
motion, or Mmode scan. Adepthtime display is then generated. A
typical applicationof the Mmode displayis usedinthe examination
of heart valve leaﬂets motion and Doppler displays.
6.4 DISCUSSION
One of the mainproblems withthe standarduse of ultrasoundarises
from high attenuation in some tissues and especially small vessels
and blood cavities. In order to overcome this limitation, contrast
agents areroutinelyused. Contrast agents aretypicallymicrospheres
of encapsulated gas or liquid coated by a shell, usually albumin.
Due to the high impedance mismatch created by the gas or liquid
contained, the resulting backscatter generatedby the contrast agents
is a lot higher than that of the blood echoes.
An alternative method to generating higher backscatter due to
the increased impedance mismatch is based on the harmonics gen
erated by the bubble’s interaction with the ultrasonic wave. The
bubble vibration also generates harmonics above andbelowthe fun
damental frequency, with the second harmonic possibly exceeding
the ﬁrst harmonic. In other words, the contrast agent introduces
nonlinear backscattering properties into the medium where it lies.
Several processes of ﬁltering out undesired echoes from station
ary media surrounding the region, where ﬂow characteristics are
assessed, result to weakening of the overall signal at the fundamen
tal frequency. Therefore, since residual harmonics will result from
moving scatterers, motion characteristics can all be obtained from
the higher harmonic echoes, after using a highpass ﬁlter and ﬁl
tering out the fundamental frequency spectrum that also contains
the undesired stationary echoes. Another method for distilling the
harmonic echo information is the more widely used phase or pulse
inversion method, in which two pulses (instead of one) are sequen
tially transmitted with their phases reversed. Upon reception, the
echoes resulting from the two pulses are then added and only the
higher harmonics remain.
Despite the fact that the idea of contrast agent use originated
for blood ﬂow measurements, the same type of approach can be
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
147
applied in the case of soft tissues as well. After being injected into
the bloodstream, the contrast agents can also appear and remain on
the tissues and offer the same advantages of motion detection and
characterization as in the case of blood ﬂow. However, it turns out
that contrast agents are not always needed for imaging of tissues
at higher harmonics, especially since tissue scattering can be up to
two orders of magnitude higher than blood scattering. The nonlin
ear wave characteristic of the tissues themselves is, thus, sufﬁcient
in itself to allow imaging of tissues, despite the resulting higher
attenuation at those frequencies. The avoidance of patient discom
fort following contrast agent injectionis one of the major advantages
of this approach in tissues. Imaging using the harmonic approach
(whether with or without contrast agents) is generally known as
harmonic imaging. Compared to the standard approach, harmonic
imaging in tissues offers the ability to distinguish between noise
and ﬂuidﬁlled structures, e.g. cysts and the gall bladder. In addi
tion, harmonic imagingallows for better edgedeﬁnitioninstructures
and, thus, is generallyknowntoincrease image clarity, mainlydue to
the much smaller inﬂuence of the transmitted pulse to the received
spectrum. Harmonic imaging is nowavailable inmost commercially
available ultrasound systems. One of the main requirements for har
monic imaging is the large bandwidth of the transducer at receive
so as to allow reception of the higher frequency components. This
comes into very good agreement with the higher resolution require
ment for diagnostic imaging.
Another ﬁeld that has emerged out of ultrasonic imaging in the
past decade is elasticity imaging. Its premise is built on two proven
facts (1) that signiﬁcant differences between mechanical properties
of several tissue components exist and (2) that the information con
tained in the coherent scattering, or speckle, is sufﬁcient to depict
these differences followinganexternal or internal mechanical stimu
lus. For example, inthe breast, not onlyis the hardness of fat different
than that of glandular tissue, but, most importantly, the hardness of
normal glandular tissue is different than tumorous tissue (benign or
malignant) by up to one order of magnitude. This is also the reason
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
148
Elisa Konofagou
why palpation has been proven a crucial tool in the detection of
cancer.
The second observation is based on the fact that coherent echoes
canbe trackedwhile or after the tissue inquestionundergoes motion
and/or deformation caused by the mechanical stimulus, e.g. an
external vibration or a quasistatic compression in a method called
Elastography. Speckle tracking techniques are also employed here
for the motion estimation. In fact, Doppler techniques, such as those
used for blood velocity estimation, were initially applied in order
to track motion during vibration (Sonoelasticity imaging or Sonoelas
tography). Parameters, such as velocity and strain, are estimated and
imaged in conjunction with the mechanical property of the under
lying tissue. The higher the velocity or strain estimated the softer
the material and vice versa. Numerous applications ranging from
the breast to the thyroid and the heart have been implimented in
clinical applications.
6.5 CONCLUDING REMARKS
Despite the fact that diagnostic ultrasound is an older imaging
modality compared to MRI and PET, it is very intriguing to see
that it continues to expand as a ﬁeld offering numerous and diverse
applications. In this chapter, we have described some of the funda
mental aspects of ultrasound physics and ultrasonic imaging as well
as referred to examples of more recent methods and applications.
References
1. Wells PNT, Biomedical Ultrasonics, Medical Physics Series, Academic
Press, London NW1, 1977.
2. Kinsler LE, Frey AR, Fundamentals of Acoustics, 2nd edn., John Wiley &
Sons, NY, 1962.
3. Jensen JA, Estimation of Blood Velocities Using Ultrasound, Cambridge
University Press, Cambridge, U.K., 1996.
4. Burckhardt CB, Speckle in ultrasound Bmode scans, IEEE Trans on Son
and Ultras SU25: 1–6, 1978.
5. Wagner RF, Smith SW, Sandrik JM, Lopez H, Statistics of speckle in
ultrasound Bscans, IEEE Trans on Son and Ultras 30: 156–163, 1983.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
149
6. Bamber JC, Tristam M, in Webb S (ed.), Diagnostic Ultrasound, IOP
Publishing Ltd., pp. 319–386, 1988.
7. Christensen PA, Ultrasonic Bioinstrumentation, 1st edn., John Wiley &
Sons, 1988.
8. Morse, Ingard, Theoretical Acoustics, New York, McGrawHill, 1968.
9. Haney MJ, O’Brien Jr, WD, Temperature dependence of ultrasonic
propagation in biological materials, in Greenleaf JF (ed.), Tissue
Characterization with Ultrasound, CRC Press Boca Raton FL, pp. 15–55,
1986.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch06 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
CHAPTER 7
Principles of Image Reconstruction
Methods
Atam P Dhawan
Multidimensional medical imaging in most radiological applications
involves three major tasks: (1) raw data acquisition using imaging
instrumentation; (2) image reconstruction from the raw data; and (3)
image display and processing operations as needed. Image recon
struction in multidimensional space is generally an ill posed prob
lem, where a unique solution representing an ideal reconstruction
of the true object from the acquired raw data may not be possible
due to limitations on data acquisition. However, using speciﬁc ﬁl
tering operations on the acquired raw data along with appropriate
assumptions and constraints in the reconstruction methods, a feasi
ble solution for image reconstruction can be obtained. Radon trans
form has been most extensively used in image reconstruction from
acquired projection data in medical imaging applications such as
Xray computed tomography. Fourier transform is directly applied to
the raw data for reconstructing images in medical imaging applica
tions, such as magnetic resonance imaging (MRI) where the raw data
is acquired in frequency domain. Statistical estimation and optimiza
tion methods often show advantages in obtaining better results in
image reconstruction dealing with the ill posed problems of imaging.
This chapter describes principles of image reconstruction in multidi
mensional space from raw data using basic transform and estimation
methods.
7.1 INTRODUCTION
Diagnostic radiology has evolved into multidimensional imaging
in the second half of the twentieth century in terms of Xray
151
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
152
Atam P Dhawan
computed tomography (CT), nuclear magnetic resonance imaging
(NMRI/MRI), nuclear medicine: single photon emission computed
tomography (SPECT) and positron emission tomography (PET),
ultrasound computed tomography, and optical tomographic imag
ing. The foundation of such and many other multidimensional
tomographic imaging techniques started from a basic theory of
image reconstruction from projections that was ﬁrst published by
J Radon in 1917
1
and later explored by a number of researchers
including Cramer and Wold,
2
Renyi,
3
Gilbert,
4
Bracewell,
5
Cormack
6
and Hounsﬁeld
7,8
and many others for imaging appli
cations in many areas including medicine, astronomy, microscopy
and geophysics.
9–11
The implementation of the Radon transformfor
reconstructing medical images from the data collected from imag
ing instrumentation was only realized in the 1960s. Cormack in
1963
6
showed the radiological applications of Radon’s work for
image reconstruction from projections using a set of measurements
deﬁning line integrals. In 1972, GN Hounsﬁeld developed the ﬁrst
commercial Xray computed tomography (CT) scanner that used a
computerized image reconstruction algorithm based on the Radon
transform. GN Hounsﬁeld and AM Cormack jointly received the
1979 Nobel Prize for their contributions to the development of com
puterized tomography for radiological applications.
6–8
Image reconstruction algorithms have been continuously devel
oped to reconstruct the true structural characteristics such as shape,
density, etc. of an object in the image. Image reconstruction from
projections or data collected from a scanner is an ill posed problem
because of the ﬁnite amount of data used to reconstruct the char
acteristics of the object. Furthermore, the acquired data is severely
degraded because of occlusion, detector noise, radiation scattering
and inhomogeneities of the medium.
The classical image reconstructionfromprojectionmethodbased
on the Radon transformis popularly known as the “backprojection”
method. The backprojection method has been modiﬁed to incorpo
rate speciﬁc data collection schemes and to improve quality. Fourier
transform and iterative series expansion based methods have
been developed for reconstructing images from projections. With
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
153
the fast developments in computer technology, advanced image
reconstruction algorithms using statistical and estimation meth
ods were developed and implemented for several medical imaging
modalities.
7.2 RADON TRANSFORM
Radon transform ﬁrst deﬁnes ray or lineintegrals to form projec
tions froman unknown object and then uses inﬁnite number of pro
jections to reconstruct an image of the object. It should be noted that
though the early evolution in computed tomography was based on
image reconstruction using parallel beam geometry for data acqui
sition, more sophisticated geometrical conﬁguration and scanning
instrumentation are used today for faster data collection and image
reconstruction. New computed tomography (CT) image scanners
(oftencalledas fourthgenerationCTscanners) utilize aconebeamof
Xray radiation and multiple rings of detectors for fast 3Dmultislice
scanning. Also, the basic Radon transformthat established the foun
dation of image reconstruction from projections has been extended
to a spectrum of exciting applications of image reconstructions in
multidimensional space usinga varietyof imagingmodalities. How
ever, the discussion in this chapter is focused on twodimensional
representation of Radon transform only for image reconstruction
from projections that are obtained through parallel beam scanning
geometry in computed tomography.
Let us deﬁne a twodimensional object function f (x, y) and its
Radon transformby R{f (x, y)}. Let us use the rectangular coordinate
system (x, y) in the spatial domain. The Radon transform is deﬁned
by the line integral
_
L
along the path L such that:
R{f (x, y)} = J
θ
(p) =
_
L
f (x, y)dl, (1)
where the projection J
θ
(p) acquired at angle θ in the polar coordinate
system is a onedimensional symmetric and periodic function with
a period of 2π. The polar coordinate system (p, θ) can be expressed
into rectangular coordinates in the Radon domain by using a rotated
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
154
Atam P Dhawan
x
y
q
p
θ
θ
p
f(x,y)
J
θ
(p)
Fig. 1. Line integral projectionJ
θ
(p) of atwodimensional object f (x, y) at anangle θ.
coordinate system (p, q) that is obtained by rotating the (x, y) coor
dinate system (Fig. 1) by an angle θ as:
x cos θ + y sin θ = p,
−x sin θ + y cos θ = q. (2)
A set of line integrals or projections can be obtained for different θ
angles as:
R{f (x, y)} = J
θ
(p) =
_
∞
−∞
f (p cos θ − q sin θ, p sin θ + q cos θ)dq. (3)
A higherdimensional Radon transform can be deﬁned in a simi
lar way. For example, the projection space for a threedimensional
Radon transform would be deﬁned by 2D planes instead of
lines.
The signiﬁcance of using the Radon transform for computing
projections inmedical imagingis that animage of a humanorgancan
be reconstructedbybackprojectingthe projections acquiredthrough
the imaging scanner. Figure 2 shows an illustration of the back
projection method for image reconstruction using projections.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
155
Projection J (p)
Projection J (p)
Projection J (p)
A
B
Projection
4 θ
θ3
θ2
θ1
(p) J
Reconstruction
Space
Fig. 2. A schematic diagram for reconstructing images from projections. Three
projections are back projected to reconstruct objects Aand B.
Three simulated projections of two objects A and B are back pro
jected into the reconstruction space. Each projection has two seg
ments of values corresponding to the objects A and B. When the
projections are back projected, the areas of higher values because
of the intersection of back projected projection data represents two
reconstructed objects. It should be noted that the reconstructed
objects may have geometrical or aliasing artifacts because of the
limited number of projections used in the imaging and reconstruc
tion processes. In the early development of ﬁrst and second gener
ations of CT scanners, only parallel beam scanning geometry was
used for direct implementation of Radon transformfor image recon
structions from projections. To improve the geometrical shape and
accuracy of the reconstructed objects, a large number of projec
tions is needed that must be acquired in a fast and efﬁcient way.
Today, fourth generation CT scanners utilize a conebeam of X
ray radiation and multiple rings of detectors for fast 3D multislice
scanning. More advanced imaging protocols, such as spiral CT use
even faster scanning and data manipulation techniques. Figure 3
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
156
Atam P Dhawan
Ring of
Detectors
Source
Source
Rotation Path
Xrays
Object
Fig. 3. An advanced Xray CT scanner geometry with rotating source and ring of
detectors.
shows a fourth generation an Xray CT scanner to obtain projec
tions using a divergent conebeamXray beamsource that is rotated
to produce multiple projections at various angles for multislice 3D
scanning. Modern CT scanners are used in many biomedical, indus
trial, and other commercial applications using a large spectrum
of imaging modalities in multidimensional image reconstruction
space.
To establish a fundamental understanding of Radon transform
and image reconstruction from projections, only 2D representation
of Radon transform with image reconstruction from projections
deﬁned through a parallel beam scanning geometry is discussed
below.
7.2.1 Reconstruction with Fourier Transform
The projection theorem, also called the central slice theorem, pro
vides a relationship between the Fourier transform of the object
function and the Fourier transform of its Radon transform or
projection.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
157
The Fourier transformof the Radon transformof the object func
tion f (x, y) can be written as
1,9–13
:
F{R{f (x, y)}} = F{J
θ
(p)}
=
_
∞
−∞
_
∞
−∞
f (p cos θ − q sin θ, p sin θ + q cos θ)e
−j2πωp
dqdp,
(4)
where ω represents the frequency component in the Fourier domain.
The Fourier transform, S
θ
(ω) of the projection J
θ
(p) can also be
expressed as:
S
θ
(ω) =
_
∞
−∞
_
∞
−∞
J
θ
(p)e
−j2πωp
dp. (5)
From Eqs. 4–5, the Fourier transform of the Radon transform of the
object function can be written as:
S
θ
(ω) =
_
∞
−∞
_
∞
−∞
f (x, y)e
−j2πω(x cos θ+y sin θ)
dxdy = F(ω, θ). (6)
Equation 6 can be considered as the twodimensional Fourier
transform of the object function f (x, y) and can be represented as
F(u, v) with:
u = ωcos θ,
v = ωsin θ,
(7)
where u and v represents frequency components along the x and
ydirections in a rectangular coordinate system.
It should be noted that S
θ
(ω) represents the Fourier transformof
the projectionJ
θ
(p) that is takenat anangle θ inthe space domainwith
a rotated coordinate system (p, q). The frequency spectrum S
θ
(ω) is
placed along a line or slice at an angle θ in the frequency domain
of F(u, v).
If several projections are obtained using different values of the
angle θ, their Fourier transform can be computed and placed along
the respective radial lines in the frequency domain of the Fourier
transform, F(u, v) of the object functionf (x, y). Additional projections
acquired in the space domain provide more spectral information
in the frequency domain leading to ﬁlling up the entire frequency
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
158
Atam P Dhawan
domain. Now the object function can be reconstructed using two
dimensional inverse Fourier transform of the spectrum F(u, v).
7.2.2 Reconstruction using Inverse Radon Transform
The forward Radon transform is used to obtain projections of an
object function at different viewing angles. Using the central slice
theorem, an object function can be reconstructed by taking the
inverse Fourier transform of the spectral information in the fre
quency domain that is assembled with the Fourier transform of
the individual projections. Thus, the reconstructed object function,
ˆ
f (x, y) canbe obtainedbytakingthe twodimensional inverse Fourier
transform of F(u, v) as:
ˆ
f (x, y) = F
−1
{F(u, v)} =
_
∞
−∞
_
∞
−∞
F(u, v)e
j2π(xu+vy)
dudv. (8)
With the change of variables:
_
u = ωcos θ,
v = ωsin θ.
Equation 8 can be rewritten with the change of variables as:
ˆ
f (x, y) =
_
π
0
_
∞
−∞
F(ω, θ)e
j2πw(x cos θ+y sin θ)
ωdωdθ. (9)
In Eq. 9, the frequency variable ω appears because of the Jacobian
due to change of variables. Replacing F(ω, θ) with S
θ
(ω), the recon
struction image
ˆ
f (x, y) can be expressedas the backprojectedintegral
(sum) of the modiﬁed projections J
∗
θ
(p) as:
ˆ
f (x, y) =
_
π
0
_
∞
−∞
ωS
θ
(ω)e
j2πω(x cos θ+y sin θ)
dωdθ
=
_
π
0
_
∞
−∞
ωS
θ
(ω)e
j2πωp
dωdθ =
_
π
0
J
∗
θ
(p)dθ,
where
J
∗
θ
(p) =
_
∞
−∞
ω S
θ
(ω)e
j2πω(x cos θ+y sin θ)
dω. (10)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
159
7.3 BACKPROJECTION METHOD FOR IMAGE
RECONSTRUCTION
The classical image reconstruction from projection method based
on the Radon transform is popularly known as the backprojection
method. The backprojection methodhas been modiﬁedby a number
of investigators to incorporate speciﬁc data collection schemes and
to improve quality of reconstructed images.
Though the object function can be reconstructed using the
inverse Fourier transform of the spectral information of the
frequency domain F(u, v) obtained using the central slice theorem,
an easier implementation of the Eq. 10 can be obtained by its realiza
tion through the modiﬁed projections, J
∗
θ
(p). This realization leads
to the convolution backprojection, also known as ﬁltered backpro
jection method for image reconstruction from projections.
The modiﬁed projection J
∗
θ
(p) can be expressed in terms of a
convolution of:
J
∗
θ
(p) =
_
∞
−∞
ωS
θ
(ω)e
j2πωp
dω
= F
−1
{ωS
θ
(ω)}
= F
−1
{ω} ⊗ J
θ
(p), (11)
where ⊗ represents the convolution operator.
Equation 11 presents some interesting challenges for imple
mentation. The integration over the spatial frequency variable ω
should be carried out from −∞ to ∞. But in practice, the projec
tions are considered to be bandlimited. This means that any spectral
energy beyond a spatial frequency, say , must be ignored. Using
Eqs. 10–11, it canbe shownthat the reconstructionfunctionor image,
ˆ
f (x, y) can be computed as:
ˆ
f (x, y) =
1
π
_
π
0
dθ
_
∞
−∞
dp
J
θ
(p
)h(p − p
), (12)
where h(p) is a ﬁlter function that is convolved with the projection
function.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
160
Atam P Dhawan
Ramakrishnan and Lakshiminarayanan
9
computed the ﬁlter
function h(p) strictly from Eq. 11 in the Fourier domain as
H
R−L
=
_
ω if ω ≤
0 otherwise
_
, (13)
where H
R−L
is the Fourier transform of the ﬁlter kernel function
h
R−L
(p) in the spatial domain and is bandlimited.
In general, H(ω) a bandlimited ﬁlter function in the frequency
domain (Fig. 4) can be expressed as:
H(ω) = ωB(ω),
where B(ω) denotes the bandlimiting function,
B(ω) =
_
1 if ω ≤
0 otherwise
_
. (14)
For the convolution operation with the projection function in the
spatial domain (Eqs. 10–11), the ﬁlter kernel function, H(ω) can be
obtained from h(p) by taking the inverse Fourier transform as:
h(p) =
_
∞
−∞
H(ω)e
j2πωp
dω. (15)
If the projections are sampled with a time interval of τ, the projec
tions can be represented as J
θ
(kτ) where k is an integer. Using the
H(ω)
1/2τ
1/2τ
1/2τ
ω
Fig. 4. Abandlimited ﬁlter function H(ω).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
161
sampling theorem and the bandlimited constraint, all spatial fre
quency components beyond are ignored such that:
=
1
2τ
. (16)
For the bandlimited projections with a sampling interval of t, Eq. 15
can be expressed with some simpliﬁcation as:
h(p) =
1
2τ
2
sin (πp/τ)
πp/τ
−
1
4τ
2
_
sin (πp/2τ)
πp/2τ
_
2
. (17)
Thus the modiﬁedprojection J
∗
θ
(p
) andthe reconstruction image can
be computed as:
J
∗
θ
(p) =
_
∞
−∞
J
θ
(p
)h(p − p
)dp
,
ˆ
f (x, y) =
π
L
L
i=1
J
θ
i
(p), (18)
where L is the total number of projections acquired during the imag
ing process at viewing angles θ
i
; for i = 1, . . . , L.
The quality of the reconstructed image depends heavily on
the number of projections and the spatial sampling interval of the
acquired projection. For better quality images to be reconstructed,
it is essential to acquire a large number of projections covering the
entire range of viewing angles around the object. Higher resolu
tion images with ﬁne details can only be reconstructed if the projec
tions are acquired with a high spatial sampling rate satisfying the
basic principle of the sampling theorem. If the raw projection data
is acquired at a sampling rate lower than the Nyquist sampling rate,
aliasing artifacts would occur in the reconstructed image because of
the overlapping spectra in the frequency domain. The ﬁne details in
the reconstructed images represent high frequency components.
The maximum frequency component that can be reconstructed
inthe image is thus limitedbythe detector size andthe scanningpro
cedure used in the acquisition of rawprojection data. To reconstruct
images of higher resolution and quality, the detector size should be
small. On the other hand, the projection data may suffer from poor
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
162
Atam P Dhawan
signaltonoise ratio if there is an insufﬁcient number of photons
collected by the detector due to its smaller size.
There are several variations in the design of the ﬁlter function
H(ω) investigated in the literature. The acquired projection data is
discrete in the spatial domain. To implement convolution backpro
jection method in the spatial domain, the ﬁlter function has to be
realized as discrete in the spatial domain. The major problem of the
RamachandaranLakshiminarayanan ﬁlter
9
is that it has sharp cut
offs inthe frequency domainat ω = 1/2τ andω = −1/2τ as shownin
Fig. 4. The sharp cutoffs based function provides sinc functions for
the ﬁlter in the spatial domain as shown in Eq. 5.16 causing mod
ulated ringing artifacts in the reconstructed image. To avoid such
artifacts, the ﬁlter function must have smooth cut offs such as those
obtained from Hamming window function. Abandlimited general
ized Hamming window can be represented as:
H
Hammin g
(ω) = ω[α + (1 −α) cos (2πωτ)]B(ω), for 0 ≤ α ≤ 1
(19)
where the parameter α can be adjusted to provide appropriate char
acteristic shape of the function.
The Hamming window based ﬁlter kernel function provides
smoother cutoffs as shown in Fig. 5. The Hamming window based
convolution function provides smoother function in the spatial
H(ω)
1/2τ
1/2τ
ω
Fig. 5. AHamming windowbased ﬁlter kernel function in the frequency domain.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
163
Fig. 6. (A) A reconstructed image of a cross sectional slice of chest of a cadaver
using the Radon transform based backprojection method; (B) The actual patholog
ically stained slice of the respective cross section.
domain that reduces the ringing artifacts and improves signalto
noise ratio in the reconstructed image. Other smoothing functions
can be used for reducing ringing artifacts and improving the quality
of the reconstructed image.
12–13
Figure 6(A) shows a reconstructedimage of a cross sectional slice
of chest of a cadaver using the Radon transform based backprojec
tion method. The actual pathologically stainedslice of the respective
cross section is shown in Fig. 6(B).
7.4 ITERATIVE ALGEBRAIC RECONSTRUCTION
TECHNIQUES (ART)
The iterative reconstruction methods are based on optimization
strategies incorporating speciﬁc constraints about the object domain
and the reconstruction process. Algebraic reconstruction tech
niques (ART)
11–14
are popular algorithms used in iterative image
reconstruction. In the algebraic reconstruction methods, the raw
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
164
Atam P Dhawan
projection data from the scanner is distributed over a prespec
iﬁed image reconstruction grid such that the error between the
computed projections from the reconstructed image and the actual
acquired projections is minimized. Such methods provide a mecha
nism to incorporate additional speciﬁc optimization criteria such as
smoothing and entropy maximization in the reconstruction process
to improve the image quality and signaltonoise ratio. The alge
braic reconstruction methods are based on the series expansion rep
resentation of a function and were used by Gordon and Herman for
medical image reconstruction.
12–14
Let us assume a twodimensional image reconstruction grid of
N pixels. Let us deﬁne p
i
representing the projection data as a set
of ray sums that are collected by M scanning rays passing through
the image at speciﬁc angles (rays as deﬁned in Fig. 1). Let f
j
be the
value of jth pixel of the image that is weighted by w
i,j
to meet the
projection measurements. Thus the ray sump
i
in the projection data
can be expressed as:
p
i
=
N
j=1
w
i,j
f
j
for i = 1, . . . , M. (20)
The representation in Eq. 5.19 provides M equations of N unknown
variables to be determined. The weight w
i,j
represents the contri
bution of the pixel value in determining the ray sum and can be
determined by geometrical consideration as the ratio of the area
overlapping with the scanning ray to the total area of the pixel. The
problem of determining f
j
for image reconstruction can be solved
iteratively using the ART algorithm. Alternately, it can be solved
through matrix inversion since the measured projection data p
i
is
known. The set of equations can also be solved using dynamic
programming methods.
12
Inalgebraic reconstructionmethods, eachpixel is assigneda pre
determined value such as the average of the rawprojection data per
pixel to start the iterative process. Any time during the reconstruc
tion process, a computed ray sumfromthe image under reconstruc
tion is obtained by passing a ray. In each iteration, an error between
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
165
the measured projection ray sumand the computed ray sumis eval
uated and distributed on the corresponding pixels in a weighted
manner. The correction to the pixel values can be obtained in an
additive or multiplicative manner, i.e. the correction value is either
added to the current pixel value or multiplied with it to obtain the
next value. The iterative process continues until the error between
the measured and computed ray sums is minimized or meets a pre
speciﬁed criterion. The f
j
values from the last iteration provide the
ﬁnal reconstructed image.
Let q
k
j
be the computed ray sum in the kth iteration that is pro
jected over the reconstruction grid in the next iteration. The iterative
procedure can then be expressed as:
q
k
i
=
N
l=1
f
k−1
l
w
i,l
for all i = 1, . . . , M
f
k+1
j
= f
k
j
+
_
p
i
− q
k
i
N
l=1
w
2
i,l
_
w
i,j
. (21)
Gordon
14
used an easier way to avoid large computation of the
weight matrix by replacing the weight by 1 or 0. If the center of the
pixel passes through the ray, the corresponding weight is assigned
as 1, otherwise 0. This simpliﬁcation provides an efﬁcient imple
mentation of the algorithm and is known as additive ART. Other
versions of ART including multiplicative ART have been developed
to improve the reconstruction efﬁcacy and quality.
12
Iterative ART methods offer an attractive alternative to the ﬁl
tered backprojection method because of their abilities to deal with
the noise and random ﬂuctuations in the projection data caused
by detector inefﬁciency and scattering. These methods are par
ticularly suitable for limited view image reconstruction as more
constraints deﬁning the imaging geometry and prior information
about the object can easily be incorporated into the reconstruction
process.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
166
Atam P Dhawan
7.5 ESTIMATION METHODS
Though the ﬁltered backprojection methods are most commonly
used in medical imaging, in practice, a signiﬁcant number of
approaches using statistical estimation methods have been investi
gated for image reconstruction for transmission as well as emission
computed tomography.
15–26
These methods assume a certain distri
bution of the measured photons and then ﬁnd the parameters for
attenuation function (in the case of transmission scans such as Xray
CT) or emitter density (in the case of emission scans such as PET).
The photon detection statistics of a detector is usually charac
terized by Poisson distribution. Let us deﬁne a measurement vector
J = [J
1
, J
2
, . . . , J
N
] with J
i
to be the random variable representing the
number of photons collected by the detector for the ith ray such
that
17
:
E[J
i
] = m
i
e
−
_
L
µ(x,y,z)dl
for i = 1, 2, . . . , N, (22)
where L deﬁnes the ray along which the photons with monochro
matic energy have been attenuated with the attenuation coefﬁcients
denoted by µs and m
i
is the mean number of photons collected by
the detector for the ith ray position. Also, in the above formulation,
the noise, scattering and random coincidence effects are ignored.
The attenuation parameter vector µ can be expressed in terms
of a series expansion as a weighted sum of individual attenuation
coefﬁcients of corresponding pixels (for 2Dreconstruction) or voxels
(for 3D reconstruction). If the parameter vector µ has N
p
number of
individual elements (pixels or voxels), it can be represented as:
µ =
N
p
j=1
µ
j
w
j
, (23)
where w
j
is the basis function that is the weight associated with the
individual µ
j
belonging to the corresponding pixel or voxel.
One simple solution to obtain w
j
is to assign it a value 1 if the
ray contributing to the corresponding photon measurement vector
passes through the pixel (or voxel) and 0 otherwise. It can be shown
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
167
that a line integral or ray sum for ith ray is given by:
_
L
i
µ(x, y, z)dl =
N
p
k=1
a
ik
µ
k
, (24)
where a
ik
=
_
L
i
w
k
( x) with x representing the position vector for
(x, y, z) coordinate system.
The weight matrix A = {a
ik
} is deﬁned to rewrite the measure
ment vector as:
J
i
( µ) = m
i
e
−[A µ]
i
,
where
[A µ]
i
=
N
p
k=1
a
ik
µ
k
. (25)
The reconstruction problem is to estimate µ from a measured set
of detector counts realizing the random variable
J. The maximum
likelihood (ML) estimate can be expressed as
17–19
:
ˆ
µ = arg max
µ≥
0
L( µ),
L( µ) = log P[
J =
j; µ),
(26)
where L( µ) is the likelihood function deﬁned as the logarithmic of
the probability function P[
J =
j; µ). The MLreconstruction methods
are developed to obtain an estimate of the parameter vector µ that
maximizes the probability of observing the measured data (photon
counts).
Using the Poisson distribution model for the photon counts,
the measurement joint probability function P[
J =
j; µ) can be
expressed as:
P[
J =
j; µ) =
N
i=1
P[J
i
= j
i
; µ) =
N
i=1
e
−j
i
(µ)
[j
i
(µ)]
j
i
j
i
!
. (27)
If the measurements are obtained independently through deﬁning
ray sums, the log likelihood function can be expressed combining
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
168
Atam P Dhawan
Eqs. 5.21, 5.25 and 5.26 as:
L( µ) =
N
i=1
h
i
([A µ]
i
),
where
h
i
(l) = j
i
log (m
i
e
−l
) − m
i
e
−l
. (28)
Let us consider an additive nonnegative function r
i
representing the
background photon count for the ith detector due to the scatter
ing and random coincidences, the likelihood function can then be
expressed as
17
:
L( µ) =
N
i=1
h
i
([A µ]
i
),
where
h
i
(l) = j
i
log (m
i
e
−l
+ r
i
) − (m
i
e
−l
+ r
i
). (29)
Several algorithms have been investigated in the literature to obtain
an estimate of the parameter vector that maximizes the log likeli
hood function given in Eq. 5.27. However, it is unlikely that there
is a unique solution to this problem. There may be several solu
tions of the parameter vector that can maximize the likelihood func
tion. All solutions may not be appropriate or even feasible for image
reconstruction. To improve quality reconstructed images, a num
ber of methods imposing additional constraints such as smoothness
are applied by incorporating the penalty functions in the optimiza
tion process. Several iterative optimization processes incorporating
roughness penalty function for the neighborhood values of the esti
matedparameter vector have beeninvestigatedinthe literature.
17–19
Let us represent a general roughness penalty function R(µ)
17–19
such that:
R( µ) =
K
k=1
ψ([C µ]
k
),
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
169
where
[C µ]
k
=
N
p
l=1
c
kl
µ
l
. (30)
where ψ
k
’s are potential functions working as a normonthe smooth
ness constraints Cµ ≈ 0 andKis the number of suchconstraints. The
matrix C is a K ×N
p
penalty matrix. It should be noted that ψ
k
’s are
convex, symmetric, nonnegative and differentiable functions.
17
A
potential choice for a quadratic penalty function could be by deﬁn
ing ψ
k
(t) = w
k
t
2
/2 with nonnegative weights, i.e. w
k≥0
. Thus the
roughness penalty function R( µ) is given by:
R( µ) =
K
k=1
w
k
1
2
([C µ]
k
)
2
. (31)
The objective function for optimization using the penalized ML
approach can now be revised as:
ˆ
µ = arg max ( µ)
where
( µ) = L( µ) −βR( µ). (32)
The parameter β controls the level of smoothness in the ﬁnal recon
structed image.
Several methods for obtaining the MLestimate have been inves
tigated in the literature. These optimization methods include expec
tation maximization (EM), complex conjugate gradient, gradient
descent optimization, grouped coordinated ascent, fast gradient
based Bayesian reconstruction and ordered subsets algorithms.
28–30
Such iterative algorithms have been applied to obtain a solution for
the parameter vector for reconstructing an image from both trans
mission and emission scans. In addition, multigrid EM methods
have also been applied for image reconstruction in positron emis
sion tomography (PET).
23–24
Figure 7(A) shows axial PET images of
the brain reconstructed using ﬁltered backprojection methods while
Fig. 7(B) shows same cross sectional images reconstructed using a
multigrid EM method.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
170
Atam P Dhawan
(A)
(B)
Fig. 7. (A) Axial PET images of the brain reconstructed using ﬁltered backpro
jection methods; (B) cross sectional images reconstructed using a multigrid EM
method.
7.6 CONCLUDING REMARKS
Image reconstructionis anintegral andprobably the most important
part of medical imaging. Utilizing more information about imaging
geometry and physics of imaging, quality of reconstruction can be
improved. Furthermore, a priori and model based information can
be used with constrained optimization methods for better recon
struction. In this chapter, basic image reconstruction approaches
are presented that are based on Radon transform, Fourier trans
form, ﬁltered backprojection, iterative ART, and statistical esti
mation and optimization methods. More details and advanced
image reconstruction methods are presented in Chapter 15 in
this book.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
171
References
1. Radon J, Uber die Bestimmung von Funktionen durch ihre Integral
werte langs gewisser Mannigfaltigkeiten, Ber Verb Saechs AKAD Wiss,
Match Phys, Kl 69: 262–277, 1917.
2. Cramer H, Wold H, Some theorems on distribution functions, J London
Math Soc 11: 290–294, 1936.
3. Renyi A, On projections of probability distributions, Acta Math Acad
Sci Budapest 3: 131–141, 1952.
4. Gilbert WM, Projections of probability distributions, Acta Math Acad
Sci Budapest 6: 195–198, 1955.
5. Bracewell RN, Strip integration radio astronomy, Aust J Physcis 9:
198–217, 1956.
6. Cormack AM, Representation of a function by its line integrals with
some radiological applications, J Appl Phys 34: 2722–2727, 1963.
7. HounsﬁeldGN, Computerizedtransverse axial scanning tomography:
Part1, description of the system, Br J Radiol 46: 1016–1022, 1973.
8. Hounsﬁeld GN, A method and apparatus for examination of a body
by radiation such as X or gamma radiation, Patent 1283915, The patent
Ofﬁce, London, England, 1972.
9. Ramachandran GN, Lakshminaryanan AV, Threedimensional recon
struction from radiographs and electron micrographs, Proc Nat Acad
Sci USA 68: 2236–2240, 1971.
10. Deans SR, The RadonTransformand Some of Its Applications, JohnWiley&
Sons, 1983.
11. Dhawan AP, Medical Image Analysis, John Wiley and Sons, 2003.
12. Herman GT, Image Reconstruction from Projections, Academic Press,
1980.
13. Rosenfeld, Kak AC, Digital Picture Processing, Vol 1, Academic Press,
1982.
14. Gordon R, A tutorial on ART (Algebraic Reconstruction Techniques),
IEEE Trans Nucl Sci 21: 78–93, 1974.
15. Dempster AP, Laird NM, Rubin DB, Maximumlikelihood fromincom
plete data via the EM algorithm, J R Stat Soc Ser B 9: 1–38, 1977.
16. Shepp LA, Vardi Y, Maximum likelihood reconstruction for emission
tomography, IEEE Trans Med Imag 1: 113–121, 1982.
17. Fessler JA, Statistical image reconstruction methods for transmission
tomography, in Sonka, M, Fitzpatrick, JM (eds.), Handbook of Medi
cal Imaging, Vol. 2, Medical Image Processing and Analysis, SPIE Press,
pp. 1–70, 2000.
18. Erdogen H, Fessler J, Monotonic algorithms for transmission tomog
raphy, IEEE Trans Med Imag 18: 801–814, 1999.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch07 FA
172
Atam P Dhawan
19. Yu DF, Fessler JA, Ficaro EP, Maximum likelihood transmission image
reconstruction for overlapping transmission beams, IEEE Trans Med
Imag 19: 1094–1105, 2000.
20. Lange K, Carson R, EM reconstruction algorithms for emission and
transmission tomography, J Comp Asst Tomogr 8: 306–316, 1984.
21. Olinger JM, Maximum likelihood reconstruction of transmission
images in emission computed tomography via the EMalgorithm, IEEE
Trans Med Imag 13: 89–101, 1994.
22. WelchA, Clack R, Natterer F, Gullberg G, Toward accurate attenuation
correction in SPECT without transmission measurements, IEEE Trans
Med Imag 16: 532–541, 1997.
23. Ranganath MV, Atam Dhawan P, Mullani N, A multigrid expectation
maximization reconstruction algorithm for positron emission tomog
raphy, IEEE Trans on Med 7: 273–278, 1988.
24. RahejaA, DhawanAP, Wavelet basedmultiresolutionexpectationmax
imization reconstruction algorithm for emission tomography, Comp
Med Imag And Graph 24: 87–98, 2000.
25. Solo V, Purdon P, Weisskoff R, Brown E, Asignal estimation approach
to functional MRI, IEEE Trans Med Imag 20: 26–35, 2001.
26. Basu S, Bresler Y, O(N
3
log N)Backprojection algorithm for the 3D
Radon transform, IEEE Trans Med Imag 21: 76–88. 2002.
27. Bouman CA, Saur K, A uniﬁed approach to statistical tomography
using coordinate descent optimization, IEEETrans Image Process 5: 480–
492, 1996.
28. Erdogen H, Gualtiere G, Fessler JA, Ordered subsets algorithms for
transmission tomography, Phys Med Biol 44: 2835–2851, 1999.
29. Mumcuoglu EU, Leahy R, Cherry SR, Zhou Z, Fast gradientbased
methods for Bayesian Reconstruction of transmission and emission
PET images, IEEE Trans Med Imag 13: 687–701, 1994.
30. Green PJ, Bayesian reconstructions from emission tomography data
using a modiﬁed EM algorithm, IEEE Trans Med Imag 9: 84–93, 1990.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
CHAPTER 8
Principles of Image Processing
Methods
Atam P Dhawan
Medical image processing methods including image restoration and
enhancement methods are very useful for effective visual examination
and computerized analysis. Image processing methods enhance features
of interest for better analysis and characterization. Though there have
been more advanced modelbased image processing methods investi
gated and developed recently, this chapter presents the principles of
selected basic image processing methods. Advanced image process
ing and reconstruction methods are described in other chapters in this
book.
8.1 INTRODUCTION
Medical images are examined through visual inspection by expert
physicians or analyzed through computerized methods for spe
ciﬁc feature extraction, classiﬁcation, and statistical analysis. In both
of these approaches, image processing operations such as image
restoration (such as smoothing operations for noise removal) and
enhancement for better feature representation, extraction and anal
ysis are very useful. The principles of some of the most commonly
used basic image processing methods for noise removal, image
smoothing, and feature enhancement are described in this chap
ter. These methods are usually available in any image processing
software such as MATLAB through the image processing toolbox.
173
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
174
Atam P Dhawan
Medical images show characteristic information about the
physiological properties of the structures and tissues. However,
the quality and visibility of information depend on the imaging
modality and the response functions (such as point spread func
tion) of the imaging scanner. Medical images from speciﬁc modal
ities need to be processed using methods suitable to enhance
the features of interest. For example, a chest Xray radiographic
image shows the anatomical structure of the chest based on the
total attenuation coefﬁcients. If the radiograph is being examined
for a possible fracture in the ribs, an image enhancement method
is required to improve the visibility of hard bony structure. But, if
an Xray mammogram is obtained for the examination of potential
breast cancer, an image processing method is required to enhance
visibility of microcalciﬁcations, speculated masses and soft tis
sue structures such as parenchyma. A single image enhancement
method may not serve both of these applications. Image enhance
ment methods for improving the soft tissue contrast in MR brain
images may be entirely different than those used for PET brain
images. Thus, image enhancement tasks andmethods are verymuch
application dependent.
Image enhancement methods may also include image restora
tionmethods whichare generallybasedonminimummeansquared
error operations, such as Wiener ﬁltering and other constrained
deconvolution methods incorporating some a priori knowledge of
degradation.
1–5
Since the main objective is to enhance features of
interest, a suitable combination of both restoration and contrast
enhancement algorithms is the integral part of preprocessing in
image analysis. The selection of a speciﬁc restoration algorithm for
noise removal is highly dependent on the image acquisition system.
For example, in the ﬁlteredbackprojection method for reconstruct
ing images in computed tomography (CT), the raw data obtained
from the scanner is ﬁrst deconvolved with a speciﬁc ﬁlter. Filter
functions such as Hamming window, as described in chapter 7, may
also be used to reduce noise in the projection data. On the other
hand, several image enhancement methods, such as neighborhood
based operations, frequency ﬁltering operations, etc., implicitly de
emphasize noise for feature enhancement.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
175
Image processing methods are usually performed in one of the
two domains: (1) spatial domain; (2) spectral domain. Image or spa
tial domain provides a distribution of an image feature such as
brightness over the spatial grid of samples. Spectral or frequency
domainprovides spectral informationina transformeddomainsuch
as the one obtained through Fourier transform. In addition, speciﬁc
transform based methods such Hough transform, neural networks
andmodelbasedmethods have also beenusedfor image processing
operations.
1–7
8.2 IMAGE PROCESSING IN SPATIAL DOMAIN
Spatial domain methods process an image with pixelbypixel
transformation based on the histogram statistics or neighborhood
operations. These methods are usually faster in computer imple
mentation as compared to frequency ﬁltering methods that require
computation of Fourier transform for frequency domain represen
tation. However, frequency ﬁltering methods may provide better
results in some applications if a priori information about the char
acteristic frequency components of the noise and features of inter
est is available. For example, speciﬁc spike based degradations
due to mechanical stress and vibration on the gradient coils in
the raw signal often cause striation artifacts in fast MR imaging
techniques. The spike degradations based noise in the MR signal
can be modeled with their characteristic frequency components
and can be removed by selective ﬁltering and wavelet processing
methods.
7
Wiener ﬁltering methods have been applied for signal
enhancement to remove frequency components related to the unde
sired resonance effects of the nuclei and noise suppression in MR
imaging.
8–10
8.2.1 Image Histogram Representation
A histogram of an image provides information about the intensity
distribution of pixels in the image. The simplest formof a histogram
is the plot of occurrence of speciﬁc graylevel values of the pixels in
the image. The occurrence of gray levels can be provided in terms of
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
176
Atam P Dhawan
the absolute values, i.e. the number of times a speciﬁc graylevel has
occurred in the image, or the probability values, i.e. the probability
of occurrence of a speciﬁc graylevel in the image. In mathematical
terms, a histogram h(r
i
) is expressed as:
h(r
i
) = n
i
for i = 0, 1, . . . , L −1, (1)
where r
i
is the ith graylevel in the image for a total of L gray
values and n
i
is the number of occurrences of graylevel r
i
in the
image.
If a histogram is expressed in terms of the probability of occur
rence of graylevels, it can be expressed as:
p(r
i
) =
n
i
n
, (2)
where n is the total number of pixels.
Thus, a histogram is a plot of h(r
i
) or p(r
i
) versus r
i
. Figure 1(A)
shows an Xray mammogram image while 1(B) shows its gray
level histogram.
(A) (B)
Fig. 1. An Xray (A) mammogram image on the left with its (B) histogram at the
right.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
177
8.2.2 Histogram Equalization
A popular generalpurpose method of image enhancement is his
togram equalization. In this method, a monotonically increasing
transformationfunction, T(r) is usedtomapthe original grayvalues,
r
i
of the input image into new gray values, s
i
of the output image
such that:
s
i
= T(r
i
) =
i
j=0
p
r
(r
j
)
=
i
j=0
n
j
n
for i = 0, 1, . . . , L −1, (3)
where p
r
(r
i
) is the probability based histogram of the input image
that is transformed into the output image with the histogram p
s
(s
i
).
The transformation function T(r
i
) in Eq. (6.3) stretches the his
togram of the input image such that the gray values occur in the
output image with equal probability of occurrence. It should be
noted that the uniform distribution of the histogram of the out
put image is limited by discrete computation of the graylevel
transformation. The histogram equalization method forces image
intensity levels to be redistributed with an equal probability of
occurrence.
Figure 2 shows the original mammogram image and its his
togram equalized image with their respective histograms. Image
saturation around the middle of the image can be noticed in the
histogram equalized image.
8.2.3 Histogram Modiﬁcation
The histogram equalization method stretches the contrast of an
image by redistributing the gray values to achieve a uniform dis
tribution. This general method may not provide good results in
many applications. It can be noted from Fig. 2 that the histogram
equalization method can cause saturation in some regions of the
image resulting in loss of details and highfrequency information
that maybe necessaryfor interpretation. Sometimes, local histogram
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
178
Atam P Dhawan
Fig. 2. Top left: original Xray mammogram image; Bottom left: histogram of
the original image; Top right: the histogram equalized image; Bottom right:
histogram of the equalized image.
equalizationis appliedseparately onpredeﬁnedlocal neighborhood
regions, such as 7 ×7 pixels, to provide better results.
1
If a desired distribution of gray values is known a priori, a his
togram modiﬁcation method is used to apply a transformation that
changes the gray values to match the desired distribution. The tar
get distribution can be obtained from a good contrast image that is
obtained under similar imaging conditions. Alternatively, an orig
inal image from a scanner can be interactively modiﬁed through
regional scaling of gray values to achieve the desired contrast. This
image cannowprovide a target distributionto the rest of the images,
obtained under similar imaging conditions, for automatic enhance
ment using the histogram modiﬁcation method.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
179
The conventional scaling method of changing gray values from
the range [a, b] to [c, d] can be given by a linear transformation as:
z
new
=
d −c
b −a
(z −a) +c, (4)
where z and z
new
are, respectively, the original and new gray values
of a pixel in the image.
Let us assume that p
z
(z
i
) is the target histogram expressed, and
p
r
(r
i
) and p
s
(s
i
) are, respectively, the histograms of the input and
output image. Atransformationis neededsuchthat the output image
p
s
(s
i
) should have the desired histogram of p
z
(z
i
). The ﬁrst step in
this process is to equalize p
r
(r
i
) using the Eq. 3 such that
1,6
:
u
i
= T(r
i
) =
i
j=0
p
r
(r
j
) for i = 0, 1, . . . , L −1, (5)
where is u
i
represents the equalized gray values of the input image.
A new transformation V can be deﬁned to equalize the target
histogram such that:
v
i
= V(z
i
) =
i
k=0
p
z
(z
k
) for i = 0, 1, . . . , L −1. (6)
Putting V(z
i
) = T(r
i
) = u
i
to achieve the target distribution, new
gray values s
i
for the output image are computed from the inverse
transformation V
−1
as:
s
i
= V
−1
[T(r
i
)] = V
−1
(u
i
). (7)
With the transformation deﬁned in Eq. 7, the histogramdistribution
of the output image p
s
(s
i
) would become similar to that of p
z
.
8.2.4 Image Averaging
Signal averaging is a well known method for enhancing signalto
noise ratio. In medical imaging, data fromthe detector is often aver
aged over time or space for signal enhancement. However, such
signal enhancement is achieved at the cost of some loss of temporal
or spatial resolution. Sequence images, if properly registered and
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
180
Atam P Dhawan
acquired in nondynamic applications, can be averaged for noise
reduction leading to smoothing effects. Selective weighted averag
ing can also be performed over a speciﬁed neighborhood of pixels
in the image.
Let us assume that an ideal image f (x, y) suffers through an
additive noise n(x, y). The acquired image g(x, y) then can be rep
resented as:
g(x, y) = f (x, y) +n(x, y). (8)
Ina general imagingprocess, the noise is assumedtobe uncorrelated
and random with a zero average value. If a sequence of K images
is acquired for the same object under the same imaging conditions,
the average image ¯ g(x, y) can be obtained as:
¯ g(x, y) =
1
K
K
i=1
g
i
(x, y), (9)
where g
i
(x, y); i = 1, 2, . . . , K represents the sequence of images to
be averaged.
As the number of images K increases, the expected value of the
average image ¯ g(x, y) approaches to f (x, y) reducing the noise per
pixel in the averaged image as:
E{¯ g(x, y)} = f (x, y)
σ
¯ g(x,y)
=
1
√
K
σ
n(x,y)
, (10)
where σ represents the standard deviation of the respective random
ﬁeld.
8.2.4.1 Neighborhood Operations
The spatial ﬁltering methods using neighborhood operations
involve the convolution of the input image with a speciﬁc mask
(such as Laplacian based high frequency emphasis ﬁltering mask)
to enhance an image. The gray value of each pixel is replaced by the
new value computed according to the mask applied in the neigh
borhood of the pixel. The neighborhood of a pixel may be deﬁned
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
181
in any appropriate manner based on a simple connectedness or any
other adaptive criterion.
13
Let us assume a general weight mask of (2p +1) ×(2p +1) pixels
where p can take integer values, such as 1, 2, . . . , depending upon
the size of the mask. For p = 1, the size of the weight mask is 3 ×3
pixels. Adiscrete convolution of an image f (x, y) with a spatial ﬁlter
represented by a weight mask w(x, y) is given by:
g(x, y) =
1
p
x
=−p
p
y
=−p
w(x
, y
)
p
x
=−p
p
y
=−p
w(x
, y
)f (x+x
, y+y
),
(11)
where the convolution is performed for all values of x and y in the
image. In other words, the weight mask of the ﬁlter is translated and
convolved over the entire extent of the input image to provide the
output image.
The values of the weight mask are derived froma discrete repre
sentation of the selected ﬁlter. Based on the ﬁlter, the characteristics
of the input image are changed in the output image. For example,
Fig. 3 shows a weighted averaging mask that can be used for image
smoothing and noise reduction. In this mask, the pixels in the
4connected neighborhood are weighted twice than other pixels as
they are closer than others to the central pixel. The mask is usedwith
a scaling factor of 1/16 that is multiplied to the values obtained by
convolution of the mask with the image Eq. 11.
Figure 4 shows an Xray mammogram image smoothed by spa
tial ﬁltering using the weighted averaging mask shown in Fig. 3.
Some loss of details can be noted in the smoothed image because of
1 2 1
2 4 2
1 2 1
Fig. 3. Aweighted averaging mask for image smoothing.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
182
Atam P Dhawan
Fig. 4. Left: an original Xray mammogramimage; right: a smoothed image using
the weight mask shown in Fig. 3.
the averaging operation. In order to minimize the loss of details, an
adaptive median ﬁltering may be applied.
1–4
8.2.4.2 Median Filter
Median ﬁlter is a well known orderstatistics ﬁlter that replaces the
original grayvalue of a pixel bythe medianof grayvalues of pixels in
the speciﬁedneighborhood. For example, for 3×3 pixels basedﬁxed
neighborhood, the gray value of the central pixel f (0, 0) is replaced
by the median of gray values of all nine pixels in the neighborhood.
Insteadof replacing the gray value of the central pixel by the median
operation of the neighborhood pixels, other operations such as mid
point, arithmetic mean, and geometric mean, can also be used in
orderstatistics ﬁltering methods.
1–5
Amedian ﬁlter operation for a
smoothed image
ˆ
f (x, y) computed from the acquired image g(x, y) is
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
183
deﬁned as:
ˆ
f (x, y) =
median
(i, j) ∈ N
{g(i, j)}, (12)
where N is the prespeciﬁed neighborhood of the pixel (x, y).
8.2.4.3 Adaptive Arithmetic Mean Filter
Adaptive local noise reduction ﬁltering can be applied using the
variance information of the selected neighborhood and an estimate
of the overall variance of noise in the image. If the noise variance of
the image is similar to the variance of gray values in the speciﬁed
neighborhood of pixels, the ﬁlter provides an arithmetic mean value
of the neighborhood. Let σ
2
n
be anestimate of the variance of the noise
in the image and σ
2
s
be the variance of gray values of pixels in the
speciﬁed neighborhood, an adaptive local noise reduction ﬁltering
can be implemented as:
ˆ
f (x, y) = g(x, y) −
σ
2
n
σ
2
s
[g(x, y) − ¯ g
ms
(x, y)], (13)
where ¯ g
ms
(x, y) is the meanof the grayvalues of pixels inthe speciﬁed
neighborhood. This should be noted that if the noise variance is zero
in the image, the resultant image is the same as the input image. If
an edge were present in the neighborhood, the local variance would
be higher than the noise variance of the image. In such cases, the
above estimate in Eq. 13 would return the value close to the original
gray value of the central pixel.
8.2.4.4 Image Sharpening and Edge Enhancement
Edges in an image are basically deﬁned by the change in gray values
of pixels in the neighborhood. The change of gray values of adjacent
pixels in the image can be expressed by a derivative (in continuous
domain) or a difference (in discrete domain) operation.
A ﬁrstorder derivative operator, such as Sobel, computes the
gradient information in a speciﬁc direction. The derivative operator
canbe encodedinto a weight mask. Figure 5 shows two Sobel weight
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
184
Atam P Dhawan
1 2 1
0 0 0
1 2 1
1 0 1
2 0 2
1 0 1
Fig. 5. Weight masks for ﬁrst derivative operator known as Sobel. The mask at
the left is for computing gradient in the xdirection while the mask at the right
computes the gradient in the ydirection.
masks that are used, respectively, in computing the ﬁrstorder gra
dient in x and ydirections (deﬁned by δ f (x, y)/δx and δ f (x, y)/δy).
These weight masks of 3 ×3 pixels each are used for convolution to
compute respective gradient images. For spatial image enhancement
based on the ﬁrstorder gradient information, the resultant gradient
image can simply be added to the original image and rescaled using
the full dynamic range of gray values.
Asecondorder derivative operator, known as Laplacian, can be
deﬁned as:
∇
2
f (x, y) =
δ
2
f (x, y)
δx
2
+
δ
2
f (x, y)
δy
2
= [f (x +1, y) +f (x −1, y) +f (x, y +1)
+f (x, y −1) −4f (x, y)], (14)
where ∇
2
f (x, y) represents the secondorder derivative or Laplacian
of the image f (x, y).
An image can be sharpened with enhanced edge information by
adding the Laplacian of the image to the original image itself. Such a
mask with Laplacian added to the image is shown in Fig. 6. Figure 7
shows the enhanced version of the original mammographic image
shown in Fig. 4.
8.3 FREQUENCY DOMAIN FILTERING
Frequency domain ﬁltering methods process an acquired image in
the Fourier domain to emphasize or deemphasize speciﬁed fre
quency components. In general, the frequency components can be
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
185
1 1 1
1 9 1
1 1 1
Fig. 6. Weight masks for image enhancement through addition of Laplacian gra
dient information to the image.
Fig. 7. The original mammogramimage on the left, also shown in Fig. 4(left), with
the Laplacian gradient based image enhancement shown in at the right.
expressed in lowand high ranges. The lowfrequency range compo
nents usually represent shapes and blurred structures in the image
while high frequency information belongs to sharp details, edges
and noise. Thus, a lowpass ﬁlter with attenuation to high frequency
components would provide image smoothing and noise removal.
A highpass ﬁltering with attenuation to low frequency extracts
edges and sharp details for image enhancement and sharpening
effects.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
186
Atam P Dhawan
8.3.1 Inverse Filtering
As presentedinChapter 2, anacquiredimage g(x, y) canbe expressed
as a convolution of the object f (x, y) with a point spread function
(PSF) h(x, y) of a linear spatially invariant imaging systemwithaddi
tive noise n(x, y) as:
g(x, y) = h(x, y) ⊗f (x, y) +n(x, y). (17)
The Fourier transform of Eq. 17, provides a multiplicative relation
ship of F(u, v), the Fourier transform of the object and H(u, v), the
Fourier transform of the PSF:
G(u, v) = H(u, v)F(u, v) +N(u, v), (18)
where u and v represents frequency domain along x and
ydirections, and G(u, v) and N(u, v) are respectively the Fourier
transforms of the acquired image g(x, y) and the noise n(x, y).
The object information in the Fourier domain can be recovered
by inverse ﬁltering as:
ˆ
F(u, v) =
G(u, v)
H(u, v)
−
N(u, v)
H(u, v)
, (19)
where
ˆ
F(u, v) is the restored image in the frequency domain.
The inverse ﬁltering operation represented in Eq. 19 provides
a basis for image restoration in the frequency domain. Inverse
Fourier transform of F(u, v) provides the restored image in the spa
tial domain. The PSF of the imaging system can be experimentally
determined or statistically estimated.
1
8.3.2 Wiener Filtering
The image restoration approach presented in Eq. 19 appears to be
simple but poses a number of challenges in practical implementa
tion. Besides the difﬁculties associated with the determination of the
PSF, lowvalues or zeros in H(u, v) cause computational problems.
Constrained deconvolution approaches and weighted ﬁltering have
been used to avoid the “division by zero” problem in Eq. 19.
1–3
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
187
Wiener ﬁltering is a well known and effective method for image
restoration to perform weighted inverse ﬁltering as:
ˆ
F(u, v) =
_
_
_
1
H(u, v)
_
_
_
H(u, v)
2
H(u, v)
2
+
S
n
(u,v)
S
f
(u,v)
_
_
_
_
G(u, v), (20)
where S
f
(u, v) and S
n
(u, v) are, respectively, the power spectrum of
the signal and noise.
The Wiener ﬁlter, also known as the minimum square error ﬁl
ter, provides an estimate determined by exact inverse ﬁltering if the
noise spectrum is zero. In cases of nonzero signaltonoise spec
trum ratio, the division is appropriately weighted. If the noise can
be assumed to be spectrally white, Eq. 20 reduces to a simple para
metric ﬁlter with a constant K as:
ˆ
F(u, v) =
__
1
H(u, v)
__
H(u, v)
2
H(u, v)
2
+K
__
G(u, v). (21)
In implementing inverse ﬁltering based methods for image restora
tion, the major issue is the estimation of the PSF and noise spectra.
The estimation of PSF is dependent on the instrumentation and
parameters of the imaging modality. For example, in the EPI method
of MR imaging, an image formation process can be described in a
discrete representation by
16
:
g(x, y) =
M−1
x
=0
N−1
y
=0
f (x
, y
)H(x
, y
; x, y), (22)
where g(x, y) is the reconstructed image of M×N pixels, f (x, y) is the
ideal image of the object and H(x
, y
; x, y) is the PSF of the image
formation process in EPI. The MR signal s(k, l) at a location (k, l) in
the kspace for the EPI method can be represented as:
s(k, l) =
M−1
x=0
N−1
y=0
f (x, y)A(x, y; k, l), (23)
where
A(x, y; k, l) = e
−2πj((kx/M)+(ly/N)−(γ/2π)B
x,y
T
k,l
)
, (24)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
188
Atam P Dhawan
where B
x,y
is spatially variant ﬁeld inhomogeneity and t
k,l
is the
time between the sampling of the kspace location (k, l) and the RF
excitation.
With the above representation, the PSF H(x
, y
; x, y) can be
obtained from the 2D inverse FFT of the function A(x, y; k, l) as:
H(x
, y
; x, y) =
M−1
k=0
N−1
l=0
A(x, y; k, l)e
2πj((kx/M)+ly/N)
=
e
2πj((k(x
−x)/M)+(l(y
−y)/N)−(γ/2π)B
x,y
T
k,l
)
. (25)
8.4 CONSTRAINED LEAST SQUARE FILTERING
The constrained least square ﬁltering method uses optimization
techniques on a set of equations representing the image formation
process. The Eq. 6.18 can be rewritten in the matrix form as:
g = Hf +n, (26)
where g is a column vector representing the reconstructed image
g(x, y), f is a column vector of MN ×1 dimension, representing the
ideal image f (x, y), and n represents the noise vector. The PSF is
represented by the matrix H of MN ×MN elements.
For image restoration using the above equation, an estimate
ˆ
f
needs to be computed such that the meansquare error between
the ideal image and the estimated image is minimized. The over
all problem may not have a unique solution. Also, small variations
in the matrix H may have signiﬁcant impact on the noise content
of the restored image. To overcome these problems regularization
methods involving constrained optimization techniques are used.
Thus, the optimizationprocess is subjectedtothe speciﬁc constraints
such as smoothness to avoid noisy solutions for the vector
ˆ
f. The
smoothness constraint can be derived from the Laplacian for the
estimated image. Using the theory of random variables, the opti
mization process is deﬁned to estimate
ˆ
f such that the mean square
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
189
error, e
2
given by:
e
2
= Trace E{(f −
ˆ
f)f
t
},
is minimizedsubject tothe smoothness constraint involvingthe min
imization of the roughness or Laplacian of the estimated image as
min{
ˆ
f
t
[C][C]
ˆ
f},
where [C] =
_
_
_
_
_
_
_
_
_
_
_
_
_
1
−2 1
1 −2 1
1 −2
1 ·
· 1
· −2
1
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
. (27)
It can be shown that the estimated image
ˆ
f can be expressed as
4
:
ˆ
f = ([H]
t
[H] +
1
λ
[C]
t
[C])
−1
[H]
t
g, (28)
where λ is a Lagrange multiplier.
8.4.1 LowPass Filtering
The ideal lowpass ﬁlter suppresses noise and highfrequency
information providing a smoothing effect to the image. A two
dimensional lowpass ﬁlter function H(u, v) is multiplied with the
Fourier transform G(u, v) of the image to provide a smoothed
image as:
ˆ
F(u, v) = H(u, v)G(u, v), (29)
where
ˆ
F(u, v) is the Fourier transformof the ﬁlteredimage
ˆ
f (x, y) that
can be obtained by taking an inverse Fourier transform.
Anideal lowpass ﬁlter canbe designedbyassigninga frequency
cutoff value ω
0
. The frequency cutoff value can also be expressed
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
190
Atam P Dhawan
as the distance D
0
fromthe origin in the Fourier (frequency) domain:
H(u, v) =
_
1 if D(u, v) ≤ D
0
0 otherwise
, (30)
where D(u, v) is the distance of a point in the Fourier domain from
the origin representing the dc value.
An ideal lowpass ﬁlter has sharp cutoff characteristics in the
Fourier domain causing a rectangular window for the pass band.
From Chapter 2, it can be shown that a rectangular function in the
frequency domain provides a sinc function in the spatial domain.
Also, the multiplicative relationship of the ﬁlter model in Eq. 29
leads to a convolution operation in the spatial domain. The rectan
gular passband window in the ideal lowpass ﬁlter causes ringing
artifacts in the spatial domain. To reduce ringing artifacts the pass
band should have a smooth falloff characteristic. A Butterworth
lowpass ﬁlter of nth order can be used to provide smoother falloff
characteristics and is deﬁned as:
H(u, v) =
1
1 +[D(u, v)/D
0
]
2n
. (31)
As the order n increases, the fall off characteristics of the pass band
become sharper. Thus, a ﬁrstorder Butterworth ﬁlter provides the
least amount of ringing artifacts in the ﬁltered image.
AGaussianfunctionis alsocommonlyusedfor lowpass ﬁltering
to provide smoother falloff characteristics of the pass band and is
deﬁned by:
H(u, v) = e
−D
2
(u,v)/2σ
2
, (32)
where D(u, v) is the distance fromthe origininthe frequency domain
and σ represents the standard deviation of the Gaussian function
that can be set to the cut off distance D
0
in the frequency domain.
In this case, the gain of the ﬁlter is down to 0.607 of its maximum
value at the cut off frequency. Figure 8 shows a CT axial image of the
chest cavity with its Fourier transform. The image was processed
with a lowpass ﬁlter with the frequency response shown in the
middle column of Fig. 8. The resultant lowpass ﬁltered image with
its Fourier transform is shown in the right column. It can be seen
J
a
n
u
a
r
y
2
2
,
2
0
0
8
1
2
:
2
W
S
P
C
/
S
P
I

B
5
4
0
:
P
r
i
n
c
i
p
l
e
s
a
n
d
R
e
c
e
n
t
A
d
v
a
n
c
e
s
c
h
0
8
F
A
P
r
i
n
c
i
p
l
e
s
o
f
I
m
a
g
e
P
r
o
c
e
s
s
i
n
g
M
e
t
h
o
d
s
1
9
1
Fig. 8. Left column: the original CT image with its Fourier transform; middle column: frequency response of the desired and
actual lowpass ﬁlter; right column: the resultant lowpass ﬁltered image with its Fourier transform.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
192
Atam P Dhawan
that lowfrequencyinformation is preserved while some of the high
frequency information is removed from the ﬁltered image. The ﬁl
tered image appears to be smoother.
8.4.2 HighPass Filtering
Highpass ﬁltering is used for image sharpening and extraction of
highfrequency informationsuchas edges. The lowfrequency infor
mationis attenuatedor blockeddependingonthe designof the ﬁlter.
An ideal highpass ﬁlter has a rectangular window function for the
highfrequency passband. Since the noise in the image usually car
ries highfrequency components, highpass ﬁltering also shows the
noise along with edge information. An ideal 2Dhighpass ﬁlter with
a cutoff frequency at a distance D
0
from the origin in the frequency
domain is deﬁned as:
H(u, v) =
_
1 if D(u, v) ≥ D
0
0 otherwise
. (33)
As described above for an ideal lowpass ﬁlter, the sharp cutoff
characteristic of the rectangular window function in the frequency
domain as deﬁned in Eq. 33 causes the ringing artifacts in the ﬁl
tered image in the spatial domain. To avoid ringing artifacts ﬁlter
functions with smoother falloff characteristics such as Butterworth
and Gaussian are used. AButterworth highpass ﬁlter of nth order
is deﬁned in the frequency domain as:
H(u, v) =
1
1 +[D
0
/D(u, v)]
2n
. (34)
Figure 9 shows a CT axial image of the chest cavity with its
Fourier transform. The image was processed with a highpass ﬁlter
with the frequency response shown in the middle column of Fig. 9.
The resultant highpass ﬁltered image with its Fourier transform
is shown in the right column. It can be seen that the lowfrequency
information is attenuatedor deemphasizedin the highpass ﬁltered
image. Highfrequency information belonging to the edges can be
seen in the ﬁltered image.
J
a
n
u
a
r
y
2
2
,
2
0
0
8
1
2
:
2
W
S
P
C
/
S
P
I

B
5
4
0
:
P
r
i
n
c
i
p
l
e
s
a
n
d
R
e
c
e
n
t
A
d
v
a
n
c
e
s
c
h
0
8
F
A
P
r
i
n
c
i
p
l
e
s
o
f
I
m
a
g
e
P
r
o
c
e
s
s
i
n
g
M
e
t
h
o
d
s
1
9
3
Fig. 9. Left column: the original CT image with its Fourier transform; middle column: frequency response of the desired and
actual highpass ﬁlter; right column: the resultant highpass ﬁltered image with its Fourier transform.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
194
Atam P Dhawan
8.5 CONCLUDING REMARKS
Image processing operations such as noise removal, averaging,
ﬁltering and feature enhancement are critically important in com
puterized image analysis for feature characterization, analysis
and classiﬁcation. These operations are also important to help
visual examination and diagnostic evaluation for medical applica
tions. Though the basic image processing operations as described
in this chapter are quite efﬁcient and effective, more sophis
ticated modelbased methods have been developed for image
speciﬁc feature enhancement operations. These methods utilize
a priori information about the statistical distribution of graylevel
features in the context of a speciﬁc application. Such methods
are useful in enhancing the signaltonoise ratio of the acquired
image for better analysis and classiﬁcation of medical images.
Some of the recently developed image processing methods are
described in various chapters in the second and third part of
this book.
References
1. Jain AK, Fundamentals of Digital Image Processing, Prentice Hall, 1989.
2. Gonzalez RC, Woods RE, Digital Image Processing, Prentice Hall, 2002.
3. Jain R , Kasturi R, Schunck BG, Machine Vision, McGrawHill, 1995.
4. Rosenfeld A, Kak AV, Digital Picture Processing, 1 & 2, 2nd edn.
Academic Press, 1982.
5. Russ JC, The Image Processing Handbook, 2nd edn., CRC Press, 1995.
6. Schalkoff RJ, Digital Image Processing and Computer Vision,
John Wiley & Sons, 1989.
7. Kao Y H, MacFall JR, Correction of MRk space data corrupted by
spike noise, IEEE Trans Med Imag 19: 671–680, 2000.
8. Ahmed OA, Fahmy MM, NMR signal enhancement via a new time
frequency transform, IEEE Trans Med Imag 20: 1018–1025, 2001.
9. Goutte C, Nielson FA, Hansen LK, Modeling of Hemodynamic
response in fMRI using smooth FIR ﬁlters, IEEE Trans Med Imag 19:
1188–1201, 2000.
10. Zaroubi S, Goelman G, Complex denoising of MR data via wavelet
analysis: Applications for functional MRI, Mag Reson Imag 18: 59–68,
2000.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
195
11. Davis GW, Wallenslager ST, Improvement of chest region CT
images through automated graylevel remapping, IEEE Trans Med
Imaging 1–5: 30–35, 1986.
12. Pizer SM, Zimmerman JB, Staab EV, Adaptive graylevel assignment
in CT scan display, J Comput Assist Tomog 8: 300–306, 1984.
13. Dhawan AP, LeRoyer E, Mammographic feature enhancement by
computerized image processing, Comp Methods & Programs in Biomed
27: 23–29, 1988.
14. Kim JK, Park JM, Song KS, Park HW, Adaptive mammographic
image enhancement using ﬁrst derivative and local statistics, IEEE
Trans Med Imag 16: 495–502, 1997.
15. Chen G, Avram H, Kaufman L, Hale J, et al., T2 restoration and
noise suppression of hybrid MR images using Wiener and linear
prediction techniques, IEEE Trans Med Imag 13: 667–676, 1994.
16. Munger P, Crelier GR, Peters TM, Pike GB, An inverse problem
approach to the correction of distortion in EPI images, IEEE Trans
Med Imag 19: 681–689, 2000.
17. DhawanAP, Medical Image Analysis, Wiley Interscience, JohnWiley and
Sons, Hoboken, NJ, 2003.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch08 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
CHAPTER 9
Image Segmentation and Feature
Extraction
Atam P Dhawan
Medical image segmentation tasks are important to visualize features of
interests such as lesions with boundary and volume information. Simi
lar information is required in the computerized quantitative analysis and
classiﬁcation for diagnostic evaluation and characterization. This chap
ter presents some of the most effective and commonly used edge and
region segmentation methods. Statistical quantitative features from gray
level distribution, segmented regions, and texture in the image are also
presented.
9.1 INTRODUCTION
After an image is processed for noise removal, restoration and fea
ture enhancement as needed, it is important to analyze the image
for extractionof features of interest involvingedges, regions, texture,
etc. for further analysis. This goal is accomplishedby image segmen
tation task. Image segmentation refers to the process of partitioning
an image into distinct regions by grouping together neighborhood
pixels based on a predeﬁned similarity criterion. The similarity cri
terion can be determined using speciﬁc properties or features of
pixels representing objects in the image. Thus, image segmentation
can also be considered as a pixel classiﬁcation technique that allows
an edge or region based representation towards the formation of
regions of similarities inthe image. Once the regions are deﬁned, sta
tistical and other features can be computed to represent regions for
197
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
198
Atam P Dhawan
characterization, analysis and classiﬁcation. This chapter describes
major image segmentation methods for medical image analysis and
classiﬁcation.
9.2 EDGEBASED IMAGE SEGMENTATION
Edgebasedapproaches use spatial ﬁltering methods to compute the
ﬁrstorder or secondorder gradient information of the image. There
are a number of gradient operators that can be used for edgebased
segmentation. These operators include Roberts, Sobel, Laplacian,
Cannyandothers.
1–5
Some involve directional derivative masks that
are used to compute gradient information. The Laplacian mask can
be usedto compute secondorder gradient information of the image.
For segmentation purposes, after edges are extracted, an edge link
ing algorithm is applied to form closed regions.
1–3
Gradient infor
mation of the image can be used to track and link relevant edges.
This step is usually very tedious for it needs to deal with the noise
and irregularities in the gradient information.
9.2.1 Edge Detection Operations
The gradient magnitude and directional information fromthe Sobel
horizontal and vertical direction masks can be obtained by convolv
ing the respective G
x
and G
y
masks with the image as
1–2
:
G
x
=
_
_
−1 0 1
−2 0 2
−1 0 1
_
_
G
y
=
_
_
1 2 1
0 0 0
−1 −2 −1
_
_
(1)
M =
_
G
2
x
+G
2
Y
≈ G
x
 +G
Y
,
where M represents the magnitude of the gradient that can be
approximated as the sum of the absolute values of the horizontal
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
199
andvertical gradient images obtainedby convolving the image with
the horizontal and vertical masks, G
x
and G
y
.
The second order gradient operator Laplacian can be computed
by convolving one of the following masks, G
l(4)
and G
L(8)
, which,
respectively, use a 4 and 8connected neighborhood.
G
L(4)
=
_
_
0 −1 0
−1 4 −1
0 −1 0
_
_
or
G
L(8)
=
_
_
−1 −1 −1
−1 8 −1
−1 −1 −1
_
_
(2)
The secondorder derivative, Laplacian is very sensitive to noise
as it canbe seenfromthe distributionof weights inthe masks inEq. 2.
The Laplacian mask provides a nonzero output even for a single
pixel based speckle noise in the image. Therefore, it is usually bene
ﬁcial to apply a smoothing ﬁlter ﬁrst before taking a Laplacian of the
image. The image can be smoothed using a Gaussian weighted spa
tial averaging as the ﬁrst step. The second step then uses a Laplacian
mask to determine edge information. Marr and Hildreth
3
combined
these two steps into a single Laplacian of Gaussian function as:
h(x, y) = ∇
2
[g(x, y) ⊗f (x, y)]
= ∇
2
[g(x, y)] ⊗f (x, y), (3)
where ∇
2
[g(x, y)] is the Laplacian of the Gaussian function that
is used for spatial averaging and is commonly expressed as the
Mexican Hat operator:
∇
2
[g(x, y)] =
_
x
2
+y
2
−2σ
2
σ
4
_
e
(x
2
+y
2
)
2σ
2
, (4)
where σ
2
is the variance of the Gaussian function.
ALaplacian of Gaussian (LOG) mask for computing the second
order gradient information of the smoothed image can be computed
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
200
Atam P Dhawan
from Eq. 4. With σ = 2, the LOG mask G
LOG
of 5 × 5 pixels is
given by:
G
LOG
=
_
_
_
_
_
_
_
0 0 −1 0 0
0 −1 −2 −1 0
−1 −2 16 −2 −1
0 −1 −2 −1 0
0 0 −1 0 0
_
¸
¸
¸
¸
¸
_
. (5)
The image obtained by convolving the LOGmask with the orig
inal image is analyzed for zero crossing to detect edges since the
output image provides values fromnegative to positive values. One
simple method to detect zero crossing is to threshold the output
image for zero value. This operation provides a new binary image
such that a “0” gray value is assigned to the binary image if the
output image has a negative or zero value for the corresponding
pixel. Otherwise, a high gray value (such as “255” for an 8 bit
image) is assigned to the binary image. The zero crossing of the
output image can now be easily determined by tracking the pixels
with a transition from black ( “0” gray value) to white (“255” gray
value).
9.2.1.1 Boundary Tracking
Edge detection operations are usually followed up by the edge
linking procedures to assemble meaningful edges to form closed
regions. Edgelinking procedures are based on pixelbypixel search
to ﬁnd connectivity among the edge segments. The connectivity can
be deﬁned using a similarity criterion among edge pixels. In addi
tion, geometrical proximity or topographical properties are used
to improve edgelinking operations for pixels that are affected by
noise, artifacts or geometrical occlusion. Estimation methods based
on probabilistic approaches, graphs and rulebased methods for
modelbased segmentation have also been used.
4–12
In the neighborhood search methods, the simplest method is
to follow the edge detection operation by a boundarytracking
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
201
algorithm. Let us assume that the edge detection operation
produces an edge magnitude e(x, y) and an edge orientation φ(x, y)
information. The edge orientation information can be directly
obtained from the directional masks, as described in chapter 6, or
computed from the horizontal and vertical gradient masks. Let us
start with a list of edge pixels that can be selected fromscanning the
gradient image obtained fromthe edge detection operation. Assum
ing the ﬁrst edge pixel as a boundary pixel b
j
, a successor boundary
pixel b
j+1
can be found in the 4 or 8connected neighborhood if the
following conditions are satisﬁed:
e(b
j
) > T
1
e(b
j+1
) > T
1
e(b
j
) −e(b
j+1
) < T
2
φ(b
j
) −φ(b
j+1
)mod 2π < T
3
,
(6)
where T
1
, T
2
, and T
3
are predetermined thresholds.
If there is more than one neighboring pixel satisfying these con
ditions, the pixel that minimizes the differences is selected as the
next boundary pixel. The algorithm is recursively applied until all
neighbors are searched. If no neighbor is found satisfying these con
ditions, the boundary search for the striating edge pixel is stopped
anda newedge pixel is selected. It canbe notedthat sucha boundary
tracking algorithm may leave many edge pixels and partial bound
aries unconnected. Some a priori knowledge about the object bound
aries is often needed to form regions with closed boundaries. Also,
relational tree structures or graphs can be used to help the formation
of closed regions.
13–14
A graphbased search method attempts to ﬁnd paths between
the start and end nodes minimizing a cost function that may be
established based on the distance and transition probabilities. The
start and end nodes are determined from scanning the edge pixels
basedon some heuristic criterion. For example, an initial search may
label the ﬁrst edge pixel in the image as the start node and all the
other edge pixels in the image or a part of the image as potential
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
202
Atam P Dhawan
end nodes. Among several graphbased search algorithms, the A*
algorithm is widely used.
13–15
9.3 PIXELBASED DIRECT CLASSIFICATION METHODS
Thepixelbaseddirect classiﬁcationmethods usehistogramstatistics
to deﬁne single or multiple thresholds to classify an image pixel
bypixel. The threshold for classifying pixels into classes is obtained
fromthe analysis of the histogramof the image. Asimple approachis
to examine the histogram for bimodal distribution. If the histogram
is bimodal, the threshold can be set to the gray value corresponding
to the deepest point in the histogram valley. If not, the image can
be partitioned into two or more regions using some heuristics about
the properties of the image. The histogramof each partition can then
be used for determining thresholds. By comparing the gray value of
each pixel to the selected threshold, a pixel can be classiﬁed into one
of the two classes.
Let us assume that an image or a part of the image has a bimodal
histogram of gray values. The image f (x, y) can be segmented into
two classes using a gray value threshold T such that:
g(x, y) =
_
1 if f (x, y) > T
0 if f (x, y) ≤ T
(7)
where g(x, y) is the segmented image with two classes of binary gray
values “1” and “0” and T is the threshold selected at the valley point
from the histogram. Asimple approach to determine the gray value
threshold T is by analyzing the histogram for the peak values and
then ﬁnding the deepest valley point between the two consecutive
major peaks.
9.3.1 Optimal Global Thresholding
To determine an optimal global gray value threshold for image seg
mentation, parametric distribution based methods can be applied to
the histogramof an image.
1,2,5,15
Let us assume that the histogramof
an image to be segmented has two Gaussian distributions belong
ing to two respective classes such as background and object. Thus,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
203
the histogram can be represented by a mixture probability density
function p(z) as:
p(z) = P
1
p
1
(z) +P
2
p
2
(z), (8)
where p
1
(z) and p
2
(z) are the Gaussian distributions of class 1 and 2,
respectively, with the class probabilities of P
1
and P
2
such that:
P
1
+P
2
= 1. (9)
Using a gray value threshold T, a pixel in the image f (x, y) can be
classiﬁed to class 1 or class 2 in the segmented image g(x, y) as:
g
(
x, y) =
_
Class1 if f (x, y) > T
Class2 if f (x, y) ≤ T
. (10)
Let us deﬁne the error probabilities of misclassifying a pixel as:
E
1
(T) =
_
T
−∞
p
2
(z)dz
and
E
2
(T) =
_
T
−∞
p
1
(z)dz, (11)
where E
1
(T) and E
2
(T) are, respectively, the probability of erro
neously classifying a class 1 pixel to class 2 and a class 2 pixel to
class 1.
The overall probability of error in pixel classiﬁcation using the
threshold T is then expressed as:
E(T) = P
2
(T)E
1
(T) +P
1
(T)E
2
(T). (12)
For image segmentation, the objective is to ﬁnd an optimal
threshold T that minimizes the overall probability of error in pixel
classiﬁcation. The optimization process requires the parameteriza
tion of the probability density distributions and likelihood of both
classes. These parameters can be determined from a model or set of
training images.
1,2,15,19,24
Let us assume σ
i
and µ
i
to be the standard deviation and mean
of the Gaussian probability density function of the class i (i = 1, 2
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
204
Atam P Dhawan
for two classes) such that:
p(z) =
P
1
√
2πσ
1
e
−(z−µ
1
)
2
/2σ
2
1
+
P
2
√
2πσ
2
e
−(z−µ
2
)
2
/2σ
2
2
. (13)
The optimal global threshold T can be determined by ﬁnding a gen
eral solution that minimizes Eq. 12 with the mixture distribution in
Eq. 13 and thus satisﬁes the following quadratic expression
2
:
AT
2
+BT +C = 0,
where
A = σ
2
1
−σ
2
2
B = 2(µ
1
σ
2
2
−µ
2
σ
2
1
)
C = σ
2
1
µ
2
2
−σ
2
2
µ
2
1
+2σ
2
1
σ
2
2
ln (σ
2
P
1
/σ
1
P
2
). (14)
If the variances of both classes can be assumed to be equal to σ
2
, the
optimal threshold T can be determined as:
T =
µ
1
+µ
2
2
+
σ
2
µ
1
−µ
2
ln
_
P
2
P
1
_
. (15)
It shouldbe notedthat incase of equal likelihoodof classes, the above
expression for determining the optimal threshold is simply reduced
to the average of the mean values of two classes. Figure 1 shows the
Fig. 1. Segmentation of a T2 weighted MR brain image (shown at the left) using
optimal thresholdingmethodat T =54 yieldingthe binarysegmentedimage shown
at the right.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
205
results of the optimal thresholdingmethodappliedtoaT2weighted
MR brain image. It can be seen that most of such a segmentation
method is quite effective in determining the intercranial volume.
9.3.2 Pixel Classiﬁcation Through Clustering
In histogram based pixel classiﬁcation method for image segmen
tation, the gray values are partitioned into two or more clusters
depending on the peaks in the histogram to obtain thresholds. The
basic concept of segmentationbypixel classiﬁcationcanbe extended
to clustering the gray values or feature vector of pixels in the image.
This approach is particularly useful when images with pixels repre
senting a feature vector consisting of multiple parameters of interest
are to be segmented. For example, a feature vector may consist of
gray value, contrast and local texture measures for each pixel in the
image. A color image may have additional color components in a
speciﬁc representation such as red, green and blue components in
the RGB color coordinate system that can be added to the feature
vector. Magnetic Resonance (MR) or multimodality medical images
may also require segmentation using a multidimensional feature
space with multiple parameters of interest.
Images can be segmented by pixel classiﬁcation through cluster
ing of all features of interest. The number of clusters in the multi
dimensional feature space thus represents the number of classes in
the image. As the image is classiﬁed into cluster classes, segmented
regions are obtained by checking the neighborhood pixels for the
same class label. However, clustering may produce disjoint regions
with holes or regions with single pixel. After the image data is clus
tered and pixels are classiﬁed, a postprocessing algorithm such as
region growing, pixel connectivity or rulebased algorithm is usu
ally applied to obtain the ﬁnal segmented regions.
21,37
There are a
number of algorithms developed for clustering in the literature and
used for a wide range of applications.
15,20,21,36–41
Clustering is the process of grouping data points with similar
feature vectors together in a single cluster while data points with
dissimilar feature vectors are placed in different clusters. Thus, the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
206
Atam P Dhawan
data points that are close to each other in the feature space are clus
teredtogether. The similarityof feature vectors canbe representedby
an appropriate distance measure such as Euclidean or Mahalanobis
distance.
42
Each cluster is represented by its mean (centeroid) and
variance (spread) associatedwiththe distributionof the correspond
ing feature vectors of the data points in the cluster. The formation of
clusters is optimized with respect to an objective function involving
prespeciﬁed distance and similarity measures along with additional
constraints such as smoothness.
9.3.2.1 kMeans Clustering
The kmeans clustering is a popular approach to partition
ddimensional data into k clusters such that an objective function
providingthe desiredproperties of the distributionof feature vectors
of clusters in terms of similarity anddistance measures is optimized.
Ageneralizedkmeans clusteringalgorithminitiallyplaces k clusters
at arbitrarily selected cluster centroids v
i
; i = 1, . . . 2, k and modiﬁes
centroids for the formation of new cluster shapes optimizing the
objective function. The kmeans clustering algorithm includes the
following steps:
(1) Select the number of clusters k with initial cluster centroids v
i
;
i = 1, . . . 2, k.
(2) Partition the input data points into k clusters by assigning each
data point x
j
to the closest cluster centroid v
i
using the selected
distance measure, e.g. Euclidean distance, deﬁned as:
d
ij
= x
j
−v
i
, (16)
where X = {x
1
, x
2
, . . . , x
n
} is the input data set.
(3) Compute a cluster assignment matrix U representing the parti
tion of the data points with the binary membership value of the
jth data point to the ith cluster such that:
U = u
ij
,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
207
where
u
ij
∈ {0, 1} for all i, j
k
i=1
u
ij
= 1 for all j and0 <
n
j=1
u
ij
< n for all i. (17)
(4) Recompute the centroids using the membership values as:
v
i
=
n
j=1
u
ij
x
j
n
j=1
u
ij
for all i. (18)
(5) If cluster centroids or the assignment matrix does not change
from the previous iteration, stop; otherwise go to step 2.
The kmeans clustering method optimizes the sumofsquarederror
based objective function J
w
(U, v) such that:
J
w
(U, v) =
k
i=1
n
j=1
x
j
−v
i
2
. (19)
It can be noted from the above algorithm that the kmeans clus
tering method is quite sensitive to the initial cluster assignment
and the choice of the distance measure. Additional criterion such
as withincluster and betweencluster variances can be included in
the objective function as constraints to force the algorithm to adapt
the number of clusters k (as needed for optimization of the objective
function).
9.3.2.2 Fuzzy cMeans Clustering
The kmeans clustering method utilizes the hard binary values for
the membership of a data point to the cluster. The fuzzy cmeans
clustering method utilizes an adaptable membership value that can
be updated based using the distribution statistics of the data points
assigned to the cluster minimizing the following objective function
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
208
Atam P Dhawan
J
m
(U, v):
J
m
(U, v) =
c
i=1
n
j=1
u
m
ij
d
2
ij
=
c
i=1
n
j=1
u
m
ij
x
j
−v
i
, (20)
where c is the number of clusters, n the number of data vectors,
u
ij
is the fuzzy membership and m is the fuzziness index. Based
on the constraints deﬁned on the distribution statistics of the data
points inthe clusters, fuzziness index canbe deﬁnedbetween1 anda
very large value for the highest level of fuzziness (maximum allow
able variance within a cluster). The membership values in the fuzzy
cmeans algorithm can be deﬁned as
36
:
0 ≤ u
ij
≤ 1 for all i, j
c
i=1
u
ij
= 1 for all j and 0 <
n
j=1
u
ij
< n for all i. (21)
The algorithmdescribedfor kmeans clusteringcanbe usedfor fuzzy
cmeans clustering with the update of the fuzzy membership values
as deﬁned in Eq. 21 minimizing the objective function as deﬁned in
Eq. 20.
Figure 2 shows the results of kmeans clustering on a T2
weighted MR brain image with k = 9. Different regions segmented
from selected clusters are shown in Fig. 2.
Fig. 2(A). AT2 weighted MR brain image used for segmentation in Fig. 2(B).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
209
Fig. 2(B). Results of segmentation of the image shown in Fig. 2(A) using kmeans
clustering algorithm with k = 9; top left: all segmented regions belonging to all
9 clusters; top middle: regions segmented from cluster k = 1; top right: regions
segmented from cluster k = 4; bottom left: regions segmented from cluster k = 5;
bottom middle: regions segmented from cluster k = 6; bottom right: regions seg
mentedfromcluster k = 9. (CourtesyDonAdams, Arwa GheithandValerie Rafalko
from their class project.)
9.4 REGIONBASED SEGMENTATION
Regiongrowing based segmentation algorithms examine pixels in
the neighborhood based on a predeﬁned similarity criterion and
then assign pixels into groups to form regions. The neighborhood
pixels with similar properties are merged to form closed regions
for segmentation. The region growing approach can be extended to
merging regions instead of merging pixels to form larger meaning
ful regions of similar properties. Such a region merging approach is
quite effective when the original image is segmented into a large
number of regions in the preprocessing phase. Large meaning
ful regions may provide a better correspondence and matching to
the object models for recognition and interpretation. An alternate
approach is region splitting in which either the entire image or large
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
210
Atam P Dhawan
regions are split into two or more regions based on a heterogene
ity or dissimilarity criterion. For example, if a region has a bimodal
distribution of gray value histogram, it can be split into two regions
of connected pixels with gray values falling in their respective dis
tributions. The basic difference between the region and threshold
ing based segmentation approaches is that regiongrowing methods
guarantee the segmented regions of connected pixels. On the other
hand, pixel thresholdingbased segmentation methods as deﬁned in
the previous section may yield regions with holes and disconnected
pixels.
9.4.1 Regiongrowing
Regiongrowing methods merge pixels of similar properties by
examining the neighborhood pixels. The process of merging pixels
continues with the growth of region adapting a new shape and size
until there is insufﬁcient number of neighborhoodpixels tobe added
in the current region. Thus, the regiongrowing process requires a
similarity criterion that deﬁnes the basis for inclusion of pixels in
the growth of the region; and a stopping criterion that stops the
growth of the region. The stopping criterion is usually based on the
minimum number or percentage of neighborhood pixels required
to satisfy the similarity criterion for inclusion in the growth of the
region.
In the regionmerging algorithms, an image may be partitioned
into a large number of potential homogeneous regions. For exam
ple, an image of 1024 ×1024 pixels can be portioned into regions of
8 × 8 pixels. Each region of 8 × 8 pixels can now be examined for
homogeneity of predeﬁned property such as gray values, contrast,
texture, etc. If the histogramof the predeﬁnedpropertyfor the region
is unimodal, the region is said to be homogeneous. Two neighbor
hood regions can be merged if they are homogeneous and satisfy
a predeﬁned similarity criterion. The similarity criterion imposes
constraints on the value of the property with respect to its mean
and variance values. For example, two homogeneous regions can
be merged if the difference in their mean gray values is within 10%
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
211
of the entire dynamic range and the difference in their variances is
within 10% of the variance in the image. These thresholds may be
selected heuristically or through probabilistic models.
10,15
It is inter
esting to note that the above criterion can be easily implemented as
a conditional rule in a knowledgebased system. Regionmerging or
regionsplitting (described in the next section) methods have been
implemented using a rule based system for image segmentation.
25
Modelbased systems typically encode knowledge of anatomy
and image acquisition parameters. Anatomical knowledge can be
modeled symbolically, describing the properties and relationships
of individual structures, or geometrically either as masks or tem
plates of anatomy, or using an atlas.
18,19,25–26
Figure 3 shows a MR
brain image and the segmented regions for ventricles. The knowl
edgeof anatomical locations of ventricles was usedtoestablishinitial
seed points for region growing. Afeature adaptive region growing
method was used for segmentation.
9.4.2 Regionsplitting
Regionsplitting methods examine the heterogeneity of a predeﬁned
propertyof the entire regioninterms of its distributionandthe mean,
Fig. 3(A). AT2 weighted MR brain image used for ventricle segmentation using
a region growing approach.
26
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
212
Atam P Dhawan
Fig. 3(B). Segmented ventricle regions of image shown in Fig. 3(A) using a mod
elbased region growing algorithm.
26
variance, minimumand maximumvalues. If the region is evaluated
as heterogeneous, that is it fails the similarity or homogeneity crite
rion, the original region is split into two or more regions. The region
splitting process continues until all regions satisfy the homogeneity
criterion individually. In the regionsplitting process, the original
region R is split into R1, R2, . . . ., R
n
subregions such that the follow
ing conditions are met
2,5
:
(1) Each region, R
i
; i = 1, 2, . . . , n is connected.
(2)
n
i=1
R
i
= R
(3) R
i
_
R
j
= O for all i, j; i = j
(4) H(R
i
) = TRUE for i = 1, 2, . . . , n.
(5) H(R
i
R
j
) = FALSE for i = j,
where H(R
i
) is a logical predicate for the homogeneity criterion on
the region R
i
.
Regionsplitting methods can also be implemented by rule
based systems and quadtrees. In the quadtree based region
splitting method, the image is partitioned into four regions that are
represented by nodes in a quad tree. Each region is checked for the
homogeneity and evaluated for the logical predicate H(R
i
). If the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
213
region is homogeneous, no further action is taken for the respective
node. If the region is not homogeneous, it is further split into four
regions.
9.5 RECENT ADVANCES IN SEGMENTATION
The problem of segmenting medical images into anatomically and
pathologicallymeaningful regions has beenaddressedusingvarious
approaches including modelbased estimation methods and rule
based systems.
17–27
Nevertheless, automatic (or semiautomatic
with minimal operator interaction) segmentation methods for spe
ciﬁc applications are still current topics of research. This is due to the
large variability in anatomical structures and challenging needs of
a reliable, accurate, and diagnostically useful segmentation. A rule
based lowlevel segmentation system for automatic identiﬁcation
of brain structures from MR images has been described by Raya.
17
Neural network based classiﬁcation approaches have also been
applied for medical image segmentation.
10,28
A multilevel adap
tive segmentation method (MAS) was used to segment and classify
multiparameter MR brain images into a large number of classes of
physiological and pathological interest.
24
The MAS method is based
on estimation of signatures for each segmentation class for pixelby
pixel classiﬁcation.
9.6 IMAGE SEGMENTATION USING NEURAL NETWORKS
Neural networks provide another pixel classiﬁcation paradigmthat
can be used for image segmentation.
10,28–29
Neural networks do not
require underlying class probability distribution for accurate clas
siﬁcation. Rather, the decision boundaries for pixel classiﬁcation
are adapted through an iterative training process. Neural network
based segmentation approaches may provide good results for med
ical images with considerable variance in structures of interest. For
example, angiographic images show a signiﬁcant variation in arte
rial structures and therefore are difﬁcult to segment. The variation
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
214
Atam P Dhawan
in image quality among various angiograms and the introduction
of noise in the course of image acquisition emphasizes the impor
tance of an adaptive nonparametric segmentation method. Neural
network paradigms such as Backpropagation, Radial Basis Func
tion and SelfOrganizing Feature Maps have been used to segment
medical images.
10,28–35
Neural networks learn from examples in the training set in
which the pixel classiﬁcation task has already been performed using
manual methods. Anonlinear mapping function between the input
features and the desired output for labeled examples is learned
by neural networks without using any parameterization. After the
learning process, a pixel in a new image can be classiﬁed for seg
mentation by the neural network.
It is important to select a meaningful set of features to provide
as input to the neural network for classiﬁcation. The selection of
training examples is also very important, as they should represent
a reasonably complete statistical distribution of the input data. The
architecture of the networkandthe distributionof trainingexamples
play a major role in determining its performance for accuracy, gen
eralization and robustness. In its simplest form, the input to a neural
network can be the gray values of pixels in a predeﬁned neighbor
hood in the image. Thus, the network can classify the center pixel of
the neighborhood based on the information of the entire set of pixels
in the corresponding neighborhood. As the neighborhood window
is translated in the image, the pixels in the central locations of the
translated neighborhoods are classiﬁed. Neural network architec
ture and learning methods are described in Chapter 10 for pattern
classiﬁcationthat canbe usedfor pixelbasedclassiﬁcationfor image
segmentation.
28–35
9.7 FEATURE EXTRACTION AND REPRESENTATION
Graylevel statistics of the image, graylevel statistics and shape of
the segmented regions, and texture can be used in feature represen
tation of the image for characterization, analysis and classiﬁcation.
Selectionof correlatedfeatures for a speciﬁc classiﬁcationtaskis very
important. Details about clustering and classiﬁcation are provided
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
215
inChapter 10 of this book. Various commonlyusedfeatures inimage
analysis and classiﬁcation are brieﬂy described below.
9.7.1 Statistical PixelLevel Image Features
Once the regions are segmented in the image, gray values of pixels
within the region can be used for computing the following statistical
pixellevel (SPL) features
1–2
:
(1) The histogram of the gray values of pixels in the image as:
p(r
i
) =
n(r
i
)
n
, (22)
where p(r
i
) and n(r
i
) are, respectively, the probability and num
ber of occurrence of a gray value r
i
in the region and n is the total
number of pixels in the region.
(2) Mean m of the gray values of the pixels in the image can be
computed as:
m =
1
n
L−1
i=0
r
i
p(r
i
), (23)
where L is the total number gray values in the image with
0, 1, . . . , L −1.
(3) Variance and central moments in the region can be computed as:
µ
n
=
L−1
i=0
p(r
i
)(r
i
−m)
n
, (24)
where the secondcentral moment µ
2
is the variance of the region.
The third and fourth central moments can be computed, respec
tively, for n = 3 andn = 4. The thirdcentral moment is a measure
of noncentrality while the fourth central moment is a measure
of ﬂatness of the histogram.
(4) Energy: Total energy E of the grayvalues of pixels in the region
is given by:
E =
L−1
i=0
[p(r
i
)]
2
. (25)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
216
Atam P Dhawan
(5) Entropy: The entropy Ent as a measure of information repre
sentedbythedistributionof grayvalues intheregionis givenby:
Ent =
L−1
i=0
p(r
i
) log
2
(r
i
). (26)
(6) Local contrast corresponding to each pixel can be computed by
the difference of the grayvalue of the center pixel and the mean
of the gray values of the neighborhood pixels. The normalized
local contrast C(x, y) for the center pixel can also be computedas:
C(x, y) =
P
c
(x, y) −P
s
(x, y)
max{P
c
(x, y), P
s
(x, y)}
, (27)
where P
c
(x, y) andP
s
(x, y) are the average graylevel values of the
pixels corresponding to the “center” andthe “surround” regions
that are grown aroundthe centeredpixel through a region grow
ing method.
5,45
(7) Additional features such as maximum and minimum gray val
ues can also be used for representing regions.
(8) The features based on the statistical distribution of local contrast
values in the region also provide useful characteristics informa
tion about the regions representing objects.
(9) Features based on the gradient information for the boundary
pixels of the region are also an important consideration in deﬁn
ing the nature of edges. For example, the fading edges with low
gradient form a characteristic feature of malignant melanoma
and must be included in the classiﬁcation analysis of images of
skin lesions.
9
9.7.2 Shape Features
Shape features of the segmentedregioncanalso be usedinclassiﬁca
tion analysis. The shape of a region is basically deﬁnedby the spatial
distribution of boundary pixels. A simple approach for computing
shape features for a 2D region is representing circularity, compact
ness, elongatedness through the minimum bounded rectangle that
covers the region.
1–5
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
217
Several shape features using the boundary pixels of the seg
mented region can be computed as:
(1) Longest axis.
(2) Shortest axis.
(3) Perimeter and area of the minimum bounded rectangle.
(4) Elongation ratio.
(5) Perimeter p and area A of the segmented region.
(6) Hough transform of the region using the gradient information
of the boundary pixels of the region
1−5
[also described later in
this chapter].
(7) Circularity (C = 1 for a circle) of the region computed as:
C =
4πA
p
2
. (28)
(8) Compactness Cp of the region computed as:
C
p
=
p
2
A
. (29)
(9) Chain code for boundary contour as obtained using a set of
orientation primitives on the boundary segments derived from
a piecewise linear approximation.
(10) Fourier descriptor of boundary contours as obtained using
the Fourier transform of the sequence of boundary segments
derived from a piecewise linear approximation.
(11) Central moments based shape features for the segmented
region.
(12) Morphological shape descriptors as obtained though the mor
phological processing on the segmented region.
46–51
9.7.3 Moments for Shape Description
The shape of a boundary or contour can be represented quantita
tively by the central moments for matching. The central moments
represent speciﬁc geometrical properties of the shape andare invari
ant to the translation, rotation and scaling. The central moments µ
pq
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
218
Atam P Dhawan
of a segmented region or binary image f (x, y) are given by
1,2,5,52
:
µ
pq
=
L
i=1
L
j=1
(x
i
− ¯ x)
p
(y
j
− ¯ y)
q
f (x, y)
where
¯ x =
L
i=1
L
j=1
x
i
f (x
i
, y
j
),
¯ y =
L
i=1
L
j=1
y
j
f (x
i
, y
j
). (30)
For example, the central moment µ
21
represents the vertical diver
gence of the shape of the region indicating the relative extent of the
bottom of the region compared to the top. The normalized central
moments can be computed as:
η
pq
=
µ
pq
(µ
00
)
γ
,
where
γ =
p +q
2
+1. (31)
There are seven invariant moments φ
1
– φ
7
for shape matching are
deﬁned as
52
:
φ
1
= η
20
+η
02
φ
2
= (η
20
−η
02
)
2
+4η
2
11
φ
3
= (η
30
−3η
12
)
2
+(3η
21
−η
03
)
2
φ
4
= (η
30
+η
12
)
2
+(η
21
+η
03
)
2
φ
5
= (η
30
−3η
12
)(η
30
+η
12
)[(η
30
+η
12
)
2
−3(η
21
+η
03
)
2
]
+(3η
21
−η
03
)(η
21
+η
03
)[3(η
30
+η
12
)
2
−(η
21
+η
03
)
φ
6
= (η
20
−η
02
)[(η
30
+η
12
)
2
−(η
21
+η
03
)
2
]
+4η
11
(η
30
+η
12
)(η
21
+η
03
)
φ
7
= (3η
21
−η
03
)(η
30
+η
12
)[(η
30
+η
12
)
2
−3(η
21
−η
03
)
2
]
+(3η
12
−η
30
)(η
21
+η
03
)[3(η
30
+η
12
)
2
−(η
21
+η
03
). (32)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
219
The invariant moments are used extensively in the literature for
shape matching and pattern recognition.
1,52
9.7.4 Texture Features
Texture is an important spatial property that can be used in
region segmentation as well as description. There are three major
approaches to represent texture: statistical, structural and spec
tral. Since texture is a property of the spatial arrangements of
the gray values of pixels, the ﬁrst order histogram of gray val
ues provide no information about the texture. Statistical methods
representing the higher order distribution of gray values in the
image are used for texture representation. The second approach
uses structural methods such as arrangements of prespeciﬁed prim
itives in texture representation. For example, a repetitive arrange
ment of square and triangular shapes can produce a speciﬁc
texture. The third approach is based on spectral analysis meth
ods such as Fourier and wavelet transforms. Using spectral anal
ysis, texture is represented by a group of speciﬁc spatiofrequency
components.
53,54
The graylevel cooccurrence matrix (GLCM) exploits the higher
order distribution of gray values of pixels that are deﬁned with a
speciﬁc distance or neighborhood criterion. In the simplest form,
the GLCM P(i, j) is the distribution of the number of occurrence
of a pair of gray values i and j separated by a distance vector
d = [dx,dy].
The GLCM can be normalized by dividing each value in the
matrix by the total number of occurrences providing the probability
of occurrence of a pair of gray values separated by a distance vec
tor. Statistical texture features are computed from the normalized
GLCM as the second order histogram H(y
q
, y
r
,d) representing the
probability of occurrence of a pair of gray values y
q
and y
r
separated
by a distance vector d. Texture features can also be described by a
difference histogram, H
d
(y
s
,d), where y
s
= y
q
− y
r
. H
d
(y
s
,d) indi
cates the probability that a difference in graylevels exists between
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
220
Atam P Dhawan
two distinct pixels. Commonly used texture features based on the
second order histogram statistics are as follows:
(1) Entropy of H(y
q
, y
r
, d), S
H
:
S
H
= −
y
t
y
q
=y
1
y
t
y
r
=y
1
H(y
q
, y
r
, d)log
10
[H(y
q
, y
r
, d)]. (33)
The entropy is a measure of texture nonuniformity. Lower
entropy values indicate greater structural variation among the
image regions.
(2) Angular Second Moment of H(y
q
, y
r
, d), ASM
H
:
ASM
H
=
y
t
y
q
=y
1
y
t
y
r
=y
1
[H(y
q
, y
r
, d)]
2
. (34)
The ASM
H
indicates the degree of homogeneity among tex
tures, and is also representative of the energy in the image (11).
Alower value of ASM
H
is indicative of ﬁner textures.
(3) Contrast of H(y
q
, y
r
, d):
Contrast =
y
t
y
q
=y
1
y
t
y
r
=y
1
∂(y
q
, y
r
)H(y
q
, y
r
, d), (35)
where ∂(y
q
, y
r
) is a measure of intensitysimilarityandis deﬁned
by ∂ = (y
q
− y
r
)
2
. Thus the contrast characterizes the extent of
variation in pixel intensity.
(4) Inverse Difference Moment of H(y
q
, y
r
, d), IDM
H
:
IDM
H
=
y
t
y
q
=y
1
y
t
y
r
=y
1
H(y
q
, y
r
, d)
1 +∂(y
q
, y
r
)
, (36)
where δ is deﬁned as before. The IDM
H
provides a measure of
the local homogeneity among textures.
(5) Correlation of H(y
q
, y
r
, d):
Cor
H
=
1
σ
y
q
σ
y
r
y
t
y
q
=y
1
y
t
y
r
=y
1
(y
q
−µy
q
)(y
r
−µy
r
)H(y
q
, y
r
, d), (37)
where µ
y
q
, µ
y
r
, σ
y
q
, σ
y
r
and are the respective means and stan
dard deviations of y
q
and y
r
. The correlation can also be
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
221
expanded and written in terms of the marginal distributions
of the second order histogram, which are deﬁned as:
H
m
(y
q
, d) =
y
t
y
r
=y
1
H(y
q
, y
r
, d), and
H
m
(y
r
, d) =
y
t
y
q
=y
1
H(y
q
, y
r
, d).
(38)
The correlation attribute is large for similar elements of the
second order histogram.
(6) Mean of H(y
q
, y
r
, d), µ
Hm
:
µ
Hm
=
y
t
y
q
=y
1
y
q
H
m
(y
q
, d). (39)
The mean characterizes the nature of the graylevel distribu
tion. Its value is typically small if the distribution is localized
around y
q
= y
1
.
(7) Deviation of H
m
(y
q
, d), σ
Hm
:
σ
Hm
=
¸
¸
¸
¸
_
y
t
y
q
=y
1
_
_
y
q
−
y
t
y
r
=y
1
y
r
H
m
(y
r
, d)
_
_
2
H
m(y
q
,d)
. (40)
The deviation indicates the amount of spread around the mean
of the marginal distribution. The deviation is small if the his
togram is densely clustered about the mean.
(8) Entropy of H
d
(y
s
, d), S
Hd(ys,d)
:
S
Hd(ys,d)
= −
y
t
y
s
=y
1
H
d
(y
s
, d)log
10
[H
d
(y
s
, d)]. (41)
(9) Angular second moment of H
d
(y
s
, d), ASM
Hd(ys,d)
:
ASM
Hd(ys,d)
=
y
t
y
s
=y
1
[H
d
(y
s
, d)]
2
. (42)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
222
Atam P Dhawan
(10) Mean of H
d
(y
s
, d), µ
Hd(ys,d)
:
µ
Hd(ys,d)
=
y
t
y
s
=y
1
y
s
[H
d
(y
s
, d)]. (43)
The features computed using the difference histogram,
H
d
(y
s
, d), have the same signiﬁcance as those attributes deter
mined by the second order statistics.
9.7.5 Hough Transform
The Hough transformis used to detect straight lines and other para
metric curves such as circles, ellipses, etc.
1,2,5
It can also be used to
detect boundaries of an arbitrarily shaped object if the parameters
of the object are known. The basic concept of the GeneralizedHough
transform is that an analytical function such as straight line, circle
or a closed shape, represented in the image space (spatial domain)
has a dual representation in the parameter space. For example, the
general equation of a straight line can be given as:
y = mx +c, (44)
where m is the slope and c is the yintercept.
As can be seen from Eq. 44, the locus of points is described by
two parameters, slope and yintercept. Therefore, a line in the image
space forms a point (m, c) in the parameter space. Likewise, a point
in the image space forms a line in the parameter space. Therefore,
a locus of points forming a line in the image space will form a set
of lines in the parameter space, whose intersection represents the
parameters of the line in the image space. If a gradient image is
threshold to provide edge pixels, each edge pixel can be mapped
to the parameter space. The mapping can be implemented using
the bins of points in the parameter space. For each edge pixel of the
straight line in the image space, the corresponding bin in the param
eter space is updated. At the end, the bin with the maximum count
represents the parameters of the straight line detected in the image.
The concept can be extended to map and detect boundaries of a
predeﬁned curve. In general, the points in the image space become
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
223
hyperplanes inthe Ndimensional parameter space andthe parame
ters of the object function in the image space can be found by search
ing the peaks in the parameter space caused by the intersection of
the hyperplanes.
To detect object boundaries using the Hough transform, it is nec
essary to create a parameter model of the object. The object model
is transferred into a table called an Rtable. The Rtable can be con
sidered as a onedimensional array where each entry of the array
is a list of vectors. For each point in the model description (MD),
a gradient along with the corresponding vector extending from the
boundary point to the centroid is computed. The gradient acts as an
index into the Rtable.
For object recognition, a 2Dparameter space of possible xy coor
dinate centers is initialized with accumulator values associated with
each location set to zero. An edge pixel from the gradient image is
selected. The gradient information is indexed into the Rtable. Each
vector in the corresponding list is added to the location of the edge
pixel. The endpoint of the vector should now point to a new edge
pixel in the gradient image. The accumulator of the corresponding
location in the parameter space is then incremented by one count.
As each edge pixel is examined, the accumulator of the correspond
ing location receives the highest count. If the model object is con
sidered to be translated in the image, the accumulator of the correct
translation location would receive the highest count. To deal with
rotation and scaling, the process must be repeated for all possible
rotations and scales. Thus, the complete process could become very
tedious if a large number of rotations and scales are examined. To
avoid this complexity, simple transformations can be made in the
Rtable of the transformation.
16
9.8 CONCLUDING REMARKS
Segmenting image into regions of interest and extracting features
from the image and segmentation are essential for analyzing and
classifying the information represented in the image. In this chap
ter, commonly used edge and region segmentation methods are
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
224
Atam P Dhawan
described. Image and region statistics based features extracted the
graylevel distributions along with shape and texture features are
also presented. Depending on the contextual knowledge, a number
of relational features withadjacencies graphandrelational attributes
can also be included in the analysis and classiﬁcation of medi
cal images. Modelbased methods representing anatomical knowl
edge from standardized atlases can be introduced in the segmen
tation and feature analysis to help computerized classiﬁcation and
interpretation of medical images. Pattern classiﬁcation methods are
described in Chapter 10 while the modelbased registration and
medical image analysis methods are described in various chapters
in the second and third part of this book. Recent developments
in modelbased medical image analysis include probabilistic and
knowledge based approaches and can be found in detail in the pub
lished literature.
22–24,53–62
This trend of using multifeature analysis
incorporating a priori and modelbased knowledge is expected to
continue in medical image analysis for diagnostic applications, as
well as for understanding of physiological processes linked with
critical diseases and designing better treatment intervention proto
cols for better healthcare.
References
1. Jain R, Kasturi R, Schunck BG, Machine Vision, McGRawHill Inc, 1995.
2. Gonzalez RC, Woods RE, Digital Image Processing, Prentice Hall, 2nd
edn., 2002.
3. Marr D, Hildreth EC, Theory of edge detection, Proc R Soc Lond B 207:
187–217, 1980.
4. Haralick RM, Shapiro LG, Image segmentation techniques, Comp Vis
Graph Imag Process 7: 100–132, 1985.
5. DhawanAP, Medical Image Analysis, Wiley Interscience, JohnWiley and
Sons, Hoboken, NJ, 2003.
6. Stansﬁeld SA, ANGY: A rulebased expert system for automatic seg
mentation of coronary vessels from digital subtracted angiograms,
IEEE Trans Patt Anal Mach Intel 8: 188–199, 1986.
7. Ohlander R, Price K, ReddyDR, Picture segmentationusinga recursive
region splitting method, Comp Vis Graph Imag Process 8: 313–333, 1978.
8. Zucker S, Regiongrowing: Childhoodandadolescence, Comp Vis Graph
Imag Process 5: 382–399, 1976.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
225
9. Dhawan AP, Sicsu A, Segmentation of images of skin lesions using
color and texture information of surface pigmentation, Comp Med Imag
Graph 16: 163–177, 1992.
10. Dhawan AP, Arata L, Segmentation of medical images through com
petitive learning, Comp Methods and Prog In Biomed 40: 203–215, 1993.
11. Raya SR, Lowlevel segmentation of 3D magnetic resonance brain
images, IEEE Trans Med Imag 9: 327–337, 1990.
12. Liang Z, Tissue classiﬁcation and segmentation of MR images, IEEE
Eng Med Biol Mag 12: 81–85, 1993.
13. Nilson NJ, Principles of Artiﬁcial Intelligence, Springer Verlag, 1982.
14. Winston PH, Artiﬁcial Intelligence, Addison Wesley, 3rd edn., 1992.
15. Dawant BM, Zijdenbos AP, Image segmentation, in Sonka M,
Fitzpatrick JM (eds.), Handbook of Medical Imaging, Vol 2: Medical Image
Processing and Analysis, SPIE Press, 2000.
16. Ballard DH, Generalizing the Hough transform to detect arbitrary
shapes, Pattern Recognition 13: 111–122, 1981.
17. Bomans M, Hohne KH, Tiede U, Riemer M, 3D segmentation of MR
images of the head for 3D display, IEEE Trans Medical Imaging 9: 177–
183, 1990.
18. Raya SR, Lowlevel segmentation of 3D magnetic resonance brain
images: A rule based system, IEEE Trans Med Imaging 9(1): 327–337,
1990.
19. Cline HE, Lorensen WE, Kikinis R, Jolesz F, Threedimensional seg
mentation of MR images of the head using probability and connectiv
ity, Journal of Computer Assisted Tomography 14: 1037–1045, 1990.
20. Clarke L, Velthuizen R, Phuphanich S, Schellenberg J, et al., MRI: Sta
bility of three supervised segmentation techniques, Magnetic Resonance
Imaging 11: 95–106, 1993.
21. Hall LO, BensaidAM, Clarke LP, Velthuizen RP, et al., Acomparison of
neural network and fuzzy clustering techniques in segmenting mag
netic resonance images of the brain, IEEE Trans on Neural Networks 3:
672–682, 1992.
22. Vannier M, Pilgram T, Speidel C, Neumann L, et al., Validation of
magnetic resonance imaging (MRI) multispectral tissue classiﬁcation,
Computerized Medical Imaging and Graphics 15: 217–223, 1991.
23. Choi HS, Haynor DR, KimY, Partial volume tissue classiﬁcationof mul
tichannel magnetic resonance images — A mixed model, IEEE Trans
actions on Medical Imaging 10: 395–407, 1991.
24. Zavaljevski A, Dhawan AP, Holland S, Ball W, et al., Multispectral MR
brain image classiﬁcation, Computerized Medical Imaging, Graphics and
Image Processing 24: 87–98, 2000.
25. Nazif AM, Levine MD, Lowlevel image segmentation: An expert sys
tem, IEEE Trans Pattern Anal Mach Intell 6: 555–577, 1984.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
226
Atam P Dhawan
26. Arata LK, Dhawan AP, Levy AV, Broderick J, et al., Three dimensional
anatomical model based segmentation of MR brain images through
prinicpal axes registration, IEEE Trans Biomed Eng 42: 1069–1078, 1995.
27. Xu L, Jackowski M, Goshtasby A, Yu C, et al., Segmentation of skin
cancer images, Image and Vision Computing 17: 65–74, 1999.
28. Sarwal A, DhawanAP, Segmentationof coronaryarteriograms through
Radial Basis Function neural network, Journal of Computing and Infor
mation Technology, 135–148, 1998.
29. Ozkan B, Dawant RJ, Maciunas RJ, NeuralNetworkBased Segmenta
tion of MultiModal Medical Images: AComparative and Prospective
Study, IEEE Trans on Medical Imaging 12: 1993.
30. Xuanli C, Beni G, AValidity Measure for Fuzzy Clustering, IEEE Trans
on Pattern Anal Mach Intell 133: 1991.
31. Bezdek A, Pattern Recognition with Fuzzy Objective Function Algorithms,
Plenum, New York, 1981.
32. Chen C, Cowan CFN, Grant PM, Orthogonal least squares learning
for radial basis function networks, IEEE Trans On Neural Networks 2(2):
302–309, 1991.
33. Poggio T, Girosi F, Networks for approximation and learning, Proceed
ings of the IEEE 78(9): 1481–1497, 1990.
34. Jacobson IRH, Radial basis functions: a survey and new results, in
Handscomb DC (ed.), The Mathematics of Surfaces III, pp. 115–133,
Clarendon Press, 1989.
35. Sarwal A, DhawanAP, Segmentationof coronaryarteriograms through
Radial Basis Function neural network, Journal of Computing and Infor
mation Technology, 135–148, 1998.
36. Xuanli G, Beni A, Validity Measure for Fuzzy Clustering, IEEE Trans
on Pattern Anal Mach Intell 133(8): 1991.
37. Loncaric S, Dhawan AP, Brott T, Broderick J, 3D image analysis of
intracerebral brain hemorrhage, Computer Methods and Programs in
Biomed 46: 207–216, 1995.
38. Broderick J, Narayan S, Dhawan AP, Gaskil M, et al., Ventricular mea
surement of multifocal brain lesions: Implications for treatment trials
of vascular dementia and multiple sclerosis, Neuroimaging 6: 36–43,
1996.
39. Schmid P, Segmentation of digitized dermatoscopic images by two
dimensional color clustering, IEEE Trans Med Imag 18: 164–171, 1999.
40. Pham DL, Prince JL, Adaptive fuzzy segmentation of magnetic reso
nance images, IEEE Trans Med Imag 18: 737–752, 1999.
41. Kanungo T, Mount DM, Netanvahu NS, Piatko CD, et al., An efﬁcient
kmeans algorithm: analysis and implementation, IEEE Trans on
Pattern Anal Mach Intell 24: 881–892, 2002.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
227
42. Duda RO, Hart PE, Pattern Classiﬁcation and Scene Analysis, Wiley, 1973.
43. Zurada JM, Introductionto Artiﬁcial Neural Systems, West PublishingCo,
1992.
44. Fahlman SE, Lebeire C, The cascadecorrelation learning architecture,
Tech Report, School of Computer Science, Carnegie Mellon University,
1990.
45. DhawanAP, LeRoyer E, Mammographic feature enhancement by com
puterized image processing, Comp Methods & Programs in Biomed 27:
23–29, 1988.
46. Serra J, Image Analysis and Mathematical Morphology, Academic Press,
1982.
47. Sternberg S, Shapiro L, MacDonaldR, Orderedstructural shape match
ing with primitive extraction by mathematical morphology, Pattern
Recognition 20: 75–90, 1987.
48. Maragos P, Patternspectrumandmultiscaleshaperepresentation, IEEE
Trans on Pattern Anal Mach Intell 11: 701–716, 1989.
49. Loncaric S, DhawanAP, Amorphological signature transformfor shape
description, Pattern Recognition 26(7): 1029–1037, 1993.
50. Loncaric S, Dhawan AP, Brott T, Broderick J, 3D image analysis of
intracerebral brain hemorrhage, Computer Methods and Programs in
Biomed 46: 207–216, 1995.
51. Loncaric S, Dhawan AP, Optimal MSTbased shape description via
genetic algorithms, Pattern Recognition 28: 571–579, 1995.
52. Flusser J, Suk T, Pattern recognition by afﬁne moments invariants,
Pattern Recognition 26: 167–174, 1993.
53. Loew MH, Feature extraction, in Sonka M, Fitzpatrick JM, Handbook
of Medical Imaging, Vol. 2: Medical Image Processing and Analysis, SPIE
Press, 2000.
54. Dhawan AP, Chitre Y, KaiserBonassoand, Moskowitz M, Analysis of
mammographic microcalciﬁcations using gray levels image structure
features, IEEE Trans Med Imaging 15: 246–259, 1996.
55. Xu L, Jackowski M, Goshtasby A, Yu C, et al., Segmentation of skin
cancer images, Image and Vision Computing 17: 65–74, 1999.
56. Dhawan AP, Sicsu A, Segmentation of images of skin lesions using
color and texture information of surface pigmentation, Comp Med Imag
Graph 16: 163–177, 1992.
57. Staib LH, Duncan JS, Boundary ﬁnding with parametrically
deformable models, IEEE Trans Pattern Anal Mach Intel 14: 1061–1075,
1992.
58. FanY, ShenD, Gur RC, Gur RE, et al., COMPARE: Classiﬁcationof Mor
phological Patterns using Adaptive Regional Elements, IEEE Transac
tions on Medical Imaging, 2006.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch09 FA
228
Atam P Dhawan
59. Grosbras MH, Laird AR, Paus T, Cortical regions involved in gaze
production, attention shifts and gaze perception, Hum Brain Mapp 25:
140–154, 2005.
60. Laird AR, Fox PM, Price CJ, Glahn DC, et al., ALE metaanalysis: Con
trolling the false discovery rate and performing statistical contrasts,
Hum Brain Mapp 25: 155–164, 2005.
61. Zhang Y, Brady M, Smith S, Segmentation of brain MR images
through a hidden Markov random ﬁeld model and the expectation
maximization algorithm, IEEE Trans Med Imaging 20(1): 45–57, 2001.
62. Scherﬂer C, Schocke MF, Seppi K, Esterhammer R, et al., Voxelwise
analysis of diffusion weighted imaging reveals disruption of the olfac
tory tract in Parkinson’s disease, Brain 129(Pt 2): 538–42, 2006.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
CHAPTER 10
Clustering and Pattern Classiﬁcation
Atam P Dhawan and Shuangshuang Dai
Clustering is a method to arrange data points into groups or clus
ters based on a predeﬁned similarity criterion. Classiﬁcation maps
the data points or their representative features into predeﬁned classes
to help the interpretation of the input data. There are several
methods available for clustering and classiﬁcation for computer
aided diagnostic or decision making systems for medical applica
tions. This chapter reviews some of the clustering and classiﬁcation
methods using deterministic as well as fuzzy approaches for data
analysis.
10.1 INTRODUCTION
Image classiﬁcation is an important task in computeraided diag
nosis. An image after any preprocessing as needed to enhance
features of interest is processed to extract features for further analy
sis. Computed features are then arranged as a feature vector. Since
features may utilize different dynamic ranges of values, normal
ization may be required before they are analyzed for classiﬁcation
into various categories. For example, a mammography image may
be processed to extract features related to microclaciﬁcations, e.g.
number of microcalciﬁcation clusters, number of microcalciﬁcations
in each cluster, size and shape of microcalciﬁcations, spatial distri
butionof microcalciﬁcations, spatialfrequencyandtexture informa
tion, mean and variance of graylevel values of microcalciﬁcations,
etc. These features are then used in a classiﬁcation method such as
229
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
230
AP Dhawan and Shuangshuang Dai
statistical pattern classiﬁer, Bayesian classiﬁer, or neural network to
classify the image into two classes: benign and malignant.
Let us review some terms commonly used in pattern classi
ﬁcation.
Pattern: Apattern (feature vector, observation, or datum) χ is a vec
tor of measurements used by the clustering algorithm. It typically
consists of a vector of d measurements: χ = (x
1
, . . . x
d
).
Feature: Afeature is deﬁned as an individual scalar components x
i
of a pattern χ.
Dimensionality: The dimensionality d usually refers to the number
of variables in the pattern or feature vector.
Pattern Set: Apattern set is denoted ℵ = {χ
1
, . . . χ
n
}. The ith pattern
in ℵ is denoted χ
i
= (x
i,1
, . . . x
i,d
). In many cases, a pattern set to be
clustered can be viewed as an n × d pattern matrix.
Class: A class, in the abstract, refers to a state of nature that gov
erns the pattern generation process. More concretely, a class can be
viewed as a source of patterns whose distribution in feature space
is governed by a probability density speciﬁc to the class.
Clustering: Clustering is a speciﬁc method that attempts to group
patterns into various classes on the basis of a similarity criterion.
Hard clustering: Hard clustering techniques assign a class label l
i
to each pattern χ
i
, using a deterministic similarity criterion or crisp
membership function.
Fuzzy clustering: Fuzzy clustering methods assign a class to each
input pattern χ
i
based on a fuzzy membership criterion with a frac
tional degree of membership f
ij
for each cluster j.
Distance measure: A distance measure is a metric on the feature
space used to quantify the similarity of patterns.
A traditional pattern classiﬁcation system can be viewed as a
mapping frominput variables representing the rawdata or a feature
set toanoutput variable representingone of the categories or classes.
To obtain a reasonable dimensionality, it is usually advantageous to
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
231
Feature
Selection/
Extraction
Interpattern
Similarity
Clustering
and/or
Classifier
Classification
Input
Data
Fig. 1. Atypical classiﬁcation system.
apply preprocessing transformations to the raw data before it is fed
into a classiﬁcation system. Preprocessing usually involves feature
extraction and/or feature selection to reduce the dimensionality to
a reasonable number. Feature selection is the process of identifying
the most effective subsets of the original features to be used in the
clustering. The selected features are expected to be correlated with
the classiﬁcation task for better results.
After the preprocessing and pattern (feature) representation are
established, interpattern similarity should be deﬁned on pairs of
patterns and it is often measured by a distance function. Finally, the
output of the clustering task is a set of clusters and it can be hard
(a deterministic partition of the data into clusters) or fuzzy where
each pattern has a variable degree of membership in each of the
output clusters. Figure 1 shows a schematic diagram of a typical
classiﬁcation system.
10.2 DATA CLUSTERING
Clustering is assigning data points or patterns (usually represented
as a vector of measurements in a multidimensional space) into
groups or clusters based on a predeﬁned similarity measure. Intu
itively, patterns within a valid cluster are more similar to each other
than they are to a pattern belonging to a different cluster. Data clus
tering is an efﬁcient method to organize a large set of data for sub
sequent classiﬁcation. Except in certain advanced fuzzy clustering
techniques, each data point should belong to a single cluster, and no
point should be excluded from membership in the complete set of
clusters.
Since similarity is fundamental to the deﬁnition of a cluster, a
measure of the similarity between two patterns drawn from the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
232
AP Dhawan and Shuangshuang Dai
same feature space is essential to most clustering procedures.
1–10
Because of the variety of feature types and scales, the proper choice
of distance measure is of great importance. It is common to calcu
late dissimilarity between two patterns using a distance measure
deﬁned on the feature space. Euclidean distance is the most popular
metric
1,2
and it is deﬁned as:
d
2
(x
i
, x
j
) =
_
d
k=1
_
x
i,k
− x
j,k
_
2
_
1/2
=
_
_
x
i
− x
j
_
_
2
. (1)
It is noted that Euclidean distance is actually a special case (p = 2)
of the Minkowski metric as
1,2
:
d
p
(x
i
, x
j
) =
_
d
k=1
_
x
i,k
− x
j,k
_
p
_
1/p
=
_
_
x
i
− x
j
_
_
p
. (2)
The Euclidean distance has an intuitive appeal as it is commonly
usedtoevaluate the proximityof objects intwoor threedimensional
space. It works well when a data set has “compact” or “isolated”
clusters.
11
The drawback to the direct use of the Minkowski metrics
is the tendency of the largestscaled feature to dominate all others.
Solutions to this problem include normalization of the continuous
features or other weighting schemes. Linear correlation among fea
tures can also distort distance measures. This distortion can be alle
viated by applying a whitening transformation to the data or by
using the squared Mahalanobis distance:
d
M
(x
i
, x
j
) = (x
i
− x
j
)A
−1
(x
i
− x
j
)
T
, (3)
where A is the sample covariance matrix of the patterns.
In this process, d
M
(x
i
, x
j
) assigns different weights to different
features based on their variances and pairwise linear correlations.
It is implicitly assumed here that class conditional densities are uni
modal and characterized by multidimensional spread, i.e. that the
densities are multivariate Gaussian. The regularized Mahalanobis
distance was used in
11
to extract hyperellipsoidal clusters.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
233
Traditional clustering algorithms can be classiﬁed into two main
categories
1,2
: hierarchical and partitional. In hierarchical clustering,
the number of clusters need not be speciﬁed a priori, and problems
due to initializationandlocal minimumdo not arise. However, since
hierarchical methods consider onlylocal neighbors ineachstep, they
cannot incorporate a prior knowledge about the global shape or size
of clusters. As a result, they cannot always separate overlapping
clusters. Moreover, hierarchical clustering is static, and points com
mittedtoa givencluster inthe earlystages cannot move toa different
cluster.
Partitional clusteringobtains asinglepartitionof thedatainstead
of a clustering structure by optimizing a criterion function deﬁned
either locally (on a subset of the patterns) or globally (deﬁned over
all of the patterns). Partitional clustering can be further divided into
two classes: crispclustering andfuzzy clustering. Incrispclustering,
every data point belong to only one cluster, while in fuzzy cluster
ing every data point belongs to every cluster to a certain degree as
determinedby the membershipfunction.
3
Partitional algorithms are
dynamic, and points can move fromone cluster to another. They can
incorporate knowledge about the shape or size of clusters by using
appropriate prototypes and distance measures.
Hierarchical clustering is inﬂexible due to its greedy approach:
after a merge or a split is selectedit is not reﬁned. Fisher
4
studiediter
ative hierarchical cluster redistribution to improve once constructed
dendrograms. Karypis et al.
5
also researched reﬁnements for hier
archical clustering. The problem with partitional algorithms is the
initial guess of the number of clusters. A simple way to mitigate
the effects of clusters initialization was suggested by Bradley and
Fayyad.
6
First, kmeans is performed on several small samples of
data with a random initial guess. Each of these constructed systems
is thenusedas a potential initializationfor a unionof all the samples.
Centroids of the best system constructed this way are suggested as
an intelligent initial guesses to ignite the kmeans algorithm on the
full data. Zhang
7
suggested another way to rectify the optimiza
tion process by soft assignment of points to different clusters with
appropriate weights, rather than by moving them decisively from
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
234
AP Dhawan and Shuangshuang Dai
one cluster to another. Nowadays, probabilistic models have been
proposedas a basis for cluster analysis. Inthis approach, the data are
viewed as coming from a mixture of probability distributions, each
representing a different cluster. Methods of this type have shown
promise in a number of practical applications.
8–10
10.2.1 Hierarchical Clustering with the Agglomerative Method
In hierarchical clustering, the number of clusters may not be spec
iﬁed in advance. It builds a cluster hierarchy or, in other words, a
tree of clusters. Every cluster node contains child clusters; sibling
clusters partition the points covered by their common parent. Such
an approach allows exploring data on different level of granularity.
Hierarchical clustering methods are dividedinto agglomerative and
divisive.
2,10,11
An agglomerative clustering method may start with
onepoint (singleton) based clusters and recursively merges two or
more appropriate clusters. Adivisive clustering starts with one clus
ter of all data points and recursively splits the most appropriate
cluster. The process continues until a stopping criterion is satisﬁed
providing a reasonable number of clusters.
Hierarchical methods of cluster analysis permit a convenient
graphical display in which the entire sequence of merging (or split
ting) is shown. Because of its treelike nature, the display has the
name of dendrogram. The agglomerative method is usually chosen
because it is more important and more widely used. One reason for
the popularity of agglomerative method is that during the merging
process, the choice of threshold is not a big concern which will be
illustrated in the details of the algorithm shown below. In contrast,
divisive methods are more computationally intensive and the difﬁ
cultyof choosingpotential allocations toclusters duringthe splitting
stages.
To merge or split subsets of points rather than individual points,
the distance between individual points has to be generalized to
the distance between subsets. Such a derived proximity measure
is called a linkage metric. The type of the linkage metric used
signiﬁcantly affects hierarchical algorithms, since it reﬂects the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
235
particular concept of closeness and connectivity. Major interclus
ter linkage metrics include single link, average link and complete
link.
2,10–13
The underlying dissimilarity measure (usually distance)
is computed for every pair of points with one point in the ﬁrst set
andanother point inthe secondset. Aspeciﬁc operationsuchas min
imum (single link), average (average link), or maximum (complete
link) is applied to pairwise dissimilarity measures:
d(C
1
, C
2
) = operation{d(x, y) x ∈ C
1
, y ∈ C
2
}. (4)
For example, the SLINKalgorithm,
12
based on the singlelink metric
representation provides the Euclidean minimal spanning tree with
O(N
2
) computational complexity.
As described above, the agglomerative methods are based on
measures of distance between clusters. Froma representation of sin
gle point basedclusters, two clusters that are nearest andsatisfy sim
ilarity criterion are merged to form a reduced number of clusters.
This is repeated until just one cluster is obtained. Let us suppose
that “n” sample (data) points are to be clustered, the initial num
ber of clusters will then be equal to n as well. Let us represent the
data vector D with n data points as D={x(1),…,x(n)} and a function
D(C
i
, C
j
) as distance measure between two clusters C
i
and C
j
. An
agglomerative algorithm for clustering can be deﬁned as follows:
Algorithm (agglomerative hierarchical clustering)
Step 1: for i = 1, . . ., n let C
i
={X(i)};
Loop: While there is more than one cluster left do
Minimizing the distance D(C
k
, C
h
) between any two
clusters
Let C
i
and C
j
be the clusters with minimum distance
C
i
= C
i
∪ C
j
;
Remove cluster C
j
;
End
In the above algorithm, a distance measure should be carefully
chosen. Normally, Euclidean distance is employed which assume
some degree of commensurability between the different variables.
It makes less sense if the variables are noncommensurate, that is,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
236
AP Dhawan and Shuangshuang Dai
Fig. 2. Asample dendrogram.
variables are measured in different units. A common strategy is to
standardize the data by dividing the sample value of each of the
variables by its sample standard deviation, so that they are equally
important. Figure 2 shows a sample dendrogram produced by the
agglomerative hierarchical clustering method for a given data.
Linkage metricsbased hierarchical clustering suffers from time
complexity. Under reasonable assumptions, suchas reducibilitycon
dition, linkage metrics methods have O(N
2
) complexity.
10–14
Chiu
et al.
15
proposed another hierarchical clustering algorithm using
a modelbased approach in which maximum likelihood estimates
were introduced.
Traditional hierarchical clustering is inﬂexible due to its greedy
approach: after a merge or a split is selected, it is not reﬁned. In
addition, since they consider only local neighbors in each step, it is
difﬁcult to incorporate a prior knowledge about the global shape or
size of clusters. Moreover, hierarchical clustering is static in a sense
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
237
that points assigned to a cluster in the early stages cannot be moved
to a different cluster in later stages.
10.2.2 Nonhierarchical or Partitional Clustering
Anonhierarchical or partitional clustering algorithm obtains a sin
gle partition of the data instead of a hierarchical clustering represen
tationsuchas the dendrogram. Partitional methods have advantages
inapplications involvinglarge datasets for whichthe constructionof
a dendrogramis computationally problematic. The partitional tech
niques usually produce clusters by optimizing an objective function
deﬁned either locally (on a subset of the patterns) or globally (over
all of the patterns).
10.2.2.1 KMeans Clustering Approach
Kmeans
2
is the simplest and most commonly used algorithm
employing a squared error criterion which is deﬁned as:
e
2
(ℵ, ) =
K
j=1
n
j
i=1
_
_
_x
(j)
i
− c
j
_
_
_
2
. (5)
The Kmeans algorithm starts with a random initial partition and
keeps reassigning the patterns to clusters based on the similarity
between the pattern and the cluster centers until a convergence cri
terion is met, e.g. there is no reassignment of any pattern from one
cluster to another, or the squared error ceases to decrease signiﬁ
cantly after some number of iterations. The kmeans algorithm is
popular because it is easy to implement with a computational com
plexity of O(N), where N is the number of patterns.
Amajor problem with this algorithm is that it is sensitive to the
selection of the initial partition and may converge to a local min
imum of the criterion function value if the initial partition is not
properly chosen. Bradley and Fayyad
6
suggested a way to mitigate
the effects of cluster initialization. One variationto the kmeans algo
rithmis to permit the splitting and merging of the resulting clusters.
Typically, a cluster is split when its variance is above a prespeciﬁed
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
238
AP Dhawan and Shuangshuang Dai
threshold and two clusters are merged when the distance between
their centroids is belowanother prespeciﬁedthreshold. Under sucha
scheme, it is possibletoobtaintheoptimal partitionstartingfromany
arbitraryinitial partition, providedproper thresholdvalues are spec
iﬁed. Another variation of the kmeans algorithm involves select
ing a different criterion function altogether. Diday
16
and Symon
17
described a dynamic clustering approach obtained by formulating
the clustering problem in the framework of maximumlikelihood
estimation. The regularized Mahakanobis distance was used in Mao
and Jain
11
to obtain hyperellipsoidal clusters.
Partitioning clustering algorithms can be divided into two
classes: crisp (or hard) clustering and fuzzy clustering. Hard cluster
ing is the traditional approach in which each pattern belongs to one
and only one cluster. Hence, the clusters are disjoint. Fuzzy cluster
ing extends this notion to associate each pattern with every cluster
usingamembershipfunction.
2
Fuzzyset theorywas initiallyapplied
to clustering in Ruspini.
28
The most popular fuzzy clustering algo
rithm is the fuzzy kmeans (FCM) algorithm. A generalization of
the FCM algorithm was proposed by Bezdek
18
through a family of
objective functions. Afuzzy cshell algorithm and an adaptive vari
ant for detecting circular and elliptical boundaries was presented in
Dave.
19
It was also extended in medical image analysis to segment
magnetic resonance images.
20
Even though it is better than the hard
kmeans algorithmat avoiding local minima, FCMcan still converge
tolocal minimaof thesquarederror criterion. Thedesignof themem
bership function is the most important problem in fuzzy clustering;
different choices include those based on similarity decomposition
and centroids of clusters.
10.2.3 Fuzzy Clustering
Conventional clustering and classiﬁcation approaches assign a data
point to a cluster or class with a well deﬁned metric. In other words,
the membership of a data point for a cluster is deterministic and can
be represented by a crisp membership function. In many real world
applications, setting up a crisp membership function for clustering
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
239
or classiﬁcation often makes the result intuitively unreasonable.
Using a less deterministic approach with probabilistic membership
functions providing fuzzy overlapping boundaries in the feature
space have provided very useful results in many applications.
18–20
10.2.3.1 Fuzzy Membership Function
Afuzzy set is a set without a crisp boundary for its membership. If
X is a space of input data points denoted generically by x, then a
fuzzy set A in X is deﬁned as a set of ordered pairs:
A = {(x, µ
A
(x)) x ∈ X} , (6)
where µ
A
(x) is called the membership function (MF) for the fuzzy
set Aand its value ranges from0 to 1. In other words, a membership
function can be represented as a mapping function that provides
each point in the input space with a membership value (or degree
of membership) between 0 and 1. For example, the age of a person
can be deﬁned into some predeﬁned deterministic groups such as
in the interval of 10, e.g. 21–30; 31, 40; 41–50, etc. However, deﬁning
a “middleaged” group of people is quite subjective to individual
perception. If we consider a range, say, between 40 and 50, as
“middleaged,” a probabilistic membership function can be deter
mined to represent the degree of belongingness to the group of mid
dle aged people.
A membership function may be expressed as the generalized
Cauchy distribution
18
as:
µ
A
(x) = bell (x; a, b, c) =
1
1 +
¸
¸
x−c
a
¸
¸
2b
, (7)
where c is the median value of the range (for example, it is 45 in the
middle age group as described above), and a and b are parameters
to adjust the width and sharpness of the curve. The membership
function for a = 15, b = 3, and c = 45 is shown in Fig. 3 as µ
A
(x) =
bell(x; 15, 3, 45).
It can be noted that the deﬁnition of “middle aged” as rep
resented by a membership function becomes more reasonable as
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
240
AP Dhawan and Shuangshuang Dai
Fig. 3. Aplot of the “bellshape” membership function bell(x; 15, 3, 45).
against to a crisprepresentation. If a personis between40 and50, the
membership function value is 1 which is considered middleaged.
Extending this concept to three groups: “young,” “middleaged,”
and “old” with three membership functions (MF) based representa
tion, a probabilistic interpretation of the age group can be obtained
as shown in Fig. 4. Aperson with 35 years of age is more likely con
sidered to be middleaged than young because the corresponding
MF value is around 0.8 for the middleage versus 0.2 for the young
group. Therefore, a particular age has three corresponding MF val
ues indifferent categories. As mentionedabove, the three MFs totally
cover the value range of X andthe transition fromone MF to another
is smooth and gradual.
10.2.3.2 Membership Function Formulation
The parameterized functions can be used to deﬁne membership
functions (MF) with different transition properties. For example,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
241
Fig. 4. Aplot of three Bell MFs for “young,” “middle aged” and “old.”
triangular, trapezoidal, Gaussian and bell shape functions have
different transition curves and therefore the corresponding prob
ability function provides different mappings to the data distribu
tion. Further, multidimensional MFs with desired shape (Triangle,
Gaussian, Bell, etc.) may be needed to deal with multidimensional
data. Amultidimensional Gaussian MF can be represented as:
µ
A
(X) = gaussian(X; M, K) = exp
_
−
1
2
(X − M)
T
K
−1
(X − M)
_
,
(8)
where X and M are column vectors deﬁned by: X = [x
1
, x
2
, . . . , x
n
]
T
and M = [m
1
, m
2
, . . . , m
n
]
T
= [E(x
1
), E(x
2
), . . . , E(x
n
)]
T
, m
i
is the
mean value of variable x
i
, and K is covariance matrix of variables x
i
deﬁned as:
K =
_
_
var(x
1
) cov(x
1
, x
2
) . . . . cov(x
1
, x
n
)
cov(x
2
, x
1
) var(x
2
) . . . . cov(x
2
, x
n
)
cov(x
n
, x
1
) cov(x
n
, x
2
) . . . . var(x
n
)
_
_
. (9)
10.2.3.3 Fuzzy kMeans Clustering
The fuzzy kmeans algorithm
18
is based on the minimization of
an appropriate objective function J, with respect to U, a fuzzy
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
242
AP Dhawan and Shuangshuang Dai
Kpartition of the dataset, and to V, a set of K prototypes as:
J
q
(U, V) =
N
j=1
K
i=1
(u
ij
)
q
d
2
(X
j
, V
i
); K ≤ N (10)
where q is any real number greater than 1, X
j
is the jth m
dimensional feature vector, V
j
is the centroid of the ith cluster, u
ij
is the degree of membership of X
j
in the ith cluster. d
2
(X
j
, V
i
) is any
inner product metric (distance between X
j
and V
j
), N is the num
ber of data points. K is the number of clusters. The parameter q is
the weighting exponent for u
ij
and controls the “fuzziness” of the
resulting clusters.
18
Fuzzy partition may be carried out through an
iterative optimization of the above objective function as the follow
ing algorithm.
Step 1: Choose primary centroid V
i
(prototypes);
Step 2: Compute the degree of membership of all feature vectors in
all the clusters:
u
ij
=
(1/d
2
(X
j
, V
i
)
1/(q−1)
K
k=1
(1/d
2
(X
j
, V
i
)
1/(q−1)
; (11)
Step 3: Compute new centroids
∧
V
i
:
∧
V
i
=
N
j=1
(u
ij
)
q
X
j
N
j=1
(u
ij
)
q
, (12)
and update the degree of membership, u
ij
to
∧
u
ij
, according to Eq. 11.
Step 4: If max[u
ij
−
∧
u
ij
] < ε stop, otherwise go to Step 3
where ε is a termination criterion between 0 and 1.
Computation of the degree of membership u
ij
depends on the
deﬁnition of the distance measure, d
2
(X
j
, V
i
)
18
as:
d
2
(X
j
, V
i
) = (X
j
− V
i
)
T
A(X
j
− V
i
). (13)
The inclusion of A (an m × m positivedeﬁnite matrix) in the
distance measure results in weighting according to the statistical
properties.
2,18
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
243
10.3 NEAREST NEIGHBORED CLASSIFIER
Apopular statistical methodfor classiﬁcation is the nearest neighbor
classiﬁer, which assigns a data point to the nearest class model in the
feature space. It is apparent that the nearest neighbor classiﬁer is a
supervised method as it uses labeled clusters of training samples in
the feature space as models of classes. Let us assume that there are
C number of classes represented by c
j
; j = 1, 2, . . . , C. An unknown
feature vector f is to be assigned to the class that is closest to the class
model developed fromclustering the labeled feature vectors during
the training. A distance measure D
j
(f) is deﬁned by the Euclidean
distance in the feature space as
2
:
D
j
(f) = f −u
j
, (14)
where
u
j
=
1
N
j
f ∈c
j
f
j
j = 1, 2, . . . , C
is the mean of the feature vectors for the class c
j
and N
j
is the total
number of feature vectors in the class c
j
.
The unknown feature vector is assigned to the class c
i
if:
D
i
(f) = min
C
j=1
[D
j
(f)]. (15)
A probabilistic approach can be applied to the task of classiﬁca
tion to incorporate a priori knowledge to improve performance.
Bayesian and maximumlikelihood methods have been widely used
in object recognition and classiﬁcation for different applications. Let
us assume that the probability of a feature vector f belonging to the
class c
i
is denoted by p(c
i
/f). Let an average risk of wrong classiﬁca
tion for assigning the feature vector to the class c
j
be expressed by
r
j
(f) as:
r
j
(f) =
C
k=1
Z
kj
p(c
k
/f), (16)
where Z
kj
is the penalty of classifying a feature vector to the class c
j
when it belongs to the class c
k
.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
244
AP Dhawan and Shuangshuang Dai
It can be shown that:
r
j
(f) =
C
k=1
Z
kj
p(f/c
k
)P(c
k
), (17)
where P(c
k
) is the probability of occurrence of class c
k
.
ABayes classiﬁer assigns an unknown feature vector to the class
c
i
if:
r
i
(f) < r
j
(f)
or
C
k=1
Z
ki
p(f/c
k
)P(c
k
) <
C
q=1
Z
qj
p(f/c
q
)P(c
q
) for j = 1, 2, . . . , C. (18)
Other versions of the Bayesian classiﬁcation as applied to medical
image classiﬁcation can be found in many papers
20–25
for radiolog
ical image analysis and computeraided diagnosis.
10.4 DIMENSIONALITY REDUCTION
As described above, the goal of clustering is to group the data or
feature vector into some meaningful categories for better classiﬁca
tion and decision making without making errors of assigning a data
vector to a wrong class. For example, for computeraided analysis
of mammograms, mammography image feature vectors may need
to be classiﬁed into “benign” or “malignant” classes by a pattern
classiﬁcation system. The error in classiﬁcation may assign a nor
mal patient to “malignant” class (therefore creating a false positive)
or may assign a cancer patient to “benign” class (therefore miss
ing a cancer). If the data (or features) are assumed to be statistically
independent, the probability of classiﬁcation error decreases as the
distance between the classes increases. This distance is deﬁned as
Ref. [14]:
d
2
=
n
i=1
µ
i1
− µ
i2
σ
2
i
, (19)
where µ
i1
and µ
i2
are the mean of each feature for the two classes.
Thus, the most useful features are those with large differences in
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
245
mean as compared to their standard deviation. The performance
should continue to improve by the addition of new features as long
as the means for the two classes differ (thereby increasing d) and
the numbers of observations are increased accordingly. The classi
ﬁer performance may be affected by unnecessary or noisy observa
tions or features that are not well correlated to the required classes.
Therefore, it is useful to reduce the number of features to those that
can provide maximum separation in the feature space among the
required classes. In addition, by reducing the number of features,
signiﬁcant gain many be achieved in computational efﬁciency. This
process is usually calleddimensionality reduction. Though there are
a number of approaches investigated for dimensionality reduction
and improving the performance of a classiﬁer in the feature space,
two useful approaches using principal component analysis (PCA)
and genetic algorithms (GA) are described here.
10.4.1 Principal Component Analysis
Principal component analysis (PCA) is an efﬁcient methodto reduce
the dimensionality of a data set which consists of a large number
of interrelated variables while retaining as much as possible of the
variation present in the data set.
2
The goal here is to map vectors
X
d
in a ddimensional space (x
1
, x
2
, . . . , x
d
) onto vectors Z
M
in an
Mdimensional space (z
1
, z
2
, . . . , z
M
) where M < d. Without loss of
generality, we express vector X as a linear combination of a set of d
orthonormal vectors u
i
:
X =
d
i=1
x
i
u
i
, (20)
where the vectors u
i
satisfy the orthonormality relation:
u
T
i
u
j
= δ
ij
. (21)
Therefore the coefﬁcient in (20) can be expressed as:
x
i
= u
T
i
X. (22)
Let us suppose that only a subset of M<d of the basis vectors u
i
are
to be retained, so that only Mcoefﬁcients x
i
are used. Ingeneral, PCA
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
246
AP Dhawan and Shuangshuang Dai
does not retain a subset of the original set of basis vectors. It ﬁnds a
new set of basis vectors that spans the original ddimensional space
such that the data can be well represented by a subset of these new
basis vectors. Here, v
i
is used to denote the new basis vectors which
meet the orthonormality requirement. As above, only Mcoefﬁcients
x
i
are used and the remaining coefﬁcients will be replaced by con
stants b
i
. Noweach vector x is approximated by an expression of the
form:
˜
X =
M
i=1
x
i
v
i
+
d
i=M+1
b
i
v
i
(23)
x
i
= v
T
i
X. (24)
We need to choose the basis vectors v
i
and the coefﬁcients b
i
to be
made such that an approximation given by Eq. 23, with the values
of x
i
determined by Eq. 24, provides the best approximation to the
original vector X on average for the whole set data set.
The next stepis tominimizes the sumof squares of errors over the
whole data set. The sumofsquare error can be written as follows:
E
M
=
1
2
d
i=M+1
v
T
i
Av
i
, (25)
where A is the covariance matrix of the set of vectors X
n
, which is
deﬁned as follows:
A =
(x
n
− ¯ x)(x
n
− ¯ x)
T
. (26)
Nowthe problemis converted to minimizing E
M
with respect to the
choice of basis vectors v
i
. A minimum value is obtained when the
basis vectors satisfy the following condition:
Av
i
= β
i
v
i
(27)
Thus, v
i
(i =M+1· · · d) are the eigenvectors if the covariance
matrix. Note that, since the covariance matrix is real and symmetric,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
247
its eigenvectors can indeed be chosen to be orthonormal. Finally, the
minimum of error is in the form:
E
M
=
1
2
d
i=M+1
β
i
. (28)
Therefore, the minimum error is achieved by rejecting the (dM)
smallest eigenvalues andtheir corresponding eigenvectors. The ﬁrst
Mlargest eigenvalues are thenretained. Eachof the associatedeigen
vectors v
i
is called a principal component.
With matrix representation, singular value decomposition (SVD)
algorithm can be employed to calculate the eigenvalues and its cor
responding eigenvectors. The use of SVD has two important impli
cations. First, it is computationally efﬁcient and second, it provides
additional insight into what a PCAactually does. It also provides a
way to represent the results of PCAgraphically and analytically.
10.4.2 Genetic Algorithms Based Optimization
In nature, the features that characterize an organism determine its
ability to endure in a competition for limited resources. These fea
tures are ﬁxed by the building block of genetics, or the gene. These
genes form chromosomes, the genetic structures which ultimately
deﬁne the survival capability of an organism. Thus, the most supe
rior organisms survive andpass their genes onto future generations,
while the genes of less ﬁt individuals are eventually eliminatedfrom
the population.
Reproduction introduces diversity into a population of individ
uals through the exchange of genetic material. Repeated selection
of the ﬁttest individuals and recombination of chromosomes pro
motes evolution in the gene pool of a species which creates even
better population members.
A genetic algorithm (GA) is a robust optimization and search
method based on the natural selection principles outlined above.
Genetic algorithms provide improved performance by exploiting
past information and promoting competition for survival. GAs gen
erate a population of individuals through selection, and search for
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
248
AP Dhawan and Shuangshuang Dai
the ﬁttest individuals throughcrossover andmutation. Afundamen
tal feature of GAs is that they operate on a representation of problem
parameters, rather than manipulating the parameters themselves.
These parameters are typically encoded as binary strings that are
associated with a measure of goodness, or ﬁtness value. As in nat
ural evolution, GAs encourage the survival of the ﬁttest through
selection and recombination. Through the process of reproduction,
individual strings are copied according to their degree of ﬁtness.
In crossover, strings are probabilistically mated by swapping all
characters located after a randomly chosen bit position. Mutation
is a secondary genetic operator that randomly changes the value of
a stringpositiontointroduce variationinthe populationandrecover
lost genetic information.
31,32
GAs maintain a population of structures that are potential
solutions to an objective function. Let us assume that features
are encoded into binary strings that can be represented as A =
a
1
, a
2
, . . . , a
L
, where L is the speciﬁed string length, or the number
of representative bits. Asimple genetic algorithm operates on these
strings according to the following iterative procedure:
(1) Initialize a population of binary strings.
(2) Evaluate the strings in the population.
(3) Select candidate solutions for the next population and apply
mutation and crossover operators to the parent strings.
(4) Allocate space for new strings by removing members from the
population.
(5) Evaluate the new strings and add them to the population.
(6) Repeat steps 3–5 until the stopping criterion is satisﬁed.
Detailed knowledge of the encoding mechanism, the objec
tive function, the selection procedure, and the genetic operators,
crossover and mutation, is essential for a ﬁrm understanding of the
above procedure as appliedto a speciﬁc problem. These components
are considered below.
The structure of the GA is based on the encoding mechanism
used to represent the variables in the given optimization problem.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
249
The candidate solutions may encode any number of variable types,
including continuous, discrete, and boolean variables. Although
alternate string codings exist,
31,32
a simple binary encoding mech
anism is considered within the scope of this thesis. Thus, the allele
of a gene in the chromosome indicates whether or not a feature is
signiﬁcant in microcalciﬁcation description. The objective function
evaluates each chromosome in a population to provide a measure
of the ﬁtness of a given string. Since the value of the objective func
tion can vary widely between problems, a ﬁtness function is used
to normalize the objective function within the range of 0 to 1. The
selection scheme uses this normalized value, or ﬁtness, to evaluate
a string.
One of the most basic reproduction techniques is proportionate
selection, whichis carriedout bytheroulettewheel selectionscheme.
In roulette wheel selection, each chromosome is given a segment of a
roulette wheel whose size is proportionate to the chromosome’s ﬁt
ness. Achromosome is reproduced if a randomly generated number
falls in the chromosome’s corresponding roulette wheel slot. Thus
since more ﬁt chromosomes are allocatedlarger wheel portions, they
are more likely to generate offspring after a spin of the wheel. The
process is repeated until the population for the next generation is
completely ﬁlled. However, due to sampling errors the population
must be very large in order for the actual number of offspring pro
ducedfor anindividual chromosome toapproachthe expectedvalue
for that chromosome.
In proportionate selection, a string is reproduced according to
how its ﬁtness compares to the population average, in other words,
as f
i
/f , where f
i
is the ﬁtness of the string and f is the average ﬁt
ness of the population. This proportionate expression is also known
as the selective pressure on an individual. The mechanics of pro
portionate selection can be expressed as: A
i
receives more than one
offspring on average if ﬁ > f ; otherwise, A
i
receives less than one
offspring on average. Since the result of applying the proportionate
ﬁtness expression will always be a fraction, this value represents the
expected number of offspring allocated to each string, not the actual
number.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
250
AP Dhawan and Shuangshuang Dai
Once the parent population is selected through reproduction,
the offspring population is created after application of genetic oper
ators. The purpose of recombination, also referred to as crossover, is
todiscover newregions of thesearchspace, rather thanrelyingonthe
same population of strings. In recombination, strings are randomly
paired and selected for crossover. If the crossover probability condi
tion is satisﬁed, then a crossover point along the length of the string
pair is randomly chosen. The offspring are generated by exchanging
the portion of the parent strings beyond the crossover position. For
a string of length l, the l−1 possible crossover positions are chosen
with equal probability.
Mutation is a secondary genetic operator that preserves the ran
dom nature of the search process and regenerates ﬁt strings that
may have been destroyed or lost during crossover or reproduction.
The mutation rate controls the probability that a bit value will be
changed. If the mutation probability condition is exceeded, then the
selected bit is inverted.
An example of a complete cycle for the simple genetic algorithm
is shown in Table 1.
31
The initial population contains four strings
composed of ten bits. The objective function determines the number
of 1’s in a chromosome and the ﬁtness function normalizes the value
to lie in the range of 0 to 1.
The proportional selection scheme allocates 0, 1, 1, and 2 off
spring to the initial offspring in their respective order. After selec
tion, the offspring are randomly paired for crossover so that strings
1 and 3 and strings 2 and 4 are mated. However, since the crossover
rate is 0.5, only strings 1 and 3 are selected for crossover. The other
strings are left intact. The pair of chromosomes then exchange their
genetic material after the ﬁfth bit position, which is the randomly
selectedcrossover point. The ﬁnal stepin the cycle is mutation. Since
the mutation rate is selected to be 0.05, only two bits out of the forty
present in the population are mutated. The second bit of string 2
and the fourth bit of string 4 are randomly selected for mutation. As
can be seen from the ﬁgure, the average ﬁtness of population P
4
is
signiﬁcantly better than the initial ﬁtness after only one generational
cycle.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
251
Table 1. ASample Generational Cycle of the Simple Genetic Algorithm
Chromosome Fitness Value Average Fitness
Population P
1
0001000010 0.2 0.50
(initial population) 0110011001 0.5
1010100110 0.5
1110111011 0.8
Population P
2
0110011001 0.5 0.65
(after selection) 1010100110 0.5
1110111011 0.8
1110111011 0.8
Population P
3
0110011011 0.6 0.65
(after crossover) 1010100110 0.5
1110111001 0.7
1110111011 0.8
Population P
4
0110011011 0.6 0.70
(after mutation) 1110100110 0.6
1110111001 0.7
1111111011 0.9
Although roulette wheel selection is the simplest method to
implement proportionate reproduction, it is highly inefﬁcient since
it requires n spins of the wheel to ﬁll a population with n mem
bers. Stochastic universal selection (SUS) is an efﬁcient alternative
to roulette wheel selection. SUS also uses a weighted roulette wheel,
but adds equally spaced markers along the outside rimof the wheel.
The wheel is spun only once, and each individual receives as many
copies of itself as there are markers in its slot.
32
The average ﬁtness value in the initial stages of a GA is typ
ically low. Thus, during the ﬁrst few generations the proportion
ate selection scheme may assign a large number of copies to a few
strings with relatively superior ﬁtness, known as super individuals.
These strings will eventually dominate the population andcause the
GAto converge prematurely. The proportionate selection procedure
also suffers from decreasing selective pressure during the last gen
erations when the average ﬁtness value is high. Scaling techniques
andranking selection can help alleviate the problems of inconsistent
selective pressure and domination by superior individuals.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
252
AP Dhawan and Shuangshuang Dai
In linear scaling, the ﬁtness value is adjusted by:
f
= af + b, (29)
where f is the original ﬁtness value and f
is the scaled ﬁtness value.
The coefﬁcients a and b are chosen so that the ﬁttest individuals
do not receive too many copies, and average individuals typically
receive one copy. These coefﬁcients should also be adjusted to avoid
negative ﬁtness values.
Ranking selection techniques assign offspring to individuals by
qualitatively comparing levels of ﬁtness. The population is sorted
according to their ﬁtness values andallottedoffspring basedontheir
rank. In ranking selection, subsequent populations are not inﬂu
encedby the balance of the current ﬁtness distributions so that selec
tive pressure is uniform. Each cycle of the simple GA produces a
completely new population of offspring from the previous gener
ation, known as generational replacement. Thus, the simple GA is
naturally slower in manipulating useful areas of the search space
for a large population. Steadystate replacement is an alternative
method which typically replaces one or more of the worst members
of the population each generation. Steadystate replacement can be
combined with an elitist strategy, which retains the best strings in
the population.
32
GAs are efﬁcient global optimization techniques which are
highly suited to searching in nonlinear, multidimensional prob
lem spaces.
32
The most widely accepted theory on the operation
of the GA search mechanism in global optimization is the Schema
Theorem. This theorem states that the search for the ﬁttest individ
uals is guided by exploiting similarities among the superior strings
in a population. These similarities are described by schemata, which
are composedof strings with identical alleles at the same position on
each string. The order of a particular schema is the number of ﬁxed
positions among the strings, and the deﬁning length is the distance
between the ﬁrst and last ﬁxed positions on a string. The schemata
with superior ﬁtness, low order and small deﬁning length increase
with each passing generation.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
253
From a set of coded parameters, GAs use a population of points
to search for the optimal solution, not just a single point in the search
space. The GAthus has a high probability of discovering the optimal
global solution in a multimodal search space since it is less likely
to be troubled by false optima. This ability becomes a tremendous
advantage over traditional methods in more complex problems.
10.5 NONPARAMETRIC CLASSIFIERS
Artiﬁcial neural network basedclassiﬁers have been exploredexten
sively in the literature for nonparametric classiﬁcation using a set
of training vectors providing relationships between input features
or measurements to output classes. Such classiﬁcation methods
that do not require any prior probabilistic model of class distribu
tions of input vectors; they learn this relationship during training.
Though there are a number of such classiﬁers have been used for
different applications, more common networks such as Backprop
agation and RadialBasis Function neural networks are described
here.
33–34
10.5.1 Backpropagation Neural Network for Classiﬁcation
The backpropagation network is the most commonly used neural
network in signal processing and classiﬁcation applications. It uses
a set of interconnected neural elements that process the information
in a layered manner. A computational neural element, also called
as perceptron, provides an output as a thresholded weighed sum
of all inputs. The basic function of the neural element, as shown in
Fig. 5, is analogous to the synaptic activities of a biological neuron.
In a layered network structure, the neural element may receive its
input from an input vector or other neural elements. A weighted
sum of these inputs constitutes the argument of a nonlinear activa
tionfunctionsuchas a sigmoidal function. The resultingthresholded
value of the activation function is the output of the neural element.
The output is distributed along weighted connections to other
neural elements.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
254
AP Dhawan and Shuangshuang Dai
X
f(ϕ)
Σ
1
w
2
ϕ
w
n+1
w
1
w
d
f(ϕ):
Y
Fig. 5. Acomputational neuron model with linear synapses.
In order to learn a speciﬁc pattern of input vectors for classiﬁ
cation, an iterative learning algorithm, such as the LMS algorithm,
often called the WidrowHoff Delta Rule
34
is used with a set of pre
classiﬁed training examples that are labeled with the input vectors
and their respective class outputs. For example, if there are two out
put classes for classiﬁcation of input vectors, the weighted sum of
all input vectors may be thresholded to a binary value, 0 or 1. The
output 0 represents class 1, while the output 1 represents class 2. The
learning algorithmrepeatedly presents input vectors of the training
set to the network and forces the network output to produce the
respective classiﬁcation output. Once the network converges on all
training examples to produce the respective desired classiﬁcation
outputs, the network is used to classify new input vectors into the
learned classes.
The computational output of a neural element can be
expressed as:
y = F
_
n
i=1
w
i
x
i
+ w
n+1
_
, (30)
where F is a nonlinear activation function that is used to threshold
the weighted sumof inputs x
i
and w
i
is the respective weight. Abias
is added to the element as w
n+1
, as shown in Fig. 5.
Let us assume a multilayer feedforward neural network with L
layers of N neural elements (Perceptrons) in each layer, as shown in
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
255
Hidden Layer
Neurons
Output Layer
Neurons
L
y
1
x
1
x
2
x
3
x
n
1
L
y
2
L
n
y
Fig. 6. Afeedforward Backpropagation neural network.
Fig. 6, such that:
y
(k)
= F
_
W
k
y
(k−1)
_
for k = 1, 2, . . . L, (31)
where y
(k)
is the output of the kth layer neural elements with k = 0
representing the input layer and W
(k)
is the weight matrix for the
kth layer such that:
y
(0)
=
_
_
_
_
_
_
_
x
1
x
2
·
x
n
1
_
¸
¸
¸
¸
¸
_
; y
(k)
=
_
_
_
_
_
_
_
_
y
(k)
1
y
(k)
2
·
y
(k)
n
y
(k)
(n+1)
k
_
¸
¸
¸
¸
¸
¸
_
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
256
AP Dhawan and Shuangshuang Dai
and
W
(k)
=
_
_
_
_
_
_
_
_
_
w
(k)
11
w
(k)
12
· w
(k)
1n
w
(k)
1(n+1)
w
(k)
21
w
(k)
22
· w
(k)
2n
w
k
2(n+1)
· · · · ·
w
(k)
n1
w
(k)
n2
· w
(k)
nn
w
(k)
n(n+1)
w
(k)
(n+1)1
w
(k)
(n+1)2
· w
(k)
(n+1)n
w
(k)
(n+1)(n+1)
_
¸
¸
¸
¸
¸
¸
¸
_
. (32)
The neural network is trained by presenting classiﬁed exam
ples of input and ouput patterns. Each example consists of the
input and output vectors {y
(0)
, y
L
} or {x, y
L
} that are encoded for the
desiredclasses. The objective of the training is to determine a weight
matrix that would provide the desired output, respectively for each
input vector in the training set. The least mean squared (LMS) error
algorithm
43,44
can be implemented to train a feed forward neural
network using the following steps:
(1) Assign random weights in the range of [−1,+1] to all weights
w
k
ij
.
(2) For each classiﬁed pattern pair {y
(0)
, y
L
} in the training set, do
the following steps:
a. Compute the output values of each neural element using the
current weight matrix.
b. Find the error e
(k)
between the computed output vector and
the desired output vector for the classiﬁed pattern pair.
c. Adjust the weight matrix using the change W
(k)
computed
as W
(k)
= αe
(k)
[y
(k−1)
] for all layers k = 1, . . . , L,
where α is the learning rate that can set between 0 and 1.
(3) Repeat step 2 for all classiﬁed pattern pairs in the training set
until the error vector for each training example is sufﬁciently
low or zero.
The nonlinear activation function is an important consideration in
computing the error vector for each classiﬁed pattern pair in the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
257
training set. Asigmoidal activation function can be used as:
F(y) =
1
1 + e
−y
. (33)
The above described gradient descent algorithmfor training a feed
forward neural network also called as backpropagation neural net
work (BPNN) is sensitive to the selection of initial weights and noise
in the training set that can cause the algorithm to get stuck in local
minima in the solution pace. This causes a poor generalization per
formance of the network when it is used to classify new patterns.
Another problem with the BPNN is to ﬁnd optimal network archi
tecture with the consideration of optimal number of hidden layers
and neural elements in each of the hidden layers. Several solutions
to ﬁnd the best architecture and generalization performance have
been explored in the literature.
34
10.5.2 Classiﬁcation Using Radial Basis Functions
Radial basis function (RBF) classiﬁers are useful interpolation meth
ods for multidimensional tasks. One major advantage of RBFs is
their structural simplicity, as seen from Fig. 7. The response of each
node in the single hidden layer is weighted and linearly summed at
the output.
The RBF network was conﬁgured by ﬁnding the centers and
widths of the basis functions and then determining the weights at
x
1
x
2
w
1
x
3
. y
. .
.
. w
ϕ
ϕ Σ
ϕ
m
x
n
Fig. 7. The radial basis function neural network representation.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
258
AP Dhawan and Shuangshuang Dai
the output of each node. The goal in selecting the unit width, or
variance, is to minimize the overlap between nearest neighbors and
to maximize the network’s generalization ability. For good gener
alization, the eigen values of the covariance matrix of each basis
are chosen as large as possible. Typically, the kernel function is a
Gaussian with unit normalization as given by
34
:
ϕ(x) = exp
_
−
_
_
x − c
i
_
_
2σ
2
i
_
, (34)
where c
i
is the center of a given kernel, and σ
2
i
is the corresponding
variance. Basis functions with less than exponential decay should
be avoided because of inferior local response.
The network output can be written in terms of the above
Gaussian basis function and the hiddentooutput connection
weights, w
i
, as:
f ( x) =
K
i=1
w
i
ϕ( x). (35)
Toaccount for large variances amongthe nodal outputs, the network
output is usually normalized. The normalized result is speciﬁed as:
f ( x) =
K
i=1
w
i
ϕ( x)
K
i=1
ϕ( x)
, (36)
where K is the total number of basis functions.
After the centers and widths of the basis functions are deter
mined, the network weights can be computed from the following:
y = F
nxp
w, (37)
where the elements of F
nxp
are the activation functions, ϕ
ij
, which
are found by evaluating the jth Gaussian function at the ith input
vector. Typically F
nxp
is rectangular with more rows than columns so
that w is overdetermined and no exact solution exists. Thus, instead
of solving for the weights by matrix inversion, w is determined by
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
259
solving a sumofsquarederror functional as:
F
T
F w = F
T
y, such that
w = (F
T
F)
−1
F
T
y
=
˜
F y,
, (38)
where
˜
F is called the pseudoinverse of F (4). In order to guarantee a
unique solution to Eq. 19,
˜
F is better expressed as:
˜
F = (F
T
F + εI)
−1
F
T
, (39)
where ε is a small constant known as the regularization parameter,
and I is the identity matrix. If F is square and nonsingular, then sim
ple matrix inversion could be used to solve for the network weights.
When the amount of data is insufﬁcient for complete approx
imation and that data is inherently noisy, it becomes necessary to
impose additional a priori constraints in order to manage the prob
lemof learning by approximation. The typical a priori supposition is
that of smoothness, or at least piecewise smoothness. The smooth
ness condition assumes that the response to an unknown data point
should be similar to the response fromits neighboring points. With
out the smoothness criterion, it would be infeasible to approximate
any function because of the large number of examples required.
34
Standard regularization is the method in learning for approxi
mation that utilizes a smoothness criterion. A regularization func
tion accomplishes two separate tasks: it minimizes the distance
between the actual data and the desired solution, and it minimizes
the deviation from the learning constraint, which can be piecewise
smoothness in classiﬁcation problems. The general functional to be
minimized is as follows:
H[f ] =
N
i=1
_
f ( x
i
) − y
i
_
2
+ ε
_
_
Df
_
_
2
, ε ∈
+
(40)
where N is the dimension of the regularization solution, ε is the
positive regularization parameter, y
i
is the actual solution, f ( x
i
) is
the desired solution, and Df
2
is a stabilizer term with D as a ﬁrst
order differential operator.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
260
AP Dhawan and Shuangshuang Dai
The solution to the above regularization functional is given by:
f ( x) =
N
i=1
b
i
G( x; c
i
), (41)
where G is the basis for the solution to the regularization problem
centered at c
i
, and b
i
=
y
i
−f ( x
i
)
ε
. Under conditions of rotational and
translational invariance, the solution can be written as:
f ( x) =
N
i=1
b
i
G
_
_
x − c
i
_
_
, (42)
10.6 EXAMPLE CLASSIFICATION ANALYSIS USING FUZZY
MEMBERSHIP FUNCTION
Skin lesion images obtained using the Nevoscope were classiﬁed
using different techniques into two classes, melanoma and dys
plastic nevus.
36
The combined set of epiilluminance and multi
spectral transilluminance images were classiﬁed using a wavelet
decomposition based ADWAT method
37
and Fuzzy Membership
Function based classiﬁcation.
36
Wavelet transform based bimodal
channel energy features obtained from the images were used
in the analysis. Methods using both crisp and fuzzy member
ship based partitioning of the feature space were evaluated. For
this purpose, the ADWAT classiﬁcation method using crisp par
titioning was extended to handle multispectral image data. Also,
multidimensional fuzzy membership functions with gaussian and
bell proﬁles were used for classiﬁcation. Results showthat the fuzzy
membership functions with bell proﬁle are more effective than the
extended ADWAT method in discriminating melanoma from dys
plastic nevus. The sensitivity and speciﬁcity of melanoma diagnosis
can be improved by adding the lesion depth and structure infor
mation obtained from the multispectral, transillumination images
to the surface characteristic information obtained from the epi
illumination images.
Bimodal features were obtained from the epiillumination
images and the multispectral transillumination images using
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
261
wavelet decompositionandstatistical analysis of the channel energy
and energy ratios for the extended ADWAT classiﬁcation method.
All these features were combined to form a composite feature set.
In this composite feature set, the dynamic range of the channel
energy ratio features is far less compared to the dynamic range of
the channel energy features. For classiﬁcation, it is necessary to nor
malize the feature set so that all the features have similar dynamic
range. Using linear transformations, all the features in the compos
ite feature set were normalized so that they have a dynamic range
betweenzero andone. Using the covariance informationas obtained
from the feature distribution of the learning data set, the values of
the dysplastic and melanoma membership functions were calcu
lated. Decision as to whether the unknown image set belongs to the
melanoma or dysplastic nevus class was taken basedon the “winner
takes all” criteria. The unknown image set was assigned to the class
with maximummembership function value. Although the member
ship functions can be thought of as multivariate conditional densi
ties similar to those used in the Bayes classiﬁer, making the decision
based on the probabilities of all the image classes for the candidate,
gives the classiﬁer its fuzzy nature.
Out of the 60 unknown images (15 melanoma and 45 dysplas
tic nevus cases) used in the classiﬁcation phase, 52 cases were cor
rectly classiﬁed using the Gaussian membership function.
36
All the
cases of melanoma and 37 cases of dysplastic nevus were identiﬁed
giving a true positive fraction of 100 percent with a false positive
fraction of 17.77 percent. For the eight dysplastic nevus cases that
were misclassiﬁed, the values of both the melanoma and dysplastic
nevus membership functions were equal to zero. These cases were
assignedto the melanoma category, since no decision about the class
can be taken if both the membership function values are the same.
Classiﬁcation results were obtained for the Bell membership func
tion using different values of the weighing constant W. Out of all
the values of W used, best classiﬁcation results are obtained for a
value of 0.6, with a true positive fraction of 100 percent with a false
positive fraction of 4.44 percent. The results obtained from all these
classiﬁcation techniques are summarized in Table 2.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
262
AP Dhawan and Shuangshuang Dai
Table 2. Results of Classiﬁcation of Optical Images Using Different
Classiﬁcation Methods for Detection of Melanoma
36
Type of
Images Used
Method Images Correctly
Classiﬁed
True
Positive
False
Positive
Melanoma Dysplastic
Epi
illuminance
Neural Network 13/15 34/45 86.66% 24.44%
Images Bayesian
Classiﬁer
13/15 40/45 86.66% 11.11%
Multispectral
and Epi
illuminance
Images
Fuzzy Classiﬁer
with
Gaussian
Membership
Function
15/15 37/45 100% 17.77%
Fuzzy Classiﬁer
with Bell
Membership
Functions
15/15 43/45 100% 4.44%
10.7 CONCLUDING REMARKS
Clustering and image classiﬁcation methods are critically impor
tant in medical imaging for computeraided analysis and diagnosis.
Though there is a wide spectrum of pattern analysis and classi
ﬁcation methods has been explored for medical image analysis,
clustering and classiﬁcation methods have to be customized and
carefully implemented for speciﬁc medical image analysis and deci
sionmaking applications. Agoodunderstanding of the involvement
of features and the contextual information may be incorporated in
modelbased approaches utilizing deterministic or fuzzy classiﬁca
tion approaches.
References
1. Jain AK, Dubes RC, Algorithms for Clustering Data, Prentice Hall,
Englewood Cliffs, NJ, 1998.
2. Duda RO, Hart PE, Stork DG, Pattern Classiﬁcation (2nd edn.),
John Wiley & Sons, 2001.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
263
3. Zadeh LA, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets
and Systems 1: 3–28, 1978.
4. Fisher D, Iterative optimization and simpliﬁcation of hierarchical clus
tering, Journal of Artiﬁcial Intelligence Research 4: 147–179, 1996.
5. Karypis G, Han EH, Multilevel reﬁnement for hierarchical clustering,
Technical Report #99–020, 1999.
6. Bradley P, Fayyad U, Reﬁning initial points for kmeans clustering, in
Proceedings of the 15th ICML, pp. 91–99, Madison, WI, 1998.
7. ZhangB, Generalizedkharmonic means —Dynamic weightingof data
inunsupervisedlearning, inProceedings of the 1st SIAMICDM, Chicago,
IL, 2001.
8. Campbell JG, FrakeyC, MurtaghF, RafteryAE, Linear ﬂawdetectionin
woventextiles usingmodelbasedclustering, PatternRecognitionLetters
18: 1539–1548, 1997.
9. Celeux G, Govaert G, Gaussian parsimonious clustering models,
Pattern Recognition 28: 781–793, 1995 .
10. Olson C, Parallel algorithms for hierarchical clustering, Parallel Com
puting 21: 1313–1325, 1995.
11. Mao J, JainAK, Aselforganizing network for hyperellipsoidal cluster
ing (HEC), IEEE Trans Neural Network 7: 16–29, 1996.
12. Sibson R, SLINK: An optimally efﬁcient algorithm for the single link
cluster method, Computer Journal 16: 30–34, 1973.
13. Voorhees EM, Implementing agglomerative hierarchical clustering
algorithms for use in document retrieval, Information Processing and
Management 22(6): 465–476, 1986.
14. Dai S, Adaptive learning for event modeling and pattern classiﬁcation,
PhD dissertation, New Jersey Institute of Technology, Jan 2004.
15. Chiu T, Fang D, Chen J, Wang Y, A Robust and scalable clustering
algorithmfor mixed type attributes in large database environments, in
Proceedings of the 7th ACM SIGKDD, pp. 263–268, San Francisco, CA,
2001.
16. Diday E, The dynamic cluster method in nonhierarchical clustering,
J Comput Inf Sci 2: 61–88, 1973.
17. Symon MJ, Clustering criterion and multivariate normal mixture,
Biometrics 77: 35–43, 1977.
18. Bezdek JC, Pattern Recognition With Fuzzy Objective Function Algorithms,
Plenum Press, New York, NY, 1981.
19. Dave RN, Generalized fuzzy Cshells clustering and detection of cir
cular and elliptic boundaries, Pattern Recogn 25: 713–722, 1992.
20. Pham DL, Prince JL, Adaptive fuzzy segmentation of magnetic reso
nance images, IEEE Trans on Med Imaging 18(9): 737–752, 1999.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
264
AP Dhawan and Shuangshuang Dai
21. Vannier M, PilgramT, Speidel C, et al., Validationof magnetic resonance
imaging (MRI) multispectral tissue classiﬁcation, Computerized Medical
Imaging and Graphics 15: 217–223, 1991.
22. Choi HS, Haynor DR, KimY, Partial volume tissue classiﬁcationof mul
tichannel magnetic resonance images — A mixed model, IEEE Trans
actions on Medical Imaging 10: 395–407, 1991.
23. Zavaljevski A, Dhawan AP, Holland S, et al., Multispectral MR brain
image classiﬁcation, Computerized Medical Imaging, Graphics and Image
Processing 24: 87–98, 2000.
24. Nazif AM, Levine MD, Lowlevel image segmentation: An expert sys
tem, IEEE Trans Pattern Anal Mach Intell 6: 555–577, 1984.
25. Arata LK, Dhawan AP, Levy AV, et al., Threedimensional anatomical
model based segmentation of MR brain images through prinicpal axes
registration, IEEE Trans Biomed Eng 42: 1069–1078, 1995.
26. Dhawan AP, Chitre Y, KaiserBonassoand M Moskowitz, Analysis of
mammographic microcalciﬁcations using gray levels image structure
features, IEEE Trans Med Imaging 15: 246–259, 1996.
27. Hall LO, BensaidAM, Clarke LP, Velthuizen RP, et al., Acomparison of
neural network and fuzzy clustering techniques in segmenting mag
netic resonance images of the brain, IEEE Trans on Neural Networks 3:
672–682, 1992.
28. Xu L, Jackowski M, Goshtasby A, et al., Segmentation of skin cancer
images, Image and Vision Computing 17: 65–74, 1999.
29. Huo Z, Giger ML, Vyborny CJ, Computerized analysis of multiple
mammographic views: Potential usefulness of special view mam
mograms in computer aided diagnosis, IEEE Trans Med Imaging 20:
1285–1292, 2001.
30. Grohman W, Dhawan AP, Fuzzy convex set based pattern classiﬁca
tion of mammographic microcalciﬁcations, Pattern Recognition 34(7):
119–132, 2001.
31. Bonasso C, GA based selection of mammographic microcalciﬁca
tion features for detection of breast cancer, MS Thesis, University of
Cincinnati, 1995.
32. Peck C, Dhawan AP, Areview and critique of genetic algorithm theo
ries, J of Evolutionary Computing, MIT Press 3(1): 39–80, 1995.
33. Dhawan AP, Medical Image Analysis, John Wiley Publications and IEEE
Press June 2003, Reprint, 2004.
34. Zurada JM, Introduction to Artiﬁcial Neural Systems, West Publishing
Co., 1992.
35. Mitra S, Pal SK, Fuzzy Multilayer perceptron, inferencing and rule
generation, IEEE Trans Neural Networks 6(1): 51–63, 1995.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classiﬁcation
265
36. Patwardhan S, Dai S, Dhawan AP, Multispectral image analysis and
classiﬁcation of melanoma using fuzzy membership based partitions,
Computerized Medical Imaging and Graphics 29: 287–296, 2005.
37. Patwardhan SV, Dhawan AP, Relue PA, Classiﬁcation of melanoma
using treestructured wavelet transforms, Computer Methods and Pro
grams in Biomedicine 72: 223–239, 2003.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch10 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
CHAPTER 11
Recent Advances in Functional
Magnetic Resonance Imaging
DaeShik Kim
While functional imaging of the brain function using magnetic resonance
imaging (fMRI) has gained a wide acceptance as a useful tool in basic and
clinical neurosciences, its ultimately utility remains elusive due to our lack
of understandingof its basic physiological processes andlimitations. Inthe
present chapter, we will discuss recent advances that are shedding light on
the relationship between the observable blood oxygenation level depen
dent (BOLD) fMRI contrast and the underlying neuroelectrical activities.
Finally, we will discuss topical issues that remain to be solved in future.
11.1 INTRODUCTION
The rapid progress of blood oxygenation level dependent (BOLD)
functional magnetic resonance imaging (fMRI) in recent years
1−3
has raised the hope that —unlike most existing neuroimaging tech
niques — the functional architecture of the human brain can be
studied directly in a noninvasive manner. The BOLD technique is
based on the use of deoxyhemoglobin as nature’s own intravas
cular paramagnetic contrast agent.
4−6
When placed in a magnetic
ﬁeld, deoxyhemoglobin alters the magnetic ﬁeld in its vicinity, par
ticularly when it is compartmentalized as it is within red blood
cells and vasculature. The effect increases as the concentration of
deoxyhemoglobin increases. At concentrations found in venous
blood vessels, a detectable local distortion of the magnetic ﬁeld
267
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
268
DaeShik Kim
surrounding the red blood cells and surrounding blood vessel is
produced. This affects the magnetic resonance behavior of the water
proton nuclei within and surrounding the vessels, which in turn
results in decreases in the transverse relaxation times T
2
and T
∗
2
.
4,6
Duringthe activationof the brain, this process is reduced: increase in
neuronal and metabolic activity results in a reduction of the relative
deoxyhemoglobin concentration due to an increase of blood ﬂow
(and hence increased supply of fresh oxyhemoglobin) that follows.
Consequently, in conventional BOLD fMRI, brain “activity” can be
measured as an increase in T
2
or T
∗
2
weighted MR signals.
1−3
Since
its introduction about 10 years ago, BOLD fMRI was successfully
applied — among numerous other examples — to precisely local
ize the cognitive,
7
motor,
8
and perceptual
9−11
function of the human
cortex cerebri (Figs. 1 and 2). The explanatory power of BOLDfMRI is
being further strengthened in recent years through the introduction
of high (∼3T) and ultrahigh (∼7T) MRI scanners.
12
This is based on
the fact the stronger magnetic ﬁeld will not only increase the fMRI
signal per se, but in addition, it will speciﬁcally enhance the sig
nal components originating from parenchymal capillary tissue. On
the other hand, conventional, lowﬁeld magnets can be expected to
“overrepresent” macrovascular signals.
11.2 NEURAL CORRELATE OF fMRI
BOLD fMRI contrast does not measure neuronal activity per se.
Rather, it reﬂects a complex convolution of changes ranging from
cerebral metabolic rate of oxygen (CMRO2), cerebral blood ﬂow
(CBF), and cerebral blood volume (CBV) following focal neuronal
activity (Fig. 1). This poses a fundamental problem for the accu
racy and validity of BOLDfMRI for clinical and basic neurosciences:
while the greatest body of existing neurophysiological data provide
spiking and/or subthreshold measurements from a small number
of neurons (10
0
–10
2
), fMRI on the other hand labels the local hemo
dynamics from the parenchymal lattice consisting millions of neu
rons (10
6
–10
8
) and a dense network of microcapillaries. Howcan we
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
269
Fig. 1. Hemodynamic basis of functional MRI. Note that fMRI is an indirect mea
sure of the neuronal activity elicited by an external stimulus (“visual stimulation”)
mediatedthrough hemodynamic processes occurring in the dense network of veins
(“V”), arteries (“A”) and capillaries.
bridge this gap from micronscale neuronal receptive ﬁeld proper
ties to millimeter scale voxel behaviors? The problem of bridging
this conceptual gap is greatly hindered by the presence substan
tial differences between neuronal and fMRI voxel properties: small
number (10
0
–10
2
) versus large number (10
6
–10
8
) of neurons under
lying the observed activation; pointlike individual neurons versus
neurovascular lattice grid; largely spiking versus largely subthresh
old activities; excitatory or inihibitory versus excitatory and/or
inhibitory (see Fig. 3 for differences in time scale between fMRI
and electrophysiological signals). The crucial questions we need to
address are discussed below.
11.2.1 Do BOLD Signal Changes Reﬂect the Magnitude
of Neural Activity Change Linearly?
Amplitude of the fMRI signal intensity change has been employed
by itself to obtain information beyond simple identiﬁcation of spa
tial compartmentalizationof brainfunctionbycorrelatingvariations
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
270
DaeShik Kim
Fig. 2. Functional MRI of the human visual cortex using BOLD contrast at 3T.
Here, the receptive ﬁeld properties for isoeccentricity was mapped using the stan
dard stimuli. Colorcoded activation areas were responding to eccentricities repre
sented by the colored rings in the upper right corner. Regions of activity were were
superimposed on the reconstructed and inﬂated brain surfaces.
in this amplitude with behavioral (e.g. Refs. 13–15) or the elec
troencephelography (EEG) response.
16
However, extracting such
information requires the deconvolution of the compounded fMRI
response,
17
assuming that fMRI signals are additive. This assump
tion, however, appears not to be generally valid (e.g. Refs. 18–20).
Tight and highly quantitative coupling between the EEG and T
∗
2
BOLDsignals in the rat model was reported where the frequency of
forepawstimulationrate was variedunder steadystate conditions.
21
A linear relationship between the BOLD response and somatosen
sory evoked potentials was demonstrated for brief stimuli but the
nature of the relationshipdependedonthe stimulationdurationand
ultimately became nonlinear;
22
in this study, the linearity was used
in a novel way to extract temporal information in the millisecond
time scale. More recently, local ﬁeld potentials and spiking activ
ity was recorded for the ﬁrst time simultaneously with T
∗
2
BOLD
fMRI signals in the monkey cortex, showing a linear relationship
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
271
between local ﬁeld potentials and spiking rate, but displaying bet
ter correlation with the former.
23
In a recent study, recording from
multiple sites for the ﬁrst time, spiking activity was shown to be lin
early correlated with the T
∗
2
BOLD response in the cat visual cortex
using a single orientation of a moving grid but with different spa
tial frequency of the grid lines.
24
However, the correlation varied
from point to point on the cortical surface and was generally valid
only when the data were averaged at least over 4 mm–5 mm spa
tial scale,
24
demonstrating the fact that T
∗
2
BOLD responses are not
spatially accurate at the level of orientation columns in the visual
system, as discussed previously. Adetailed set of studies were per
formed asking the same type of questions and using laser Doppler
techniques to measure cerebral blood ﬂow (CBF)
25,26,28
; these stud
ies concluded that linear domains exist between CBF increases and
aspects of electrical activity and that hemodynamic changes evoked
by neuronal activity depend on the afferent input function but that
they do not necessarily reﬂect output level of activity of a region.
11.2.2 Small Versus Large Number
Given the nominal voxel size of most fMRI scans (several milli
meters at best), it is safe to conclude that BOLD reﬂects the activ
ity of many neurons (let’s say, for a voxel of 1 mm
3
–2 mm
3
around
10
5
neurons).
28
The overwhelming body of existing electrophysi
ological data, however, is based on electrode recordings from sin
gle (single unit recording, SUA) or a handful of neurons (multiunit
recording, MUA). The real question is hence to ask how accurately
the responses of single cells (our “gold standard” given the existing
bodyof data) arereﬂectedbyapopulationresponse, suchas inBOLD
fMRI. Theoretically, if each neuron would“ﬁre” independently of its
neighbor’s behavior, this wouldbe anill posedproblem, as fMRI will
not be able to distinguish small activity changes in a large cellular
population from large changes in a small population. Fortunately,
however, neurons are embedded in tight local circuitries, forming
functional clusters with similar receptive ﬁeld properties raging
from “microcolumns,” “columns,” to “hypercolumns.” Both the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
272
DaeShik Kim
neuronal ﬁring rate and phase are correlated between neighboring
neurons (Singer, 1999), and in most sensory areas there is a good
correlation between local ﬁeld potentials (LFP), which are assumed
to reﬂect the average activity of a large number of neurons, and
the responses of individual spiking neurons. In fact, it is difﬁcult
to imagine how BOLD contrast could be detectable at all, if it were
sensitized to the behavior of uncorrelated individual neurons, as the
metabolic demand of a single neuron would be hardly sufﬁcient to
initiate the chain of hemodynamic events giving rise to BOLD.
11.2.3 Relationship between Voxel Size and Neural
Correspondence
Clearly, the MRI voxel size is a key element in determining the spa
tial dependence of the correlation between the BOLD and electrode
data. A large voxel will improve the relationship to the neuronal
event, since a voxel that displays BOLD signal changes will have
a much higher probability of including the site of the electrically
active column when its size increases, for example to sizes that are
often used in human studies (e.g. 3 mm
3
×3 mm
3
×3 mm
3
). How
ever, such a large voxel will provide only limited information about
the pattern of activation, due to its low spatial resolution. Smaller
voxels (i.e. at the size of individual single unit recording sites) which
could potentially yield a much better spatial resolution will result in
a large variability in neuronal correspondence and the BOLDsignal
and a large number of “active” voxels will actually originate from
positions beyond the site of electrical activity (Fig. 4).
11.2.4 Spiking or Subthreshold?
According to the standard “integrateandﬁre” model of neurons,
action potential is generated when the membrane potential reaches
threshold by depolarization, which in turn is determined by the
integration of incoming excitatory (EPSP) and inhibitory (IPSP)
postsynaptic potentials. Action potentials are usually generated
only around the axon hillock, while synaptic potentials can be
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
273
S
p
i
k
e
s
/
s
e
c
B
O
L
D
[
%
]
Ti me after stimulus onset [sec]
0
10 20
25
0
0.0
1.0
30
R
e
c
t
i
f
i
e
d
v
o
l
t
s
[
µ
V
]
0
50
BOLD
Low frequency analog
electrode signals (100300Hz)
Spike rate
Bin size = TR = 0.5 sec
sti mulus
Fig. 3. Time course of BOLD and single unit recordings from the same cortical
location. Identical visual stimuli were used for fMRI and subsequent single unit
recording sessions. Blue trace: peristimulus histogram of the spike activity. Bin
size for the histogram = 0.5 sec = TR for fMRI. Red trace: BOLD percent changes
during visual stimulation. Xaxis: time after stimulus onset. Left Yaxis: Spikes per
second. Right Yaxis: BOLDpercent changes. Graybox: stimulus duration. Theblack
trace above indicates the original lowfrequency analog signals (100 Hz–300 Hz)
underlying the depicted spike counts.
generated all across the dendritic tree (mostly on dendritic spines)
and cell soma. The thresholddependent action potential ﬁring
means that much more sub and suprathreshold synaptic activity
than action potential activity is likely at any one time. And the
much larger neural surface area associated with synaptic activity
means that the total metabolic demand (i.e. number of Na
+
/K
+
pumps involved etc.) for synaptic activity ought to be signiﬁ
cantly higher than those required for generating action poten
tials. It seems therefore likely to be the case that BOLD con
trast — like other methods based on cortical metabolism, such as
2DG (
14
C2deoxyglucose)
49
and optical imaging — is dominated
by the synaptic subthreshold activity. However, the precise
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
274
DaeShik Kim
contributions of synaptic and spiking activities are hard to quantify,
since withconventional stimuli one wouldexpect synaptic input and
spikingoutput activityto be roughlycorrelatedwitheachother.
29−31
Indeed, it is not trivial to imagine an experiment where input and
output activities would not correlate with each other. One way this
has been proposed in the past, is to look in a visual area at spa
tial activity resulting from the edge of a visual stimulus.
32−34
Since
“extraclassical” receptive ﬁelds across such an edge are by deﬁni
tion subthreshold activity, it follows that a stimulus with an edge
in it creates regions of cortex where activity is only subthreshold
in origin. Existing optical imaging studies
32,35
have concluded that
subthresholdactivity does indeedcontribute signiﬁcantly to the opti
cal signal, suggesting that it might contribute to the BOLD signal
as well. The results of our combined BOLD and single unit studies
suggest that both local ﬁeld potential (LFP) and singleunit correlate
well with the BOLD signal (see Figs. 3 and 4). We have used LFP on
1 7
0
0.5
1.0
Neural modulation [∆ spikes/sec]
B
O
L
D
m
o
d
u
l
a
t
i
o
n
[
%
]
∆
R = 0.85
2
y = 0.12x + .085
7.95 spikes per 1%BOLD
Fig. 4. Results of direct comparison between BOLD and single unit recordings
across all sites (n=58). Xaxis: neural modulation for the single unit response in
spikes per seconds. Yaxis: % BOLD modulation. The six data points indicate the
BOLD/single unit responses for six different spatial frequencies used for this study.
The thick black line is the regression line for the depicted data points. Coefﬁcient
of determination of the regression line, R
2
=0.85.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
275
the assumption that it represents the average activity of thousands
of neurons. In agreement with previous ﬁndings,
36
LFP signals may
provide a better estimate of BOLD responses than suprathreshold
spike rate. However, whether intracellular or extracellular activity is
better correlatedwithBOLDis harder toaddress, sincewithagrating
stimulus (and in fact with many types of visual stimuli), one would
expect intracellular and extracellular activity to be roughly corre
lated with each other.
29−31
Separating intracellular and extracellu
lar activity would have to be accomplished using a visual stimulus
known to do so. One imaging experiment presumptively showing
a large contribution of intracellular activity to the optical imaging
signal uses focal iontophoresis of GABAA antagonist bicuculline
methiodide
37,38
to generate a mismatch between intracellular and
extracellular activity. This is a rare case where a blooddependent
signal could be reversibly altered by an artiﬁcial manipulation of
neural activity. We are currently repeating these studies using fMRI
techniques to elucidate the spatial contribution of the intracellular
and extracellular activity in BOLD functional MRI signals.
11.2.5 Excitatory or Inhibitory Activity?
Although the neuro and cognitivescience communities have
embraced fMRI with exuberance, numerous issues remain poorly
understood regarding this technique. Because fMRI maps are based
on secondary metabolic and hemodynamic events that follow neu
ronal activity, and not the electrical activity itself, it remains mostly
unclear what the spatial speciﬁcity of fMRI is (i.e. how accurate are
the maps generated by fMRI compared to actual sites of neuronal
activity?). Inaddition, the nature of the link betweenthe magnitudes
of neuronal activity versus fMRI signals is not well understood (i.e.
what does a change of particular magnitude in fMRI signals mean
with respect to the change in magnitude of processes that deﬁne
neuronal signaling, such as action potentials or neurotransmitter
release?). fMRI is often used without considering these unknowns.
For example, modulating the intensity of fMRI signals by means
of different paradigms and interpreting the intensity changes as
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
276
DaeShik Kim
changes in neuronal activity of corresponding magnitude is a com
mon practice that is not fully justiﬁed under most circumstances. To
the best of our knowledge, there is currently no evidence that the
metabolic demands differ greatly between excitatory and inhibitory
synapses. Therefore, fundamentally, both the excitatory (EPSP) and
inhibitory (IPSP) synaptic inputs can be expected to cause simi
lar metabolic and hemodynamic events ultimately giving rise to
similar BOLD contrasts. On the site of the spiking output activ
ity, however, they have an opposite effect: accumulation of EPSPs
will increase the probability for spike generation (and therefore
also the metabolic demand), while IPSPs will decrease it. Assuming
that the BOLD response predominantly reﬂects changes in synap
tic subthreshold activity, it remains elusive whether excitatory and
inhibitory cortical events can be differentiated using the BOLD
response in any single region. Recently, one group proposed that
inhibition, unlike excitation, elicits no measurable change in the
BOLD signal.
39
They hypothesized that because of the lower num
ber of inhibitory synapses,
40
their strategically superior location
(inhibitory receptors: basal cell body; excitatory receptors: distal
dendrites), andincreasedefﬁciency
41
there couldbe lower metabolic
demand during inhibition compared to excitation. The validity of
this claim notwithstanding, both empirical and theoretical studies
suggest that excitatory and inhibitory neurons in the cortex are so
tightly interconnected in local circuits (see e.g. Ref. 42 for details of
the local circuitry in cat primary visual cortex; see also Ref. 43 for
the anatomy of local inhibitory circuits in cats) that one is unlikely to
observe an increase in excitation without an increase in inhibition.
After all, for an inhibitory neuron to increase its ﬁring rate, it must
be receiving more excitatory input, and most of the excitatory input
comes from the local cortical neighborhood (see Refs. 42 and 44 for
overview). Naturally, excitation and inhibition would not occur in
temporal unison, as otherwise no cell would reach threshold. On
the temporal scale of several hundred milliseconds to seconds dur
ing which BOLD contrast emerges,
3
however, such potential tem
poral differences would most likely be rendered indistinguishable.
One viable hypothesis is therefore that BOLD contrast reﬂects a
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
277
steadystate balance of local excitation and inhibition. In particular
if BOLD is more sensitive to subthreshold than to spiking activity.
11.3 NONCONVENTIONAL fMRI
BOLD fMRI at conventional low magnetic ﬁeld of 1.5T can com
monly achieve a spatial resolution of up to 3–5 millimeters. This
is sufﬁcient for labeling cortical organization at hypercolumn (sev
eral millimeters) or area (several centimeters) scales. But functional
images at this resolution fail to accurately label the columnar orga
nization of the brain. Studies at higher magnetic ﬁelds (such as 3 or
7T) can produce signiﬁcant enhancement of the spatial resolution
and speciﬁcity of fMRI. Theoretical and experimental studies have
shown at least a linear increase in signaltonoise ratio (SNR) with
magnetic ﬁeld strength. The increase of the static MR signal can be
used to reduce the volume needed for signal averaging. Further
more, as the ﬁeld strength increases, the ﬁeld gradient around the
capillaries becomes larger and extends further into the parenchyma
thus increasing the participation of the brain tissue in functional sig
nal. Concurrently, the shortened T
∗
2
of the blood at high B
0
reduces
the relative contribution from the large veins.
While these results suggest that stronger magnetic ﬁeld per se
will speciﬁcally enhance the signal components originating from
parenchymal capillary tissue, recent optical spectroscopy and func
tional MRI data
45−48
suggest that the spatial speciﬁcity of BOLD
could be further and more dramatically improved if an — hypoth
esized — initial decrease of MR signals can be utilized for functional
imaging formation. To this end, it is suggested that the ﬁrst event
following focal neuronal activity is a prolonged increase in oxygen
consumption, caused by an elevation in oxidative metabolism of
active neurons. Based on 2DG data,
49
one can assume the increase
in oxidative metabolismin mammalian cortex to be colocalizedwith
the site of electrical activity. The increase in oxidative metabolism
will naturally elevate the local deoxyhemoglobin content in the
parenchyma of active neurons, assuming there is no immedi
ate commensurate change in cerebral blood ﬂow.
50
In T
2
or T
∗
2
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
278
DaeShik Kim
weightedBOLDfMRI images, suchincrease inparamagnetic deoxy
hemoglobin should therefore be detectable as a transient decrease
in observable MR signals. Such an initial deoxygenation of the
local cortical tissue will last only for a brief period, as fresh blood
(fresh oxyhemoglobin) will rush into capillaries in response to the
increased metabolism, thus reversing the local ratio of hemoglobin
in favor of oxyhemoglobin, and hence resulting in a delayed increase
in observable MR signals (i.e. the conventional BOLD signal). The
crucial question here is the “where” of the above described “bipha
sic” hemodynamic processes. Grinvald and coauthors
51,46
hypoth
esized a fundamentally distinct functional speciﬁcity for these two
events: The initial deoxygenation, as a consequence of an increase
in oxidative metabolism, should be coregistered with the site of
electrical activity up to the level of individual cortical columns (in
fact, the well established “optical imaging of intrinsic signals”,
52,53
which has been cross validated with single unit techniques,
54,55
is
similarly based on measuring the local transient increase of deoxy
hemoglobin). The delayed oxygenation of the cortical tissue on the
other hand, is suggested to be far less speciﬁc due to the spread
of hemodynamic activity beyond the site of original neural activ
ity. Both the existence of “biphasic” BOLD response per se, and the
suggested differences in functional speciﬁcity has been the subject
of heated controversies in recent years (see Ref. 56 for a comprehen
sive update of this saga). While the initial deoxygenation signal in
fMRI (termed “initial dip”) has been reported in awake behaving
humans
57,58
and anesthetized monkeys,
59
studies in rodents failed
to detect any signiﬁcant initial decrease in BOLD signal following
sensory stimulation,
60−62
but see Ref. 63. The question of whether
the use of initial dip would indeed improve the spatial speciﬁcity
of BOLD has been far more difﬁcult to address experimentally. This
is largely because most fMRI studies examining this phenomenon
so far have been conducted in humans (e.g. Refs. 57 and 64), and
therefore, by necessity have used relatively coarse nominal spatial
resolution above the level of the individual cortical columns. In ani
mal studies using ultrahigh magnetic ﬁelds (e.g. 9.4T), in which
functional images at submillimeter scale can be acquired, the results
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
279
of our own group
45
(Fig. 6) suggest that indeed the use of the “initial
dip” can signiﬁcantly improve the spatial speciﬁcity of BOLD. This
result has been questioned afterwards
65
; see Ref. 66 for our reply.
On the other hand, in a recent pioneering study, preoperative func
tional MRI and intraoperative optical imaging were performed in
the same human subject.
67
While the spatial overlap between opti
cal imaging and conventional (positive) fMRI was poor, there was
a dramatic improvement in spatial correspondence between the
two dataset when the initial dip portion of the MRI signal was
used. Furthermore, combined single unit and oxygen tension probe
measurements
68
convincingly demonstrated both the presence as
well as the functional signiﬁcance of the initial deoxygenation sig
nal component.
Alternative to the initial deoxygenationsignals, the spatial speci
ﬁcity of T
∗
2
based fMRI can be further improved if only the arterial
contribution and/or to attenuate the draining vessel artifacts are
utilized for functional image construction. For example, perfusion
weighted images based on arterial spin labeling can be made sensi
tive to the cerebral blood ﬂow(CBF) changes fromupstreamarterial
networks to the capillaries, thus providing better spatial localization
ability
69,70
than T
∗
2
BOLD imaging methods.
11.4 CONCLUSIONS AND FUTURE PROBLEMS OF fMRI
In less than a decade since the ﬁrst noninvasive measurements of
functional blood oxygenation level signals from the human brain,
fMRI has developed into an indispensable neuroimaging tool that
is ubiquitous in both clinical and basic neuroscience settings. The
explanatorypower of fMRI however, is currentlylimiteddue topres
ence of major theoretical and practical shortcoming. These include
(but not limited to): (a) lack of the detailed understanding of its
neural correlate; (b) limited spatial resolution; and (c) the difﬁculty
in combining fMRI with other imaging/measurement techniques.
Furthermore, it is important to note that conventional functional
MRI data analysis techniques (e.g. General Linear Model, ttest,
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
280
DaeShik Kim
Fig. 5. Figure 4 shows the neuronal correspondence (R
2
between BOLD and
single unit responses) as a function of the reshufﬂed voxel sizes. For each
voxel size, the distribution of the neuronal qualities is indicated by the stan
dard deviation. The red curve marks the mean neuronal correspondence for
each voxel size. For curve ﬁtting, conventional sigmoidal ﬁtting was used. The
results depicted in Fig. 8 predict that the neuronal correspondence saturates
around R
2
=0.7 at the voxel size of around 4.7 × 4.7 mm
2
. Larger voxel sizes
are suggested to be ineffective in further improving the level of neuronal cor
respondence. That is, the maximum amount of variance in the underlying neu
ronal modulation that can be explained with the variance of conventional T
∗
2
based positive BOLD is about 70%. Once the voxel size has been reduced to
be smaller than ∼2.8 × 2.8 mm
2
, only less than 50% of the variance in the
underlying neuronal modulation can be explained through the observed BOLD
responses.
crosscorrelation etc.) implicitly assume a modularity of cortical
functions: parametric statistical methods test the hypothesis that
certain areas of the brain are signiﬁcantly more active than others
with nonvanishing residual false positive detection error (repre
sented as pvalue). However, such techniques assume that the brain
consists of individual computational modules (similar to “Phreno
logical” ideas) that are spatially distinct from each other. Interest
ingly, increasing number of evidences in recent years suggest an
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
281
alternative representation model: that the information in the brain
is representedina more distributedfashion.
71,72
Inthe latter case, the
conventional statistical techniques may fail to detect the correct pat
tern of neuronal activation, because they attempt to detect the areas
of “strongest” activation, while the information may represented
information using a much larger area of cortical tissue than conven
tionally assumed. In their original works, Haxby and colleagues
71
have used simple voxeltovoxel comparison methods to look for
Fig. 6. Improvement of BOLD spatial speciﬁcity by using nonconventional func
tional MRI signals. Time course on the left side shows biphasic evolution of MR
signals, resulting the early deoxygenation contrast. If used, such deoxygenation
signals produce highresolution images of exceedingly high functional speciﬁcity
(termed BOLD−) that contrasts with conventional BOLD fMRI signals (termed
BOLD+).
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
282
DaeShik Kim
activity pattern in the human brain. Linear pattern discrimination
techniques, such as support vector machines (SVM) or ﬁsher’s lin
ear discriminators (FLD) are inherently better suited for classifying
observed activation pattern into separable categories. For example,
when appliedfor discriminating orientation tuning behavior of vox
els fromprimary visual areas, SVMwas able to detect minute differ
ences in orientation selectivity of individual voxels in human V1.
73
Finally, while fMRI provides detailed information about the
“where” of the brain’s functional architecture noninvasively, such
localization information alone, must leave pivotal questions about
the brain’s information processing (the “how” of the processing)
unanswered, as long as the underlying pattern of neuronal connec
tivity cannot be mapped in an equally noninvasive manner. Future
fMRI studies incognitive neuroimagingstudies will have toembrace
a signiﬁcantly more multimodal approach. For example, combin
ing fMRI with diffusion tensor imaging
74,75
will label the pattern of
structural connectivity between functionally active areas. The direc
tion of the ﬂow of functional ﬂow of information within this mesh
of neural networks could then be elucidated by performing time
resolved fMRI, effective connectivities, and possibly also repetitive
transcranial magnetic stimulations (rTMS) together with high reso
lution fMRI experiments.
11.5 ACKNOWLEDGMENTS
We thank Drs Louis Toth, Itamar Ronen, Mina Kim and Kamil
Ugurbil for their help during the studies. This work was supported
by grants from NIH (MH67530, NS44820).
References
1. Bandettini PA, Wong EC, Hinks RS, Tikofsky RS, et al., Time course EPI
of human brain function during task activation, Magn Reson Med 25:
390–397, 1992.
2. Kwong KK, Belliveau J, Chesler DA, Goldberg IE, et al., Dynamic
magnetic resonance imaging of human brain acrivity during primary
sensory stimulation, Proc Natl Acad Sci USA 89: 5675–5679, 1992.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
283
3. Ogawa S, Tank DW, Menon R, Ellermann JM, et al., Intrinsic signal
changes accompanying sensory stimulation: Functional brain map
ping with magnetic resonance imaging, Proc Natl Acad Sci USA 89:
5951–5955, 1992.
4. Ogawa S, Lee TM, Nayak AS, Glynn P, Oxygenationsensitive contrast
in magnetic resonance image of rodent brain at high magnetic ﬁelds,
Magn Reson Med 14: 68–78, 1990.
5. Pauling L, Coryell CD, The magnetic properties and structures of
hemoglobin, oxyhemoglobin, and carbonmonoxyhemoglobin, Proc
Natl Acad Sci USA 22: 210–216, 1936.
6. Thulborn KR, Waterton JC, Mattews PM, Radda GK, Dependence of
the transverse relaxation time of water protons in whole blood at high
ﬁeld, Biochem Biophys Acta 714: 1982.
7. Wagner AD, Schacter DL, Rotte M, Koutstaal W, et al., Building mem
ories: Remembering and forgetting of verbal experiences as predicted
by brain activity, Science 281: 1188–1191, 1998.
8. Kim SG, Ashe J, Hendrich K, Ellermann JM, et al., Functional mag
netic resonance imaging of motor cortex: Hemispheric asymmetry and
handedness, Science 261: 615–617, 1993.
9. Engel SA, Glover GH, Wandell BA, Retinotopic organization in human
visual cortex and the spatial precision of functional MRI, Cereb Cortex
7: 181–192, 1997.
10. Sereno MI, Dale AM, Reppas JB, Kwong KK, et al., Borders of multi
ple visual areas in humans revealed by functional magnetic resonance
imaging, Science 268: 889–893, 1995.
11. Tootell RB, Mendola JD, Hadjikhani NK, Ledden PJ, et al., Functional
analysis of V3Aand related areas in human visual cortex, J Neurosci 17:
7060–7078, 1997.
12. Ugurbil K, Toth L, KimDS, Howaccurate is magnetic resonance imag
ing of brain function? Trends Neurosci 26: 108–114, 2003.
13. Gandhi SP, Heeger DJ, Boynton GM, Spatial attention affects brain
activity in human primary visual cortex, Proc Natl Acad Sci USA 96:
3314–3319, 1999.
14. Salmelin R, Schnitzler A, Parkkonen L, Biermann K, et al., Native lan
guage, gender, and functional organization of the auditory cortex, Proc
Natl Acad Sci USA 96: 10460–10465, 1999.
15. Tagaris GA, Kim, SG, StruppJP, AndersenP, et al., Mental rotationstud
ied by functional magnetic resonance imaging at high ﬁeld (4 tesla):
Performance and cortical activation, J of Cogn Neurosci 9: 419–432, 1997.
16. Dehaene S, Spelke E, Pinel P, Stanescu R, et al., Sources of mathematical
thinking: Behavioral andbrainimaging evidence, Science 284: 970–974,
1999.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
284
DaeShik Kim
17. Glover GH, Deconvolutionof impulse response ineventrelatedBOLD
fMRI, Neuroimage 9: 416–429, 1999.
18. BoyntonGM, Engel SA, Glover GH, Heeger DJ, Linear systems analysis
of functional magnetic resonance imaging in human V1, J Neurosci 16:
4207–4221, 1996.
19. Sidtis JJ, Strother SC, Anderson JR, Rottenberg DA, Are brain functions
really additive? Neuroimage 9: 490–496, 1999.
20. Vazquez AL, Noll DC, Nonlinear aspects of the BOLD response in
functional MRI, Neuroimage 7: 108–118, 1998.
21. Brinker G, Bock C, Busch E, Krep H, et al., Simultaneous recording of
evoked potentials and T
2
weighted MR images during somatosensory
stimulation of rat, Magn Reson Med 41: 469–473, 1999.
22. Ogawa S, Lee TM, Stepnoski R, Chen W, et al., An approach to probe
some neural systems interaction by functional MRI at neural time scale
down to milliseconds (In Process Citation), Proc Natl Acad Sci USA 97:
11026–11031, 2000.
23. Logothetis NK, Pauls J, Augath M, Trinath T, et al., Neurophysiologi
cal investigation of the basis of the fMRI signal, Nature 412: 150–157,
2001.
24. TothLJ, RonenI, OlmanC, Ugurbil K, et al., Spatial correlationof BOLD
activity with neuronal responses, Paper presented at Soc. Neurosci,
Abstracts, 2001.
25. Lauritzen M, Relationship of spikes, synaptic activity, and local
changes of cerebral blood ﬂow, J Cereb Blood Flow Metab 21: 1367–
1383, 2001.
26. MathiesenC, Caesar K, AkgorenN, LauritzenM, Modiﬁcationof activ
itydependent increases of cerebral blood ﬂow by excitatory synap
tic activity and spikes in rat cerebellar cortex, J Physiol 512(Pt 2):
555–566, 1998.
27. Mathiesen C, Caesar K, Lauritzen M, Temporal coupling between
neuronal activity and blood ﬂow in rat cerebellar cortex as indi
cated by ﬁeld potential analysis, J Physiol 523(Pt 1): 235–246,
2000.
28. Braitenberg V, Brain size and number of neurons: An exercise in syn
thetic neuroanatomy, J Comput Neurosci 10: 71–77, 2001.
29. Ferster D, Linearityof synaptic interactions inthe assemblyof receptive
ﬁelds in cat visual cortex, Curr Opin Neurobiol 4: 563–568, 1994.
30. Jagadeesh B, Wheat HS, Ferster D, Linearity of summation of synaptic
potentials underlying direction selectivity in simple cells of the cat
visual cortex, Science 262: 1901–1904, 1993.
31. Nelson S, Toth L, Sheth B, Sur M, Orientation selectivity of corti
cal neurons during intracellular blockade of inhibition, Science 265:
774–777, 1994.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
285
32. GrinvaldA, Lieke EE, Frostig RD, HildesheimR, Cortical pointspread
function and longrange lateral interactions revealed by realtime opti
cal imaging of macaque monkey primary visual cortex, J Neurosci 14:
2545–2568, 1994.
33. Gulyas B, Orban GA, Duysens J, Maes H, The suppressive inﬂuence
of moving textured backgrounds on responses of cat striate neurons to
moving bars, J Neurophysiol 57: 1767–1791, 1987.
34. KnierimJJ, vanEssenDC, Neuronal responses to static texture patterns
inareaV1of the alert macaque monkey, J Neurophysiol 67: 961–980, 1992.
35. Toth LJ, Rao SC, Kim DS, Somers D, et al., Subthreshold facilitation
and suppression in primary visual cortex revealed by intrinsic signal
imaging, Proc Natl Acad Sci USA 93: 9869–9874, 1996.
36. Logothetis NK, Pauls J, Augath M, Trinath T, et al., Neurophysiological
investigation of the basis of the fMRI signal, Nature 412: 150–157, 2000.
37. Ajima A, Matsuda Y, Ohki K, Kim DS, et al., GABAmediated repre
sentation of temporal information in rat barrel cortex, Neuroreport 10:
1973–1979, 1999.
38. Toth LJ, Kim DS, Rao SC, Sur M, Integration of local inputs in visual
cortex, Cereb Cortex 7: 703–710, 1997.
39. Waldvogel D, van Gelderen P, Muellbacher W, Ziemann U, et al., The
relative metabolic demand of inhibition and excitation, Nature 406:
995–998, 2000.
40. Beaulieu C, Colonnier, M, A laminar analysis of the number of
roundasymmetrical andﬂatsymmetrical synapses onspines, dendritic
trunks, and cell bodies in area 17 of the cat, J Comp Neurol 231: 180–189,
1985.
41. Koos T, Tepper JM, Inhibitory control of neostriatal projection neurons
by GABAergic interneurons, Nat Neurosci 2: 467–472, 1999.
42. Payne BR, Peters A, The cat primary visual cortex, Academic Press,
San Diego, 2001.
43. Kisvarday ZF, Kim DS, Eysel UT, Bonhoeffer T, Relationship between
lateral inhibitory connections and the topography of the orientation
map in cat visual cortex, Eur J Neurosci 6: 1619–1632, 1994.
44. Sherpherd GM, The synaptic organization of the brain, Oxford Univer
sity Press, Oxford, 1990.
45. Kim DS, Duong TQ, Kim SG, Highresolution mapping of isoorienta
tion columns by fMRI, Nat Neurosci 3: 164–169, 2000.
46. Malonek D, Grinvald A, Interactions between electrical activity and
cortical microcirculation revealed by imaging spectroscopy: Implica
tions for functional brain mapping, Science 272: 551–554, 1996.
47. Malonek D, Grinvald A, Vascular regulation at sub millimeter range.
Sources of intrinsic signals for highresolutionoptical imaging, Adv Exp
Med Biol 413: 215–220, 1997.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
286
DaeShik Kim
48. Vanzetta I, GrinvaldA, Increased cortical oxidative metabolismdue to
sensory stimulation: Implications for functional brain imaging, Science
286: 1555–1558, 1999.
49. Sokoloff L, Reivich M, Kennedy C, Des Rosiers MH, et al., The
[14C]deoxyglucose method for the measurement of local cerebral glu
cose utilization: Theory, procedure, andnormal values in the conscious
and anesthetized albino rat, J Neurochem 28: 897–916, 1977.
50. Fox PT, Raichle ME, Focal physiological uncoupling of cerebral blood
ﬂow and oxidative metabolism during somatosensory stimulation in
human subjects, Proc Natl Acad Sci USA 83: 1140–1144, 1986.
51. Malonek D, Dirnagl U, Lindauer U, Yamada K, et al., Vascular imprints
of neuronal activity: Relationships between the dynamics of cortical
blood ﬂow, oxygenation, and volume changes following sensory stim
ulation, Proc Natl Acad Sci USA 94: 14826–14831, 1997.
52. Frostig RD, Lieke EE, Ts’o DY, GrinvaldA, Cortical functional architec
ture and local coupling between neuronal activity and the microcircu
lation revealed by in vivo highresolution optical imaging of intrinsic
signals, Proc Natl Acad Sci USA 87: 6082–6086, 1990.
53. GrinvaldA, Lieke E, Frostig RD, Gilbert CD, et al., Functional architec
ture of cortex revealed by optical imaging of intrinsic signals, Nature
324: 361–364, 1986.
54. Crair MC, Gillespie DC, Stryker MP, The role of visual experience in
the development of columns in cat visual cortex, Science 279: 566–570,
1998.
55. Shmuel A, GrinvaldA, Functional organization for direction of motion
and its relationship to orientation maps in cat area 18, J Neurosci 16:
6945–6964, 1996.
56. Buxton RB, The elusive initial dip, Neuroimage 13: 953–958, 2001.
57. Hu X, Le TH, Ugurbil K, Evaluation of the early response in fMRI in
individual subjects using short stimulus duration, Magn Reson Med 37:
877–884, 1997.
58. MenonRS, OgawaS, HuX, StruppJP, et al., BOLDbasedfunctional MRI
at 4 Tesla includes a capillary bed contribution: Echoplanar imaging
correlates with previous optical imaging using intrinsic signals, Magn
Reson Med 33: 453–459, 1995.
59. Logothetis NK, Guggenberger H, Peled S, Pauls J, Functional imaging
of the monkey brain, Nat Neurosci 2: 555–562, 1999.
60. Lindauer U, Royl G, Leithner C, Kuhl M, et al., No evidence for early
decrease in blood oxygenation in rat whisker cortex in response to
functional activation, Neuroimage 13: 988–1001, 2001.
61. Marota JJ, Ayata C, Moskowitz MA, Weisskoff RM, et al., Investigation
of the early response to rat forepaw stimulation, Magn Reson Med 41:
247–252, 1999.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
287
62. Silva AC, Kim SG, Pseudocontinuous arterial spin labeling technique
for measuring CBF dynamics with high temporal resolution, Magn
Reson Med 42: 425–429, 1999.
63. Mayhew J, Johnston D, Martindale J, Jones M, et al., Increased oxygen
consumption following activation of brain: Theoretical footnotes using
spectroscopic data from barrel cortex, Neuroimage 13: 975–987, 2001.
64. Yacoub E, Le TH, Ugurbil K, Hu X, Further evaluation of the initial
negative response in functional magnetic resonance imaging, Magn
Reson Med 41: 436–441, 1999.
65. Logothetis N, Can current fMRI techniques reveal the micro
architecture of cortex? Nat Neurosci 3: 413–414, 2000.
66. Kim DS, Duong TQ, Kim SG, Reply to “Can current fMRI techniques
reveal the microarchitecture of cortex?” Nat Neurosci 3: 414, 2000.
67. CannestraAF, PouratianN, Bookheimer SY, MartinNA, et al., Temporal
spatial differences observed by functional MRI and human intraoper
ative optical imaging, Cereb Cortex 11: 2001.
68. Thompson JK, Peterson MR, Freeman RD, Singleneuron activity and
tissue oxygenation in the cerebral cortex, Science 299: 1070–1072, 2003.
69. Duong TQ, KimDS, Ugurbil K, KimSG, Localized cerebral blood ﬂow
response at submillimeter columnar resolution, Proc Natl Acad Sci USA
98: 10904–10909, 2001.
70. Luh WM, Wong EC, Bandettini PA, Ward BD, et al., Comparison of
simultaneouslymeasuredperfusionandBOLDsignal increases during
brain activation with T(1)based tissue identiﬁcation, Magn Reson Med
44: 137–143, 2000.
71. Haxby JV, Gobbini MI, Furey ML, Ishai A, et al., Distributed and over
lapping representations of faces and objects in ventral temporal cortex,
Science 293: 2425–2430, 2001.
72. Ishai A, Ungerleider LG, Haxby JV, Distributed neural systems for the
generation of visual images, Neuron 28: 979–990, 2000.
73. Kim DS, Kim M, Ronen I, Formisano E, et al., In vivo mapping of func
tional domains and axonal connectivity in cat visual cortex using mag
netic resonance imaging, Magn Reson Imaging 21: 1131–1140, 2003.
74. Kim M, Ducros M, Carlson T, Ronen I, et al., Anatomical correlates
of the functional organization in the human occipitotemporal cortex,
Magn Reson Imaging 24: 583–590, 2006.
75. Singer, W, Time as coding space? Curr Opin Neurobiol 9: 189–194, 1999.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch11 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
CHAPTER 12
Recent Advances in Diffusion
Magnetic Resonance Imaging
DaeShik Kim and Itamar Ronen
Diffusion weighted magnetic resonance imaging (DWI) plays an increas
ingly important role in clinical and basic neurosciences. This is thanks
to DWI’s exceptional capability in representing structural properties of
neural tissue as local water molecular displacements: changes in mean
diffusivity reﬂect changes in macroscopic structural properties, while
gradientdirection encoded diffusion tensor imaging (DTI) can reveal
neuroanatomical connections in a noninvasive manner. Finally, recent
advances in compartmentalspeciﬁc diffusion MRI suggest that micro
scopic cellular tissue properties might be measurable as well using diffu
sion MRI.
12.1 INTRODUCTION
Magnetic resonance imaging has paved the way for accurately map
ping the structural and functional properties of the brain in vivo.
In particular, the intrinsic noninvasiveness of magnetic resonance
(MR) methods and the sensitivity of the MRsignal to subtle changes
in the structural and physiological neuronal tissue fabric make
it an all but ideal research and diagnostic tool for characterizing
intact neural tissue and studying processes that affect neural tis
sue properties such as cortical thinning, demyelination, and nerve
degeneration/regeneration following injury. To this end, the tech
nique of diffusion weighted MRI (DWI) has become one of the pri
mary research and diagnostic tools in evaluating tissue structure
289
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
290
DaeShik Kim and Itamar Ronen
thanks to its ability to represent structural properties of neural
tissue as local water molecular displacements. For example, the
sharp difference in structural characteristics between tissue prop
erties in the central nervous system has been extensively exploited
in countless DWI applications, ranging from the characterization of
ischemia,
1,2
demarcation of brain tumors
3
and the extensive inves
tigation of connectivity through the use of diffusion tensor imaging
(DTI).
4,5
In addition, recent advances in diffusion tensor imaging
(DTI) promises to label axonal connectivity pattern in a noninvasive
manner by utilizing directionally encoded local water diffusivity.
Finally, recent advances in compartmentalspeciﬁc diffusion MRI
suggest that diffusion MRI might be also able to provide semiquan
titative information about microscopic cellular tissue properties.
12.1.1 Brownian Motion and Molecular Diffusion
The essential nature of diffusion is that a group of molecules that
start at the same location will spread out over time. Each molecule
experiences a series of random displacements so that after a time T,
the spread of position along a spatial axis x has a variance of:
σ
2
x
= 2DT, (1)
where D is the diffusion coefﬁcient, a constant characteristic of the
medium. Diffusion of water molecules in most biological tissues is
recognizedas beingsmaller thanthe value inpure water. Inthe brain
tissue, the diffusion coefﬁcient is two to ten times lower than in pure
water.
6
It has been shown that in brain gray matter, the diffusion
properties are relatively independent of orientation (or isotropic).
Conversely, in ﬁbrous tissues such as brain white matter, the diffu
sion properties vary with orientation. A very important empirical
observation is that the diffusion parallel to the ﬁber is much greater
than the diffusion perpendicular to it.
7
The variation with orienta
tion is termed diffusion anisotropy (Fig. 1). Isotropic diffusion may
indicate either a structurally isotropic medium, or the existence of
multiple anisotropic structures that are randomly oriented in the
same sample volume.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
291
Fig. 1. The upper panel shows a schematic representation of a typical white mat
ter voxel. The voxel is mostly occupied by closely packed myelinated axons. Water
molecule diffusion is restricted in the direction perpendicular to the axonal ﬁbers
leading to an anisotropic diffusion pattern. In the lower panel, a schematic repre
sentation of a gray matter voxel is shown. Although the presence of cell membranes
still poses restriction on diffusion, the well oriented structure of white matter ﬁber
tract no longer exists, and thus the diffusion pattern is more isotropic.
12.1.2 Anisotropic Diffusion
Whereas the factors that determine the lower diffusion coefﬁcient in
brain tissue and the anisotropic water diffusion in white matter are
not completelyunderstood, it is assumedthat increasedviscosity, tis
sue compartmentalization, as well as interaction with the structural
components of the tissue such as macromolecules, membranes and
intracellular organelles contribute tothis phenomenon. One hypoth
esis of biological diffusion properties is related to the restriction of
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
292
DaeShik Kim and Itamar Ronen
diffusion by obstacles such as membranes.
6,8
For very short diffu
sion times (i.e. if the diffusion path is short relative to the structural
dimensions), the molecular diffusion should resemble the free dif
fusion in a homogeneous medium. As the diffusion time increases,
the water molecules diffuse far enough to encounter obstacles that
may obstruct their movement. In certain media where the diffusion
is restricted by impermeable barriers, it has been shown that as the
diffusion time increases, the diffusion coefﬁcient decreases when
the diffusion distance is comparable with structure dimensions.
6
Another hypothesis is that the behavior of water diffusion in tissue
may reﬂect rather hindered than restricted diffusion.
6,9
The move
ment of water molecules may be hindered by much slower moving
macromolecules andbymembranes, resultingincomplicated, tortu
ous pathways. The anisotropic behavior of diffusion in white matter
may be also due to the intrinsic order of the axoplasmatic medium.
6
The presence of microtubules and neuroﬁlaments associated with
axonal transport andthe lamellar structure of the myelinsheathmay
inhibit motion perpendicular to axons, but does not restrict motion
parallel to the ﬁber. When diffusion is hindered, the observed or
apparent diffusion coefﬁcient relates to the inherent diffusion coef
ﬁcient, D
0
, through a tortuosity factor, λ
9
:
D
app
=
D
0
λ
2
. (2)
12.1.3 Data Acquisition for DWI and DTI
As suggested by Stejskal and Tanner,
10
the MR image is sensitized
to diffusion in a given direction using a couple of temporally sepa
rated pulsed B
0
ﬁeld gradients in the desired direction. The appli
cation of a magnetic ﬁeld gradient pulse at e.g. one of the 3 spatial
dimensions (x, y, z) dephases the protons (spin) along the respective
dimension (Fig. 2). Asecond pulse at the same direction, but oppo
site polarity(“refocusingpulse”), will rephase these spins. However,
such rephasing cannot be perfect if the protons moved between the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
293
δ
∆
90º 180º
TE/2 TE/2
MR signal
g
y
g
z
g
x
RF
TE
Fig. 2. MIR pulse sequence for diffusion tensor imaging (DTI). The direction of
the magnetic ﬁeld gradient is the one in which g(x) =g(y), and g(z) =0, or g=(1,
1, 0). See text for further details.
two gradient pulses. That is to say, the signal loss, which cannot
be recovered after the application of the second gradient pulse, is a
function of the local molecular motion. The amount of the molecular
diffusion is known to obey Eq. 3, assuming the sample is isotropic
(no directionality in water diffusion):
S
S
0
= e
−γ
2
G
2
δ
2
(−δ/3)D
, (3)
where S and S
0
are signal intensities with and without the diffu
sion weighting, γ is a constant (gyromagnetic ratio), G and δ are
gradient strength and duration, and is the separation between
a pair of gradient pulses. Because these parameters are all known,
from the amount of signal decrease (S/S
0
), diffusion constants at
each voxel can be derived. Such measurements have revealed that
diffusionof brainwater has strongdirectionality(anisotropy), which
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
294
DaeShik Kim and Itamar Ronen
is attributed to the existence of natural boundaries, such as axons
and/or myelination. The properties of such water diffusion can be
expressed as an ellipsoid — “diffusion ellipsoid.”
11,12
This ellipsoid
can be characterized by six parameters; diffusion constants along
the longest, middle, and shortest axes (λ
1
, λ
2
, and λ
3
, called princi
pal axes) and the direction of the three principal axes, perpendicular
to each other. Once the diffusion ellipsoid is fully characterized at
each pixel of the brain images, local ﬁber structure can be derived.
For example, if λ
1
λ
2
≥ λ
3
(diffusion is anisotropic), it suggests
the existence of dense and aligned ﬁbers within each pixel, whereas
isotropic diffusion (λ
1
≈ λ
2
≈ λ
3
) suggests sparse or unaligned
ﬁbers. When diffusion is anisotropic, the direction of λ
1
indicates the
direction of the ﬁbers.
12.1.4 Measures of Anisotropy Using Diffusion Tensors
One important application of the diffusion tensor is the quantita
tive characterization of the brain tissue structure and the degree of
anisotropy in brain white matter. Several scalar measures, which
emphasize different tensor features, have been derived fromthe dif
fusion tensor by different groups.
7,13,14
To this end, diffusion tensor
elements can be calculated by:
b = γ
2
δ
2
(−δ/3)G
2
(4)
S = S
0
exp (−bD) (5)
D =
1
b
ln
S
0
S
. (6)
While the diffusion D is a scalar for conventional DWI, it is a ten
sor in case of DTI data. That is, instead of being characterized by a
single number, it is describedbya 3×3 matrix of numbers. For exam
ple, if the diffusionsensitizing gradient pulses are applied along the
xaxis, u = (1, 0, 0), or if the measurement axis is at an angle θ to the
xaxis and in the x +y plane, u = ( cos θ, sin θ, 0), then the measured
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
295
value of D along any axis u is given by:
D =
_
u
x
u
y
u
z
_
_
_
D
xx
D
xy
D
xz
D
xy
D
yy
D
yz
D
xz
D
yz
D
zz
_
_
_
_
u
x
u
y
u
z
_
_
, (7)
D = u
2
x
D
xx
+u
2
y
D
yy
+u
2
z
D
zz
+2u
x
u
y
D
xy
+2u
y
u
z
D
yz
+2u
z
u
x
D
zx
, (8)
∴
1
b
ln
S
0
S
= u
2
x
D
xx
+u
2
y
D
yy
+u
2
z
D
zz
+2u
x
u
y
D
xy
+2u
y
u
z
D
yz
+2u
z
u
x
D
zx
. (9)
For example, for 12 directions,
_
_
_
_
_
_
_
_
_
_
_
_
1
b
ln
S
0
S
1
·
·
·
·
1
b
ln
S
0
S
12
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
= U
D (10)
where,
U =
_
_
_
_
_
_
_
_
_
_
u
2
x1
u
2
y1
u
2
z1
u
x1
u
y1
u
y1
u
z1
u
z1
u
x1
· · · · · ·
· · · · · ·
· · · · · ·
· · · · · ·
u
2
x12
u
2
y12
u
2
z12
u
x12
u
y12
u
y12
u
z12
u
z12
u
x12
_
¸
¸
¸
¸
¸
¸
¸
¸
_
and
D =
_
_
_
_
_
_
_
_
_
_
D
xx
D
yy
D
zz
2D
xy
2D
yz
2D
zx
_
¸
¸
¸
¸
¸
¸
¸
¸
_
.
Now, if we assume that the columns of U are linearly independent,
then the matrix UTU is invertible and the least squares solution is
D
0
= (U
T
U)
−1
U
T
_
_
_
_
_
_
_
_
_
_
_
_
1
b
ln
S
0
S
1
·
·
·
·
1
b
ln
S
0
S
12
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
(11)
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
296
DaeShik Kim and Itamar Ronen
Since the 3×3 tensor matrix
=
D
=
_
D
xx
D
xy
D
xz
D
xy
D
yy
D
yz
D
xz
D
yz
D
zz
_
is symmetric along
the diagonal, the eigenvalues and eigenvectors can be obtained
by diagonalizing the matrix using the Jacobi transformation. The
resulting eigenvalues
=
=
_
λ
1
0 0
0 λ
2
0
0 0 λ
3
_
and corresponding eigenvec
tors
=
P
=
_
−→
p
1
−→
p
2
−→
p
3
_
can then be used to describe the diffusivity
and directionality (or anisotropy) of water diffusion within a given
voxel. An important measure associated with the diffusion tensor is
its trace:
tr{D} = D
xx
+D
yy
+D
zz
= 3 · λ = λ
1
+λ
2
+λ
3
. (12)
The trace has similar values in healthy white and gray matter
(tr{D} ∼2.1×10
−3
mm
2
/s). However, the trace value drops consider
ably in brain tissue affected by acute stroke.
15
This drop is attributed
to an increase in tortuosity factor due to the shrinkage of the extra
cellular space.
15
Consequently, the trace of the diffusion tensor can
be used as an early indicator of ischemic brain injury. Finally, the
anisotropy of the diffusion tensor characterizes the amount of dif
fusion variation as a function of direction (e.g. the deviation from
isotropy). Several of these anisotropy measures are normalized to
a range from 0 to 1. One of the most commonly used measures of
anisotropy is the fractional anisotropy (FA)
7
:
FA =
1
√
2
_
(λ
1
−λ
2
)
2
+(λ
2
−λ
3
)
2
+(λ
3
−λ
1
)
2
λ
2
1
+λ
2
2
+λ
2
3
, (13)
which is the ratio of the rootmeansquare (RMS) of the eigenvalues
deviation from their mean normalized by the eigenvalues Euclid
ian norm. FAhas been shown to provide the best contrast between
different classes of brain tissues.
16
Auseful way to display tract ori
entation is to use color to encode the direction of the tensor major
eigenvector.
17,18
The 3D eigenvector space is associated with the
3D RGB (RedGreenBlue) color space by assigning a color to each
component of the eigenvector (e.g. red to x, green to y, and blue
to z). Consequently, the ﬁbers that are oriented from left to right
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
297
Fig. 3. Color maps of several brain slices. (Left) axial, (middle) coronal, and (right)
sagittal slices. See text for further details.
of the brain appear red, the ﬁbers oriented anteriorlyposteriorly
(frontback) appear green, and those oriented superiorlyinferiorly
(topbottom) appear blue (Fig. 3). All the other orientations are com
binations of these three colors. Color maps allowthe identiﬁcationof
different white matter structures. Eigenvector color maps for three
orthogonal planes in a 3D brain volume are presented in Fig. 3.
The color intensities are weighted by FAto emphasize white matter
anatomy.
12.1.5 White Matter Tractography
White matter tractography (WMT) is based on the estimation of
white matter tract orientation using measured diffusion proper
ties of water as described in the previous sections. Some of the
major techniques for DTI based ﬁber tractography are discussed
below:
12.1.6 Propagation Algorithms
In algorithms developed by many groups,
11
a continuous represen
tation of the diffusion tensor and principal eigenvector ε
1
are inter
polated from the discrete voxel data. The ﬁber track direction at
any location along the tract is given by the continuous ε
1
. Typically,
the tracking algorithm stops when the ﬁber radius of curvature or
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
298
DaeShik Kim and Itamar Ronen
Fig. 4. In vivo highresolution diffusion tensor imaging (DTI) of the human corpus
callosum. The left panel depicts the userdeﬁned seeding ROI for ﬁber reconstruc
tion, and the right panel shows the result of DTI based ﬁber tractography of human
corpus callosum.
the anisotropy factor falls below a threshold (Figs. 4 and 5). With
this approach, the ﬁber is not represented by a succession of line
segments but by a relatively smooth curve that follows the local
diffusion direction and is more representative of the behavior of
real ﬁbers. These two approaches, often designated “streamline”
approaches, are based on the assumption that diffusion is locally
uniformandcanbe accuratelydescribedbya single vector ε
1
. Unfor
tunately, this fails todescribe voxels occupiedbyﬁbers withdifferent
diffusion tensors.
19
Furthermore, the presence of noise in the diffu
sion MRI data induces a small uncertainty in the direction of the
vectors ε
1
, that can lead to signiﬁcant ﬁber tact propagation error.
To try overcome these problems, tensorline approaches have been
developed such that the entire tensor information is used instead of
reducingit toa single eigenvector.
20,21
Arecentlyproposedapproach
is a continuous approximation of the tensor ﬁeld using Bsplines
to derive ﬁber tracts. Tensorline algorithms seem to perform better
than streamline algorithms for reconstructing low curvatures ﬁbers
and, in general, achieve better reproducibility. Poupon
22,23
have
developed an algorithm based on a probabilistic approach aimed
at minimizing ﬁber bending along the ﬁber tract. A regularization
step based on the analogy between ﬁber pathways in white matter
and so called “spaghetti plates” is used to improve robustness. A
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
299
Fig. 5. The explanatory power of DTI can further be increased by combining DTI
ﬁber tractography with conventional functional imaging. Here, the areas of high
functional MRI (fMRI) activity during visual stimulation along the human ventro
temporal cortex are used as seeding points for DTI based ﬁber reconstructions.
consequence of this approach is that it can represent ﬁber branching
and forks that are typically present in white matter fascicles, a clear
advantage over previously published methods.
12.1.6.1 Fiber assignment by continuous tracking
Mori et al.
24
developed one of the earliest and most commonly
employed algorithms: ﬁber assignment by continuous tracking
(FACT). The FACT is based on extrapolation of continuous vector
lines fromdiscrete DTI data. The reconstructedﬁber directionwithin
each voxel is parallel to the diffusion tensor eigenvector (ε
1
) associ
ated with the greatest eigenvalue (λ
1
). Within each voxel, the ﬁber
tract is a line segment deﬁned by the input position, the direction of
ε
1
and an output position at the boundary with the next voxel. The
trackis propagatedfromvoxel tovoxel andterminatedwhena sharp
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
300
DaeShik Kim and Itamar Ronen
turn in the ﬁber orientation occurs. The FACT uses as propagation
direction the value corresponding to the current voxel: v
∗
prop
= v
prop
.
The value of p is chosen such as the current step crosses the entire
voxel and reaches its boundary. To this end, the trajectory will be
formed by a series of segments of variable length. FACT integration
has the advantage of a high computational efﬁciency.
12.1.6.2 Streamline tracking
The streamline tracking (STT) technique
11,25,26
approximates v
prop
by
the major eigenvector of the tensor:
v
prop
= e
1
. (14)
This approach is analogous to simulated ﬂow propagation in ﬂuid
dynamics including the study of blood ﬂow phenomena from MRI
ﬂow measurements with 3D phase contrast.
27
12.1.6.3 Tensor deﬂection
An alternative approach for determining tract direction is to use
the entire diffusion tensor to deﬂect the incoming vector (v
in
)
direction
14,28
:
v
out
= D· v
in
. (15)
The incoming vector represents the propagation direction from
theprevious integrationstep. Thetensor operator deﬂects theincom
ing vector towards the major eigenvector direction, but limits the
curvature of the deﬂection, which should result in smoother tract
reconstructions. Tensor deﬂection (TEND) was proposed in order to
improve propagation in regions with low anisotropy, such as cross
ing ﬁber regions, where the direction of fastest diffusivity is not well
deﬁned.
29
12.1.6.4 Tensorline algorithms
The tensorline algorithm, described by Weinstein et al.,
30
dynam
ically modulates the STT and TEND contributions to steer the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
301
tract:
v
out
= fe
1
+(1 −f )((1 −g)v
in
+gDv
in
), (16)
where f and g are user deﬁned weighting factors that vary between
0 and 1. The algorithm has 3D terms: (a) an STT term (e
1
) weighted
byf , (b) a TENDterm(D·v
in
) weightedby(1−f )g, andanundeviated
v
in
term weighted by (1 −f )(1 −g). The vectors and are normalized
to unify before being used in Eq. 16. Estimated trajectories with dif
ferent properties can be achieved by changing f and g. Tensorline
may be considered as a family of tractography algorithms that can
be tuned to accentuate speciﬁc behavior. In the original implemen
tation of this algorithm, Weinstein et al. used a measure of prolate
tensor shape, f =CL,
14
to weight the STT term. Note that for f =1,
the tensorlines algorithm is equivalent to STT.
12.1.6.5 Probabilistic mapping algorithm
Diffusion tensor imaging is based on the assumption that the local
orientation of nerve ﬁbers is parallel to the ﬁrst eigenvector of the
diffusion tensor. However, due to issues such as imaging noises,
limitedspatial resolutionandpartial volume effect, the ﬁber orienta
tion cannot determined without uncertainty. Probabilistic methods
for determining the connectivity between brain regions using infor
mationobtainedfromDTI have recentlybeenintroduced.
31−33
These
approaches utilize probability density functions (PDFs) deﬁned at
each point within the brain to describe the local uncertainty in ﬁber
orientation. The probabilistic tractography algorithm reveals ﬁber
connectivitythat progresses intothe graymatter; while conventional
streamlined algorithms failed to yield acceptable results. The goal
of probabilistic tracking approaches is to determine the probabil
ity that ﬁbers project from a starting point (or group of points) to
regions of interest. In data analysis performed in this research, the
local ﬁber orientation is given by the ﬁrst eigenvector of the diffu
sion tensor that we call ε
1
. To performprobabilistic tacking, we need
to introduce an uncertainty of ε
1
orientation at every point along a
ﬁber created by a streamline tracking method. Then we repeat the
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
302
DaeShik Kim and Itamar Ronen
tracking a great number of times to generate a 3D probability map.
The uncertainty in ε
1
orientation can be described by the probabil
ity that it is deﬂected about its original position. The result of this
deﬂection is a vector ε
1
. θ is the angle between ε
1
and ε
1
, and φ is
the rotation of ε
1
about ε
1
. The PDF for θ and φ are given by the 0th
order model of uncertainty described in Ref. 32 φ is uniformly dis
tributed between 0 and 2π, and θ is normal about 0 with a standard
deviation Sigma linked to FA. Indeed, the smaller FA, the greater
the ﬁber orientation uncertainty. We deﬁne Sigma = S(FA), S being
a sigmoid function. In our computation, we can modify the sigmoid
function parameters: Sigma max, the standard deviation of θ as FA
tends to 1, Sigma 0, the standard deviation of θ as FA tends to 1
(i.e. a residual uncertainty), FA
0
, the value of FAfor which Sigma =
(Sigma
0
+ Sigma max)/2, and slope, the slope of the sigmoid. To
create a probabilistic map, a great number of ﬁbers are generated
using the streamline tracking algorithm. At each point along ﬁber
propagation, ε
1
is modiﬁed into ε
1
, using a randomnumber genera
tor and the PDF for φ and θ described above. The probability map is
the number of ﬁbers reaching a voxel divided by the total number of
ﬁbers that were generated. Whenprobabilistic trackingis performed
from multiple starting point (such as en entire ROI), the probability
is multiplied by the number of starting points.
12.1.7 Limitations of DTI Techniques
Despite its great promise for visualizing and quantitatively char
acterizing white matter connections, DTI has some important lim
itations. It is not clear what is actually being measured with the
anisotropy index. For example, the precise contribution of these two
factors, ﬁber density and myelination, on the anisotropy index has
not been completely understood. Thus, it is not clear to what degree
the results of DTI correspond to the actual density and orientation
of the local axonal ﬁber bundles. It is also important to understand
how white matter is, in general, organized. The most basic short
coming of DTI is that it can only determine a single ﬁber orienta
tion at any given location in the brain. This is clearly inadequate
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
303
in regions with complex white matter architecture, where different
axonal pathways crisscross through each other. The crossing ﬁbers
create multiple ﬁber orientations within a single MRI voxel, where
a voxel refers to a 3D pixel, and constitutes the individual element
of the MR image. Since the diffusion tensor assumes only a sin
gle preferred direction of diffusion within each voxel, DTI cannot
adequately describe regions of crossing ﬁbers, or of converging or
diverging ﬁbers. 3D DTI ﬁber tracking techniques are also found
in these regions of complex white matter architecture, since there
is no well deﬁned single dominant ﬁber orientation for them to
follow.
In recent years, some of these problems have been addressed
by measuring the full 3D dispersion of water diffusion in each MRI
voxel at highangular resolution. Thus, insteadof obtainingdiffusion
measurements in only a few independent directions to determine a
single ﬁber orientation as in DTI, dozens or even hundreds of uni
formly distributed diffusion directions in 3D space are acquired to
resolve multiple ﬁber orientations in high angular resolution diffu
sion imaging (e.g. HARDI). Each distinct ﬁber population can be
visualized on maps of the orientation distribution function (ODF),
which are computed from the 3D high angular resolution diffusion
data through a projection reconstruction technique known as the
FunkRadon transform. This 3D projection reconstruction is very
similar mathematically to the 2D method by which CT images are
calculated from Xray attenuation data. Unlike DTI, HARDI has
the advantage of being modelindependent, and therefore does not
assume any particular 3Ddistribution of water diffusion or any spe
ciﬁc number of ﬁber orientations within a voxel.
12.1.8 The Use of High bvalue DWI for Tissue Structural
Characterization
As a result of the structural heterogeneity of tissue in a spatial
scale signiﬁcantly smaller that the typical image voxel size, the
diffusionweighted signals display a multiexponential dependence
on diffusion weighting magnitude quantiﬁed with the parameter
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
304
DaeShik Kim and Itamar Ronen
b, where b = γ
2
δ
2
g
2
( − δ/3) in a spinecho diffusion experiment,
where γ is the gyromagnetic ratio, g is the magnitude of the Stejskal
Tanner gradient pair each of which is of δ duration, and is the
temporal separation of the gradient pair. The complexity of this
multiexponential behavior of the signal led to a more detailed
inspection of diffusion properties in matter, as proposed.
34,35
The
method, known as qspace imaging, is based on the acquisition of
data with multiple gradient strength values g. When a Fourier trans
formation is performed pixel by pixel with respect to the variable
q = γg/2π:
P(
R, ) =
1
2π
_
∞
−∞
S( q, ) · exp (−i2π q ·
R)d q, (17)
the transformed data set P represents the displacement probability
of the water molecules with respect to the axis which was sensi
tized to diffusion, at a given diffusion time . This concept has been
successfully applied in various in vitro and in vivo applications,
36−40
where the use of long diffusion times combined with gradation of
bvalues andFourier transformationhas yieldeddisplacement maps
with exquisite accuracy.
Although qspace imaging potentially yields detailed diffusion
data on heterogeneous tissue, the straightforward use of qspace
data for imagingpurposes has beenmostlylimitedto displayingone
of the mainparameters of the displacement distributionfunction, i.e.
the zero displacement probability (amplitude at displacement = 0),
and the displacement probability RMS (FWHM of the distribution
function). This particular use clusters together the various diffusion
components, and thus it is particularly suitable for applications in
which diffusion in a voxel is dominated by one component, either
because of the nature of the tissue or by eliminating nonrestricted
diffusion components by means of a large value.
The other approach of using diffusion data acquired with multi
ple bvalues is to model the data according to a plausible model that
governs the diffusionpatternineachvoxel. Inthis approach, the data
is ﬁtted to a multiparametric model function that best represents
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
305
the expected behavior of the signal with respect to b. The advan
tage of modeling the diffusion data is in the possibility to extract
information about diffusion characteristics of water in various com
partments from the same data set, and thus simultaneously obtain
volumetric and structural information about those compartments.
The most common and useful model for that matter is a biexpo
nential decay diffusion model, which partitions the diffusion data
into slow and fast diffusing components.
41−46
It is now accepted
that there is no soichiometric relation between the two components
in the biexponential model and two distinct tissue compartments.
However, it is widely accepted that the largest contribution to the
nonmonoexponential behavior stems from restriction imposed on
diffusion, mostly on the intracellular and intraaxonal water pool.
44
This view gains support from studies that measured diffusion of
intracellular metabolites such as Nacetyl aspartate (NAA), for
which the diffusion attenuation curve as a function of bvalue was
shown to be nonmonoexponential.
47,48
12.2 SUMMARY AND CONCLUSIONS
Diffusion weighted magnetic resonance imaging (DWI) already
plans a crucial role in detecting neurostructural deviations at macro
scopic level. With recent advances in DTI, multimodal imaging
and compartmental speciﬁc imaging, the importance of diffusion
MRI for clinical and basic neurosciences plays are likely to increase
exponentially.
12.3 ACKNOWLEDGMENTS
Drs Mina Kimand Susumu Mori provided crucial help in DWI/DTI
data acquisition and analyses. We also thank Mathieu Ducros, Sahil
Jain and KeunHo Kim for their help with DTI postprocessing. This
work was supported by grants from NIH (RR08079, NS44825), The
MIND institute, Keck Foundation, and Human Frontiers Science
Program.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
306
DaeShik Kim and Itamar Ronen
References
1. Moseley ME, Cohen Y, et al., Early detection of regional cerebral
ischemia in cats: Comparison of diffusion and T2weighted MRI and
spectroscopy, Magn Reson Med 14(2): 330–346, 1990.
2. Moseley ME, Kucharczyk J, et al., Diffusionweighted MR imag
ing of acute stroke: Correlation with T2weighted and magnetic
susceptibilityenhanced MR imaging in cats, AJNR Am J Neuroradiol
11(3): 423–429, 1990.
3. Eis M, Els T, et al., Quantitative diffusion MRimaging of cerebral tumor
and edema, Acta Neurochir Suppl (Wien) 60: 344–346, 1994.
4. Basser PJ, Mattiello J, et al., Estimation of the effective selfdiffusion
tensor from the NMR spin echo, J Magn Reson B 103(3): 247–254,
1994.
5. Pierpaoli C, JezzardP, et al., Diffusion tensor MRimaging of the human
brain, Radiology 201(3): 637–648, 1996.
6. Le Bihan D, Turner R, et al., Imaging of diffusion and microcirculation
with gradient sensitization: Design, strategy, and signiﬁcance, J Magn
Reson Imaging 1(1): 7–28, 1991.
7. Basser, PJ, Pierpaoli C, Microstructural and physiological features of
tissues elucidated by quantitativediffusiontensor MRI, J Magn Reson
B 111(3): 209–219, 1996.
8. Hajnal JV, Doran M, et al., MR imaging of anisotropically restricted
diffusion of water in the nervous system: Technical, anatomic, and
pathologic considerations, J Comput Assist Tomogr 15(1): 1–18, 1991.
9. Norris DG, The effects of microscopic tissue parameters on the diffu
sion weighted magnetic resonance imaging experiment, NMR Biomed
14(2): 77–93, 2001.
10. Stejskal EO, Tanner JE, Restricted selfdiffusion of protons in colloidal
systems by the pulsegradient, spinecho method, J Chem Phys 49(4):
1768–1777, 1968.
11. Conturo TE, Lori NF, et al., Tracking neuronal ﬁber pathways in the
living human brain, Proc Natl Acad Sci USA 96(18): 10422–10427, 1999.
12. Basser PJ, MattielloJ, et al., MRdiffusiontensor spectroscopyandimag
ing, Biophys J 66(1): 259–267, 1994.
13. Conturo TE, McKinstry RC, et al., Encoding of anisotropic diffusion
with tetrahedral gradients: A general mathematical diffusion formal
ism and experimental results, Magn Reson Med 35(3): 399–412, 1996.
14. Westin CF, Maier SE, et al., Image Processing for Diffusion Tensor Magnetic
Resonance Imaging, Springer, Cambridge, 1999.
15. Sotak CH, The role of diffusion tensor imaging in the evaluation
of ischemic brain injury — A review, NMR Biomed 15(7–8): 561–569,
2002.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
307
16. Alexander AL, Hasan K, et al., Ageometric analysis of diffusion tensor
measurements of the human brain, Magn Reson Med 44(2): 283–291,
2000.
17. Makris N, Worth AJ, et al., Morphometry of in vivo human white mat
ter association pathways with diffusionweighted magnetic resonance
imaging, Ann Neurol 42(6): 951–962, 1997.
18. Pajevic S, Pierpaoli C, Color schemes to represent the orientation of
anisotropic tissues from diffusion tensor data: Application to white
matter ﬁber tract mapping in the human brain, Magn Reson Med 43(6):
921, 2000.
19. Alexander AL, Hasan KM, et al., Analysis of partial volume effects in
diffusiontensor MRI, Magn Reson Med 45(5): 770–780, 2001.
20. Weinstein D, Rabinowitz R, et al., Ovarian hemorrhage in women with
Von Willebrand’s disease. A report of two cases, J Reprod Med 28(7):
500–502, 1983.
21. Pajevic S, Basser P, A continuous tensor ﬁeld approximation for DT
MRI data, 9th Annula Conference of the ISMRM, 2001.
22. Poupon C, Clark CA, et al., Regularization of diffusionbased direction
maps for the tracking of brain white matter fascicles, Neuroimage 12(2):
184–195, 2000.
23. Poupon C, Mangin J, et al., Towards inference of human brain connec
tivity from MR diffusion tensor data, Med Image Anal 5(1): 1–15, 2001.
24. Mori S, CrainBJ, et al., Threedimensional trackingof axonal projections
in the brain by magnetic resonance imaging, Ann Neurol 45(2): 265–269,
1999.
25. Basser PJ, Pajevic S, et al., In vivo ﬁber tractography using DTMRI data,
Magn Reson Med 44(4): 625–632, 2000.
26. Lori NF, Akbudak E, et al., Diffusion tensor ﬁber tracking of human
brain connectivity: Aquisition methods, reliability analysis and bio
logical results, NMR Biomed 15(7–8): 494–515, 2002.
27. Napel S, Lee DH, et al., Visualizing threedimensional ﬂow with simu
lated streamlines and threedimensional phasecontrast MR imaging,
J Magn Reson Imaging 2(2): 143–153, 1992.
28. Lazar M, Weinstein DM, et al., White matter tractography using diffu
sion tensor deﬂection, Hum Brain Mapp 18(4): 306–321, 2003.
29. Westin CF, Maier SE, et al., Image Processing for Diffusion Tensor Magnetic
Resonance Imaging, Springer, Cambridge, 1999.
30. Weinstein DM, Kindlmann GL, et al., Tensorlines: Advectiondiffusion
based propagation through diffusion tensor ﬁelds, IEEE Visualization
Proc, San Francisco, 1999.
31. Behrens TE, JohansenBerg H, et al., Noninvasive mapping of connec
tions between human thalamus and cortex using diffusion imaging,
Nat Neurosci 6(7): 750–757, 2003.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
308
DaeShik Kim and Itamar Ronen
32. Parker GJ, HaroonHA, et al., Aframework for a streamlinebasedprob
abilistic index of connectivity (PICo) using a structural interpretation
of MRI diffusion measurements, J Magn Reson Imaging 18(2): 242–254,
2003.
33. Jones DK, Pierpaoli C, Conﬁdence mapping in diffusion tensor mag
netic resonance imaging tractography using a bootstrap approach,
Magn Reson Med 53(5): 1143–1149, 2005.
34. Callaghan PT, Eccles CD, et al., NMR microscopy of dynamic displace
ments: kspace and qspace imaging, Journal of Physics — Scientiﬁc
Instruments 21(8): 820–822, 1988.
35. Cory, DG, Garroway AN, Measurement of translational displacement
probabilities by NMR: An indicator of compartmentation, Magn Reson
Med 14(3): 435–444, 1990.
36. King MD, Houseman J, et al., qSpace imaging of the brain, Magn Reson
Med 32(6): 707–713, 1994.
37. King MD, Houseman J, et al., Localized qspace imaging of the mouse
brain, Magn Reson Med 38(6): 930–937, 1997.
38. Assaf Y, Cohen Y, Structural information in neuronal tissue as revealed
by qspace diffusion NMR spectroscopy of metabolites in bovine optic
nerve, NMR Biomed 12(6): 335–344, 1999.
39. Assaf Y, Cohen Y, Assignment of the water slowdiffusing compo
nent in the central nervous system using qspace diffusion MRS:
Implications for ﬁber tract imaging, Magn Reson Med 43(2): 191–199,
2000.
40. Assaf Y, BenBashat D, et al., High bvalue qspace analyzed diffusion
weightedMRI: Applicationtomultiple sclerosis, MagnResonMed 47(1):
115–126, 2002.
41. Niendorf T, Dijkhuizen RM, et al., Biexponential diffusion attenuation
in various states of brain tissue: Implications for diffusionweighted
imaging, Magn Reson Med 36(6): 847–857, 1996.
42. Mulkern RV, Gudbjartsson H, et al., Multicomponent apparent diffu
sion coefﬁcients in human brain, NMR Biomed 12(1): 51–62, 1999.
43. Clark, CA, Le Bihan D, Water diffusion compartmentation and
anisotropy at high b values in the human brain, Magn Reson Med 44(6):
852–859, 2000.
44. Inglis BA, Bossart EL, et al., Visualization of neural tissue water com
partments using biexponential diffusion tensor MRI, Magn Reson Med
45(4): 580–587, 2001.
45. MulkernRV, VajapeyamS, et al., Biexponential apparent diffusioncoef
ﬁcient parametrization in adult vs newborn brain, Magn Reson Imaging
19(5): 659–668, 2001.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
309
46. Clark CA, Hedehus M, et al., In vivo mapping of the fast and slow
diffusion tensors in human brain, Magn Reson Med 47(4): 623–628, 2002.
47. Assaf Y, CohenY, Invivo andinvitro biexponential diffusionof Nacetyl
aspartate (NAA) in rat brain: A potential structural probe?, NMR
Biomed 11(2): 67–74, 1998.
48. Assaf Y, Cohen Y, Nonmonoexponential attenuation of water and
Nacetyl aspartate signals due to diffusion in brain tissue, J Magn Reson
131(1): 69–85, 1998.
January 22, 2008 12:2 WSPC/SPIB540:Principles and Recent Advances ch12 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
CHAPTER 13
Fluorescence Molecular Imaging:
Microscopic to Macroscopic
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
Medical imaging has revolutionized our understanding and ability to
monitor speciﬁc macroscopic physical, physiological, and metabolic func
tions at cellular and subcellular levels. In the years to come, it will enable
detection and characterization of disease even before anatomic changes
become apparent. Fluorescence molecular imaging is revolutionarizing
drug discovery and development with realtime in vivo monitoring in
intact tissues. Technological advancements have taken ﬂuorescence based
imaging from microscopy to preclinical and clinical instruments for med
ical imaging. This chapter describes the current state of technology associ
ated with in vivo noninvasive or minimally invasive ﬂuorescence imaging
along with the underlying principles. An overview of microscopic and
macroscopic ﬂuorescence imaging techniques is presented and their role
in the development and applications of exogenous ﬂuorescence contrast
agents is discussed.
13.1 INTRODUCTION
Present medical imaging technologies rely on macroscopic physical,
physiological, or metabolic changes that differentiate pathological
fromnormal tissue rather than identifying speciﬁc molecular events
(e.g. gene expression) responsible for disease.
1
The human genome
project is making molecular medicine an exciting reality. Develop
ments in quantum chemistry, molecular genetics and high speed
computers have created unparallel capabilities for understanding
311
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
312
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
complex biological systems. Current research has indicated that
many diseases such as cancer occur as the result of the gradual
buildup of genetic changes in single cells.
1−4
Molecular imaging
exploits speciﬁc molecular probes as the source of image contrast for
studying such genetic changes at subcellular level. Molecular imag
ing is capable of yielding the critical information bridging molecular
structure and physiological function for understanding the integra
tive biology, which is the most important process in characterization
of disease, prevention, earlier detection, treatment, and evaluation
of treatment.
The use of contrast agents for disease diagnostics and func
tionality is very common in established imaging modalities like
positron emission tomography (PET), magnetic resonance imaging
(MRI), and Xray tomography (CT). Contrast agents provide accu
rate difference images under nearly identical biological conditions
and yield superior diagnostic information. Fluorescence molecular
imaging is a novel multidisciplinary ﬁeld, in which ﬂuorescence
contrast agents are used to produce images that reﬂect cellular and
molecular pathways and in vivo mechanisms of disease present
within the context of physiologically authentic environments. The
limitation of ﬂuorescence imaging is that the excitation light must
reach the ﬂuorescent molecule which is governed by the absorption
dependent penetration depth of the light within the tissue. How
ever, ﬂuorophores can be excited continuously and the signal is not
governed by the inherent properties of the probe like the radioactive
decay. Further, a set of photophysical properties are accessible like
ﬂuorophore concentration, ﬂuorescence quantumyield and ﬂuores
cence lifetime. Some of these parameters are inﬂuenced by the local
environment such as pH, ions, oxygen etc. and therefore, provide
more relevant information about the physiological and molecular
condition. Most importantly, light is a nonionizing radiation, ren
dering it harmless and nontoxic.
Biophotonics can provide tools capable of identifying speciﬁc
subset of genes encoded within the human genome that can cause
the development of cancer and other diseases. Photonic techniques
are being developed to image and identify the molecular alterations
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
313
that distinguisha diseasedcell froma normal cell. Suchtechnologies
will ultimately aid in characterizing and predicting the pathologi
cal behavior of the cell, as well as its responsiveness to drug treat
ment. The rapid development of laser and imaging technology has
yield powerful tools for the study of disease on all scales: single
molecule totissue materials andwhole organs. Biochemical analyses
of individual compounds characterize basic ﬂuorescence properties
of common ﬂuorophores within the tissue. Additional information
associated with complex systems such as cells and tissues structure
can be obtainedfromin vitro measurements. The study of in vivo ani
mal disease models provides information of about the intercellular
interactions andregulatoryprocesses. Humanclinical trials will then
lead to optical diagnostic, monitoring, and treatment procedures.
The purpose of this chapter is to provide an overview of micro
scopic and macroscopic ﬂuorescence imaging techniques. Fluores
cence confocal microscopy, plan reﬂectance imaging, and diffuse
optical tomography techniques are discussed along with their role
in the development of exogenous ﬂuorescence contrast agent for
cellular level to in vitro and in vivo tissue imaging. For more spe
ciﬁc details on ﬂuorescence contrast agents, measurement set ups
and image reconstruction techniques and applications, the reader is
encouragedtotapthe extensive literature available onthese subjects.
13.2 FLUORESCENCE CONTRAST AGENT:
ENDOGENOUS AND EXOGENOUS
Light induced ﬂuorescence is a powerful noninvasive method for
tissue pathology recognition and monitoring.
4−7
The attractiveness
of ﬂuorescence imaging is that ﬂuorescent dyes can be detected
at low concentrations using nonionizing harmless radiation that
can be applied repeatedly to the patient. In ﬂuorescence imag
ing, the energy from an external source of light is absorbed and
almost immediately reemitted at a longer, lower energy wave
length that is related to the electronic transition from the excited
state to the ground state of the ﬂuorescent molecule. Fluorescence
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
314
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
that originates from chromophores naturally present in the tis
sue (endogenous) is known as autoﬂuorescence. Synthesized chro
mophores (exogenous) may also be administered that target speciﬁc
tissue type, or may be activated by functional changes in the tissue.
13.2.1 Endogenous Fluorophores
These ﬂuorophores are generally associated with the structural
matrix of tissue (e.g. collagen and elastin)
8
or with the cellular
metabolic pathways (e.g. NAD and NADH).
9
Cells in various dis
ease state often undergo different rates of metabolismor have differ
ent structures associatedwitha distinct ﬂuorescent emissionspectra.
Fluorescence emission generally depends on the ﬂuorophores con
centration, spatial distribution throughout the tissue, local microen
vironment, and light attenuation due to differences in the amount
of nonﬂuorescing chromophores. Autoﬂuorescence of proteins is
associated with amino acids such as tryptophan, tyrosin and pheny
lalanine with absorption maxima at 280 nm, 275 nm, and 257 nm
respectively, and emission maxima between 280 nm (phenylala
nine) and 350 nm (tryptophan). One of the main imaging applica
tions of ﬂuorescent proteins is in monitoring tumor growth
10,11
and
metastasis formation,
12,13
as well as occasionally gene expression.
4
Structural ﬂuorophores like collagen or elastin have absorption
maxima between 300 nm–400 nm and show broad emission bands
between 400 nm and 600 nm with maxima around 400 nm. Fluores
cence of collagenor elastinhas beenusedtodistinguishbetweenvar
ious tissue types e.g. epithelial and connective tissue.
14−20
NADH
is excited from 330 nm–370 nm wavelength range and is most con
centrated within the mitochondrial membrane where it is oxidized
within the respiratory chain. Its ﬂuorescence is an appropriate
parameter for detection of ischemic or neoplastic tissue. Fluores
cence of free and protein bounded NADH has been shown to be
sensitive to oxygen concentration.
21
The main drawback of endoge
nous ﬂuorophores is their low excitation and emission wavelength.
Inthis spectral range, the tissue absorptionis relatively highlimiting
the light penetration.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
315
13.2.2 Exogenous Fluorophores
Various ﬂuorescing dyes can be use for probing cell anatomy and
cell physiology. Exogenous ﬂuorescence probes target speciﬁc cellu
lar and subcellular events, and this ability differentiates them from
nonspeciﬁc dyes, such as indocyanine green (ICG), which reveals
generic functional characteristics such as vascular volume and per
meability. These ﬂuorescence probes typically consist of the active
component, which interacts with the target (i.e. the afﬁnity ligand
or enzyme substrate); the reporting component (i.e. the ﬂuorescent
dye); and possibly a delivery vehicle (for example, a biocompati
ble polymer), which ensures optimal biodistribution. An important
characteristic inthe designof active andactivatable probes for invivo
applications is the use of ﬂuorochromes that operate inthe NIRspec
trum of optical energy. This is due to the low light absorption that
tissue exhibits in this spectral window, which makes light penetra
tion of several centimeters possible.
Exogenous targeted and activatable imaging probes yield
particularly high tumor/background signal ratios because of their
nondetectability in the native state. In activatable probes, the ﬂuo
rochromes are usually arranged in close proximity to each other so
that they selfquench, or they are placed next to a quencher using
enzymespeciﬁc peptide sequences.
22
These peptide sequences
can be cleaved in the presence of the enzyme, thus freeing the
ﬂuorochromes that can then emit light upon excitation. In contrast
to active probes, activatable probes minimize background signals
because they are essentially dark at the absence of the target and can
improve contrast and the detection sensitivity. Avariety of endoge
nous reporter probes have been usedfor enhanceddetection of early
cancers, including somatostatinreceptor targetedprobes
23−24
; folate
receptor targeted agents
25
; tumor cell targeted agents
26−29
; agents
that incorporate into areas of calciﬁcation; bone formation or both
30
;
and agents being activated by tumorassociated proteases.
31
Dyes
like ﬂuorescein and indocyanine green are commonly used for ﬂu
orescence angiography or blood volume determination in a clinical
setup. Extensive research is also been carriedout for development of
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
316
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
exogenous ﬂuorophores with applications as activable probes that
carry quenched ﬂuorochromes
24,33
and photosensitizer or tumor
killing agents for cancer treatment using photodynamic therapy.
Aphotosensitizer is a drug that is preferentially taken up by malig
nant tissue and can be photoactivated. After an optimal time from
administration, light is shown on the tissue area of interest and
absorbed by the sensitizer. The sensitizer then kills the surrounding
tumor tissue, leaving the healthy tissue undamaged. Tissue local
ization, effectiveness in promoting cell death, and toxicity are some
of the parameters that need to be characterized before human trials.
13.3 FLUORESCENCE IMAGING
Fluorescence imaging can provide information at different res
olutions and depth penetrations, ranging from micrometers
(microscopy) to centimeters (ﬂuorescence reﬂectance imaging and
ﬂuorescence molecular tomography).
2−3
On microscopic level, ﬂuo
rescent reporter dyes are typically used for monitoring the distribu
tion of important chemical species throughout the cell by obtaining
ﬂuorescence microscopy images of the cell after injecting it with the
dye. Viability of the cell or permeability of its membrane can also
be determined using ﬂuorescence microscopy. Compared to micro
scopic cellular imaging, macroscopic in vitro tissue imaging allows
us to study interactions between cells and provide a platform much
closer to true in vivo analysis in terms of structural architecture on
microscopic and macroscopic scales. There is a signiﬁcant differ
ence intissue uptake andstorage of various exogenous ﬂuorophores
between in vitro and in vivo specimens. However, in vitro measure
ments can provide information associated with complex systems
such as interaction of various biochemicals that are present in func
tional systems. Further, the effect of local environment ontissue opti
cal properties and properties such as reactivity to a speciﬁc chemical
can be investigated prior to involving live subjects. For diagnos
tic purposes, the actual location and kinetics of tissue uptake are
important. This information cannot be obtained using in vitro tissue
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
317
analysis. The pharmacokinetics, tissue discrimination capabilities,
toxicity, and clearance pathways of ﬂuorescence probes need to be
studied prior to use in human trials. Such studies are performed
in vivo using animal models.
13.3.1 Fluorescence Microscopic Imaging
Fluorescence microscopy using endogenous ﬂuorophores ﬁnds
applications in discriminating normal tissue fromcancerous or even
precancerous tissue inrealtime clinical setting. Unique ﬂuorescence
spectral patterns associated with cell proliferation and between
rapidly growing and slowly growing cells have been studied. Auto
ﬂuorescence was used to identify terminal squamous differentia
tion of normal oral epithelial cells in culture and discrimination
of proliferating and nonproliferating cell populations. Fluorescence
microscopy using exogenous dyes is the most common technique
used for monitoring the spatial distribution of a particular analyte
throughout a cell. One or more exogenous dyes are introduced into
the cell and allowed to disperse. These dyes then interact with the
analyte of interest which in turn changes their ﬂuorescence prop
erties. By obtaining a ﬂuorescence image of the cell using excita
tion at speciﬁc wavelengths, relative concentrations of the analyte
can be determined. Another important application of exogenous
dyes is in elucidating the role of a particular chemical in cellular
biology.
In epiﬂuorescence microscopy, the specimen is typically excited
using a mercury or xenon lamp along with a set of monochroma
tor ﬁlters. The excitation light, after reﬂecting from a dichromatic
mirror shines on to the sample through a microscope objective. The
dichromatic mirror reﬂects light shorter than a certain wavelength
(excitation), andpasses light longer thanthat wavelength(emission).
Thus only the emitted ﬂuorescence light passes onto the eye piece
or projected onto an electronic array detector positioned behind the
dichroic mirror. While imaging thick specimens, the emitted ﬂuo
rescent signal must pass through the volume of the specimen which
decreases the resolution of objects in the focal plane. Additionally,
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
318
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
ﬂuorescence emitted from excited objects that lie above and below
the focal plane, obscures the emission from the in focus objects.
Laserscanning confocal microscopy offers distinct advantages
over epiﬂuorescence microscopy by using a pin hole aperture as
shown in Fig. 1. The laser excitation light reﬂects off a dichromatic
mirror and is focused on a single point within the tissue of inter
est rather than broadly illuminating the entire specimen using a
computercontrolled XY scanning mirror pair. With only a single
point illuminated, the illumination intensity rapidly falls off above
and below the plane of focus as the beam converges and diverges,
thus reducing excitation of ﬂuorescence form interfering objects
Fig. 1. The principle of operationof a confocal microscope is shownonthe left. The
pinhole aperture placed at the focal length of the lens blocks the light coming from
outof focus planes (green and blue lines), while allowing the light coming from
the planeinfocus to reach the detector. Aschematic of pointscanning ﬂuorescence
confocal microscope is shown on the right. The dichromatic mirror reﬂects the
emission light while allowing the excitation light to pass through. Amotorized XY
scanning mirror pair is used to collect the data from the selected sample area.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
319
situated out of the focal plane being examined. The emitted ﬂuo
rescence light from the sample gets descanned by the same mirrors
that are used to scan the excitation light from the laser. The emitted
light passes through the dichromatic and is focused onto a pinhole
aperture. The light that passes through the pinhole is measured by a
detector, i.e. a photomultiplier tube. Any light emitting fromregions
awayfromthe vicinityof the illuminatedpoint will be blockedbythe
pinhole aperture, thus providing attenuation to outoffocus inter
ference. Most confocal imaging systems provide adjustable pinhole
blocking apertures. This enables a tradeoff to be made in vertical
resolution and sensitivity. Asmall pinhole gives the highest resolu
tion and lowest signal and vice versa. With pointbypoint scanning,
there is never a complete image of the sample at any given instant.
The detector is attached to a computer which builds up the image,
one pixel at a time.
Pointscanning microscopes, when used with high numerical
aperture lenses, have an inherent speed limitation in ﬂuorescence.
This arises because of a limitation in the amount of light that
can be obtained from the small volume of ﬂuorophore contained
within the focus of the scanned beam (less than a cubic micron).
At moderate levels of excitation, the amount of light emitted will be
proportional to the intensity of the incident excitation. However, ﬂu
orophore excitedstates have signiﬁcant lifetimes (inthe order if afew
nanosecond). Therefore, as the level of excitation is increased, the
situation eventually arises when most of the ﬂuorophore molecules
are pumped up to their excited state and the ground state becomes
depleted. At this stage, the ﬂuorophore is saturatedandno more sig
nal may be obtained from it by increasing the ﬂux of the excitation
source.
Despite their success, conventional microscopy methods suf
fer signiﬁcant limitations when used in biological experimentation.
They usually require chemical ﬁxation of removed tissues, involve
the observation of biological samples under nonphysiological con
ditions, can generally not resolve the dynamics of cellular processes,
andmost importantly, it is verydifﬁcult togenerate quantitative data
using microscopy.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
320
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
13.3.2 Fluorescence Macroscopic Imaging
Planar ﬂuorescence imaging, transillumination and ﬂuorescence
molecular tomography (FMT) are the most common imaging tech
niques used for obtaining ﬂuorescence information at macroscopic
resolution. Collapsing the volume of an animal or tissue into a sin
gle image, known as planar imaging, is generally fast, the data sets
generated are small, and imaging can be done in high throughput
fashion, at the expense of internal resolution. Tomographic imaging
on the other hand allows a virtual slice of the subject to be obtained
andis more quantitative andcapable of displayinginternal anatomic
structures and/or functional information. However, FMT requires
longer acquisition times, generates a very large data set and is com
putationally expensive. Further, light becomes diffuse within a few
millimeters of propagationwithinthe tissues owingtoelastic scatter
ing experienced by photons when they interact with various cellular
components, such as the membranes anddifferent organelles. Diffu
sion results in the loss of imaging resolution. Therefore, macroscopic
ﬂuorescence imaging largely depends on spatially resolving and
quantifying bulk signals from speciﬁc ﬂuorescent entities reporting
on cellular and molecular activity.
13.3.3 Planar Fluorescence Imaging
The most common technique to record ﬂuorescence within a large
tissues volume is associated with illuminating tissue with a plane
wave, i.e. an expanded light beam, and then collecting ﬂuores
cence signals emitted towards a CCD camera.
37
These methods can
be generally referred to as planar methods and can be applied in
epiilluminationor transilluminationmode. Figure 2 shows a typical
setup of a planar reﬂectance imaging system. The imaging plane is
uniformly illuminated using a particular wavelength light source
and the light emitted by the ﬂuorophore is captured using a CCD
camera. An illustrative image of a nude mouse with a subcutaneous
human breast cancer xenograft obtained using a nearinfrared ﬂuo
rescent probe is also shown.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
321
Fig. 2. Schematic diagram of a typical planner reﬂectance imaging system. The
imaging plane is uniformly illuminated using a particular wavelength light source
and the light emitted by the ﬂuorophore is captured using a CCD camera. An
illustrative image of a nude mouse with a subcutaneous human breast cancer
xenograft MDA MD 361 obtained using a nearinfrared ﬂuorescent probe is also
shown.
Planar imaging has the added advantage that same instru
mentation can be used to image ﬂuorescence in solutions and
excised tissues. However, a signiﬁcant drawback of this method
is that it cannot resolve depth and does not account for non
linear dependencies of the signal detected on propagation depth
and the surrounding tissue. Superﬁcial ﬂuorescence activity may
reduce the contrast of underlying activity from being detected
owing to the simple projection viewing. Despite the drawbacks,
planar imaging remains popular because setting up a reﬂectance
imaging system is comparatively easy and inexpensive. Planar ﬂu
orescence imaging is a very useful technique when probing super
ﬁcial structures (<5 mm deep), for example during endoscopy,
41,42
dermatological imaging,
43
intraoperative imaging,
44
probing tissue
autoﬂuorescence
45,46
or small animal imaging,
47
with very high
throughputs.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
322
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
13.3.3.1 Fluorescence molecular tomography
Recent technological evolutions have been the development of ﬂu
orescence tomography for investigations at the wholeanimal or tis
sue level. These technologies allow threedimensional imaging of
ﬂuorescence biodistribution in whole animals and account for tis
sue optical heterogeneity and the nonlinear dependence of ﬂuo
rescence intensity on depth and optical properties. It can localize
and quantify ﬂuorescent probes threedimensionally in deep tis
sues at high sensitivities.
48,49
The diffuse optical tomography (DOT)
methods account for partial volume effects, reduce the inﬂuence of
superﬁcial tissues and improve the contrast to noise ratio (CNR)
of buried targets
50−54
thereby overcoming the shortcomings of the
planer reﬂectance imaging.
Optical tomography is far more complex compared to Xray CT.
In Xray CT, the radiation propagates through the medium in a
straight line from the source to the detector. The forward problem
then becomes a set of integrals (Radon transform) and the inverse
problemis linear and well posed (backprojection methods). On the
other hand, inoptical imagingbythe time the light reaches the detec
tor, it has lost all the information about the originating source due to
multiple scattering. Each measurement is therefore sensitive to the
whole tissue volume resulting into an ill posed, underdetermined
inverse problem. Mathematical models based on radiative transport
(e.g. Monte Carlo techniques) or diffusion equation are required to
reconstruct the most probable photon propagation path through tis
sue for a given source detector geometry (forward problem).
55,56
Algorithms based on linear numerical inversion methods (inverse
solution) start withthe diffusionequationwhichis thentransformed
in to an integral equation via Green’s theorem. A linear version of
the equation is then obtained using Born’s (or Rytov’s) approxi
mation and then discretized into a system of linear equations as
follows:
The ﬂuorophore concentration is reconstructed by inverting
ratiometric data derived from the intensities of the excitation and
ﬂuorescence light measured on the detector plane for each source
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
323
position. The light intensity at the excitation wavelength is written
as (r
s(i)
, r
d(i)
, λ
exc
), where r
s(i)
, and r
d(i)
is the positions of the ith
source and ith detector locations respectively, and λ
exc
is the excita
tion wavelength. Similarly, the ﬂuence at the emission wavelength
λ
emi
is written as (r
s(i)
, r
d(i)
, λ
emi
). Following the normalized Born
approach, formulation of the ratiometric ﬂuorescence/excitation
measurements is written in discrete notation as y = Ax with the
following deﬁnitions
21,22
;
y
i
=
(r
s(i)
, r
d(i)
, λ
emi
) −θ
f
o
(r
s(i)
, r
d(i)
, λ
exc
)
o
(r
s(i)
, r
d(i)
, λ
exc
)
(1)
A
i,j
= −
S
o
vh
3
D
o
G(r
s(i)
, r
j
, λ
exc
)G(r
j
, r
d(i)
, λ
emi
)
G(r
s(i)
, r
d(i)
, λ
exc
)
(2)
x
j
= ∂N
j
(3)
Here, the two point Greens function, G, models light transport
for given boundary conditions and optical properties. Image voxel
(x
j
) have concentration, ∂N
j
and position r
j
. These equations are
then numerically solved using some type of regularization scheme.
Singular value decomposition, algebraic reconstructiontechnique or
conjugate gradient algorithms are used for example using Tikhonov
regularization.
57−60
The linear formulation works well when perturbations are small
and isolated, and when the background media is relatively uniform.
However, the diffusion equation is inherently nonlinear because
both the photon ﬂuence rate and the Green’s function are depen
dent upon the unknown quantities we are trying to solve. In algo
rithms based on nonlinear iterative methods, a global norm, such
as mean square error, is iteratively minimized. The unknown in
homogeneity is obtained that best predicts the measurement data
subject to some a priori knowledge. The unknown inhomogeneity
is computed based on its current estimate and is compared to the
measurements every iteration.
61−64
The ﬂuorescence optical data can be obtained before and after
administration of the absorbing ﬂuorescence contrast agent and
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
324
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
the DOT images can be reconstructed and subtracted. However, a
more robust approach is to use differential measurements due to the
extrinsic perturbation. The two data sets (excitation and emission)
are obtained within a short time of one another thereby minimiz
ing positional and movement errors and instrumental drift. The
use of emission/excitation differential measurements eliminates
systematic errors associated with operational parameters and pro
vides a baseline measurement for independent reconstruction.
3,65
Further, these ratio measurements reduce the inﬂuence of het
erogeneous optical properties and path lengths.
65
Fluorescence
DOT images have recently been demonstrated in vivo using both
ﬁbercoupled cylindrical geometries
66−68
and lenscoupled planar
geometries.
69−71
The use of a lens to relay light fromthe tissue surface to a charge
coupled device (CCD) for detection
69,71−73
permits dense spatial
sampling and large imaging domains on the detected surface. The
ﬁber coupledilluminationsystems introduce anundesiredasymme
try between the illumination plane (sparsely sampled by discrete
ﬁbers) and the detection plane (densely sampled by a CCD array
detector) and force tradeoffs between sampling density and ﬁeld
of view on the illumination plane. In addition, ﬁber optic switch
ing times (>0.1 seconds) limit data acquisition speeds. Rather than
direct lens coupling, other systems have usedarrays of detector ﬁber
torelaylight fromtissuetoaCCD.
66−68,74−76
Whileprovidingsource
detector symmetry, this approach does not provide the dense sam
pling of the lens coupleddetection. The source plane can be sampled
using fast acquisition, ﬂexible, highdensity, and large ﬁeldofview
arrangements by raster scanning the source laser.
A schematic of the small animal continuouswave ﬂuorescence
DOT system is shown in Fig. 3.
77
Here, the source illumination
is provided by a laser diode. The collimated output of the laser
passes through a beam splitter that deﬂects 5% of the beam to
a photodiode for a reference measure of the laser intensity. The
remainder of the collimated beam (95%) passes through a lens, L,
into a dualaxis XY galvanometer mirror system. The mirror pair
samples the source plane using a ﬂexible, highdensity and large
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
325
Fig. 3. Fluorescence tomography system. The mouse subject is suspended and
held in light compression between two movable windows (W1 and W2). Light
from a laser diode at 785 nm (LD) is collimated and passes through a 95/5 beam
splitter (BS). A reference photodiode (PD) collects 5% of the beam. The main 95%
beampasses through lens (L1) into a XYgalvo scanning system(XYGal). The mirror
pair scans the beam onto the illumination window (W1) of the imaging tank. Light
emitted from W2 is detected by an EMCCD via a ﬁlter (F1) and lens system (L2).
77
ﬁeldofview arrangements by raster scanning the focused illumi
nation (spot size = 100 µm) in two dimensions with a position A
to position B switch time of <0.5 ms. The 100 µm source spot size
is similar to the multimode ﬁbers sizes used in a wide variety of
DOT systems.
66−69,71−76
The use of the galvanometer mirror pair
permits the system to scan an adjustable area of up to 8 cm×8 cm
withﬂexible source positioningandsource separations. After propa
gating throughthe sample volume, transmittedlight passes through
a selectable ﬁlter element andis detectedonthe opposite plane using
a lens coupled CCD camera. The typical scanning protocol consists
of two separate excitation and ﬂuorescence scans. The excitation
light intensity proﬁle is measured for each source position using a
neutral density ﬁlter. The ﬂuorescence emission light intensity pro
ﬁle is then measured by using a narrowband interference ﬁlter. The
excitation and emission images obtained from the CCD camera are
normalized using the mean source intensity values obtained from
the photodiode. This normalization compensates for the differences
inlight levels betweenthe excitationandemissionscans. Afull frame
4×4 binnedimage data (128×128) is collectedfor all the source posi
tions. The full detector images are cropped and binned to generate
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
326
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
set of detector measurement positions symmetricallyarrangedinthe
xy plane such that for xy source position, there is a matched xy
detector position. With a total small animal whole body scan time
of ∼2.2 min, this ﬂuorescence DOT system provides a 10× larger
imaging domain (5 cm×5 cm×1.5 cm) compared to an equivalent
ﬁberswitched systemwhile maintaining the same resolution (small
object FWHM≤2.2 mm) and sensitivity (<0.1 pmole).
69,77
Imaging the distribution of tumortargeted molecular probes
simultaneously in the liver, kidneys and tumors is demonstrated
in Fig. 4 by imaging the uptake of a breast tumorspeciﬁc polypep
tide in nude mice bearing subcutaneously implanted human breast
Fig. 4. Representative slices from a 3D tomographic reconstruction of a nude
mouse with a subcutaneous human breast cancer xenograft MDA MD 361. (A)
a xy slice parallel to the detector plane at a depth of z = 2.5 mm and (B) a xz slice
extending from the source plane to detector plane at y = 12 mm.
77
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
327
cancer carcinoma MDAMB361. The polypeptide was conjugated
with a near infrared ﬂuorescent probe cypate, which serves as the
ﬂuorescent contrast for optical imaging. For imaging the small ani
mal, anesthetizednude mice (ketamine/xylazine via intraperitoneal
injection) were suspended between the source and detector win
dows. A matching ﬂuid (µ
a
= 0.3 cm
−1
, µ
s
= 10 cm
−1
) surrounds
the animal. With warmed matching ﬂuid (T=38
◦
C), the mice can
be imaged multiple times over the normal course of anesthetic dose
(30 minutes–60 minutes). The full protocol for a combined ﬂuores
cence/excitation scan (a 24 × 36 array with 2 mm square spacing
between source positions, x = −24 mm to 24 mm, y = −36 mm
to 36 mm) took 5 minutes–6 minutes including animal position
ing, emission and excitation scanning, retrieval of animal from the
scanner and reconstructing the data. Figure 5 shows a 2D slice
Fig. 5. Retinal angiography images of a diabetic fundus (FA) showing loss of
normal retinal capillaries andgrowthof abnormal ones that leakthe ﬂuoresceindye.
(Source: Dr Levent Akduman, Saint Louis University Eye Center.)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
328
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
parallel to the detector plane at a depth of z = 2.5 mm and a 2D
slice extending fromthe source plane to detector plane at y =12 mm,
obtainedfromthe 3Dtomographic reconstruction. The tumor (breast
cancer) shows uptakeof aﬂuorescingnear infraredcypatederivative
probe with a polypeptide that targets a protein receptor expressed
in breast cancer. The kidneys also show contrast. The maximum
values of probe concentration obtained from the tumor, liver and
kidney volumes as a ratio of the background are 54.7, 32.4, and 58.3
respectively.
Besides applications for disease diagnosis and monitoring,
molecular imaging assays in intact living animals can also beneﬁt in
resolving biological questions raised by pharmaceutical scientists.
Transgenic animals are useful in guiding early drug discovery by
“validating” the target protein, evaluating test compounds, deter
mining whether the target is involved in any toxicological effects of
test compounds, and testing the efﬁcacy of compounds to ensure
that the compounds will act as expected in man (Livingston, 1999).
The implementation of molecular imaging approaches in this drug
discovery process offers the strong advantage of being able to mean
ingfully study a potential drug labeled for imaging in an animal
model, often before phenotypic changes become obvious, and then
quickly move into human studies. It is likely that preclinical trials
canbe acceleratedtorule out drugs withunfavorable biodistribution
and/or pharmacokinetics prior to human studies. Afurther advan
tage over in vitro and cell culture experimentation may be achieved
byrepetitivestudyof thesameanimal model, usingidentical or alter
native biological imagingassays at different time points. This reveals
a dynamic and more meaningful picture of the progressive changes
in biological parameters under scrutiny, as well as possible temporal
assessment of therapeutic responses, all in the same animal without
recourse to its death. This yields better quality results fromfar fewer
experimental animals. Another beneﬁt of molecular imaging assays
is their quantitative nature. The images obtained are usually not just
subjective or qualitative, as is the case with standard use of sev
eral conventional medical imaging modalities, but instead, usually
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
329
provide meaningful numerical measures of biological phenomena
(exempliﬁed below). Such quantitative data could even be consid
ered more useful than similar data obtainable in vitro or ex vivo,
on account of preserving the intactness and the physiology of the
experimental subject.
13.4 CONCLUSIONS
With the completion of several genome sequences, the next cru
cial step is to understand the function of gene products and their
role in the development of disease. This knowledge will potentially
facilitate the discovery of informative biomarkers that can be used
for the earliest detection of disease and for the creation of new
classes of drugs directed at new therapeutic targets. Thus, one of
the capabilities most highly sought after is the noninvasive visu
alization of speciﬁc molecular targets, pathways and physiological
effects in vivo. Revolutionary advances in ﬂuorescent probes, pho
toproteins and imaging technologies have allowed cell biologists
to carry out quantitative examination of cell structure and function
at high spatial and temporal resolution. Indeed, whole cell assays
have become an increasingly important tool in screening and drug
discovery.
Fluorescence molecular imaging now creates the possibility of
achieving several important goals in biomedical research, namely,
(1) to develop noninvasive in vivo imaging methods that reﬂect spe
ciﬁc cellular and molecular processes, for example, gene expres
sion, or more complexmolecular interactions suchas proteinprotein
interactions; (2) to monitor multiple molecular events near simulta
neously; (3) tofollowtrafﬁckingandtargetingof cells; (4) tooptimize
drug and gene therapy; (5) to image drug effects at a molecular and
cellular level; (6) to assess disease progression at a molecular patho
logical level; and (7) to create the possibility of achieving all of the
above goals of imaging in a rapid, reproducible, and quantitative
manner, so as to be able to monitor timedependent experimental,
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
330
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
developmental, environmental, and therapeutic inﬂuences on gene
products in the same animal or patient.
1
Fluorescein and ICG are FDA approved ﬂuorescence dyes for
humanmedical applications andare routinelyusedinclinical retinal
angiography
78
and liver function testing.
79
Sample images of reti
nal angiography of a diabetic fundus (FA) showing loss of normal
retinal capillaries andgrowth of abnormal ones that leak the ﬂuores
ceindye are showninFig. 5. ICGexhibits favorable pharmacokinetic
properties for assessment of hepatic functionandcardiac output and
has been applied in clinical settings.
80
ICGhas also been reported as
a NIR contrast agent for detection of tumors in animal research
81,82
andat clinical level.
83
The ﬁrst ﬂuorescence contrastenhancedimag
ing in a clinical setting was reported by Ntziachristos et al.
83
who
demonstrated uptake and localization of ICGin breast lesions using
DOT. Fluorescence imaging has shown very promising results as a
potential imaging modality that will provide speciﬁc macroscopic
physical, physiological, or metabolic information at molecular level.
With the current resources and research efforts, it won’t be long
before a library of ﬂuorescence biomarkers and photosynthesizers
for diagnosis, monitoring and treatment various diseases is formed.
Technological advancements will soon take the ﬂuorescence based
imaging devices from preclinical to clinical setups.
13.5 ACKNOWLEDGMENT
The authors acknowledge the help and support of Joseph P Culver,
Samuel Achilefu and the entire team of the Optical Radiology Lab
oratory in the Department of Radiology at Washington University
School of Medicine, Saint Louis, Missouri. The authors are thankful
to Dr Levent Akduman, Saint Louis University Eye Center, Saint
Louis, Missouri, for providing the retinal angiography images illus
trated in this chapter. Some of the work presented here was sup
ported in part by the following research grants: National Institutes
of Health, K25NS44339, BRGR01 CA109754, Small Animal Imaging
Resource Program (SAIRP) grant, R24 CA83060.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
331
References
1. Massoud TF, Gambhir SS, Molecular imaging in living subjects: Seeing
fundamental biological processes in a new light, Genes & Dev 17: 545–
580, 2003.
2. Weissleder R, Ntziachristos V, Shedding light onto live molecular
targets, Nature Medicine 9(1): 123–128, 2003.
3. Ntziachristos V, Fluorescence molecular imaging, AnnuRev Biomed Eng
8: 1–33, 2006.
4. Yang M, Baranov E, Moossa AR, et al., Visualizing gene expression by
whole body ﬂuorescence imaging, Proc Natl Acad Sci USA 97: 12278–
12282, 2000.
5. Tuchin VV, Handbook of Optical Biomedical Diagnostics, PM107, SPIE
Press, Bellingham, WA, 2002.
6. Das BB, Lui F, Alfano RR, Timeresolved ﬂuorescence and photon
migration studies in biomedical and random media, Rep Prog Phys 60:
227, 1997.
7. Lakowicz JR, Principles of ﬂuorescence spectroscopy, 2ndedn., Kluwer
Academic, New York, 1999.
8. Denk W, Twophoton excitation in functional biological imaging,
J Biomed Opt 1: 296, 1996.
9. Fujimoto D, Akiba KY, Nakamura N, Isolation and characterization
of a ﬂuorescent material in bovine Achillestendon collagen, Biochem
Biophys Res Commun 76: 1124, 1977.
10. Wagnieres GA, Star WM, Wilson BC, In vivo ﬂuorescence spectroscopy
and imaging for oncological applications, Photochem Photobiol 68: 603,
1998.
11. Holfman RM, Visualization of GFPexpressing tumors and metastesis
in vivo, Biotechniques 30: 1016–1022, 1024–1026, 2001.
12. Yang M, et al., Wholebody optical imaging of green ﬂuorescent protin
expressing tumors and metastases, Proc Natl Acad Sci USA 97: 1206–
1211, 2000.
13. Moore A, Sergeyev N, Bredow S, et al., Model system to quantitate
tumor burdeninlocoregional lymphnodes during cancer spread, Inva
sion Metastasis 18: 192–197, 1998.
14. Wunderbaldinger P, Josephson L, Bremer C, et al., Detection of lymph
node metsastases by contrast — Enhanced MRI in an experimental
model, Magn Reson Med 47: 292–297, 2002.
15. Das BB, Lui F, Alfano RR, Timeresolved ﬂuorescence and photon
migration studies in biomedical and random media, Rep Prog Phys 60:
227, 1997.
16. Lakowicz JR, Principles of ﬂuorescence spectroscopy, 2ndedn., Kluwer
Academic, New York, 1999.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
332
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
17. Schneckenburger H, Steiner R, Strauss W, et al., Fluorescence technolo
gies in biomedical diagnostics, in Tuchin VV (ed.), Optical Biomedical
Diagnostics, SPIE Press, Bellingham, WA, 2002.
18. Sinichkin Yu P, Kollias N, Zonios G, et al., Reﬂectance and ﬂuorescence
spectroscopy of human skin in vivo, in Tuchin VV (ed.), Handbook of
Optical Biomedical Diagnostics, SPIE Press, Bellingham, WA, 2002.
19. Sterenborg HJ, Motamedi M, Wagner RF, et al., In vivo ﬂuorescence
spectroscopy and imaging of human skin tumors, Lasers Med Sci 9:
344, 1994.
20. Zeng H, MacAulay C, McLean DI, et al., Spectroscopic andmicroscopic
characteristics of human skin autoﬂuorescence emission, Photochem
Photobiol 61: 645, 1995.
21. Schneckenburger H, Steiner R, Strauss W, et al., Fluorescence technolo
gies in biomedical diagnostics, in Tuchin VV (ed.), Optical Biomedical
Diagnostics, SPIE Press, Bellingham, WA, 2002.
22. Tung CH, Fluorescent peptide probes for in vivo diagnostic imaging,
Biopolymers 76: 391–403, 2004.
23. Zaheer A, Lenkinski RE, Mahmood A, et al., In vivo near — infrared
ﬂuorescence imaging of osteobllastic activity, Nat Biotechnol 19: 1148–
1154, 2001.
24. Weissleder R, Tung CH, Mahmood U, et al., In vivo imaging of tumors
with proteaseactivated nearinfrared ﬂuorescent probes, Nat Biotech
nol 17: 375–378, 1999.
25. Tung CH, Fluorescent peptide probes for in vivo diagnostic imaging,
Biopolymers 76: 391–403, 2004.
26. Ballou B, et al., Tumor labeling in vivo using cyanineconjugated mon
oclonal antibodies, Cancer Immunol Immunother 41: 257–263, 1995.
27. Neri D, et al., Targetingbyafﬁnitymaturedrecombinant antibodyfrag
ments on an angiogenesisassociated ﬁbronectin isoform, Nat Biotech
nol 15: 1271–1275, 1997.
28. Muguruma N, et al., Antibodies labeled with Fluorescenceagent
excitable by infrared rays, J Gastroenterol 33: 467–471, 1998.
29. Folli S, et al., Antibody —indocyanine conjugates for immunophotode
tection of human squamous cell carcinoma in nude mice, Cancer Res
54: 2643–2649, 1994.
30. Zheer A, et al., In vivo near — infrared ﬂuorescence imaging of
osteoblastic activity, Nat Biotechnol 19: 1148–1154, 2001.
31. Tung CH, Mahmood U, Bredow S, et al., In vivo imaging of proteolytic
enzyme activity using a novel molecular reporter, Cancer Res 60: 4953–
4958, 2000.
32. Bogdanov AAJr, Lin CP, Simonova M, et al., Cellular activation of the
self–quenched ﬂuorescent reporter probe in tumor microenvironment,
Neoplasia 4: 228–236, 2002.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
333
33. Funovics M, Weissleder R, Tung CH, Protease sensors for bioimaging,
Anal Bioanal Chem 377: 956–963, 2003.
34. Phair RD, Misteli T, Kinetic modeling approaches to in vivo imaging,
Nat Rev Mol Cell Biol 2: 898–907, 2003.
35. Ke S, Wen XX, Gurﬁnkel M, et al., Near infrared optical imaging of
epidermal growth factor receptor in breast cancer xenografts, Cancer
Res 63: 7870–7875, 2003.
36. Zaheer A, Lenkinski RE, Mahmood A, et al., In vivo near — infrared
ﬂuorescence imaging of osteobllastic activity, Nat Biotechnol 19: 1148–
1154, 2001.
37. Weissleder R, Tung CH, Mahmood U, et al., In vivo imaging of tumors
with proteaseactivated nearinfrared ﬂuorescent probes, Nat Biotech
nol 17: 375–378, 1999.
38. Wunder A, Tung CH, MullerLander U, et al., In vivo imaging of pro
tease activity in arthritis —Anovel approach for monitering treatment
response, Arthritis Rheum 50: 2459–2465, 2004.
39. MahmoodU, Tung C, BagdanovA, et al., Near infraredoptical imaging
system to detect tumor protease activity, Radiology 213: 866–870, 1999.
40. Yang M, Baranov E, Jinag P, et al., Whole–body optical imaging of green
ﬂuorescent protinexpressingtumors andmetastases, Proc Natl AcadSci
USA 97: 1206–1211, 2000.
41. Ito S, et al., Detection of human gastric cancer in resected speci
mens using a novel infrared ﬂuorescent antihuman carcinoembry
onic antigen antibody with an infrared ﬂuorescence endoscope in vitro,
Endoscopy 33: 849–853, 2001.
42. Marten K, et al., Detection of dysplastic intestinal adenomas using
enzymesensing molecular beacons in mice, Gastroenterology 122: 406–
414, 2002.
43. Zonios G, Bkowski J, Kollias N, Skin melanin, hemoglobin, and light
scattering properties can be quantitatively assessed in vivo using dif
fuse reﬂectance spectroscopy, J Invest Dermatol 117: 1452–1457, 2001.
44. KuroiwaT, KajimotoY, OhtaT, Development andclinical applicationof
near — infrared surgical microscope; preliminary report, Minim Inva
sive Neurosurg 44: 240–242, 2001.
45. RichardsKortum R, SevickMuraca E, Quantitative optical spec
troscopyfor tissue diagnosis, AnnuRev Physical Chem47: 555–606, 1996.
46. Wang TD, et al., In vivo identiﬁcation of colonic dysplasia using ﬂuo
rescence endoscopic imaging, Gastrointest Endosc 49: 447–455, 1999.
47. Mahmood U, Tung C, Bagdanov A Jr, et al., Nearinfrared optical
imaging of protease activity for tumor detection, Radiology 213: 866–
870, 1999.
48. Ntziachistos V, Bremer C, Weissleder R, Fluorescencemediatedtomog
raphy resolves protease activity in vivo, Nat Med 8: 575–560, 2002.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
334
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
49. Ntziachristos V, Weissleder R, Charge coupled device based scanner
for tomography of ﬂuorescent nearinfrared probes in turbid media,
Med Phys 29: 803–809, 2002.
50. Hebden JC, Wong KS, Timeresolved optical tomography, App Optics
32(4): 372–380, 1993.
51. Barbour RL, Graber HL, Chang JW, et al., MRIguided optical tomog
raphy: Prospects and computation for a new imaging method, IEEE
Compu Sci & Eng 2(4): 63–77, 1995.
52. Pogue BW, Patterson MS, Jiang H, et al., Initial assessment of a simple
system for frequencydomain diffuse optical tomography, Physics in
Medicine and Biology 40(10): 1709–1729, 1995.
53. Oleary MA, Boas DA, Chance B, et al., Experimental images of hetero
geneous turbid media by frequencydomain diffusingphoton tomog
raphy, Optics Letters 20(5): 426–428, 1995.
54. Gonatas CP, Ishii M, Leigh JS, et al., Optical diffusion imaging using a
direct inversion method, Phys Rev E 52(4): 4361–4365, 1995.
55. Gibson AP, Hebden JC, Arridge SR, Recent advances in diffuse optical
imaging, Phys Med Biol 50: R1–R43, 2005.
56. Arridge SR, Optical tomography in medical imaging, Inverse Problems
15: R41–R93, 1999.
57. O’Leary MA, Boas DA, Chance B, et al., Experimental images of hetero
geneous turbid media by frequencydomain diffusingphoton tomog
raphy, Opt Lett 20: 426, 1995.
58. Yao Y, Wang Y, Pei Y, et al., Frequencydomain optical imaging of
absorption and scattering distributions by a Born iterative method,
J Opt Soc Am A14: 325, 1997.
59. Gaudette RJ, Brooks DH, DiMarzio CA, et al., A comparison study of
linear reconstruction techniques for diffuse optical tomography imag
ing of absorption coefﬁcient, Phy Med Biol 45: 1051, 2000.
60. Pogue B, McBride T, Prewitt J, et al., Spatially variant regularization
improves diffuse optical tomography, Appl Opt 38: 2950, 1999.
61. Ye JC, Webb KJ, Millane RP, et al., Modiﬁed distorted Born iterative
method with an approximate Frechet derivative for optical diffusion
tomography, J Opt Soc Am A16: 1814, 1999.
62. Hielsher AH, Klose AD, Hanson KM, Gradientbased iterative image
reconstruction scheme for timeresolved optical tomography, IEEE
Trans Med Imag 18: 262, 1999.
63. BluestoneAY, AbdoulaevG, SchmitzCH, et al., Threedimensional opti
cal tomography of hemodynamics in the human head, Opt Express 9:
272, 2001.
64. Roy R, SevickMuraca EM, Anumerical study of gradient based non
linear optimization methods for contrast enhanced optical tomogra
phy, Opt Express 9: 49, 2001.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
335
65. Soubret A, Ripoll J, Ntziachristos V, Accuracy of ﬂuorescent tomogra
phy in the presence of heterogeneities: Study of the normalized Born
ratio, IEEE Trans Med Imaging 24(10): 1377–1386, 2005.
66. Ntziachristos V, Weissleder R, Chargecoupleddevice based scanner
for tomography of ﬂuorescent nearinfrared probes in turbid media,
Med Phys 29(5): 803–809, 2002.
67. Ntziachristos V, Tung CH, Bremer C, Weissleder R, et al., Fluorescence
molecular tomography resolves protease activity in vivo, Nat Med 8(7):
757–760, 2002.
68. Ntziachristos V, Bremer C, Tung C, et al., Imaging cathepsin B up
regulation in HT1080 tumor models using ﬂuorescencemediated
molecular tomography (FMT), Acad Radiol 9: S323–S325, 2002.
69. Graves EE, Ripoll J, Weissleder R et al., A submillimeter resolution
ﬂuorescence molecular imaging systemfor small animal imaging, Med
Phys 30(5): 901–911, 2003.
70. Graves EE, Weissleder R, Ntziachristos V, Fluorescence molecular
imagingof small animal tumor models, Current Molecular Medicine 4(4):
419–430, 2004.
71. Ntziachristos V, Schellenberger EA, Ripoll J, et al., Visualization of anti
tumor treatment by means of ﬂuorescence molecular tomography with
an annexin VCy5.5 conjugate, Proc Nat Acad Sci USA 101(33): 12294–
12299, 2004.
72. Culver JP, Choe R, Holboke MJ, et al., Threedimensional diffuse optical
tomography in the parallel plane transmission geometry: Evaluation
of a hybrid frequency domain/continuous wave clinical system for
breast imaging, Med Phys 30(2): 235–247, 2003.
73. Schulz RB, Ripoll J, Ntziachristos V, Experimental ﬂuorescence tomog
raphy of tissues with noncontact measurements, IEEE Transactions on
Medical Imaging 23(4): 492–500, 2004.
74. Godavarty A, Eppstein MJ, Zhang CY, et al., Fluorescenceenhanced
optical imaging in large tissue volumes using a gainmodulated ICCD
camera, Physics in Medicine and Biology 48(12): 1701–1720, 2003.
75. Ntziachristos V, Weissleder R, Experimental threedimensional ﬂuo
rescence reconstruction of diffuse media by use of a normalized Born
approximation, Optics Letters 26(12): 893–895, 2001.
76. Oleary MA, Boas DA, Li XD, et al., Fluorescence lifetime imaging in
turbid media, Optics Letters 21(2): 158–160, 1996.
77. Patwardhan SV, et al., Timedependent wholebody ﬂuorescence
tomography of probe biodistributions in mice, Optics Express 13(7):
2564–2577, 2005.
78. Richards G, Soubrane G, Yanuzzi L, Fluorescein and ICG Angiography,
Thieme, Stuttgart, Germany, 1998.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch13 FA
336
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
79. Flanagan JH Jr, Khan S, Menchen S, et al., Functionalized tricarbocya
nine dyes as nearinfrared ﬂuorescent probes for biomolecules, Biocon
jug Chem 8: 751, 1997.
80. Caesar J, ShaldonS, Chiandussi L, et al., The use of indocyanine greenin
the measurement of hepatic bloodﬂowandas a test of hepatic function,
Clin Sci 21: 43, 1961.
81. Gurﬁnkel M, ThompsonAB, Ralston W, et al., Pharmacokinetics of ICG
anf HPPHcar for the detection of normal and tumor tissue using ﬂu
orescence, nearinfrared reﬂectance imaging: A case study, Photochem
Photobiol 72: 94, 2000.
82. Licha K, Riefke B, Ntziachristos V, et al., Hydrophilic cyanine
dyes as contrast agents for nearinfrared tumor imaging: Synthesis,
photophysical properties and spectroscopic in vivo characterization,
Photochem PhotoBiol 72: 392, 2000.
83. Ntziachristos V, YodhAG, Schnall M, et al., Concurrent MRI anddiffuse
optical tomography of breast after indocyanine green enhancement,
Proc Natl Acad Sci USA 97: 2767, 2000.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
CHAPTER 14
Tracking Endocardium Using Optical
Flow along IsoValue Curve
Qi Duan, Elsa Angelini, Shunichi Homma
and Andrew Laine
In cardiac image analysis, optical ﬂowtechniques are widely used to track
ventricular borders as well as estimate myocardial motion ﬁelds. The opti
cal ﬂowcomputation is typically performed in Cartesian coordinates, and
not constrained froma priori knowledge of normal myocardiumdeforma
tion patterns. However, for cardiac motion analysis, displacements along
speciﬁc directions and their derivatives are usually more interesting than
2D or 3D displacement ﬁelds themselves. In this context, we propose two
general frameworks on optical ﬂow estimation along isovalue curves.
We applied the proposed frameworks in several speciﬁc applications: for
endocardium tracking on cine cardiac MRI series and realtime 3D ultra
sound, and thickening computation in 2D ultrasound images. The endo
cardial surfaces tracked with the proposed algorithm were quantitatively
comparedonmanual tracingat eachframe. The proposedmethodwas also
compared to the traditional LucasKanade optical ﬂow method directly
applied to MRI image data in Cartesian coordinates and the standard
correlation based optical ﬂow estimation on realtime 3D echocardiogra
phy. Quantitative comparison showed a positive improvement in average
tracking errors or efﬁciency, through the whole cardiac cycle.
14.1 INTRODUCTION
Cardiac imaging techniques, including echocardiography, cardiac
MRI, cardiac CT, and cardiac PET/SPECT, are widely used in clini
cal screening and diagnosis examinations as well as in research for
in vivo studies. These imaging techniques provide structural and
337
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
338
Qi Duan et al.
functional information. In most clinical studies, quantitative evalu
ation of cardiac function requires endocardial border segmentation
throughout the whole cardiac cycle.
Recent advance in cardiac imaging technology have greatly
improved the spatial and temporal resolution of acquired data, such
as with realtime threedimensional echocardiography,
1
and high
temporal resolution MRI.
2
However, as information content is more
detailed, the amount of data needed to be analyzed for one cardiac
cycle also increases dramatically, making manual analysis of these
data sets prohibitively laborintensive in clinical diagnosis centers.
In this context, many computeraided methods were developed to
automate or semiautomate endocardial segmentation or tracking
tasks throughout the whole cardiac cycle. These computerbased
techniques can be divided into two classes: segmentation methods
and motion tracking methods.
Today, cardiac image segmentation is a very active research area.
Many techniques have been proposed, including active contour,
3,4
levelset methods and deformable models,
5–9
classiﬁcation,
10
active
appearance models,
11
and other methods.
12
Optical ﬂowalgorithms
on tracking of the endocardial borders or other anatomical land
marks throughout the whole sequences were studied in several
recent works.
13–18
Opticalﬂow based tracking techniques offer the
possibility to compute myocardium motion ﬁeld. Usually, these
methods require initializationof the trackedpoints, either bymanual
tracing or with other segmentation techniques (as a preprocessing
step).
However, in cardiac motion analysis, displacements along
speciﬁc directions are usually better indicators of wall motion
abnormality. In this context, we propose a general framework for
optical ﬂow estimation along isovalue curves. An additional con
straint related to speciﬁc motion direction was incorporated in the
original optical ﬂow system of equations to properly constrain
the problem. A leastsquare ﬁtting method was applied to small
neighborhoods for each point of interest to increase the robustness
of the method. The proposed method was then applied to endo
cardium tracking and results were quantitatively compared with
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
339
that obtained by manual tracing as well as tracking with the origi
nal LucasKanade optical ﬂow method.
19
14.2 MATHEMATICAL ANALYSIS
14.2.1 Optical Flow Constraint Equation
Optical ﬂow(OF) tracking refers to the computation of the displace
ment ﬁeld of objects in an image, based on the assumption that the
intensity of the object remains constant. This notion was ﬁrst pro
posed by Horn
20
and drove the active area of motion analysis in
the 1990s. Barron et al.
21
wrote an extensive survey of the major
opticalﬂow techniques at that time and drew the conclusion that
the LucasKanade and the FleetJepson methods were the most reli
able among the nine techniques they implemented and tested on
several image motion sequences.
Assuming the intensity at time frame t of the image point (x, y)
is I(x, y, t), with u(x, y) and v(x, y) being the corresponding x and y
components of the optical ﬂowvector at that point, it is assumedthat
the image intensity will remain constant at point (x + dx, y + dy) at
time t +dt, where dx = udt and dy = vdt are the actual displacement
of the point during time perioddt, leading to the following equation:
I(x +dx, y +dy, t +dt) = I(x, y, t) (1)
If the image intensityis smoothwithrespect tox, y, andt, the left
handside of Eq. (1) canbe expandedintoaTaylor series.
20
Simpliﬁca
tions, as detailed in Ref. 20, performed by ignoring the higher order
terms and taking limits as dt → 0, lead to the following equation:
∂I
∂x
dx
dt
+
∂I
∂y
dy
dt
+
∂I
∂t
= 0 (2)
Using the notations:
u =
dx
dt
, v =
dy
dt
,
I
x
=
∂I
∂x
, I
y
=
∂I
∂y
, I
t
=
∂I
∂t
,
(3)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
340
Qi Duan et al.
Eq. (2) can be simpliﬁed as:
I
x
u +I
y
v +I
t
= 0, (4)
Eq. (4) is called the optical ﬂow constraint equation, as it expresses a
constraint on the components u and v of the optical ﬂow. This sys
tem is underconstrained and with this equation alone, the optical
ﬂowproblemcan not be uniquely solved. All gradientbased optical
ﬂow methods try to add additional constraints to make the system
sufﬁciently constrained or even overconstrained. For example, the
LucasKanade method
19
tries to solve Eq. (2) through a weighted
leastsquares ﬁtting in each small spatial neighborhood by mini
mizing the following equation, assuming a constant motion within
the neighborhood:
(x,y)∈
W
2
(x, y)[I
x
u +I
y
v +I
t
]
2
(5)
where W(x, y) denotes a window function applied to the neighbor
hood. The solution to Eq. (5) is given by the following linear system:
A
T
W
2
A
_
u
v
_
= A
T
W
2
b (6)
where for n points in the neighborhood at single time t,
A =
_
I
x
1
. . . I
x
n
I
y
1
. . . I
y
n
_
T
,
W = diag
_
W(x
1
, y
1
), . . . , W(x
n
, y
n
)
_
,
b = −
_
_
_
_
_
I
t
(x
1
, y
1
)
.
.
.
I
t
(x
n
, y
n
)
_
¸
¸
¸
_
.
(7)
The systemdescribedinEq. (6) canbe solvedbymatrix inversion
when the 2 by 2 matrix A
T
W
2
A is nonsingular. The intrinsic least
square ﬁtting property increases the robustness of the optical ﬂow
estimation for the LucasKanade method.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
341
14.2.2 Optical Flow along IsoValue Curves
In cardiac motion analysis, motion along some isovalue curves is
usually more interesting than the full 2D or 3D displacement itself.
In both cardiac biomechanics,
22
and cardiac imaging analysis, such
as Ref. 23, 2D or 3D displacement vectors are usually decomposed
into radial and circumferential displacement components. These
components and their derivatives (strains) are usually good indica
tors of ventricular abnormalities. For example, myocardium thick
ening, computed via radial derivatives of radial displacements, is
the best indicator for ischemia according to a recent biomechanics
study.
24
With the correct use of a coordinate system, such as polar
coordinates
25
in 2D and cylindrical coordinates
23
in 3D, displace
ments along some directions (e.g. along radial directions) can be
mathematically formulated as motion along some isovalue curves
(e.g. θ = const). In this context, investigating optical ﬂow along iso
value curves becomes important.
Given a timevarying Ndimensional time series I(
−→
X, t), where
−→
X = [x
1
, . . . , x
N
]
T
is the spatial coordinates and t is the temporal
dimension, the constant intensity constraint is
I(
−→
X, t) = I(
−→
X +
−→
dX, t +dt) (8)
where
−→
dX is the ND displacement vector within time period dt for
the pixel located at
−→
X at time t. Using Taylor series expansion and
omitting higher order terms, we have
∇I(
−→
X, t) ·
−→
dX +
∂I(
−→
X, t)
∂t
dt = 0 (9)
where ∇I(
−→
X, t) =
_
∂I
∂x
1
, . . . ,
∂I
∂x
N
_
T
is the image spatial gradient vector
and the “·” represents the vector dot product.
By deﬁning the velocity vector (i.e. optical ﬂow vector) as
−→
v =
d
−→
X
dt
=
_
dx
1
dt
, . . . ,
dx
N
dt
_
T
, the optical ﬂow constraint equation for
Ndimensional time series can be derived as
∇I(
−→
X, t) ·
−→
v +
∂I(
−→
X, t)
∂t
= 0 (10)
by taking limits as dt → 0.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
342
Qi Duan et al.
Assume the optical ﬂowestimation is performedalong isovalue
curves G(
−→
X,
−→
dX) = const. Note that in Ndimensional space, more
than one equations may be needed to represent isovalue curves or
hypersurfaces, so Gcould be a vector of functions and const could
be a constant vector with same length as G. By letting F(
−→
X,
−→
dX) =
G(
−→
X,
−→
dX) −const, the problemcan be converted into an optical ﬂow
estimation along the zerovalue curve(s) F(
−→
X,
−→
dX) = 0 (note F could
be a vector as the same reason as G). Thus, for a point
−→
X, two general
constraints are imposed on the optical ﬂow vector
−→
v :
_
¸
_
¸
_
∇I(
−→
X, t) ·
−→
v +
∂I(
−→
X, t)
∂t
= 0
F(
−→
X,
−→
dX) = 0
(11)
There are many ways to solve the system described by Eq. (11).
Here, we will propose a framework to solve this system via energy
minimization since this framework can be easily extended to image
spaces with different dimensionalities, can easily incorporate neigh
borhood information, and can easily add additional constraints.
One straightforward way to solve the optical ﬂow along
isovalue curves as in Eq. (11) is to followthe rationale of the Lucas
Kanade method. To increase the robustness of optical ﬂow estima
tion, for each point
−→
X
c
, the ﬁnal optical ﬂowestimation is solved via
energy minimization of the energy deﬁned in Eq. (12), in the least
square ﬁtting sense, in an npoint neighborhood centered at
−→
X
c
,
assuming a constant motion within the neighborhood:
−→
v = arg min
−→
v
E
1
= arg min
−→
v
(E
OF
+E
ISO
)
= arg min
−→
v
__
_
W(
−→
X ) · OF(
−→
X )
_
_
2
¸
¸
−→
X ∈
+
_
_
F(
−→
X
c
,
−→
dX
c
)
_
_
2
_
, (12)
where · represents the l
2
norm, and the weighting vector W(
−→
X )
and optical ﬂow constraint vector OF(
−→
X ) are deﬁned as following
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
343
in the neighborhood :
W(
−→
X ) =
_
W(
−→
X
1
), . . . , W(
−→
X
n
)
_
T
OF(
−→
X ) =
_
_
_
_
_
_
_
_
_
∇I(
−→
X
1
, t) ·
−→
v +
∂I(
−→
X
1
, t)
∂t
.
.
.
∇I(
−→
X
n
, t) ·
−→
v +
∂I(
−→
X
n
, t)
∂t
_
¸
¸
¸
¸
¸
¸
¸
_
given (
−→
X
1
, . . . ,
−→
X
n
) ∈ .
(13)
Generally solving the energy minimizationprobleminEq. (12) is
not trivial dependinguponthe nonlinearityof the functionF(
−→
X,
−→
dX).
One important feature of the proposed framework as in Eq. (12)
is that everything is formulated in the original coordinate system of
the input image series. There is no need to resample the image data
to other coordinate system, e.g. polar coordinate, which is usually
done in motion analysis or segmentation in one direction, such as in
Ref. 26. The main advantage of the proposed framework compared
with these image resample frameworks are to avoid image resam
pling, which is a relative expensive step especially for 3D image
volumes andmay introduce some artifact depending uponthe inter
polation scheme used. This will save a lot of computational power
when dealing with higher dimensional image series.
Another thing needed to be pointed out is that Eq. (12) is not the
only way to formulate the optical ﬂow along isovalue curve. Actu
ally, another framework with identical optimum solution in ideal
case will be proposed in the realtime 3D ultrasound application in
a constrained energy minimization fashion.
Inthe followingsection, the proposedframeworkwill be applied
to different applications. Speciﬁc zerovalue curve(s) F(
−→
X,
−→
dX) will
be derived and the instants of Eq. (12) or other energy minimiza
tion schemes will be derived as well. The tracking results will
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
344
Qi Duan et al.
be quantitatively compared to the results derived from manual
tracing through areabased index or ﬁniteelement model based
contour/surface comparison.
14.3 METHODS AND RESULTS
14.3.1 Example I: Tracking Radial Displacements of the
Endocardium in 2D Cardiac MRI Series
Adirect application of the proposed framework was tested to track
the endocardiummotionalong radial displacements. Previous work
involving tracking endocardial borders using optical ﬂow, such as,
27
usually applied the optical ﬂow algorithm directly on the Carte
sian image data without additional constraints on motion direction.
Since radial displacements and its derivatives are the most inter
esting components of endocardial motion, we focused on OF radial
displacement computation only.
14.3.1.1 Mathematical analysis
Usually in 2D cardiac images, a polar coordinate system is used to
decompose the endocardium displacement ﬁeld in radial and cir
cumferential directions. We followed the same coordinate system
convention. The selection of the center of the polar coordinate sys
tem cannot simply be the centroid of the blood pool because of the
well known “ﬂoating centroid” problem in cardiac biomechanics.
28
Following the proper ventricle axis selection protocol described in
Ref. 28, the long axis of left ventricle was ﬁrst selected and then
the center of the polar coordinate system was set as the intersec
tion of LV long axis and the imaging plane. In this coordinate sys
tem, radial displacements can be deﬁned as displacements along
isovalue lines θ = const. The corresponding zerovalue function
F(
−→
X,
−→
dX) = f (x
c
, y
c
, u, v) = 0, expressingthe fact that the point (x
c
, y
c
)
and its motion vector (u, v) are along the line θ = const, is given by:
_
x
c
sin θ −y
c
cos θ = 0
usin θ −v cos θ = 0,
(14)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
345
which can be simpliﬁed into
f (x
c
, y
c
, u, v) = y
c
u −x
c
v = 0. (15)
So the total energy associated with the optical ﬂow along zero
value curve is:
E
1
=
(x,y)∈
W
2
(x, y)[I
x
u +I
y
v +I
t
]
2
+f
2
(x
c
, y
c
, u, v). (16)
Similar to the original LucasKanade method, the energy mini
mizationproblemdescribedbyEq. (16) canbe solvedbyleastsquare
ﬁtting of the following equivalent overconstrained system:
_
_
_
_
W
2
I
2
x
W
2
I
x
I
y
W
2
I
x
I
y
W
2
I
2
y
y
c
−x
c
_
¸
¸
_
_
u
v
_
=
_
A
T
W
2
b
0
_
, (17)
where W, A, and b are deﬁned in Eq. (7).
14.3.1.2 Data and evaluation methods
The endocardial border tracking scheme developed in the previous
section was tested on two cardiac MRI protocols:
• A 2D cardiac MRI series with ECG gating acquired by a GE 1.5T
system using protocol FIESTA for 2D short axis stacks from an
IRB approved experiment of LAD occlusion in sheep hearts. This
protocol, which is also calledSSFPby other vendors, will generate
clear anatomical image of the heart. For this reason, this protocol
is widely used in cardiac MRI. This data set is selected to test the
performance of the optical ﬂow on clear images with standard
temporal resolution in cardiac MRI.
• A2D cardiac MRI series with ECG gating acquired by a Siemens
1.5T system using a novel hightemporal resolution Phase Train
Imaging (PTI) protocol proposed by Pai et al.
2
for 2D short axis
stacks from a volunteer heart. This novel high speed can pro
vide 2 ms temporal resolution on average andabout four hundred
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
346
Qi Duan et al.
frames per cardiac cycle. The image quality is worse than FIESTA
or SSFP protocol. This data is selected to test the performance of
the optical ﬂow on highspeed low quality image series and also
the robustness on longterm tracking.
Endocardial border points for eachtime frame of the FIESTAdata
and the last frame for the PTI data were traced by an experienced
expert. The optical ﬂowalgorithmwas initialized with manual trac
ing points on the ﬁrst frame (enddiastole) and then automatically
run to track those points throughout the whole cardiac cycle (20
frames in total for the FIESTAdata and 412 frames for the PTI data).
Two error measurements were used to evaluate the performance
of the optical ﬂow: (1) the Tanimoto index TI =
TP
1+FP
=
Seg
1
∩Seg
2
Seg
1
∪Seg
2
,
29
whichis widely usedincomparisonof segmentationresults; (2) rela
tive errors in radial coordinates. A24 ﬁnite element model was used
to ﬁt manually traced points or optical ﬂow tracked points for each
frame of interest. The relative errors in radial coordinates of each
element were then computed, with its mean serving as a perfor
mance indicator for each frame. The original LucasKanade optical
ﬂowmethod was also implemented and applied to the same data as
a comparison method for endocardium tracking, without isovalue
curve constraint.
14.3.1.3 Results
On the FIESTAdata, radial lengths for the endocardial border points
tracked by our method at enddiastole (ED) and endsystole (ES) are
plotted in Fig. 1(a). When compared to endocardium obtained by
manual tracing, our proposed method has TI value 74.62%±8.54%,
compared with that obtained by original LucasKanade method as
72.06% ± 9.13%. These results showed that our proposed method
has more accurate and robust performance than the original Lucas
Kanade method. Example tracking results at frame 10 are shown in
Fig. 1(b), showing that our method is less likely to fail compared
with the original method. Similar conclusion can be drawn from
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
347
(a)
(c)
(d)
(b)
0 5 10 15 20
0
0.1
0.2
0.3
0.4
Frame Number
r
e
l
e
r
r
Proposed Method
LK Method
100 0 100
12
14
16
18
20
22
24
(degree)
r
a
d
i
a
l
l
e
n
g
t
h
(
m
m
)
ED
ES
0 5 10 15 20
0
0.05
0.1
0.15
0.2
Frame Number
r
e
l
s
t
d
Proposed Method
LK Method
Fig. 1. On FIESTA sheep data: (a) Radial length of endocardium points at ED
(solid line) and ES (dashed line); (b) Tracking result at frame 10 with proposed
method (center of red circle) and LucasKanade method (center of green cross);
(cd) Relative radial coordinates errors: (c) relative error and(d) standarddeviation.
(bd) solid line: proposed method; dashed line: LucasKanade method.
the comparison of the relative errors in radial coordinates plotted in
Figs. 1(c) and1(d), for whichthe proposedmethodhas lower average
errors as well as lower standard deviations in the relative errors in
radial coordinates. The additional constraint of the OF motion along
isovalue curves improved the robustness and accuracy for tracking
of the endocardium.
Error accumulation of consecutive frames in the OF estimation
can be noticed from the plots, which suggests that applying for
ward and backward tracking or adding more reference points may
improve the performance of OF estimation.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
348
Qi Duan et al.
Fig. 2. Tracking result at frame 412 with proposed method (red circle) and Lucas
Kanade method (green cross) on the highspeed PTI data.
On the PTI data, after tracking the endocardium through the
whole cardiac cycle, the TI values at the last frame are 85.40% for
our method and 63.70% for the original LucasKanade method.
The relative errors are 7.10%±9.52% for our method and 96.49%±
126.91% for the original LucasKanade method. Tracking results for
the last frame (the 412th frame) are shown in Fig. 2, which shows
that the additional constraint derived from the isovalue curve
increases the robustness of our method for high temporal resolution
tracking.
14.3.2 Example II: Tracking the Endocardium in RealTime
3D Ultrasound
Development of realtime 3D (RT3D) echocardiography started
in the late 1990s
30
based on matrix phased arrays transducers.
Recently, a new generation of RT3D transducers was introduced by
Philips Medical Systems (Best, The Netherlands) with the SONOS
7500 transducer followed by the iE33 that can acquire a fully
sampled cardiac volume in four cardiac cycles. This technical
design enabled a dramatic increase in spatial resolution and image
quality, which makes such 3D ultrasound techniques increasingly
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
349
attractive for daily cardiac clinical diagnosis. Since RT3D ultra
sound acquires volumetric ultrasound sequences with fairly high
temporal resolution and a stationary transducer, it can capture
the complex 3D cardiac motion very well. Advantages of using
threedimensional ultrasound in cardiology include the possibility
to display a threedimensional dynamic view of the beating heart,
and the ability for the cardiologist to explore the threedimensional
anatomy at arbitrary angles, to localize abnormal structures and
assess wall deformation. This technology has been shown, in the
past decade, to provide more accurate and reproducible screen
ing for quantiﬁcation of cardiac function for two main reasons: the
absence of geometrical assumption for ventricular shapes and the
accuracy of the visualization planes for performing ventricular vol
ume measurements. It was validatedthroughseveral clinical studies
for quantiﬁcation of LVfunction as reviewed in Ref. 31 and in Ref. 5.
The development for computer aided tools for RT3D ultrasound is
relatively limited compared with the development of image pro
cessing techniques for other modalities. Early studies
17
used simple
simulated phantoms while recent research
32
used 3D ultrasound
data sequence for LV volume estimation. In Ref. 27, we proposed
a framework based on correlation based optical ﬂow estimation to
track the endocardium. The result is quantitatively validatedagainst
manual tracing result. In a recent study,
33
3D speckle tracking tech
niques, whichare similar to our methodinRef. 27, was testedmainly
on simulated data. All the tracking in previous studies was per
formed directly in 3D Cartesian coordinates. However, for tracking
the endocardium purpose, the problem can be reformulated as an
optical ﬂow along isovalue curve problem, which will be much
more efﬁcient with comparable tracking results. Example frame of
RT3D ultrasound is shown in Fig. 3 in Phillips QLAB interface.
14.3.2.1 Mathematical analysis
As mentioned in previous section, Eq. (12) is not the only way to
formulate the optical ﬂow along isovalue curves. An equivalent
frameworktoEq. (12) canbe formulatedthroughconstrainedenergy
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
350
Qi Duan et al.
Fig. 3. Example frame of RT3D ultrasound at ED for a patient with transplanted
heart. (a) axial, (b) elevation and (c) azimuth views.
minimization:
_
¸
_
¸
_
−→
v = arg min
−→
v
E
OF
= W(
−→
X ) · OF(
−→
X )
2

−→
X ∈
F(
−→
X
c
,
−→
dX
c
) = 0.
(18)
Equation (18) is an equivalent systemof Eq. (12) in the sense that
both systems have the same optimum solution under ideal case, i.e.
the minimum value of E
OF
is zero.
In order to show that our framework is not limited to gradient
based optical ﬂow framework, in this example, we will derive an
energy term that is equivalent to correlation based optical ﬂow as
used in Ref. 27. Since maximizing correlation coefﬁcient is equiva
lent to minimizing the sum squared difference between two neigh
borhoods, the optical ﬂow energy E
OF
can be simply deﬁned with
this error energy. To properly deﬁne “radial displacement,” a prolate
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
351
spheroidal coordinate system (λ, µ, θ) with focus was established as
described in Refs. 27 and 34. So for each point
−→
X
c
with an npoint
neighborhood centered at
−→
X
c
, the tracking problemcan be formu
lated as:
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
−→
v = arg min
−→
v
E
OF
=
−→
X ∈
_
I(
−→
X, t) −I(
−→
X +
−→
v dt, t +dt)
_
2
F(
−→
X
c
,
−→
dX
c
) =
_
µ
t+dt
c
−µ
t
c
θ
t+dt
c
−θ
t
c
_
= 0,
(19)
where
−→
X =
_
_
x
y
z
_
_
=
_
_
d sinh λ sin µcos θ
d sinh λ sin µsin θ
d cosh λ cos µ
_
_
. (20)
14.3.2.2 Data and evaluation method
The tracking approach was tested on one data set acquired with a
SONOS7500 3Dultrasoundmachine (Philips Medical Systems, Best,
The Netherlands): One transthoracic clinical data set was aquired
from a heart transplant patient. Spatial resolution of the analyzed
data was (0.8 mm
3
) and 16 frames were acquired for one cardiac
cycle. The endocardial surfaces were manually traced by one expe
rience expert for every frame between the enddiastole and the end
systole. The optical ﬂow algorithms were initialized using manual
tracing at ED and ES frames. The endocardial surfaces in between
were generated by averaging the results from forward and back
ward tracking by both methods. Manual tracing for each frame was
used as gold standard for surface comparison.
We evaluated OF tracking performance via visualization and
quantiﬁcation of dynamic ventricular geometry compared to seg
mented surfaces. Usually, comparison of segmentation results is
performedvia global measurements like volume difference or mean
squared error. In order to provide local comparison, we proposed
a novel comparison method in Ref. 35 based on a parameteriza
tion of the endocardial surface in prolate spheroidal coordinates
36
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
352
Qi Duan et al.
and previously used for comparison of ventricular geometries from
two 3D ultrasound machines in Ref. 37. The endocardial surfaces
were registered using three manually selected anatomical land
marks: the center of the mitral oriﬁce, the endocardial apex, and the
equatorial midseptum. The data as ﬁttedinprolate spheroidal coor
dinates (λ, µ, θ), projecting the radial coordinate λ to a 64element
surface mesh with bicubic Hermite interpolation, yielding a realis
tic 3Dendocardial surface. The ﬁtting process was performed using
the custom ﬁnite element package Continuity 5.5 developed at the
University of California San Diego (http://cmrg.ucsd.edu). The ﬁt
ted nodal values and spatial derivatives of the radial coordinate,
λ, were then used to map relative differences between two sur
faces, ε = (λ
seg
− λ
OF
)/λ
seg
using custom software. A Hammer
mapping was used to ﬂatten the endocardial surface via an area
preserving mapping,
28
through which relative λ difference maps
were generated for endsystole (ES), providing a direct quantitative
comparisonof ventricular geometry. These maps are visualizedwith
isolevel lines, quantiﬁed in percentage values of radial difference.
The area under 10%differences is usedas the criteria for quantitative
comparison.
Average intraobserver variance and interobserver variance
were also computed by the similar scheme using the two tracings
from a single user one month apart and two tracings from two dif
ference users at the same time.
14.3.2.3 Results
The area percentages under 10% difference are plotted in Fig. 4(a).
The mean values are 69.66% ± 21.42% for proposed method and
87.29%±10.38%for direct tracking scheme. Example Hammer maps
from both methods at one frame are shown in Figs. 4(b) and 4(c)
respectively. The average intra and interobserver differences are
79.38% and 55.33%, respectively, in terms of same surface compari
soncriteria. Bothmethods are comparable to interobserver variance
and the direct tracking has better performance than the proposed
method.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
353
(a) (b) (c)
1 2 3 4 5 6 7
0
20
40
60
80
100
P
e
r
c
e
n
t
a
g
e
Constrained
Direct
Intra
Inter
0
0
0

0
.
1
0
.1

0
.
2

0
.
1
0
.
1
septum
anterior
lateral
posterior
septum
apex

0
.
2
0
.
2

0.
1
0

0
.
1
0
septum
anterior
lateral
posterior
septum
apex
Fig. 4. Optical ﬂowtrackingresults onRT3Dultrasound: (a) areapercentage under
10% difference from manual tracing for each frame generated by the proposed
method (blue) and direct tracking method (green). Average intraobserver variance
(red) and interobserver variance (cyan) are also plotted for reference; (b) Hammer
map of direct tracking result at frame 5; (c) Hammer map of constrained tracking
result at frame 5.
From computational cost point of view, the proposed method
used 5.3767 seconds on average to tracking the surface between
two frames, whereas the direct tracking method needed 112.78 sec
onds on average for the same task, which leads about 20 times
saving in computational power for our method compared with
direct tracking scheme. With performance comparable to interuser
difference and much shorter time cost in computation than direct
tracking scheme, our method may be more suitable in clinical appli
cations, where the total analysis time is limited to 5–10 minutes for
each data set.
14.3.3 Example III: Thickening Computation
on 2D Ultrasound Slices
In the previous two applications, optical ﬂowalong isovalue curves
was used mainly as a tracking tool. In this example, we will show
that the displacement estimated in this framework can also be used
in motion analysis and strain computation.
14.3.3.1 Data and method
One basal shortaxis crosssectionviewwas extractedfromthe RT3D
clinical data used in previous section. 2D versions of optical ﬂow
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
354
Qi Duan et al.
(a)
(b)
(c)
spetum
spetum
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
Fig. 5. Results on thickening computation: (a) example 2D slice; (b) Segmental
average of thickening from direct tracking scheme; (c) Segmental average of thick
ening from our method.
methods detailed in previous section were also implemented. Both
optical ﬂow methods were initialized at enddiastole with man
ual tracing. Segmental average thickening from ED to ES were
computed.
14.3.3.2 Results
Segmental average results of thickening computation from both
methods are shown in Fig. 5. Both methods generatedsimilar results
and correctly indicated the reduced motion at the septum in the
original data set.
14.4 DISCUSSION
Fromthe three examples we showed, we can conclude that, with the
additional energy term or constraint from the isovalue curve, the
optical ﬂow algorithm can either perform better with roughly same
computational cost or much efﬁciently without downgrading the
accuracy a lot, especially for the tracking tasks like tracking endo
cardium in cardiac imaging. Radial displacements and thickening
estimated derived from constrained scheme were similar to those
obtained by direct tracking.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
355
The frameworks proposed in Eqs. (12) and (18) are generic.
They can be easily extended to higher dimensional space; and the
energy for optical ﬂow estimation can be chosen different from
the optical ﬂow constrained equation. Actually with proper choice of
the optical ﬂow energy term, the well known intensity constancy
assumption can be loosened, which could increase the robustness
of the estimation. Moreover, in addition to direct beneﬁt from the
energy minimization framework, additional constraints, such as
smoothness constraint, can be seamlessly incorporated by simply
adding weighted energy terms associated with these constraints.
This framework could be also merged with variational optical ﬂow
approaches, such as works in Refs. 38 and 39.
The proposed frameworks were formulated directly in the same
coordinate systemas the input image, so there is no data resampling
required, which will reduce the overall computational cost and
reduce the accuracy dependency on the interpolation methods. The
key points in these frameworks are to properly deﬁne the zerovalue
curve function vector F and to properly minimize the energy. The
latter one could be formulated as a nonlinear problem for some
applications.
Although ideal systems described by Eqs. (12) and (18) have
same optimal solutions, the results on real image series from these
two frameworks may be different if the zerominimum for the opti
cal ﬂow energy can not be reached. In this case, framework deﬁned
by Eq. (18) will still give optical ﬂow displacement vector along the
isovalue curves whereas the framework deﬁned by Eq. (12) may
loosen this constraint to get estimation with even lower energy,
which yields the fact that the framework deﬁned by Eq. (12) out
performs its constrained counterpart deﬁned by Eq. (18). Consider
ing computational cost, framework deﬁned by Eq. (12) will slightly
increase the cost compared to direct tracking method with the addi
tional energy term; on the contrary, the framework deﬁned by
Eq. (18) usually offers huge saving in computational power due to
dimensionality reduction. So for tracking purpose, if the accuracy is
more important than the efﬁciency, we would suggest the use of the
unconstrained version as Eq. (12). If efﬁciency is more important or
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
356
Qi Duan et al.
displacement is required to strictly follow the isovalue curves, the
constrained version as Eq. (18) would be a good choice.
The last thing that needs to be pointed out is that the displace
ments estimated by the proposed frameworks cannot be used for
motion analysis along directions other than the given isovalue
curve. For example, if the displacement of the endocardium is esti
mated by optical ﬂow along radial direction (θ = const) in 2D, this
estimation cannot be directly used to estimate the circumferential
displacement or cardiac twist. This is a limitation of the proposed
frameworks since in some sense we are trading the universality in
free motion estimation for much better accuracy or efﬁciency for
motion estimation along speciﬁc isovalue curves. Fortunately, this
limitation would not limit the usefulness of the proposed frame
work a lot since in most of cardiac applications, landmark or surface
trackingandmotionanalysis alongspeciﬁc directions are more often
than free motion analysis.
14.5 CONCLUSION
Twogeneric frameworks for optical ﬂowwere proposedas anenergy
minimization problem with local constraints related to isovalue
curves. Three applications of these frameworks were presented for
trackingof the endocardiumon2DMRI dataseries (bothFIESTAand
PTI protocols) andrealtime 3Dultrasoundseries. The endocardium
borders tracked by the proposed method as well as the Lucas
Kanade method were quantitatively compared to manual tracing on
each frame through the Tanimoto Index and relative errors in radial
coordinates after FEM ﬁtting. The results showed superior perfor
mance for the proposed method in tracking the endocardium. The
constrained version was applied on realtime 3D ultrasound data.
Quantitative evaluation results yielded comparable performance to
interobserver variance with about 20fold saving in computational
cost compared to direct tracking scheme. Thickening computations
from the proposed method and direct tracking method were com
pared with similar results. These frameworks are generic and can
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
357
be readily extended to ndimensional spaces and seamlessly incor
porated additional constraints via a similar energy minimization
framework.
14.6 ACKNOWLEDGMENT
This work was funded by National Science Foundation grant BES
0201617, American Heart Association #0151250T, Philips Medi
cal Systems, New York State NYSTAR/CAT Technology Program.
Dr Andrew McCulloch at the University of California, San Diego
provided the ﬁnite element software “Continuity” through the
National Biomedical Computation Resource (NIH P41RR08605).
The authors also would like to thank Dr Todd Pulerwitz
(Department of Medicine, Columbia University), Susan L. Herz,
Christopher M. Ingrassia, Drs Jeffrey W. Holmes, Dr Kevin D. Costa
(Department of Biomedical Engineering), andDr VinayM. Pai (Radi
ology, New York University).
References
1. Ramm OTV, Pavy JHG, Smith SW, Kisslo J, Realtime, three
dimensional echocardiography: The ﬁrst human images, Circulation
84: 685, 1991.
2. Pai V, Axel L, Kellman P, Phase train approach for very high temporal
resolution cardiac imaging, J Cardiovasc Magn Reson 7: 98–99, 2005.
3. Drezek R, Stetten GD, Ota T, Fleishman C, et al., Active contour based
on the elliptical Fourier series, applied to matrixarray ultrasound of
the heart, presented at 25th AIPR Workshop: Emerging Applications
of Computer Vision, 1997.
4. Chalana V, Linker DT, Haynor DR, Kim Y, A multiple active con
tour model for cardiac boundary detection on echocardiographic
sequences, IEEE Transactions on Medical Imaging 15: 290–298, 1996.
5. Angelini ED, Homma S, Pearson G, Holmes JW, et al., Segmentation of
realtime threedimensional ultrasound for quantiﬁcation of ventricu
lar function: Aclinical study on right and left ventricles, Ultrasound in
Med & Biol 31: 1143–1158, 2005.
6. Paragios N, A level set approach for shapedriven segmentation and
tracking of the left ventricle, IEEE Transactions on Medical Imaging 22:
773–776, 2003.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
358
Qi Duan et al.
7. Lin N, Duncan JS, Generalized robust point matching using an
extendedfreeformdeformation model: Application to cardiac images,
presented at 2004 2nd IEEE International Symposium on Biomedical
Imaging: Macro to Nano, 2004.
8. Rueckert D, Burger P, Geometrically Deformable Templates for Shapebased
Segmentation and Tracking in Cardiac MR Images, presented at Energy
Minimization Methods in Computer Vision and Pattern Recognition,
Venice, Italy, 1997.
9. Montagnat J, Delingette H, Spatial and Temporal Shape Constrained
Deformable Surfaces for 3D and 4D Medical Image Segmentation, INRIA,
Sophia Antipolis RR4078, 2000.
10. van Assen CH, Danibuchkine MG, Frangi AF, Ordas S, et al., SPASM:
A3DASM for segmentation of sparse and arbitrarily oriented cardiac
MRI data, Medical Image Analysis 10: 286–303, 2006.
11. Mitchell SC, Lelieveldt BPF, van der Geest R, Schaap J, et al., Segmenta
tion of Cardiac MR Images: An Active Appearance Model Approach,
presented at SPIEThe International Society for Optical Engineering,
2000.
12. Setarehdan SK, Soraghan JJ, Automatic cardiac LV boundary detec
tion and tracking using hybrid fuzzy temporal and fuzzy multiscale
edge detection, IEEE Transaction on Biomedical Engineering 46: 1364–
1378, 1999.
13. Veronesi F, Corsi C, Caiani EG, Sarti A, et al., Tracking of left ventricular
long axis from realtime threedimensional echocardiography using
optical ﬂow techniques, IEEE Transactions on Information Technology in
Biomedicine 10: 174–181, 2006.
14. Duan Q, Angelini E, Herz SL, Ingrassia CM, et al., Dynamic Cardiac
Information From Optical Flow Using Four Dimensional Ultrasound, pre
sented at 27th Annual International Conference IEEE Engineering in
Medicine and Biology Society (EMBS), Shanghai, China, 2005.
15. Loncaric S, Majcenic Z, Optical Flow Algorithm for Cardiac Motion Esti
mation, presented at 22ndAnnual International Conference of the IEEE
Engineering in Medicine and Biology Society, Jul 23–28 2000, Chicago,
IL, 2000.
16. Gindi GR, Gmitro AF, Delorie DHJ, Velocity FlowField Analysis of Car
diac Dynamics, presented at Proceedings of the Thirteenth Annual
Northeast Bioengineering Conference, Philadelphia, PA, USA, 1987.
17. Gutierrez MA, Moura L, Melo CP, Alens N, Computing Optical Flow
in Cardiac Images for 3D Motion Analysis, presented at Proceed
ings of the 1993 Conference on Computers in Cardiology, London,
UK, 1993.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along IsoValue Curve
359
18. Suhling M, Arigovindan M, Jansen C, Hunziker P, et al., Myocardial
motion analysis from Bmode echocardiograms, IEEE Transactions on
Image Processing 14: 525–536, 2005.
19. Lucas BD, Kanade T, An Iterative Image Registration Technique with an
Appication to Stereo Vision, presented at International Joint Conference
on Artiﬁcial Intelligence (IJCAI), 1981.
20. Horn BKP, Robot Vision, MIT Press, Cambridge, 1986.
21. Barron JL, Fleet D, Beauchemin S, Performance of optical ﬂow tech
niques, Int Journal of Computer Vision 12: 43–77, 1994.
22. Humphrey JD, Cardiovascular Solid Mechanics: Cells, Tissues, and Organs,
Springer, New York, USA, 2002.
23. Papademetris X, Sinusas AJ, Dione DP, DuncanJS, Estimationof 3Dleft
ventricular deformation from echocardiography, Medical Image Analy
sis 8: 285–294, 2004.
24. Azhari H, Sideman S, Weiss JL, Shapiro EP, et al., Threedimensional
mapping of acute ischemic regions using MRI: Wall thichening versus
motionanalysis, AmericanJournal of Physiology259: H1492–H1503, 1990.
25. Suhling M, Arigovindan M, Jansen C, Hunziker P, et al., Myocardial
motion analysis from Bmode echocardiograms, IEEE Transactions on
Image Processing 14: 525–536, 2005.
26. Noble N, Hill D, Breeuwer M, Schnabel J, et al., Myocardial delineation
via registration in a polar coordinate system, Acad Radiol 10: 1349–1358,
2003.
27. Duan Q, Angelini ED, Herz SL, Gerard O, et al., Tracking of LV Endo
cardial Surface on RealTime ThreeDimensional Ultrasound with Optical
Flow, presented at Third International Conference on Functional Imag
ing and Modeling of the Heart 2005, Barcelona, Spain, 2005.
28. Herz S, Pulerwitz T, Hirata K, Laine A, et al., Novel Technique for
Quantitative Wall Motion Analysis Using RealTime ThreeDimensional
Echocardiography, presented at Proceedings of the 15th Annual Scien
tiﬁc Sessions of the American Society of Echocardiography, 2004.
29. Theodoridis S, Koutroumbas K, Pattern Recognition, Academic Press,
USA, 1999.
30. Ramm OTV, Smith SW, Realtime volumetric ultrasound imaging sys
tem, Journal of Digital Imaging 3: 261–266, 1990.
31. Krenning BJ, Voormolen MM, Roelandt JRTC, Assessment of left ven
tricular function by threedimensional echocardiography, Cardiovasc
Ultrasound 1(1): 2003.
32. Shin IS, Kelly PA, Lee KF, Tighe DA, Left Ventricular Volume Estimation
From ThreeDimensional Echocardiography, presented at Proceedings of
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch14 FA
360
Qi Duan et al.
SPIE, Medical Imaging 2004 —Ultrasonic Imaging and Signal Process
ing, San Diego, CA, United States, 2004.
33. Yu W, Yan P, Sinusas AJ, Thiele K, et al., Towards pointwise motion
tracking in echocardiographic image sequences: Comparing the relia
bility of different features for speckle tracking, Medical Image Analysis
10: 495–508, 2006.
34. Herz S, Ingrassia C, Homma S, Costa K, et al., Parameterization of left
ventricular wall motion for detection of regional ischemia, Annals of
Biomedical Engineering 33: 912–919, 2005.
35. DuanQ, Angelini ED, Herz SL, Ingrassia CM, et al., Evaluation of Optical
Flow Algorithms for Tracking Endocardial Surfaces on ThreeDimensional
Ultrasound Data, presented at SPIE International Symposium, Medical
Imaging 2005, San Diego, CA, USA, 2005.
36. Ingrassia CM, Herz SL, Costa KD, Holmes JW, Impact of Ischemic
Region Size on Regional Wall Motion, presented at Proceedings of the
2003 Annual Fall Meeting of the Biomedical Engineering Society,
2003.
37. Angelini ED, HammingD, Homma S, Holmes J, et al., Comparisonof Seg
mentation Methods for Analysis of Endocardial Wall Motion with RealTime
ThreeDimensional Ultrasound, presented at Computers in Cardiology,
Memphis TN, USA, 2002.
38. BruhnA, Weickert J, Feddern C, Kohlberger T, et al., Variational optical
ﬂowcomputation in realtime, IEEETransactions on Image Processing 14:
608–615, 2005.
39. Ruhnau P, Kohlberger T, Schnorr C, Nobach H, Variational optical
ﬂowestimation for particle image velocimetry, Experiments in Fluids 38:
21–32, 2005.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
CHAPTER 15
Some Recent Developments in
Reconstruction Algorithms for
Tomographic Imaging
ChienMin Kao, Emil Y Sidky, Patrick La Rivière
and Xiaochuan Pan
Ionizingradiationbasedimaging techniques play anextremely important
role in noninvasively yielding information about the internal anatomic
structure and functional information within a subject under study. Com
puted tomography (CT), positron emission tomography (PET), and single
photon emission computed tomography (SPECT) are the main imaging
modalities based upon ionizing radiation, and they have found applica
tions in virtually every discipline in science, engineering, biology, chem
istry, and, more notably, medicine. In these imaging techniques, one needs
to develop algorithms for accurately reconstructing the underlying object
function fromacquired projection data. In the last decade or so, in parallel
to tremendous tomographic hardware advancement for data acquisition,
there have also beenimportant breakthroughs inthe development of inno
vative algorithms for reconstructing the underlying object function. Inthis
chapter, we brieﬂy reviewsome of the recent developments in reconstruc
tion algorithms for tomographic imaging in CT, PET, and SPECT.
15.1 IMAGE RECONSTRUCTION IN COMPUTED
TOMOGRAPHY
15.1.1 Introduction
Xray projection imaging is the most common noninvasive scan
employed for probing the interior of a subject, and it found wide
application very quickly after its initial discovery in 1895. For many
361
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
362
ChienMin Kao et al.
purposes, the projection of the subject’s Xray attenuation coefﬁ
cient yields important diagnostic information in medical imaging,
or structural and compositional information in industrial imaging.
There are, however, an increasing number of imaging applications,
where it is desirable to have full 3D information of the Xray atten
uation coefﬁcient. Such information can be provided by combining
and processing Xray projections of a subject taken from multiple
viewangles surrounding the subject. In the 1970s, as computer tech
nology began its rapid ascension, computed tomography (CT) was
developed to address the need for internal 3D structural informa
tion. The early CTscanners obtained3Dimages slicebyslice by illu
minating the subject with a fan of Xrays, rotating the Xray source
and detector to obtain complete information to reconstruct the 2D
slice image. By translating the subject the subsequent slices could
be obtained. The theory of image reconstruction lead to the ﬁltered
back projection (FBP) and fanbeam ﬁltered back projection (FFBP)
algorithms, corresponding respectively to parallel and diverging
ray illumination. This stepandshoot process was streamlined by
introducing the helical source trajectory. This trajectory is what is
seen from the subject reference frame as it is translated at a con
stant rate through a rotating gantry that carries the Xray source
and detector on a circular trajectory. If the helical pitch, the distance
covered by the subject during a single turn of the gantry, is not too
great then variations on the 2D FFBP algorithm can be utilized to
obtain accurate reconstruction of the subject’s 3D Xray attenuation
coefﬁcient.
The trend in the technical development of CT scanners is to
include more and more rows on the detector, extending its dimen
sion along the longitudinal axis of the helical scan (referred to as
simply the longitudinal direction for short). Currently, commercial
scanners employ up to 64 detector rows and this number will cer
tainly increase, because more detector rows allows for higher heli
cal pitches and more rapid coverage of 3D volumes. As the detector
size increases longitudinally the Xray source slit is opened up to
illuminate the subject with an Xray cone beam. This evolution of
CT scanners has increased the sense of urgency for the development
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
363
of practical algorithms that can yield accurate image reconstruction
from cone beam CT projection data.
Atheory of 3D image reconstruction for cone beam CT has been
known since work by Tuy,
1
who derived a inversion formula that
yields a 3Ddistribution fromits cone beamprojection at views along
a general class of Xray source trajectories. The helical trajectory falls
into this class. Although the Tuy formula represents an important
advance in the theory of cone beam CT image reconstruction, it has
two major practical shortcomings. First, direct implementation of
this formula is numerically inefﬁcient. Second, the projection data
cannot be truncated; a complete projection of the subject is needed
fromall of the viewangles. This is particularly impractical in helical,
conebeamCTfor humansubjects; as thedetector wouldhavetohave
an extent larger than the body’s projection at all the sampled views.
During the 1990s and early 2000s much effort was devoted to derive
a practical image reconstruction algorithm, using a relation between
cone beamprojection data and the 3DRadon transformof the object
function derivedby Grangeat.
2
These algorithms sought to solve the
so called long object problem, where the cone beam projection data
are truncated only in the longitudinal direction.
2
A breakthrough
in image reconstruction theory occurred in 2001 when Katsevich
published an exact formula for image reconstruction directly from
helical, conebeamprojectiondata.
3
This algorithm, thoughrelatedto
the Tuy formula,
4
could support longitudinal truncation of the cone
beamprojection data, andrequires only 1Dﬁltering of the projection
data thereby improving numerical efﬁciency.
The ideas of the Katsevichalgorithmcombinedwiththe geomet
rical construct of the so called πline in helical cone beamscanning,
5
led Zou and Pan to develop a new class of cone beam CT image
reconstruction algorithms.
6−10
These algorithms obtain the image
in a curvilinear coordinate system that is deﬁned by the chords of
a general source trajectory. In helical cone beam CT, the πlines can
be interpreted as a special set of chords. The new algorithms are
efﬁcient and create opportunities to design novel data acquisition
conﬁgurations that allow for dose reduction and increased scan
ning speed. The ZouPan image reconstruction formula involves
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
364
ChienMin Kao et al.
reversing the usual data processing steps of data ﬁltration followed
by back projection to the image array. These algorithms instead
perform the back projection step ﬁrst, and are hence called back
projectionﬁltration(BPF). The reversal of these operations improves
algorithm efﬁciency, because the ﬁltration in the image space is less
time consuming than in the data space. More importantly, BPF can
perform exact image reconstruction for projection data that is trun
catedbothlongitudinallyandtransversely. Inthe followingsections,
we introduce the data model for helical cone beam CT; we then
explain the BPF image reconstruction algorithm; and ﬁnally we dis
cuss the implications for regionofinterest (ROI) imaging.
15.1.2 The Data Model of Helical Cone Beam CT
In helical cone beam CT, the Xray source travels along a helical tra
jectory along with the 2D detector array. The detector shown in the
image is a ﬂatpanel array, while current helical cone beam systems
generally use curved detector arrays. The image reconstruction the
ory, below, is presentedina detector independent formulationwhich
canbe easilyadaptedtoeither detector geometry. The data model for
the helical cone beam system assumes that the line integral of the
Xray attenuation coefﬁcient for a ray originating from the source
and terminating at a detector bin can be obtained from:
d
i
= −ln
_
I
i
(I
0
)
i
_
, (1)
where i is a generic index for the rays speciﬁed by the combination
of all detector bin and source locations; (I
0
)
i
is the Xray intensity in
number of photons that would be measured for the ith ray if there
were no subject; I
i
is the actual measured intensity; and d
i
represents
the line integral of the Xray attenuation for the ith ray:
d
i
=
_
∞
0
dµ(s
i
+
ˆ
θ
i
). (2)
The vector s
i
is the Xray source location and
ˆ
θ
i
is the unit vector for
the ith ray; µ is the spatially varying Xray attenuation coefﬁcient,
ignoring energy dependence. The data model is idealized; Xray
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
365
scatter, beampolychromatricity, partial volume averaging, etc.
11
are
all neglected here. The set of d
i
is interpreted as the measurements
because they can be computed fromthe rawmeasurements through
Eq. (1). The aimof the reconstruction algorithmis to ﬁnd µ(r ) given
measurements d
i
.
As described by Eq. (2) the measurement set is a large but ﬁnite
set of line integrals. The theory of image reconstruction in helical
cone beam CT is formulated in terms of a continuous data function;
thus we rewrite Eq. (2) to reﬂect this fact, and discuss discretiza
tion after the reconstruction formula is written down in Eq. (3). To
develop the image reconstruction formula, we assume that we can
obtain the continuous data function
g(λ,
ˆ
θ) =
_
∞
0
dµ[s(λ) +
ˆ
θ], (3)
where λ is the continuously varying helical parameter indicating
the source position. The source position is given in Cartesian coor
dinates by:
s(λ) =
_
Rcos λ, Rsin λ,
h
2π
λ
_
, (4)
where R is the helical radius, and h is the pitch length. The coordi
nate system is set up so that the axis of the helix is aligned along z.
The detector bin locations are not speciﬁed. It is assumed that the
detector captures the attuation measurements along the necessary
rays originating at s(λ) in the direction
ˆ
θ. The sufﬁcient range of λ
and
ˆ
θ is discussed below.
15.1.3 The BPF Algorithm
The BPF algorithmfor image reconstructioninhelical conebeamCT
involves decomposing the imaging volume in chords of the source
trajectory. Mathematically, a single chord is described by:
r
c
(λ
1
, λ
2
, t) = s(λ
1
)(1 − t) + s(λ
2
)t; t ∈ [0, 1]. (5)
The chord, speciﬁedby the helical parameters λ
1
andλ
2
, is a line seg
ment that joins the source positions s(λ
1
) and s(λ
2
). The parameter
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
366
ChienMin Kao et al.
t locates a point on the chord. It has been previously observed that
all points internal to the convex hull of the helix can be uniquely
assigned to a point on a helical chord with the restriction that
λ
1
−λ
2
 < 2π; chords that satisfy this restriction are called πlines.
5
The BPF algorithm obtains the volume image by reconstructing it
chord by chord.
The main steps of the BPF algorithminvolve taking a derivative
of the projection data, back projection of the data derivative to the
chord to form an intermediate image function, and ﬁnally ﬁltration
of the intermediate image to obtain the actual image function. The
ﬁrst processing step for the data function follows this equation:
g
D
(λ,
ˆ
θ) =
∂
∂p
g(p,
ˆ
θ)
¸
¸
¸
p=λ
. (6)
The next step involves back projecting the data onto the chord:
f
I
(λ
1
, λ
2
, t) =
_
λ
2
λ
1
dλ
1
s(λ) − r
c
(λ
1
, λ
2
, t)
g
D
(λ,
ˆ
θ
c
),
where
ˆ
θ
c
=
r
c
(λ
1
, λ
2
, t) − s(λ)
r
c
(λ
1
, λ
2
, t) − s(λ)
.
(7)
Before continuing on to the last step of the chord image reconstruc
tion, we note here that the above formula says something about the
projection data sufﬁcient for reconstruction the image on the chord.
The integration of λ for the back projection goes from λ
1
to λ
2
, so
projection views for λ ∈ [λ
1
, λ
2
] are needed to form f
I
(λ
1
, λ
2
, t),
and for each view the rays that intersect the chord need to be
measured.
It turns out that the intermediate chord image f
I
(λ
1
, λ
2
, t) is sim
ply the Hilbert transform of the desired image function µ
c
(λ
1
, λ
2
, t)
along the chord:
f
I
(λ
1
, λ
2
, t) = 2
_
∞
−∞
dt
µ
c
(λ
1
, λ
2
, t
)
t − t
where µ
c
(λ
1
, λ
2
, t) = µ[r
c
(λ
1
, λ
2
, t)].
(8)
The Hilbert transform involves an inﬁnite range integration, but it
is known that the object function µ has compact support. It turns
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
367
out that the solution µ
c
(λ
1
, λ
2
, t) to the integral equation, Eq. (8), can
be expressed with a ﬁnite range integration because of the compact
support property. Assuming that µ
c
is compactly supported within
the interval t ∈ [t
a
, t
b
], we have:
µ
c
(λ
1
, λ
2
, t) =
_
t
b
− t
t − t
a
×
_
t
b
t
a
dt
1
t − t
_
t
− t
a
t
b
− t
f
I
(λ
1
, λ
2
, t
). (9)
The fact that the t
integration only runs from t
a
to t
b
has further
implications on the data sufﬁciency conditions. Only the projection
rays that intersect the chord for t ∈ [t
a
, t
b
] need to be measured, not
the complete πline. This inverse for the ﬁnite Hilbert transform is
actually only one of many possibilities.
8,12
This completes the chain
of operations needed to go from the projection data to the image
along a πline. The only issues that remain are howto obtain volume
images, and what projection data are sufﬁcient for reconstruction.
15.1.4 The Long Object Problem and ROI Reconstruction
The theory of πline image reconstruction, above, tells howto obtain
the reconstructedimage onthe trajectory chords. The endgoal, how
ever, is volume reconstruction. This section clariﬁes the connection
between the two cases, and along the way discusses scanning data
requirements for various scanning tasks.
For diagnostic helical conebeam CT the most important task
that the image reconstruction can fulﬁll is to provide numerically
exact images efﬁciently from projection data that are longitudi
nally truncated. The BPF algorithm does this. As the BPF algorithm
itself provides the image on individual πlines, the volume must be
parameterized ﬁrst in the curvilinear system speciﬁed by the inde
pendent variables λ
1
, λ
2
, and t. The variables λ
1
and λ
2
specify a
πline, and t yields a speciﬁc point on that chord. We illustrate now
how the volume coverage works in this coordinate system. First,
one can ﬁx one end of the chord, say λ
1
= λ
A
, then sweep λ
2
in
the range [λ
A
, λ
A
+ 2π]. Such a set of πlines deﬁnes a πsurface
whose geometry only depends on λ
A
. To obtain the volume, λ
A
is
swept throughaninterval [λ
start
, λ
end
]. The data sufﬁcency condition
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
368
ChienMin Kao et al.
for such a volume scan is easy to derive from geometric considera
tions of the individual πline is obviously λ ∈ [λ
start
, λ
end
+2π]. The
required projection data on the detector fromeach view, however, is
less obvious but not difﬁcult to derive. From Sec. 1.3, rays passing
through the πline deﬁned by λ
1
and λ
2
must be detected. It turns
out that the area on the detector that should be measured is speci
ﬁed by the so called TamDanielsson (TD) window.
5,13
This window
represents the shadowof all πlines to which the current viewangle
contributes. Geometrically, the boundaries of the TDwindow are
deﬁned by the shadows of the helical scanning trajectory on the
detector within 2π of the current scanning angle λ. Note that this
geometrical deﬁnition can be applied to any detector geometry as
long as the TDwindowﬁts within the detector. In practice, even the
TDwindowis the upper limit on the detector area. If it is known that
the subject support is conﬁnedwell withinthe convexhull of the heli
cal scan, then the required detector area can be reduced further. In
either case, theBPFreconstructionallows theutilizationof projection
data that are longitudinally truncated; thus solving the long object
problem.
Because the BPF theory reconstructs a volume image chordby
chord substantial reduction of scanning effort, even over long object
scanning, is possible when the image is desired only within a certain
ROI. Given the ROI, one needs only identify the πlines that inter
sect with the ROI and then reconstruct them. The scanning range is
found by examining the volume parameterization in terms of λ
1
and
λ
2
. For example, spherical volume can be reparameterized in terms
of λ
1
, λ
2
, andt, then subsequently projecteddown to the λ
1
, λ
2
plane
(by integrating over t). Each point within the area represents a sin
gle πline that should be reconstructed. The actual volume that is
reconstructed is the union of the support segments of all of these
πlines, which in general will be larger than the desired ROI. True
ROI reconstruction (known as the interior problem) is theoretically
not possible. As with the long object scanning, the necessary projec
tion data are identiﬁed by the shadow of the support segments of
each πline on the detector. While long object scanning serves the
bulk of diagnostic helical conebeam CT, ROI scanning may prove
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
369
useful for speciﬁc protocols in image guided radiation therapy, CT
breast screening, or cardiac imaging.
15.2 IMAGE RECONSTRUCTION IN POSITRON
EMISSION TOMOGRAPHY
15.2.1 Introduction
Positron emission tomography (PET) is a unique, functional imag
ing modality that is capable of producing quantitative in vivo assays
of a large variety of molecular pathways of biological systems. PET
has been routinely used in cancer diagnosis and evaluation.
14
It is
also widely used in neurology
15
and cardiology,
16
and is promising
for providing effective treatment outcome evaluations.
17
Recently,
there has been substantial interest in developing dedicated PET sys
tems for imaging small animals (such systems are referred to as
microPET systems below).
18
In combination with the use of animal
models of human biology and diseases, microPET systems are pow
erful tools inpreclinical research.
19
MicroPETimaging of gene trans
fer, expression, and therapy have been successfully demonstrated
20
;
and there are high expectations that microPET systems will play
important roles in discovering new biology, as well as in drug and
treatment developments.
21
In comparison with human PET imag
ing, microPETimagingdemands muchhigher imagingperformance
characteristics,
18
making microPET system development a useful
test bed for innovative PET designs and technologies. Because both
animal and human PET systems are available, PET imaging is also
a useful translational research tool. Finally, in recent years there has
also been greatly renewed enthusiasm for timeofﬂight (TOF) PET
imaging due to its ability to produce improved image quality and
the availability of fast and dense scintillators adequate for imple
menting TOFPET systems.
22
The imaging performance of a PET system depends critically
on both its instrumentation and reconstruction.
23
Many discov
eries and innovations in PET instrumentation have taken place
in recent years.
18,23
These include new scintillators and detector
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
370
ChienMin Kao et al.
designs that enable substantial improvement of the spatial reso
lution and timing accuracy in cost effective manners. Currently,
PET imaging can reach a spatial resolution of about or better than
1 mm. On the other hand, there exist efforts in achieving excep
tional high sensitivity for microPET systems.
24
Parallel to advances
in instrumentation, there are substantial advances in PET image
reconstruction as well.
25
In practice, as we will discuss below, PET
data are signiﬁcantly degraded, making the imaging model consid
erably different from the ideal. Major degradations in PET imag
ing include data noise, effects of ﬁnite detector size, and the pres
ence of unwanted radiation (scatter and random coincidences).
18,23
These degradations need to be addressed in reconstruction in order
to produce highquality PET images. As the application domain of
PET imaging enlarges, higher demands in all performance aspects
of PET imaging can be expected. These demands would require
many more advances in PET instrumentation and reconstruction
to be made.
Excellent reviewarticles onPETinstrumentationandreconstruc
tion can be found in Refs. 18, 23 and 25. Below, we will discuss
issues and challenges facing PET image reconstruction and describe
approaches for addressing them.
15.2.2 Imaging Model
PET imaging is based on the principle of annihilation coincidence
detection and tracer kinetic modeling.
23
PET tracers are molecules
radioactively labeledwithpositronemitting isotopes, whichinclude
F18, C11, N13, and O15. Positrons emitted by PET tracers will
annihilate with electrons in their surroundings and give rise to a
pair of 511 keV photons traveling in opposite directions. Typically,
rings of gamma ray detectors are placed around the subject being
imaged. A simultaneous detection of two 511 keV photons by the
detector rings, called a coincidence detection, registers an annihila
tion event. Generally, one can deﬁne the response function h
i
( x ) to
represent the probability for a positron emission occurring at x to
be detected by the ith detector pair of a PET scanner. The response
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
371
function h
i
( x ) include factors such as the (geometric) detection efﬁ
ciency of the ith detector pair to the position x and the attenuation
that the annihilation photons are subject to before exiting the sub
ject. Because the annihilation photons travel in opposite directions,
for small detectors we have, to a good approximation, h
i
( x) =
i
a
i
for x ∈ L
i
and
h
i
( x ) = 0 for x ∈ L
i
, where L
i
denotes the line that
connects the centers of the front faces of the two detectors of the ith
detector pair; and
i
and a
i
are the detection efﬁciency and subject
attenuation on L
i
respectively. In the literature, L
i
is called the line of
response (LOR). The number of the coincidence events collected at
the ith detector pair, denotedby g
i
, is then relatedto the image func
tion f ( x ), i.e. the spatial density distribution of the positron decays
taking place during the imaging time, by
g
i
=
i
a
i
×
_
L
i
dl f ( x ), (10)
where
_
L
i
dl denotes the line integral alongthe LORL
i
. Consequently,
under suitable conditions PET measurements are related to a collec
tion of line integrals of the image function, i.e. to certain samplings
of the Radon transform of the image function. Provided that the
resulting samplings are adequate, according to the theory of Radon
transform, the image function can be recovered from the acquired
PET measurements, up to the spatial resolution limit supported by
the samplings.
The above description provides the basic principle underlying
PET imaging. This description is greatly simpliﬁed; it omits many
physical factors involved in the imaging process, including the
positron range, photon noncolinearity, the presence of scattered and
randomcoincidences, and the effects of ﬁnite detector size. Positron
range is the ﬁnite distance between the location where a positron
is emitted and where the annihilation takes place. Therefore, rigor
ouslyspeaking, the image functionf ( x ) refers to the densityfunction
of the positron annihilation, rather than that of the positron emit
ter itself unless the positron range is negligibly small. Depending
on the positron emitting isotope employed, the positron range can
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
372
ChienMin Kao et al.
vary from0.83 mmto 8.54 mm.
18
Photon noncolinearity refers to the
fact that the directions of the two annihilation photons can slightly
deviate from the ideal 180
◦
. The fullwidthatthehalfmaximum of
this angular deviation is small but ﬁnite (about 0.5
◦
). The departure
of the annihilation position from the detected LOR due to photon
noncolinearity increases as the size of the detector ring increases.
Scattered events are registered coincidence events of which at least
one annihilation event undergoes scattering before detection. There
are also random events which chance coincidence detection of two
photons originating from two independent positron annihilations.
These event types can signiﬁcantly contaminate PET measurements
in3DwholebodyPETimagingandinapplications that employhigh
tracer concentrations.
23
Apair of detectors is sensitive to all annihi
lation events taking place inside the common volume seen by the
detectors. In the above simpliﬁed imaging model, we have assumed
sufﬁciently small detectors suchthat the sensitive volume reduces to
the LOR. In practice, this is often a poor assumption. Furthermore,
due to gammaray penetrations the sensitive volume can be much
larger than that suggested by the dimension of the detector front
face, leading to the phenomenon of parallax errors.
23
The sensitivity
of adetector pair topoints withinthe sensitive volume of the detector
pair can also vary considerably. In addition to the above described
physical factors, radioactive decay and photon detection are ran
dom processes, giving rise to statistical variations in the number of
detected events when given the same image function and imaging
conditions. To include these physical and statistical factors in PET
imaging, one can write:
¯ g
i
= E{g
i
} =
_
d
3
xh
i
( x )f ( x ) + s
i
+ r
i
, i = 1, . . . , N, (11)
where s
i
and r
i
are the expectations of the number of scattered and
random events accumulated at the ith detector pair during the
imaging time, andE{g
i
} denotes the ensemble mean of g
i
. In PET, g
i
’s
are independent Poisson variates. Therefore, the conditional prob
ability distribution of the measurement g = [g
1
, . . . , g
N
]
t
given the
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
373
image function f ( x ), is equal to:
p( g
f ) =
N
i=1
e
−¯ g
i
¯ g
g
i
i
/g
i
. (12)
It is well knownthat variance {g
i
} = ¯ g
i
. Therefore, generallyPETdata
noise is not stationary (i.e. variant with measurements), with the the
relative standard deviation of the noise with respect to its mean
decreasing with the number of detected events. From Eq. (12), the
loglikelihood function for the measurement is given by:
l(
f  g) = log p( g
f ) = −
N
i=1
¯ g
i
+
N
i=1
g
i
log ¯ g
i
+ constant. (13)
15.2.3 Image Reconstruction
Under idealized conditions, PET measurements are related to cer
tain samplings of the Radon transform of the underlying image
function. After correcting for the detection sensitivity and subject
attenuation, analytic algorithms developed for inverting the Radon
transform, such as the well celebrated ﬁltered backprojection algo
rithm(FBP), canbe employedfor reconstructingthe unknownimage
function fromPET measurements. Methods that can compensate for
certain deviations from the Radon transform, such as the positron
range and stationary detector response, have also been proposed.
26
Analytic PET reconstruction methods, however, have two major
shortcomings. First, the tomographic reconstruction process is ill
conditioned such that small data noises can give rise to large errors
in the solution image. Unfortunately, PET data generated in typi
cal studies are quite noisy; therefore, achieving effective control of
the negative effects of data noise is a concern of special importance
in PET reconstruction. Noise reduction in analytic reconstruction
methods is typically achieved by employing ad hoc lowpass ﬁlters.
By assuming stationary data noise (which is incorrect), Wiener ﬁl
ters for reducing noise have also been developedandinvestigated.
27
Generally speaking, analytic methods lack proper mechanisms for
implementing optimized handing of the nonstationary data noise
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
374
ChienMin Kao et al.
encountered in PET imaging. Second, analytic reconstruction algo
rithms are based on simpliﬁed imaging models that do not take into
account most physical factors present inPETimaging. Therefore, it is
necessary to apply prereconstruction corrections so that the imaging
model for the corrected data can approximate the assumed models.
Physical factors that are uncorrected for, or are only partially cor
rected, in the preprocessing can lead to image degradations, such
as image blur and spatial varying resolution, or even cause image
artifacts. Accurate prereconstruction data corrections are often difﬁ
cult toachieve. Furthermore, suchprereconstructioncorrections will
deteriorate the statistical nature of the acquired data and aggravate
the aforementioned concern regarding the inferior noise handling
capability with the analytic reconstruction methods.
Modelbased approaches that can fully account for the physical
and statistical models of the PET imaging process are necessary for
achieving best image reconstructions. Iterative reconstruction meth
ods are such modelbased techniques. For purpose of computation,
the image function needs to be discretized:
f ( x ) =
M
j=1
f
j
b
j
( x), (14)
where b
j
( x ), j = 1, . . . , M, is a expression set. The continuous image
model given by Eq. (11) then becomes:
E{ g} = H
f + s + r, (15)
where
f = [f
1
, . . . , f
M
]
t
, s = [s
1
, . . . , s
N
]
t
, r = [r
1
, . . . , r
N
]
t
, and H
is an N × M system response matrix having the elements H
ij
=
_
d
3
xh
i
( x )b
j
( x). The probability model for g is still given by Eq. (12).
In the literature, the voxel representation, in which the image is
assumed to consist of a lattice of cubic elements containing uni
form radioactivity within, is widely adopted for image discretiza
tion. Other discrete image representations have also been proposed
for PET image reconstruction. It is also common for researchers to
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
375
consider the following simpliﬁed PET imaging model:
E{ y} = H
f . (16)
In this case, one either removes scattered and random events in
the data by prereconstruction corrections, or simply ignores such
events. In addition, the Poisson model is often assumed for y even
though the model is no longer valid after data corrections. Many
iterative methods for solving the discrete PET imaging model given
byEqs. (15) and(16) havingbeendeveloped. These methods differ in
the cost functions they employ for ﬁnding solutions to these mod
els. They also substantially differ in the quantitative performance
characteristics (i.e. the tradeoff behavior between the reconstruc
tion accuracy and noise sensitivity), the convergence behavior, and
the computational complexity.
Iterative PET reconstruction methods include the algebraic
reconstruction techniques (ART),
28
projectionontoconvex (POCS)
techniques,
29
penalized weighted leastsquare (PWLS) methods,
30
maximum likelihoodbased (ML) approaches,
31
and the maximum
a posteriori (MAP) approaches.
32,33
IntheARTmethods, one observes
that the solution to Eq. (16) can be interpreted as:
f ∈
N
_
i=1
A
i
, A
i
= { x :
h
t
i
x = y
i
}, (17)
where
h
i
= [H
i1
, . . . , H
iM
]
t
. The projection operator P
i
that maps an
arbitraryvector x tothe closest point onthe hyperplane A
i
is equal to:
P
i
x = x + ( g
i
−
h
t
i
x)
h
i
_

h
i

2
. (18)
The ART algorithm seeks to sequentially enforce the hyperplane
constraints until convergenceis reached, yieldingthefollowingalgo
rithm: given an initial estimate
f
(0)
, the nth estimate is equal to:
f
(n)
=
˜
P
N
· · ·
˜
P
1
f
(n−1)
. (19)
The order of the projectionis arbitrary. The resultingalgorithmis fast
in terms of both convergence andthe computation time neededeach
iteration, but it lacks the ability to explicitly incorporate mechanisms
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
376
ChienMin Kao et al.
for handling data noise and often fails to converge when subject to
inconsistent data. The POCS techniques are generalizations of the
ART methods in which the solution image is given by:
f ∈
N
_
i=1
˜
A
i
, N
> N, (20)
where, in addition to the N hyperplane constraints given by the
measurements,
˜
A
i
can include any convex set. Let
˜
P
i
denote the
projection operator associated with the convex constraint sets
˜
A
i
,
the POCS update equation is then given by:
f
(n)
=
˜
P
N
· · ·
˜
P
1
f
(n−1)
. (21)
Therefore, certain information regarding the data noise and solution
images in the form of convex constraints can be speciﬁed for allevi
ating the negative effects of data noise. Convergence can still be an
issue in POCS methods.
In contrast toART and POCS methods, the ML, MAP, and PWLS
methods are statistical methods that explicitly employ the probabil
itydistributions of the data inreconstruction. Inthe MLmethods, the
solutions maximize the log likelihood functions given by Eq. (13),
and they are often generated by using the expectation maximization
(EM) algorithm given by:
f
(n)
j
=
i
H
ij
i
H
ij
_
g
i
j
H
ij
f
(n−1)
j
_
f
(n−1)
j
, j = 1, . . . , M. (22)
When the measurements are strictly independent Poisson variates,
given a positive initial estimate
f
(0)
the EM algorithm is guaranteed
to converge to the ML solution.
31
The EM algorithm has a relatively
simple update equation, offers favorable quantitative performance
characteristics, and automatically enforces the voxel positivity con
dition. Inthe MAPapproach(alsocalledthe Bayesianapproach), one
seeks to maximize the a posteriori distribution p(
f  g) or, equivalently,
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
377
logp(
f  g). According to the Bayes theorem, we have:
log p(
f  g) = log
_
p( g
f )p(
f )/p( g )
_
= l(
f  g) + log p(
f ) + constant.
(23)
The a priori information p(
f ) imposes smoothness conditions on
the solution,
32
or introduce structural information of the solution
deriving from associated anatomical images.
33
In the literature,
MAPmethods are also called penalized maximumlikelihood meth
ods because the prior term penalizes the loglikelihood function in
Eq. (23). Many iterative algorithms for generating the MAP esti
mates have been proposed, including EMlike algorithms. Both
ML and MAP methods require exact knowledge of the probabil
ity distributions of the data. In many practical situations (such as
with precorrected data), such exact probability distributions are
not available. Approximate distributions for randomcorrected data
have been proposed, including the shifted Poisson model and its
variations, and the saddlepoint approximation.
34
Although exact
distributions are difﬁcult to obtain, the secondorder statistics of the
corrected data can be readily derived. It is therefore attractive to
employ PWLS methods that seek the minimize the cost function
30
:
(
f ) =
1
2
( y − H
f )
t
W( y − H
f ) + β(
f )G(
f ), (24)
where the weighting matrix W is the inverse of an estimate of the
conditional variances of y, G(
f ) imposes penalties for image rough
ness, and β(
f ) provides a mechanismfor preserving edge structures
in image.
The EM algorithm is quite attractive for PET image reconstruc
tion; the main drawback that limits its practical usefulness is its
slow convergence rate. An important variation of the EM algorithm
is the orderedsubsets EM(OSEM) algorithm, whichhas beenwidely
adopted as the de facto standard for practical applications.
35
In this
algorithm, the data are dividedinto a number of disjoint subsets and
the EM algorithm is sequentially applied to these subsets to consti
tute one iteration. This simple modiﬁcation has been observed to
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
378
ChienMin Kao et al.
remarkably increases the algorithm’s convergence rate and, empir
ically, faster convergence rate is achieved with the use of more sub
sets. Unfortunately, under certain situations the OSEM algorithm
may not converge (similar to the situations with ART and POCS).
Modiﬁcations to ensure the convergence of the OSEM algorithm
have been developed and investigated. Important examples are the
rawactionmaximumlikelihood(RAMLA) algorithm
36
andthe con
vergent OSEM (COSEM) algorithm.
37
It is noted that, in practice, the properties of a statistical recon
struction method depend not only on the cost function it aims to
optimize but also on the speciﬁc updating equation it employs. This
is because the algorithmis often terminated before reaching conver
gence and in some situations multiple solutions can exist. We also
note that at convergence the EM algorithm is known to minimize
the KullbackLeibler distance between the acquired data and the
predicted data based on the estimated solution. This interpretation
of the EM algorithm is valid irrespective of the speciﬁc data noise
model.
15.2.4 ThreeDimensional Imaging, Dynamic Imaging,
and List Model Reconstruction
Most modern PET systems are fully 3D systems, with the so called
3DRP algorithm of Kinahan
38
being widely offered for perform
ing analytical 3D PET image reconstruction. Alternative rebinning
approaches that convert a fully 3D PET dataset to a collection of 2D
datasets associated with individual transaxial image slices have also
beendeveloped. The conversionprocess canbe either approximate
39
or mathematically exact.
40
Hybrid iterative reconstruction meth
ods that ﬁrst analytically rebin 3D PET datasets to 2D datasets
and employ 2D iterative reconstruction for achieving slicebyslice
reconstruction have also been developed and investigated. In such
hybrid approaches, system response matrices for, and the proba
bility distributions of, the rebinned data need to be determined.
Generally, as expected, direct 3D iterative reconstruction produces
the best solutions. Hybrid approaches, nonetheless, greatly alleviate
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
379
the tremendous computation demands required by fully 3D itera
tive reconstructions and provide attractive tradeoffs between image
accuracy and computation burden.
For derivations of certain local biochemical/physiological
parameters within individual voxels, dynamic PET imaging is often
performed. Conventionally, dynamic PET data are stored as a tem
poral sequence of static PET data. The acquired data at each time
point, calleda frame, is separately reconstructedby using analytic or
iterative reconstruction methods described above to generate a tem
poral sequence of PET images. Appropriate kinetic models are then
employedtoaccount for the observedtemporal variations of the PET
tracers within each voxel for deriving relevant parametric images.
In this conventional approach, the spatial and temporal informa
tion available in the dynamic PET data are treated independently,
although they are not uncorrelated. In Ref. 41, Kao et al. made the
observation that the temporal information available in the dynamic
data can be exploited for greatly reducing the data noise associated
with each frame and hence signiﬁcantly improving the resulting
image quality. By having obtained better frame images, more accu
rate kinetic parameters regarding the PET tracer are also obtained.
Reconstructionapproaches that generate parametric images directly
from the dynamic PET data have also been reported.
42
So far, we have discussed the histogram data format in which
the accumulated event counts at individual detector pairs (i.e. at
individual LORs) of a PET scanner are presented. In contrast, the
listmodel data format presents a streamof individual event records
that are sequentially storedinthe chronological order of event detec
tion. Listmodel data format is more versatile than the histogram
format. In principle, as much information as desired regarding the
detected events can be stored in the event records, therefore per
mitting maximal utilization of the detected event information for
achieving optimized image reconstruction. Obviously, listmodel
datasets grow linearly in size with the imaging time while the his
togramdatasets have a ﬁxedsize as determinedthe number of LORs
of a PETscanner. As the number of LORs ina modernhighresolution
PET system has drastically increased, the listmodel data format
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
380
ChienMin Kao et al.
has gained popularity because its advantages are starting to out
weigh the storage disadvantage. Iterative algorithms basing on the
ML and MAP criteria for reconstructing listmodel PET data have
been developed.
43,44
Methods for jointly estimating the image func
tions and the temporal basis functions underlying the tracer kinetics
fromlistmodel data have also beeninvestigated.
45
The combination
of listmodel data and physiological gating information also pro
vides excellent mechanisms for performing cardiac and respiratory
motion corrections. Readers are referred to Ref. 25 for more detailed
discussion on listmodel PET image reconstruction.
15.3 IMAGE RECONSTRUCTION IN SINGLE PHOTON
EMISSION COMPUTED TOMOGRAPHY
15.3.1 Introduction
In singlephoton emission computed tomography (SPECT), a radio
pharmaceutical is injected into a patient with the expectation that
it will track some functional or physiological process of interest. At
any given time, one seeks to know the 3D distribution of the tracer.
This canbe achievedbyemployingone or more scintillationcameras
placed outside the patient, each of which records the 2Ddistribution
of emitted photons incident on it.
In order to form a projection image that represents a known
mapping from the 3D distribution of activity to the 2D projection,
the camera is generally equippedwith a leadcollimator that restricts
the angular range of photons that reach the face of the camera. In
this section, we focus on the case when a socalled parallelhole col
limator is employed. Such collimators attempt to restrict attention to
photons that are travellingnormal tothe face of the camera, although
in practice, they admit photons incident from a range of angles cen
tered around zero degrees. This acceptance cone leads to depth
dependent resolution and it is important to model and account for
this effect in order to obtain more accurate reconstructed images of
the activity distribution.
Another physical effect that must be accounted for is attenua
tion of the photons as they travel through the patient fromthe point
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
381
at which they are emitted. SPECT is often performed with photons
around 140 keV for which the attenuation coefﬁcient in soft tissue
is approximately 0.15 cm
−1
. In some areas of the body, such as the
abdomen, it is reasonable to assume that attenuation is uniform.
In other regions, such as the thorax, where the lungs, soft tissue,
andbone all present signiﬁcantly different attenuationcoefﬁcients, a
more general model of nonuniformattenuationhelps improve quan
titative accuracy.
To obtainsufﬁcient data to invert the mapping fromthe 3Dactiv
ity distribution to 2D projections, the camera or cameras must be
rotated around the patient to a variety of angles. If we represent the
coordinates of the camera face by ξ and z and the angular position
of the camera by θ, then the mean of the set of measurements, which
we denote p (ξ, z, θ), can be related to the 3D activity distribution
by the following very general equation, adapted from Liang (PMB
1997), and which includes the effect of nonuniform attenuation and
depthdependent resolution:
p(ξ, z, θ) =
_
∞
−∞
dη
_
∞
−∞
_
∞
−∞
dξ
dz
h (ξ − ξ
, z − z
, η) a
θ
(ξ
, z
, η)
×exp
_
−
_
L
0
(ξ
,z
,η;ξ,z)
µ
θ
(ξ
, z
, η
)dl
_
. (25)
Here, a
θ
(ξ, η, z) represents the activity distribution a(x, y, z) in a coor
dinate system rotated by θ about the z axis:
ξ = x cos θ + y sin θ
(26)
η = x sin θ − y cos θ,
and µ
θ
(ξ, η, z) represents the attenuation map µ(x, y, z) in the same
rotated coordinate system. The detector response kernel is repre
sented by h(ξ, z, η) and it models blurring that is depthdependent,
but shift invariant at a speciﬁed depth. The attenuation termis writ
ten as a line integral through the attenuation map along the line
L
θ
(ξ
, z
, η; ξ, z) that connects the point (ξ
, z
, η) to the detector bin
(ξ, z) at angle θ. This very general form accounts for the fact that
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
382
ChienMin Kao et al.
photons travelling from different portions of the ﬁeld of view of a
given bin at a given depth could experience different amounts of
attenuation because of the different path they travel along toward
the detector.
Fortunately, it is veryreasonable tosimplifyEq. (25) byassuming
that the attenuation experienced by the photons traveling along any
of the lines contributing to a given projection bin is the same and can
be represented by the attenuation that takes place along the central
ray of the bundle. We then obtain:
p(ξ, z, θ) =
_
∞
−∞
dη
_
∞
−∞
_
∞
−∞
dξ
dz
h (ξ − ξ
, z − z
, η) a
θ
(ξ
, z
, η)
×exp
_
−
_
η
−∞
µ
θ
(ξ
, z
, η
)dη
_
. (27)
We will take Eq. (27) as our fundamental imaging equation and con
sider the approaches that have been developed to inverting it under
a variety of special cases.
15.3.2 No Attenuation, No Depthdependent Resolution
The simplest possible case arises when one ignores the effects
of attenuation and depthdependent resolution effects. Then we
obtain:
p (ξ, z, θ) =
_
∞
−∞
dηa
θ
(ξ, z, η), (28)
which is a slack of twodimensional Radon transforms. This can be
inverted by use of a number of standard reconstruction algorithms,
including ﬁltered backprojection and direct Fourier methods.
15.3.3 Uniform Attenuation Alone
The next simplest case arises when ignoring depthdependent res
olution and assuming that the attenuation can be represented by
a uniform attenuation coefﬁcient within some closed boundary.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
383
In this case, the imaging equation can be written:
p(ξ, z, θ) =
_
∞
−∞
dηa
θ
(ξ, z, η)e
−µ[η+D(ξ,z,θ)]
, (29)
where D(ξ, z, θ) represents the distance fromthe point x = ξ cos θ, y =
ξcos θ, z totheboundaryinthedirectionof theprojection. Bydeﬁning
a set of modiﬁed projections:
m(ξ, z, θ) ≡ e
µ[D(ξ,z,θ)]
p(ξ, z, θ), (30)
we obtain:
m(ξ, z, θ) =
_
∞
−∞
dηa
θ
(ξ, z, η)e
−µη
, (31)
an equation generally known as the exponential Radon transform
(ERT).
46
Tretiak and Metz developed an FBPstyle reconstruction for
mula in which appropriately modiﬁed projections are subject to
exponentially weighted backprojection.
46
The reconstruction for
mula for the activity a(r, φ, z) given in cylindrical coordinates can be
expressed as:
a(r, φ, z) =
_
2π
0
e
µη
_
ν
m
≥ν
µ
ν
m

2
e
j2πν
m
ξ
_
∞
−∞
m(ξ
, z, θ)e
−j2πν
m
ξ
dξ
dν
m
dθ,
(32)
where v
µ
= µ/2π, ε = r cos (θ, −ϕ), andµ = r sin (θ−ϕ). Anumber of
different analytic algorithms for inverting this imaging model were
proposed over the years. Bellini et al.
47
and Inouye et al.
48
developed
methods that workedinthe spatial frequencydomaintoestimate the
2DFourier transformof the unattenuated sinogram, fromwhich the
exact image couldbe obtainedby inverting the 2DRadon transform.
Hawkins proposeda methodbasedonthe use of circularlyharmonic
Bessel transforms.
49
These algorithms are all exact in the face of per
fect data, but they propagate noise and inconsistencies differently.
In 1995, Metz and Pan
50
analyzed the 2D Fourier transform
of the 2D ERT and demonstrated that all these methods can be
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
384
ChienMin Kao et al.
interpretedas special cases of a broadclass of methods. In particular,
they showed that these methods represented different choices of
weighting coefﬁcients implicitly being used to combine redundant
data that arise due to certain Fourierdomain symmetries. Metz and
Pan also showed that a new member of the class, given by a differ
ent choice of those weightingcoefﬁcients, hadbetter noise properties
than those of the existing methods.
50,51
Speciﬁcally, the methodprovides a means of estimating the coef
ﬁcients of the angular Fourier series representation of the 2DFourier
transform of a(r, φ, z) at a ﬁxed z, which we denote A
k
(ν
a
), from the
2D Fourier transform (actually a combination of a 1D Fourier trans
formand a 1DFourier series expansion) of the modiﬁed data, which
we denote M
k
(ν
m
), by use of:
A
k
(ν
a
) = ωγ
k
M
k
(ν
m
) + (1 − ω)(−1)
k
γ
−k
M
k
(− ν
m
), (33)
where ν
m
=
_
ν
2
a
+ ν
2
µ
, γ =
_
ν
2
m
− ν
2
µ
/(ν
m
+ ν
µ
), and 0 ≤ ω ≤ 1
is a weight that allows the two independent estimates of A
k
(ν
a
)
to be combined in a way that minimizes the variance of the ﬁnal
image. Metz and Pan showed that the existing algorithms can be
derived by the selection of different ω and that new algorithms can
be derived that may have noise properties superior to the existing
algorithms.
50,51
15.3.4 Distancedependent Resolution Alone
If attenuation is ignored but distancedependent resolution effects
modeled, then Eq. (27) becomes:
p(ξ, z, θ) =
_
∞
−∞
dη
_
∞
−∞
_
∞
−∞
dξ
dz
h(ξ − ξ
, z − z
, η)a
θ
(ξ
, z
, η). (34)
Appledorn
52
presented an analytic solution to this equation for the
case when h(ξ, z, η) is a Cauchy function whose width parameter
grows linearly with distance η.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
385
15.3.5 Distancedependent Resolution and
Uniform Attenuation
When both distancedependent resolution and uniformattenuation
are modeled, Eq. (27) becomes:
p(ξ, z, θ)
=
_
∞
−∞
dη
_
∞
−∞
_
∞
−∞
dξ
dz
h(ξ − ξ
, z − z
, η)a
θ
(ξ
, z
, η)e
−µ[η+D(ξ,z,θ)]
,
(35)
Soares derived the ﬁrst analytic solution to this equation for the
case when h(ξ, z, η) is a Cauchy function.
53
Van Elmbt and Walrand
54
generalized Bellini’s method for inverting the ERT to invert Eq. (35)
for the more practical case when h(ξ, z, η) is modeled as a Gaussian
whose standard deviation grows linearly with distance η. Pan and
Metz extended the earlier work of Metz and Pan to this equation
for both the Cauchy form of h(ξ, z, η) considered by Soares and the
Gaussian form considered by van Elmbt and Walrand.
55
15.3.6 Nonuniform Attenuation Alone
When the attenuation is nonuniform and distancedependent reso
lution effects are ignored, Eq. (27) becomes:
p(ξ, z, θ) =
_
∞
−∞
dηa
θ
(ξ, z, η)exp
_
−
_
η
∞
µ
θ
(ξ, z, η
)dη
_
. (36)
This equation is often referred to as the attenuated Radon transform.
An approximate approach to inverting this equation was devel
oped by Chang.
56
The multiplicative Change method entails cal
culating the average fraction of photons that safely escape from
each point in the reconstructed volume to the various detector loca
tions. The reconstructedimage is thenmultipliedbythe reciprocal of
this average transmission factor map to obtain the corrected image.
This correction is only approximate but it can be reﬁned through
an iterative process in which the corrected image is reprojected and
the resulting data compared to the measured data. The difference
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
386
ChienMin Kao et al.
between the reprojections and the measured data are used to gener
ate an error image by reconstruction using FBP. The error images are
corrected for attenuation by the multiplicative trick and then added
to the original corrected image. This process could be continued for
a desired number of iterations.
The explicit solution for the attenuated Radon transform was
ﬁrst presented by Novikov.
57
Natterer made signiﬁcant contri
butions to the theory, including a different inversion formula as
well as an alternative, simpler derivation of Novikov’s formula.
58
Kunyansky also provided a slightly modiﬁed version of Novikov’s
formula, whichallowedfor simpler implementationandfromwhich
FBP and the TretiakMetz algorithm could easily be obtained under
the cases when attenuation was zero or uniform, respectively.
59
15.3.7 Short Scan and Region of Interest Imaging
While it is well known that inversion of the Radon transform only
requires data acquired over a 180
◦
angular range, it was not known
until recently whether the exponential Radon transform and the
attenuated Radon transform could also be inverted from so called
short scan data.
In 2002, however, Noo andWagner showedthat the ERTrequires
data on the angular interval θ ∈ [0, π].
60
Pan et al.
61
generalized this
result to develop so calledπscheme short scan strategy in which the
full angular range of 2π is divided into a number of nonoverlapping
angular intervals, and the data function is acquired only over dis
joint angular intervals whose summation without conjugate views
is equal to π. This approach does not yield an explicit inversion for
mula, but it was demonstrated that an iterative algorithm is able to
generate high quality reconstructions from πscheme data.
As for the attenuated Radon transform, Sidky et al., were able to
show, both heuristically and rigorously, that a short scan is sufﬁcient
here as well by adopting the so called potatopeeler perspective to
establish that there is twofold redundancy in a fullscan dataset.
62
Recently, Noo et al. have developed approaches to reconstructing
ROI images from the ERT and attenuated RT with truncations.
63
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
387
15.4 ACKNOWLEDGMENTS
This work was supported in part by NIH grants EB00225 and
EB02765. Dr Emil Sidky was supported NIH K01 EB003913. The
authors are thankful to Mr Xiao Han and Mr Dan Xia for working
on the latex ﬁle of the manuscript.
References
1. Tuy HK, An inversion formula for conebeam reconstruction, SIAM
J Appl Math 43: 546–552, 1983.
2. Grangeat P, Mathematical framework of conebeam3Dreconstruction
via the ﬁrst derivative of the Radon transform, in Herman GT, Louis
AK, Natterer F (eds.), Mathematical Methods in Tomography, Lecture Notes
in Mathematics, SpringerVerlag, Berlin, pp. 66–97, 1991.
3. Katsevich A, Analysis of an exact inversion algorithm for spiral cone
beam CT, Phys Med Biol 47: 2583–2597, 2002.
4. Chen G, An alternative derivation of Katsevich’s conebeam recon
struction formula, Med Phys 30: 3217–3226, 2003.
5. Danielsson PE, Edholm P, Seger M, Towards exact 3Dreconstruction
for helical conebeamscanning of long objects. Anewdetector arrange
ment anda newcompleteness condition, inTownsendDW, KinahanPE
(eds.), Proceedings of the 1997 International Meeting on Fully Three
Dimensional Image Reconstruction in Radiology and Nuclear Medicine,
Pittsburgh, pp. 141–144, 1997.
6. Zou Y, Pan X, Exact image reconstruction on PIline from minimum
data in helical conebeam CT, Phys Med Biol 49: 941–959, 2004.
7. Zou Y, Pan X, Image reconstruction on PIlines by use of ﬁltered back
projection in helical conebeam CT, Phys Med Biol 49: 2717–2731, 2004.
8. Sidky EY, Zou Y, Pan X, Minimum data image reconstruction algo
rithms with shiftinvariant ﬁltering for helical, conebeam CT, Phys
Med Biol 50: 1643–1657, 2005.
9. Zou Y, Pan X, An extended data function and its backprojection onto
PIlines in helical conebeam CT, Phys Med Biol 49: N383–N387, 2004.
10. Zou Y, Pan X, Sidky EY, Theory and algorithms for image reconstruc
tion on chords and within region of interests, Journal of the Optical Soci
ety of America A 22: 2372–2384, 2005.
11. Jiang Hsieh, Computed Tomography — Principles, Designs, Artifacts, and
Recent Advances, SPIE Press, Bellingham, WA, 2003.
12. Sidky EY, Pan X, Recovering compactly supported functions from
knowledge of its hilbert transform on a ﬁnite interval, IEEE Signal
Processing Lett 12: 97–100, 2005.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
388
ChienMin Kao et al.
13. Tam KC, Samarasekera S, Sauer F, Exact conebeam CT with a spiral
scan, Phys Med Biol 43: 1015–1024, 1998.
14. Gambhir SS, Molecular imaging of cancer with positron emission
tomography, Nature Rev Cancer 2: 683–693, 2002.
15. Herholz K, Heiss WD, Positron emission tomography in clinical neu
rology, Molecular Imaging and Biology 6: 239–269, 2004.
16. Lance Gould K, PET perfusion imaging and nuclear cardiology, J Nucl
Med 32: 579–606, 1991.
17. Alexander GE et al., Longitudinal PET evaluation of cerebral metabolic
decline in dementia: Apotential outcome measure in Alzheimer’s dis
ease treatment studies, Am J Psychiastry 159: 738–745, 2002.
18. Tai YC, Laforest R, Instrumentation aspects of animal PET, Annu Rev
Biomed Eng 7: 255–285, 2005.
19. Weissleder R, Scaling down imaging: Molecular mapping of cancer in
mice, Nature Rev Cancer 2: 11–18, 2002.
20. Herschman HR et al., Seeing is believing: Noninvasive, quantita
tive and repetitive imaging of reporter gene expression in living ani
mals, using positron emission tomography, J Neurosci Res 59: 699–705,
2000.
21. Kelloff GJ, Progress and promise of FDGPET imaging for cancer
patient management and oncologic drug develpment, Clinical Cancer
Research 11: 2785–2808, 2005.
22. Moses WW, Timeofﬂight in PET revisited, IEEE Trans Nucl Sci 50:
1325–1330, 2003.
23. Wernick MN, Aarsvold JN (eds.), Emisson Tomography: The Fundamen
tals of PET and SPECT, Elsevier Academic Press, San Diego, CA, 2004.
24. Kao CM, Chen CT, Development and evaluation of a dualhead PET
system for highthroughput smallanimal imaging, 2003 IEEE Nuclear
Science Symposium Conference Record, 2072–2076, 2003.
25. Qi J, Leahy RM, Iterative reconstruction techniques in emission com
puted tomography, Phys Med Biol 51: R541–R578, 2006.
26. HuesmanR, SalmeronE, Baker J, Compensationfor crystal penetration
in high resolution positron tomography, IEEE Trans Nucl Sci 36: 1100–
1107, 1989.
27. Shao L, Karp JS, Countryman P, Practical considerations of the Wiener
ﬁltering technique on projection data for PET, IEEE Trans Nucl Sci 41:
1560–1565, 1994.
28. Herman GT, Meyer LB, Algebraic reconstruction techniques can be
made computationally efﬁcient [positron emission tomography appli
cation], IEEE Trans Med Imag 12: 600–609, 1993.
29. Wernick MN, Chen CT, Superresolved tomography by convex projec
tions and detector motion, J Opt Soc Am A 9: 1547–1553, 1992.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
389
30. Fessler JA, Penalized weighted leastsquares image reconstruction for
PET, IEEE Trans Med Imag 13: 290–300, 1994.
31. Shepp LA, Vardi Y, Maximum likelihood reconstruction for emission
tomography, IEEE Trans Signal Process 41: 534–548, 1982.
32. Hebert T, Leahy R, Ageneralized EMalgorithmfor 3DBayesian recon
struction from Poisson data using Gibbs priors, IEEE Trans Med Imag
8: 194–202, 1989.
33. Ouyang X, Wong WH, Johnson VE, Hu X, Chen CT, Incorporation of
correlated structural image in PET image reconstruction, IEEE Trans
Med Imag 13: 627–640, 1994.
34. Ahn S, Fessler JA, Emission image reconstruction for randoms
precorrected PET allowing negative singoram values, IEEE Trans Med
Imag 23: 591–601, 2004.
35. Hudson HM, Larkin RS, Accelerated image reconstruction using
ordered subsets of projection data, IEEE Trans Med Imag 13: 601–609,
1994.
36. Browne J, De Pierro AR, Arowaction alternative to the EM algorithm
for maximizing likelihoods in emission tomography, IEEE Trans Med
Imag 15: 687–699, 1996.
37. Hsiao IT, Rangarajan A, Khurd P, Gindi G, An accelerated convergent
ordered subsets algorithm for emission tomography, Phys Med Biol 49:
2145–2156, 2004.
38. Kinahan PE, Rogers WL, Analytic 3D image reconstruction using all
detected events, IEEE Trans Nucl Sci 36: 864–968, 1989.
39. DuabeWitherspoon ME, Muehllehner G, Treatment of axial data in
threedimensional PET, J Nucl Med 28: 1717–1724, 1987.
40. Defrise Met al., Exact andapproximate rebinning algorithmfor 3DPET
data, IEEE Trans Med Imag 16: 167–186, 1997.
41. Kao CM, Yap JT, Mukherjee J, Wernick MN, Image reconstruction for
dynamic pet based on loworder approximation and restoration of the
sinograms, IEEE Trans Med Imag 16: 738–749, 1997.
42. Kamasak ME, Bouman CA, Morris ED, Sauer K, Direct reconstruction
of kinetic parameter images from dynamic PET data, IEEE Trans Med
Imag 25: 636–650, 2005.
43. Barrett HH, White T, Parra L, Listmodel likelihood, J Opt Soc Am A14:
2914–2923, 1997.
44. Rahmin A, Blinder S, Cheng JC, Sossi V, Statistical list model recon
sturction in quantitative dynamic imaging using high resolution
research tomograph, 8th Fully 3D Meeting 117–120, 2005.
45. Reader AJ, Sureau FC, Comtat C, Trebossen R et al., Joint estimation
of dynamic PET images and temporal basis functions using fully 4D
MLEM, Phys Med Biol 51: 5455–5474, 2006.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
390
ChienMin Kao et al.
46. Tretiak O, Metz CE, The exponential radon transform, SIAM J Appl
Math 39: 341–354, 1980.
47. Bellini S, Piacenti M, Caffario C, Rocca F, Compensation of tissue
absorption in emission tomography, IEEE Trans Acoust Speech, Sig Pro
cessing 27: 213–218, 1979.
48. Inouye T, Kose K, Hasegawa A, Image reconstruction algorithm for
singlephotonemission computed tomography with uniform attenu
ation, Phys Med Biol 34: 299–304, 1989.
49. Hawkins WG, Leichner PK, Yang NC, The circular harmonic trans
form for spect reconstruction and boundary conditions on the fourier
transform of the sinogram, IEEE Trans Med Imaging 7: 135–148, 1988.
50. Metz CE, Pan X, A uniﬁed analysis of exact methods of inverting the
2D exponential radon transform, with implications for noise control
in spect, IEEE Trans Med Imaging 14: 643–658, 1995.
51. Pan X, Metz CE, Analysis of noise properties of a class of exact meth
ods of inverting the 2D exponential radon transform, IEEE Trans Med
Imaging 14: 659–668, 1995.
52. Appledorn CP, An analytical solution to the nonstationary reconstruc
tion problem in single photon emission computed tomography, in
Ortenhdahl DA, Llacer J (eds.), Information Processing in Medical Imag
ing, WileyLiss, New York, pp. 69–79, 1990.
53. Soares EJ, Byrne CL, Glick SJ, Appledorn CR et al., Implementation
and evaluation of an analytical solution to the photon attenuation and
nonstationaryresolutionreconstructionprobleminSPECT, IEEETrans
Nucl Sci 40: 1231–1237, 1993.
54. van Elmbt L, Walrand S, Simultaneous correction of attenuation and
distancedependent resolution in SPECT: An analytical approach, Phys
Med Biol 38: 1207–1217, 1993.
55. Pan X, Metz CE, Analytical approaches for image reconstruction in 3d
spect, inGrangeat P, Amans J (eds.), 3DImage ReconstructioninRadiology
andNuclear Medicine, KluwerAcademic Publishers, NewYork, 103–116,
1996.
56. Chang LT, A method for attenuation correction in radionuclide com
puted tomography, IEEE Trans Nucl Sci 25: 638–643, 1978.
57. Novikov RG, An inversion formula for the attenuated Xray transfor
mations, Dèpartment deMathèmatique, Universitè de Nantes, Nantes,
France (preprint), 2000.
58. Natterer F, Inversion of the attenuated Radon transform, Inv Probs 17:
113–119, 2001.
59. Kunyansky LA, A new SPECT reconstruction algorithm based upon
the Novikov’s explicit inversion formula, Inv Prob 17: 293–306,
2001.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
391
60. Noo F, Wagner JM, Image reconstruction in 2D spect with 180degree
acquisition, Inv Probs 17: 1357–1372, 2001.
61. Pan X, Kao CM, Metz C, A family of pischeme exponential radon
transform and the uniqueness of their inverses, Inv Prob 18: 825–836,
2002.
62. Sidky E, Pan X, Variable sinogramand redundant information in spect
withnonuniformattenuationandthe uniqueness of their inverses, Inv
Probs 18: 1483–1497, 2002.
63. Noo F, Defrise M, Pack JD, Clackdoyle R, Image reconstruction from
truncated data in singlephoton emission computed tomography with
uniform attenuation, Inv Prob 23: 645–667, 2007.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch15 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
CHAPTER 16
ShapeBased Reconstruction from
Nevoscope Optical Images of
Skin Lesions
Song Wang and Atam P Dhawan
Optical imaging of skin lesions has been of signiﬁcant interest for early
diagnosis and characterization of skin cancers. The work presented in
this chapter is in continuation of the development of an optical imaging
based portable system with computerized analysis to detect skin cancers,
particularly melanomas in early curable stage. The method developed
in this paper can provide reconstructions of melanin and blood contents
associatedwithaskinlesionfromits multispectral transilluminationbased
optical images. The results of simulation of a skin lesion for reconstruction
of melaninandblood(hemoglobin) informationfrommultispectral optical
images are presented. Changes in melanin and hemoglobin contents in a
skin lesion detected over time using the proposed method would allow
early detection of malignant transformation and the development of a
cancerous lesion.
16.1 INTRODUCTION
In recent years, optical medical modalities have drawn signiﬁcant
attention from researchers. Visible and nearinfrared light wave
lengths have been used of surface reﬂectance, transillumination and
transmission based methods.
1
Also, optical modalities can provide
a portable imaging system for routine screening and monitoring of
skin lesions. Optical modalities usually make use of light within the
lower part of magnetic electric spectra, which is believed that this
kind of light is not going to poison the interrogated tissue or the
393
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
394
Song Wang and Atam P Dhawan
side effect would be greatly reduced. In the so called “therapeutic
window,” including part of visible light and infrared light, physio
logically meaningful chromophores like melanin, oxyhemoglobin
and deoxyhemoglobin have relative low absorption coefﬁcients.
Meanwhile, the scattering coefﬁcients of human tissues are rela
tively high in this range, resulting in a penetrating depth favor
able for investigating brain, breast and skin etc. More importantly,
the optical coefﬁcients of these chromophores are strongly wave
lengthdependent. The discrepancy of chromophores therefore gives
us a chance to reveal their distributions using multispectral light.
1
Though the work presented in this chapter is motivated by the
need of developing a computeraided optical imaging system for
diagnosis and characterization of skin cancers, the methods pre
sented in the paper are generally applicable for optical image
reconstruction.
Skin cancer is one of fastest growing cancer among all cancers.
2
The majority of skin cancers are nonmelanoma skin cancers. The
cancer is derived fromkeratinocytes, the main type of cell of epider
mis. Malignant melanomaresults fromuncontrolledgrowthof mela
somes originallyexistedinepidermis. Thoughnonmelanoma cancer
prevails among all kinds of skin cancers, malignant melanoma is
the most fatal form which accounts for 90% death.
2
It is fatal if
not detected in early stages. It can be cured with nearly 100% sur
vival rate, if removed at its early stage. Malignant melanoma is
currently diagnosed by dermatologists according to its color and
morphology.
2
However, such diagnosing process to large extent
is subjective and diagnostic accuracy rests on the dermatologist’s
individual experience. There is an urgent need for developing a
noninvasive modality to reveal physiological features of malignant
melanoma quantitatively and objectively so that even an unskilled
dermatologist is able to make right decision.
As an effective utility to diagnosing malignant melanoma, the
lightbased device should be able to provide both morphological
and physiologic information. Morphological information may be
utilized to determine the depth of invasion of malignant melanoma
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
395
while physiologic characteristics like distribution of melanin dust
and blood vessels are essential to differentiate it from a benign
one. Within visible and infrared spectra, major chromophores of
malignant melanoma include melanin, oxyhemoglobin and deoxy
hemoglobin. The absorptionof water is negligible comparedtothese
major ones. Among these chromophores, melanin presents a higher
absorption than oxyhemoglobin and deoxyhemoglobin. Also, their
absorption spectra are not linear dependent. Hence, it is possible
to uncover distributions of these chromophores by multispectral
optical measurement. Like mentioned before, malignant melanoma
have a nonblood, melaninrich core and a hemoglobinrich periph
eral blood net. Once distributions of chromophores are rendered, its
structure is available simultaneously. Having investigated the phys
ical properties of malignant melanoma under visible and infrared
light, it is obvious that anoptical device wouldbe areasonable choice
for imaging malignant melanoma.
An optical transillumination imaging device, Nevoscope is
used for imaging skin lesions. It was introduced by Dhawan
3
for
noninvasive diagnosis of malignant melanoma and other skin can
cers. In its transillumination mode, light is directed by a channel 45
◦
with respect to the normal of skin and enters skin though a ring light
source. The reemerged light gets captured by the CCD camera and
forms the transilluminationimage. This image contains the informa
tionof underlyingoptical properties. Withinthe optical tomography
framework, it is possible to retrieve two key signatures of malignant
melanoma, the spatial distribution of melanin and blood from opti
cal reﬂectance measurements.
16.2 OPTICAL IMAGING METHODS
During the last several decades, various optical imaging modalities
have been developed.
10
These methods can be divided into ﬁve cat
egories: surface imaging, ﬂuorescence imaging, optical coherence
imaging (OCT), optical spectroscope and optical tomography (OT).
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
396
Song Wang and Atam P Dhawan
16.2.1 Surface Imaging
Surface imaging methods provide a speciﬁc light source to illumi
nate the surface of the skin and skin lesion. Surface reﬂectance based
measurements are then stored as a highresolution image using a
highresolution CCDdigital camera via a magniﬁcation optical lens.
For example, “Dermoscope” has been used for surfacereﬂectance
based imaging of skin lesion.
2,3
A better accuracy for detecting
melanoma can be obtained through the use of the Epiluminescence
Light Microscopy (ELM) imaging method, where the reﬂection of
the surface light is reduced by either an oilglass interface on the
skin or crosspolarization of the surface and reﬂected light to cancel
the surface reﬂection.
2
Dermoscopyutilizes surface reﬂectance dom
inant illumination methods that are to render the skin translucent
and thereby allowing for the visualization of subsurface structures
and colors. These subsurface structures and colors in combination
with their location and distribution (pattern) have been shown to
improve a clinician’s ability to detect early melanoma and basal
cell carcinoma. Dermoscopy can be performed utilizing polarized
or nonpolarized light. Crosspolarization method for epilumines
cence uses linear polarizers in the incident light and a viewing lens
to cancel the light that is reﬂected from the skin. Since most of the
reﬂected light from the skin surface has, for the most part, the same
polarization angle as the incident light, crosspolarization blocks
most of the surface reﬂected light and only the light that is diffused
below the skin surface is visualized.
A novel optical imaging system, the Nevoscope that uses
transillumination as well as a combination of surface illumination
and transillumination, has been developed by Dhawan to provide
images with signiﬁcant information about skinlesion subsurface
pigmentation architecture.
3,4
Nevoscope consists of a digital CCD
camera hookedupto a zoomlens anda customizedoptical assembly
to obtain surface and/or transilluminationbased images of the skin
and skin lesion. In the Nevoscope transillumination method, light
is transmitted into the skin area surrounding the lesion at 45
◦
angle.
A virtual light source is thus created a few millimeters below the
skin surface for uniformtransillumination of a small area of the skin
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
397
containing the skin lesion. In such a sidetransillumination method,
no surface light is used. An annular transillumination ring provides
ﬁber optics directed light to illuminate the region of interest uni
formly. Theskinlesionis positionedinsidethetransilluminationring
through the opening providing a direct ﬁeld of view to the digital
camera througha zoomlens assembly. The light fromthe illuminator
ring that is not reﬂectedback due to a mismatch in refractive indices,
enters into the skin and goes through multiple internal reﬂections
andscattering. This light eventuallygets diffusedacross the layers of
the skin and backscattered diffused light photons emerge from the
skin to forma transilluminated image of the skin and skin lesion. For
surface illumination, additional ﬁber optics directed point sources
distributed around the internal wall of the Nevoscope are provided
to reﬂect light through the surface of the skin lesion. The surface
light intensity can be adjusted and is polarized. Another polarizing
lens (crosspolarized by 90
◦
) is used with crosspolarization method
for the imaging of skin lesion. The Nevoscope by virtue of its design
provides three different ways of imaging a skin lesions.
Besides using the pigment and color information from surface
reﬂectance information, optical models to relate the reﬂectance mea
surements to underlying optical properties have been developed.
For instance, Claridge etc.
15
use a KubelkaMunk model to simu
late the formation of color images of melanoma. They eventually
are able to recover blood and melanin distribution in various skin
layers. KubelkaMunk model is basically a onedimensional theory.
Using this model and multispectral imaging, the usually illposed,
underdetermined inverse problem occurred in optical tomography
is dealt for image reconstruction.
16.2.2 Fluorescence Imaging
Fluorescence imaging uses ultraviolet light to excite ﬂuorophores
and collects emitted light at a higher wavelength. Fluorophores
include endogenous and exogenous ﬂuorophores. The former refers
to natural ﬂuorophores intrinsic inside the skin such amino acid
andstructural protein. These ﬂuorophores are randomly distributed
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
398
Song Wang and Atam P Dhawan
in skin. So direct reconstructing their distributions is meaningless
if not possible. The latter usually refers to some smart polymer
nanoparticles targeting at speciﬁc molecules like hemoglobin.
Due to the disparity of the metabolism states, some kinds of
exogenous ﬂuorophores have unique distributions in malignant
melanoma compared to those in normal tissue, which may suggest
the presence of a cancer. Fluoresence imaging has a similar mech
anism as single particle emission computed tomography (SPECT).
The purpose is to recover source distributiongivena boundary mea
surement. Aspiring results are obtained by several groups.
16,17
16.2.3 Optical Coherence Tomography
The relatively new modality OCT makes use of coherent proper
ties of light.
18
In OCT system, light with a low coherence length
is divided into two parts. One serves as reference while the other
is directed into tissue. When light travels in the tissue, it encoun
ters interface with different refractive index and part of the light is
reﬂected. This reﬂectance is mixed with the reference subsequently.
Once the difference of optical path length between reference light
and reﬂected light is less than the coherence length, coherence
occurs. By observing the coherence pattern and changing the optical
path length of reference light with a mirror, a cross section of skin
can be rendered.
With a sufﬁcient low coherence length, the resolution of OCT
may reach a magnitude of micrometer hence can disclose subtle
changes in cancer tissue at a cellular level. OCT recovers the struc
ture of interrogated tissue in a mechanism analogous to ultrasonic
imaging. The latter modality sends sound wave into tissue and the
sound wave reﬂects when encountering impedance varied inter
face. However, the resolution of OCT is much higher than ultrasonic
imaging.
16.2.4 Optical Spectroscope
Another optical imaging modality is an optical spectroscope. Its
application dates back several decades when spectroscope was
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
399
ﬁrst used to evaluate blood oxygenation. The spectroscope samples
investigated tissue using reasonable distanced detector and source.
It measures reemerging light at multiple wavelengths usually rang
ing from visible to infrared spectra. The measured absorption spec
trum is a direct reﬂectance of what happens within the sampling
volume.
In skin, the absorption spectrum is an overlap effect of several
chromophores. However, to recognize the fraction of these chro
mophores is difﬁcult. The common assumption is that these chro
mophores are homogenously distributed in the sampling volume.
In fact, it is not true for skin, a complex and heterogeneous layered
tissue. Even a further assumption that chromophores in each skin
layer are homogenous is against reality. Havingsaidthat, the absorp
tion spectrumis indeed related to the underlying composition of tis
sue. It may therefore provide signiﬁcant signatures to some diseases
by itself.
Malignant melanoma contains more melanin and blood than
normal tissue hence more light is absorbedandthe absorptionvaries
in terms of characteristics of melanin and hemoglobin. Study shows
that the absorption spectrum of malignant melanoma differs signif
icantly fromthat of normal tissue. Features extracted fromthe spec
trum can be subsequently used to identify malignant melanoma.
Tomatis etc.
19
used artiﬁcial neural network as the classiﬁer. They
reporteda sensitivityof 80.4%anda speciﬁcityof 75.6%in1391 cases
where 184 are melanoma. A study based on multivariate discrimi
nant analysis
20
also shows promising results.
16.2.5 Optical Tomography
When discussing about optical tomography, we refer to the optical
imaging systemaimed to reconstruct inside spatialresolved optical
properties by multiple source detector channels. Though some opti
cal tomography systems borrowsimilar ideas fromother well estab
lishedtomographysystems like CT, the fundamental designconcept
of optical devices may deviate greatly from these well established
ones. The characteristic of the system is related to the conﬁguration
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
400
Song Wang and Atam P Dhawan
of source detector channels. It is also related to the tissue since the
optical properties of the tissue affects light transportation and con
sequently affects the spatial sensitivity of the system. The reason is
that, unlike the straight line trajectory of CT, light becomes diffuse
in tissue. Both system conﬁguration and tissue optical properties
change the trajectories of light photons.
The reconstruction of optical properties differs from most well
established modalities in that it usually does not have access to the
standard algorithms such as ﬁltered backpropagation algorithm.
With the underdeterminedandillposednature of the inverse prob
lem, an optical reconstruction algorithm is to seek an optimal solu
tion from numerous possible ones meeting some priori. The optical
reconstruction algorithms are far away from perfect and it is still
a hot research topic in the optical communities. As about Nevo
scope, because of its reﬂectance geometry, the measurements of
source detector channels are highly dependent. That is, effective
measurements are greatly reduced. In other words, there are fewer
constraints on parameter space under this scenario, which makes
the inverse problem even harder to solve. The shape based multi
constrants algorithm presented in this chapter has several advan
tages over the conventional voxel by voxel approaches. It has fewer
parameters and more constraints. And it has a global method to
search the parameter space. The algorithm will be illustrated and
discussed in rest of the chapter.
16.3 METHODOLOGY: SHAPEBASED OPTICAL
RECONSTRUCTION
Figure 1 shows a ﬂow chart of the proposed reconstruction method
in terms of Nevoscope transillumination images. Firstly, the overall
goal is to minimize difference between the real measurement and
the predicted measurement. This minimization problem is solved
by genetic algorithms to offer a global searching. Secondly, a lin
earized forward model is adopted and evaluated by Monte Carlo
simulation in terms of typical optical properties of normal skin.
Thirdly, the malignant melanoma is represented by shapes of its
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
401
Normal Skin Image
Skin Lesion Image
real
M ∆
Jacobian Matrix
Normal Skin
Optical Properties
cal
M ∆
Genetic
Algorithm
Predicted
Shape
Sampling
Function
Monte Carlo
Simulation
Fig. 1. Flow chart of proposed method.
melanin part and blood part. These parameters are lumped into
genetic algorithms.
16.3.1 Forward Modeling
To develop an optical tomographic system, a forward model is
required to relate the measurement to the optical properties of the
investigated tissue. Regardless of what kind of imaging geometry
we are using, an optical system may be described as:
M = F(x), (1)
where M is the measurement and F is a forward model. x is a distri
bution of unknown optical properties.
Given a reasonable initial guess x
0
of the background optical
properties, we may expand Eq. (1) into:
M = F(x
0
) + F
(x
0
)(x − x
0
) +
1
2
F
(x
0
)(x − x
0
) + · · · , (2)
where F
and F
are ﬁrst order and second order Frechet derivatives
respectively.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
402
Song Wang and Atam P Dhawan
Let M
cal
= M−F(x
0
) andx = x−x
0
, Eq. (2) maybe rearranged
as:
M
cal
= F
x +
1
2
F
x + · · · . (3)
The discrete form of Eq. (3) turns to be:
M
cal
= J x +
1
2
H x + · · · . (4)
Here, J is the Jacobian matrix and H is the Hessian Matrix.
M
cal
is the measurement vector and x is the vector that gives the varia
tions from the background x
0
.
Neglecting higher order terms in Eq. (4), we get a simpliﬁed
linear system:
M
cal
= J x. (5)
The formulation in terms of Eq. (5) leads to linear optical tomog
raphy which is also known as “difference imaging.” That is, two
measurements are taken. One is for background tissue (that is, x
0
)
and one is for abnormal tissue (that is, unknown x). Their difference
is then fedto the reconstruction algorithmto obtain the optical prop
erties. In this study, the linear approach is adopted for Nevoscope
and the Jacobian matrix is extracted by Monte Carlo simulation in
terms of a seven layered optical skin model.
16.3.2 Shape Representation of SkinLesions
There are a variety of shape representation methods adopted by
authors working on optical tomography. In their 2D shapebased
reconstruction, Kilmer et al.
5
used a Bspline curve to describe
the 2D shape. Babaeizadeh
6
used tensorproduct Bspline to cre
ate 3D heart shape when studying electrical impedance tomogra
phy. The Bspline curve can sufﬁciently describe complex shape
given a few control points. Kilmer
7
later used an ellipsoid in their
3D study. To fully determine the ellipsoid, they need three param
eters to represent the centroid, three parameters to represent the
lengths of the semiaxes and three parameters to represent the direc
tion of the ellipsoid. The advantage of their approach is that only
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
403
nine parameters are required, which is a quite small number of
parameters in 3D geometry. However, this simpliﬁcation makes it
impossible to describe more complex 3D shapes. Zacharopoulos
8
uses the spherical harmonic representation in their 3D study. In
their study, Zacharopoulos shows that the eleven degree spherical
harmonic representation can describe the shape of neonatal head
fairly well.
Amelanoma basically is a 3D object. Therefore, the 3D descrip
tion like spherical harmonic is appropriate. In addition, we can
observemalignant melanomaonestepfurther. Malignant melanoma
is a result of uncontrolled replication of melasome cells sitting in
the basal layer of epidermis. The shape of the melanoma is hence
bounded by the epidermis layer. All we need to describe the lesion
is therefore reduced to 2D surfaces in the 3D domain.
In order to represent malignant melanoma with 2D surfaces,
we break it into two parts: the melanin part and the blood part.
The melanin part is a 3D region bounded by a single surface and
the epidermis layer. Within the region, the optical properties are
constant and the only absorber is melanin. Furthermore, a second
surface which sits below the ﬁrst surface is used to represent the
blood part. The region bounded by the ﬁrst surface and the second
surface is blood only. This model mimics the deteriorated lesion and
its xz intersection is shown in Fig. 2.
Let the ﬁrst surface be represented as f
1
(x, y) which corresponds
to the depth of lesion from the epidermis layer at the position (x, y).
The idea is to represent the continuous surface with limited parame
ters. Firstly, we put a N×N rectangular gridto lie over the epidermal
layer. Secondly, the function f
1
(x, y) is sampled to N×N discrete val
ues f
d1
(X, Y). Here, (x, y) is continuous and (X, Y) is N ×N numbers
of discrete sampling positions. Third, the discrete values are inter
polated by the cubic tensorproduct Bspline which satisﬁes the fol
lowing condition:
f
d1
(X, Y) =
N
i=1
N
j=1
c
1
(i, j)β
3
(X − i, Y − j). (6)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
404
Song Wang and Atam P Dhawan
Epidermis
Dermis
Fat
Melanin
Blood
f
2
(x,y)
f
1
(x,y)
Stratum Corneum
Fig. 2. Shape representation of malignant melanoma.
The original function f
1
(x, y) can then be approximated by:
f
B1
(x, y) =
N
i=1
N
j=1
c
1
(i, j)β
3
(x − i, y − j), (7)
where
β
3
(x − i, y − j) = β
3
(x − i)gβ
3
(y − j), (8)
is the tensor product of onedimensional cubic Bspline basis β
3
(x−i)
and β
3
(y − j). And c
1
(i, j) is Bspline coefﬁcient.
Similarly, the second surface can be deﬁned by N × N discrete
values f
d1
(X, Y) or, equivalently, z
d
(X, Y) = f
d2
(X, Y)−f
d1
(X, Y) which
is the thickness of blood region between the ﬁrst surface and the
second surface.
16.3.3 Reconstruction Algorithm
To reconstruct the surfaces and piecewise constant optical proper
ties, the continuous surface representation should be incorporated
into the forward photon transportation model. In our study, the lin
earized forward model (Eq. 5) is kept intact and the continuous rep
resentation is sampled into the discrete vector of unknowns x.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
405
That is:
M = J · x = J · S(f
d1
(X, Y), mp1, z
d
(X, Y), bp2), (9)
where S( · ) is the sampling function which converts the continuous
shape representation to the voxelbased optical properties in for
ward model, mp1 is the fraction of melanin and bp2 is fraction of
blood.
The inverse problemcan therefore be formulated as minimizing
the objective function:
F
obj
=
1
2
M
real
− J · x
2
=
1
2
M
real
− J · S(f
d1
(X, Y), mp1, z
d
(X, Y), bp2)
2
. (10)
Now, the unknowns of this inverse problem are reduced to
f
d1
(x, y), z
d
(X, Y), mp1 and bp2.
The multispectral shape reconstruction using N wavelengths
can be formulated as a multiobjective optimization problem and
its objective function is given as:
F
obj
= α
1
· F
λ
1
obj
+α
2
· F
λ
2
obj
+ · · · +α
N
· F
λ
N
obj
, (11)
where, {α
1
, α
2
, . . . , α
N
} is a set of coefﬁcients to balance the contribu
tions from different singlewavelength objective functions.
In our study, we use a genetic algorithm to solve the optimiza
tion problem
8
for the following reasons. First, in genetic algorithm,
the gradient neednot be evaluatedwhichsimpliﬁes the computation
and provides reliability. Second, genetic algorithmis one of the most
popular methods used to seek global minimal. Third, among the
global optimization techniques, genetic algorithm provides a rea
sonable convergence rate due to its implicit parallel computation.
Its elements include the ﬁtness function, coding the chromosome,
reproduction and crossover (breeding) and mutation.
As to the optimization problem occurred in the shapebased
reconstruction, the objective function(Eq. 11) is selectedas the ﬁtness
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
406
Song Wang and Atam P Dhawan
function. Its parameters is coded into chromosome like:
f
d1
(X
1
, Y
1
) − f
d1
(X
2
, Y
2
) − · · · − f
d1
(X
N×N
, Y
N×N
) − mpl
−z
d
(X
1
, Y
1
) − z
d
(X
2
, Y
2
) − · · · − z
d
(X
N×N
, Y
N×N
) − bp2
. (12)
Reproduction is governed by the “Roulette wheel” rule and
crossover and mutation events occur according to some predeﬁned
probabilities.
Before working on the optimization algorithm, we may add
some reasonable constraints to the parameters. Firstly, the region
of support of the lesion in xy plane is readily available in terms of
its surface shape. This further reduces the number of parameters to
represent a surface from N × N to a smaller set. As a consequence,
the optimization algorithm has a faster convergence rate. Secondly,
the blood region is typically thin layers within a few hundreds of
micrometers which put a constraint on z
d
(X, Y). Thirdly, the frac
tions of melanin and blood are not free parameters. They can also
be bounded according to the appearance of melanoma and the clin
ical experience. Lastly, multispectral imaging provides implicit con
straints. Given the distinct absorption spectra of blood and melanin,
a reasonable solution must satisfy the measurements of all involved
wavelengths.
16.3.4 Phantom and Error Evaluation
To validate the shapebased multispectral algorithm, a double
surface phantom is created to represent malignant melanoma. The
ﬁrst andsecondsurfaces are describedbya mixedGaussianfunction
which are given as:
f
1
(x, y) = MAX(peak1 · G(x, y, µ
1a
, µ
2a
, σ
a
), peak2 · G(x, y, µ
1b
, µ
2b
, σ
b
))
(13)
f
2
(x, y) = MAX(peak3 · G(x, y, µ
1a
, µ
2a
, σ
a
), peak4 · G(x, y, µ
1b
, µ
2b
, σ
b
))
where the Gaussian function is:
G(x, y, µ
1
, µ
2
, σ) =
1
2πσ
2
exp
−
(x −µ
1
)
2
+(y −µ
2
)
2
2σ
2
. (14)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
407
And the parameters used in Eq. (13) are:
µ
1a
= 0.0375 ×2 cm µ
2a
= −0.0375 ×3 cm
µ
1b
= −0.0375 ×2 cm µ
2b
= 0.0375 ×3 cm
σ
a
= 0.0375 ×2.5 cm σ
b
= 0.0375 ×2 cm
peak1 = 100 ×6 µm peak2 = 100 ×4 µm
peak3 = 100 ×8 µm peak4 = 100 ×6 µm
(15)
The fraction of melanin is set to 5% between the epidermal layer
and the ﬁrst surface f
1
(x, y). And the fraction of blood is set to 20%
between the ﬁrst surface f
1
(x, y) and the second surface f
2
(x, y). This
model has sufﬁcient variation in order to verify the reliability of the
reconstruction algorithm. Figure 3(A) displays the 3D view of this
model.
To further evaluate the reconstruction result, we introduce the
volume deviations Volerr1 and Volerr2. They are deﬁned as:
Volerr1 =
vol1
c
− vol1
m

vol1
m
. (16)
Here, vol1
c
is the calculatedvolume boundedby the ﬁrst surface and
vol1
m
is the corresponding volume from the model:
Volerr2 =
vol2
c
− vol2
m

vol2
m
. (17)
Here, vol2
c
is the calculated volume bounded by the ﬁrst and second
surfaces and vol2
m
is the corresponding volume from the model.
16.4 RESULTS AND DISCUSSIONS
We select 580 nm and 800 nm to validate the reconstruction algo
rithm since at these two wavelengths the absorption of oxy and
deoxyhemoglobin is equivalent. In addition, absorption of melanin
andbloodat the two wavelengths has considerable difference which
provides excellent constraints to the solution. First of all, the dou
ble surface continuous model is sampled and two “real” measure
ments M
580
and M
800
are calculated by Monte Carlo simulation
at 580 nm and 800 nm respectively. Next, a nine by nine rectangu
lar grid is overlapped on epidermal layer. The region of support of
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
408
Song Wang and Atam P Dhawan
Fig. 3. Reconstruction results: (A) Doublesurface model (B–E) Reconstructed
surfaces with different constraints.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
409
the lesion is counted as 10 discrete control points. As a result, the
chromosome contains 22 genes and is coded as:
f
d1
(X
1
, Y
1
) − f
d1
(X
2
, Y
2
) − · · · − f
d1
(X
10
, Y
10
) − mpl
−z(X
1
, Y
1
) − z(X
2
, Y
2
) − · · · − z(X
10
, Y
10
) − bp2
. (18)
The ﬁtness function of the genetic algorithm is given as:
F
obj
= α
1
F
580
obj
+α
2
F
800
obj
, (19)
where in our simulation, α
1
is 0.3 and α
2
is 1.
Four simulations with different constraints are implemented in
real number represented genetic algorithm and results are summa
rizedinTable 1. The recoveredsurfaces are displayedinFigs. 3(B–E).
In each case, the left is the ﬁrst surface and the right is the second
surface. There is noconstraint for the ﬁrst surface while the thickness
between the ﬁrst surface and the second surface is set to be 300 µm
to represent a thin layer of blood net. In addition, the deformation
process of the surfaces during optimization is shown in Fig. 4.
In terms of Table 1, the constraints have signiﬁcant impacts on
thereconstructedsurfaces. Except theloosest constrainedcase(E), all
cases present reasonable reconstructions which are consistent with
the model. Moreover, the reconstructed ﬁrst surface has a smaller
volume error than the second surface. There are several reasons to
explain the larger volume error of the second surface. Firstly, the
absorption coefﬁcient of blood is smaller than that of melanin. As a
result, the change in blood region has less contribution to the ﬁtness
function. Secondly, since a reﬂectance geometry is adopted in Nevo
scope, the sensitivity decreases at deeper layers. This also inﬂuences
Table 1. Summary of Reconstruction Results
Melanin Blood Recovered Recovered
Bounds Bounds Melanin Blood Volerr 1 Volerr 2
Case (%) (%) (%) (%) (%) (%)
(b) 5–5 20–20 5 20 2.68 16.58
(c) 4.5–5.5 10–30 5.10 18.33 3.81 20.89
(d) 4–6 10–30 4.64 18.97 5.18 24.71
(e) 3–7 10–30 3.37 10.08 44.09 63.50
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
410
Song Wang and Atam P Dhawan
Fig. 4. Deformation process during optimization (From left to right and from top
to bottom): (A) the ﬁrst surface (B) the second surface.
the accurate reconstruction of the second surface. Thirdly, because
the two surfaces are attached together, the error resulting from the
ﬁrst surface would inevitably propagate to the second surface. In
the worst case (E), a large error has been observed. The fraction of
melanin and blood is underestimated, which associates with over
estimated volumes. It is therefore still a reasonable result for the
optimization problem.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
ShapeBased Reconstruction from Nevoscope Optical Images of Skin Lesions
411
16.5 CONCLUSION
Ashape based reconstruction method using a genetic algorithmhas
been presented in this chapter. Though the reconstruction algorithm
has beendescribedfor optical images of skinlesions for the detection
of malignant melanoma, the framework of the shape based image
reconstruction can be applied to other optical imaging applications.
16.6 ACKNOWLEDGMENTS
This research was partially funded by grants from George AOhl Jr
Trust Foundation and Gustarus and Louise Pfeiffer Research Foun
dation. The workpresentedinthis report is also a part of the doctoral
dissertation work of Song Wang with Dr AtamDhawan as his Ph.D.
advisor.
References
1. Abramovits W, Stevenson LC, Changing paradigms in dermatology:
New ways to examine the skin using noninvasive imaging methods,
Clinics in Dermatol 21: 353–358, 2003.
2. Bashkatov AN, Genina EA, et al., Optical properties of human skin,
subcutaneous and mucous tissues in the wavelength range from 400
to 2 000 nm, Journal of Physics D: Applied Physics 38: 2543–2555, 2005.
3. Dhawan AP, Gordon R, Rangayyan RM, Nevoscopy: Three
dimensional computed tomography for nevi and melanoma by trans
illumination, IEEE Trans on Medical Imaging MI3(2): 54–61, 1984.
4. Patwardhan S, Dai S, Dhawan AP, Multispectral image analysis and
classiﬁcation of melanoma using fuzzy membership based partitions,
Computerized Medical Imaging and Graphics 29: 287–296, 2005.
5. Misha E Kilmer, Eric L Miller, David Boas, et al., Ashapebased recon
structiontechnique for DPDWdata, Optics Express 7(13): 481–491, 2000.
6. Saeed Babaeizadeh, Dana H Brooks, David Isaacson, A deformable
radius Bspline method for shapebased inverse problems, as applied
to electrical impedance tomography, acoustics, speech, and signal pro
cessing 2005 (ICASSP ’05).
7. Misha E Kilmer, Eric L Miller, et al., Threedimensional shapebased
imaging of absorption perturbation for diffuse optical tomography,
Applied Optics 42(16): 3129–3144, 2003.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch16 FA
412
Song Wang and Atam P Dhawan
8. Zacharopoulos A, Arridge S, Dorn O, et al., 3Dshape reconstruction in
optical tomography using spherical harmonics and BEM, Progress in
Electromagnetics Research Symposium 2006.
9. Chris Houck, Jeff Joines, Mike Kay, AGenetic Algorithm for Function
Optimization: AMatlabImplementation, NCSUIETR, pp. 95–09, 1995.
10. Balch CM, et al., Prognostic factors analysis of 17,600 melanoma
patients: Validation of the American joint committee on cancer
melanoma staging system, Journal of Clinical Oncology 19(16): 3622–
3634, 2001.
11. Marchesini R, et al., Optical imaging and automated melanoma detec
tion: Questions and answers, Melanoma Research 12: 279–286, 2002.
12. Ganster H, Pinz A, Kittler H, et al., Computer aided recognition of
pigmented skin lesions, Melanoma Research 7: 1997.
13. Seidenari S, et al., Digital videomicroscopy and image analysis with
automatic classiﬁcation for detection of thin melanomas, Melanoma
Research 9(2): 163–171, 1999.
14. Menzies S, Crook B, McCarthy W, et al., Automated instrumentation
and diagnosis of invasive melanoma, Melanoma Research 7: 1997.
15. Claridge E, Cotton S, et al., From color to tissue histology: Physics
based interpretation of images of pigmented skin lesion, Medical Image
Analysis, 489–502, 2003.
16. Claridge E, Preece SJ, An inverse method for recovery of tissue param
eters from colour images, Information Processing in Medical Imaging,
Springer, Berlin, LNCS2732, pp. 306–317, 2003.
17. Churmakov DY, et al., Analysis of skin tissues spatial ﬂuorescence dis
tribution by the Monte Carlo simulation, J Phys D: Applied Phys 36:
1722–1728, 2003.
18. Chang J, Graber HL, Barbour RL, Imaging of ﬂuorescence in highly
scattering media, IEEE Trans on Biomedical Engineering 44(9): 810–822,
1997.
19. Fercher AF, et al., Optical coherence tomography — Principles and
applications, Rep Prog Phys 66: 239–303, 2003.
20. Tomatis S, et al., Automated melanoma detection: Multispectral imag
ing and neural network approach for classiﬁcation, Med Phys 30(2):
212–221, 2003.
21. Tomatis S, Bartoli C, et al., Spectrophotometric imaging of subcuta
neous pigmented lesion: Discriminant analysis, optical properties and
histological characteristics, J Photochem Photobiol 42: 32–39, 1998.
22. Young AR, Chromophores in human skin, Physics in Medicine and
Biology 42: 789–802, 1997.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
CHAPTER 17
Multimodality Image Registration
and Fusion
Pat Zanzonico
Imaging has long been a vital component of clinical medicine and, more
recently, of biomedical research in small animals. In addition, image reg
istration and fusion have become increasingly important components of
both clinical and laboratory (i.e. smallanimal) imaging and have lead
to the development of a variety of pertinent software and hardware tools,
including multimodality, e.g. PETCT, devices which “automatically” pro
vide registered and fused threedimensional (3D) image sets. This chapter
is a brief, largely nonmathematical review of the basics of image regis
tration and fusion and of software and hardware approaches to 3D image
alignment, including mutual information algorithms and multimodality
devices.
17.1 INTRODUCTION
Since the discovery of Xrays, imaging has been a vital compo
nent of clinical medicine. Increasingly, in vivo imaging of small
laboratory animals, i.e. mice and rats, has emerged as an impor
tant component of basic biomedical research. Historically, clinical
and laboratory imaging modalities have often been divided into
two general categories, structural (or anatomical) and functional (or
physiological). Anatomical modalities, i.e. depicting primarily mor
phology, include Xrays (plain radiography), CT (computed tomog
raphy), MRI (magnetic resonance imaging), and US (ultrasound).
Functional modalities, i.e. depicting primarily information related
413
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
414
Pat Zanzonico
to underlying metabolism, include (planar) scintigraphy, SPECT
(singlephoton emission computed tomography), PET (positron
emission tomography), MRS (magnetic resonance spectroscopy),
and fMRI (functional magnetic resonance imaging). The functional
modalities formthe basis of the rapidly advancing ﬁeld of “molecu
lar imaging,” deﬁned as the direct or indirect noninvasive monitor
ing and recording of the spatial and temporal distribution of in vivo
molecular, genetic, and/or cellular processes for biochemical, bio
logical, diagnostic, or therapeutic applications.
1
Since informationderivedfrommultiple images is oftencomple
mentary, e.g. localizing the site of an apparently abnormal metabolic
process to a pathologic structure such as a tumor, integration of
this information may be helpful and even critical. In addition to
anatomic localization of “signal” foci, image registration and fusion
provide: intra as well as intermodality corroboration of diverse
images; more accurate and more certain diagnostic and treatment
monitoring information; image guidance of externalbeam radia
tion therapy; and potentially, more reliable internal radionuclide
dosimetry, e.g. in the formof radionuclide imagederived “isodose”
contours superimposed on images of the pertinent anatomy. The
problem, however, is that differences in image size and dynamic
range, voxel dimensions and depth, image orientation, subject posi
tion and posture, and information quality and quantity make it dif
ﬁcult to unambiguously colocate areas of interest in multiple image
sets. The objective of image registration and fusion, therefore, is (a) to
appropriately modify the format, size, position, and even shape of
one or both image sets to provide a pointtopoint correspondence
between images and (b) to provide a practical integrated display
of the images thus aligned. This process entails spatial registration
of the respective images in a common coordinate system based on
optimization of some “goodnessofalignment,” or “similarity,” cri
terion (or metric). This chapter is a brief, largely nonmathematical
review of the basics of image registration and fusion and of soft
ware andhardware approaches to 3Dimage alignment andpresents
illustrative examples of registered and fused multimodality images
in both clinical and laboratory settings.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
415
17.2 BACKGROUND
The image registration and fusion process
2−5
is illustrated diagram
matically and in general terms in Fig. 1. The ﬁrst step is reformatting
of one image set (the “ﬂoating,” or secondary, image) to match that
of the other image set (the reference, or primary, image). Alter
natively, both image sets may be transformed to a new, common
image format. Threedimensional (3D), or tomographic, image sets
are characterized by: the dimensions (e.g. in mm), i.e. the length
Fig. 1. The image registration and fusion process. See text for details.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
416
Pat Zanzonico
(X), width (Y), and depth (Z), of each voxel; the image matrix,
X×Y×Z=number of rows, X×number of columns, Y×number
of tomographic images (or “slices”), Z; and the image depth (e.g. in
bytes), which deﬁnes the dynamic range of signal displayable in
each voxel (e.g. a wordmode, i.e. one word or two byte“deep,”
PET image can display up to 2
16
=65 536 signal levels for 16bit
words). The foregoing image parameters are provided in the image
“header,” a block of data which may either be in a standalone text
ﬁle associated with the image ﬁle or incorporated into the image ﬁle
itself. Among the image sets to be registered, either the ﬁner matrix
is reformatted to the coarser matrix by combining of voxels or the
coarser matrix is reformatted to the ﬁner matrix by interpolation
of voxels. One of the resulting 3D image sets is then magniﬁed or
miniﬁed to yield primary and secondary images with equal voxel
dimensions. Finally, the “deeper” image is rescaled to match the
depth of the “shallower” matrix. Usually, the higher spatial resolu
tion and ﬁner matrix structural (e.g. CT) image is the primary image
and the functional (e.g. PET) image the secondary image.
Thesecondstepinimageregistrationis theactual transformation
[translation, rotation, and/or deformation (warping)] of the refor
mattedsecondary image set to spatially align it, in three dimensions,
with the primary image set.
The third and fourth steps are, respectively, the evaluation of
the accuracy of the registration of the primary and transformed sec
ondary images and adjustment, iteratively, of the secondary image
transformation until the registration (i.e. the goodnessofalignment
metric) is optimized.
The ﬁfth and ﬁnal step is image fusion, the integrated display of
the registered images.
17.3 PROCEDURES AND METHODS
17.3.1 “Software” versus “Hardware” Approaches
to Image Registration
In both clinical and laboratory settings, there are two practi
cal approaches to image registration and fusion, “software” and
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
417
“hardware” approaches. In the software approach, images are
acquired on separate devices, imported into a common image
processing computer platform, and registered and fused using
the appropriate software. In the hardware approach, images are
acquired on a single, multimodality device and transparently regis
tered and fused with the manufacturer’s integrated software. Both
approaches are dependent on software sufﬁciently robust to recog
nize and import diverse image formats. The availability of industry
wide standard formats, such as the ACRNEMA DICOM standard
i.e. theAmericanCollege of Radiology(ACR) andNational Electrical
Manufacturers Association (NEMA) for Digital Imaging and Com
munications in Medicine (DICOM) standard,
6−9
is therefore critical.
17.3.2 Software Approaches
17.3.2.1 Rigid versus nonrigid transformations
Softwarebased transformations of the secondary image set to spa
tially align it with the primary image set are commonly character
ized as either “rigid” or “nonrigid”.
2−5
In a rigid transformation,
the secondary image is only translated and/or rotated with respect
to the primary image. However, the Euclidean distance between
any two points (i.e. voxels) within an individual image set remains
constant. In nonrigid, or deformable, transformations (commonly
known as “warping”), selected subvolumes within the image set
may be expanded or contracted and/or their shapes altered. Trans
lations and/or rotations may be performed as well. Such warping is
therefore distinct fromany magniﬁcation or miniﬁcation performed
in the reformatting step, where distances between points all change
by the same relative amount. Unlike rigid transformations, which
may be either manual or automated, nonrigid transformations are
generally automated.
17.3.2.2 Feature and intensitybased approaches
Registrationtransformations are oftenbasedonalignment of speciﬁc
landmarks visible in the image sets; this is sometimes characterized
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
418
Pat Zanzonico
as the “featurebased” approach.
2−5
Such landmarks may be either
intrinsic, i.e. one or more welldeﬁned anatomic structure(s) or the
body contour (i.e. surface outline), or extrinsic, i.e. one or more ﬁdu
cial markers placed in or around the subject. Featurebased registra
tiongenerallyrequires some sort of preprocessing“segmentation”of
the image sets beingaligned, that is, identiﬁcationof the correspond
ing features (e.g. ﬁduciary markers) of the image sets. Featurebased
image registration algorithms may be automated by minimization
of the difference(s) in position of the pertinent feature(s) between
the image sets being aligned.
Other registration algorithms are based on analysis of voxel
intensities (e.g. counts in a PET or SPECT image) and are character
ized as “intensitybased” approaches.
2−5
These include: alignment
of the respective “centers of mass” (e.g. counts) and orientation (i.e.
principal axes) calculated for each image set; minimization of abso
lute or sumofsquare voxel intensity differences between the image
sets; crosscorrelation (i.e. maximizing the voxel intensity correla
tion between the image sets); minimization of variance (i.e. match
ing of identiﬁable homogeneous regions in the respective images
sets); and matching of voxel intensity histograms (discussed fur
ther in the Results and Findings section).
2
Such intensitybased
approaches implicitly assume that the voxel intensities inthe images
being aligned represent the same, positively correlated parameters
(e.g. counts) and thus are directly applicable only to intramodality
image registration.
17.3.2.3 Mutual information
A relatively new but already widely used automated registra
tion algorithm is based on the statistical concept of mutual
information,
3,10
also known as transinformation or relative entropy.
The mutual information of two random variables A and B is a
quantity that measures the statistical dependence of the two vari
ables, that is, the amount of information that one variable contains
about the other. Mutual informationmeasures the informationabout
Athat is shared by B. If Aand B are independent, then Acontains no
information about B and vice versa and their mutual information is
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
419
therefore zero. Conversely, if Aand B are identical, then all informa
tion conveyed by A is shared with B and their mutual information
is maximized. Accurate spatial registration of two such image sets
thus results in the maximization of their mutual information and
vice versa.
The concepts of entropy and mutual information are developed
more formally in the following. Given “events” (e.g. grayscale val
ues) e
1
, e
2
, . . . , e
n
with probabilities (i.e. frequencies of occurrence)
p
1
, p
2
, . . . , p
n
in an image set, the entropy (speciﬁcally, the socalled
“Shannon entropy”) H is deﬁned as follows
3
:
H ≡
n
1
p
i
log
1
p
i
(1a)
= −
n
1
p
i
log p
i
. (1b)
The term, log
1
p
i
, indicates that the amount of information provided
by an event is inversely related to the probability (i.e. frequency)
of that event: the less frequent an event, the more signiﬁcant is its
occurrence. The information per event is thus weighted by the fre
quency of its occurrence. The uniform “background” (e
BG
) occu
pies a large portion of a CT image (i.e. p
BG
is large), for example,
and therefore contributes relatively little information (i.e. log
1
P
BG
is
small) — and would not contribute substantially to accurate align
ment with an MR image. The Shannon entropy is also a measure of
the uncertaintyof anevent. Whenall events (e.g. all grayscale values
in an image) are equally likely to occur (as in an highly heteroge
neous image), the entropy is maximal.
a
When an event or a range of
events is more likely to occur (as in a uniformimage), the entropy is
minimal. Additionally, the entropy is a measure of dispersion of an
image’s probability distribution (i.e. the probability of a grey scale
value versus the grey scale values): a highly heterogeneous image
has a broad dispersion and a high entropy while a uniform image
has no dispersion and minimal entropy. Entropy thus has several
a
The analogy between signal entropy, used in the context of mutual information, and
thermodynamic entropy thus becomes clear.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
420
Pat Zanzonico
interpretations: the information content per event (e.g. greyscale
value), the uncertainty per event, and the statistical dispersion of
events in an image.
For two images A and B, the mutual information MI(A,B) may
be deﬁned as follows
b,3
:
MI(A, B) ≡ H(B) −H(BA). (2)
H(B) is the Shannon entropy of image B (derived from the proba
bility distribution of its greyscale value) and H(BA) is the condi
tional entropy of image B with respect to image A [derived fromthe
conditional probabilities p(ba), the probability of grey scale value b
occurring in image B given that grey scale value a occurs in the cor
responding voxel in image A]. When interpreting entropy in terms
of uncertainty, MI(A,B) thus corresponds to the uncertainty in image
B minus the uncertainty in image B when image A is known. Intu
itively, therefore, MI(A,B) — the imageB information in image A —
is the amount by which the uncertainty in image B decreases when
image A is given. Because images A and B can be interchanged,
MI(A,B) is also the information image B contains about image Aand
it is therefore mutual information. Registration thus corresponds to
maximizing mutual information: the amount of information images
have about each other is maximized when, and only when, they
are aligned. If a subject is imaged by two different modalities, there
is presumably considerable mutual information between the spatial
distributionof the respective signals inthe twoimages sets nomatter
howdiverse (i.e. unrelated) they may appear to be. For example, the
distribution of ﬂuorine18labeled ﬂuorodeoxyglucose (FDG) visu
alized in a PET scan is, at some level, dictated by (i.e. dependent on)
the distribution of different tissue types imaged by CT.
17.3.2.4 Goodnessofalignment metrics
Regardless of the algorithm employed, the evaluation and adjust
ment of the registration requires some metric of its accuracy. It may
b
In information theory, there are actually a number of different deﬁnitions of mutual
information.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
421
be as simple as visual (i.e. qualitative) inspection of the aligned
images and a judgment by the operator that the registration is
or is not “acceptable.” A more objective, and ideally quantitative,
evaluation of the accuracy of the registration is, of course, pre
ferred. One goodnessofalignment metric, for example, is the sum
of the Euclidean distances between corresponding ﬁduciary mark
ers (or anatomic landmarks) in the two image sets; the optimum
alignment corresponds to the transformation yielding the mini
mum sum of distances. Another similarity metric, as discussed
above, is the mutual information: when the mutual information
between the two image sets is maximized, they are optimally
aligned.
17.3.3 Hardware Approaches
The major manufacturers of PET and CT scanners now also market
multimodality scanners,
11−13
combining high performance stateof
theart PET and CT scanners in a single device. These instruments
provide nearperfect registration of images of in vivo function (PET)
and anatomy (CT) using a measured, and presumably ﬁxed, rigid
transformation between the image sets. These devices have already
had a major impact on clinical practice, particularly in oncology,
and PETCT devices are currently outselling “PETonly” systems by
a twotoone ratio.
14
Although generally encased in a single seam
less housing, the PETandCTgantries insuchmultimodality devices
are separate; the respective ﬁelds of vieware separatedby a distance
of the order of 1 mand the PET and CT scans are performed sequen
tially (Figs. 2 and 3). In one such device (Gemini, Philips Medical),
the PET and CT gantries are actually in separate housings with an
adjustable separation (up to ∼1 m) between them; this not only pro
vides access to patients but also may minimize anxiety among claus
trophobic subjects (Fig. 4).
In addition to PETCT scanners, SPECTCT scanners are now
commercially available. The design of SPECTCT scanners is similar
to that of PETCT scanners in that the SPECT and CT gantries are
separate and the SPECT and CT scans are acquired sequentially, not
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
422
Pat Zanzonico
Fig. 2. Schematic diagram(side view) of a commercially available clinical PETCT
scanner. From reference
11
by permission of the authors. Inset: Photo of the PETCT
scanner in the diagram, the Biograph™ (SiemensCTI).
simultaneously. In such devices, the separation of the SPECT and
CT scanners is more apparent (Fig. 5) because the rotational and
other motions of the SPECT detectors effectively precludes encas
ing them in a housing with the CT scanner. Multimodality imaging
devices for small animals (i.e. rodents) — PETCT, SPECTCT, and
even SPECTPETCT devices — are now commercially available as
well (Fig. 6).
Multimodality devices simplify image registration andfusion —
conceptually as well as logistically — by taking advantage of the
ﬁxed geometric arrangement between the PET and CT scanners or
the SPECT and CT scanners in such devices. Further, because the
time interval between the sequential scans is short (i.e. a matter of
minutes) and the subject remains in place, it is unlikely that sub
ject geometry will change signiﬁcantly between the PET or SPECT
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
423
Fig. 3. Atypical imagingprotocol for a combinedPETCTstudy: (A) the topogram,
or scout CT scan, for positioning; (B) the CT scan; (C) generation of CTbased atten
uation correction factors; (D) the PET scan over the same longitudinal range of the
patient as the CT scan; (E) reconstruction of the attenuationcorrected PET emission
data; (F) the attenuationcorrected PET images; and (G) display of the ﬁnal fused
PETCT images. From Ref. 13 by permission of the authors.
scan and the CT scan. Accordingly, a rigid transformation matrix
(i.e. translations and rotations in three dimensions) can be used to
align the PET or SPECT and the CT image sets. This matrix can be
measured using a “phantom,” i.e. an inanimate object with PET or
SPECT and CTvisible landmarks arranged in a welldeﬁned geom
etry. The transformation matrix required to align these landmarks
can then be stored and used to automatically register all subsequent
multimodality studies, since the device’s geometry and therefore
this matrix should be ﬁxed.
17.3.4 Image Fusion
Image fusion may be as simple as simultaneous display of images
in a juxtaposed format. Amore common, and more useful, format is
an overlay of the registered images, where one image is displayed
in one color table and the second image in a different color table.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
424
Pat Zanzonico
Fig. 4. Photo of a commercially available clinical PETCT scanner, the
Gemini™ (Philips Medical), which allows variable separation of the PET and the
CT subsystems.
Typically, the intensities of the respective color tables as well as the
“mixture” of the two overlaid images can be adjusted. Adjustment
(e.g. with a slider) of the mixture allows the operator to interactively
vary the overlay so that the designated screen area displays only the
ﬁrst image, only the second image, or some weighted combination
of the two images, each in its respective color table.
17.4 RESULTS AND FINDINGS
17.4.1 Software Approaches to Image Registration
17.4.1.1 Featurebased approach: Extrinsic ﬁduciary markers
Comparative imaging of multiple radiotracers in the same subject
can be invaluable in elucidating and validating their respective
mechanisms of localization. Comparative imaging of PET trac
ers, particularly in small animals, is problematic, however: such
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
425
Fig. 5. Photo of a commercially available clinical SPECTCT scanner, the
Precedence™ (Philips Medical).
tracers must be administered and imaged separately because simul
taneously imaged positron emitters cannot be separated based on
energy discrimination. In one such study (Fig. 7),
15
the intratumoral
distributions of sequentiallyadministeredF18FDGandthe hypoxia
tracer F18ﬂuoromisonidazole (FMiso) were compared in rats by
registered R4 microPET™ imaging with positioning of each animal
in a customfabricated wholebody mold. Custommanufactured
germanium68 rods were reproducibly positioned in the mold
as external ﬁduciary markers. The registered microPET™ images
unambiguously demonstrate grossly similar though not identical
distributions of FDGand FMiso in the tumors —a highactivity rim
surrounding a loweractivity core. However, there were subtle but
possibly signiﬁcant differences in the intratumoral distributions of
FDG and FMiso, and these may not have been discerned without
careful image registration.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
426
Pat Zanzonico
Fig. 6. Photos of two commercially available laboratory (i.e. rodent) SPECTCT
scanners: (A) the XSPECT™ (Gamma Medica); (B) the Inveon™ (Siemens Preclin
ical Solutions), which allows the detachment and separate use of the CT and the
PET subsystems.
17.4.1.2 Intensitybased approach: Minimization
of intensity differences
As illustrated in Fig. 8 showing sequential PET brain images of the
same patient,
2
misalignment of the image sets produces visualiz
able structure in the difference images (the bottomrowof Fig. 8(A)),
i.e. the voxelbyvoxel intensity differences are not zero. In con
trast, accurate registration yields differences images whose voxel
byvoxel intensity differences are equal to zero within statistical
uncertainty (i.e. “noise”) and therefore an absence of visualizable
structure (bottom row of Fig. 8(B)).
17.4.1.3 Intensitybased approach: Matching of voxel
intensity histograms
For two image sets A and B, a 2D joint histogram (also known as
the “feature space”) (Fig. 9)
2
can be constructed by plotting, for
each combination of intensity a in image A and intensity b in image
B, the point (a, b) whose darkness or lightness reﬂects the number
of occurrences of the combination of intensities a and b. Thus, a
darker point in the joint histogram indicates a larger number and
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
427
Fig. 7. (A) Left panel: An anesthetized tumorbearing rat in its customfabricated
mold (RapidFoam™, Soule Medical) for immobilization and reproducible posi
tioning for repeat and/or intermodality imaging studies. Right panel: Three custom
manufactured
68
Ge ﬁduciary markers (10 µCi each, 1 × 10 mm) (Sanders Medical
Products) reproducibly insertedinto the moldandusedas extrinsic ﬁduciary mark
ers for software registration of serial microPET™ images. (B) The appearance of
the
68
Ge markers on overlaid F18FDG and FMiso transversesection microPET™
images before and after registration based on the rigid transform consisting of
translations x, y, and z and rotations θ
x
, θ
y
, and θ
z
. (C) Registered and
fused
18
FFDG (gray scale) and FMiso (hot iron) transverse, coronal, and sagit
tal microPET™ images; the sagittal views are through a R3327AT rat prostate
tumor xenograft inthe animal’s right hindlimb. Discordant areas of FDGandFMiso
uptakes are indicated by the white arrows for the R3327AT tumor and by the yel
low arrows for a FaDu human squamous cell carcinoma tumor xenograft. Both
tumors, 20 mm×20 mm×30 mm in size, were signiﬁcantly hypoxic. From Ref. 15
by permission of the authors.
a lighter point a smaller number of occurrences of the combina
tion (a,b). When two identical image sets are aligned (matched), all
voxels coincide and the plot in the voxel intensity histogram is the
line of identity (i.e. a = b for all voxels). As one of the image sets
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
428
Pat Zanzonico
Fig. 8. Intramodality image registration based on minimization of voxel intensity
differences. (A) Selected brain images of sequential misaligned (i.e. nonregistered)
PET studies of the same patient, with the sectionbysection difference images in
the bottom row. (B) The same image sets as in (A), now aligned by minimization of
the voxelbyvoxel intensity differences. From Ref. 2 by permission of the authors.
Fig. 9. Intramodality image registration based on matching of voxel intensity his
tograms. The joint intensity histograms of a transversesection brain MR image
with itself when the two image sets are originally matched (i.e. aligned) and when
misaligned by counterclockwise rotations of 10
◦
and 20
◦
, respectively. See text for
details. Adapted from Ref. 2 by permission of the authors.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
429
Fig. 10. An intermodality (CT and MR) joint intensity histogram. The feature
less (i.e. uniform) area corresponding to brain tissue in the transversesection head
CT image (left panel), in contrast to the anatomic detail in the corresponding area
of the MR image (middle panel), yields a distinct vertical cluster (arrow) in the
CTMR joint histogram (right panel). Adapted from Ref. 3 by permission of the
authors.
is rotated relative to the other (by 10
◦
and then by 20
◦
), for exam
ple, the joint histogrambecomes increasinglyblurred(i.e. dispersed)
(Fig. 9). Alignment of the images can therefore be achieved by min
imizing the dispersion in the joint intensity histogram. Like other
intensitybased approaches, this approach is most readily adapt
able to similar (i.e. intramodality) images sets but in principle can
be applied to dissimilar (i.e. intermodality) images by appropri
ate mapping of one image intensity scale to the other intensity
scale (Fig. 10).
3
17.4.1.4 Mutual information
As illustrated in Fig. 11
3
for registration of a brain MR image with
itself, the joint histogram of two images changes as the alignment
of the images changes. When the images are registered, correspond
ing signal foci overlap and the joint histogram will show certain
clusters of grey scale values. As images become increasingly mis
aligned (illustrated in Fig. 11 with rotations of 2
◦
, 5
◦
, and then
10
◦
of the brain MRI relative to the original image), signal foci
will increasingly overlap that are not their respective counterparts
on the original image. Consequently, the cluster intensities for
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
430
Pat Zanzonico
Fig. 11. Effect of misregistrationonjoint intensityhistograms andmutual informa
tion (MI) between a transversesection brain MR image (top row) and itself. Shown
are the joint intensity histograms and mutual information (MI) (middle row) when
the two image sets are originally matched (i.e. aligned) and when misaligned by
clockwise rotations of 2
◦
, 5
◦
, and 10
◦
, respectively (bottomrow). See text for details.
Adapted from Ref. 3 by permission of the authors.
corresponding signal foci (e.g. skull and skull, brain and brain
etc.) will decrease and newnoncorresponding combinations of grey
scale values (e.g. of skull and brain) will appear. The joint his
togram will thus become more dispersed; as described above, min
imization of this dispersion is the basis of certain intensitybased
registration algorithms. At the same time, the mutual information
(MI) (see Eqs. 1 and 2), which is minimized when the two images
are aligned, will increase. However, unlike other intensitybased
approaches, no assumptions are made in the MI approach regard
ing the nature of the relationship between image intensities (e.g.
a positive or a negative correlation). MI is thus a completely gen
eral goodnessofalignment metric and can be applied to inter as
well as intramodality registration and automatically without prior
segmentation.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
431
17.4.2 Hardware Approaches to Image Registration
Multimodality devices simplify image registration and fusion —
conceptually as well as logistically — by taking advantage of the
ﬁxed geometric arrangement between the PET and CT scanners or
the SPECT and CT scanners in such devices. Further, because the
time interval between the sequential scans is short (i.e. a matter of
minutes), it is unlikely that a subject’s geometry will change signif
icantly between the PET or SPECT scan and the CT scan. Accord
ingly, a rigid transformation matrix (i.e. translations and rotations
in three dimensions) can be used to align the PET or SPECT and
the CT image sets. This matrix can be measured using a “phan
tom,” i.e. an inanimate object with PET or SPECT and CTvisible
landmarks arranged in a well deﬁned geometry. The transforma
tion matrix required to align these landmarks can then be stored and
used to automatically register all subsequent multimodality studies,
since the devices mechanics and therefore this matrix are presum
ably ﬁxed.
To illustrate the utility of registered and fused multimodality
imaging studies in both clinical and laboratory settings, examples
are presented in Figs. 12
16
and 13.
17.5 DISCUSSION AND CONCLUDING REMARKS
In practice, two basic approaches to image registration and fusion,
“software” and “hardware” approaches, have been developed.
In the software approach, images are acquired on separate
devices and registered and fused using the appropriate software.
Rather robust and userfriendly software for image registration and
fusion is now widely available. Software approaches to registra
tion of images acquired on separate devices have been particularly
successful in the brain because of the ability to reliably immobilize
and position the head, the pronounced contrast between the bony
skull (an intrinsic landmark) and the brain, and the lack of motion
or deformation of internal structures. Outside the brain, however,
software registration is more difﬁcult because of the many degrees
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
432
Pat Zanzonico
Fig. 12. Registered and fused FDGPET and CT scans of a patient with lung cancer
and an adrenal gland metastasis. (A) Coronal PET images showtypically increased
FDG uptake in the primary lung tumor (single arrow in left panel) and in the metas
tasis in the left adrenal gland (double arrow in left panel) but also in the left side
of the neck (arrow in right panel). (B) Transaxial PET and CT images through this
neck lesion. Reading these images separately or in the juxtaposed format shown,
it is difﬁcult to deﬁnitively identify the anatomic site (i.e. tumor versus normal
structure) of the focus of activity in the neck. (C) The registered and fused PETCT
images, using the fused, or overlay, display, unambiguously demonstrate that the
FDGactivity is located within muscle, a physiological normal variant. Because it is
best visualizedusing the original color display, an arrowis usedto identify the loca
tion of this unusual, but nonpathologic, focus of FDG activity on the fused images.
Therefore, the FDG activity in the neck was not previously undetected disease, a
ﬁnding which would signiﬁcantly impact the subsequent clinical management of
the patient. Adapted from Ref. 16 with permission of the authors.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
433
Fig. 13. Registered and fused SPECTCT images (coronal views) of a mouse with
a LAN1 neuroblastoma tumor xenograft in its hindlimb (arrow). The radiotracer
was iodine125labeled 3F8, an antibody directed against the ganglioside 2 (GD2)
antigen, which is overexpressed on neuroblastomas (including LAN1). The images
were acquired at two days postinjection with the XSPECT™ [Fig. 6(A)]. The CT
image shows the tumor as a spaceoccupying structure along the contour of the ani
mal (left panel). The speciﬁc targeting of the radiolabeled 3F8 to the GD2expressing
tumor xenograft is demonstrated by the highcontrast SPECT image (middle panel).
The registered and fused PETCT images, again using the fused, or overlay, display,
unambiguously demonstrate that the 3F8 activity is located in the tumor, conﬁrm
ing that the focus of activity represents speciﬁc tumortargeting by this antibody
and not, for example, excreted activity in the urinary bladder or radioactive con
tamination. The images are providedcourtesy of Drs Shakeel Modak andNaiKong
Cheung, Memorial SloanKettering Cancer Center.
of freedom of the torso and its internal structures when imaged at
different times by different devices and with the subject in differ
ent positions. For example, depending on the variable degree of
ﬁlling of the bladder with urine or the intestines with gas, pelvic
and abdominal structures may be signiﬁcantly displaced from one
imaging study to the next. The registration process may therefore be
rather timeconsuming and laborintensive.
In the hardware approach, images are acquired on a single,
multimodality device and transparently registered and fused. To
date, such multimodality devices have been restricted almost exclu
sively to PETCT and SPECTCT scanners. While MRICT scan
ners might have little practical advantage, since both MRI and
CT are both anatomic imaging modalities, PETMRI and SPECT
MRI devices would be highly attractive. Combining PET or SPECT
and MRI remains problematic, however, because the magnetic
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
434
Pat Zanzonico
ﬁelds proximal to an MRI scanner interfere with the scintillation
detection process in all current generation PET and SPECT scan
ners. Nonetheless, practical PETMRI scanners are currently under
development.
17
Both intra and intermodality image registration and fusion will
nodoubt become evenmore widelyusedandincreasinglyimportant
in both clinical and laboratory settings.
References
1. SNA News, RSNA, SNM Urge Interdisciplinary Cooperation to Advance
Molecular Imaging, 2005.
2. Hutton BF, Braun M, Thurfjell L, et al., Image registration: An essential
tool for nuclear medicine, Eur J Nucl Med 29: 559–577, 2002.
3. Maintz JBA, Viergever MA, A survey of medical image registration,
Med Image Anal 2: 1–36, 1998.
4. Hajnal JV, Hill DLG, Hawkes DJ (eds.) Medical Image Registration, Boca
Raton, FL, CRC Press, 2001.
5. Hill DLG, Batchelor PG, Holden M, et al., Medical image registration,
Phys Med Biol 46: R1–R45, 2001.
6. American College of Radiology, National Electrical Manufacturers
Association, “ACRNEMA Digital Imaging and Communications
Standard,” NEMAStandards PublicationNo. 300–1985, Washington, DC,
1985.
7. American College of Radiology, National Electrical Manufacturers
Association, “ACRNEMA Digital Imaging and Communications
Standard: Version 2.0,” NEMA Standards Publication No. 300–1988,
Washington, DC, 1988.
8. American College of Radiology, National Electrical Manufacturers
Association, “Digital Imaging and Communications in Medicine
(DICOM): Version3.0,” Draft Standard, ACRNEMACommittee, Work
ing Group VI, Washington, DC, 1993.
9. Mildenberger P, Eichelberg M, Martin E, Introduction to the DICOM
standard, Eur Radiol 12: 920–927, 2002.
10. Viola P, Wells III WM, Alignment by maximization of mutual informa
tion, Inter J Computer Vision 22: 137–154, 1997.
11. Beyer T, Townsend DW, Brun T, et al., Acombined PET/CT scanner for
clinical oncology, J Nucl Med 41: 1369–1379, 2000.
12. Townsend DW, Carney JPJ, Yap JT, et al., PET/CT today and tomorrow,
J Nucl Med 445: 4S–14S, 2004.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
435
13. Yap JT, Carney JPJ, Hall NC, et al., Imageguided cancer therapy using
PET/CT, Cancer J 10: 221–223, 2004.
14. J Nucl Med(Newsline), PETon Display: Notes fromthe 59th SNMAnnual
Meeting, pp. 24N–26N, 2003.
15. Zanzonico P, Campa J, PolycarpeHolman D, et al., Animalspeciﬁc
positioning molds for registration of repeat imaging studies: Com
parative microPET™imaging of F18labeled ﬂuorodeoxyglucose and
ﬂuoromisonidazole in rodent tumors, Nucl Med Biol 33: 65–70, 2006.
16. Schoder H, Erdi Y, Larson S, et al., PET/CT: Anewimaging technology
in nuclear medicine, Eur J Nucl Med Mol Imaging 30: 1419–1437, 2003.
17. Catana C, Wu Y, Judenhofer MS, et al., Simultaneous Acquisition of
multislice PET and MR images: Initial results with a MRcompatible
PET scanner, J Nucl Med 47: 1968–1976, 2006.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch17 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
CHAPTER 18
Wavelet Transformand Its Applications
in Medical Image Analysis
Atam P Dhawan
Recently, wavelet transform has been found to be a very productive and
efﬁcient tool for image processing, analysis and compression for medical
applications. Wavelet transformprovides complete spatiofrequency local
ization for medical images that may be used to remove noise, undesired
features and artifact, or to extract useful features for image characteriza
tion and classiﬁcation. This chapter provides an introduction of wavelet
transform with decomposition and reconstruction methods for medical
image analysis.
18.1 INTRODUCTION
Wavelet transform has recently emerged as an efﬁcient signal pro
cessing tool for the localization of frequency or spectral components
in the data. As a historical perspective of signal analysis, the Fourier
transform has proved to be an extremely useful tool for decompos
ing a signal into constituent sinusoids of different frequency com
ponents. However, Fourier analysis suffers from a drawback of the
loss of localization or time information when transforming infor
mation from the time domain to the frequency domain. When the
frequency representation of a signal is looked into, it is impossible
to tell when a particular event took place. If the signal properties do
not change much over time, this drawback may be ignored. How
ever, signals change with interesting properties over time or space.
437
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
438
Atam P Dhawan
An electrocardiogram signal changes over the time marker with
respect to heart beat events. Similarly, in the context of two
dimensional and threedimensional images, a signal or a property
represented by the image changes over the sampled data points
in space. Fourier analysis, in general, does not provide a speciﬁc
event (frequency) localizedinformation with respect to time (in time
series signals) or space (inimages). This drawbackof Fourier transfer
can be somewhat addressed by using shorttime fourier transform
(STFT).
1−4
This technique adapted the Fourier transform to analyze
only a small section of the signal at a time. As a matter of fact, STFT
maps a signal into separate functions of time and frequency. The
STFT provides some information about frequency localization with
respect to a selected window. However, this information is obtained
withlimitedprecisiondeterminedbythe size of the window. Amajor
shortcoming with STFT is that the window size is ﬁxed for all fre
quencies once a particular size for the time windowis chosen. In real
applications, signals may require a variable windowsize in order to
accurately determine event localization with respect to frequency
and time or space.
Wavelet transform may use long sampling intervals where low
frequency information is needed, and shorter sampling intervals
where high frequency information is available. The major advan
tage of wavelet transform is its ability to perform multiresolu
tion analysis for event localization with respect to all frequency
components in data over time or space. Thus, wavelet analysis is
capable of revealing aspects of data that other signal analysis tech
niques miss, suchas breakdownpoints, anddiscontinuities inhigher
derivatives.
1−4
Wavelet transform theory uses two major concepts: scaling and
shifting. Scaling, through dilation or compression, provides a capa
bility of analyzing a signal over different windows or sampling
periods in the data while shifting, through delay or advancement,
provides translation of the wavelet kernel over the entire signal.
Daubechies wavelets
1
are compactly orthonormal wavelets which
make discrete wavelet analysis practicable. Wavelet analysis has
seen numerous applications in statistics,
1−4
time series analysis
1−2
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
439
and image processing.
5−8
Generalized wavelet basis functions have
been studied for image processing applications.
6−8
Furthermore,
wavelet transformhas also been used in data mining ﬁeld and other
data intensive applications because of its many favorable proper
ties, such as vanishing moments, hierarchical and multiresolution
decomposition structure, linear time and space complexity of the
transformations, decorrelatedcoefﬁcients andawide varietyof basis
functions.
18.2 WAVELET TRANSFORM
Wavelet transform is the decomposition of a signal, f (t), with a
family of real orthonormal bases ψ
j,k
(t) obtained through trans
lation and scaling of a kernel function ψ(t) in the Hilbert
space L
2
(R) of square integrable functions, known as the mother
wavelet, i.e.
ψ
j,k
(t) = 2
j/2
ψ(2
j
t − k); j, k ∈ Z, (1)
where j and k are integers representing, respectively, scaling and
shifting indices. Using the orthonormal property, the wavelet
coefﬁcients of a signal f (t) can be computed as:
c
j,k
=
+∞
−∞
f (t)ψ
j,k
(t)dt. (2)
The signal f (t) can be fully recovered or reconstructed from the
wavelet coefﬁcients as:
f (t) =
j,k
c
j,k
ψ
j,k
(t). (3)
To obtain wavelet coefﬁcients fromEq. (2), ψ
j,k
(t), the translated and
scaled versions of the mother wavelet ψ(t), are obtained using a
scaling function. Using a scale resolution of multiples of two, the
scaling function φ(t) can be obtained as:
φ(t) =
√
2
n
h
0
(n)φ(2t − n). (4)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
440
Atam P Dhawan
Then, the wavelet kernel ψ(t) is related to the scaling function as:
ψ(t) =
√
2
n
h
1
(n)φ(2t − n), (5)
where
h
1
(n) = ( − 1)
n
h
0
(1 − n). (6)
The coefﬁcients h(n) in Eq. (4) must satisfy several conditions for
the set of basis wavelet functions deﬁned in Eq. (1) to be unique,
orthonormal, and with a certain degree of regularity.
1−4
18.3 SERIES EXPANSION AND DISCRETE WAVELET
TRANSFORM
Let x[n] be an arbitrary square summable sequence representing a
signal in the time domain such that:
x[n] ∈ l
2
(Z). (7)
The series expansion of a discrete signal x[n] using a set of ortho
normal basis functions ϕ
k
[n] is given by:
x[n] =
k∈Z
ϕ
k
[l], x[l]ϕ
k
[n] =
k∈Z
X[k]ϕ
k
[n]
where X[k] = ϕ
k
[l], x[l] =
l
ϕ
∗
k
[l]x[l]. (8)
where X[k] is the transform of x[n]. All basis functions must satisfy
the orthonormality condition, i.e.
ϕ
k
[n], ϕ
l
[n] = δ[k − l]
with
x
2
= X
2
(9)
where represents the inner product.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
441
The series expansion is considered to be complete if every signal
from l
2
(Z) can be expressed as shown in Eq. (8). Similarly, using a
set of biorthogonal basis functions, the series expansion of the signal
x[n] can be expressed as:
x[n] =
k∈Z
ϕ
k
[l], x[l] ˜ ϕ
k
[n] =
k∈Z
˜
X[k] ˜ ϕ
k
[n]
=
k∈Z
˜ ϕ
k
[l], x[l]ϕ
k
[n] =
k∈Z
X[k]ϕ
k
[n]
where
˜
X[k] = ϕ
k
[l], x[l] and X[k] = ˜ ϕ
k
[l], x[l]
and ϕ
k
[n], ˜ ϕ
l
[n] = δ[k − l]. (10)
Using a quadrature mirror ﬁlter theory, the orthonormal bases ϕ
k
[n]
can be expressed as lowpass and high pass ﬁlters for the decompos
tion and reconstruction of a signal. It can be shown that a discrete
signal x[n] can be decomposed into X[k] as:
x[n] =
k∈Z
ϕ
k
[l]x[l]ϕ
k
[n] =
k∈Z
X[k]ϕ
k
[n]
where
ϕ
2k
[n] = h
0
[2k − n] = g
0
[n − 2k]
ϕ
2k+1
[n] = h
1
[2k − n] = g
1
[n − 2k]
and
X[2k] = h
0
[2k − l], x[l]
X[2k + 1] = h
1
[2k − l], x[l]. (11)
In Eq. (11), h
0
and h
1
are respectively, the lowpass and high pass
ﬁlters for signal decomposition or analysis, and g
0
and g
1
are respec
tively, the low pass and high pass ﬁlters for signal reconstruction or
synthesis. A perfect reconstruction of the signal can be obtained if
the orthonormal bases are usedindecompositionandreconstruction
stages as:
x[n] =
k∈Z
X[2k]ϕ
2k
[n]+
k∈Z
X[2k + 1]ϕ
2k+1
[n]
=
k∈Z
X[2k]g
0
[n − 2k]+
k∈Z
X[2k + 1]g
1
[n − 2k]. (12)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
442
Atam P Dhawan
As described above, the scaling function provides low pass ﬁl
ter coefﬁcients and the wavelet function provides the high pass
ﬁlter coefﬁcients. Amultiresolutionsignal representationcanbe con
structed based on the differences of information available at two
successive resolutions 2
j
and 2
j−1
. Such a representation can be com
puted by decomposing a signal using the wavelet transform. First,
the signal is ﬁltered using the scaling function, a low pass ﬁlter. The
ﬁltered signal is then subsampled by keeping one out of every two
samples. The result of low pass ﬁltering and subsampling is called
the scale information. If the signal has the resolution 2
j
, the scale
information provides the reduced resolution 2
j−1
. The difference of
information between resolutions 2
j
and 2
j−1
is called the “detail”
signal at resolution 2
j
. The detail signal is obtained by ﬁltering the
signal with the wavelet, a high pass ﬁlter, and subsampling by a
factor of two.
In order to decompose an image, the above method for 1D sig
nals is applied ﬁrst along the rows of the image, and then along the
columns. The image, at resolution 2
j+1
, represented by A
j+1
, is ﬁrst
low pass and high pass ﬁltered along the rows. The result of each
ﬁltering process is subsampled. Next, the subsampled results are
low pass and high pass ﬁltered along each column. The results of
these ﬁltering processes are again subsampled. The combination of
ﬁltering and subsampling processes essentially provides the band
pass information. The frequency band denoted by A
j
in Fig. 1 is
Fig. 1. A threelevel wavelet decomposition tree, where A means approximation
and D means detail.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
443
2
H
1
2
H
0
2
H
1
2
H
0
2
H
1
2
H
0
Horizontal Subsampling
Vertical Subsampling
2
H
1
2
H
0
2
H
1
2
H
0
2 2
1
2 2
H
0
2
H
1
2
H
0
2
H
1
2
H
0
2 2
H
1
2 2
H
0
2
H
1
2
H
0
2
H
1
2
H
0
2 2
H
1
2 2
0
Horizontal Subsampling
Vertical Subsampling
LowLow A
j
HighHigh Dj
3
HighLow Dj
2
LowHigh Dj
1
Fig. 2. Multiresolution decomposition of an image using the wavelet transform.
referred to as the lowlowfrequency band. It contains the scaled low
frequency information. The frequency bands labeled D
j
1
, D
j
2
, and
D
j
3
denote the detail information. They are referred to as lowhigh,
highlow, and highhigh frequency bands, respectively (Fig. 2). This
scheme can be iteratively applied to an image to further decompose
the signal into narrower frequency bands, i.e. each frequency band
can be further decomposed into four narrower bands. Since each
level of decomposition reduces the resolution by a factor of two,
the length of the ﬁlter limits the number of levels of decomposition
(Fig. 3).
The signal decomposition at the jth stage can thus be general
ized as:
x[n] =
J
j=1
k∈z
X
(j)
[2k + 1]g
(j)
1
[n − 2
j
k]+
k∈z
X
(j)
[2k]g
(j)
0
[n − 2
j
k]
X
(j)
[2k] = h
(j)
0
[2
j
k − l], x[l]
X
(j)
[2k + 1] = h
(j)
1
[2
j
k − l], x[l]. (13)
Wavelet based decomposition of a signal x[n] using a lowpass ﬁlter
h
0
[k] (obtained fromthe scaling function) and a highpass ﬁlter h
1
[k]
is shown in Fig. 4(A) while the reconstruction of the signal from
wavelet coefﬁcients is shown in Fig. 4(B).
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
444
Atam P Dhawan
D
1
3
HighHigh
Component
D
1
2
HighLow
Component
D
1
1
LowHigh
Component
A
1
LowLow
Component
A
2 D
2
1
D
2
3
D
2
2
Fig. 3. Wavelet transform based image decomposition: the original resolution
image (NxN) is decomposedinto four lowlowA
1
, lowhigh D
1
1
, highlowD
1
2
, and
highhigh D
1
3
images each of which is subsampled to resolution
N
2
X
N
2
. The low
lowimage is further decomposedinto four images of
N
4
X
N
4
resolutioneachinthe
second level of decomposition. For a full decomposition, each of the “detail” com
ponent can also be decomposed into four subimages with
N
4
X
N
4
resolution each.
The “least asymmetric” wavelets were computed and reported
by Daubechies.
1
Different least asymmetric wavelets were com
puted for different support widths as larger support widths pro
vide more regular wavelets, a desired property in signal and image
processing. A least asymmetric wavelet is shown in Fig. 5 with the
coefﬁcients of the correspondinglowpass andhighpass ﬁlters given
in Table 1.
18.4 IMAGE PROCESSING USING WAVELET TRANSFORM
The wavelet transform provides a set of coefﬁcients representing
the localized information in a number of frequency bands. A pop
ular method for denoising and smoothing is to threshold these
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
445
2 H
1
2 H
0
2 H
1
2 H
0
2 H
1
2 H
0
22 H
1
22 H
0
2 H
1
2 H
0
22 H
1
2 H
0
H
1
22 H
0
2 H
1
2 H
0
22 H
1
2 H
0
H
1
22 H
0
x[n] X
(1)
[2k+1]
2
G
1 +
G
0 2 2
G
1 +
G
0 2 2
G
1 +
G
0 2
2
G
1 22
G
1 +
G
0 22 2
G
1 +
G
0 2
22
G
1 +
G
0 22 2
G
1 +
G
0 2
22
G
1 +
G
0 22
] [n x
(A)
(B)
X
(1)
[2k]
X
(2)
[2k+1]
X
(2)
[2k]
X
(3)
[2k+1]
X
(3)
[2k]
X
(3)
[2k+1]
X
(3)
[2k]
X
(2)
[2k+1]
X
(1)
[2k+1]
Fig. 4. (A) Amultiresolution signal decomposition using wavelet transform and
(B) the reconstruction of the signal from wavelet transform coefﬁcients.
coefﬁcients in those bands that have high probability of noise and
then reconstruct the image using the reconstruction ﬁlters. The
reconstruction ﬁlters, as described in Eq. (12), can be derived from
the decomposition ﬁlters using the quadrature mirror theory.
1−4
The reconstruction process integrates information from speciﬁc
bands with successive upscaling of resolution to provide the ﬁnal
reconstructed image at the same resolution as of the input image. If
certain coefﬁcients related to the noise or noise like information are
not included in the reconstruction process, the reconstructed image
shows a reduction of noise and smoothing effects. As can be seen
in Fig. 6.22, the coefﬁcients available in the lowhigh, highlow and
highhigh frequency bands in the decomposition process, provide
edge related information that can be emphasized in the reconstruc
tionprocess for image sharpening.
5−8
Figure 6 shows anXraymam
mogramoriginal image that is smoothedusingthe wavelet shownin
Fig. 5. To obtain the smoothed image shown in Fig. 7, a hard thresh
olding method was used in which highhigh frequency wavelet
coefﬁcients was equated to zero and not used in the reconstruction
process. The loss of highfrequency information can be seen in the
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
446
Atam P Dhawan
Fig. 5. The least asymmetric wavelet with eight coefﬁcients.
Table 1. The Coefﬁcients for the Corresponding LowPass
and High Pass Filters for the Least Asymmetric
Wavelet
N High Pass Low Pass
0 −0.107148901418 0.045570345896
1 −0.041910965125 0.017824701442
2 0.703739068656 −0.140317624179
3 1.136658243408 −0.421234534204
4 0.421234534204 1.136658243408
5 −0.140317624179 −0.703739068656
6 −0.017824701442 −0.041910965125
7 0.045570345896 0.107148901418
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
447
Fig. 6. An original digital mammogram image.
smoothed image. Figure 8 shows the reconstructed image of from
the highhigh wavelet coefﬁcients only.
18.5 FEATURE EXTRACTION USING WAVELET
TRANSFORM FOR IMAGE ANALYSIS
Twodimensional wavelet transform is widely used in image pro
cessing applications. Its ability to repeatedly decompose an image
in the low frequency channels makes it ideal for image analysis
since the lower frequencies dominate the real images. The smooth
image has strong components only in the low frequencies whereas
the textured image has substantial components in the wide fre
quency/scale spectrum. Features related to spatiofrequency rep
resentation of the image can be efﬁciently extracted and analyzed
usingwavelet transformmethod. Wavelet transformprovides one of
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
448
Atam P Dhawan
Fig. 7. Asmoothed version of the image shown in Fig. 6 obtained through wavelet
transform based smoothing method.
the best representation methods for analysis of texture information
in the image. Texture has been widely used in image analysis for
biomedical applications and satellite image analysis. It is an impor
tant characteristic of an image and is useful for image interpretation
and recognition. The application of wavelet orthogonal representa
tiontotexture discriminationandfractal analysis has beendiscussed
by Mallat.
2
Feature extraction for texture analysis and segmentation
using wavelet transforms has been applied by Chang and Kuo,
9
Laine and Fan,
10
Unser,
11
and others.
12−15
Each level of decomposition provides band pass ﬁltered spa
tiofrequency information that can be used for feature extraction,
representation and analysis. For example, energy ratios in spe
ciﬁc subbands from the wavelet transform based multiresolution
decomposition have been used in characterization of skin lesion
images for detection of skin cancer, malignant melanoma.
16−18
The
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
449
Fig. 8. An image reconstructed using highhigh wavelet coefﬁcients from the
wavelet transform of the image shown in Fig. 6.
epiluminesence images of skin lesion were obtained using Nevo
scope and used for classiﬁcation using texture based features
extracted through the wavelet transform based decomposition
method.
19−22
The method is brieﬂy described here.
21−22
Figure 9
Fig. 9. Sample images (A) dysplastic nevus and (B) malignant melanoma.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
450
Atam P Dhawan
shows sample images of a dysplastic nevus (nonmalignant lesion)
and a malignant melanoma.
18.5.1 Feature Extraction Through Wavelet Transform
Threelevel wavelet transform was applied to the epiluminesence
images usingthe Daubechies 3 wavelet toobtainthe 10 mainwavelet
subbands. Figure 10 shows threelevel wavelet decomposition cod
ing of the image.
These channels (subbands) were further grouped into low
(channels 1–4), middle (channels 5–7) and highfrequency (chan
nels 8–10). The ratio of the mean energy in the four lowfrequency
channels (1–4) to the mean energy in the three middlefrequency
channels (5–7) is proposed as a criterion for optimal feature selec
tion by R Porter and N Canagarajah.
12
Similarly, a set of ratios of
the wavelet coefﬁcients are studied for the textural analysis and the
optimal set of features is obtained by statistical analysis.
The set of ratios studied are:
r
1
=
m(c
1
)
m(c
12
)
; r
2
=
m(c
12
)
m(c
11
)
; r
3
=
m(c
2
) + m(c
3
) + m(c
4
)
m(c
5
) + m(c
6
) + m(c
7
)
;
r
4
=
m(c
5
) + m(c
6
) + m(c
7
)
m(c
8
) + m(c
9
) + m(c
10
)
;
r
5
=
m(c
1
)
m(c
12
)
∗
m(c
2
) + m(c
3
) + m(c
4
)
m(c
5
) + m(c
6
) + m(c
7
)
;
Fig. 10. Threelevel wavelet decomposition of an image.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
451
r
6
=
m(c
12
)
m(c
11
)
∗
m(c
5
) + m(c
6
) + m(c
7
)
m(c
8
) + m(c
9
) + m(c
10
)
;
r
7
=
m(c
1
)
m(c
2
) + m(c
3
) + m(c
4
)
÷
m(c
12
)
m(c
5
) + m(c
6
) + m(c
7
)
;
r
8
=
m(c
11
)
m(c
5
) + m(c
6
) + m(c
7
)
÷
m(c
12
)
m(c
8
) + m(c
9
) + m(c
10
)
, (14)
wherec
i
stands for thedifferent wavelet channels i = 1..10, of decom
position and m stand for the mean value of the wavelet coefﬁcients
for different channels given by:
m =
i
j
x
ij
length ∗ breadth
, (15)
where x
ij
is the computed coefﬁcient of wavelet transform; the
length and breadth are the dimensions of the respective channels
decomposed.
The variance of the wavelet coefﬁcient is given by:
ε =
i
j
(x
ij
− mean)
2
length ∗ breadth
, (16)
where mean represents the mean of the wavelet coefﬁcients.
The entropy measure for texture analysis can be deﬁned as:
H =
i
j
x
2
ij
∗ log (x
2
ij
)
length ∗ breadth
. (17)
The energy of the wavelet coefﬁcients deﬁned as follows:
E =
i
j
x
2
ij
length ∗ breadth
. (18)
The set of ratios mentioned earlier is calculated for mean, vari
ance, energy and entropy of wavelet coefﬁcients giving in all 32
ratios, which henceforth are referred to as features. Also the gray
level features such as the mean and standard deviation of the image
intensity were included in the feature set. Thus, 34 features were
considered in this texture analysis.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
452
Atam P Dhawan
Statistical correlation analysis was performed on extracted
features to select statistically signiﬁcant and correlated features.
The statistical correlation analysis provideda reducedset of features
of the following ﬁve features with highest statistical signiﬁcance:
f
1
=
m(c
1
)
m(c
12
)
; f
2
=
m(c
12
)
m(c
11
)
; f
3
=
e(c
12
)
e(c
11
)
;
f
4
=
et(c
1
)
et(c
2
) + et(c
3
) + et(c
4
)
÷
et(c
12
)
et(c
5
) + et(c
6
) + et(c
7
)
;
f
5
= ln (std + 1), (19)
where m, e, et and std stands for the mean, energy and entropy of the
wavelet coefﬁcients and standard deviation of the image intensity
respectively.
The selected features were then used in training a nearest
neighborhoodclassiﬁer (describedinChapter 10) usingatrainingset
of pathologically validated labeled set of images. The trained clas
sifer was then used to classify those images that were not included
in the training set. Results of the nearestneighborhood classiﬁer
were compared to the pathology to obtain true positive and false
positive rates of melanoma detection. Atrue positive rate of 93%for
melanoma detection was obtained with a false positive rate of 0%
through this analysis.
20
18.6 CONCLUDING REMARKS
Wavelet transform has been effectively used for one and multi
dimensional data analysis with a number of applications including
medical image analysis. Wavelet transformprovides a simple series
expansion based signal decomposition and reconstruction methods
for localization of characteristic events associated with frequency
and time/space information. Utilizing the property orthonormal
basis functions with scaling and shifting operations, multiresolu
tion wavelet packet analysis provides localized responses equiva
lent to multiband ﬁlters but in a computationally efﬁcient manner.
Wavelet transform can be implemented through a simple modular
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
453
algorithm suitable for fast or realtime applications of any kind of
data analysis.
Wavelet transform has been used for image enhancement,
restoration and reconstruction for medical images. The localized
spatiofrequency information available through wavelet transform
can be effectively used for deﬁning speciﬁc features for image rep
resentation, characterization and classiﬁcation. Multidimensional
expansion of wavelets transform and adaptive design of wavelet
for speciﬁc image processing tasks have become areas of signiﬁcant
research interest in the recent years and will continue to be a pro
ductive research area in the near future.
References
1. Ingrid Daubechies, Ten Lectures on Wavelets, Society for Applied Math
ematics, Philadelphia, PA, 1992.
2. Mallat S, A theory for multiresolution signal decomposition: The
wavelet representation, IEEE Transactions on Pattern Analysis and
Machine Intelligence 11: 674–693, 1989.
3. Stephane Mallat, Wavelets for a Vision, Proceedings of the IEEE 84:
604–614, 1996.
4. Cohen A, Kovacevic J, Wavelets: The mathematical background, Pro
ceedings of the IEEE 84: 514–522, 1996.
5. BovikA, Clark M, Geisler W, Multichannel texture analysis using local
ized spatial ﬁlters, IEEE Transactions on Pattern Analysis and Machine
Intelligence 12: 55–73, 1990.
6. Weaver JB, Yansun X, Healy Jr DM, Cromwell LD, Filtering noise from
images with wavelet transforms, Magnetic Resonance in Medicine 21:
288–295, 1991.
7. Alex P Pentland, Interpolation using wavelet bases, IEEE Trans
actions on Pattern Analysis and Machine Intelligence 16: 410–414,
1994.
8. MingHaw Yaou, WenThong Chang, Fast surface interpolation using
multiresolution wavelet transform, IEEE Transactions on Pattern Anal
ysis and Machine Intelligence 16: 673–688, 1994.
9. Chang T, Kuo CCJ, Texture analysis and classiﬁcation with tree
structure wavelet transform, IEEE Trans Image Process 2(4): 429–447,
1993.
10. LaineA, FanJ, Texture classiﬁcationbywavelet packet signatures, IEEE
Trans Pattern Anal Mach Intell 15(11): 1186–1191, 1993.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch18 FA
454
Atam P Dhawan
11. Unser M, Texture classiﬁcation and segmentation using wavelet
frames, IEEE Trans Image Process 4(11): 1549–1560, 1993.
12. Porter R, Canagarajah N, A robust automatic clustering scheme for
image segmentation using wavelets, IEEE Trans Image Process 5(4):
662–665, 1996.
13. Wang JW, Chen CH, Chien WM, Tsai CM, Texture classiﬁcation using
nonseparable twodimensional wavelets, Pattern Recognition Letters
19: 1225–1234, 1998.
14. Chitre Y, Dhawan A, Mband wavelet discrimination of natural tex
tures, Pattern Recognition Letters, 773–789, 1999.
15. vanErkel AR, ThPattynama PM, Receiver operating characteris
tic (ROC) analysis: Basic principles and applications in radiology,
European Journal of Radiology 27: 88–94, 1998.
16. Kopf A, SaloopekT, SladeJ, MarghoodA, et al., Techniques of cutaneous
examination for the detection of skin cancer, Cancer Supplement 75(2):
684–690, 1994.
17. Koh H, Lew R, Prout M, Screening for melanoma/skin cancer: Theo
retical and practical considerations, J Am Acad Dermatol 20: 159–172,
1989.
18. Stoecker W, Moss R, Skin Cancer Recognition by Computer Vision:
Progress Report, National Science Foundation Grant ISI 8521284,
August 29, 1988.
19. Dhawan AP, Early detection of cutaneous malignant melanoma by
three dimensional Nevoscopy, Computer Methods and Programs in
Biomedicine 21: 59–68, 1985.
20. Nimukar A, DhawanA, Relue P, Patwardhan S, Wavelet and Statistical
Analysis for Melanoma Classiﬁcation, SPIE International Conference on
Medical Imaging, MI 4684, 1346–1353, Feb 24–28, 2002.
21. Patwardhan S, Dhawan AP, Relue P, Classiﬁcation of melanoma using
treestructured wavelet transform, Computer Methods and Programs in
Biomedicine 72(3): 223–239, 2003.
22. Patwardhan S, Dai S, Dhawan AP, Multispectral image analysis and
classiﬁcation of melanoma using fuzzy membership based partitions,
Computerized Medical Imaging and Graphics 29: 287–296, 2005.
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
CHAPTER 19
Multiclass Classiﬁcation for Tissue
Characterization
Atam P Dhawan
Computer aided diagnostic applications such as cancer detection may
require a binary classiﬁcation into benign and malignant classes. How
ever, there are many medical imaging applications requiring multiclass
classiﬁcations to categorize image data into more than two classes for tis
sue or pathology characterization. This chapter provides an introduction
of some of the approaches such as Bayesian classiﬁcation, support vec
tor machine, and neurofuzzy systems that can be applied in multiclass
classiﬁcation.
19.1 INTRODUCTION
Conventional methods for computeraided medical image analy
sis for the detection of an outcome or pathology such as cancer
usually require a binary classiﬁcation of acquired image data. How
ever, other medical image analysis applications such as segmenta
tion and tissue characterization from multiparameter images may
require multiclass classiﬁcation. For example, brainimages acquired
through multiparameter multidimensional imaging protocols may
be analyzed for multiclass segmentation for tissue characteriza
tion for the evaluation and detection of critical neurological func
tions and disorders. Several chapters in this book describe current
and merging trends in multiparameter brain imaging and radia
tion therapy that can be beneﬁted using multiclass classiﬁcation
455
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
456
Atam P Dhawan
approaches. Fusion of anatomical, metabolic and functional infor
mation usually leads to multidimensional data sets leading to anal
ysis of local regions that can be obtained from segmentation and
detectionapproaches basedonmulticlass classiﬁcation. Inthis chap
ter, we present some of the multiclass classiﬁcationmethods suitable
for multiparameter medical image analysis.
19.2 MULTICLASS CLASSIFICATION USING MAXIMUM
LIKELIHOOD DISCRIMINANT FUNCTIONS
Medical images preprocessing and feature extraction analysis leads
to a set of spatially distributed multidimensional data vectors of
raw measurements and computed features. The total number of
measurements and computed features allocated to each pixel in the
image sets up the dimension d of the feature space. Let us assume
that we have an image of m rows and n columns with mn number
of pixels to be classiﬁed into k number of classes. Thus, we have mn
data vectors X = {x
j
; j = 1, 2, . . . , mn} distributedin a ddimensional
feature space. Thus, each element of the data vector (i.e. pixel in the
image) is associated with ddimensional feature vector. The pur
pose of multiclass classiﬁcation is to ﬁnd a mapping f (X) to map the
input data vectors into k classes denoted by C = {c
i
; i = 1, 2, . . . , k}.
In order to learn such a mapping, we can use a training set S of
cardinality l with labeled input vectors such that:
S = {(x
1
, c
l
), . . . , (x
l
, c
l
)}, (1)
x
i
∈ χ are provided in the innerproduct space of and χ ⊆ R
d
and
C
i
∈ γ = {1, . . . , k} the corresponding class or category label.
As shown in Eq. (1), there is a pair relationship of the assignment
of each input pixel X to a class C. Let us assume that each class c
i
model obtained from the training set has a mean vector µ
i
and a
covariance represented by
i
such that:
ˆ µ
i
=
1
n
j
x
j
, (2)
where i = 1, 2, . . . , k; andj = 1, . . . , n; n is the number of pixel vectors
in the ith class, and x
j
is the jth of n multidimensional vectors that
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
Multiclass Classiﬁcation for Tissue Characterization
457
comprise the class. The dimension of x
j
corresponds to the number
of image modalities used in the analysis. The covariance matrix of
class i,
ˆ
i
, is:
ˆ
i
=
1
n − 1
j
(x
j
− ˆ µ
i
)(x
j
− ˆ µ
i
) . (3)
For developing anestimationmodel,
1–3
let us assume that the image
to classify is a realization of a pair of random variables {C
mn
, X
mn
};
where C
mn
is the class of the pixel mn. C
mn
represents the spatial
variability of the class in the image and can take the values in a
discrete set {1, 2, . . . , k}. X
mn
is a ddimensional random variable of
pixel mn describing the variability of measurements for that pixel.
X
mn
describes the variability of the observed values x in a particular
class. Given that C
mn
= i, (i = 1, 2, . . . , k), the distribution of X
mn
is estimated to obey the general multivariate normal distribution
described by the density function:
ˆ p(x) =
1
(2π)
d/2
¸
¸
¸
¸
ˆ
i
¸
¸
¸
¸
1/2
exp
_
_
_
−(x − ˆ µ
i
)
2
ˆ
i
(x − ˆ µ
i
)
_
¸
_
, (4)
where x is a delement column vector, ˆ µ
i
is a delement estimated
mean vector for the class i calculated from the training set,
ˆ
i
is the
estimated d × d covariance matrix for class i also calculated from
the training set, and d is the dimension of multiparameter or feature
vector.
For maximum likelihood based discriminant analysis to assign
a class to a given pixel in the image.
1–4
For each pixel, four tran
sition matrices P
r
(m, n) = [p
ijr
(m, n)] can be estimated, where r
is a direction index (following four spatial connectedness direc
tions in the image) and p
ijr
(m, n) are the transition probabilities
deﬁned by:
p
ij1
(m, n) = P{C
mn
= jC
m,n−1
= i}, (5)
p
ij2
(m, n) = P{C
mn
= jC
m+1,n
= i}, (6)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
458
Atam P Dhawan
p
ij3
(m, n) = P{C
mn
= jC
m,n+1
= i}, (7)
p
ij4
(m, n) = P{C
mn
= jC
m−1,n
= i}. (8)
A generalized estimation of transition probabilities for classes
can be obtained using b images in the training set and aver
aged over small neighborhood of h pixels around the pixel mn
as:
p
ij1
(m, n) =
b
n
{pixC
mn
= j, C
m,n−1
= i}
b
n
{pixC
m,n−1
= i}
p
ij2
(m, n) =
b
n
{pixC
mn
= j, C
m+1,n
= i}
b
n
{pixC
m+1,n
= i}
p
ij3
(m, n) =
b
n
{pixC
mn
= j, C
m,n+1
= i}
b
n
{pixC
m,n+1
= i}
p
ij4
(m, n) =
b
n
{pixC
mn
= j, C
m−1,n
= i}
b
n
{pixC
m−1,n
= i}
,
(9)
where
b
{pixCP} denotes the number of pixels with the
property CP in the images used in the training set used
to generate the model and
n
represents the number of
pixels with the given property in the predeﬁned neighbor
hood.
The equilibrium transition probabilities can then be estimated
using a similar procedure as:
π
i
(mn) =
b
n
{pixC
mn
= i}
b
n
{pix}
. (10)
19.2.1 Maximum Likelihood Discriminant Analysis
The class random variable C
mn
is assumed to constitute a kstate
Markovrandomﬁeld. Rows andcolumns of C
mn
constitute segments
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
Multiclass Classiﬁcation for Tissue Characterization
459
of kstate Markov chains. Chains are speciﬁed by the k ×k transition
matrix P = [p
ij
] where:
p
ij
= P{C
mn
= jC
m,n−1
= i}, (11)
which leads to the equilibrium probabilities (π
1
, π
2
, . . . , π
K
).
Usingthe above model,
1
the probabilities of eachpixel belonging
to a speciﬁc class i is:
P{C
mn
= ix
kl
, (k, l) ∈ N(m, n)}, (12)
where N(m, n) is a predeﬁned neighborhood of the pixel (m, n). For
example, a fourconnected neighborhood around a pixel mn can be
deﬁned as:
N(m, n) = {(m, n), (m − 1, n), (m, n − 1), (m + 1, n), (m, n + 1)}. (13)
It follows that:
P{C
mn
= ix
kl
, (k, l) ∈ N(m, n)} =
P
_
C
mn
= i, X
mn
X
m±1,n
, X
m,n±1
_
P
_
X
mn
X
m±1,n
, X
m,n±1
_
(14)
and
P{C
mn
= ix
kl
, (k, l) ∈ N(m, n)}
=
P
_
X
mn
C
mn
= i, X
m±1,n
, X
m,n±1
_
P
_
C
mn
= iX
m±1,n
, X
m,n±1
_
P
_
X
mn
X
m±1,n
, X
m,n±1
_ ,
(15)
where
P{◦X
m±1,n
, X
m,n±1
} ≡ P{◦X
m−1,n
}P{◦X
m,n−1
}
×P{◦X
m+1,n
}P{◦X
m,n+1
}. (16)
Taking into account the class conditional independence Eq. (15) can
be stated as:
P{C
mn
= ix
kl
, (k, l) ∈ N(m, n)}
=
P{X
mn
C
mn
= i}P{C
mn
= iX
m±1,n
, X
m,n±1
}
P{X
mn
X
m±1,n
, X
m,n±1
}
. (17)
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
460
Atam P Dhawan
With the Bayes estimation method, the above expression leads to:
P{C
mn
= ix
kl
, (k, l) ∈ N(m, n)}
=
P{X
mn
C
mn
= i}P{X
m±1,n
, X
m,n±1
C
mn
= i}P{C
mn
= i}
P{X
mn
X
m±1,n
, X
m,n±1
}P{X
m±1,n
, X
m,n±1
}
(18)
where
P{X
m±1,n
, X
m,n±1
◦} ≡ P{X
m−1,n
◦}P{X
m,n−1
◦}
×P{X
m+1,n
◦}P{X
m,n+1
◦}. (19)
The terms in Eq. (19) can be further expressed as:
P{X
m−1,n
C
mn
= i} =
N
j=1
P{X
m−1,n
C
mn
= j}P{C
m−1,n
= jC
mn
= i}
≡ H
m−1,n
(i). (20)
Finally, substituting Eqs. (19) and (20) into Eq. (18), the probability
of the current pixel, mn belonging to class i given the characteristics
of the pixels in the neighborhood of mn can now be deﬁned as:
P{C
mn
= ix
kl
, (k, l) ∈ N(m, n)}
=
P{C
mn
= iX
mn
}P{X
mn
}H
m−1,n
(i)H
m,n−1
(i)H
m+1,n
(i)H
m,n+1
(i)
P{X
mn
X
m±1,n
, X
m,n±1
}P{X
m±1,n
, X
m,n±1
}
.
(21)
Equation (21) shows the conventional expression for the class proba
bilities, denotedby P{C
mn
= iX
mn
}P{X
mn
}, is modiﬁedby the factors
H
ij
according to the evidence foundin the immediate neighborhood.
Pixels are classiﬁed based on the class that maximizes.
19.3 NEUROFUZZY CLASSIFIERS FOR MULTICLASS
CLASSIFICATION
Thepatternrecognitionsystems suchas backpropagationneural net
work, radial basis function (RBF) network or of knearestneighbor
(KNN) can provide multiclass classiﬁcation using crisp decision
surfaces that often suffer fromlowimmunity to noise in the training
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
Multiclass Classiﬁcation for Tissue Characterization
461
patterns. Neural networks and clustering methods for classiﬁcation
are described in Chapter 10 of this book. To overcome problems of
crisp function based classiﬁer, fuzzy functions have been used for
classiﬁcation applications.
Several approaches using fuzzy set theory for pattern recogni
tion can be found in a number of publications.
5–19
A novel pattern
recognition method using fuzzy functions with a winnertakeall
strategy is presented here that can be used for multiclass classiﬁca
tion. In this approach, the feature space is ﬁrst partitioned into all
categories using the training data. The data is thus transformed into
convex sets in the feature space. It is achieved by dividing theminto
homogeneous (containing only points fromone category), nonover
lapping, closed convex subsets, and then placing separating hyper
planes between neighboring subsets from different categories. The
hyperplane separation of the obtained subsets with homogenous
convex regions provides the consecutive network layer to deter
mine what region a given input pattern belongs to. In our approach,
a fuzzy membership M
f
function is devised for each created con
vex subset (f =1, 2, . . . , k). The classiﬁcation decision is made by the
output layer based on the “winnertakeall” principle. The resulting
category C is the convex set category with the highest value of mem
bership function for the input pattern. Aschematic diagram of such
a neuronfuzzy classiﬁcation system is shown in Fig. 1.
5
19.3.1 Convex Set Creation
There are two requirements for the convex sets: they have to be
homogeneous and nonoverlapping. To satisfy the ﬁrst condition,
one needs to devise a method of ﬁnding one category points within
another category’s hull. Thus, two problems can be deﬁned: (1) how
to ﬁndwhether the point Plies inside of a convex hull (CH) of points;
(2) howto ﬁndout if two convex hulls of points are overlapping. The
second problem is more difﬁcult to examine because hulls can be
overlapping over a common (empty) space that contains no points
fromeither category. This problemcanbe deﬁnedas a generalization
of the ﬁrst one,
20
andthe ﬁrst conditioncanbe seenas aspecial case of
January 22, 2008 12:3 WSPC/SPIB540:Principles and Recent Advances ch19 FA
462
Atam P Dhawan
M
1
winner