NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI

World Scientifc
PRINCIPLES AND ADVANCED METHODS IN
MEDICAL IMAGING AND IMAGE ANALYSIS
ATAM P DHAWAN
New Jersey Institute of Technology, USA
H K HUANG
University of Southern California, USA
DAE-SHIK KIM
Boston University, USA
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
For photocopying of material in this volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to
photocopy is not required from the publisher.
ISBN-13 978-981-270-534-1
ISBN-10 981-270-534-1
ISBN-13 978-981-270-535-8 (pbk)
ISBN-10 981-270-535-X (pbk)
Typeset by Stallion Press
Email: enquiries@stallionpress.com
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, including photocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the Publisher.
Copyright © 2008 by World Scientific Publishing Co. Pte. Ltd.
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
Printed in Singapore.
PRINCIPLES AND ADVANCED METHODS IN MEDICAL IMAGING AND
IMAGE ANALYSIS
ChingTing - Principles and Adv Methods.pmd 6/18/2008, 9:12 AM 1
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
To
My wife, Nilam,
for her support and patience;
and my sons, Anirudh and Akshay,
for their quest for learning.
(Atam P Dhawan)
To
My wife, Fong,
for her support;
and my daughter, Cammy; and my son, Tilden,
for their young wisdom.
(HK Huang)
To
My daughter, Zeno,
for her curiosity
(Dae-Shik Kim)
v
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
Preface and Acknowledgments
We are pleased to bring “Principles and Advanced Methods in
Medical Imaging and Image Analysis”, a volume of contributory
chapters, to the scientific community. The book is a compilation
of carefully crafted chapters written by leading researchers in the
field of medical imaging they have put in a great deal of effort
in contributing the various chapters. This book can be used as
a research reference or a text book for graduate level courses in
biomedical engineering and medical sciences.
The book is a unique combination of chapters describing the
principles as well as state-of-the-art advanced methods in medical
imaging and image analysis for selected applications. Though com-
puterizedmedical imaginghas a verywide spectrumof applications
in diagnostic radiology and medical research, we have selected a
subset of important imaging modalities with specific applications
that are significant in medical sciences and clinical practice. The
topics covered in the chapters have been developed with a natural
progression of understanding, keeping in mind future technological
advances that are expected to have a major impact in clinical prac-
tice and the understanding of complex pathologies. We hope that
this book will provide a unique learning experience from theoreti-
cal concepts to advanced methods and applications to researchers,
clinicians and students.
vii
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
viii
Preface and Acknowledgments
We are very grateful to our contributors who are internationally
renowned experts and experienced researchers in their respective
fields within the wide spectrum of medical imaging and comput-
erized medical image analysis. We also gracefully acknowledge the
support provided by the editorial board and staff members of World
Scientific Publishing. Special thanks to Ms CT Ang for her guidance
and patience in preparing this book.
We hope that readers will find this book useful in providing a
concise version of important principles, advances, and applications
in medical imaging and image analysis.
Atam P Dhawan
HK Huang
Dae-Shik Kim
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
Contributors
Walter J Akers, PhD
Staff Scientist, Optical Radiology Laboratory
Department of Radiology
Washington University School of Medicine
University of Washington at St Louis
Elsa Angelini, PhD
Ecole Nationale Supérieure des
Télécommunications
Paris, France
Leonard Berliner, MD
Department of Radiology
New York Methodist Hospital, NY
Sharon Bloch, PhD
Optical Radiology Laboratory, Department of Radiology
Washington University School of Medicine
University of Washington at St Louis
Christos Davatzikos, PhD
Director, Section of Biomedical Image Analysis
Associate Professor, Department of Radiology
University of Pennsylvania
ix
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
x
Contributors
Mathieu De Craene, PhD
Computational Imaging Lab
Department of Information
and Communication Technologies
Universitat Pompeu Fabra, Barcelona
Atam P Dhawan, PhD
Professor, Department of Electrical
and Computer Engineering
Professor, Department of Biomedical Engineering
New Jersey Institute of Technology
Qi Duan, PhD
Department of Biomedical Engineering
Columbia University
Alejandro F Frangi, PhD
Computational Imaging Lab
Department of Information
and Communication Technologies
Universitat Pompeu Fabra, Barcelona
Shunichi Homma, MD
Margaret Millikin Hatch Professor
Department of Medicine
Columbia University
HK Huang, DSc
Professor and Director, Imaging Informatics Division
Department of Radiology, Keck School of Medicine
Department of Biomedical Engineering
Viterbi School of Engineering
University of Southern California
Dae-Shik Kim, PhD
Director, Center for Biomedical Imaging
Associate Professor, Anatomy and Neurobiology
Boston University School of Medicine
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
Contributors
xi
Elisa E Konofagou, PhD
Assistant Professor
Department of Biomedical Engineering
Columbia University
Andrew Laine, PhD
Professor
Department of Biomedical Engineering
Columbia University
Angela R Laird, PhD
Assistant Professor, Department of Radiology
University of Texas Health Sciences Center
San Antonio
Maria YY Law, PhD
Associate Professor
Department of Health Technology and Informatics
The Hong Kong Polytechnic University
Heinz U Lemke, PhD
Research Professor, Department of Radiology
University of Southern California
Los Angeles, CA
Guang Li, PhD
Medical Physicist, Radiation Oncology Branch
National Cancer Institute,
NIH, Bethesda, Maryland
Brent J Liu, PhD
Assistant Professor and Deputy Director of Informatics
Department of Radiology, Keck School of Medicine
Department of Biomedical Engineering
Viterbi School of Engineering
University of Southern California
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
xii
Contributors
Tianming Liu, PhD
Center for Bioinformatics
Harvard Medical School
Department of Radiology
Brigham and Women’s Hospital, MA
Sachin Patwardhan, PhD
Research Scientist, Department of Radiology
Mallinckrodt Institute of Radiology
University of Washington, St Louis
Xiaochuan Pan, PhD
Professor
Department of Radiology
Cancer Research Center
The University of Chicago
Itamar Ronen, PhD
Assistant Professor
Center for Biomedical Imaging
Department of Anatomy and Neurobiology
Boston University School of Medicine
Yulin Song, PhD
Associate Professor
Department of Radiology
Memorial Sloan-Kettering Cancer Center
New Jersey
Song Wang, PhD
Department of Electrical
and Computer Engineering
New Jersey Institute of Technology
Pat Zanzonico, PhD
Molecular Pharmacology and Chemistry
Memorial Sloan-Kettering Cancer Center
New York
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
Contributors
xiii
Zheng Zhou, PhD
Manager
Imaging Processing and Informatics Lab
Department of Radiology
University of Southern California
Xiang Sean Zhou, PhD
Senior Staff Scientist, Program Manager
Computer Aided Diagnosis and Therapy Solutions
Siemens Medical Solutions, Inc., Malvern PA
Lionel Zuckier, MD
Head, Nuclear Medicine
Department of Radiology
New Jersey University for Medicine and Dentistry
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
Contents
Preface and Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1. Introduction to Medical Imaging and Image
Analysis: AMultidisciplinary Paradigm . . . . . . . . . 1
Atam P Dhawan, HK Huang and Dae-Shik Kim
Part I. Principles of Medical Imaging and Image
Analysis
2. Medical Imaging and Image Formation. . . . . . . . . . 9
Atam P Dhawan
3. Principles of X-ray Anatomical Imaging
Modalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Brent J Liu and HK Huang
4. Principles of Nuclear Medicine Imaging
Modalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Lionel S Zuckier
5. Principles of Magnetic Resonance Imaging . . . . . . 99
Itamar Ronen and Dae-Shik Kim
6. Principles of Ultrasound Imaging Modalities . . . . 129
Elisa Konofagou
7. Principles of Image Reconstruction Methods. . . . . 151
Atam P Dhawan
8. Principles of Image Processing Methods . . . . . . . . . 173
Atam P Dhawan
xv
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
xvi
Contents
9. Image Segmentation and Feature Extraction . . . . . 197
Atam P Dhawan
10. Clustering and Pattern Classification . . . . . . . . . . . . 229
Atam P Dhawan and Shuangshuang Dai
Part II. Recent Advances in Medical Imaging and
Image Analysis
11. Recent Advances in Functional Magnetic
Resonance Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Dae-Shik Kim
12. Recent Advances in Diffusion Magnetic
Resonance Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Dae-Shik Kim and Itamar Ronen
13. Fluorescence Molecular Imaging: Microscopic to
Macroscopic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Sachin V Patwardhan, Walter J Akers and
Sharon Bloch
14. Tracking Endocardium Using Optical Flow
Along Iso-Value Curve . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Qi Duan, Elsa Angelini, Shunichi Homma and
Andrew Laine
15. Some Recent Developments in Reconstruction
Algorithms for Tomographic Imaging . . . . . . . . . . . 361
Chien-Min Kao, Emil Y Sidky, Patrick La Rivière
and Xiaochuan Pan
16. Shape-Based Reconstruction from Nevoscope
Optical Images of Skin Lesions . . . . . . . . . . . . . . . . . . 393
Song Wang and Atam P Dhawan
17. Multimodality Image Registration and Fusion . . . 413
Pat Zanzonico
18. Wavelet Transform and Its Applications in
Medical Image Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 437
Atam P Dhawan
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
Contents
xvii
19. Multiclass Classification for Tissue
Characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Atam P Dhawan
20. From Pairwise Medical Image Registration to
Populational Computational Atlases. . . . . . . . . . . . . 481
M De Craene and AF Frangi
21. Grid Methods for Large Scale Medical Image
Archiving and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 517
HK Huang, Zheng Zhou and Brent Liu
22. Image-Assisted Knowledge Discovery
and Decision Support in Radiation
Therapy Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
Brent J Liu
23. Lossless Digital Signature Embedding
Methods for Assuring 2D and 3D Medical
Image Integrity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
Zheng Zhou, HK Huang and Brent J Liu
Part III. Medical Imaging Applications, Case Studies
and Future Trends
24. The Treatment of Superficial Tumors Using
Intensity Modulated Radiation Therapy and
Modulated Electron Radiation Therapy . . . . . . . . . 599
Yulin Song and Maria Chan
25. Image Guidance in Radiation Therapy. . . . . . . . . . . 635
Maria YY Law
26. Functional Brain Mapping and Activation
Likelihood Estimation Meta-Analysis. . . . . . . . . . . . 663
Angela R Laird, Jack L Lancaster and Peter T Fox
27. Dynamic Human Brain Mapping and Analysis:
From Statistical Atlases to Patient-Specific
Diagnosis and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 677
Christos Davatzikos
January 23, 2008 11:18 WSPC/SPI-B540:Principles and Recent Advances fm
xviii
Contents
28. Diffusion Tensor Imaging Based Analysis
of Neurological Disorders . . . . . . . . . . . . . . . . . . . . . . . 703
Tianming Liu and Stephen TC Wong
29. Intelligent Computer Aided Interpretation
in Echocardiography: Clinical Needs
and Recent Advances . . . . . . . . . . . . . . . . . . . . . . . . . . . 725
Xiang Sean Zhou and Bogdan Georgescu
30. Current and Future Trends in Radiation
Therapy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745
Yulin Song and Guang Li
31. IT Architecture and Standards for a
Therapy Imaging and Model Management
System (TIMMS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783
Heinz U Lemke and Leonard Berliner
32. Future Trends in Medical and Molecular
Imaging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 829
Atam P Dhawan, HK Huang and Dae-Shik Kim
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 845
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
CHAPTER 1
Introduction to Medical Imaging
and Image Analysis:
A Multidisciplinary Paradigm
Atam P Dhawan, HK Huang and Dae-Shik Kim
Recent advances in medical imaging with significant contributions from
electrical and computer engineering, medical physics, chemistry, and
computer science have witnessed a revolutionary growth in diagnostic
radiology. Fast improvements in engineering and computing technolo-
gies have made it possible to acquire high-resolution multidimensional
images of complexorgans toanalyze structural andfunctional information
of human physiology for computer-assisted diagnosis, treatment evalua-
tion, and intervention. Through large databases of vast amount of infor-
mation such as standardized atlases of images, demographics, genomics,
etc. newknowledge about physiological processes andassociatedpatholo-
gies is continuously being derived to improve our understanding of criti-
cal diseases for better diagnosis and management. This chapter provides
an introduction to this ongoing knowledge quest and the contents of the
book.
1.1 INTRODUCTION
In a general sense, medical imaging refers to the process involving
specialized instrumentation and techniques to create images or rel-
evant information about the internal biological structures and func-
tions of the body. Medical imaging is sometimes categorized, in a
wider sense, as a part of radiological sciences. This is particularly
1
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
2
Atam P Dhawan, HK Huang and Dae-Shik Kim
relevant because of its most commonapplications indiagnostic radi-
ology. In clinical environment, medical images of a specific organ or
part of the body are obtained for clinical examination for the diag-
nosis of a disease or pathology. However, medical imaging tests are
also performed to obtain images and information to study anatom-
ical and functional structures for research purposes with normal as
well as pathological subjects. Such studies are very important to
understand the characteristic behavior of physiological processes in
humanbody to understandanddetect the onset of a pathology. Such
an understanding is extremely important for early diagnosis as well
as developinga knowledge base tostudythe progressionof a disease
associated with the physiological processes that deviate from their
normal counterparts. The significance of medical imaging paradigm
is its direct impact on the healthcare through diagnosis, treatment
evaluation, intervention and prognosis of a specific disease.
Froma scientific point of view, medical imaging is highly multi-
disciplinary and interdisciplinary with a wide coverage of physical,
biological, engineeringandmedical sciences. The overall technology
requires direct involvement of expertise in physics, chemistry, biol-
ogy, mathematics, engineering, computer science and medicine so
that useful procedures and protocols for medical imaging tests with
appropriate instrumentation can be developed. The development
of a specific imaging modality system starts with the physiological
understanding of the biological medium and its relationship to
the targeted information to be obtained through imaging. Once
such a relationship is determined, a method for obtaining the tar-
geted information using a specific energy transformation process,
often known as physics of imaging, is investigated. Once a method
for imaging is established, proper instrumentation with energy
source(s), detectors, and data acquisition systems are designed
and integrated to physically build an imaging system for imaging
patients to obtain target information in the context of a patholog-
ical investigation. For example, to obtain anatomical information
about internal organs of the body, X-ray energy may be used. The
X-ray energy, while transmitted through the body, goes through
attenuation based on the density of the internal structures. Thus,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
Introduction to Medical Imaging and Image Analysis
3
the attenuation of the X-ray energy carries the target information
about the density of internal structures which is then displayed as a
two-dimensional (incase of radiography or mammography) or mul-
tidimensional (3Din case computed tomography (CT); 4Din case of
cine-CT) image. This information(image) canbe directly interpreted
by a radiologist or further processed by a computer for image pro-
cessing and analysis for better interpretation.
With the evolutionary progress in engineering and computing
technologies in the last century, medical imaging technologies have
witnessed a tremendous growth that has made a major impact in
diagnostic radiology. These advances have revolutionarized health-
care through fast imaging techniques; data acquisition, storage and
analysis systems; high resolution picture archiving and communi-
cation systems; information mining with modeling and simulation
capabilities to enhance our knowledge base about the diagnosis,
treatment and management of critical diseases such as cancer, car-
diac failure, brain tumors and cognitive disorders.
Figure 1 provides a conceptual notion of the medical imaging
process from determination of principle of imaging based on the
target pathological investigation to acquiring data for image recon-
struction, processing and analysis for diagnostic, treatment evalua-
tion, and/or research applications.
There are many medical imaging modalities and techniques
that have been developed in the past years. Anatomical structures
can be effectively imaged today with X-ray computed tomogra-
phy (CT), magnetic resonance imaging (MRI), ultrasound, and opti-
cal imaging methods. Furthermore, information about physiologi-
cal structures with respect to metabolism and/or functions, can be
obtained through nuclear medicine [single photon emission com-
puted tomography (SPECT) and positron emission tomography
(PET)], ultrasound, optical fluorescence, and several derivative pro-
tocols of MRI such as fMRI, diffusion-tensor MRI, etc.
The selection of an appropriate medical imaging modality is
important for obtaining the target information for a successful
pathological investigation. For example, if information has to be
obtained about the cardiac volumes and functions associated with
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
4
Atam P Dhawan, HK Huang and Dae-Shik Kim
Physiology and
Understanding
of Imaging
Medium
Principle
of Imaging
Target
Investigation
or
Pathology
Physics of
Imaging
Detector
Physics
Imaging
Instrumentation
Energy-Source
Physics
Data
Acquisition
Image
Reconstruction
Image
Processing
Interpretation:
Diagnosis
Evaluation
Intervention
Database and
Computerized
Analysis
New Knowledge
Fig. 1. A conceptual block diagram of medical imaging process for diagnostic,
treatment evaluation and intervention applications.
a beating heart, one has to determine the requirements and limita-
tions about the spatial and temporal resolution for the target set of
images. It is also important to keep in mind the type of pathology
being investigated for the imaging test. Depending on the investi-
gation, such as metabolism of cardiac walls, or opening and closing
measurements of mitral valve, a specific medical imaging modality
(e.g. PET) or a combination of different modalities (e.g. stress-PET
and ultrasound) can be selected.
1.1.1 Book Chapters
In this book, we present a collection of carefully written chapters to
describe principles and recent advances of major medical imaging
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
Introduction to Medical Imaging and Image Analysis
5
modalities and techniques. Case studies and data analysis protocols
are also described for investigating selected critical pathologies. We
hope that this book will be useful for engineering as well as clinical
students and researchers. The book presents a natural progression
of technology development and applications through the chapters
that are written by leading and renowned researchers and educa-
tors. The book is organized in three parts: Principles of Imaging
and Image Analysis (Chapters 2–10); Recent Advances in Medical
ImagingandImageAnalysis (Chapters 11–23); andMedical Imaging
Applications, Case Studies and Future Trends (Chapters 24–32).
Chapter 2 describes some basic principles of medical imaging
and image formation. In this chapter, AtamDhawan has focused on
a basic mathematical model of image formation for a linear spatially
invariant imaging system.
InChapter 3, Brent LiuandHKHuangpresent basic principles of
X-ray imaging modalities. X-ray radiography, mammography, com-
puted tomography (CT) and more recent PET-XCT fusion imaging
systems are described.
Principles of nuclear medicine imaging are described by Lionel
Zuckier in Chapter 4 where he provides foundation and clinical
applications of single photon emission tomography (SPECT) and
positron emission tomography (PET).
In Chapter 5, Itamar Ronen and Dae-Shik Kimdescribes sophis-
ticated principles and imaging techniques of Magnetic Resonance
Imaging (MRI). Imaging parameters and pulse techniques for use-
ful MR imaging are presented.
Elisa Konofagou presents the principles of ultrasound imaging
in Chapter 6. Instrumentation and various imaging methods with
examples are described.
In Chapter 7, Atam Dhawan describes the foundation of multi-
dimensional image reconstruction methods. Abrief introduction of
different types of transform and estimation methods is presented.
Atam Dhawan presents a spectrum of image enhancement,
restoration and filtering operations in Chapter 8. Image processing
methods in spatial (image) domain as well as frequency (Fourier)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
6
Atam P Dhawan, HK Huang and Dae-Shik Kim
domain are described. In Chapter 9, Atam Dhawan describes basic
image segmentation and feature extraction methods for representa-
tion of regions of interest for classification.
In Chapter 10, Atam Dhawan and Shuangshuang Dai present
principles of pattern recognition and classification. Genetic algo-
rithmbasedfeature selectionandnonparametric classificationmeth-
ods are also described for image/tissue classification for diagnostic
applications.
Advances inMRimagingwithrespect tonewmethods andpulse
sequences associated with functional imaging of brain are described
byDae-ShikKiminChapter 11. Diffusionanddiffusion-tensor based
magnetic resonance imaging methods are described by Dae-Shik
Kim and Itamar Ronen in Chapter 12. These two chapters bring the
most recent developments infunctional brainimaging to investigate
neuronal information including homodynamic response and axonal
pathways.
Chapter 13 provides a spectrum of optical and fluorescence
imagingfor 3-Dtomographic applications. Throughspecific contrast
imaging methods, Sachin Patwardhan, Walter Akers and Sharon
Bloch explore molecular imaging applications.
In Chapter 14, Qi Duan, Elsa Angelini, Shunichi Homma and
AndrewLaine presents recent investigations in dynamic ultrasound
image analysis for tracking endocardium in 4D cardiac imaging.
Chien-Min Kao, Emil Y. Sidky, Patrick LaRiviere, and Xiaochuan
Pan describe recent advances in model based multidimensional
image reconstruction methods for medical imaging applications in
Chapter 15. These methods use multivariate statistical estimation
methods in image reconstruction.
Shape-based optical image reconstruction of specific entities
frommultispectral images of skinlesions is presentedby Song Wang
and Atam Dhawan in Chapter 16.
Clinical multimodality image registration and fusion methods
with nuclear medicine and optical imaging are described by Pat
Zanzonico in Chapter 17. Pat emphasizes on clinical needs of local-
ization of metabolic information with real time processing and
efficiency requirements.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
Introduction to Medical Imaging and Image Analysis
7
Recently wavelet transform has been extensively investigated
for obtaining localized spatio-frequency information. The use of
wavelet transform in medical image processing and analysis is
described by Atam Dhawan in Chapter 18.
Medical image processing and analysis often require a multi-
class characterization for image contents. Atam Dhawan presents a
probabilistic multiclass tissue characterization method for MRbrain
images in Chapter 19.
In Chapter 20, Mathieu De Craene and Alejandro F Frangi
present a reviewof advances in image registration methods for con-
structing standardized computational atlases.
In Chapter 21, HK Huang, Zheng Zhou and Brent Liu describe
information processing and computational methods to deal with
large image archiving and communication corresponding to large
medical image databases.
Brent Lu, in Chapter 22, describes knowledge mining and deci-
sion making strategies for medical imaging applications in radiation
therapy planning and treatment.
With large image archiving and communication systems linked
with large image databases, information integrity becomes a critical
issue. InChapter 23, Zheng Zhou, HKHuang andBrent J Liupresent
lossless digital signature embedding methods in multidimensional
medical images for authentification and integrity.
Medical imaging applications in intensity modulated radiation
therapy (IMRT), a radiation treatment protocol, are discussed by
Yulin Song in Chapter 24.
In Chapter 25, Maria Law presents the detailed role of medical
imaging based computer assisted protocols for radiation treatment
planning and delivery.
Recently developed fMR and diffusion-MR imaging meth-
ods provide overwhelming volumes of image data. A produc-
tive and useful analysis of targeted information extracted from
such MR images of brain is a challenging problem. In Chapter 26,
Angela Laird, Jack Lancaster and Peter Fox describe recently devel-
oped maximum likelihood estimation based “meta” analysis algo-
rithms for the investigation of a specific pathology. In Chapter 27,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch01 FA
8
Atam P Dhawan, HK Huang and Dae-Shik Kim
Christos Davatzikos presents dynamic brain mapping methods for
analysis of patient specific information for better pathological char-
acterization and diagnosis. Tianming Liu and Stephen Wong, in
Chapter 28, explore a recently developed model-based image anal-
ysis algorithms for analyzing diffusion-tensor MR brain images for
the characterization of neurological disorders.
Model-based intelligent analysis and decision-support tools are
important in medical imaging for computer-assisted diagnosis and
evaluation. Xiang Sean Zhou, in Chapter 29, presents specific chal-
lenges of intelligent medical image analysis, specificallyfor the inter-
pretation of cardiac ultrasound images. However, the issues raised
in this chapter could be extended to other modalities and applica-
tions. In Chapter 30, Yulin Song and Guang Li present an overview
of future trends and challenges in radiation therapy methods
that closely linked with high resolution multidimensional medical
imaging.
Heinz U Lemke and Leonard Berliner, in Chapter 31, describes
specific methods and information technology (IT) issues in dealing
with image management systems involving very large databases
and widely networked image communication systems.
To conclude, Chapter 32 presents a glimpse of future trends and
challenges in high-resolution medical imaging, intelligent image
analysis, and smart data management systems.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
CHAPTER 2
Medical Imaging and Image Formation
Atam P Dhawan
Medical imaging involves a good understanding of imaging medium
and object, physics of imaging, instrumentation, and often computerized
reconstruction and visual display methods. Though there are a number of
medical imaging modalities available today involving ionized radiation,
nuclear medicine, magnetic resonance, ultrasound, and optical methods,
each modality offers a characteristic response to structural or metabolic
parameters of tissues and organs of human body. This chapter provides
an overview of the principles of medical imaging modalities and a basic
linear spatially invariant image formation model used for most common
image processing tasks.
2.1 INTRODUCTION
Medical imaging is a process of collecting information about a
specific physiological structure (an organ or tissue) using a pre-
defined characteristic property that is displayed in the form of
an image. For example, in X-ray radiography, mammography and
computed tomography (CT), tissue density is the characteristic
property that is displayed in images to show anatomical struc-
tures. The information about tissue density of anatomical struc-
tures is obtained by measuring attenuation to X-ray energy when
it is transmitted through the body. On the other hand, a nuclear
9
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
10
Atam P Dhawan
medicine positron emission tomography (PET) image may show
glucose metabolism information in the tissue or organ. A PET
image is obtainedby measuring gamma-ray emission fromthe body
when a radioactive pharmaceutical material, such as flurodeoxyglu-
cose (FDG) is injected in the body. FDG metabolizes with the tis-
sue through blood circulation eventually making it a source of
emission of gamma-ray photons. Thus, medical images may pro-
vide anatomical, metabolic or functional information related to
an organ or tissue. These images through proper instrumentation
and data collection methods can be primarily reconstructed in
two- or three-dimensions and then displayed as multidimensional
data sets.
The basic process of image formation requires an energy source
to obtain information about the object that is displayed in the form
of an image. Some form of radiation such as optical light, X-ray,
gamma-ray, RF or acoustic waves, interacts with the object tissue
or organ to provide information about its characteristic property.
The energy source can be external (X-ray radiography, mammog-
raphy, CT, ultrasound), internal [nuclear medicine: single photon
emissioncomputedtomography(SPECT); positronemissiontomog-
raphy (PET)], or a combination of both internal and external such as
in magnetic resonance imaging where proton nuclei that are avail-
able in the tissue in the body provides electromagnetic RF energy
based signals in the presence of an external magnetic field and a
resonating RF energy source.
As described above, image formation requires an energy source,
a mechanism of interaction of energy with the object, an instru-
mentation to collect the data with the measurement of energy after
the interaction, and a method of reconstructing images of infor-
mation about the characteristic property of the object from the
collected data.
The following imaging modalities are commonly used for
medical applications today. The medical imaging modalities
are briefly described below with their respective principles of
imaging.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
11
2.2 X-RAY IMAGING
X-rays were invented by in Conrad Rontgen in 1895 who described
it as new kind of rays which can penetrate almost anything. He
described the diagnostic capabilities of X-rays for imaging human
body and received the Nobel prize in 1901. X-ray radiography is the
simplest form of medical imaging with the transmission of X-rays
through the body which is then collected on a film or an array of
detectors. The attenuation or absorption of X-rays is described by
the photoelectric and Compton effects providing more attenuation
through bones than soft tissues or air.
1−5
The diagnostic range of X-rays is used between 0.5Aand 0.01A.
A wavelength which corresponds to the photon energy of approx-
imately 20 kev to 1.0 Mev. In this range, the attenuation is quite
reasonable to discriminate bones, soft tissue and air. In addition,
the wavelength is short enough for providing excellent resolution
of images even with sub mm accuracy. Shorter wavelengths than
diagnostic range of X-rays provides much higher photon energy
and therefore less attenuation. Increasing photon energy makes the
human body transparent for the loss of any contrast in the image.
The diagnostic X-rays wavelength range provides higher energy per
photons and provides a refractive index of unity for almost all mate-
rials in the body. This guarantees that the diffraction will not distort
the image and rays will travel in straight lines.
1−8
X-raymedical imaginguses anexternal ionizedradiationsource,
an X-ray tube to generate X-ray radiation beam that is transmit-
ted through human body. Attenuation to X-ray radiation beam is
measured to provide information about variations in the tissue den-
sity that is displayed in X-ray 2D radiographs or 3D computed
tomography (CT) images. The output intensity of a radiation beam
parallel to x-direction for a specific y-coordinate location in the
selected z-axial planar cross-section, I
out
(y; x,z) would be given by:
I
out
(y; x, z) = I
in
(y; x, z)e

µ(x,y;z)dx
,
where µ(x, y, z) represents attenuation coefficient to the transmitted
X-ray energy.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
12
Atam P Dhawan
Fig. 1. An X-ray mammography image with microcalcification areas.
X-rayconventional radiographycreates a2Dimage of a3Dobject
projectedon the detector plane. Figure 1 shows a 2Dmammography
image of a female breast. Several microcalcification areas can be seen
in this image.
While 2D projection radiography may be adequate for many
diagnostic applications, it does not provide 3Dqualitative andquan-
titative information about the anatomical structures and associated
pathology that is necessary for diagnostics and treating a number
of diseases or abnormalities. Combining Radon transform acquir-
ing ray-integral measurements with 3D scanning geometry, X-ray
computed tomography (CT) provides a three-dimensional recon-
struction of internal organs and structures.
9−11
X-ray CT has proven
to be very useful and sophisticated imaging tool in diagnostic radi-
ology and therapeutic intervention protocols. The basic principle of
X-ray CT is the same as that of X-ray digital radiography: X-rays are
transmitted through the body and collected by an array of detectors
to measure the total attenuation along the X-ray path.
8−11
Figure 2,
shows a pathological axial image of the cardiovascular cavity of a
cadaver. The corresponding image obtainedfromX-ray CTis shown
at the bottom of Fig. 2.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
13
Fig. 2. Top: a pathological axial image of the cardiovascular cavity of a cadaver,
bottom: the corresponding image obtained from X-ray CT.
2.3 MAGNETIC RESONANCE IMAGING
The principle of nuclear magnetic resonance for medical imag-
ing was first demonstrated by Raymond Damadian in 1971 and
Paul Lauterbur in 1973. Nuclear magnetic resonance (NMR) is a
phenomenon of magnetic systems that possesses both a magnetic
moment and an angular moment. In magnetic resonance imaging
(MRI), the electromagnetic induction based signals at magnetic res-
onance frequency in the radio frequency (RF) range are collected
through nuclear magnetic resonance from the excited nuclei with
magnetic moment and angular momentum present in the body.
4−7
All materials consist of nuclei which are protons, neutrons or
a combination of both. Nuclei that contain an odd number of pro-
tons, neurons or both in combination possess a nuclear spin and a
magnetic moment. Most materials are composed of several nuclei
which have the magnetic moments such as
1
H,
2
H,
13
C,
31
Na, etc.
When such material is placed number a magnetic field, randomly
orientednuclei experience an external magnetic torque which aligns
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
14
Atam P Dhawan
the nuclei either in a parallel or an antiparallel direction in reference
to the external magnetic field. The number of nuclei aligned in par-
allel is greater by a fraction than the number of nuclei aligned in
an antiparallel direction and is dependent on the strength of the
applied magnetic field. Thus, a net vector results in the parallel
direction. The nuclei under the magnetic field rotate or precess like
spinning tops precessing around the direction of the gravitational
field. The rotating or precessional frequency of the spins is called
the Larmor precession frequency and is proportional to the mag-
netic field strength. The energy state of the nuclei in the antiparallel
direction is higher than the energy state of the nuclei in the parallel
direction. When an external electromagnetic radiation at the Larmor
frequency is applied through the RF coils (because the natural mag-
netic frequency of these nuclei fall within the radiofrequency range),
some of the nuclei directed in the parallel direction get excited and
go to the higher energy state, becoming in the direction antipar-
allel to the external magnetic field to the antiparallel direction. The
lower energy state has the larger population of spins than the higher
energystates. Thus, throughthe applicationof the RFsignal, the spin
population is also affected.
When the RF excitation signal is removed, the excited portions
tend to return to their lower energy states through relaxation result-
ing in the recovery of the net vector and the spin population. The
relaxationprocess causes the emissionof a RFfrequencysignal at the
same Larmor frequency which is received by the RF coils to gener-
ate an electric potential signal called the free-induction decay (FID).
This signal becomes the basis of MR imaging.
Given an external magnetic field H
0
, the angular (Larmor) fre-
quency, ω
0
of nuclear precession can be expressed:
ω
0
= γH
0
. (1)
Thus, the precession frequency depends on the type of nuclei with a
specific gyromagnetic ratio γ and the intensity of the external mag-
netic field. This is the frequency on which the nuclei can receive
the radio frequency (RF) energy to change their states for exhibiting
nuclear magnetic resonance. The excitednuclei returnto the thermal
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
15
equilibrium through a process of relaxation emitting energy at the
same precession frequency, ω
0
.
It can be shown that during the RF pulse (nuclear excitation
phase), the rate of change in the net stationary magnetization vector

M can be expressed as (the Bloch equation):
d

M
dt
= γ

M ×

H, (2)
where

H is the net effective magnetic field.
Considering the total response of the spin systemin the presence
of an external magnetic field along with the RF pulse for nuclear
excitation followed by the nuclear relaxation phase, the change of
the net magnetization vector can be expressed as Ref. [5]:
d

M
dt
= γ

M ×

H −
M
x

i + M
y

j
T
2

(M
z
− M
0
z
)

k
T
1
, (3)
where

M
0
z
is the net magnetization vector in thermal equilibrium
in the presence of an external magnetic field H
0
only, and T
1
and T
2
are, respectively, the longitudinal (spin-lattice) andtransverse (spin-
spin) relaxation times in the nuclear relaxation phase when excited
nuclei return to their thermal equilibrium state.
In other words, the longitudinal relaxation time, T
1
represents
the return of net magnetization vector in z direction to its thermal
equilibriumstate while the transverse relaxationtime, T
2
, represents
the loss of coherence or dephasing of spinleading to the net zero vec-
tor in the x-y plane. The longitudinal and transverse magnetization
vectors with respect to the relaxation times in the actual stationary
coordinate system, can be given by:

M
x,y
(t) =

M
x,y
(0)e
−t/T
2
e
−iω
0
t

M
z
(t) =

M
0
z
(1 − e
−t/T
1
) +

M
z
(0)e
−t/T
1
(4)
where

M
x,y
(0) =

M
x

,y

(0)e
−iω
0
τ
p
.

M
x,y
(0) represents the initial transverse magnetization vector with
the time set to zero at the end of the RF pulse of duration τ
p
.
During imaging, the RF pulse, transmitted through an RF coil
causes nuclear excitation changing the longitudinal and transverse
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
16
Atam P Dhawan
magnetization vectors. After the RF pulse is turned off, the excited
nuclei go through the relaxation phase emitting the absorbedenergy
at the same Larmor frequency that can be detected as an electrical
signal, calledthe free inductiondecay(FID). The FIDis the rawNMR
signal that can be acquired through the same RF coil tuned at the
Larmor frequency.
Let us represent a spatial location vector r in the spinning nuclei
systemwith a net magnetic field vector

H
r
(r) and the corresponding
net magnetization vector

M(r, t), the magnetic flux φ(t) through the
RF coil can be given as Ref. [5]:
φ(t) =

object

H
r
(r) ·

M(r, t)dr, (5)
where the voltage induced in the RF coil, V(t) is the rawNMRsignal
and can be expressed (using Faraday’s Law) as:
V(t) = −
∂φ(t)
∂t
= −

∂t

object

H
r
(r) ·

M(r, t)dr. (6)
Figure 3 provides axial, coronal andsagittal cross-sectionMRimages
of a brain. Details of the gray andwhite matter structures are evident
in these images.
2.4 SINGLE PHOTON EMISSION COMPUTED
TOMOGRAPHY
In 1934, Jean Frederic Curie and Irene Curie discovered radiophos-
phorous
32
P, a radioisotope to demonstrate radioactivity decay.
Fig. 3. (from left to right): axial, coronal and sagittal cross-section MR images of
a human brain.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
17
In 1951, radionucleotide imaging of thyroid was demonstrated by
Cassen through administration of iodine radioisotope
131
I. Anger
in 1952, developed a scintillation camera, also known as Anger
camera with sodium iodide crystals coupled with photomultiplier
tubes. Kuhl and Edwards developed a transverse section tomog-
raphy gamma ray scanner for radionuclide imaging in 1960s.
12−15
Their imaging system included an array of multiple collimated
detectors surrounding a patient with rotate-translation motion to
acquire projection for emission tomography. With the advances of
computer reconstruction algorithms and detector instrumentation,
the gamma ray imaging is now known as single photon emission
computed tomography (SPECT) for 3D imaging of human organs
that is extended to even full body imaging. The radioisotopes are
injected in the body through the administration of radiopharmaceu-
tical drugs that metabolize with the tissue making tissue a source of
gamma ray emissions. The gamma rays fromthe tissue pass through
the body and are captured by the detectors surrounding the body to
acquire rawdata for defining projections. The projection data is then
used in reconstruction algorithms to display images with the help
of a computer and high-resolution displays. In SPECT imaging, the
commonlyusedradionuclides are Thallium
201
Tl, Technetium
99m
Tc,
Iodine
123
I andGallium
68
Ga. These radionuclides decay by emitting
gamma rays with photon energies ranging from 135 keV to 511 keV.
The attenuation to gamma rays is similar in nature as of X-rays and
can be expressed as:
I
d
= I
0
e
−τx
,
whereI
0
is theintensityof gammarays at thesource, I
d
is theintensity
at the detector after the gamma rays have passed the distance x in
the body with a linear attenuation coefficient ν that depends on the
density of the medium and the energy of gamma ray photons.
Figure 4 shows
99m
Tl SPECT images of a human brain. It can be
noticed that SPECT images are poor in resolution and anatomical
structure as compared to CT or MR images. However, the SPECT
images show radioactivity distribution in the tissue representing a
specific metabolism or blood flow.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
18
Atam P Dhawan
Fig. 4. SPECT image of a human brain.
2.5 POSITRON EMISSION TOMOGRAPHY
Positron emission tomography (PET) imaging methods were devel-
oped in 1970s by a number of researchers including Phelp,
Robertson, Ter-Pogossian and Brownell and several others.
14,16
The
concept of PET imaging is based on the simultaneous detection of
two 511 keVenergy photons traveling in the opposite direction. The
distinct feature of PETimagingis its abilitytotrace radioactive mate-
rial metabolized in the tissue to provide specific information about
its biochemical and physiological behavior.
Some radioisotopes decay by emitting positive charged parti-
cles called positrons. The emission of positron is accompanied by
a significant amount of kinetic energy. After emission, a positron
travels typically for 1 mm– 3 mm losing some of its kinetic energy.
The loss of energy makes the positron suitable for interaction with a
loosely bound electron within a material for annihilation. The anni-
hilation of the positron with the electron causes the formation of
two gamma photons with 511 keV traveling in opposite directions
(close to 180

apart). The two photons can be detected by two sur-
rounding scintillation detectors simultaneously within a small time
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
19
window. This simultaneous detection within a small time window
(typically in the order of nanoseconds) is called a coincidence detec-
tion indicating the origin of annihilation along the line joining the
two detectors involved in coincidence detection. Thus, by detecting
a large number of coincidences, the source location and distribution
can be reconstructed through image reconstruction algorithms. It
should be noted that the point of emission of a positron is different
from the point of annihilation with an electron. Though the imag-
ing process is aimed at the reconstruction of source representing the
locations of emission of positrons, it is the locations of annihilation
events that are reconstructed as an image in the positron emission
tomography (PET). However, the distribution of emission events
of positrons is considered to be close enough to the distribution of
annihilation events within a resolution limit.
The main advantage of PET imaging is its ability of extracting
metabolic and functional information of the tissue because of the
unique interaction of positron within the matter of the tissue. The
most common positron emitter radionuclide used in PET imaging
is fluorine
18
F that is administered as fluorine labeled radiopharma-
ceutical calledfluorodeoxyglucose (FDG). The FDGimages obtained
through PET imaging show very significant information about the
glucose metabolism and blood-flow of the tissue. Such metabolism
information has been proven to be of critical in determining the het-
erogeneity and invasiveness of tumors.
Figure 5 shows a set of axial cross-section of brain PET images
showingglucose metabolism. The streakingartifacts andlowresolu-
tion details can be noticed in these images. The artifacts seen in PET
images are primarily because of low volume of data caused by the
nature of radionuclide-tissue interaction and electronic collimation
necessary to reject the scattered events.
2.6 ULTRASOUND IMAGING
Soundor acoustic waves were successfully usedinsonar technology
in military applications in World War II. The potential of ultrasound
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
20
Atam P Dhawan
Fig. 5. Serial axial images of a human brain acquired using FDG PET.
waves in medical imaging was explored and demonstrated by sev-
eral researchers in the 1970s and 1980s including Wild, Reid, Frey,
Greenleaf and Goldberg.
17−20
Today, ultrasound imaging is success-
fullyusedindiagnostic imagingof anatomical structures, bloodflow
measurements and tissue characterization. Safety, portability and
low-cost aspects of ultrasound imaging have made it a significantly
successful diagnostic imaging modality.
Sound waves are characterized by wavelength and frequency.
Sound waves audible to the human ear are comprised of frequen-
cies ranging from 15 Hz to 20 kHz. Sound waves with frequencies
above 20 kHz are called ultrasound waves. The velocity of propaga-
tion of sound in water and in most body tissues is about 1500 m/sec.
Thus, the wavelength based resolution criterion is not satisfied
fromelectromagnetic radiation concept. The resolution capability of
acoustic energy is therefore dependent on the frequency spectrum.
The attenuation coefficient in body tissues varies approximately
proportional to the acoustic frequency at about 1.5 dB/cm/MHz.
Thus, at much higher frequencies, the imaging is not meaningful
because of excessive attenuation. In diagnostic ultrasound, imag-
ing resolution is limited by the wavelength. Shorter wavelengths
provide better imaging resolution. Shorter waves can also penetrate
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
21
deeper into tissue. Since the velocity of sound waves in a specific
medium is fixed, the wavelength is inversely proportional to the
frequency. In medical ultrasound imaging, sound waves of 2 MHz
to 10 MHz can be used but 2 MHz to 5 MHz frequencies are more
common.
Let us assume that a transducer provides an accoustic signal of
s(x, y) intensity with a pulse ω(t) that is transmitted in a medium
with an attenuation coefficient, µand reflected by a biological tissue
of reflectivity R(x, y, z) with a distance z from the transducer. The
recorded reflected intensity of a time varying accoustic signal, J
r
(t)
over the region R can then be expressed as:
J
r
(t) = K

e
−2µz
z

R(x, y, z)s(x, y) ¯ ω

t −
2z
c

dxdydz

,
(7)
where K, ¯ ω(t) and c, respectively, represent a normalizing constant,
received pulse and the velocity of the acoustic signal in the medium.
Usinganadaptive time varyinggainto compensate for the atten-
uation of the signal, Eq. 7 for the compensated recorded reflected
signal from the tissue, J
cr
(t) can be simplified to:
J
cr
(t) = K

R(x, y, z)s(x, y) ¯ ω

t −
2z
c

dxdydz

or, in terms of a convolution as:
J
cr
(t) = K

R

x, y,
ct
2

⊗ s(− x, −y) ¯ ω(t)

, (8)
where ⊗ represents a 3D convolution. This is a convolution of a
reflectivity term characterizing the tissue and an impulse response
characterizing the source parameters.
Backscattered echo and Doppler shift principles are more com-
monly used with the interaction of sound waves with human tissue.
Sometimes, the scattering information is complemented with trans-
mission or attenuation related information such as velocity in the
tissue. Figure 6 shows a diastolic color Doppler flowconvergence in
the apical four-chamber view of mitral stenosis.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
22
Atam P Dhawan
Fig. 6. Adiastolic color Doppler flowimage showing an apical four-chamber view
of mitral stenosis.
2.7 PRINCIPLES OF IMAGE FORMATION
It is usually desirable for an imaging system to behave like a lin-
ear spatially invariant system. In other words, the response of the
imaging system should be consistent, scalable and independent of
the spatial position of the object being imaged. A system is said to
be linear if it follows two properties: scaling and superposition.
1−3
In mathematical representation, it can be expressed as:
h{aI
1
(x, y, z) + bI
2
(x, y, z)} = ah{I
1
(x, y, z)} + bh{I
2
(x, y, z)}, (9)
where a and b area scalar multiplication factors, and I
1
(x, y, z) and
I
2
(x, y, z) are two inputs to the system represented by the response
function h.
It should be noted that in real-world situations, it is difficult to
find a perfectly linear image formation system. For example, the
response of photographic film or X-ray detectors cannot be linear
over the entire operating range. Nevertheless, under constrained
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
23
conditions and limited exposures, the response can be practically
linear. Also, anonlinear systemcanbemodeledwithpiecewiselinear
properties under specific operating considerations.
In general, image formation is a neighborhood process. One can
assume a radiant energy such as light source to illuminate an object
represented by the function f (α, β, γ). Using the additive property
of radiating energy distribution to form an image, g(x, y, z) can be
written as:
g(x, y, z) =

+∞
−∞


−∞

+∞
−∞
h(x, y, z, α, β, γ, f (α, β, γ))dαdβdγ, (10)
where h(x, y, z, α, β, γ, f (α, β, γ)) is called the response function of the
image formation system. If the image formation system is assumed
to be linear, the image expression becomes:
g(x, y, z) =

+∞
−∞


−∞

+∞
−∞
h(x, y, z, α, β, γ)f (α, β, γ)dαdβdγ. (11)
The response function h(x, y, z, α, β, γ) is called the Point Spread
Function (PSF) of the image formation system. The PSF depends
on the spatial extent of the object and image coordinates systems.
The expression h(x, y, z, α, β, γ) is the generalized version of the PSF
described for the linear image formation system that can be further
characterized as a spatially invariant (SI) or spatially variant (SV)
system. If a linear image formation system is such that the PSF is
uniformacross the entire spatial extent of the object and image coor-
dinates, the systemis called a linear spatially invariant (LSI) system.
In such a case, the image formation can be expressed as:
g(x, y, z) =

+∞
−∞


−∞

+∞
−∞
h(x−α, y−β, z−γ)f (α, β, γ)dαdβdγ. (12)
In other words, for an LSI image formation system, the image is rep-
resented as the convolution of the object radiant energy distribution
and the PSF of the image formation system. It should be noted that
the PSF is basically a degrading function that causes a blur in the
image and can be compared to the unit impulse response, a com-
mon term used in signal processing. In other words, the acquired
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
24
Atam P Dhawan
image g(x, y, z) for a LSI imaging system, can be expressed as the
convolution of object distribution with the PSF as:
g(x, y, z) = h(x, y, z) ⊗ f (x, y, z) + n(x, y, z), (13)
where n(x, y, z) represents an additive noise term.
Considering Fourier Transform, the above equation can be rep-
resented in frequency domain:
G(u, v, w) = H(u, v, w)F(u, v, w) + N(u, v, w), (14)
where G(u, v, w), H(u, v, w) and N(u, v, w) are, respectively, Fourier
Transform of g(x, y, z), f (x, y, z) and n(x, y, z) as:
G(u, v, w) =


−∞


−∞
g(x, y, z)e
−j2π(ux+vy+wz)
dxdydx,
H(u, v, w) =


−∞


−∞
h(x, y, z)e
−j2π(ux+vy+wz)
dxdydx,
N(u, v, w) =


−∞


−∞
n(x, y, z)e
−j2π(ux+vy+wz)
dxdydx. (15)
Image processing and enhancement operations can be easily and
more effectively performed on the above described representation
of image formation through a LSI imaging system. However, the
validity of suchanassumptionfor imaging systems inthe real world
may be limited.
2.8 RECEIVER OPERATING CHARACTERISTICS (ROC)
ANALYSIS AS A PERFORMANCE MEASURE
Receiver operating characteristic (ROC) analysis is considered a sta-
tistical measure for studying the performance of an imaging or diag-
nostic systemwith respect to its ability to detect a system’s ability to
detect abnormalities accurately and reliably (true positive) without
providing false detections. In other words, ROC analysis provides
a systematic analysis of sensitivity and specificity of a diagnostic
test.
1,8,21
Let us assume the total number of examination cases to be N
tot
,
out of which N
tp
cases have positive true-condition with the actual
presence of the object and the remaining cases, N
tn
, have negative
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
25
true-condition with no object present. Let us suppose these cases are
examined though the test for which we need to evaluate accuracy,
sensitivity andspecificity factors. Considering the observer does not
cause any loss of information or misinterpretation, let N
otp
(true
positive) be the number of positive observations from N
tp
positive
true-condition cases and N
ofn
(false negative) be the number of neg-
ative observation from N
tp
positive true-condition cases. Also, let
N
otn
(true negative) be the number of negative observations from
N
tn
negative true-condition cases and N
ofp
(false positive) be num-
ber of positive observation from N
tn
negative true-condition cases.
Thus,
N
tp
= N
otp
+ N
ofn
and
N
tn
= N
ofp
+ N
otn
. (16)
The following relationships can be easily derived from above
definitions.
(1) True positive fraction (TPF) is the ratio of the number of positive
observations to the number of positive true-condition cases.
TPF = N
otp
/N
tp
(17)
(2) False negative fraction (FNF) is the ratio of the number of nega-
tive observations to the number of positive true-condition cases.
FNF = N
ofn
/N
tp
(18)
(3) False positive fraction (FPF) is the ratio of the number of positive
observations to the number of negative true-condition cases.
FPF = N
ofp
/N
tn
(19)
(4) True negative fraction (TNF) is the ratio of the number of nega-
tive observations to the number of negative true-conditioncases.
TNF = N
otn
/N
tn
(20)
This should be noted that:
TPF + FNF = 1
(21)
TNF + FPF = 1.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
26
Atam P Dhawan
FPF
TPF
b
a
c
Fig. 7. ROC curves with curve “a” indicating better overall classification ability
than the curve “b” while the curve “c” shows the random probability.
AgraphbetweenTPF andFPF is calleda receiver operating char-
acteristic (ROC) curve for a specific medical imaging or diagnostic
test for detection of an object. It should also be noted that statistical
randomtrials with equal probability of positive and negative obser-
vations wouldleadto the diagonally placedstraight-line as the ROC
curve. Different tests and different observers may lead to different
ROC curves for the same object detection. Figure 7 shows different
three different ROC curves for a hypothetical detection/diagnosis.
It can be noted that observer corresponding to curve “a” is far better
than the observer “b.”
True positive fraction, TPF, is also called the sensitivity while
the true negative fraction (TNF) is known as specificity of the test
for detection of an object. Accuracy of the test is given by a ratio of
correct observation to the total number of examination cases. Thus,
Accuracy = (TPF + TNF)/N
tot
. (22)
Inother words, different imagingmodalities andobservers maylead
to different accuracy, sensitivity and specificity levels.
2.9 CONCLUDING REMARKS
This chapter presented basic principles of major medical imaging
modalities anda linear spatially invariant model of image formation
that is practically easier to deal with post-processing operations for
image enhancement and analysis. Though these assumptions may
not be strictly followed by the real world imaging scanners, medi-
cal imaging systems often perform close to them. Medical imaging
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
Medical Imaging and Image Formation
27
modalities and image analysis systems are evaluated on the basis of
their capabilities to detect true detections of pathologies while mini-
mizing the false detections. Such performance evaluations are often
conducted through receiver operating characteristic (ROC) curves
that provides a very useful way of understanding detection capa-
bility in terms of the sensitivity and specificity and the relation-
ship of potential tradeoffs between true positive and false positive
detections. Quantitative data analysis with appropriate models can
improve image presentation (through better image reconstruction
methods), and image analysis with feature detection, analysis and
classification to improve the true positive rate while minimizing
false positive rate of detection of a specific pathology for which
imaging tests are performed.
References
1. DhawanAP, Medical Image Analysis, JohnWiley&Sons, Hoboken, 2003.
2. Barrett H, Swindell W, Radiological Imaging: The Theory of Image Forma-
tion, Detection and Processing Volumes 1–2, Academic Press, New York,
1981.
3. Bushberg JT, Seibert JA, Leidholdt EM, Boone JM, The Essentials of Med-
ical Imaging, Williams & Wilkins, 1994.
4. Cho ZH, Jones JP, Singh M, Fundamentals of Medical Imaging, John
Wiley & Sons, New York, 1993.
5. Liang Z, Lauterbur PC, Principles of Magnetic Resonance Imaging, IEEE
Press, 2000.
6. Lev MH, Hochberg F, Perfusion magnetic resonance imaging to assess
brain tumor responses to new therapies, J Moffit Cancer Center 5:
446–450, 1998.
7. Stark DD, Bradley WG, Magnetic Resonance Imaging, 3rd edn., Mosby,
1999.
8. Shung KK, Smith MB, Tsui B, Principles of Medical Imaging, Academic
Press, 1992.
9. Hounsfield GN, A method and apparatus for examination of a body
by radiation such as X or gamma radiation, Patent 1283915, The Patent
Office, London, England, 1972.
10. HounsfieldGN, Computerizedtransverse axial scanning tomography:
Part 1, description of the system, Br J Radiol 46: 1016–1022, 1973.
11. Cormack AM, Representation of a function by its line integrals with
some radiological applications, J Appl Phys 34: 2722–2727, 1963.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch02 FA
28
Atam P Dhawan
12. Cassen B, Curtis L, Reed C, Libby R, Instrumentation for
131
I used in
medical studies, Nucleonics 9: 46–48, 1951.
13. Anger H, Use of gamma-ray pinhole camera for in vivo studies, Nature
170: 200–204, 1952.
14. Brownell G, Sweet HW, Localization of brain tumors, Nucleonics 11:
40–45, 1953.
15. Casey ME, Eriksson L, Schmand M, Andreaco M, Paulus M, Dahlborn
M, Nutt R, Investigation of LSO crystals for high spatial resolution
positron emission tomography, IEEETrans Nucl Sci 44: 1109–1113, 1997.
16. Kuhl E, Edwards RQ, Reorganizingdata fromtransverse sections scans
using digital processing, Radiology 91: 975–983, 1968.
17. FishP, Physics and Instrumentation of Diagnostic Medical Ultrasound, John
Wiley & Sons, Chichester, 1990.
18. Kremkau FW, Diagnostic Ultrasound Principles and Instrumentation,
Saunders, Philadelphia, 1995.
19. Kremkau FW, Doppler Ultrasound: Principles and Instruments, Saunders,
Philadelphia, 1991.
20. Hykes D, Ultrasound Physics and Instrumentation, Mosby, New York,
1994.
21. Swets JA, Pickett RM, Evaluation of Diagnostic Systems, Academic Press,
Harcourt Brace Jovanovich Publishers, New York, 1982.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
CHAPTER 3
Principles of X-ray Anatomical Imaging
Modalities
Brent J Liu and HK Huang
This chapter provides basic concepts of various X-ray imaging modal-
ities. The first part of the chapter addresses digital X-ray projection
radiography which includes digital fluorography, computed radiogra-
phy, X-ray mammography, and digital radiography. The key compo-
nents belonging to each of these imaging modalities will be discussed
along with basic principles to reconstruct the 2D image. The second part
of the chapter focuses on 3D volume X-ray acquisition which includes
X-ray CT, multislice, cine, and 4D CT. The image reconstruction methods
will be discussed along with key components which have advanced the
CT technology to the present day.
3.1 INTRODUCTION
This chapter will present X-ray anatomical imaging modalities
which cover a large amount of the total number of diagnostic imag-
ingprocedures. X-rayprojectionradiographyalone accounts for 70%
of the total number of diagnostic imagingprocedures. Inthis chapter,
we will only focus on digital X-ray anatomical imaging modalities,
which include digital fluorography, computed radiography, X-ray
mammography, digital radiography, X-ray CT, and multislice, cine,
and 4D X-ray CT.
29
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
30
Brent J Liu and HK Huang
There are twoapproaches toconvert afilm-basedimage todigital
form. The first is to utilize existing equipment in the radiographic
procedure room and only change the image receptor component.
Twotechnologies, computedradiography(CR) usingthe photostim-
ulable phosphor imagingplate technology, anddigital fluorography,
are in this category. This approach does not require any modification
in the procedure roomand is therefore more easily adopted for daily
clinical practice. Thesecondapproachis toredesigntheconventional
radiographic procedure equipment, including the geometry of the
X-ray beams and the image receptor. This method is therefore more
expensive to adopt, but the advantage is that it offers special features
like low X-ray scatter which would not otherwise be achievable in
the conventional procedure.
3.2 DIGITAL FLUOROGRAPHY
Since 70% of the radiographic procedures still use film as an output
medium, it is necessary to develop methods to convert images on
films to digital format. This section discusses digital fluorography
which converts images to digital format utilizing a video camera
and A/D converter.
The video scanning system is a low cost X-ray digitizer which
produces either a 512 Kor 1 Kdigitized image with 8 bits/pixel. The
system consists of three major components: a scanning device with
a video or a charge-coupled device (CCD) camera that scans the
X-ray film, an analog/digital converter that converts the video sig-
nals from the camera to gray level values, and an image memory to
store the digital signals fromthe A/Dconverter. The image stored in
the image memory is the digital representation of the X-ray film or
image inthe image intensifier tube obtainedby using the video scan-
ning system. If the image memory is connectedto a digital-to-analog
(D/A) conversion circuitry and to a TV monitor, this image can be
displayed back on the monitor (which is a video image). The mem-
ory can be connected to a peripheral storage device for long-term
image archive. Figure 1 shows a block diagram of a video scanning
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
31
A/D
Image
Memory
D/A
Video
Monitor
Image
Processor/
Computer
Digital
Storage
Device
Video
Scanner
Digital Chain
Fig. 1. Block diagram of a video scanning system, the digital chain is a standard
component in all types of scanner.
system. The digital chainshownis a standardcomponent inall types
of scanner.
Video scanner system can be connected to an image intensi-
fier tube to form a digital fluoroscopic system. Digital fluorogra-
phy is a method that can produce dynamic digital X-ray images
without changing the radiographic procedure room drastically
from conventional fluorography. This technique requires an add-
on unit in the conventional fluorographic system. Figure 2 shows
a schematic of the digital fluorographic system with the following
major components:
(1) X-ray source: The X-ray tube and a grid to minimize X-rays
scatter.
(2) Image receptor: The image receptor is an image intensifier tube.
(3) Video camera plus optical system: The output light from the
image intensifier goes through an optical system, which allows
the video camera to be adjusted for focusing. The amount of
light going into the camera is controlled by means of a light
diaphragm. The camera used is usually a plumbicon or a CCD
(charge couple device) with 512 or 1024 scan lines.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
32
Brent J Liu and HK Huang
(1)
X-ray
Tube
Collimator Table Patient Grid
Image
Intensites
Tube
Light
Diaphragm
TV
Camara
Optics
Digital
Chain
TV
Monitor
(2) (3) (4)
Fig. 2. Schematic of a digital fluorographic system coupling the image intensifier
and the digital chain. See text for key to numbers.
(4) Digital chain: The digital chain consists of an A/D converter,
image memories, image processor, digital storage, and video
display. The A/D converter, the image memory, and the digi-
tal storage can handle 512 × 512 × 8 bit image at 30 frames per
second, or 1024 × 1024 × 8 bit image at 7.5 frames per second.
Sometime the RAID (redundant array of inexpensive disks) is
used to handle the high speed data transfer.
Fluorography is used to visualize the motion of body compart-
ments (e.g. blood flow, heart beat), the movement of a catheter, as
well as to pinpoint anorganina body regionfor subsequent detailed
diagnosis. Each exposure required in a fluorography procedure is
very minimal compared with a conventional X-ray procedure.
Digital fluorography is considered to be an add-on system
because a digital chain is added to an existing fluorographic
unit. This method utilizes the established X-ray tube assembly,
image intensifier, video scanning, and digital technologies. The out-
put from a digital fluorographic system is a sequence of digital
images displayed on a video monitor. Digital fluorography has an
advantage over conventional fluorography in that it gives a larger
dynamic range image and can remove uninteresting structures in
the images by performing digital subtraction.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
33
When image processing is introduced to the digital fluoro-
graphic system, dependent on the application, other names are
used, for example, digital subtraction angiography (DSA), digital
subtraction arteriography (DSA), digital video angiography (DVA),
intravenous video arteriography (IVA), computerized fluoroscopy
(CF), and digital video subtraction angiography (DVSA).
3.3 IMAGING PLATE TECHNOLOGY
Imaging plate system, commonly called computed radiography
(CR), consists of two components: the imaging plate and the scan-
ning mechanism. The imaging plate (laser-stimulated luminescence
phosphor plate) used for X-rays detection, is similar in principle to
the phosphor intensifier screen used in the standard screen/film
receptor. The scanning of a laser-stimulated luminescence phosphor
imaging plate also uses a scanning mechanism (Reader) similar to
that of a laser film scanner. The only difference is that instead of
scanning an X-ray film, the laser scans the imaging plate. This sec-
tion describes the principle of the imaging plate, specifications of
the system, and system operation.
3.3.1 Principle of the Laser-Stimulated Luminescence
Phosphor Plate
The physical size of the imaging plate is similar to that of a con-
ventional radiographic screen; it consists of a support coated with
a photo-stimulable phosphorous layer made of BaFX: Eu
2+
(X= Cl,Br,I), Europium-activated barium-fluorohalide compounds.
After the X-ray exposure, the photo-stimulable phosphor crystal is
able tostore a part of the absorbedX-rayenergyina quasistable state.
Stimulation of the plate by a 633 nanometer wavelength helium-
neon (red) laser beam leads to emission of luminescence radiation
of a different wavelength (400 nanometer), the amount of which is
a function of the absorbed X-ray energy [Fig. 3(B)].
The luminescence radiation stimulated by the laser scanning is
collected through a focusing lens and a light guide into a photomul-
tiplier tube, which converts it into electronic signals. Figure 3(A)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
34
Brent J Liu and HK Huang
X-ray Photons
Unused imaging plate
Recording the X-ray image
The laser beams extract
the X-ray image from the
plate by converting it to
light photons which form a
light image.
The small amount of re-
sidual image on the plate
is thoroughly erased by
flooding the plate with light.
The erased plate can be
used again.
Laser-Beam Scanning
Light
Reading
X-ray
exposure
BaFX
crystals
support
(A)
Erasing
Fig. 3. Physical principle of laser-stimulated luminescence phosphor imaging
plate. Above: (A) Fromthe X-ray photons exposing the imaging plate to the forma-
tion of the light image. Below: (B) The wavelength of the scanning laser beam(b) is
different from that of the emitted light (a) from the imaging plate after stimulation
(courtesy of J Miyahara, Fuji Photo Film Co Ltd).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
35
Fig. 3. (Continued)
shows the physical principle of the laser-stimulated luminescence
phosphor imaging plate. The size of the imaging plate can be 8×10,
10 × 12, 14 × 14, or 14 × 17 square inches. The image produced is
2 000 ×2 500 ×10 bits.
3.3.2 Computed Radiography System Block Diagram
and its Principle of Operation
The imaging plate is housed inside a cassette just like a screen/film
receptor. Exposure of the imagingplate (IP) toX-rayradiationresults
in the formation of a latent image on the plate (similar to the latent
image formed in a screen/film receptor). The exposed plate is pro-
cessed through a CRReader to extract the latent image —analogous
to the exposed film developed by a film developer. The processed
imaging plate can be erased by bright light and be used again. The
imaging plate can either be removable or nonremovable. An image
processor is used to optimize the display (e.g. lookup tables) based
on types of exam and body regions.
The output of this system can be one of two forms — a printed
film or a digital image — the latter can be stored in a digital storage
device and be displayed on a video monitor. Figure 4 illustrates the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
36
Brent J Liu and HK Huang
1
2
3
4
CRT
To host computer
Controller
A/D
converter
Semiconductor laser
Stimulable phospor detector
Fig. 4. Dataflowof an upright CR systemwith nonremovable imaging plates (IP).
(1) Formation of the latent image on the IP. (2) The IP is scanned by the laser beam.
(3) Light photons are converted to electronic signals. (4) Electronic signals are con-
verted to digital signals which form a CR image (courtesy of Konica Corporation,
Japan).
dataflow of an upright CR system with three unremovable imaging
plates. Figure 5 shows the latest XG-5000 multiplate reader system
with removable imaging plate and its components.
3.3.3 Operating Characteristics of the CR System
A major advantage of the CR system compared to the conven-
tional screen/film system is that the imaging plate is linear and
has a large dynamic range between the X-ray exposure and the
relative intensity of the stimulated phosphors. Hence, under a
similar X-ray exposure condition, the image reader is capable of
producing images with density resolution comparable or superior
to those from the conventional screen/film system. Since the image
reader automatically adjusts the amount of exposure received by
the plate, over- or underexposure within a certain limit would not
affect the appearance of the image. This useful feature can best be
explained by the two examples given in Fig. 6.
In quadrant A of Fig. 6, example I represents the plate exposed
to a higher relative exposure level but with a narrower exposure
range (10
3
–10
4
). The linear response of the plate after laser scan-
ning yields a high level but narrow light intensity (photostimula-
ble luminescence, PSL) range from 10
3
–10
4
. These light photons are
converted into electronic output signals representing the latent
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
37
Fig. 5. A Fuji XG-5000 CR System with the multiimaging plate reader and two
QA/Image Processing workstations (IIP and IIP Lite). Note that the second work-
station shares the same database as the first workstation so that an X-ray technician
can perform QAand image processing while another is operating the plate reader
and processing the imaging plates.
image stored on the image plate. The image processor senses a nar-
row range of electronic signals and selects a special look-up table
[the linear line in Fig. 6(B)], which converts the narrow dynamic
range 10
3
–10
4
to a large light relative exposure of 1 to 50 [Fig. 6(B)].
If hardcopy is needed, a large latitude film can be used that cov-
ers the dynamic range of the light exposure from 1 to 50, as shown
in quadrant C, these output signals will register the entire optical
density (OD) range fromOD0.2 to OD2.8 on the film. The total sys-
temresponse including the imaging plate, the look-up table, and the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
38
Brent J Liu and HK Huang
Fig. 6. Two examples, I and II, illustrate the operating characteristics of the CR
system and explain how it compensates for over and under exposures.
film subject to this exposure range is depicted as curve I in quad-
rant D. The system-response curve, relating the relative exposure
on the plate and the OD of the output film, shows a high gamma
value andis quite linear. This example demonstrates howthe system
accommodates a high exposure level with a narrowexposure range.
Consider example II, in which the plate receives a lower ex-
posure level but with wider exposure range. The CR system auto-
matically selects a different look-up table in the image processor
to accommodate this range of exposure so that the output sig-
nals again span the entire light exposure range form 1 to 50.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
39
The system-response curve is shown as curve II in quadrant D. The
keyinselectingthe correct look-uptable is that the range of the expo-
sure has to span the total light exposure of the film, namely from1 to
50. It is noted that in both examples, the entire useful optical density
range for diagnostic radiology is utilized.
If a conventional screen/film combination system was used,
exposure on example I in Fig. 6 would only utilize the higher optical
density region of the film, whereas in example II it would utilize
the lower region. Neither case would utilize the full dynamic range
of the optical density in the film. From these two examples, it is
seen that the CRsystemallows the utilization of the full optical den-
sity dynamic range, regardless whether the plate is overexposed or
underexposed. Figure 7 shows an example comparing the results
of using screen/film versus CR under identical X-ray exposures.
Fig. 7. Comparison of quality of images obtained by using (A) the conven-
tional screen/filmmethod and (B) CR techniques. Exposures were 70 kVp; 10 mAs,
40 mAs, 160 mAs, 320 mAs on a skull phantom. It is seen that in this example that
the CR technique is almost dose independent (courtesy of Dr S Balter).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
40
Brent J Liu and HK Huang
The same effect is achieved if the image signals are for digital out-
put, and not for hard copy film. That is, the digital image produced
from the image reader and the image processor will also utilize
the full dynamic range from quadrant D to produce 10-bit digital
numbers.
3.4 FULL-FIELD DIRECT DIGITAL MAMMOGRAPHY
3.4.1 Screen/Film and Digital Mammography
Conventional screen/film mammography produces a very high
quality mammogram on an 8 sq in×10 sq in film. Some ab-
normalities in the mammogram require 50 µm spatial resolution
to be recognized. For this reason, it is difficult to use CR or a laser
film scanner to convert a mammogram to a digital image, hinder-
ing the integration of the modality images to PACS. Yet, mammo-
graphy examinations account for about 8% of all diagnostic
procedures in a typical radiology department. During the past sev-
eral years, due to much support from the National Cancer Institute
and the United States Army Medical Research and Development
Command, some direct digital mammography systems have been
developed by joint efforts between academic institutions and pri-
vate industry. Some of these systems are in clinical use. In the next
section, we describe the principle of digital mammography, a very
critical component in a totally digital imaging system in a hospital.
3.4.2 Full Field Direct Digital Mammography
There are two methods of obtaining a full field direct digital mam-
mogram, one is the imaging plate technology described in Sec. 3.3
but with higher resolution imaging plate of different materials
and higher quantum efficient detector systems. The other is the
slot-scanning method. This section summarizes the slot scanning
method.
The slot-scanning technology modifies the image receptor of
a conventional mammography system by using a slot-scanning
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
41
mechanism and detector system. The slot-scanning mechanism
scans a breast by an X-ray fan beam and the image is recorded by
a charged-couple device (CCD) camera encompassed in the Bucky
antiscatter grid of the mammography unit. Figure 8 shows a pic-
ture of a FFDDMsystem. The X-ray photons emitted fromthe X-ray
tube are shaped by a collimator to become a fan beam. The width
of the fan beam covers one dimension of the image area (e.g. x-axis)
and the fan beam sweeps in the other direction (y-axis). The move-
ment of the detector system is synchronous with the scan of the
fan beam. The detector system of the FFDDM shown is composed
of a thin phosphor screen coupled with four CCD detector arrays
via a tapered fiber optic bundle. Each CCD array is composed of
1 100×300 CCDcells. The gapbetweenanytwoadjacent CCDarrays
Fig. 8. A slot-scanning digital mammography system. The slot with 300 pixel
width covering the x-axis (4 400 pixels). The X-ray beam sweeps (arrow) in the y-
direction producing over 5 500 pixels. X: X-ray and collimator housing, C: breast
compressor.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
42
Brent J Liu and HK Huang
requires a procedure called “butting” to minimize the loss of pixels.
The phosphor screen converts the penetrated X-ray photons (i.e. the
latent image) to light photons. The light photons pass through the
fiber optic bundle, reach the CCD cells, and then are transformed
to electronic signals. The more light photons received by each CCD
cell, the larger the signal is transformed. The electronic signals are
quantized by an analog to digital converter to create a digital image.
Finally, the image pixels travel through a data channel to the sys-
tem memory of the FFDDM acquisition computer. Figure 9 shows
a 4 K×5 K×12 bit digital mammogram obtained with the system
shown in Fig. 8. A screening mammography examination requires
four images, two for each breast, producing a total of 160 Mbytes of
image data.
Fig. 9. A 4 K×5 K×12 bit digital mammogram obtained with the slot-scanning
FFDDMshown on a 2 K×2.5 Kmonitor. The windowat the upper part of the image
is the magnified glass showing a true 4 K×5 K region (courtesy of Drs E Sickles
and SL Lou).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
43
3.5 DIGITAL RADIOGRAPHY
During the past five years, research laboratories and manufacturers
have devoted tremendous energy and resources investigating new
digital radiography systems other than CR. The main emphases are
to improve the image quality and operation efficiency, and to reduce
the cost of projection radiography examination. Digital radiography
(DR) is an ideal candidate. In order to compete with conventional
screen/film and CR, a good DR system should:
• Have a high detector quantumefficiency (DQE) detector with 2–3
or higher line pair/mm spatial resolution, and a higher signal to
noise ratio.
• Produce digital images of high quality.
• Deliver low dosage to patients.
• Produce the digital image within seconds after X-ray exposure.
• Comply with industrial standards.
• Have an open architecture for connectivity.
• Be easy to operate.
• Be compact in size.
• Offer competitive cost savings.
Depending onthe methodusedfor the X-ray photonconversion,
DR can be categorized into direct and indirect image capture meth-
ods. In indirect image capture, attenuated X-ray photons are first
converted to light photons by the phosphor or the scintillator, from
which the light photons are converted to electronic signals to form
the DR image. The direct image capture method generates a digital
image without going through the light photon conversion process.
Figure 10 shows the difference between the direct and the indirect
digital capture method. The advantage of the direct image capture
method is that it eliminates the intermediate step of light photon
conversion. The disadvantages are that the engineering involved in
direct digital capture is more elaborate, and that it is inherently dif-
ficult to use the detector for dynamic image acquisition due to the
necessity of recharging the detector after each read out. The indirect
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
44
Brent J Liu and HK Huang
Selenium + Semiconductor
Converts X-rays to
Electrical Signals
Direct Digital
Radiograph
Scintillator or Phosphor Converts
X-rays to Light Photons
Light Photons to
Electrical Signals
Indirect digital
Radiograph
e
e
Light photons
(B) Indirect Image Capture
X-Rays
X-Rays
Scintillator or Phosphor Converts
X-rays to Light Photons
Light Photons to
Electrical Signals
Scintillator or Phosphor Converts
X-rays to Light Photons
(A) Direct Image Capture
Fig. 10. Direct and indirect image capture methods in digital radiography.
capture method uses either the amorphous silicon phosphor or scin-
tillator panels. The direct capture method uses the amorphous sele-
niumpanel. It appears that the direct capture methodhas the advan-
tage over the indirect capture method since it eliminates the inter-
mediate step of light photon conversion.
Two prevailing scanning modes in digital radiography are slot
and areal scanning. The digital mammography system discussed in
the last section uses the slot-scanning method. Current technology
for areal detection mode uses the flat-panel sensors. The flat-panel
can be one large or several smaller panels put together. The areal
scan method has the advantage of being fast in image capture, but
it also has two disadvantages, one being the high X-ray scattering.
The secondis the manufacturingof the large flat panels is technically
difficult.
Digital radiography (DR) design is flexible which can be used as
an add-on unit in a typical radiography roomor a dedicated system.
In the dedicated system, some design can be used both as a table top
unit attached to a C-arm radiographic device or as an upright unit
shown in Fig. 11. Figure 12 illustrates the formation of a DR image,
comparing it with Fig. 4 on that of a CR image. A typical DR unit
produces a 2 000 × 2 500 × 12 bit image instantaneously after the
exposure.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
45
(A) Dedicated
C-Arm System
(B) Dedicated
Chest
(C) Add-On
Fig. 11. Three configurations of digital radiography design.
Unused IP IP with latent image IP with residue image
High Intensity Light
X-ray
Digital
Image
4 6 8 (100nm)
Emission
Light
Stimulated
light
DR Laser
Reader
Fig. 12. Steps inthe formationof a DRimage, comparing it withthat of a CRimage
shown in Fig. 4.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
46
Brent J Liu and HK Huang
3.6 X-RAY CT AND MULTISLICE CT
3.6.1 Image Reconstruction from Projections
Since most sectional images, like CT, are generated based on image
reconstruction fromprojections, we first summarize the Fourier pro-
jection theorem, the algebraic reconstruction, and the filtered back-
projection method before the discussion of imaging modalities.
3.6.1.1 The Fourier Projection Theorem
Let f (x, y) be a 2D cross-sectional image of a three-dimensional
object. The image reconstruction theorem states that f (x, y) can be
reconstructed fromthe cross-sectional one-dimensional projections.
In general, 180 different projections in one degree increments are
necessary to produce a satisfactory image, and using more projec-
tions always result in a better reconstructed image.
Mathematically, the image reconstruction theorem can be
described with the help of the Fourier transform(FT). Let f (x, y) rep-
resent the two-dimensional image to be reconstructed and let p(x)
be the one-dimensional projection of f (x, y) onto the horizontal axis,
which can be measured experimentally (see Fig. 13, the zero degree
projection). In the case of X-ray CT, we can consider p(x) as the total
linear attenuation of tissues transverses by a collimated X-ray beam
at location x.
Then:
p(x, 0) =

+∞
−∞
f (x, y)dy. (1)
The 1D Fourier transform of p(x) has the form:
P(u) =

+∞
−∞

+∞
−∞
f (x, y)dy

exp ( −i2πux) dx. (2)
Equations (1) and (2) imply that the 1D Fourier transform of a one-
dimensional projection of a two-dimensional image is identical to
the corresponding central section of the two-dimensional Fourier
transform of the object. For example, the two-dimensional image
can be a transverse (cross) sectional X-ray image of the body, and
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
47
2-D IFT
F(u,0)= (P(x,0))
1-D FT
1
1
2
3
4
X-rays
F(u, )= (P(x, ))
P(x, )
P(x,0)
Spatial Domain
Frequency Domain
f(x, y)
2-D FT
2
2
= 0'...180'
= 0 θ
θ
θ
θ θ


'...180'
Fig. 13. Principle of the Fourier projection theoremfor image reconstruction from
projections. F(0,0) is at the center of the 2D FT, low frequency components are
represented at the center region. The numerals represent the steps described in the
text.
P(x, θ): X-rays projection at angle θ
F(u, θ): 1D Fourier transform of p(x, θ)
IFT: Inverse Fourier transform
the one-dimensional projections can be the X-ray attenuation pro-
files (projection) of the same section obtained from a linear X-ray
scan at certain angles. If 180 projections at one degree increments
are accumulated and their 1D FTs performed, each of these 180 1D
Fourier transformrepresents a correspondingcentral line of the two-
dimensional Fourier transform of the X-ray cross-sectional image.
The collection of all these 180 1DFourier transformis the 2DFourier
transform of f (x, y).
The steps of a 2D image reconstruction from its 1D projections
shown in Fig. 13 are as follows:
(1) Obtain 180 1D projections of f (x, y), p(x, θ) where θ = 1, . . . , 180.
(2) Perform the FT on each 1D projection.
(3) Arrange all these 1DFTs according to their corresponding angles
in the frequency domain. The result is the 2D FT of f (x, y).
(4) Perform the inverse 2D FT of (3), which gives f (x, y).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
48
Brent J Liu and HK Huang
The Fourier projection theorem forms the basis of tomographic
image reconstruction. Other methods that can also be used to recon-
struct a 2D image from its projections are discussed later in this
chapter. We emphasize that the reconstructed image from projec-
tions is not always exact; it is only an approximation of the original
image. Adifferent reconstruction method will give a slightly differ-
ent version of the original image. Since all these methods require
extensive computation, specially designed image reconstruction
hardware is normally used to implement the algorithm. The term
“computerized (computed) tomography” (CT) is often used to rep-
resent that the image is obtained from its projections using a recon-
struction method. If the 1D projections are obtained from X-ray
transmission (attenuation) profiles, the procedure is called XCT or
X-ray CT. In the following sections, we summarize the algebraic and
filtered back-projection methods with simple numerical examples.
3.6.1.2 The Algebraic Reconstruction Method
The algebraic reconstruction method is often used for the recon-
struction of images from an incomplete number of projections
(i.e. <180

). The result is an exact reconstruction (a pure chance) of
the original image f (x, y). For a 512 ×512 image, it will require over
180 projections, each with sufficient data points in the projection, to
render a good quality image.
3.6.1.3 The Filtered (Convolution) Back-Projection Method
The filtered back-projection method requires two components, the
back-projection algorithm, and the selection of a filter to modify the
projectiondata. The selectionof a proper filter for a givenanatomical
region is the key in obtaining a good reconstruction from filtered
(convolution) back-projection method. This is the method of choice
for almost all XCT scanners. The result of this method, is an exact
reconstruction (again, by pure chance) of the original f (x, y). The
mathematical formulation of the filtered back-projection method is
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
49
given in Eq. (3):
f (x, y) =

π
0
h(t)

m(t, θ)dθ, (3)
where m(t, θ) is the “t” sampling point at “θ” angle projection, h(t)
is the filtered function, and “

” is the convolution operator.
3.6.2 Transmission X-ray Computed Tomography (XCT)
3.6.2.1 Conventional XCT
A CT scanner consists of a scanning gantry housing an X-ray tube
and a detector unit, and a movable bed which can align a specific
cross section of the patient with the gantry. The gantry provides a
fixed relative position between the X-ray tube and the detector unit.
A scanning mode is the procedure of collecting X-ray attenuation
profiles (projections) from a transverse (cross) section of the body.
Fromthese projections, the CTscanner’s computer programor back-
projector hardware reconstructs the corresponding cross-sectional
image of the body. Figures 14 and15 showthe schematic of two most
popular XCT scanners (third and fourth generations), both using an
X-ray fan beam. These types of XCT take about 5 seconds for one
sectional scan, and more time for image reconstruction.
3.6.2.2 Spiral (Helical) XCT
Three other configurations can improve the scanning speed: the
helical (spiral) CT, the cine CT (Sec. 3.6.2.3), and the multislice CT
(Sec. 3.6.2.4). The helical CT is based on the design of the third-, or
the fourth-generation scanner, the cine CT uses a scanning electron
beam X-ray tube, and multi-slice CT uses a cone beam instead of a
fan beam.
The CT configurations shown in Figs. 14 and 15 have one com-
mon characteristic: the patient’s bed remains stationary during
the scanning; after a complete scan, the patient’s bed advances a
certain distance and the second scan resumes. The start-and-stop
motions of the bedslowdownthe scanningoperation. If the patient’s
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
50
Brent J Liu and HK Huang
Fig. 14. Schematic of the rotation scanning mode using a fan-beam X-ray. The
detector array rotates with the X-ray tube as a unit.
bed can assume a forward motion at a constant speed while the
scanning gantry rotated continuously, the total scanning time of a
multiple section examination could be reduced. Such a configura-
tion is not possible, however, because the scanning gantry is con-
nected to the external high energy transformer and power supply is
through the cables. The spiral or helical CT design does not involve
cables.
Figure 16 illustrates the principle of spiral CT. There are two pos-
sible scanning modes: single helical and cluster helical. In the single
helical mode, the bed advances linearly while the gantry rotates
in sync for a period of time, say 30 seconds. In the cluster helical
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
51
Direction of
X-Ray Motion
X-Ray Tube
Stationary Scintillation Dectector Array
Fig. 15. Schematic of the rotation scanning mode with a stationary scintillation
detector array, only X-ray source rotates.
Fig. 16. Helical (spiral) CT scanning modes.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
52
Brent J Liu and HK Huang
mode, the simultaneous rotation and translation lasts only 15 sec-
onds, whereupon both motions stop for 7 seconds before resuming
again. The single helical mode is usedfor patients whocanholdtheir
breath for a longer period of time, while the cluster helical mode is
for patients who need to take a breath after 15 seconds.
The design of the helical XCT introduced in the late 1980s.
It is based on three technological advances: the slip-ring gantry,
improved detector efficiency, and greater X-ray tube cooling capa-
bility. The slip-ring gantry contains a set of rings and electrical com-
ponents that rotate, slide and make contact to generate both high
energy(tosupplythe X-raytube andgenerator) andstandardenergy
(to supply powers to other electrical and computer components).
For this reason, no electrical cables are necessary to connect the
gantry and external components. During the helical scanning, the
term “pitch” is used to define the relationship between the X-ray
beam collimation and the velocity of the bed movement.
Pitch = Table movement in mm per gantry rotation/slice thickness.
Thus, a pitch equals to “1” means that the gantry rotates a complete
360

as the bed advances 1.5 mm in one second which gives a slice
thickness of 1.5 mm. During this time, raw data is collected cover-
ing 360 degrees and 1.5 mm. Assuming one rotation takes one sec-
ond, then for the single helical scan mode, 30 seconds of raw data
are continuously collected while the bed moves 45 mm. After the
data collection phase, the rawdata are interpolated and/or extrapo-
latedtosectional projections. Theseorganizedprojections areusedto
reconstruct individual sectional images. In this case, they are 1.5 mm
contiguous slices. Reconstruction slice thickness can be from1.5 mm
to 1 cm, depending on the interpolation and extrapolation methods
used.
The advantages of the spiral CT scans are speed of scanning,
allowing the user to select slices from continuous data to recon-
struct slices with peak contrast medium, retrospective creation of
overlappingor thinslices, andvolumetric data collection. The disad-
vantages are the helical reconstruction artifacts and potential object
boundary unsharpness.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
53
3.6.2.3 Cine XCT
Cine XCT, introduced in early 1980s, uses a completely different
X-ray technology, namely, an electron beam X-ray tube: this scan-
ner is fast enough to capture the motion of the heart. The detec-
tor array of the system is based on the fourth-generation stationary
detector array (scintillator and photodiode). As shown schemati-
cally in Fig. 17, an electron beam(1) is accelerated through the X-ray
tube and bent by the deflection coil (2) toward one of the four tar-
get rings (3). Collimators at the exit of the tube restrict the X-ray
beam to a 30

fan beam, which forms the energy source of scan-
ning. Since there are four tungsten target rings, each of which has
a fairly large area (210

tungsten, 90 cm radius) for heat dissipa-
tion, the X-ray fan beam can sustain the energy level required for
scanning continuously for various scanning modes. In addition, the
detector and data collection technologies used in this system allow
very rapid data acquisition. Two detector rings (indicated by 4 in
Fig. 17) allow data acquisition for two consecutive sections simul-
taneously. For example, in the slow acquisition mode with a 100 ms
scanning time, and a 8 ms interscan delay, cine XCT can provide
9 scans/s, or inthe fast acquisitionmode witha 20 ms scanning time,
34 scans/s.
The scanningcanbe done continuouslyonthe same bodysection
(tocollect dynamic motiondata of the section) or alongthe axis of the
patient (to observe the vascular motion). Because of its fast scanning
speed, cine XCT is used for cardiac motion and vascular studies and
Fig. 17. Schematic of the cine XCT. Source: Diagram adapted from a technical
brochure of Imatron Inc.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
54
Brent J Liu and HK Huang
emergency room scans. Until the availability of the multislice XCT,
cine XCT is the fast scanner for dynamic studies.
3.6.2.4 Multislice XCT
In spiral XCT, the patient’s bed moves during scan, but the X-ray
beam is a fan beam perpendicular to the patient axis, and the detec-
tor systemis built to collect data for the reconstruction of one slice. If
the X-ray beamis shaped to a three-dimensional cone beamwith the
z-axis parallel to the patient’s axis, and if a multiple detector array
(in the z-direction) system is used to collect the data, then we have
a multislice XCT scanner (see Fig. 18). Multislice XCT, in essence, is
also spiral scan except that the X-ray beamis shaped to a cone beam
geometry. Multislice XCT can obtain many images in one exami-
nation with a very rapid acquisition time, for example, 160 images
in 20 seconds, or 8 images/sec, or 4 MB/sec of raw data. Figure 18
shows the schematic. It is seen from this figure that a full rotation
of the cone beam is necessary to collect sufficient projection data to
reconstruct the number of slides equal to the z-axis collimation of
the detector system (see below for definition). Multislice XCT uses
several new technologies:
(1) New detector: Ceramic type detector is used to replace tra-
ditional crystal technology. Ceramic detector has the advan-
tages of more light photons in the output, less afterglow time,
higher resistance to radiation and mechanical damage, and can
be shaped much thinner (1/2) for equivalent amount of X-ray
absorption, compared with the crystal scintillators.
(2) Real-time dose modulation: A method to minimize dose deliv-
eredto the patient using the cone beamgeometry by modulating
the mAs (milliampere-seconds) of the X-rays beam during the
scan.
(3) Cone beam geometry image reconstruction algorithm: Efficient
collection and recombination of cone beam X-ray projections
(raw data) for sectional reconstruction.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
55
Fig. 18. Geometry of the multislice XCT. The patient axis is in the z-direction.
The X-rays (X) shaped as a collimated cone beam rotates around the z-axis 360

in sync continuously with the patient’s bed moving linearly in z-direction. The
detector system(D) is a combination of detector arrays shaped in a concave surface
facing the X-ray beam. The number of slices per 360

rotation are determined by
two factors: the number of detector arrays in the z-direction, and the method used
to recombine the cone beam projection data into transverse sectional projections
(Fig. 13). The reconstructed images are transverse viewperpendicular to the z-axis.
If the cone beamdoes not rotate while the patient’s bedis moving, the reconstructed
image is equivalent to a digital fluorographic image.
(4) High speed data output channel: During one examination, say,
for 160 images, much more data have to be collected during the
scanning. Fast I/O data channels from the detector system to
image reconstruction are necessary.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
56
Brent J Liu and HK Huang
If the patient bed is moving linearly, but the gantry does not
rotate, the result is a digital fluorographic image with better image
quality than that discussed in Sec. 3.2. Currently, 16-slice and
32-slice detector CT scanners are in heavy clinical use with the
64, 128, and 256-slice detector CT scanners on the near horizon.
Figure 19 shows a 3D rendered volume image of acquisition data
from a 64-slice detector CT scan of a human heart. With the advent
of the 256-slice detector CT scanner, it will be feasible to acquire
image data for an entire organ such as the heart in a single scan
cycle as shown in the figure.
3.6.3 Some Standard Terminology Used in Multi-Slice XCT
Recall the term “pitch” defined in spiral XCT, with cone beam-
multidetector scanning, because of the multi-detector arrays in the
z-direction(Fig. 18) the table movement canbe manytimes the thick-
ness of an individual slice. For example, take a 16 mm × 1.5 mm
detector system (16 arrays with 1.5 mm thickness per array), and
Fig. 19. A3Dvolume rendered image of the heart fromdata acquired by a 64-slice
CT scanner. Note that a 256-slice CT scanner would be able to scan the entire heart
in one single rotation (courtesy of Toshiba Medical Systems).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
57
with the slice thickness of an individual image being 1.5 mm, then
use the definition of “pitch” in spiral scan:
Pitch = Table movement in mm per gantry rotation/Slice thickness
= (16 ×1.5 mm/rotation)/1.5 mm = (24 mm/rotation)/1.5 mm.
That means the table moves 24 mm/rotation with a reconstructed
slice thickness of 1.5 mmwould have a pitch of 16. (Sec. 3.6.2.2) This
case also represents contiguous scans. Comparing this example with
that shown in Section 3.6.2.2 for single slice scan, the definition of
“pitch” shows some discrepancy due to the size of the multidetec-
tor arrays. Since different manufacturers produce different sizes of
multidetector arrays, the word “pitch” becomes confusing. For this
reason, the international electrotechnical commission (IEC) accepts
the following definition of pitch (now often referred to as the IEC
pitch):
z-axis collimation (T) =the width of the tomographic section along
the z-axis imaged by one data channel (array). In multidetector row
(multislice) CT scanners, several detector elements may be grouped
together to form one data channel (array).
Number of data channels (N) =the number of tomographic sections
imaged in a single axial scan.
Table speed or increment (I) =the table increment per axial scan or
the table increment per rotation of the X-ray tube in a helical (spiral)
scan.
Pitch (P) = Table speed (I mm/rotation)/(N· T)
Thus, for a 16 detector scanner in a 16 ×1.5 mm scan mode, N=16
andT=1.5 mm, andif the table speed=24 mm/rotation, then P=1,
acontiguous scan. If thetablespeedis 36 mm/rotation, thenthepitch
is 36/(16

1.5) =1.5.
3.6.4 Four-Dimensional (4D) XCT
Refer to Fig. 17, with the bed stationary, but the gantry continuously
rotates, we would have a four-dimensional XCT, with the fourth
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
58
Brent J Liu and HK Huang
dimension as time. In this scanning mode, human body physiologi-
cal dynamic can be visualized in 3D. Current multislice XCT with a
limited size of detector arrays in z-direction, and data collection sys-
temof 100 MB/sec can only visualize a limited segment of the body.
In order to realize the potential clinical applications of 4D XCT, sev-
eral challenges are in order:
(1) Extend the cone beamX-ray and the length of the detector array
in the z-direction. Currently, detector system with 256 arrays
with 912 detectors per array is available in some prototype 4D
XCT systems.
(2) Improve the efficiency and performance of the A/D conversion
at the detector.
(3) Increase the data transfer rate between the data acquisition sys-
tem to the display system from the 100 MB/sec to 1 GB/sec.
(4) Revolutionize display method for 4D images.
4D XCT can produce images of gigabyte range per examination.
Methods of archive, communication, and display become challeng-
ing issues.
3.6.4.1 PET/XCT Fusion Scanner
XCT is excellent for anatomical delineation with fast scanning time,
while positron emission tomography (PET) is slow in obtaining
physiological images of poorer resolution, but good for the differen-
tiation between benign and malignant tumors. PET requires atten-
uation correction in image reconstruction, and the fast CT scan time
can provide the anatomical tissue attenuation in seconds which can
be used as a base for PET data correction. Thus, the combination of
a CT and a PET scanner during a scan would give a very power-
ful tool for improving the clinical diagnostic accuracy when neither
alone would be able to provide such result. Yet, the two scanners
have to be combined as one system otherwise the misregistration
between CT and PET images would sometimes give misinforma-
tion. CT/PET Fusion scanner is such a hybrid scanner which can
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
59
Fig. 20. Reconstructed image showing fusion of both CT image data and positron
emission tomography (PET) image data into a single image. Note that PET image
data shows physiological function while the CT image data shows the anatomical
features. Tools allow the user to dynamically change how much PET or CT data
is displayed in the fused image. Note the areas in the body such as the heart with
high activity signal from the acquired PET data.
obtain the CT images as well as PET images during an examina-
tion. The PET images so obtained actually have better resolution
than that without using the CT attenuation correction. The output
of a PET/CT fusion scanner is two sets of images, CT and PET with
the same coordinate system for easy fusing of the images together.
Figure 20 shows an example of CT/PET fused image data set show-
ing both anatomical as well as physiological function.
3.6.4.2 Components and Data Flow of an XCT Scanner
The major components and data flow of an XCT include a gantry
housing the X-ray tube, the detector system, and signal pro-
cessing/conditioning circuits; a front-end preprocessor unit for
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
60
Brent J Liu and HK Huang
cone/fan beam projection data corrections and recombination to
transverse sectional projection data; a high-speed computational
processor; a hardware back-projector unit; and a video controller for
displaying images. In XCT, its CT number, or pixel/voxel value, or
Hounsfield number, represents the relative X-ray attenuation coef-
ficient of the tissue in the pixel/voxel, is defined as follows:
CT number = K(µ −µ
w
)/µ
w
,
where µ is the attenuation coefficient of the material under consid-
eration, µ
w
is the attenuation coefficient of water, and Kis a constant
set by the manufacturer.
3.6.5 XCT Image Data
3.6.5.1 Slice Thickness
Current multislice CT scanners can feature up to 32 detectors in an
array. In a spiral scan, multiple slices of data can be acquired simul-
taneously for different detector sizes, and 0.75 mm, 1 mm, 2 mm,
3 mm, 4 mm, 5 mm, 6 mm, 7 mm, 8 mm, and 10 mm slice thickness
can be reconstructed.
3.6.5.2 Image Data Size
Astandard Chest CT of coverage size between 300 mm–400 mmcan
yield image sets from 150–200 images all the way up to 600–800
images depending on the slice thickness, or data sizes from 75 MB
up to 400 MB. Performance-wise, that same standard chest CT can
be acquired in 0.75 mm slices in 10 seconds. A whole body CT can
produce upto 2 500 images or 1 250 MB(1.25 GB) of data. Eachimage
is 512 ×512 ×2 byte size.
3.6.5.3 Data Flow/Post-Processing
The fan/cone beam raw data are obtained by the acquisition host
computer. Slice thickness reconstructions are performed on the raw
data. Once the set of images is acquired in DICOMformat, any post-
processing is performed on the DICOM data. This includes sagittal,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
Principles of X-ray Anatomical Imaging Modalities
61
coronal, and off-axis slice reformats as well as 3D post processing.
Sometimes the cone beam raw data are saved for future reconstruc-
tion of different slice thicknesses.
Some newer scanners feature a secondary computer, which
shares the same database as the acquisition host computer. This sec-
ondary computer can perform the same post-processing functions
while the scanner is acquiring new patient data. This secondary
computer also can perform network send jobs to data storage or
another DICOM destination (e.g. highly specialized 3D processing
workstation) andmaintains a sendqueue, thus alleviating the acqui-
sition host computer from these functions and improving system
throughput.
References
1. Cao X, Huang HK, Current status and future advances of digital radio-
graphy and PACS, IEEE Eng Med & Bio 19(5): 80–88, 2000.
2. Feldkamp LA, Davis, LC, Kress JW, Practical cone-beam algorithm,
J Optical Society Amer A 1: 612–619, 1984.
3. Huang HK, PACS and Imaging Informatics: Basic Principles and Applica-
tions, Wiley & Sons, NY, 2004.
4. Stahl JN, Zhang J, Chou TM, Zellner C, Pomerantsev EV, Huang HK,
Anewapproach to tele-conferencing with intravascular ultrasound and
cardiac angiography in a low-bandwidth environment, RadioGraphics
20: 1495–1503, 2000.
5. Taguchi K, Aradate H, Algorithmfor image reconstruction in multislice
helical CT, Medical Physics 25(4): 550–561, 1998.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch03 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
CHAPTER 4
Principles of Nuclear Medicine
Imaging Modalities
Lionel S Zuckier
Nuclear medicine utilizes radioactive molecules (radiopharmaceuticals)
for the diagnosis and treatment of disease. The diagnostic information
obtained from imaging the distribution of radiopharmaceuticals is fun-
damentally functional and thus differs from other imaging disciplines
within radiology, which are primarily anatomic in nature. Imaging using
radiopharmaceuticals can be subdivided into single- and dual-photon
modalities. A wide selection of radiopharmaceuticals is available for
single-photon imaging designed to study numerous physiologic pro-
cesses within the body. Static, dynamic, gated and tomographic modes
of single-photon acquisition can be performed. Dual-photon imaging is
the principle underlying positron emission tomography (PET) and is fun-
damentally tomographic. PET has expanded rapidly due to the clinical
impact of the radiopharmaceutical
18
F-fluorodeoxyglucose, a glucose ana-
logue used for imaging of malignancy. The fusion of nuclear medicine
tomographic images with anatomic CT is evolving into a dominant imag-
ing technique. The current chapter will review physical, biological and
technical concepts underlying nuclear medicine.
4.1 INTRODUCTION
4.1.1 Physical Basis of Nuclear Medicine
Nuclear medicine is a branch of medicine which utilizes molecules
containing radioactive atoms (radiopharmaceuticals) for the diag-
nosis and treatment of disease. Radioactive atoms have structurally
unstable nuclei and seek to achieve greater stability by the release
of energy and/or particles in a process termed radioactive decay.
1
63
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
64
Lionel S Zuckier
Atoms with unstable arrangements of protons and neutrons are
termed radionuclides. This stochastic process is governed by first-
order kinetics such that for N atoms, the rate of decay dN/dt is
equal to −λN, where t is time and λ is the physical decay constant.
It follows that N(t) =N
0
e
−λt
where N
0
is the number of radioactive
atoms present at time 0. The time for half of a sample of atoms to
decay is a constant termed the physical half-life (T
1/2
) and is charac-
teristic for each radionuclide. The physical decay constant λ can be
expressed as 0.693/T
1/2
. It is customary to quantify the amount of
a radioactive substance by its rate of decay, or activity. The S.I. unit
Becquerel (Bq) is equal to 1 disintegration per second (dps), while the
traditional unit Curie (Ci) is equal to 3.7 ×10
10
dps.
A second important feature in characterizing a radionuclide is
the nature, frequency, and energy of its emitted radiations. Various
types of radiation may be emitted fromthe atomic nucleus (Table 1).
Alpha (α), beta (β

) and positron (β
+
) radiations are particulate and
penetrate relatively short distances in tissue. Gamma (γ) radiation
is non-particulate and penetrating, making it useful for diagnostic
imaging purposes, where it can be detected by instruments external
to the body. Other types of penetrating radiations which are imaged
innuclear medicine include X-rays that are emittedas a consequence
of rearrangement of shell electrons, and 511 keV photons (annihila-
tion radiations) that result from positron-electron annihilation.
Table 1. Ionizing Radiations
33
Type Rest Mass Charge Origin
(Atomic Mass Units)
Alpha (α) 4.002 +2 Nucleus
Beta negative (β

) 5.486 ×10
−4
−1 Nucleus
or electron (e

)
Gamma (γ) 0 0 Nucleus
Beta + (β
+
) 5.486 ×10
−4
(e

)+1 Nucleus
or positron (e
+
)
X-ray 0 0 Shell electrons
Annihilation photons 0 0 Annihilation of
e
+
and e

January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
65
Penetrating radiation is subject to attenuation in soft-tissues in
an exponential manner. As photons travel a distance x through mat-
ter, the intensity of radiation decreases as e
−µx
where µ is the linear
attenuation coefficient dependent on the mass density of the atten-
uator, its atomic number (Z), and the energy of the radiation. For
photons typically used in nuclear medicine, the predominant inter-
action with soft tissue is Compton scattering, potential degrading
the image by redirecting the photons. Attenuation is also a fun-
damental foil to quantitative analysis, as the radiation measured
by detectors external to the body is reduced to a variable degree
depending on the nature and amount of intervening attenuator and
is no longer directly proportional to the activity at the source being
imaged. Methods are available toestimate andcompensate for atten-
uation in nuclear medicine imaging and will be discussed where
relevant.
The types of radiation enumerated above are all ionizing; when
they pass through tissues they deposit energy leading to potential
chemical and biologic effects. In addition to man-made radiation
from medical, industrial and military causes, inevitable exposure
to ionizing radiation also results from natural radiation emanating
fromouter space and radionuclides in the earth’s crust. Mammalian
cells posses repair mechanisms which, at least in part, repair such
damage. The study of the interaction of radiation and living organ-
isms comprises the discipline of radiation biology.
2
Much has been
learned regarding the potential toxicity of radiation and this has
informed the field of radiation safety.
3
Controversy persists to the
present day as to whether there is a minimal amount of radiation
necessary to cause cell damage. However, for safety and regulatory
purposes we assume that exposure to even a small amount of radia-
tion is associated with a finite risk (nonthreshold model). As a general
rule, particulate radiation is significantly more injurious than non-
particulate, as the energy is deposited in a more concentrated track.
Physicians must weigh the potential benefits of radiologic studies
withtheir inherent risks. Radiopharmaceuticals are therefore admin-
istered in the smallest amounts necessary to effect a diagnosis or
therapy.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
66
Lionel S Zuckier
4.1.2 Conceptual Basis of Nuclear Medicine
Nuclear medicine owes much of its approach to the Tracer Principle,
developed by George de Hevesy, who was awarded the Nobel Prize
inChemistryin1943 for his “workonthe use of isotopes as tracer ele-
ments inresearches onchemical processes.”
4
Inde Hevesy’s method,
a radioactive atom is introduced within a molecule under study,
thereby allowing the newly formed radioactive moiety to serve as
an easily identifiable version of the compound, a radioactive tracer.
5
The use of radioactive molecules to elucidate physiologic pathways
has been adopted by nuclear medicine. Radionuclides with appro-
priate physical properties are substituted into molecules of biologi-
cal interest thereby creating radiopharmaceuticals which, combined
with selectivity of the body’s physiologic processes, can be used to
identify and target cells and organs of interest (Fig. 1). Techniques
L
S
H
K
Br
Bl
K
Legend
M
M
M
M
T
Fig. 1.
18
F-FDG PET scan demonstrates foci of malignant tumor within lymph
nodes in the patient’s right neck (arrow). Malignant tumor, in contrast to most
other tissues, tends to preferentially metabolize glucose; FDGcan therefore be used
to identify sites of malignancy, as in the present case. The distribution of FDG also
reflects normal tissues of the body that avidly utilize glucose, including brain (Br),
and to a lesser degree liver (L) and spleen (S). In the resting state, uptake of FDGby
muscles (M) is minimal. Uptake bythe heart (H), tonsils (T) andbowel (arrowheads)
noted in the current study tend to be variable in intensity. Unlike glucose, a large
proportion of FDG is excreted by the kidneys (K) into the urinary bladder (Bl).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
67
within nuclear medicine are unique amongst the radiologic modal-
ities in that they primarily yield information regarding the function
of tissues, rather than anatomic or structural detail.
Optimally, one of the atoms within a biologically relevant
molecule is substituted with a suitable radioactive isotope of the
same element. The difference in atomic mass of isotopes is due to
a variation in the number of neutrons while the number of protons
is unchanged, the latter guaranteeing virtually identical chemical
behavior of the moieties. For example,
123
I or
131
I can be substi-
tuted for stable (nonradioactive)
127
I within sodium iodide (NaI),
which is still taken up by the thyroid gland in a manner identi-
cal to the nonradioactive substrate. More commonly, because of
limitations in available radionuclides and their imaging proper-
ties, an analog of the molecule of interest is created which shares
critical biochemical features, although its chemical structure differs
and its biological fate is not identical to that of the original com-
pound.
18
F-Fluorodeoxyglucose (FDG) represents a radiopharma-
ceutical which shares some, but not all, features of glucose, yet is of
immense clinical utility (Fig. 2).
Glucose
O
OH
OH
HO
HO
HO
G
L
Y
C
O
L
Y
S
I
S
X
Glucose-6P
-------
FDG-6P
Cell
Glucose
-------
FDG
O
OH
18
F
HO
HO
HO
FDG
G
l
u
t
1
,

3
H
e
x
o
k
i
n
a
s
e
Fig. 2. Structure of glucose and
18
F-flouro-deoxyglucose (FDG) are illustrated
on the left side of the diagram; in the latter, a hydroxyl group has been replaced
with radioactive
18
F-flourine to form FDG. Although glucose and FDG are taken
up similarly in cells by the glut1 and glut 3 glucose transporters and both are
phosphorylatedbythe enzyme hexokinase, FDG-6-phosphate (FDG-6P), incontrast
to glucose-6-phosphate (glucose-6P), can not proceed to glycolysis and in effect
becomes trapped within the cells.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
68
Lionel S Zuckier
The methods of tracking distribution of radiopharmaceuticals
in clinical nuclear medicine are varied. Infrequently, radiophar-
maceuticals are used in non-imaging quantitative assays where
samples of blood or urine are measured in sensitive well type scin-
tillation detectors as a means of deriving information regarding
physiologic function and metabolic clearance. Examples include
the measurements of the absorption and subsequent excretion of
57
Co-labeled vitamin B
12
(used in the evaluation of vitamin B
12
deficiency), and the determination of the rate of renal excretion
of
51
Cr-EDTA (used in the measurement of renal function). Non-
imaging probe-type radiation detectors are used for organ-based
counting such as in the measurement of thyroid uptake of radio-
active iodine. While no image is generated by the thyroid probe, the
data are fundamentally spatial in that the detector interrogates a
defined volume of tissue. The use of nonimaging probes has also
spread to the surgical suite, in order to identify lymph nodes which
have been rendered radioactive by virtue of draining an anatomic
region where radiopharmaceutical has been injected into the subcu-
taneous tissue. These collimated solid-state hand-held scintillation
detectors, used in sentinel lymph node biopsy, have a well-defined
field-of-view and can be slowly translated over the surgical bed
to reveal the location of radioactive lymph nodes or other targets.
A third and most common method of assaying the distribution of
radiopharmaceuticals used in the current practice of clinical nuclear
medicine, is radionuclide imaging. This noninvasive technique has
evolved as an integral component of medical imaging for over five
decades, often serving as the proving ground for concepts that were
subsequently introduced into other radiologic modalities.
6
An important facet of nuclear medicine is the administra-
tion of therapeutic radiopharmaceuticals designed to destroy tar-
geted cells.
7
Therapeutic radionuclides in clinical use today emit
β

particles. For example,
131
I-NaI is used for treatment of thy-
roid cancer while
90
Y or
131
I labeled antibodies are administered
to destroy lymphoma cells. While these therapies, per se, are not
imaging examinations, therapeutic applications are frequently pre-
ceded by imaging using γ-emitting analogues in order to predict
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
69
efficacy and toxicity, a discipline within medical physics termed
dosimetry.
4.1.3 Radiopharmaceuticals in Nuclear Medicine
Radiopharmaceuticals for diagnostic purposes are generally labeled
with γ emitting radionuclides; γ photons which readily exit the
body, are detectable by a variety of instruments discussed within
this chapter, and pose the lowest radiation risk to the patient. The
γ-emittingradionuclides usedemit photons withone or more princi-
pal energies in the range of 69 keV–394 keV(Table 2). The most com-
mon radionuclide used today is
99m
Tc which possesses the nearly
optimal characteristics of 140 keV γ-radiation, physical T
1/2
of six
hours, and the absence of energetic particulate emissions. Determi-
nation of the spatial distribution of a radiopharmaceutical based
on the detection of individual photons emitted from the patient’s
body is termed single-photon imaging. A contrasting imaging pro-
cess prevails in positron emission tomography (PET) where radio-
pharmaceuticals incorporate radionuclides that emit positrons in
the course of their radioactive decay (Table 3). In PET, the spatial
distribution of the radiopharmaceutical is determined by detect-
ing a pair of simultaneously-emitted photons resulting from the
Table 2. Radionuclides used in Single-Photon Radionuclide Imaging
34
Radionuclide Principle Photon
Energies (keV)
Half-Life
(Hours)
Common
Radiopharmaceutical
Forms
Clinical Application
Tc-99m 140 6.02 Numerous Numerous
I-131 364 193 NaI, MIBG Thyroid, tumors (1)
Ga-67 93, 185, 300, 394 78.3 Ga-citrate Inflammation,
infection
In-111 171, 245 67.9 Labeled leukocytes,
octreotide
Infection, tumors (2)
Tl-201 69–80 73.1 Thallous chloride Cardiac perfusion
(Hg X-rays)
I-123 159 13.2 NaI, MIBG Thyroid, tumors (1)
Legend: MIBG — metaiodobenzylguanidine, octreotide, (1) tumor of the APUD
family, (2) somatostatin receptor bearing tumors.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
70
Lionel S Zuckier
Table 3. Radionuclides Commonly used in Positron Emission Tomography
26
Radionuclide Method of Half-life Max β
+
Energy Maximal Range in
Production (Minutes) (MeV) Water (mm)
C-11 Cyclotron 20.4 0.96 3.9
N-13 Cyclotron 9.96 1.2 5.1
O-15 Cyclotron 2.05 1.7 8.0
F-18 Cyclotron 110 0.64 2.3
Ru-82 Strontium 1.3 3.4 18
82 generator
annihilation of a positron and electron in a process termed dual-
photon or coincidence imaging. In the discussion that follows, instru-
ments for both single-photon and dual-photon imaging will be
reviewed. Discussion of nonimaging detectors will serve as an
introduction to the principles of equipment used in nuclear
medicine imaging.
4.2 NUCLEAR MEDICINE EQUIPMENT
8
4.2.1 Nonimaging
Basic to understanding the techniques and equipment used for
imaging in nuclear medicine are the principles underlying the scin-
tillation detector.
9
When used for in vivo assay of a radiopharma-
ceutical within an organ, such as the amount of radioactive iodine
uptake within the thyroid gland, the scintillation detector is collo-
quially termed a scintillation probe. Components of the scintillation
probe include a collimator, scintillationcrystal, photomultiplier tube
(PMT), and electronics (Fig. 3). The collimator restricts the field-of-
viewof the crystal to a finite region opposite the collimator opening
(aperture). The scintillation crystal effectively shifts the wavelength
of photon energy fromγ rays to visible light, in quanta proportional
to the energy of the incident photon. In clinical systems, crystals
of sodium iodide, purposely contaminated or “doped” with thal-
lium ions (NaI(Tl)), are commonly used. The crystals are optically
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
71
Fig. 3. Scintillation crystal. In the illustration, a 159 keV photon, emitted from the
decay of
123
I within the thyroid gland, enters the aperture of the collimator and
is absorbed within the NaI(Tl) crystal, resulting in conversion to visible light (a
scintillation). The light is incident on the photocathode within the photomultiplier
tube (PMT), which dislodges electrons that are subsequently amplified many folds
by a series of dynodes held at progressively greater voltages. Pulses produced by
the PMT for each photon absorbed are sorted by the pulse-height analyzer (PHA),
resulting in an energy spectrum (actual
123
I spectrum shown). Counts within a
defined range of energies (i.e. the photopeak energy window) on the PHA are
integrated over time by the scaler/ratemeter. In a related design, the scintillation
crystal is fabricated with a well into which samples are placed for analysis and
counting with high geometric efficiency (lower right).
coupledto PMTs whichconvert the scintillations of light into current
which is amplified to detectable levels. PMTs consist of a photocath-
ode, designed to emit photoelectrons upon being struck by incident
light, multiple dynodes held at progressively increasing voltages
producing to an amplified cascade of electrons, and an anode which
collects the current. Voltage across PMTs is in the 1000 volt range.
Each γ photon originally absorbed in the crystal results in a dis-
crete pulse of current exiting the PMT; the amplitude of this pulse
is proportional to the incident photon energy. A device similar to
the scintillation probe is used to characterize and count radioactive
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
72
Lionel S Zuckier
samples which are placed within a specialized scintillation crystal
with an indentation or well which serves to increase the efficiency of
counting (Fig. 3).
Electronics in counting systems typically include a pulse-
height analyzer (PHA), which can discriminate pulses of differing
amplitudes originating from photons of differing energies. Scat-
tered photons loose a portion of their initial energy and can
thereby be differentiated from unscattered photons and excluded
from counting if so desired. Photons emitted from multiple radio-
nuclides can also be discriminated using the PHA. As a general
rule, counts are integrated over a fixed period of time and dis-
played on a scaler/ratemeter. Modern devices can estimate and
compensate for the fraction of counts lost due to dead time (pulse
resolving time).
4.2.2 The Rectilinear Scanner
When collimated, the scintillation probe effectively interrogates a
well-defined volume, and therefore conveys spatial information.
Early efforts to provide a distribution map of radiopharmaceuti-
cals within the body were obtained by manually translating the
scintillation crystal over a region of interest.
5
A mechanical sys-
tem of rectilinear scanning, developed by Benedict Cassen in the
early 1950s, incorporated a systematic method of measuring count
rates in a raster pattern over a region of interest.
10
Images were
recorded on paper or film as dots whose intensity was proportional
to the count rate sampled at the corresponding locations over the
patient. One of the advantages of the rectilinear scanner was that
it allowed for a simple method of mapping palpable abnormalities
on the patient to locations on the printed image. A disadvantage
of rectilinear scanning was the relatively protracted time needed
to scan an area of interest, since the data were collected serially.
11
This precluded imaging of dynamic processes, such as the flow of
blood to an organ, leading to the development of alternate imaging
methods.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
73
4.2.3 The Anger Gamma Camera
4.2.3.1 Background
The need to simultaneously, rather than sequentially, sample counts
from within the volume of interest was a factor that led to
development of the gamma-camera (γ-camera) by Hal Anger in
1952.
12,13
Basic principles introduced by Anger remain operative in
nuclear medicine imaging systems used today, albeit with refine-
ments in image acquisition, storage and retrieval made possible by
widespread availability of microprocessors. Elements of the Anger
scintillationcamera designinclude the collimator, crystal, PMTs, and
electronics (Fig. 4).
4.2.3.2 Collimation
14,15
Purpose of collimation in γ-cameras is to map the distribution of
activity onto the crystal surface in an orderly and interpretable man-
ner. The majority of collimators used today are parallel hole, consist-
ingof multiple leadsepta (or partitions) arrangedperpendicularlyto
the crystal face so that they only permit passage of γ photons normal
to the crystal. Collimators are designed to be specific to a particular
range of radionuclide energies. Collimators are also designed based
on preferences between the competing goals of count-rate sensitiv-
ity and spatial resolution, as determined by the length, thickness,
and spacing (aperture width) of the septa.
16
Clinical systems sold
today are typically equipped with a selection of low energy col-
limators designated for “high-resolution,” “high-sensitivity,” and
“all-purpose” applications. For nuclear imaging laboratories that
utilize
67
Ga,
111
In or
131
I, collimators designed for medium-energy
(
67
Ga,
111
In) and-high-energy (
131
I) imaging are also required.
There is a relationshipbetweenthe source-to-collimator distance
(r), image spatial resolution and count-rate sensitivity for parallel-
hole collimators. As r increases, spatial resolution worsens while
the count-rate sensitivity remains constant (Fig. 5). While this latter
observation appears to contradict the so-called inverse square law,
in fact, the intensity of radiation at each collimator aperture does
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
74
Lionel S Zuckier
Fig. 4. Basic principle of the Anger scintillation camera. In the current illustra-
tion, an area of concentrated radiopharmaceutical within the patient’s body emits
γ photons in an isotropic manner. The fate of various emitted photons is illus-
trated. Photon 1 exits the body but does not intersect the γ camera, while photon 2
is completely absorbed within the patient. Photon 3 exits the body and intersects
the γ camera, but the angle of incidence is such that the photon is absorbed by the
lead septa partition of the parallel-hole collimator. Photon 4 travels in a direction
such that it is able to pass through a collimator aperture, strike and be absorbed by
the Na(I) crystal. The energy of the photon is transformed to visible light emitted
isotropically within the crystal which travels or is reflected towards the photomul-
tiplier tubes (PMTs). These convert the light signal to an electronic pulse which is
amplified and then analyzed by the positioning and summing circuits to determine
the apparent position of absorption. If the total energy, or Z signal, of the ampli-
fied photon (as indicated on the illustrated
99m
Tc energy spectrum by the number
4) falls within a 20% window centered on the 140 keV energy peak, the event is
accepted and x and y coordinates are stored within the image matrix. If a γ photon
is scattered within the patient as in photon 5, the lower energy of its pulse will be
rejected by the PHA, and the erroneous projection of this photon will not be added
to the image matrix. Less commonly, diverging, converging or pinhole collimators
may be used in place of the parallel hole collimator.
decrease as 1/r
2
; however, the number of elements which admit
photons increases as r
2
, based on geometric considerations. This
in turn explains the loss of resolution with increasing distance, with
radiation emitted froma focal source passing through the collimator
to interact with ever-larger regions of the crystal.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
75
Fig. 5. Four one-minute images taken anteriorly over the chest of a patient with
the collimator placed 2, 4, 8 and 16 inches from the subject. While the count rate
remains relatively constant over these distances (total counts noted in the lower
right corner of each image), degradation of spatial resolution is readily apparent.
An additional collimator design is used to image small objects
with superior resolution. The pinhole collimator (Fig. 4) consists
of a single small aperture (3 mm–5 mm in diameter) within a lead
housing which is offset from the crystal, thereby restricting inci-
dent photons to those that pass through the aperture. This creates
an inverted and potentially magnified image on the crystal face in a
manner analogous to that used in pinhole photography. The pinhole
collimator has the capacityto increase resolution(throughits magni-
fying effect), especially at close distances, but does so at the expense
of field-of-viewand parallax distortion. As source-to-collimator dis-
tance increases with this collimator, the field of view of the cam-
era increases while magnification and resolution less and count-rate
efficiency markedly decreases.
Additional collimator designs include diverging and converg-
ing collimators, the former allowing imaging of an area larger than
the collimator face and the latter allowing the enlargement of small
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
76
Lionel S Zuckier
regions of interest (Fig. 4). These are infrequently used today but
may still find application in portable cameras with small field-of-
view crystals.
4.2.3.3 Crystal
As in the non-imaging probe, γ-camera crystals are generally com-
posed of NaI(Tl). Features that make this crystal desirable include
high mass density and atomic number (Z), thereby effectively
stopping γ photons, and high efficiency of light output. Most
current cameras incorporate large (50 cm×60 cm) rectangular detec-
tors. While expensive, the larger field of view results in increased
efficiency.
17
In early designs, crystals were often 0.5 inches thick,
which was well-suited for high energy γ photons. In more recent
implementations of the γ-camera, crystals only 3/8-inch or 1/4-inch
thick are used, which is more than adequate for stopping the pre-
dominantly low-energy photons in common use today and which
also results in superior intrinsic spatial resolution. In theAnger cam-
era design, the NaI(Tl) crystal is optically coupled to an array of
PMTs which is packed against the undersurface of the crystal. Light-
pipes may be used to redirect light from the crystal into the PMTs.
4.2.3.4 Positioning Circuitry and Electronics
Early positioning circuitry in the Anger γ-camera was analog in
nature. γ photons incident on the NaI crystal resulted in production
of light whichpropagatedthroughout the crystal andwas converted
to anamplifiedelectrical pulse by the PMTs. Output of the PMTs was
summed to produce an aggregate XandYsignal which reflected the
location of the scintillation event in the crystal, and which was used
to deflect the beamof a cathode ray tube (CRT) in order to produce a
single spot on the image. The sum of the PMT signals (Z signal) was
proportional to the γ photon energy and was used to exclude lower
energy scattered photons. In order to accurately superimpose the
distribution of multiple γ photons of different energies emanating
from one or more radionuclides (such as 171 and 245 keV photons
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
77
of
111
In and 93, 185, 300 and 394 keV photons of
67
Ga), the Z pulse
is also used to normalize the X and Y signals so that the images
described by each photopeak are superimposable and image size is
energy-invariant (Z pulse normalization).
As microcomputers became faster, less expensive, and more
widely available, successive versions of γ-cameras increasingly
incorporated microprocessors.
18
Initially, the x and y signals of
the γ-camera were first processed by analog means and subse-
quently converted to digital signals by analog-to-digital convert-
ers (ADCs). Computers were then used for image storage, viewing,
image correction
19
and various quantitative analyses. γ-cameras
eventually became fully “digital” in that the output of each PMT
was digitized. Most current digital γ-cameras have ability to adjust
the gain of each PMT individually, leading to the improved overall
camera performance.
20
Individual events, as detected by the PMTs,
are then corrected for local differences in pulse-height spectra and
for positioning. These refinements have led to improvement in spa-
tial resolution and image uniformity.
4.2.3.5 Modes of Acquisition
Prior to image acquisition, the operator must specify acquisition
parameters such as size of the image matrix, number of brightness
levels, photopeak and window width. Typically, for a
99m
Tc acqui-
sition, a 128 × 128 matrix is used with 2
8
or 2
16
levels of bright-
ness, corresponding to maximum of 256 or 65 536 counts per pixel,
respectively. The acquisition window refers to the range of photon
energies which will be accepted by the PHA. The peak and win-
dow width selected for
99m
Tc are 140 keV and 20%, respectively. As
mentioned earlier, scattered photons have decreased energy, and
the energy window excludes most, though not all, of these lower
energy photons. Modern cameras allow concurrent acquisition of
photons in multiple energy windows, whether emanating from a
single radionuclide with multiple γ emissions (such as
67
Ga), or
multiple radionuclides with one or more energy peaks each (Fig. 6).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
78
Lionel S Zuckier
Fig. 6. Dual-isotope acquisition. 24 hours prior toimaging, 0.5 mCi of
111
In-labeled
autologous white blood cells (
111
In-WBCs) were injected intravenously into the
patient to localize sites of infection. 30 minutes prior to imaging, 10 mCi of
99m
Tc-
sulfur colloid were injected intravenously to localize marrow. Coimaging of the
99m
Tc window (140%±20%keV) and dual
111
In windows (171%±15%keV and
245%±15%keV) was performed, thereby producing simultaneous images of the
marrow(left panel) andwhite bloodcell distribution (right panel). In spite of differ-
ing energy, Z-pulse normalization has resulted in superimposable images. Marrow,
liver and spleen are visible on both marrow and WBC studies. The
99m
Tc study is
used to estimate WBC activity which is due to normal marrow distribution and
of no pathologic consequence. To signify infection,
111
In-WBC activity must be in
areas other than the visualized marrow distribution.
In the latter case, photons derived from each radionuclide can sub-
sequently be separated into separate images, each reflecting the dis-
tribution of a single radiopharmaceutical. Multi-isotope imaging is
especially helpful when the two sets of images are used for com-
parison purposes. Depending on the relative activities of the radio-
pharmaceuticals and other considerations, the images of the isotope
emitting the lower energy photons may have to be corrected for
down-scatter of higher energy photons into its energy window.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
79
Current clinical γ-cameras typicallyacquire several types of data
(Fig. 7). To acquire a static (or spot) view, the camera is placed
over a region of the body and an acquisition is performed for a
predetermined number of counts or interval of time. The latter is
appropriate when intensities of different parts of the body are being
compared. Dynamic imaging refers to the acquisition of multiple
sequential images at defined intervals. These may be of short dura-
tion, such as a series of two-second images to portray blood flow, or
Fig. 7. Images taken from a bone scan illustrate various modes of acquisition.
Initial anterior images fromthe dynamic flowstudy(toppanel) consist of sequential
2-second images taken over the feet subsequent to injection of 25 mCi of
99m
Tc-
labeled MDP. Static (spot) images were taken two hours thereafter in anterior, left
lateral, and plantar (sole of foot) projections for five minutes each. Sweep images
were alsotakenat that time, inanterior andposterior projections, where the detector
and patient table move with respect to each other in order to produce an extended
field-of-viewimage. Increased flowand bone uptake within the left foot are highly
suggestive of osteomyelitis (infected bone).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
80
Lionel S Zuckier
of longer duration, such as multiple one-minute images to demon-
strate uptake and excretion of radiopharmaceuticals by the kidneys
or liver. Many cameras have the ability to acquire whole-body views,
where the detector and patient bed move with respect to each other
during the acquisition, allowing an elongated sweep image to be
obtained. Gated images are obtained during the course of a cyclical
physiologic process, where a series of dynamic images is acquired
based on repetitive sampling synchronized by a physiologic trigger
(Fig. 8). This is commonly used to obtain a series of images dur-
ing phases of the cardiac cycle, thereby portraying the change in
left-ventricular volume during this period. In this method, the R-
wave of the electrocardiograph (ECG) is used to repetitively trigger
acquisition into a series of 16 brief frames into which counts are
accrued. When summed over the course of several hundred cardiac
beats, the limitation of statistical “noise” imposed by the fewcounts
collected over each sub-second segment of the physiologic cycle is
overcomed.
In general, two modes of data collection are possible, frame mode
and list mode. In the former and more common method of imaging,
events detected by the PHAthat fall within predetermined param-
eters are incremented into a specific image matrix. For example, in
dynamic or gatedimaging, framelengthis prescribeda priori, andthe
counts are parsed into the appropriate matrices in real time. Frame
mode is an efficient method of memory utilization, and images are
retrievable immediately following completion of acquisition. Adis-
advantage of frame mode is that the acquisition parameters must
be selected prior to the acquisition, and cannot be retrospectively
changed. For example, if the patient’s heart rate changes during the
acquisition, or if we wish to adjust the energy window, there is no
way to alter the acquisition parameters. Alternatively, the time, loca-
tion, andevenenergy of eachscintillationevent over the entire dura-
tion of the acquisition, in addition to any relevant physiologic trig-
gers, are stored when acquiring in list mode. At the conclusion of the
acquisition, eachevent canbe retrospectivelysortedfromthe list into
specific time bins or energy windows. As the data list remains intact,
this exercise can be repeated as many times as desired. List mode
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
81
A B CDE F GH I J A B C DE F GH I J K L MNOP A B CDE F GH I J K L MNOP K L MNOP
A B C D
E F G H
I J K L
M N O P
Time
% EDV
100
75
50
25
0
ECG
ROI
Time-Activity Curve
BK
Fig. 8. Gatedcardiac study performedafter labeling of the patient’s redbloodcells
with 25 mCi of
99m
Tc. The electrocardiograph tracing (top) illustrates the division
of the cardiac cycle into 16 frames, marked A-P for illustrative purposes. Counts
fromthe γ-camera during each portion of the cycle are assigned to a corresponding
image matrix (labeled A-P). After counts from several hundred beats have been
summed, count statistics are adequate to portray the change in volume of blood
within the heart during a cardiac cycle. These can be shown as sequential images,
as illustrated, or as a cine loop. Quantitative analysis can also be performed. As
illustrated, regions-of-interest (ROIs) are placed over the left ventricle (LV) at both
end diastole and end systole (black and white curves, respectively). Non-specific
activity overlying the heart is estimated from an adjacent background (BK) region
and subtracted from the LV regions. A time-activity curve (solid line) depicts the
change in LV ventricular volume during the course of an average heart-beat. The
dotted line illustrates the first derivative. In the current illustration, the percent
change in LV volume during contraction (ejection fraction) is 45%, slightly below
normal. RVand SP indicate locations of the right ventricle and spleen, respectively.
necessitate increased storage requirements, but is especially useful
in research applications where data may be analyzed in multiple
ways.
Tomographic imaging refers to the acquisition of 3D data. The
initial development of tomography occurred in nuclear medicine
6
;
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
82
Lionel S Zuckier
subsequently this technique was extended to computed transmis-
sion tomography (CT) and magnetic resonance imaging (MRI).
Single- and dual-photon methods of tomography will be discussed
separately below.
4.2.3.6 Analysis of Data
A strength of nuclear medicine is quantitative analysis. Regions of
interest (ROIs) can be defined on an image, or series of images, and
usedtoextract areal count rates. Whenappliedtoa dynamic series of
images, the result is called a time-activity curve (TAC). For example,
a ROI over the left ventricle, in conjunction with gating by the elec-
trocardiograph (ECG), can be applied to obtain a TACof ventricular
volume during systole from which we can derive a left-ventricular
ejection fraction (Fig. 8). Routine applications in common use today
which utilize ROIs include studies of renal function, gastric empty-
ing, gall-bladder ejection fraction, and cardiac ejection fraction.
Two factors complicate the analysis of ROIs in planar scintigra-
phy. Attenuation of overlying soft tissues may vary across a single
image, among multiple images of the same patient, and certainly
from patient to patient. Attempts to compare relative uptake by
left and right kidneys within a single image may therefore be con-
founded by differences in attenuation of the overlying soft tissues.
Toa certaindegree, the approximate depthof organs, estimatedfrom
orthogonal views or other anatomic imaging modalities, canbe used
to compensate for attenuation based on the assumption that soft tis-
sue is equivalent to water as an attenuating medium.
The second factor which confounds quantitative analysis is the
activityresidingwithintissues above or belowa structure of interest.
With reference to the example cited above, attempts to compare left
and right renal activity may be confounded by activity in overlying
soft tissues, such as the liver. To compensate, a background ROI is
typically defined adjacent to the area of interest and is then used to
estimate and correct for these non-specific counts. Asimilar method
is used to correct the left ventricular ROI for blood pool activity
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
83
originating in the overlying chest wall and lungs in calculation of
left ventricular ejection fraction (Fig. 8).
4.2.4 Alternate Scintigraphic Camera Designs
By far, the Anger-style γ-camera has predominated in clinical
radionuclide imaging. However, other camera designs have been
developed and commercialized, especially for niche applications.
One such camera, designed by Bender and Blau in the early 1960s,
utilized 294 small NaI(Tl) crystals that were monitored by 35 PMTs
assigned to 14 rows and 21 columns.
21
In contrast to the Anger scin-
tillation camera, in which imaging is predicated upon localizing the
position of scintillation events in a large crystal, this scintillation
camera decodes position based on the photon’s interaction with
specific crystals, each of which represents a finite spatial location.
A major advantage of this design, which found application in first
pass cardiac studies, is a higher count-rate capability than that of the
Anger gamma camera. Recently, development of solid state detec-
tors has resulted in reemergence of multicrystal cameras, especially
for portable or dedicated cardiac applications (Fig. 9).
4.3 TOMOGRAPHY
4.3.1 Single-Photon
In current methods of single-photon tomography, the γ-camera
describes a circular or elliptical orbit aroundthe patient as it acquires
projection images at multiple angles. Data are then reconstructed
using either filtered backprojection or iterative algorithms to esti-
mate the original 3Ddistribution of activity
22
(Fig. 10). Tomography
does not increase spatial resolution. It is used to increase contrast by
eliminating overlying activity and is helpful in improving anatomic
localization. Tomographic images are also amenable to fusion with
CT or MRI data.
In theory, projection images obtained subtending 180

around
the patient should be sufficient to reconstruct the three-dimensional
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
84
Lionel S Zuckier
Fig. 9. Pediatric bone scan image using a portable solid state multicrystal camera
(Digirad, Poway, CA) to visualize detailed uptake of radiopharmaceutical in bones
of the hand. The detector consists of 4096 3 mm×3 mm crystals of thallium-doped
cesium iodide [CsI(Tl)].
distribution of activity and this is the standard acquisition method
used in cardiac perfusion tomography. The heart is situated antero-
laterally in the left chest and a 180

acquisition, centered on the
heart, is obtained extending from right anterior oblique to left pos-
terior oblique projections. Cameras with multiple detector heads are
useful for tomographic acquisitions because they reduce the acqui-
sition time required. An efficient method of using two detectors for
cardiac imaging is to arrange the detectors 90

to one another, in a so
called “L” configuration. In this way, the assembly need rotate only
90

to complete the entire 180

data acquisition.
Attenuation of photons in tissue and loss of resolution with dis-
tance from the collimator act to degrade photons originating on the
far side of the body. As a result, in most noncardiac tomographic
applications, acquisitions are performed using projections over a
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
85
Fig. 10. SPECT imaging. In this example, red blood cells from the patient have
been labeled with
99m
Tc and reinjected into the patient. An acquisition is performed
consisting of 64 projections taken about the upper abdomen. Eight representative
projection images are displayed in panel A. The 3Ddistribution of activity has been
iteratively reconstructed fromthe projection data; selected axial, saggital and coro-
nal images are shown (panel B). Note the relative greater intensity of the periphery
of the liver as compared to its center due to effect of soft-tissue attenuation. Leg-
end: Aaorta; H heart; I inferior vena cava; K kidney; L liver; S spleen; V vertebral
bodies.
full 360

rotation. In this case, dual-headed cameras are configured
with the heads opposing each other at 180

. The assembly rotates
180

to complete the entire 360

data acquisition. For a three-headed
camera, heads are spaced 120

apart and a complete acquisition can
take place following a 120

rotation.
Attenuation interferes with quantitative analysis of images in
SPECT. This problem is manifested in cardiac perfusion imaging,
where variable soft-tissue attenuation leads to apparent regional
defects in the myocardium which simulate lack of perfusion. In the
imaging of the brain, it has been possible to correct for attenuation
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
86
Lionel S Zuckier
by assuming the cranial contour conforms to an oval and consists of
water density; however, this method cannot be generalized to more
complicated and heterogeneous parts of the body. Attenuation cor-
rection can also be based on actual attenuation measurements using
radioactive sources which are rotated around the patient, thereby
obtaining a crude transmission-based attenuation map. The mea-
sured attenuation map is then typically segmented, to minimize
stochastic noise, and used to derive an energy-appropriate correc-
tionfor the emissionscan.
23
Most recently, SPECTcameras have been
manufactured with integrated inline CT scanners.
24
Using lookup
tables, it is thenpossibletotranslatetheattenuationof thelowenergy
X-ray beamto the energy-appropriate attenuation correction. At the
present time, a minority of clinical SPECT is performed with atten-
uation correction however its use appears to be increasing.
4.3.2 Dual-Photon
Radionuclides for dual-photon imaging emit positrons.
25
As dis-
cussed earlier, these particles do not exit the body and therefore
cannot be directly imaged. Each positron travels only several mil-
limeters or less within the patient (depending on its kinetic energy
and location), comes to rest, and combines with an electron. In
this process, the electron and positron annihilate each other and
their rest masses are converted into energy, resulting in creation
of two 511 keV photons which travel in nearly opposite direc-
tions (Fig. 11). Imaging systems in positron emission tomography
(PET) are designed to identify these paired photons and deter-
mine their line of origin, the line-of-response.
26
In some early clinical
systems, modified dual-headed γ-cameras with rectangular detec-
tors were used to detect coincident photons in PET imaging. How-
ever, because these detectors only subtended a small fraction of
the angles surrounding the patient, count rate sensitivity was poor
and use of this method has declined. Currently, clinical systems
utilize multiple dedicated rings of detectors which surround the
patient in a 360

geometry. A typical clinical PET system consists
of 3–4 rings of detectors, each subdivided into 6–8 planes and with
1000 elements per transaxial plane. The large number of detector
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
87
511KeV
511KeV
β

β
+
*
*
FDG
18
F
Detector Ring
Fig. 11. Principle of PET imaging. A patient with lymphoma is noted to have
intense
18
F-FDG concentration in an axillary lymph node.
18
F-FDG localizes in the
tumor, presumably due to the increased rate of glycolysis (glucose metabolism)
in malignant tissue. With radioactive decay of the
18
F, a positron (β
+
) is emitted,
travels on the order of 1 mm, comes to rest, and combines with a ubiquitous elec-
tron (β

) to produce a pair of nearly-opposed 511 keVphotons (dotted lines). In the
illustration, the pair of annihilation radiation photons nearly simultaneously inter-
sect two elements within the ring of detectors (asterisks). The line that is defined by
the two detectors is termed the line-of-response (dashed line). Millions of such coin-
cidences are used to reconstruct the original distribution of
18
F, thereby identifying
the location of the tumor.
elements in a PET scanner is expensive and difficult to design.
Amethod which has been used to simplify the scanner is to score a
large block detector crystal in such a way as to have it function as up
to 64 individual detector elements, backed by only 4 PMTs. In some
designs, each PMT is shared among four adjacent detectors, further
reducing their number and the overall cost. The exact location of
each photon interaction within the block detector is encoded by the
intensity of light recorded at the PMTs, which is unique for each
crystal element.
Optimal scintillators for PET are different than those for single-
photon imaging. The 511 keV photons are difficult to stop, and lack
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
88
Lionel S Zuckier
Table 4. Scintillators Used in PET After Zanzonico
26
Scintillator Mass Density (ρ) Effective Light Output Scintillation
(gm/cm
3
) Atomic (Photons/ Decay Time
Number, Z MeV) (µsec)
NaI(Tl) 3.7 51 41 000 230
Bismuth germanate, 7.1 75 9 000 300
BGO
Lutecium ortho- 7.4 66 30 000 40
oxysilicate, LSO
Germanium ortho- 6.7 59 8 000 60
oxysilicate, GSO
of absorptive collimation in PETleads to high count rates andpoten-
tially high dead-times. Crystals with high densities, high atomic
numbers, and rapid light output are favored (Table 4). High light
output is also desirable in that it reduces statistical uncertainty and
therefore improves scatter rejection.
In contrast to single-photon imaging, where collimation is
requiredto relate a photonto its line of origin, no absorptive collima-
tionis requiredinPET. This allows for far greater count-rate sensitiv-
ity than in single-photon systems. Furthermore, no degradation of
resolutionoccurs withincreasing distance fromthe detectors. Detec-
tor elements are usually operated in coincidence with only a subset
of all the other remaining detector elements, eliminating the consid-
eration of adjacent elements where the lines of response would lie
outside of the patient’s body. When two photons are detected by the
scanner within a finite time interval τ, typically only 6 ns–12 ns, the
detectors involved define a line-of-response for subsequent recon-
struction of the in vivo distribution of radionuclide. The energies of
the photons are usually windowed to reduce scattering. Millions of
lines-of-response are then used to calculate the distribution of radio-
pharmaceutical within the patient.
The paired 511 keV annihilation photons may interact with
the PET camera in several different ways (Fig. 13). Depending
on the orientation of the 511 keV photons, some positron annihi-
lations are missed completely. In a large number of others, only
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
89
one of the pair of annihilation photons interacts with a detector,
resulting in an unpaired single. Of the pairs of photons detected by
the camera within the timing window τ and which therefore define
a line-of-response, some reflect accurate or true events, correspond-
ing to absorption of non-scattered 511 keV photons that originate
from a single positron-electron annihilation. Other pairs of pho-
tons detected occur following the scattering of one or both photons
through shallow angles, thereby remaining within an acceptable
energy window but defining an erroneous line-of-response (scattered
coincidences). Athirdcategory of pairedphotons is designatedas ran-
doms, due to the mistaken pairing of two independent 511 keV pho-
tons which are incident on the detectors within the specified timing
window τ despite originating from two separate positron-electron
annihilations. The narrower the timing window τ, the smaller the
number of random coincidences accepted. However, as τ is made
too short, true coincidences are also lost because of the nontriv-
ial time required for photon flight, scintillation within the crystal,
and electronic processing. Analogously, a narrower energy win-
dow will decrease scattered photons but at a cost of decreased
true coincidences due to limitations in energy resolution of the
system.
In the past, a major limitation of PET systems has been the high
count rate, and demands placed upon the electronics and proces-
sors. Additionally, the true count rate increases proportional to the
activitypresent withinthe patient while the randomcoincidence rate
increases as the square of the activity, and becomes critical at high
count rates. In order to decrease the number of randoms and scat-
tered coincidences and to reduce the huge computational demand
which increases as the square of the number of detectors, many sys-
tems introduce leador tungstensepta betweenthe rings of detectors,
which prevents oblique lines-of-response. These systems are termed
2D in that only events within rings or between adjacent rings are
acquired (Fig. 12). The transaxial images so derived are stacked to
constitute a three-dimensional volume of distribution from which
coronal, saggital and maximum-intensity projection (MIP) images
are derived.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
90
Lionel S Zuckier
2-D Acquisition
Septa
Extended
-3 -1 -2 n +1 +2 +3
-3 -1 -2 n +1 +2 +3 -3 -1 -2 n +1 +2 +3
-3 -1 -2 n +1 +2 +3
Septa
Retracted
3-D Acquisition
Fig. 12. 2D versus 3D PET. In 2D acquisition, lead or tungsten septa are extended
between adjacent rings of scintillation detectors. Only lines-of-response within a
single ring or possibly adjacent rings are permitted (solid line) while paired pho-
tons with greater obliquity are absorbed by the septa (broken lines). In 3D acquisi-
tion, the septa are retracted and each detector element may be in coincidence with
opposite detector elements in any of the rings. This results in increased sensitivity
but markedly increased random coincidences and potentially dead time. Sequen-
tial scans in a patient with previously treated tumor in the frontal lobe demonstrate
the improved quality of 3D scan relative to 2D scans, which is achievable when
imaging small body parts such as the head where scatter is minimal.
By removing the lead or tungsten septa between adjacent rings
of detectors, it is possible to perform an acquisition where lines-of-
response between all the detector rings are potentially available to
detect coincident pairs of photons, an approach termed 3D(Fig. 12).
3D-acquisition has been facilitated by faster computer processors
and improvements in detectors which allow better temporal and
energy resolution. Because of the deleterious effect of high count
rate, the administered activities are decreased in 3D acquisitions.
Imaging times may also be significantly reduced due to the greater
count-rate sensitivity.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
91
511KeV
511KeV
1b
1a
4a
3a
5b
5a
*
A
B
C
2a
Fig. 13. Variety of photon interactions in PET imaging. Positrons emitted fromthe
tumor in the patient’s left axilla result in emission of pairs of annihilation photons.
Representative interactions are illustrated. (1) A pair of annihilation photons is
detectedby the ring of detectors, resulting ina “true” coincidence whichis recorded
as the line-of-response “A” (dashed line). (2) Only one of the pair of photons is
recorded by the ring of detectors, a “single.” The second photon escapes by passing
through the ring of detectors, or by passing out of plane or into a lead septum. (3a)
and (4a) Two unrelated single photons are recorded within the finite coincidence
timing window τ, resulting in false line-of-response “B” (random). (5) One of the
511 keV photons is directly detected by a detector (5a) while the second undergoes
Compton scattering within the patient (5b). Depending on the scattering angle, if
this latter photon remains within the acceptable energy window, there will result a
malpositioned coincidence and erroneous line-of-response “C.”
Anumber of sources of error impact upon the accuracy of dual-
photon imaging. Depending on its kinetic energy, the positron may
travel up to several millimeters prior to coming to rest, which
displaces the line-of-response from the actual site of the annihila-
tion event. Secondly, because positrons may actually have nonzero
momenta immediately prior to annihilation, the emitted 511 keV
photons may not be exactly collinear (at 180

to each other). This,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
92
Lionel S Zuckier
too, leads to errors in the designated line-of-response. Each detector
element also has a finite size, which introduces further uncertainty
in the line-of-response. As mentioned above, random and scattered
coincidences can leadto erroneous lines of response as well. Modern
PET scanners in clinical use have spatial resolution on the order of
3 mm–4 mm.
In PET imaging, attenuation correction is routinely performed
for clinical interpretation and is required for quantitative analy-
sis. In uniform and geometrically simple regions of the body such
as the skull, attenuation correction may be based on the assump-
tion that the body contour conforms to a water-equivalent regu-
lar geometric shape, as discussed with regard to SPECT scanners.
Measurement-based attenuation correction in PET utilizes radioac-
tive sources suchas
68
Ge (energy511 keV) and
137
Cs (energy662 keV)
which are rotatedaroundthe patient andusedto yielda crude atten-
uation map for correcting the emission data. Often, the
68
Ge and
137
Cs attenuation maps are segmented into lung, bone, and soft tis-
sue densities to overcome the degradative effects of the low-count
(“noisy”) transmission data. These are then used to derive a 511 keV
appropriate attenuation map with which to correct the emission
scans. Most recently, PET scanners have been manufactured with
integrated inline CT scanners.
27
Using segmentation and lookup
tables, it is also possible to translate the energy-specific attenuation
of the X-ray beam to that of the appropriate 511 keV photon energy
(Fig. 14).
A widely used measure of radiopharmaceutical uptake in clin-
ical PET imaging is the standardized uptake ratio (SUV), which is
a dimensionless ratio of radiotracer concentration in the structure
of interest to the average concentration in the body (administered
activity divided by patient mass). Quantitation is most useful when
values are compared in a single subject before and after therapy.
Accurate measurement of SUV is dependent on accurate attenua-
tion correction, and its clinical utility also requires standardization
of the interval between FDG administration and imaging, fasting of
the patient before FDG administration and other biologically rele-
vant variables.
28
Most studies have shown that a semiquantitative
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
93
NAC CT
AC
Fused
Fig. 14. Non-attenuation corrected (NAC) and attenuation corrected (AC) scans
insame patient as Fig. 12. Original data froma PET-CTscanconsists of NACandCT
images. The CT scan is used to create an energy-appropriate attenuation correction
mapwhichis appliedtocreateACimages. CTscanis oftenfusedwiththeACimages
in order to improve anatomic localization of abnormalities. Note the distribution of
activity in the NAC image where central structures are relatively lower in activity
than peripheral structures; this has been corrected in the AC PET images.
visual grading system is equally effective as SUV-based diagnostic
criteria in differentiating malignant from benign tissue.
4.3.3 Fusion Imaging in Nuclear Medicine
A development in nuclear medicine which continues to evolve is
the combination or fusion of scintigraphic images with anatomic
radiologic modalities, chiefly CT. To a certain degree, this has
been facilitated by development of standards for image storage
and retrieval (Digital Imaging and Communications in Medicine
[DICOM]) and the interface of multiple modalities to common data
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
94
Lionel S Zuckier
and storage systems (Picture Archive and Communication Sys-
tems or PACS).
29,30
Nuclear medicine images contain unique func-
tional information but are often limited in anatomic content. CT and
anatomic MR imaging have superior spatial resolution, but gener-
ally lack functional information. By fusing nuclear medicine and
anatomic images, it is possible to correlate changes in function with
particular anatomic structures. Initially, this was attemptedby using
software to retrospectively register
31
and then fuse (overlay) studies
performed on separate scanners at different times (Fig. 15). Methods
CT
RBC
Fusion
Axial Saggital Coronal
Fig. 15. Retrospective SPECT Fusion. Contrast enhanced CT scan, performed sev-
eral days prior to the blood pool study illustrated in image 10, has been regis-
tered and fused to the nuclear medicine study using a mutual information method
(MIMvista version 4.0, MIMvista Corp, Cleveland, OH). Note excellent registration
of the inferior vena cava (arrows) and spleen (arrowhead) on blood pool and CT
images. The study was done to evaluate the blood volume corresponding to an
area of contrast enhancement in the periphery of the left lobe of the liver on CT
(dashed circle). No corresponding increase in blood volume is noted and the lesion
therefore does not represent a hemangioma.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
95
of retrospective registrationwere frequentlyhamperedbyinevitable
differences in patient positioning, bowel preparation and other vari-
able factors. To minimize this problem, a hardware approach was
developed where SPECT and PET scanners were combined inline
with CTscanners, thereby permitting both studies to be sequentially
acquiredonthe same gantrywithlittle or notime andpatient motion
between them (Fig. 16). An additional benefit of combined devices
has been the ability to correct for attenuation as described above.
Currently, the vast majority of PET scanners sold for clinical use
incorporate CT; while less common on SPECT scanners, this feature
is increasing in frequency. Indeed, the success of CTfusion in clinical
Stress →
Rest →
Stress →
Rest →
V
e
r
t
i
c
a
l

L
o
n
g

A
x
i
s










S
h
o
r
t

A
x
i
s

A
Stress
Rest
LAD
RCA
B
Fig. 16. Hardware fusion of SPECT and multislice CT coronary angiography
(CTA) in a 50 year old man with chest pain performed on an inline dedicated
scanner (Research SPECT/16-CT Infinia LS, General Electric Healthcare Technolo-
gies). (A) Selected tomographic perfusion images of the heart are displayed in short
axis, and vertical long axis. An area of diminished perfusion at stress (odd rows,
arrows) exhibits improved perfusion at rest (even rows, arrowheads). (B) Fused
SPECT/CTAdata combining anepicardial display of myocardial perfusionat stress
(upper image) and rest (lower image) with the coronary tree derived from the CTA
study illustrates the relationship of the ischemic territory and the right coronary
artery (RCA). The course of the left anterior descending (LAD) artery is also demon-
strated. CTAimages illustrate luminal narrowing in the RCA(not shown). (Images
courtesy of Dr R Bar-Shalom, Rambam Health Care Campus, Haifa, Israel.)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
96
Lionel S Zuckier
single- anddual-photon tomography has ledto development of par-
allel techniques for small animal imaging and research.
32
4.4 CONCLUDING REMARKS
In nuclear medicine, molecules containing radioactive atoms,
termed radiopharmaceuticals, are used to diagnose and treat dis-
ease; the interaction of the radiopharmaceuticals with physiologic
processes within the body reveals unique functional information.
Methods of diagnosis in nuclear medicine include imaging of sin-
gle photons originating from the radiopharmaceuticals, and in the
case of PET, imagingof dual photons that derive fromannihilationof
positrons. Current developments innuclear medicine include fusion
of SPECT and PET with anatomic modalities such as CT.
References
1. Groch MW, Radioactive decay, Radiographics 18(5): 1245–1246, 1998.
2. Hall EJ, Giaccia AJ, Radiobiology for the Radiologist, 6th edn., Lippincott
Williams & Wilkins, Philadelphia, PA, 2006.
3. Brodsky A, Kathren RL, Historical development of radiation safety
practices in radiology, Radiographics 9(6): 1267–1275, 1989.
4. Schuck HSR, Osterling Aet al., Nobel. The Man and his Prizes, 2nd edn.,
Elsevier Publishing Company, Amsterdam, 1962.
5. Graham LS et al., Nuclear medicine from Becquerel to the present,
Radiographics 9(6): 1189–1202, 1989.
6. Kuhl DE, Edwards RQ, Reorganizing data from transverse section
scans of the brain using digital processing, Radiology 91(5): 975–983,
1968.
7. Kassis AI, Adelstein SJ, Radiobiologic principles in radionuclide ther-
apy, Journal of Nuclear Medicine 46 (Suppl 1): 4S–12S, 2005.
8. Budinger TF, Rollo FD, Physics and instrumentation, Prog Cardiovasc
Dis 20(1): 19–53, 1977.
9. Ranger NT, Radiation detectors in nuclear medicine, Radiographics
19(2): 481–502, 1999.
10. Blahd WH, Ben Cassen and the development of the rectilinear scanner,
Semin Nucl Med 26(3): 165–170, 1996.
11. Gottschalk A, Anger HO, Use of the Scintillation camera to reduce
radioisotope scanning time, Jama 192: 448–452, 1965.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
Principles of Nuclear Medicine Imaging Modalities
97
12. Anger HO, Scintillation Camera, The Review of Scientific Instruments
29(1): 27–33, 1958.
13. Collica CJ, Robinson T, Hayt DB, Comparative study of the gamma
camera and rectilinear scanner, Am J Roentgenol Radium Ther Nucl Med
100(4): 761–779, 1967.
14. Formiconi AR, Collimators, Q J Nucl Med 46(1): 8–15, 2002.
15. Swann S et al., Optimized collimators for scintillation cameras, J Nucl
Med 17(1): 50–53, 1976.
16. Anger HO, Scintillation camera with multichannel collimators, J Nucl
Med 5: 515–531, 1964.
17. MurphyPH, Burdine JA, Large-field-of-view(LFOV) scintillationcam-
eras, Semin Nucl Med 7(4): 305–313, 1977.
18. Todd-Pokropek A, Advances in computers and image processing with
applications in nuclear medicine, Q J Nucl Med 46(1): 62–69, 2002.
19. Muehllehner G, Colsher JG, Stoub EW, Correction for field nonunifor-
mity in scintillation cameras through removal of spastial distortion,
J Nucl Med 21(8): 771–776, 1980.
20. Genna S, Pang SC, Smith A, Digital scintigraphy: Principles, design,
and performance, J Nucl Med 22(4): 365–371, 1981.
21. Bender MA, Blau M, The autofluoroscope, Nucleonics 21(10): 52–56,
1963.
22. Larsson SA, Gamma camera emission tomography. Development and
properties of a multisectional emission computedtomography system,
Acta Radiol Suppl 363: 1–75, 1980.
23. Xu EZ et al., A segmented attenuation correction for PET, J Nucl Med
32(1): 161–165, 1991.
24. Bocher M et al., Gamma camera-mounted anatomical X-ray tomogra-
phy: Technology, system characteristics and first images, Eur J Nucl
Med 27(6): 619–627, 2000.
25. Votaw JR, The AAPM/RSNAphysics tutorial for residents. Physics of
PET, Radiographics 15(5): 1179–1190, 1995.
26. Zanzonico P, Positron emission tomography: A review of basic prin-
ciples, scanner design and performance, and current systems, Semin
Nucl Med 34(2): 87–111, 2004.
27. Beyer T et al., Acombined PET/CT scanner for clinical oncology, J Nucl
Med 41(8): 1369–1379, 2000.
28. Keyes JW, Jr, SUV: Standard uptake or silly useless value? J Nucl Med
36(10): 1836–1839, 1995.
29. Alyafei S et al., Image fusion system using PACS for MRI, CT, and PET
images, Clin Positron Imaging 2(3): 137–143, 1999.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch04 FA
98
Lionel S Zuckier
30. GrahamRN, Perriss RW, ScarsbrookAF, DICOMdemystified: Areview
of digital file formats and their use in radiological practice, Clin Radiol
60(11): 1133–1340, 2005.
31. Zitova B, Flusser J, Image registration methods: A survey, Image and
Vision Computing 21(11): 977–1000, 2003.
32. Lewis JS et al., Small animal imaging. Current technology and perspec-
tives for oncological imaging, Eur J Cancer 38(16): 2173–2188, 2002.
33. Technology NI.o.S.a. [cited February 11, 2007]; Available from:
http://physics.nist.gov/cuu/index.html.
34. Cherry SR, Sorenson JA, Phelps ME, Physics in Nuclear Medicine, 3rd
edn., Saunders, Philalephia, PA, 2003.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
CHAPTER 5
Principles of Magnetic Resonance
Imaging
Itamar Ronen and Dae-Shik Kim
The phenomenon of nuclear magnetic resonance was first described by
Felix Bloch
1
and independently by EM Purcell in 1946.
2
Both scien-
tists shared the Nobel Prize in 1952 for this pivotal discovery. The phe-
nomenon is tightly linked to the broader field of interaction between
matter and radiation commonly known as spectroscopy. It is within the
frame of nuclear magnetic resonance (NMR) spectroscopy that the field
developed in leaps and bounces, leading to the discovery of Fourier
Transform NMR by RR Ernst (Nobel Prize in Chemistry, 1991) and to
an astonishingly wide range of applications, from the structural deter-
mination of protein structure in solution to the investigation of metabolic
processes in live organisms, from solid state research to myriad appli-
cations in organic chemistry. The unexpected paradigm shift in NMR
research came in the early 1970s, when independent research by two
ingenious scientists, Paul Lauterbur, then at SUNYStony Brook and Peter
Mansfield at Nottingham University, UK, has raised the possibility of
obtaining images based on the signal generated by nuclear mag-
netic resonance.
3−5
The humble beginnings, namely the projection-
reconstruction maps of two water-filled test tubes shown by Lauterbur
in his first publication on the matter in the journal Nature, were soon
followed by the first applications of this new technique to obtain
images of the human body, and this spawned a new field — the
field of magnetic resonance imaging (MRI). Both scientists shared the
Nobel Prize in Physiology or Medicine in 2003. MRI effectively revolu-
tionized the biomedical sciences, allowing the noninvasive imaging of
practically every organ of the human body in health and disease.
MRI methodology began covering a broad range of diagnostic tools,
and invaded an astonishing variety of basic research fields, including
the ability to visualize brain function, characterize tissue microscopic
99
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
100
Itamar Ronen and Dae-Shik Kim
structure, reconstruct neural connections, and more — all in a nonin-
vasive and harmless manner. In this chapter, we will explore the basic
principles of nuclear magnetic resonance — the way in which the NMR
signal is generated and detected, the properties of the NMR signal
and the way in which this signal is manipulated to provide us with
images.
5.1 PHYSICAL AND CHEMICAL FOUNDATIONS OF MRI
5.1.1 Angular Momentum of Atomic Nuclei
Atomic nuclei posses intrinsic angular momentum, associated with
precession of the nucleus about its own axis (spin). In classical
physics, angular momentumis a vector associated with a body rotat-
ing or orbiting around an axis of rotation, and is given by

L = r ×

P
where

L is the angular momentumvector, r is the radius vector from
the center of rotation and

P is the linear momentumvector. The vec-
tor multiplication operator, ×, generates the angular momentum to
be perpendicular to the plane defined by

P and r, as can be seen in
Fig. 1.
In quantum mechanics, the angular momentum for particles
such as electrons, protons or atomic nuclei is given by L =

I(I +1)h, where L stands for the total angular momentum of the
Fig. 1. The relationship between the angular momentumL, the linear momentum
P and the radius vector r.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
101
particle, I is the spin number, or simply the spin of the particle, and
h is Planck’s constant in the appropriate units. The spin quantum
number I is characteristic of every nuclear species. I can be zero or
can take positive integer or half integer values. The differences in the
value of the spin quantum number reflect differences in the nuclear
composition and charge distribution. For instance, the nucleus of
1
H
(I = 1/2) consists of one proton only, while the nucleus of
2
H(I = 1)
consists of a proton and a neutron. For
12
C and
16
O I = 0, and these
nuclei have zero angular momentum. Table 1 lists some of the stable
isotopes of common elements in the periodic table together with
their spin numbers.
5.1.2 Energy States of a Nucleus with a Spin I
Anucleus with a spin =I possesses 2I +1 possible states, defined by
a quantumnumber m
I
. mcan obtain the values −I, −I +1,…,I −1, I.
These states are associated with the different projections of L on the
(arbitrarily chosen) z-axis. The projection is then given by L
z
(m) =
mh. The relationship between L and the different L
z
for the case of
I = 3/2 is given in Fig. 2. It should be noted that the projection of
the angular momentum is well defined on one axis only, while as a
result of Heisenberg’s uncertainty principle, L
x
and L
y
are not well
defined, causing L to lie on an uncertainty cone. In the absence of an
Table 1.
Isotope Natural Spin (I) Magnetic Gyromagnetic Ratio
Abundance (%) Moment (µ)

(γ)

1
H 99.9844 1/2 2.7927 26.753
2
H 0.0156 1 0.8574 4,107
11
B 81.17 3/2 2.6880 —
13
C 1.108 1/2 0.7022 6,728
17
O 0.037 5/2 −1.8930 −3,628
19
F 100.0 1/2 2.6273 25,179
29
Si 4.700 1/2 −0.5555 −5,319
31
P 100.0 1/2 1.1305 10,840

γ in units of 10
7
rad T
−1
sec
−1

µ in units of nuclear magnetons = 5.05078 · 10
−27
JT
−1
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
102
Itamar Ronen and Dae-Shik Kim
Fig. 2. Quantization of the angular momentum for I = 3/2. The four possible m
states represent four projections on the z-axis.
external magnetic field, states withdifferent mhave the same energy —
they are degenerate states.
5.1.3 Nuclear Magnetic Moment
The overall spin of the (charged) nucleus generates a magnetic dipole
moment, or a magnetic moment, alongthe spinaxis. The nuclear mag-
netic moment, µ is again an intrinsic property of the specific nucleus.
The magnetic moment µ results from a motion of a charged parti-
cle, similar to a generation of magnetic moment as a result of a loop
current. The magnetic moment µ is also a vector, and it is propor-
tional to the angular momentum L through the gyromagnetic ratio γ:
µ = γL µ
z
= γL
z
= γmh. The gyromagnetic ratio is one of the most
useful constants in MR physics, and we will encounter it on several
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
103
occasions in our discussion. Table 1 lists the gyromagnetic ratio for
several nuclei of stable isotopes, as well as for the electron.
5.1.4 The Interaction with an External Magnetic Field
The nuclear magnetic moment can interact with an external magnetic
field B
0
. The energy of this interaction is given by: E = −ˆ µ ·
ˆ
B
0
=
−(µ
x
B
0,x

y
B
0,y

z
B
0,z
), the scalar product between the magnetic
field and the magnetic dipole. If the field is oriented solely along the
z-axis, the energy of this interaction is thus given by: E = −ˆ µ ·
ˆ
B
0
=
−µ
z
B
0
= −γ
I
m
I
hB
0
m
I
= −I, −I + 1, . . . , I − 1, I where γ
I
is the
gyromagnetic ratio of the nucleus I. As can be appreciated, the effect
of the external magnetic field is the removal of the degeneracy of the
energy levels. Eachmstate is nowcharacterizedbya distinct energy. In
the case of spin1/2, mis equal toeither −1/2 or +1/2, andthe energy
levels associatedwiththese two states are: E
m=−1/2
= E
0
+
1
2
γhB
0
and
E
m=+1/2
= E
0

1
2
γhB
0
. Figure 3 describes the energy level diagramof
a spin 1/2 particle in the presence of an external magnetic field. The
Fig. 3. The effect of B
0
on the two degenerate m states for I = 1/2 particle.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
104
Itamar Ronen and Dae-Shik Kim
energy gap between the two levels is given by E = E
−1/2
− E
1/2
=
γhB
0
. This energy expressed in frequency, using the Planck relation
E = hυ where υ is the Larmor frequency:
ω
0
= 2πν
0
= γB
0
.
The removal of energylevel degeneracycanbe viewedas a result
of break of spherical symmetry introduced by the external magnetic
field B
0
. As a result of the interaction with B
0
, L is now confined to
2I +1 orientations with respect to B
0
, dictated by the possible 2I +1
L
z
values, which now reflect the projections of L on the axis defined
by the direction of B
0
. In the case of I = 1/2, the state m = 1/2 has L
z
lie on the positive z, and m = −1/2 on the negative z-axis, similarly
to what is seen on Fig. 2 for the case I = 3/2.
5.1.5 The Classical Picture
For spin=1/2, the same result can be obtained solely from classi-
cal considerations. One can envision the nucleus as a small mag-
netic dipole with a magnetic moment µ. When placed in an external
magnetic field B
0
, the nuclei will experience a torque in a plane per-
pendicular to B
0
and precess along the magnetic field lines with a
precession frequency ω
0
, proportional to the external magnetic field
through the gyromagnetic ratio: ω
0
= γB
0
. The dipole precesses
along a cone that forms an angle θ with the z-axis. This is an equiva-
lent to the uncertainty cone described for L
x
and L
y
of the quantum
particle. It should be noted that the classical picture converges with
the quantum-mechanical picture only for I = 1/2.
5.1.6 Distribution Among m States
The removal of energy degeneracy thus creates a uniformly spaced
energy ladder with2I +1 distinct states, separatedby ω = γB
0
. The
energy of the states increases as m decreases. In the case of I = 1/2,
the state m = 1/2 has the lower energy and is typically designated
as the α state, and the state with m = −1/2 is higher in energy and is
designated as the β state. Most importantly for MRI, the energy gap
is linearly proportional to the external magnetic field, and so is the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
105
Larmor frequency. In room temperature, one can use the Maxwell-
Boltzmann distribution to estimate the distribution of the particles,
in our case — nuclei, among the various energy levels. For I = 1/2,
the MB distribution among the states α and β is given by:
n
α,β
N
=
exp


E
0
±
1
2

0
kT

exp


E
0

1
2

0
kT

+ exp


E
0
+
1
2

0
kT
,
where k is the Boltzmann constant, υ
0
is the Larmor frequency and
T is the temperature. For T = 300 k, and a magnetic field of 3 Tesla,
the Larmor frequency is roughly 127.7 MHz, and the α state is more
populated than the β state such that:
n
β
n
α
=
exp


E
0
+
1
2
hv
0
kT

exp


E
0

1
2
hv
0
kT
= exp

−hv
0
kT

= 0.99998.
5.1.7 Macroscopic (Bulk) Magnetization
In a macroscopic sample, the total magnetic moment of the sample is
called the macroscopic or bulk magnetization. The bulk magnetiza-
tion, or simply the magnetization at thermal equilibriumis denoted
byM(eq) or simplybyM, andit is the sumof the individual moments
in the sample. The sum of all moments with state α or β are given
by M
α,β
, where M
α,β
(x) = M
α,β
(y) = 0 and M
α,β
(z) > 0. The x and
y projections of M
α,β
vanish because of the uncertainty cone (in the
quantummechanical picture) or the lack of phase coherence among
the individual precessions (in the classical picture). As a result of the
thermal distribution among states, M
α
> M
β
, and the total magne-
tization M is given by M = M
α
− M
β
. It can be easily shown that at
high temperatures, M =
1
4
N(γh)
2
B
0
/kT, known also as Curie’s law.
5.1.8 The Interaction with Radiofrequency Radiation —
the Resonance Phenomenon
At this point, an important point has been reached —the creation of
an energy gap between two unevenly populated states. Aradiofre-
quency (RF) radiation at a frequency equal to the frequency gap
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
106
Itamar Ronen and Dae-Shik Kim
between the states will result in transitions of particles fromα to the
β states and in absorption of energy quanta. This is the nuclear mag-
netic resonance phenomenon, and thus the resonance condition is
ω
RF
= ω
0
. The simplest experimental setting that can be envisioned
is that of a magnet that generates a static homogeneous magnetic
field B
0
and a radiofrequency source that generates RF radiation
ω
RF
. If the sample inside the homogeneous B
0
contains nuclei with
I > 0 (e.g. a water sample where the hydrogen atoms have nuclei
with I = 1/2), by slowly varying either the external magnetic field
B
0
or ω
RF
, the resonance condition will be met at some point, result-
ing in absorption of RF. This absorption can be detected by a RF
detector. One of the first NMR spectra ever obtained was of ethanol
(CH
3
CH
2
OH). The three resonances that were visible on the spec-
trum were those of the three hydrogen “types” (the CH
3
group, the
CH
2
group and the OH group) and the slight variations in reso-
nance frequencies among the three stem from slight differences in
the electron shielding around the different
1
H nuclei.
5.2 THE BLOCH EQUATIONS
A phenomenological description of the equations of motion for M,
the bulk magnetization, was given by Felix Bloch, and it is known
as the Bloch equations. The Bloch equations are extremely use-
ful for the understanding of the various effects that experimental
manipulation of M have, and in particular — the effects of radiofre-
quency radiation. The Bloch equations for the three components of
the magnetization are:
dM
x
dt
= −γ(B
0,y
M
z
− B
0,z
M
y
) −
M
x
T
2
,
dM
y
dt
= −γ(B
0,z
M
x
− B
0,x
M
z
) −
M
y
T
2
,
dM
z
dt
= −γ(B
0,x
M
y
− B
0,y
M
x
) −
M
eq,z
− M
z
T
1
,
where the first term in each equation represents the torque exerted
on each magnetization component by the components of B
0
perpen-
dicular to it. The second term in each equation is a relaxation term
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
107
that allows for the magnetization to regain its equilibrium value.
Since we assume B
0
along the z-axis, the relaxation along the z-axis
is called the longitudinal relaxation, whether the relaxation on the
xy plane is called the transverse relaxation. The sources of these
relaxation processes are different, and will be discussed later. If the
magnetic field B
0
is aligned along the z-axis, and taking into account
the relation ω
0
= γB
0
, the Bloch equations in the presence of a static
magnetic field take the form:
dM
x
dt
= ω
0
M
y

M
x
T
2
,
dM
y
dt
= −ω
0
M
x

M
y
T
2
,
dM
z
dt
= −
M
z,eq
− M
z
T
1
.
The solution of the Bloch equations for the transverse (xy) compo-
nents of the magnetization is a clockwise precession accompanied
by decay at a rate 1/T
2
until M
xy
→ 0. The longitudinal component
of M decays according to 1/T
1
, approaching M
z,eq
:
M
x
(t) = M
xy
(0) cos (ωt) · exp


t
T
2

,
M
y
(t) = M
xy
(0) sin (ωt) · exp


t
T
2

,
M
z
(t) = M
z
(0) +

M
z,eq
− M
z
(0) · exp


t
T
1

.
5.2.1 The Inclusion of the RF Field in the Bloch Equations
As mentioned earlier, the actual MRexperiment involves the pertur-
bation of M
eq
with a radiofrequency irradiation. The RF irradiation
generates an EM field oscillating at a frequency which we denote
by ω
1
. The magnetic part of the electromagnetic field can thus be
given by B
1
= γω
1
. In order to drive M out of equilibrium, B
1
must
operate perpendicular to the z-axis. For simplicity reasons, in our
discussion, we will assume B
1,x
> 0, B
1,y,z
= 0, or in other words,
B
1
exerts torque on the yz plane, tilting the equilibrium magneti-
zation away from the z-axis toward the y-axis. Typically we apply
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
108
Itamar Ronen and Dae-Shik Kim
a linearly polarized oscillating field. The linear polarization can be
decomposed into two circularly polarized counter-rotating fields with
a frequency difference of 2ω:
B
RF
= [B
1,x
cos (ωt) + B
1,y
sin (ωt)] + [B
1,x
cos (ωt) − B
1,y
sin (ωt)].
The first component is a counter-clockwise rotating component. We
are interested in irradiation frequencies close to resonance, and at
resonance ω = ω
0
. Since Mis precessing clockwise under B
0
, it is 2ω
0
away fromresonance, and its influence on M can be neglected. Thus
B
RF
can be viewed as a circularly polarized field, where the polar-
ization rotates at a frequency ω : B
RF
= B
1,x
cos (ωt) − B
1,y
sin (ω, t).
When a B
RF
field is applied, the total magnetic field B is thus:

B =


B
1
cos (ωt)
−B
1
sin (ωt)
B
0


The Bloch equations thus assume the following form:
dM
x
dt
= −ω
1
sin (ωt)M
z

0
M
y

M
x
T
2
,
dM
y
dt
= ω
1
cos (ωt)M
z

0
M
x

M
y
T
2
,
dM
z
dt
= −ω
1
cos (ωt)M
y

1
sin (ωt)M
x

M
x,eq
− M
z
T
1
,
where ω
1
= γB
1
.
5.2.2 The Rotating Frame of Reference
This is a rather complicated picture, since the effective field B is a
combination of a static magnetic field B
0
and a rotating field B
1
. In
order to simplify the picture, we move from the laboratory frame of
reference to a frame of reference that moves along with the rotating
field B
1
, i.e. rotates clockwise at a frequency ω. Since B
1
is perpendic-
ular to B
0
, and the frame of reference rotates at exactly the frequency
of rotation of the RF field, both fields now appear to be static in this
reference. Expressing the components of the transverse magnetiza-
tion in the rotating frame M
x

,y

in terms of the components in the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
109
laboratory frame yields:
M
x

= M
x
cos (ωt) − M
y
sin (ωt); M
y

= M
x
sin (ωt) + M
y
cos (ωt).
Rewriting the Bloch equations for M
x

and M
y

gives:
dM
x

dt
= (ω
0
−ω)M
y


M
x
T
2
,
dM
y

dt
= ω
1
M
z
−(ω
0
−ω)M
x


M
y
T
2
,
dM
z
dt
= −ω
1
M
y


M
z,eq
− M
z
T
1
.
Inthis frame, the effective magnetic fieldis now

B
eff
=

B
1
0
B
0
−ω/γ

.
This is a static magnetic field that is a sum of the RF field B
1
, which
operates along the x

-axis, and a reduced static magnetic field B
0

ω/γ that operates along the z-axis. The magnetic field along the z-
axis seems reduced since now rotating frame follows at a frequency
ω the magnetization, which precesses at a frequency ω
0
. The relative
frequency among them is thus ω
0
− ω, and thus the static magnetic
field seems “reduced.” The precession of M about the axis defined
by the effective magnetic field is depicted in Fig. 4(a).
Fig. 4. The effective field in the rotating frame of reference (A) off resonance and
(B) on resonance.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
110
Itamar Ronen and Dae-Shik Kim
At resonance, ω = ω
0
and the z component of B
eff
vanishes.
This creates a particularly simple picture, where motion of the mag-
netization is solely dictated by B
1
. This is an extremely important
achievement in our discussion, because it makes the description of
the effects of pulse sequences on the magnetization extremely intu-
itive. In Fig. 4(b), the resonance condition in the rotating frame is
described. As can be seen, with the application of B
1,x
, M will pre-
cess about the axis defined by B
1,x
, i.e. on the zx plane, moving from
the positive x-axis towards the positive y-axis and so on.
5.2.3 RF Pulses
RFcanbe appliedina constant manner (CW) or a transient one (pulses).
An RF pulse will cause M to precess around B
1
, but only for the period
of its duration. After the pulse had ended, M will obey the Bloch
equations where the field consists only of B
0
. When an RF pulse is
applied to M, the angle between M and the positive z-axis achieved
at the end of the pulse is defined as the flip, or tilt angle. A simple
formula for the tilt angle is: θ = γB
1
τ, where θ is the tilt angle, B
1
is
the amplitude of the RF field, and τ is the pulse duration. Figure 5
describes a 90

tilt angle for B
1
applied on the x-axis, or 90

x
, and a
180

pulse when B
1
is applied on the y-axis, or 180

y
.
Fig. 5. The effect of (A) a 90 degree RF pulse when B
1
is along the x-axis; (B) a 180
degree RF pulse when B
1
is along the y-axis.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
111
5.3 THE FREE INDUCTION DECAY
The simplest pulse-NMR experiment involves an excitation of the
magnetization by a RF pulse, and the detection of the precession
of the magnetization about the B
0
axis in the absence of a RF field.
The signal picked up by the RF receiver coil (which may, or may not
be the one used for the RF transmission) is the one induced by the
oscillations of the x and y components of M. This signal is called
the free induction decay, or the FID. The FID should be identical
to the solution of the Bloch equations given previously. However,
since the signal undergoes demodulation, or in other words — the
RF frequency is subtracted fromthe actual frequency of the detected
signal, the FIDis analogous to the solution of the Bloch equations in
the rotating frame. The solution is given by:
S
x

(t) = S(0) cos[(ω
0
−ω
ref
)t] · exp


t
T
2

,
S
y

(t) = −S(0) sin[(ω
0
−ω
ref
)t] · exp


t
T
2

,
where S(0) is proportional to the projection of M on the XY plane
immediately after the RF pulse is given and is thus proportional
to M
z
sinθ, where θ is the flip angle. The reference or demodulation
frequency ω
ref
is essentially the rotation frequency of the rotating
frame. The projection is of course maximized when θ = 90

. The
FID is in fact a complex signal, and with a quadrature detection coil
both the real and imaginary parts of the signal, separated by a phase
of π/2, are detected. The FID for a simple signal that consists of one
resonance at ω
0
is thus a damped oscillation at a frequency ω
0
−ω
ref
and a decay time constant of T
2
, as can be seen in Fig. 6.
5.3.1 The NMR spectrum
The typical NMR experiment, and as we will see later — the MRI
image, carries the FID data onto the frequency domain via the
Fourier transformation. The Fourier transformationof the FIDyields
the real and imaginary parts of the NMR spectrum, also known as
the absorption anddispersion modes. It shouldbe notedthat the phase
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
112
Itamar Ronen and Dae-Shik Kim
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 0.1 0.2 0.3 0.4 0.5
t
Fig. 6. The real part of the FID (off resonance).
associated with the detection of the FIDis arbitrary, and thus can be
modified in the processing phase to yield a “pure” absorption (in-
phase) spectrum, or any desired phase. The explicit expressions for
the real and imaginary parts of the NMR spectrum are given by:
ˆ
S
y

(ω) =


0
S
y

(t) exp (iωt)dt =
S(0)T
2
1 + T
2
2
(ω −ω)
2
,
ˆ
S
x

(ω) =


0
S
x

(t) exp (iωt)dt =
S(0)T
2
2
(ω −ω)
1 + T
2
2
(ω −ω)
2
,
where ω = ω
0
−ω
ref
. The real and imaginary parts of the spectrum
are seen on Fig. 7. The real part is a spectral line with a Lorenzian
line shape. The full width at half maximumis inversely proportional
to the characteristic decay time of the FID. Here, it is given by the
relaxation constant T
2
, and the relation between the full width at
half maximum (FWHM) and T
2
is υ
1/2
= 1/πT
2
. Later on we will
see that relaxation is enhanced by experimental factors that are not
necessarilyintrinsic tothe sample, andthus anewtermwill be added
to the apparent transverse relaxation — T

2
.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
113
0
0.02
0.04
0.06
0.08
0.1
0.12
-500 -300 -100 100 300 500
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
-500 -300 -100 100 300 500
Fig. 7. The real part or absorptionmode (left) andthe imaginary part or dispersion
mode (right) of the NMR spectrum of a single resonance.
5.3.2 Relaxation in NMR
The NMR signal is governed by two distinct relaxation times. T
1
=
1/R
1
is the longitudinal relaxation time, which is dictatedby the energy
exchange between the systemand the “lattice,” which is contains all
other degrees of freedom to which the spin system is coupled. T
1
describes the rate of the return of M
z
to its equilibrium value, M
z,eq
.
T
2
= 1/R
2
is the transverse relaxation time. T
2
is associated with loss of
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
114
Itamar Ronen and Dae-Shik Kim
coherent spin motion on the xy plane, which results in a net decrease
of M
xy
and ultimately in its vanishing. Both relaxation times are inti-
mately relatedto molecular motion andthe interactions between the
spins with neighboring spins and their surroundings. Relaxation
is induced by randomly fluctuating magnetic fields, typically associ-
ated with the modulation of nuclear interactions by the random, or
stochastic molecular motion.
5.3.2.1 T
1
Relaxation
The magnetization at equilibrium, M
eq
is governed by the distribu-
tion of the spins among the two magnetic states, α andβ: M
eq
=M
α

M
β
. At equilibrium — this distribution is given by the Boltzmann
distribution. When the magnetization is out of equilibrium — what
will drive it back to equilibrium are fluctuations in the magnetic
field, whose frequency is somewhat represented by ω
0
and thus allow
for energy exchange. In the case of T
1
, which operates on M
z
, the fluc-
tuations have to induce changes in M
z
, thus will be generated by
fluctuations in B
x,y
. Fluctuations in the magnetic field will be gener-
ated by interactions that are modulated by, e.g., molecular motion.
Many of those mechanisms include interactions between two neigh-
boring spins. If the interaction between two spins is dependent on
their orientation in the magnetic field, this interaction will be modu-
lated, for example, by rotation of the molecule in which these spins
are incorporated.
5.3.2.2 Example — the dipole-dipole interaction
Two neighboring magnetic dipoles interact with each other (think
two magnets). The interaction strength when the two are in an exter-
nal magnetic field B
0
depends, among other things, on the angle θ
between the axis that connects the two dipoles and the magnetic
field. Specifically, this interaction is proportional to
1
r
3
ij
(3cos
2
θ
ij
− 1),
where r
ij
is the internuclear distance between nuclei i and j, and
θ
ij
is the angle described above. In liquids, random motion will
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
115
modulate both r and θ, and if the two nuclei belong to the same
molecule (e.g. the two hydrogen atoms in a water molecule), θ is
primarily modulated by molecular rotational motion. Rotation is a
random motion, but a typical rotation time will be closely related
to, e.g. the size of the molecule at hand: the larger the molecule,
the slower its characteristic rotation time. The characteristic time for
such random motion is given by a correlation time, τ
c
. If the char-
acteristic motional/rotational time constant τ
c
is characterized by
a frequency, 2π/τ
c
that is similar to ω
0
, a new kind of resonance is
achieved — between the Larmor precession, and a random process
(e.g. molecular rotation). This allows for energy exchange between
the spin systemand the “lattice,” here characterized only by its rota-
tional degree of freedom. This energy exchange is irreversible and
leads to loss of energy in the spin system that eventually returns to
thermal equilibrium where M = M
eq
. It is thus the rapport between
τ
c
, governed among other things by the molecular size, and ω
0
that
primarily defines T
1
in most situations.
5.3.2.3 T
2
Relaxation
T
2
is the decay of the x and y components of the magnetization.
For argument’s sake, let’s assume M = M
x
. The decay will result
from random fluctuations on B
y
and B
z
. Fluctuations on B
y
induce
energy level changes since they act on M
z
, similarly to what we
previously saw. Only that this time, we do not have the contribu-
tion from B
x
, thus the energy-exchange component in T
2
is 1/2 of
that of T
1
. Fluctuations in B
z
are tantamount to randomly varying
the Larmor frequency ω
0
. This broadening of the resonance from
a single frequency ω
0
to a distribution of frequencies will result in
loss of coherence in the precession of the transverse components
of M about the z-axis. This irreversible phase loss will gradually
decrease the magnitude of M
xy
until phase coherence is completely
lost andM
xy
→0. This effect increases withB
0
, andthus T
2
decreases
monotonously with B
0
. T
2
is referred to as the transverse relaxation
time or as the spin-spin relaxation time.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
116
Itamar Ronen and Dae-Shik Kim
5.3.2.4 T

2
— The Effects of Field Inhomogeneity
As we saw earlier, T
2
contributes to the decay on the XY plane even
when the external field is completely homogeneous. Inhomogeneity
of B
0
will contribute further to the loss of coherence, or dephasing of
M
xy
—simply because different parts of the sample “feel” a different
field B
0
, resulting in a spread of frequencies. The total amount of
decay is given by an additional decay constant —T

2
. The relaxation
rate due to T

2
combines contributions fromthe “pure” T
2
relaxation
and those that stem from B
0
inhomogeneity: 1/T

2
= 1/T
2
+ 1/T

2
,
where T

2
denotes the contribution to transverse relaxation from B
0
inhomogeneity.
5.3.2.5 Refocusing the Effects of Static B
0
Inhomogeneity — The Spin Echo
Erwin Hahn first noticed that if an excitation pulse is followed by
another pulse after a time periodτ, a FIDis regeneratedafter another
τ periodelapsedfromthe secondpulse, eventhoughthe original FID
has completely vanished.
2
This phenomenon was to be later called
the spin echo, and it became a staple feature in numerous pulse
sequences, the most basic of which is used to measure T
2
. The spin
echo is best explained using the concept of two isochromats, or two
spin populations with distinctly different resonance frequencies, ω
s
and ω
f
, i.e. a “slow” and a “fast” frequencies stemming from dif-
ferent B
0
felt by these populations. A diagrammatic description of
the sequence of events in a spin echo is given in Fig. 8. Following
the first 90

(x) pulse, both isochromats create an initial transverse
magnetization (a). After a period τ, as a result of the frequency dif-
ference between the two, ω
s
is lagging behind ω
f
, as seen in (b). If a
180

(x) pulse is given, the isochromats are flipped around the x-axis,
and the “mirror image” of the two isochromats is such that now ω
s
is in the lead and ω
f
is lagging behind (c). After the same period
τ, the two isochromats will converge, or refocus, on the negative y-
axis (d). The phase between the two isochromats that was created by
the inhomogeneity of B
0
is now restored to 0. By generalization —
the spin echo sequence refocuses phase loss that is due to static B
0
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
117
Fig. 8. Spin dynamics of two isochromats during a Hahn spin echo sequence: (A)
immediately after excitation; (B) following a delay τ; (C) after the 180

(refocusing)
pulse and (D) after a second delay τ.
inhomogeneities. One should note that phase losses due to T
2
relax-
ation are not restored, and neither are losses due to spin motion in
an inhomogeneous B
0
. Figure 9 shows the FID and the echo that
is generated by a spin echo sequence. It should be noted that
although the intensity of the echo is weighted by T
2
, the envelopes
of both the original FID and the echo are still decaying as a function
of T

2
.
5.3.2.6 The Effect of T
1
Since T
1
operates on the z-axis, its effects are not directly visible on
the FID, or the NMR spectrum for that matter. Since the amount of
Fig. 9. The FID and echo formation for a Hahn spin echo.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
118
Itamar Ronen and Dae-Shik Kim
magnetization available for detection is dictated by M
z
, the inten-
sity of the signal detected will depend on howlarge was M
z
prior to
the excitation pulse. If the time between subsequent excitations, also
known as TR (time-to-repetition) is too short to let M
z
from previ-
ous excitation to reach its equilibrium value M
z,eq
, then a reduction
in signal intensity occurs. This reduction is more severe for spin
populations with a longer T
1
, and this is the basis for obtaining T
1
-
based contrast in MR images. Since T
1
affects M
z
, an inversion pulse
(a 180

pulse) applied first to the sample inverts the magnetization
to yield M(0) = −M
z,eq
. From this point and on, the magnetization,
which does not possess transverse components, will relax along the
z-axis according to M
z
(t) = M
z,eq
(1 − 2 exp (−
t
T
1
)) (see the solution
for M
z
in the chapter “the Bloch equations”). To make the magneti-
zation detectable, another pulse, a detection pulse, is needed, to flip
the magnetization to the xy plane. This is the inversion-recovery
sequence, used both for measuring T
1
in a sample as well as for gen-
erating contrast based on T
1
and on other mechanisms that will be
briefly mentioned later.
5.4 SPATIAL ENCODING — DIFFERENTIATING THE NMR
SIGNAL ACCORDING TO ITS SPATIAL ORIGINS
Let us revise some of the simplest principles we knowso far through
asimpleexample: inahomogeneous field, acoupleof test tubes filled
with water, set apart from each other on the x-axis will generate a
single peak, whose frequency is dictated by B
0
and γ: ω
0
= γ B
0
. Or
in other words: The FID(and thus the spectrum) will have one char-
acteristic frequency, defined by the chemical species in our sample
(e.g. water protons). As long as the field B
0
is homogeneous — ω
0
is
constant across the sample.
If variability is introduce in B
0
along a certain axis, e.g. the x-axis,
the same variability will be expressed in ω
0
, and each point in space
with the same x coordinate will have the same ω
0
, only that now
ω
0
= ω
0
(x). The simplest such variability is one that is linear with
distance from an arbitrary point — a linear magnetic field gradient
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
119
(MFG). It should be emphasized that in the MR convention, the B
0
fieldis always orientedalongthe z-axis, but the variationinB
0
is linear
along the axis of choice. The resulting magnetic field in the presence
of a MFG, e.g. along the x-axis is then:
B
0
(x) = B
0
(0) +B
0
(x) = B
0
(0) +
dB
0
dx
x = B
0
(0) + g
x
· x,
g
x
has thus units of g · cm
−1
(cgs), and it is the slope of the variation
of B
0
with x.
5.4.1 Acquisition in the Presence of a MFG
As can be seen on Fig. 10, application a MFG on the x-axis assigns
a resonance frequency to each position on that axis. In other words,
the FID consists now of a range of resonance frequencies and this
range is a result of the variability of B
0
across the sample. Acqui-
sition of a FID in the presence of the MFG, and a following FT
yields a spectrum on which frequency is proportional to position on the
x-axis. The intensity of the “peak” is proportional to the total M(0)
at that specific location on the x-axis. This 1D-image is thus a pro-
jection of the 3-D spin distribution in our sample on the x-axis.
Fig. 10. The sample in the presence of a magnetic field gradient. Each point in the
sample along the x-axis feels a different magnetic field (left). The result (right) is a
frequency encoded one-dimensional image (projection) of the object.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
120
Itamar Ronen and Dae-Shik Kim
Historically speaking, Paul Lauterbur (Nature, 1973) first suggested
the use of MFG for spatial encoding. His idea was to measure
the projections in different radial directions, and reconstruct the
object from them (projection-reconstruction). He called his method
Zeugmatography.
5.4.2 MFG, Spectral Width and Field-of View
The range of frequencies or the spectral width SW that is spanned
by the MFG is related to the spatial range D on the axis of inter-
est through the gradient strength and the gyromagnetic ratio.
In the case where the gradient was applied along the x-axis,
SW(x) =γ·g(x)·D(x). The implication is that the stronger is the gra-
dient, the broader is the frequency span for a specific desired
field-of-view (FOV) on a desired axis. Or, conversely, increasing the
FOV in a specific axis increases the SW on that axis.
5.4.3 Another Way to Look at the Effect of MFG
If two locations a andbonanobject alongthe gradient axis are desig-
nated the locations x
a
and x
b
, respectively, then in the rotating frame
the frequencies generated at those two locations in the presence of
a MFG are two non-identical frequencies, ω
a
and ω
b
, respectively.
By doubling the gradient strength twice, the frequencies are also
doubled to become 2ω
a
and 2ω
b
, respectively. This means that the
evolution of the FID in the presence of a magnetic field gradient,
or more specifically, of the phase of each frequency component of
the FID is a function of both time (t) and the gradient strength (g).
Twice the time — twice the phase for the same g; twice the gradient
strength — twice the phase for the same t. Thus the detected signal
S is S(γ, t, g). A new variable k can be introduced, which has the
units of cm
−1
(inverse space). k is defined as: k = γ

g(t)dt and if the
gradient is constant with time, then S = S(k) = S(γ · g · t). In a similar
way in which time and frequency domains are related to each other
via the Fourier transformation, so are the coordinate vector r and
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
121
the vector k:
f (r) = 2π

F(k) exp (ik · r)dk F(k) = 2π

f (r) exp (ik · r)dr,
or in other words, k-space and coordinate space are Fourier-
conjugates.
The effect that gradients and RF pulses have on the phase of
the transverse magnetization can thus be efficiently described as a
trajectory in k-space. It is instructive to consider the case where a
slice-selective excitation pulse (e.g. a 90
x
pulse) is applied to the
equilibrium magnetization, and only manipulations of the magne-
tization on the XY plane are considered. This covers the common
situation of trajectories in a 2Dk-space, encountered in all multislice
schemes. At t = 0 (right after the excitation pulse), M
xy
is in-phase
and thus k
x
= k
y
= 0.
This is all verysimilar tospectroscopyonlythat insteadof having
different resonance frequencies that originate from different chemi-
cal shifts, the different frequencies originate fromdifferent positions
in a nonhomogeneous field!
5.4.4 Flexibility in Collecting Data in K-Space
Since s = s(k) = s(γ · t · g), the signal (encoded for position on, e.g.,
the x-axis) can be acquired in two different ways:
Keep the gradient constant: let the signal evolve with time, and
sample the FID at different time points. This is typically referred to
as frequency encoding.
Keep the time constant: sample the FID following application of
short gradients of the same duration, but with different gradient
strength. This is referred to as phase encoding.
Thus, if a rectangular portion of k-space needs to be sampled,
a logical way to achieve this goal is to acquire the data follow-
ing sequential excitations of the spin system, where each excitation
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
122
Itamar Ronen and Dae-Shik Kim
is frequency encoded in one direction (say, the x-axis) and phase
encoded in the perpendicular direction (e.g. the y-axis).
5.4.5 The Gradient Echo
Typically, in order to allow for efficient time management of other
pulse sequence elements and the acquisition of a full echo signal,
the magnetization is first dephased along the frequency encoding
direction, and the rephased using a gradient of opposite polarity. If
the dephasing gradient amplitude equals the rephrasing gradient
amplitude, then magnetization components that gained phase −
created by a local field B
0
−B for a time period t, will now recover
the same phase, if the local field at the same point is B
0
+ B for
the same time t. More generally, the refocusing condition is that the
area of the dephasing gradient be equal to that of the rephrasing
gradient:

t(end)
t(start)
g
deph.
dt = −

t(end)
t(start)
g
reph.
dt. This allows for flexibility
in choosing the time it takes to the magnetization to refocus. The
time between excitation and refocusing is the gradient echo time
(TE). The principle of the gradient echo is illustrated in Fig. 11.
Fig. 11. The gradient echo.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
123
5.4.6 Encoding for the Third Dimension: Slice Selection or
Additional Phase Encoding
There are two main options for spatially encoding the out-of-plane
dimension. One is to add a phase-encoding loop on the third dimen-
sion. This choice is popular with imaging modalities that aim for
high spatial resolution. The other option is to combine frequency
selective RF pulses with magnetic field gradients for a spatially
selective excitation. For example, for a RF pulse with a sinc-shape
envelope
sin τ
τ
, the frequency response function is rectangular with a
bandwidth of 1/τ. In the presence of a gradient, the external mag-
netic field is given by B(x) = B(0) + B = B(0) + g · x. The
range of the magnetic field B can be expressed in terms of range
of frequencies ω/γ and it is this range of frequencies that are in
resonance with those contained in the sinc pulse bandwidth. This
provides a simple relationship between the bandwidth of the pulse,
the gradient applied in conjunction with the pulse and the spatial
extent of the excitation, or slice thickness: BW = γgx, where BW
is the bandwidth of the RF pulse, γ is the gyromagnetic ratio, g is
the gradient strength and x is the slice thickness. The carrier fre-
quency of the RF pulse can be monitored to modify the location of
the center of the slice. Ina typical multislice MRI experiment, the slice
location is variedcyclically, andin order to avoidartifacts associated
with slice overlaps, the cycle is performed on odd and even slices
sequentially.
5.4.7 Intraslice Phase Dispersion
One problemassociated with slice selection stems fromthe fact that
due to the presence of the gradient during the application of the RF
pulse, a range of frequencies is being created. This means that except
for the one frequency ω
0
, i.e. the reference frequency for that partic-
ular nucleus, all other frequencies excited by the pulse are off reso-
nance. Off resonance effects are marginal when the pulse duration is
short withrespect tothefrequencyoffset causedbythegradients, but
this is typically not the case. The result is that frequency components
within the slice gain a phase component that is proportional to their
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
124
Itamar Ronen and Dae-Shik Kim
distance from the point in the slice excited by ω
0
(typically the cen-
ter of the slice for symmetric frequency responses). This in turn
causes signal loss, which can be at times quite severe. In order to
refocus this phase dispersion, a gradient with the opposite polarity
to that of the slice selection gradient is applied immediately at the
end of the pulse. It can be shown that for complete refocusing the
following condition has to be met: S(g
refocusing
) =S(g
slice selection
)/2,
where S is the “area”, or the integral over time of the given
gradient.
5.4.8 A Complete Pulse Sequence
The first MRI pulse sequence that incorporated all three elements
of spatial encoding: frequency encoding, phase encoding and slice
selectionwas the “spinwarp” suggestedbyW. Edelsteinin1980. The
schematics of the spin warp are given in Fig. 12. Many of the MRI
pulse sequences that were subsequently developedare conceptually
similar to the spin warp. Notable exceptions are sequences that are
based on a single excitation, such as echo planar imaging (EPI) and
multiple spin-echo sequences.
Fig. 12. The spin warp pulse sequence.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
125
5.4.9 Contrast in MRI Sequences
Contrast is the visual differentiation between different parts of the
image. In MRI, contrast is typically based on a physical property
related to a specific spin population. The physical property is thus
called a contrast mechanism. Contrast can be based on relaxation
properties: T
1
, T
2
, T

2
. Additionally, contrast based on spin mobility —
flow (e.g. blood flow), perfusion (mobility of water through the cap-
illary bed into tissue), and self-diffusion (random motion of water
molecules). A different type of contrast is based on chemical envi-
ronment effects — proton or water chemical exchange between dif-
ferent environments (e.g. binding sites on heavy macromolecules)
gives rise to contrast through magnetization transfer mechanism.
Relaxation-based contrast is the most basic way to obtain contrast
in MRI. The contrast is achieved by sensitizing the image to one
(or more) of the relaxation mechanisms previously mentioned. By
examining the simple spin-warp sequence, it is already possible to
get a sense of how contrast is achieved. First, since this is a gradient
echo sequence, the image will be primarily T

2
weighted: the inten-
sity of the echo is given by S(TE) = S(0)

exp(−TE/T

2
). This is not
entirely correct, since there are two other main factors that influ-
ence the contrast. One is explicitly present in the equation above —
S(0), or spin density. The other is caused by the finite time between
consecutive excitations (TR) —which affects the amount of longitu-
dinal magnetization available for the next excitation. This contrast is
based on T
1
and becomes more pronounced as TR becomes shorter
or as the flip angle is closer to 90

. The possibility of obtaining con-
trast based on T
2
is based on the introduction of a spin-echo ele-
ment in the pulse sequence. This can be easily done by inserting an
180

pulse between the excitation and the center of the acquisition,
and accounting for the polarity of the gradient echo gradients. This
modification converts the spin-warp sequence into a T
2
-weighted
sequence. Other mechanisms will be described in detail in other
chapters of this book.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
126
Itamar Ronen and Dae-Shik Kim
5.4.10 Echo Planar Imaging (EPI)
In our review of pulse sequence principles, we assumed that each
k-space line requires a separate excitation. This puts a severe limit
on the minimum time required for obtaining an image — the need
to introduce a delay between excitations (TR) is the one most time
consuming element in the entire pulse sequence. P Mansfield
5
sug-
gested the possibility of obtaining an image with a single excitation.
The trick is to find a trajectory in k-space that will cover the portion
of k-space we are interestedin. The way it is done is demonstratedin
Fig. 13. Following the excitation pulse, pre-encoding gradients are
applied in the phase encoding and read-out (frequency encoding)
directions (a). From now on, the read-out gradients switch polar-
ity back and forth to allow for “zig-zag” scan of k-space. Phase-
encoding “blips” are introduced between read-out lines to bring the
magnetization to consecutive k-space lines. Since the read-out time
is typically very short (on the order of 1 ms–2 ms), the acquisition of
an entire image of a single slice takes typically less than 100 ms. An
entire volume that consists of several slices can thus be acquired in
a couple of seconds. EPI is thus the natural sequence to be used in
applications where temporal (andnot spatial) resolution is required.
Fig. 13. The EPI pulse sequence.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
Principles of Magnetic Resonance Imaging
127
The most notable application of this class is that of functional MRI.
In the version of EPI shown above, the signal is T

2
weighted. An
introduction of a 180

pulse, similar to what has been previously
mentioned, will result in a T
2
-weighted EPI image.
5.5 CONCLUSION
In this chapter, we developed the theoretical foundations of NMR
and subsequently of MRI. The principles of nuclear magnetic reso-
nance, nuclear relaxation, spatial encoding and MRI image contrast
have been discussed and amply illustrated. This basis should give
the reader a strong tool for understanding the sophisticated applica-
tions of MRI inthe biomedical sciences, suchas functional MRI of the
brain using the blood oxygenation level dependent (BOLD) effect,
diffusion weighted and diffusion tensor imaging (DTI) and more.
References
1. Bloch F, Nuclear induction, Phys Rev 70(7–8): 460, 1946.
2. Hahn EL, Spin echoes, Phys Rev 80: 580–594, 1950.
3. Edelstein WA, Hutchison JMS, Johnson G, Redpath T, Spin warp NMR
imaging and applications to human whole body imaging, Phys Med Biol
25: 751–756, 1980.
4. Lauterbur PC, Image formation by induced local interaction: Examples
employing nuclear magnetic resonance, Nature 242: 190–191, 1970.
5. Mansfield P, Multiplanar image formation using NMR spin echoes,
J Phys C 10: 55–58, 1977.
6. Purcell EM, Torrey HC, Pound RV, Resonance absorption by nuclear
magnetic moments in a solid, Phys Rev 69(1–2): 37, 1946.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch05 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
CHAPTER 6
Principles of Ultrasound Imaging
Modalities
Elisa Konofagou
Despite the fact that medical ultrasound preceded MRI and PET, ongo-
ing advances have allowed it to continuously expand as a field in its
numerous applications. In the past decade, with the advent of faster pro-
cessing, specialized contrast agents, a better understanding of nonlinear
wave propagation, novel and real-time signal and image processing and
complex ultrasound transducer manufacturing, ultrasound imaging and
ultrasound therapy have enjoyed a multitude of newfeatures and clinical
applications. Those have added to the higher quality and wider applica-
tions of diagnostic ultrasound images. Due to these developments, ultra-
sound has become a very powerful imaging modality mainly due to its
unique temporal resolution, lowcost, nonionizing radiation and portabil-
ity. Lately, unique features such as harmonic imaging, coded excitation,
3D visualization and elastic imaging, have added to higher quality and
wider range of applications of diagnostic ultrasound images. In this chap-
ter, a short overview of the fundamentals of diagnostic ultrasound and a
brief summary of its many applications and methods are provided. The
first part of this chapter will provide a short backgroundonthe ultrasound
physics andthe secondpart will constitute a short overviewonultrasound
imaging and image formation.
6.1 INTRODUCTION
Sounds with a frequency above 20 kHz are called ultrasonic, since
they occur at frequencies inaudible to the human ear. When emitted
at short bursts, propagating through media, such as water, with low
reflection coefficients and reflected by obstacles along their propa-
gation path, the detection of the reflection, or echo, of the ultrasonic
129
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
130
Elisa Konofagou
wave can help localize the obstacle. This principle has been used
by sonar (SOund NAvigation and Ranging) and inherently used
by marine mammals, such as dolphins and whales, to help them
localize prey, obstacles or predators. In fact, the frequencies used for
“imaging” vary significantly dependent upon the application: from
underwater sonar (up to 300 kHz), diagnostic ultrasound (1 MHz–
40 MHz), therapeutic ultrasound (0.8 MHz–4 MHz) and industrial
nondestructive testing (0.8 MHz–20 MHz) to acoustic microscopy
(up to 2 GHz).
6.2 BACKGROUND
6.2.1 The Wave Equation
As the ultrasonic wave propagates through the tissue, its energy and
momentum are transferred to the tissue. No net transfer of mass
occurs at any particular point in the medium unless this is induced
by the momentum transfer. As the ultrasonic wave passes through
the medium, the peak local pressure in the medium increases. The
oscillations of the particles result to harmonic pressure variations
within the mediumand to a pressure wave that propagates through
Particle
distribution
Particle
displacement
Particle
distribution
Particle
displacement
Direction of propagation
Particle
distribution
λ
Particle
displacement
Particle
distribution
Particle
displacement
Particle
distribution
Particle
displacement
Direction of propagation
Fig. 1. Particle displacement and particle distribution for a traveling longitudinal
wave. The direction of propagation is from left to right, namely the longitudinal
(or, axial) direction. A shear wave can be created in the perpendicular direction,
in which case the particles would also be moving in a direction orthogonal to the
direction of propagation (not shown here).
1
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
131
S
S
(1)
(2)
z z+ z
u u+ u
F
SS
S
(1)
(2)
z z+δz
u u+δu
F
Fig. 2. A small volume of the medium of impedance Z (1) at equilibrium and
(2) undergoing oscillatory motion when an oscillatory force F is applied.
the medium as neighboring particles move with respect to one
another (Fig. 1). The particles of the medium can move back and
forth in a direction parallel (longitudinal wave) or perpendicular
(transverse wave) to the traveling direction of the wave.
Let’s consider the first case.
Assuming that a small volume of the medium that can be mod-
eled as a nonviscous fluid (no shear waves can be generated) is
shown on Fig. 2, an applied force δF produces a displacement of
u + δu in the x-position on the right-hand side of the small vol-
ume. A gradient of force
∂F
∂z
is thus generated across the element in
question, and, assuming that the element is small enough so that
the measured quantities within the medium are constant, it can be
assumed as being linear, or:
δF =
∂F
∂z
δz, (1)
and according to Hooke’s Law,
F = KS
∂u
∂z
, (2)
where K is the adiabatic bulk modulus of the liquid and S is the area
of the regiononwhichthe force is exerted. Bytakingthe derivative of
both sides of Eq. 2 with respect to z and following Newton’s Second
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
132
Elisa Konofagou
Law, from Eq. 1 we obtain the so-called “wave equation”:

2
u
∂z
2

1
c
2

2
u
∂t
2
= 0 (3)
where c is the speedof soundgiven by c =

K
ρ
=

1
ρκ
, where ρ is the
density of the medium and κ is the compressibility of the medium.
Eq. 3 relates the second differential of the participle displacement
with respect to distance to the acceleration of a simple harmonic
oscillator. Note that the average speed of sound in most soft tissues
is about 1540 m/s with a total range of ±6%. For the shear wave
derivationof this equationplease refer to Wells
1
or Kinsler andFrey
2
among others.
The solutionof the wave equationis givenbyafunctionu, where:
u = u(ct −z). (4)
An appropriate choice of function for u in Eq. 4 can be:
u(t, z) = u
0
exp[jk(ct −z)], (5)
where k is the wavenumber and equal to 2π/λ with λ denoting the
wavelength (Fig. 1).
6.2.1.1 Impedance, Power and Reflection
The pressure wave that results fromthe displacement generatedand
given by Eq. 5 is given by:
p(t, z) = p
0
exp[jk(ct −z)], (6)
where p
0
is the pressure wave amplitude and j is equal to

−1. The
particle speed and the resulting pressure wave are related through
the following relationship:
u =
p
Z
, (7)
where Zis the acoustic impedance definedas the ratio of the acoustic
pressure wave at a point in the mediumto the speedof the particle at
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
133
the same point. The impedance is thus characteristic of the medium
and given by:
Z = ρc. (8)
The acoustic wave intensity is defined as the average flow of
energythrougha unit area inthe mediumperpendicular tothe direc-
tion of propagation.
2
By following that definition, the intensity can
be found equal to
3
:
I =
p
2
0
2Z
, (9)
andusuallymeasuredinunits of mW/cm
2
indiagnostic ultrasound.
A first step into understanding the generation of ultrasound
images is to follow the interaction of the propagating wave with
the tissue. Thanks to the varying mean acoustic properties of tis-
sues, a wave transmitted into the tissue will get partly reflected at
areas where the properties of the tissue and, thus its impendance,
are changing. These areas constitute a so-called “impedance mis-
match” (Fig. 3).
Reflected
Incident
Transmitted
r
u
r
p
t
u
i
p
t
p
t
u
t
i
r
Reflected
Incident
Transmitted
r
u
r
u
r
p
r
p
t
u
t
u
i
p
i
p
t
p
t
p
t
u
t
u
tt
ϑ
ii
rr
Medium 1
Medium 2
Interface
Reflected
Incident
Transmitted
r
u
r
u
r
p
r
p
t
u
t
u
i
p
i
p
t
p
t
p
t
u
t
u
tt
ii
rr
Reflected
Incident
Transmitted
r
u
r
u
r
p
r
p
t
u
t
u
i
p
i
p
t
p
t
p
t
u
t
u
tt
ii
ϑ
rr
ϑ
Medium 1
Medium 2
Interface
Fig. 3. An incident wave at an impedance mismatch (interface): A reflected and
a transmitted wave with certain velocities and pressure amplitudes are created
ensuring continuity at the boundary.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
134
Elisa Konofagou
The reflection coefficient R of the pressure wave at an incidence
angle of ϑ
i
is given by:
R =
p
r
p
i
=
Z
2
cosϑ
t
−Z
1
cosϑ
i
Z
2
cosϑ
t
+Z
1
cosϑ
i
, (10)
where ϑ
t
is the angle of the transmitted wave (Fig. 3) also related to
the incidence angle through Snell’s Law:
λ
1
cos ϑ
i
= λ
2
cos ϑ
t
, (11)
where λ
1
and λ
2
are the wavelengths of the waves in medium 1 and
2, respectively, and related to the speeds in the two media through:
c = λf , (12)
where f is the frequency of the propagating wave.
As Fig. (3) also shows, the wave impingent upon the impedance
mismatch also generates a transmitted wave, i.e. a wave that prop-
agates through. The transmission coefficient is defined as:
T =
p
t
p
i
=
2Z
2
cos ϑ
i
Z
2
cos ϑ
i
+Z
1
cos ϑ
t
. (13)
According to the parameters reported by Jensen
3
on impedance
and speed of sound of air, water and certain tissues, the reflection
coefficient at a fat-air interface is equal to −99.94% showing that
virtually all of the energy incident on the interface is reflected back
in tissues such as the lung. A more realistic example found in the
human body is the muscle-bone interface, where the reflection coef-
ficient is 49.25%, demonstrating the challenges encountered when
usingultrasoundfor the investigationof bone structure. Onthe other
hand, given the overall similar acoustic properties between different
soft tissues, the reflection coefficient is too low when used to differ-
entiate betweendifferent soft tissue structures rangingonlybetween
−10% and 0.
The values mentioned above determine both the interpretation
of ultrasound images, or sonograms, as well as the design of trans-
ducers, as discussed in the sections below.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
135
6.2.1.2 Tissue Scattering
In the previous section, the notions of reflection, transmission and
propagation were discussed in the simplistic scenario of plane wave
propagation and its impingment on plane boundaries. In tissues,
however, such a situation is rarely encountered. In fact, tissues are
constitutedby cells andgroups of cells that serve as complex bound-
aries to the propagating wave. As the wave propagates through all
these complex structures, reflected and transmitted waves are gen-
erated at each one of these interfaces dependent on the local density,
compressibility and absorption of the tissue. The groups of cells
are called “scatterers” as they scatter acoustic energy. The backscat-
tered field, or what is “scattered back” to the transducer, is used to
generate the ultrasound image. In fact, the backscattered echoes are
usually coherent and can be used as “signatures” of tissues that are
e.g. in motion or under compression, as appliedin elasticity imaging
methods.
An example of such an ultrasound image can be seen in Fig. 4.
The capsule of the prostate is shown to have a strong echo, mainly
due to the high impedance mismatch between the surrounding
medium, gel in this case, and the prostate capsule. However, the
remaining area of the prostate is depicted as a grainy region sur-
rounding the fluid filled area of the urethra (dark, or lowscattering,
area in the middle of the prostate). This grainy appearance is
Urethral
crest
Central zone
Peripheral zone
Verumontanum
Fibrous connective
tissue
Urethral
crest
Central zone
Peripheral zone
Verumontanum
Fibrous connective
tissue
(A) (B)
Fig. 4. Sonogram of (A) an in vitro canine prostate and (B) its corresponding
anatomy at the same plane as that scanned.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
136
Elisa Konofagou
called “speckle,” a termborrowed fromthe laser literature.
4
Speckle
is produced by the constructive and destructive interference of
the scattered signals from structures smaller than the wavelength;
hence, the appearance of bright and dark echoes, respectively. So,
speckle does not necessarily relate to a particular structure in the
tissue.
Given its statistical significance, in its simplest representation,
the amplitude of speckle has been represented as having a Gaus-
sian distribution with a certain mean and variance.
5
In fact, these
same parameters have been used to indicate that the signal-
to-noise ratio of an ultrasound image is fundamentally limited
to only 1.91.
5
As a result, in the past, several authors have
tried different speckle cancellation techniques
6
in an effort to
increase the image quality of diagnostic ultrasound. However,
speckle offers one important advantage that has rendered it
vital in the current applications of ultrasound (Sec. 5). Despite it
being described solely by statistics, speckle is not a random signal.
As mentioned earlier, speckle is coherent, i.e. it preserves its char-
acteristics when shifting from position. Consequently, motion esti-
mation techniques that can determine anything from blood flow to
tissue elasticity are made possible in a field that is widely known as
“speckle tracking.”
6.2.1.3 Attenuation
As the ultrasound wave propagates inside the tissue, it undergoes a
loss of power dependent onthe distance traveledinthe tissue. Atten-
uationof the ultrasonic signal canbe attributedtoa varietyof factors,
such as divergence of the wavefront, reflection at planar interfaces,
scattering from irregularities or point scatterers and absorption of
the wave energy.
7
In this section, we will concentrate on the latter,
beingthe strongest factor insoft (other thanlung) tissues. Inthis case,
the absorptionof the wave’s energyleads to heat increase. The actual
cause of absorption is still relatively unknown but simple models
have beendevelopedtodemonstrate the dependence of the resulting
wave pressure amplitude decrease in conjunction with the viscosity
of tissues.
8
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
137
By not going into detail concerning the derivations of such a
relationship, an explanation of the phenomenon is provided here.
Let’s consider a fluid with a certain viscosity that provides a certain
resistancetoawavepropagatingthroughits different layers. Inorder
to overcome the resistance, a certain force per unit area or, pressure,
needs to be applied that is proportional to the shear viscosity of the
fluid η as well as the spatial gradient of the velocity,
7
or:
p ∝ η
∂u
∂z
. (14)
Equation 14 shows that a fluid with higher viscosity will require
higher force to experience the same velocity gradient compared to
a less viscous fluid. By considering Eqs. 2 and 14, an extra term can
be added to the wave equation that includes both the viscosity and
compressibility of the medium,
7
or:

2
u
∂z
2
+


3

k

3
u
∂z
2
∂t

1
c
2

2
u
∂t
2
= 0, (15)
where ξ denotes the dynamic coefficient of compressional viscosity.
The solution to this equation is given by:
u(t, z) = u
0
exp (−αz) exp[jk(ct −z)], (16)
where α is the attenuation coefficient also given by (for α k):
α =


3

k
2
2ρc
. (17)
From Eq. 16, the effect of attenuation on the amplitude of the wave
is clearly depicted (Fig. 5). An exponential decay on the envelope of
the pressure wave highly dependent on the distance results fromthe
tissue attenuation. The intensity of the wave will decrease at twice
the rate, given that from Eq. 9:
I(t, z) =
p
2
0
Z
exp (−2αz) exp[2jk(ct −z)] (18)
or, the average intensity is equal to:
I = I
0
exp (−2αz). (19)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
138
Elisa Konofagou
z
u(t,z)
z
u(t,z)
Fig. 5. This is the attenuated wave of Fig. 1. Note that the envelope of the wave is
dependent on the attenuation of the medium.
Another important effect that the tissue attenuation can have on
the propagating wave is a frequency shift. This is because a more
complex form for the attenuation α is:
α = β
0

1
f , (20)
where β
0
and β
1
are the frequency-independent and frequency-
dependent attenuationcoefficients. Infact, the frequency-dependent
term is the largest source of attenuation and increases linearly with
frequency. As a result, the spectrum of the received signal changes
as the pulse propagates through the tissue in such a way that a shift
to smaller frequencies, or downshift, occurs. In addition, the down-
shift is dependent on the bandwidth of the pulse propagating in the
tissue and the mean frequency of a spectrum(in this case Gaussian
3
)
can be given by:
f = f
0
−(β
1
B
2
f
2
0
)z, (21)
where f
0
and B denote the center frequency and bandwidth of the
pulse. Thus, according to Eq. 21, the downshift due to attenuation
depends on the tissue frequency-dependent attenuation coefficient,
and the pulse center frequency and bandwidth. A graph showing
the typical values of frequency-dependent attenuation coefficients
(measured in dB/cm/MHz) in the biological tissue is given in Fig. 6.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
139
10
-3
10
-2
10
-1
10
0
10
1
10
2
Attenuation (dB/cm/MHz)
Plasma
Blood
Spleen
Bone
Fat
Liver
Kidney
10
-3
10
-2
10
-1
10
0
10
1
10
2
Attenuation (dB/cm/MHz)
Plasma
Blood
Spleen
Bone
Fat
Liver
Kidney
Fig. 6. Attenuation values of certain fluids and soft tissues.
9
6.3 KEY TOPICS WITH RESULTS AND FINDINGS
6.3.1 Transducers
The pressure wave that was discussed in the previous section is
generated using an ultrasound transducer, which is typically a
piezoelectric material. “Piezoelectric” denotes the particular prop-
erty of certain crystal polymers of transmitting a pressure (“piezo”
means “to press” in Greek) wave generated when an electrical
potential is applied across the material. Most importantly, since
this piezoelectric effect is reversible, i.e. a piezoelectric crystal
will convert an impinging pressure wave to an electric poten-
tial, the same transducer can also be used as a receiver. Such
crystalline or semicrystalline polymers are the poly-vinylidene
fluoride (PVDF), quartz, barium titanate and lead zirconium
titanate (PZT).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
140
Elisa Konofagou
A single-element ultrasound transducer is shown in Fig. 2.
Dependent upon its thickness (l) and propagation speed (c), the
piezoelectric material has a resonance frequency given by:
f
0
=
c
2l
. (22)
The speed in the PZT material is around 4 000 ms
−1
, so for a 5 MHz
transducer, the thickness should be 0.4 mm thick. The matching
layer is usually coated onto the piezoelectric crystal in order to min-
imize the impedance mismatch between the crystal and the skin
surface and, thus, maximize the transmission coefficient (Eq. 13). In
order to overcome the aforementioned impedance mismatch, the
ideal impedance Z
m
and thickness d
m
of the matching layer are
respectively given by:
Z
m
=

Z
T
Z (23)
and
d
m
=
λ
4
, (24)
with Z
T
denoting the transducer impedance and Z the impedance
of the medium.
The backing layers behind the piezoelectric crystal are used in
order to increase the bandwidth and the energy output. If the back-
ing layer contains air, then the air-crystal interface yields a max-
imum reflection coefficient given the high impedance mismatch.
Another by-product of an air-backed crystal element is that the
crystal remains relatively undamped, i.e. the signal transmitted will
have a low bandwidth and a longer duration. On the other hand,
the axial resolution of the transducer depends on the signal dura-
tion, or pulse width, transmitted. As a result, there is a tradeoff
between transmitted power and resolution of an ultrasound sys-
tem. Depending on the application, different backing layers are
therefore used. Air-backed transducers are used in continuous-
wave and ultrasound therapy applications. Heavily-backed trans-
ducers are utilized in order to obtain high resolution, e.g. for high
quality imaging at the expense of lower sensitivity and reduced
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
141
Fig. 7. Typical construction of a single-element transducer.
3
penetration. Coded-excitation techniques have recently been
successfully applied to circumvent such tradeoffs.
For imaging purposes, an assembly of elements such as that in
Fig. 7 is usually used and called an “array” of such elements. In an
array, the elements are stacked next to each other at a distance equal
to less than a wavelength for the minimuminterference andreduced
grating lobes. The linear array has the simplest geometry. It selects
the region of interest by firing elements above that region. The beam
can then be moved on a line by firing groups of adjacent elements
andthenthe rectangular image obtainedis formedby combining the
received signals by all the elements. A curved array is used when
the transducer is smaller than the area scanned. Aphased array can
be used to change the “phase” or delay between the fired elements
and thus achieve steering of the beam. The phased array is usually
the choice for cardiovascular exams, when the windowbetween the
ribs allows for a very small transducer to image the whole heart.
Focusing andsteering can both be achievedby modifying the profile
of firing delays between elements (Fig. 8).
6.3.2 Ultrasonic Instrumentation
Figure 9 shows a block diagram of the different steps that are used
in order to acquire, process and display the received signal fromthe
tissue.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
142
Elisa Konofagou
Array of
elements
Beam wavefront
Beam direction
Group 1
Group 2
Group 3
Group 1
Group 2
Group 3
x x x
x
x
x
Group 1
Group 2
Group 3
x
x
x
… …
… … …

τ τ τ
(A) (B) (C)
Fig. 8. Electronic (A) beamforming, (B) focusing and(C) focusing andbeamsteer-
ing as achieved in phased arrays. The time delay between the firings of different
elements is denoted here by τ.
Fig. 9. Block diagram of a pulsed-wave system and the resulting signal or image
at three different steps.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
143
6.3.2.1 Transducer Frequency
In ultrasound imaging, a pulse of a given duration, frequency and
bandwidth is first transmitted. As mentioned before, a tradeoff
between penetration (or, low attenuation) and resolution exists.
Therefore, the chosen frequency will depend on the application.
Usually, for deeper-organs, such as the heart, the uterus and the
liver, the frequencies are restricted in the range of 3 MHz–5 MHz
while for more superficial structures, such as the thyroid, the
breast, the testis and applications on infants, a wider range of
4 MHz–10 MHz is applied. Finally, for ocular applications, a range
of 7 MHz–25 MHz is determined by the low attenuation, low depth
and higher resolution required.
The pulse is usually a few cycles of that frequency long (usually
3–4 cycles) so as to ensure high resolution, and is generated by the
transmitter through a voltage step sinusoidal function at a voltage
amplitude (100 V–500 V) and a frequency equal to that of the res-
onance frequency of the transducer elements. For static structures,
a single pulse or multiple pulses (usually used for averaging later)
could be used at an arbitrary frequency. However, for moving struc-
tures, such as blood, liver and the heart, a fundamental limit on the
maximum pulse repetition frequency (PRF) is set by the maximum
depth of the structure, or PRF (kHz) = c/2D
max
. Typically, the PRF
is in the range of 1 kHz–3 kHz.
6.3.2.2 RF Amplifier
The received signal needs to be initially amplified so as to guarantee
a goodsignal-to-noise ratio. At the same time, the input of the ampli-
fier should be devoid of the high voltage pulse in order to protect
the circuits but also maintain its low noise and high gain. Atypical
dynamic range expectedat the output is on the order of 70 dB–80 dB.
6.3.2.3 Time-Gain Compensation (TGC)
As indicated above, attenuation is unavoidable as the wave trav-
els through the medium and it increases with depth. In order to
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
144
Elisa Konofagou
avoid artificial darkening of deeper structures as a result, a voltage-
controlledattenuator is usuallyemployed, where a control voltage is
utilized to manually adjust the system gain accordingly after recep-
tion of an initial scan. Alogarithmic voltage ramp is usually applied
that compensates for a mean attenuation level with depth.
6
The
dynamic range becomes further reduced to 40 dB–50 dB.
6.3.2.4 Compression Amplifier
The signals will ultimately be displayed as a greyscale on a cathode
ray tube (CRT), where the dynamic range is typically only 20 dB–
30 dB. To this purpose, an amplifier with a logarithmic response is
utilized.
6.3.3 Ultrasonic Imaging
Ultrasonic imaging is usually known as echography or sonography,
depending on which side of the Atlantic ocean one is scanning from.
As mentioned earlier, the signal acquired by the scanner can be pro-
cessed and displayed in several different fashions. In this section,
the most typical and routinely used ones are discussed.
6.3.3.1 A-Mode
Since the image is a grayscale picture, the amplitude of the signal
is displayed. For this, the envelope of the RF signal needs to be
calculated. This is, for example, achieved by applying the Hilbert
transforms. The resulting signal is called a detected A-scan, A-line
or A-mode scan (A- for Amplitude). An example of that is shown on
Fig. 8.
6.3.3.2 B-Mode
When the received A-scans are spatially combined after acquisition
using either a mechanically moved transducer or the previously
mentioned arrays and used to brightness-modulate the display in
a 2D format, the brightness or B-mode is created, which has a true
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
145
Fig. 10. Top: B-scan of an abdominal aorta in a mouse at 30 MHz; Bottom: M-mode
image over several cardiac cycles taken along the dashed line in the B-scan.
image format and is by far the most widely used diagnostic ultra-
sound mode. By default, sonogram or echogram refers to B-mode.
Figure 10 shows a longitudinal B-mode image of anabdominal aorta.
One of the biggest advantages of ultrasound scanning is real-time
scanningandthis is achieveddue tothe shallowdepthof scanningin
most tissues and the high speed of sound. The frame rate is usually
on the order of 30 Hz–100 Hz (while in the M-mode version it can
be as fast the PRF itself, see below). The frame rate is limited by the
number of A-mode scans acquired, N
A
, and the maximum depth,
i.e. the maximum frame rate is given by PRF
F
= c/2D
max
/N
A
.
6.3.3.3 M-Mode
Another way of displaying the A-scans is in function of time, espe-
cially in cases where tissue motion needs to be monitored and ana-
lyzed such as the case of heart valves or other cardiac structures. In
the case of Fig. 10, only one A-scan from a particular tissue struc-
ture is displayed in brightness mode and followed in time, called
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
146
Elisa Konofagou
motion-, or M-mode scan. Adepth-time display is then generated. A
typical applicationof the M-mode displayis usedinthe examination
of heart valve leaflets motion and Doppler displays.
6.4 DISCUSSION
One of the mainproblems withthe standarduse of ultrasoundarises
from high attenuation in some tissues and especially small vessels
and blood cavities. In order to overcome this limitation, contrast
agents areroutinelyused. Contrast agents aretypicallymicrospheres
of encapsulated gas or liquid coated by a shell, usually albumin.
Due to the high impedance mismatch created by the gas or liquid
contained, the resulting backscatter generatedby the contrast agents
is a lot higher than that of the blood echoes.
An alternative method to generating higher backscatter due to
the increased impedance mismatch is based on the harmonics gen-
erated by the bubble’s interaction with the ultrasonic wave. The
bubble vibration also generates harmonics above andbelowthe fun-
damental frequency, with the second harmonic possibly exceeding
the first harmonic. In other words, the contrast agent introduces
nonlinear backscattering properties into the medium where it lies.
Several processes of filtering out undesired echoes from station-
ary media surrounding the region, where flow characteristics are
assessed, result to weakening of the overall signal at the fundamen-
tal frequency. Therefore, since residual harmonics will result from
moving scatterers, motion characteristics can all be obtained from
the higher harmonic echoes, after using a high-pass filter and fil-
tering out the fundamental frequency spectrum that also contains
the undesired stationary echoes. Another method for distilling the
harmonic echo information is the more widely used phase or pulse
inversion method, in which two pulses (instead of one) are sequen-
tially transmitted with their phases reversed. Upon reception, the
echoes resulting from the two pulses are then added and only the
higher harmonics remain.
Despite the fact that the idea of contrast agent use originated
for blood flow measurements, the same type of approach can be
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
147
applied in the case of soft tissues as well. After being injected into
the bloodstream, the contrast agents can also appear and remain on
the tissues and offer the same advantages of motion detection and
characterization as in the case of blood flow. However, it turns out
that contrast agents are not always needed for imaging of tissues
at higher harmonics, especially since tissue scattering can be up to
two orders of magnitude higher than blood scattering. The nonlin-
ear wave characteristic of the tissues themselves is, thus, sufficient
in itself to allow imaging of tissues, despite the resulting higher
attenuation at those frequencies. The avoidance of patient discom-
fort following contrast agent injectionis one of the major advantages
of this approach in tissues. Imaging using the harmonic approach
(whether with or without contrast agents) is generally known as
harmonic imaging. Compared to the standard approach, harmonic
imaging in tissues offers the ability to distinguish between noise
and fluid-filled structures, e.g. cysts and the gall bladder. In addi-
tion, harmonic imagingallows for better edgedefinitioninstructures
and, thus, is generallyknowntoincrease image clarity, mainlydue to
the much smaller influence of the transmitted pulse to the received
spectrum. Harmonic imaging is nowavailable inmost commercially
available ultrasound systems. One of the main requirements for har-
monic imaging is the large bandwidth of the transducer at receive
so as to allow reception of the higher frequency components. This
comes into very good agreement with the higher resolution require-
ment for diagnostic imaging.
Another field that has emerged out of ultrasonic imaging in the
past decade is elasticity imaging. Its premise is built on two proven
facts (1) that significant differences between mechanical properties
of several tissue components exist and (2) that the information con-
tained in the coherent scattering, or speckle, is sufficient to depict
these differences followinganexternal or internal mechanical stimu-
lus. For example, inthe breast, not onlyis the hardness of fat different
than that of glandular tissue, but, most importantly, the hardness of
normal glandular tissue is different than tumorous tissue (benign or
malignant) by up to one order of magnitude. This is also the reason
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
148
Elisa Konofagou
why palpation has been proven a crucial tool in the detection of
cancer.
The second observation is based on the fact that coherent echoes
canbe trackedwhile or after the tissue inquestionundergoes motion
and/or deformation caused by the mechanical stimulus, e.g. an
external vibration or a quasi-static compression in a method called
Elastography. Speckle tracking techniques are also employed here
for the motion estimation. In fact, Doppler techniques, such as those
used for blood velocity estimation, were initially applied in order
to track motion during vibration (Sonoelasticity imaging or Sonoelas-
tography). Parameters, such as velocity and strain, are estimated and
imaged in conjunction with the mechanical property of the under-
lying tissue. The higher the velocity or strain estimated the softer
the material and vice versa. Numerous applications ranging from
the breast to the thyroid and the heart have been implimented in
clinical applications.
6.5 CONCLUDING REMARKS
Despite the fact that diagnostic ultrasound is an older imaging
modality compared to MRI and PET, it is very intriguing to see
that it continues to expand as a field offering numerous and diverse
applications. In this chapter, we have described some of the funda-
mental aspects of ultrasound physics and ultrasonic imaging as well
as referred to examples of more recent methods and applications.
References
1. Wells PNT, Biomedical Ultrasonics, Medical Physics Series, Academic
Press, London NW1, 1977.
2. Kinsler LE, Frey AR, Fundamentals of Acoustics, 2nd edn., John Wiley &
Sons, NY, 1962.
3. Jensen JA, Estimation of Blood Velocities Using Ultrasound, Cambridge
University Press, Cambridge, U.K., 1996.
4. Burckhardt CB, Speckle in ultrasound B-mode scans, IEEE Trans on Son
and Ultras SU-25: 1–6, 1978.
5. Wagner RF, Smith SW, Sandrik JM, Lopez H, Statistics of speckle in
ultrasound B-scans, IEEE Trans on Son and Ultras 30: 156–163, 1983.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
Principles of Ultrasound Imaging Modalities
149
6. Bamber JC, Tristam M, in Webb S (ed.), Diagnostic Ultrasound, IOP
Publishing Ltd., pp. 319–386, 1988.
7. Christensen PA, Ultrasonic Bioinstrumentation, 1st edn., John Wiley &
Sons, 1988.
8. Morse, Ingard, Theoretical Acoustics, New York, McGraw-Hill, 1968.
9. Haney MJ, O’Brien Jr, WD, Temperature dependence of ultrasonic
propagation in biological materials, in Greenleaf JF (ed.), Tissue
Characterization with Ultrasound, CRC Press Boca Raton FL, pp. 15–55,
1986.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch06 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
CHAPTER 7
Principles of Image Reconstruction
Methods
Atam P Dhawan
Multidimensional medical imaging in most radiological applications
involves three major tasks: (1) raw data acquisition using imaging
instrumentation; (2) image reconstruction from the raw data; and (3)
image display and processing operations as needed. Image recon-
struction in multidimensional space is generally an ill posed prob-
lem, where a unique solution representing an ideal reconstruction
of the true object from the acquired raw data may not be possible
due to limitations on data acquisition. However, using specific fil-
tering operations on the acquired raw data along with appropriate
assumptions and constraints in the reconstruction methods, a feasi-
ble solution for image reconstruction can be obtained. Radon trans-
form has been most extensively used in image reconstruction from
acquired projection data in medical imaging applications such as
X-ray computed tomography. Fourier transform is directly applied to
the raw data for reconstructing images in medical imaging applica-
tions, such as magnetic resonance imaging (MRI) where the raw data
is acquired in frequency domain. Statistical estimation and optimiza-
tion methods often show advantages in obtaining better results in
image reconstruction dealing with the ill posed problems of imaging.
This chapter describes principles of image reconstruction in multidi-
mensional space from raw data using basic transform and estimation
methods.
7.1 INTRODUCTION
Diagnostic radiology has evolved into multidimensional imaging
in the second half of the twentieth century in terms of X-ray
151
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
152
Atam P Dhawan
computed tomography (CT), nuclear magnetic resonance imaging
(NMRI/MRI), nuclear medicine: single photon emission computed
tomography (SPECT) and positron emission tomography (PET),
ultrasound computed tomography, and optical tomographic imag-
ing. The foundation of such and many other multidimensional
tomographic imaging techniques started from a basic theory of
image reconstruction from projections that was first published by
J Radon in 1917
1
and later explored by a number of researchers
including Cramer and Wold,
2
Renyi,
3
Gilbert,
4
Bracewell,
5
Cormack
6
and Hounsfield
7,8
and many others for imaging appli-
cations in many areas including medicine, astronomy, microscopy
and geophysics.
9–11
The implementation of the Radon transformfor
reconstructing medical images from the data collected from imag-
ing instrumentation was only realized in the 1960s. Cormack in
1963
6
showed the radiological applications of Radon’s work for
image reconstruction from projections using a set of measurements
defining line integrals. In 1972, GN Hounsfield developed the first
commercial X-ray computed tomography (CT) scanner that used a
computerized image reconstruction algorithm based on the Radon
transform. GN Hounsfield and AM Cormack jointly received the
1979 Nobel Prize for their contributions to the development of com-
puterized tomography for radiological applications.
6–8
Image reconstruction algorithms have been continuously devel-
oped to reconstruct the true structural characteristics such as shape,
density, etc. of an object in the image. Image reconstruction from
projections or data collected from a scanner is an ill posed problem
because of the finite amount of data used to reconstruct the char-
acteristics of the object. Furthermore, the acquired data is severely
degraded because of occlusion, detector noise, radiation scattering
and inhomogeneities of the medium.
The classical image reconstructionfromprojectionmethodbased
on the Radon transformis popularly known as the “backprojection”
method. The backprojection method has been modified to incorpo-
rate specific data collection schemes and to improve quality. Fourier
transform and iterative series expansion based methods have
been developed for reconstructing images from projections. With
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
153
the fast developments in computer technology, advanced image
reconstruction algorithms using statistical and estimation meth-
ods were developed and implemented for several medical imaging
modalities.
7.2 RADON TRANSFORM
Radon transform first defines ray- or line-integrals to form projec-
tions froman unknown object and then uses infinite number of pro-
jections to reconstruct an image of the object. It should be noted that
though the early evolution in computed tomography was based on
image reconstruction using parallel beam geometry for data acqui-
sition, more sophisticated geometrical configuration and scanning
instrumentation are used today for faster data collection and image
reconstruction. New computed tomography (CT) image scanners
(oftencalledas fourthgenerationCTscanners) utilize acone-beamof
X-ray radiation and multiple rings of detectors for fast 3Dmultislice
scanning. Also, the basic Radon transformthat established the foun-
dation of image reconstruction from projections has been extended
to a spectrum of exciting applications of image reconstructions in
multidimensional space usinga varietyof imagingmodalities. How-
ever, the discussion in this chapter is focused on two-dimensional
representation of Radon transform only for image reconstruction
from projections that are obtained through parallel beam scanning
geometry in computed tomography.
Let us define a two-dimensional object function f (x, y) and its
Radon transformby R{f (x, y)}. Let us use the rectangular coordinate
system (x, y) in the spatial domain. The Radon transform is defined
by the line integral
_
L
along the path L such that:
R{f (x, y)} = J
θ
(p) =
_
L
f (x, y)dl, (1)
where the projection J
θ
(p) acquired at angle θ in the polar coordinate
system is a one-dimensional symmetric and periodic function with
a period of 2π. The polar coordinate system (p, θ) can be expressed
into rectangular coordinates in the Radon domain by using a rotated
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
154
Atam P Dhawan
x
y
q
p
θ
θ
p
f(x,y)
J
θ
(p)
Fig. 1. Line integral projectionJ
θ
(p) of atwo-dimensional object f (x, y) at anangle θ.
coordinate system (p, q) that is obtained by rotating the (x, y) coor-
dinate system (Fig. 1) by an angle θ as:
x cos θ + y sin θ = p,
−x sin θ + y cos θ = q. (2)
A set of line integrals or projections can be obtained for different θ
angles as:
R{f (x, y)} = J
θ
(p) =
_

−∞
f (p cos θ − q sin θ, p sin θ + q cos θ)dq. (3)
A higher-dimensional Radon transform can be defined in a simi-
lar way. For example, the projection space for a three-dimensional
Radon transform would be defined by 2D planes instead of
lines.
The significance of using the Radon transform for computing
projections inmedical imagingis that animage of a humanorgancan
be reconstructedbybackprojectingthe projections acquiredthrough
the imaging scanner. Figure 2 shows an illustration of the back
projection method for image reconstruction using projections.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
155
Projection J (p)
Projection J (p)
Projection J (p)
A
B
Projection
4 θ
θ3
θ2
θ1
(p) J
Reconstruction
Space
Fig. 2. A schematic diagram for reconstructing images from projections. Three
projections are back projected to reconstruct objects Aand B.
Three simulated projections of two objects A and B are back pro-
jected into the reconstruction space. Each projection has two seg-
ments of values corresponding to the objects A and B. When the
projections are back projected, the areas of higher values because
of the intersection of back projected projection data represents two
reconstructed objects. It should be noted that the reconstructed
objects may have geometrical or aliasing artifacts because of the
limited number of projections used in the imaging and reconstruc-
tion processes. In the early development of first and second gener-
ations of CT scanners, only parallel beam scanning geometry was
used for direct implementation of Radon transformfor image recon-
structions from projections. To improve the geometrical shape and
accuracy of the reconstructed objects, a large number of projec-
tions is needed that must be acquired in a fast and efficient way.
Today, fourth generation CT scanners utilize a cone-beam of X-
ray radiation and multiple rings of detectors for fast 3D multislice
scanning. More advanced imaging protocols, such as spiral CT use
even faster scanning and data manipulation techniques. Figure 3
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
156
Atam P Dhawan
Ring of
Detectors
Source
Source
Rotation Path
X-rays
Object
Fig. 3. An advanced X-ray CT scanner geometry with rotating source and ring of
detectors.
shows a fourth generation an X-ray CT scanner to obtain projec-
tions using a divergent cone-beamX-ray beamsource that is rotated
to produce multiple projections at various angles for multislice 3D
scanning. Modern CT scanners are used in many biomedical, indus-
trial, and other commercial applications using a large spectrum
of imaging modalities in multidimensional image reconstruction
space.
To establish a fundamental understanding of Radon transform
and image reconstruction from projections, only 2D representation
of Radon transform with image reconstruction from projections
defined through a parallel beam scanning geometry is discussed
below.
7.2.1 Reconstruction with Fourier Transform
The projection theorem, also called the central slice theorem, pro-
vides a relationship between the Fourier transform of the object
function and the Fourier transform of its Radon transform or
projection.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
157
The Fourier transformof the Radon transformof the object func-
tion f (x, y) can be written as
1,9–13
:
F{R{f (x, y)}} = F{J
θ
(p)}
=
_

−∞
_

−∞
f (p cos θ − q sin θ, p sin θ + q cos θ)e
−j2πωp
dqdp,
(4)
where ω represents the frequency component in the Fourier domain.
The Fourier transform, S
θ
(ω) of the projection J
θ
(p) can also be
expressed as:
S
θ
(ω) =
_

−∞
_

−∞
J
θ
(p)e
−j2πωp
dp. (5)
From Eqs. 4–5, the Fourier transform of the Radon transform of the
object function can be written as:
S
θ
(ω) =
_

−∞
_

−∞
f (x, y)e
−j2πω(x cos θ+y sin θ)
dxdy = F(ω, θ). (6)
Equation 6 can be considered as the two-dimensional Fourier
transform of the object function f (x, y) and can be represented as
F(u, v) with:
u = ωcos θ,
v = ωsin θ,
(7)
where u and v represents frequency components along the x- and
y-directions in a rectangular coordinate system.
It should be noted that S
θ
(ω) represents the Fourier transformof
the projectionJ
θ
(p) that is takenat anangle θ inthe space domainwith
a rotated coordinate system (p, q). The frequency spectrum S
θ
(ω) is
placed along a line or slice at an angle θ in the frequency domain
of F(u, v).
If several projections are obtained using different values of the
angle θ, their Fourier transform can be computed and placed along
the respective radial lines in the frequency domain of the Fourier
transform, F(u, v) of the object functionf (x, y). Additional projections
acquired in the space domain provide more spectral information
in the frequency domain leading to filling up the entire frequency
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
158
Atam P Dhawan
domain. Now the object function can be reconstructed using two-
dimensional inverse Fourier transform of the spectrum F(u, v).
7.2.2 Reconstruction using Inverse Radon Transform
The forward Radon transform is used to obtain projections of an
object function at different viewing angles. Using the central slice
theorem, an object function can be reconstructed by taking the
inverse Fourier transform of the spectral information in the fre-
quency domain that is assembled with the Fourier transform of
the individual projections. Thus, the reconstructed object function,
ˆ
f (x, y) canbe obtainedbytakingthe two-dimensional inverse Fourier
transform of F(u, v) as:
ˆ
f (x, y) = F
−1
{F(u, v)} =
_

−∞
_

−∞
F(u, v)e
j2π(xu+vy)
dudv. (8)
With the change of variables:
_
u = ωcos θ,
v = ωsin θ.
Equation 8 can be rewritten with the change of variables as:
ˆ
f (x, y) =
_
π
0
_

−∞
F(ω, θ)e
j2πw(x cos θ+y sin θ)
|ω|dωdθ. (9)
In Eq. 9, the frequency variable ω appears because of the Jacobian
due to change of variables. Replacing F(ω, θ) with S
θ
(ω), the recon-
struction image
ˆ
f (x, y) can be expressedas the backprojectedintegral
(sum) of the modified projections J

θ
(p) as:
ˆ
f (x, y) =
_
π
0
_

−∞
|ω|S
θ
(ω)e
j2πω(x cos θ+y sin θ)
dωdθ
=
_
π
0
_

−∞
|ω|S
θ
(ω)e
j2πωp
dωdθ =
_
π
0
J

θ
(p)dθ,
where
J

θ
(p) =
_

−∞
|ω| S
θ
(ω)e
j2πω(x cos θ+y sin θ)
dω. (10)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
159
7.3 BACKPROJECTION METHOD FOR IMAGE
RECONSTRUCTION
The classical image reconstruction from projection method based
on the Radon transform is popularly known as the backprojection
method. The backprojection methodhas been modifiedby a number
of investigators to incorporate specific data collection schemes and
to improve quality of reconstructed images.
Though the object function can be reconstructed using the
inverse Fourier transform of the spectral information of the
frequency domain F(u, v) obtained using the central slice theorem,
an easier implementation of the Eq. 10 can be obtained by its realiza-
tion through the modified projections, J

θ
(p). This realization leads
to the convolution backprojection, also known as filtered backpro-
jection method for image reconstruction from projections.
The modified projection J

θ
(p) can be expressed in terms of a
convolution of:
J

θ
(p) =
_

−∞
|ω|S
θ
(ω)e
j2πωp

= F
−1
{|ω|S
θ
(ω)}
= F
−1
{|ω|} ⊗ J
θ
(p), (11)
where ⊗ represents the convolution operator.
Equation 11 presents some interesting challenges for imple-
mentation. The integration over the spatial frequency variable ω
should be carried out from −∞ to ∞. But in practice, the projec-
tions are considered to be bandlimited. This means that any spectral
energy beyond a spatial frequency, say , must be ignored. Using
Eqs. 10–11, it canbe shownthat the reconstructionfunctionor image,
ˆ
f (x, y) can be computed as:
ˆ
f (x, y) =
1
π
_
π
0

_

−∞
dp

J
θ
(p

)h(p − p

), (12)
where h(p) is a filter function that is convolved with the projection
function.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
160
Atam P Dhawan
Ramakrishnan and Lakshiminarayanan
9
computed the filter
function h(p) strictly from Eq. 11 in the Fourier domain as
H
R−L
=
_
|ω| if |ω| ≤
0 otherwise
_
, (13)
where H
R−L
is the Fourier transform of the filter kernel function
h
R−L
(p) in the spatial domain and is bandlimited.
In general, H(ω) a bandlimited filter function in the frequency
domain (Fig. 4) can be expressed as:
H(ω) = |ω|B(ω),
where B(ω) denotes the bandlimiting function,
B(ω) =
_
1 if |ω| ≤
0 otherwise
_
. (14)
For the convolution operation with the projection function in the
spatial domain (Eqs. 10–11), the filter kernel function, H(ω) can be
obtained from h(p) by taking the inverse Fourier transform as:
h(p) =
_

−∞
H(ω)e
j2πωp
dω. (15)
If the projections are sampled with a time interval of τ, the projec-
tions can be represented as J
θ
(kτ) where k is an integer. Using the
H(ω)
1/2τ
-1/2τ
1/2τ
ω
Fig. 4. Abandlimited filter function H(ω).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
161
sampling theorem and the bandlimited constraint, all spatial fre-
quency components beyond are ignored such that:
=
1

. (16)
For the bandlimited projections with a sampling interval of t, Eq. 15
can be expressed with some simplification as:
h(p) =
1

2
sin (πp/τ)
πp/τ

1

2
_
sin (πp/2τ)
πp/2τ
_
2
. (17)
Thus the modifiedprojection J

θ
(p

) andthe reconstruction image can
be computed as:
J

θ
(p) =
_

−∞
J
θ
(p

)h(p − p

)dp

,
ˆ
f (x, y) =
π
L
L

i=1
J
θ
i
(p), (18)
where L is the total number of projections acquired during the imag-
ing process at viewing angles θ
i
; for i = 1, . . . , L.
The quality of the reconstructed image depends heavily on
the number of projections and the spatial sampling interval of the
acquired projection. For better quality images to be reconstructed,
it is essential to acquire a large number of projections covering the
entire range of viewing angles around the object. Higher resolu-
tion images with fine details can only be reconstructed if the projec-
tions are acquired with a high spatial sampling rate satisfying the
basic principle of the sampling theorem. If the raw projection data
is acquired at a sampling rate lower than the Nyquist sampling rate,
aliasing artifacts would occur in the reconstructed image because of
the overlapping spectra in the frequency domain. The fine details in
the reconstructed images represent high frequency components.
The maximum frequency component that can be reconstructed
inthe image is thus limitedbythe detector size andthe scanningpro-
cedure used in the acquisition of rawprojection data. To reconstruct
images of higher resolution and quality, the detector size should be
small. On the other hand, the projection data may suffer from poor
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
162
Atam P Dhawan
signal-to-noise ratio if there is an insufficient number of photons
collected by the detector due to its smaller size.
There are several variations in the design of the filter function
H(ω) investigated in the literature. The acquired projection data is
discrete in the spatial domain. To implement convolution backpro-
jection method in the spatial domain, the filter function has to be
realized as discrete in the spatial domain. The major problem of the
Ramachandaran-Lakshiminarayanan filter
9
is that it has sharp cut-
offs inthe frequency domainat ω = 1/2τ andω = −1/2τ as shownin
Fig. 4. The sharp cut-offs based function provides sinc functions for
the filter in the spatial domain as shown in Eq. 5.16 causing mod-
ulated ringing artifacts in the reconstructed image. To avoid such
artifacts, the filter function must have smooth cut offs such as those
obtained from Hamming window function. Abandlimited general-
ized Hamming window can be represented as:
H
Hammin g
(ω) = |ω|[α + (1 −α) cos (2πωτ)]B(ω), for 0 ≤ α ≤ 1
(19)
where the parameter α can be adjusted to provide appropriate char-
acteristic shape of the function.
The Hamming window based filter kernel function provides
smoother cutoffs as shown in Fig. 5. The Hamming window based
convolution function provides smoother function in the spatial
H(ω)
1/2τ
-1/2τ
ω
Fig. 5. AHamming windowbased filter kernel function in the frequency domain.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
163
Fig. 6. (A) A reconstructed image of a cross sectional slice of chest of a cadaver
using the Radon transform based backprojection method; (B) The actual patholog-
ically stained slice of the respective cross section.
domain that reduces the ringing artifacts and improves signal-to-
noise ratio in the reconstructed image. Other smoothing functions
can be used for reducing ringing artifacts and improving the quality
of the reconstructed image.
12–13
Figure 6(A) shows a reconstructedimage of a cross sectional slice
of chest of a cadaver using the Radon transform based backprojec-
tion method. The actual pathologically stainedslice of the respective
cross section is shown in Fig. 6(B).
7.4 ITERATIVE ALGEBRAIC RECONSTRUCTION
TECHNIQUES (ART)
The iterative reconstruction methods are based on optimization
strategies incorporating specific constraints about the object domain
and the reconstruction process. Algebraic reconstruction tech-
niques (ART)
11–14
are popular algorithms used in iterative image
reconstruction. In the algebraic reconstruction methods, the raw
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
164
Atam P Dhawan
projection data from the scanner is distributed over a prespec-
ified image reconstruction grid such that the error between the
computed projections from the reconstructed image and the actual
acquired projections is minimized. Such methods provide a mecha-
nism to incorporate additional specific optimization criteria such as
smoothing and entropy maximization in the reconstruction process
to improve the image quality and signal-to-noise ratio. The alge-
braic reconstruction methods are based on the series expansion rep-
resentation of a function and were used by Gordon and Herman for
medical image reconstruction.
12–14
Let us assume a two-dimensional image reconstruction grid of
N pixels. Let us define p
i
representing the projection data as a set
of ray sums that are collected by M scanning rays passing through
the image at specific angles (rays as defined in Fig. 1). Let f
j
be the
value of j-th pixel of the image that is weighted by w
i,j
to meet the
projection measurements. Thus the ray sump
i
in the projection data
can be expressed as:
p
i
=
N

j=1
w
i,j
f
j
for i = 1, . . . , M. (20)
The representation in Eq. 5.19 provides M equations of N unknown
variables to be determined. The weight w
i,j
represents the contri-
bution of the pixel value in determining the ray sum and can be
determined by geometrical consideration as the ratio of the area
overlapping with the scanning ray to the total area of the pixel. The
problem of determining f
j
for image reconstruction can be solved
iteratively using the ART algorithm. Alternately, it can be solved
through matrix inversion since the measured projection data p
i
is
known. The set of equations can also be solved using dynamic
programming methods.
12
Inalgebraic reconstructionmethods, eachpixel is assigneda pre-
determined value such as the average of the rawprojection data per
pixel to start the iterative process. Any time during the reconstruc-
tion process, a computed ray sumfromthe image under reconstruc-
tion is obtained by passing a ray. In each iteration, an error between
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
165
the measured projection ray sumand the computed ray sumis eval-
uated and distributed on the corresponding pixels in a weighted
manner. The correction to the pixel values can be obtained in an
additive or multiplicative manner, i.e. the correction value is either
added to the current pixel value or multiplied with it to obtain the
next value. The iterative process continues until the error between
the measured and computed ray sums is minimized or meets a pre-
specified criterion. The f
j
values from the last iteration provide the
final reconstructed image.
Let q
k
j
be the computed ray sum in the k-th iteration that is pro-
jected over the reconstruction grid in the next iteration. The iterative
procedure can then be expressed as:
q
k
i
=
N

l=1
f
k−1
l
w
i,l
for all i = 1, . . . , M
f
k+1
j
= f
k
j
+
_
p
i
− q
k
i

N
l=1
w
2
i,l
_
w
i,j
. (21)
Gordon
14
used an easier way to avoid large computation of the
weight matrix by replacing the weight by 1 or 0. If the center of the
pixel passes through the ray, the corresponding weight is assigned
as 1, otherwise 0. This simplification provides an efficient imple-
mentation of the algorithm and is known as additive ART. Other
versions of ART including multiplicative ART have been developed
to improve the reconstruction efficacy and quality.
12
Iterative ART methods offer an attractive alternative to the fil-
tered backprojection method because of their abilities to deal with
the noise and random fluctuations in the projection data caused
by detector inefficiency and scattering. These methods are par-
ticularly suitable for limited view image reconstruction as more
constraints defining the imaging geometry and prior information
about the object can easily be incorporated into the reconstruction
process.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
166
Atam P Dhawan
7.5 ESTIMATION METHODS
Though the filtered backprojection methods are most commonly
used in medical imaging, in practice, a significant number of
approaches using statistical estimation methods have been investi-
gated for image reconstruction for transmission as well as emission
computed tomography.
15–26
These methods assume a certain distri-
bution of the measured photons and then find the parameters for
attenuation function (in the case of transmission scans such as X-ray
CT) or emitter density (in the case of emission scans such as PET).
The photon detection statistics of a detector is usually charac-
terized by Poisson distribution. Let us define a measurement vector

J = [J
1
, J
2
, . . . , J
N
] with J
i
to be the random variable representing the
number of photons collected by the detector for the i-th ray such
that
17
:
E[J
i
] = m
i
e

_
L
µ(x,y,z)dl
for i = 1, 2, . . . , N, (22)
where L defines the ray along which the photons with monochro-
matic energy have been attenuated with the attenuation coefficients
denoted by µs and m
i
is the mean number of photons collected by
the detector for the i-th ray position. Also, in the above formulation,
the noise, scattering and random coincidence effects are ignored.
The attenuation parameter vector µ can be expressed in terms
of a series expansion as a weighted sum of individual attenuation
coefficients of corresponding pixels (for 2Dreconstruction) or voxels
(for 3D reconstruction). If the parameter vector µ has N
p
number of
individual elements (pixels or voxels), it can be represented as:
µ =
N
p

j=1
µ
j
w
j
, (23)
where w
j
is the basis function that is the weight associated with the
individual µ
j
belonging to the corresponding pixel or voxel.
One simple solution to obtain w
j
is to assign it a value 1 if the
ray contributing to the corresponding photon measurement vector
passes through the pixel (or voxel) and 0 otherwise. It can be shown
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
167
that a line integral or ray sum for i-th ray is given by:
_
L
i
µ(x, y, z)dl =
N
p

k=1
a
ik
µ
k
, (24)
where a
ik
=
_
L
i
w
k
( x) with x representing the position vector for
(x, y, z) coordinate system.
The weight matrix A = {a
ik
} is defined to rewrite the measure-
ment vector as:
J
i
( µ) = m
i
e
−[A µ]
i
,
where
[A µ]
i
=
N
p

k=1
a
ik
µ
k
. (25)
The reconstruction problem is to estimate µ from a measured set
of detector counts realizing the random variable

J. The maximum
likelihood (ML) estimate can be expressed as
17–19
:
ˆ
µ = arg max
µ≥

0
L( µ),
L( µ) = log P[

J =

j; µ),
(26)
where L( µ) is the likelihood function defined as the logarithmic of
the probability function P[

J =

j; µ). The MLreconstruction methods
are developed to obtain an estimate of the parameter vector µ that
maximizes the probability of observing the measured data (photon
counts).
Using the Poisson distribution model for the photon counts,
the measurement joint probability function P[

J =

j; µ) can be
expressed as:
P[

J =

j; µ) =
N

i=1
P[J
i
= j
i
; µ) =
N

i=1
e
−j
i
(µ)
[j
i
(µ)]
j
i
j
i
!
. (27)
If the measurements are obtained independently through defining
ray sums, the log likelihood function can be expressed combining
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
168
Atam P Dhawan
Eqs. 5.21, 5.25 and 5.26 as:
L( µ) =
N

i=1
h
i
([A µ]
i
),
where
h
i
(l) = j
i
log (m
i
e
−l
) − m
i
e
−l
. (28)
Let us consider an additive nonnegative function r
i
representing the
background photon count for the i-th detector due to the scatter-
ing and random coincidences, the likelihood function can then be
expressed as
17
:
L( µ) =
N

i=1
h
i
([A µ]
i
),
where
h
i
(l) = j
i
log (m
i
e
−l
+ r
i
) − (m
i
e
−l
+ r
i
). (29)
Several algorithms have been investigated in the literature to obtain
an estimate of the parameter vector that maximizes the log likeli-
hood function given in Eq. 5.27. However, it is unlikely that there
is a unique solution to this problem. There may be several solu-
tions of the parameter vector that can maximize the likelihood func-
tion. All solutions may not be appropriate or even feasible for image
reconstruction. To improve quality reconstructed images, a num-
ber of methods imposing additional constraints such as smoothness
are applied by incorporating the penalty functions in the optimiza-
tion process. Several iterative optimization processes incorporating
roughness penalty function for the neighborhood values of the esti-
matedparameter vector have beeninvestigatedinthe literature.
17–19
Let us represent a general roughness penalty function R(µ)
17–19
such that:
R( µ) =
K

k=1
ψ([C µ]
k
),
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
169
where
[C µ]
k
=
N
p

l=1
c
kl
µ
l
. (30)
where ψ
k
’s are potential functions working as a normonthe smooth-
ness constraints Cµ ≈ 0 andKis the number of suchconstraints. The
matrix C is a K ×N
p
penalty matrix. It should be noted that ψ
k
’s are
convex, symmetric, nonnegative and differentiable functions.
17
A
potential choice for a quadratic penalty function could be by defin-
ing ψ
k
(t) = w
k
t
2
/2 with non-negative weights, i.e. w
k≥0
. Thus the
roughness penalty function R( µ) is given by:
R( µ) =
K

k=1
w
k
1
2
([C µ]
k
)
2
. (31)
The objective function for optimization using the penalized ML
approach can now be revised as:
ˆ
µ = arg max ( µ)
where
( µ) = L( µ) −βR( µ). (32)
The parameter β controls the level of smoothness in the final recon-
structed image.
Several methods for obtaining the MLestimate have been inves-
tigated in the literature. These optimization methods include expec-
tation maximization (EM), complex conjugate gradient, gradient
descent optimization, grouped coordinated ascent, fast gradient
based Bayesian reconstruction and ordered subsets algorithms.
28–30
Such iterative algorithms have been applied to obtain a solution for
the parameter vector for reconstructing an image from both trans-
mission and emission scans. In addition, multigrid EM methods
have also been applied for image reconstruction in positron emis-
sion tomography (PET).
23–24
Figure 7(A) shows axial PET images of
the brain reconstructed using filtered backprojection methods while
Fig. 7(B) shows same cross sectional images reconstructed using a
multigrid EM method.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
170
Atam P Dhawan
(A)
(B)
Fig. 7. (A) Axial PET images of the brain reconstructed using filtered backpro-
jection methods; (B) cross sectional images reconstructed using a multigrid EM
method.
7.6 CONCLUDING REMARKS
Image reconstructionis anintegral andprobably the most important
part of medical imaging. Utilizing more information about imaging
geometry and physics of imaging, quality of reconstruction can be
improved. Furthermore, a priori and model based information can
be used with constrained optimization methods for better recon-
struction. In this chapter, basic image reconstruction approaches
are presented that are based on Radon transform, Fourier trans-
form, filtered backprojection, iterative ART, and statistical esti-
mation and optimization methods. More details and advanced
image reconstruction methods are presented in Chapter 15 in
this book.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
Principles of Image Reconstruction Methods
171
References
1. Radon J, Uber die Bestimmung von Funktionen durch ihre Integral-
werte langs gewisser Mannigfaltigkeiten, Ber Verb Saechs AKAD Wiss,
Match Phys, Kl 69: 262–277, 1917.
2. Cramer H, Wold H, Some theorems on distribution functions, J London
Math Soc 11: 290–294, 1936.
3. Renyi A, On projections of probability distributions, Acta Math Acad
Sci Budapest 3: 131–141, 1952.
4. Gilbert WM, Projections of probability distributions, Acta Math Acad
Sci Budapest 6: 195–198, 1955.
5. Bracewell RN, Strip integration radio astronomy, Aust J Physcis 9:
198–217, 1956.
6. Cormack AM, Representation of a function by its line integrals with
some radiological applications, J Appl Phys 34: 2722–2727, 1963.
7. HounsfieldGN, Computerizedtransverse axial scanning tomography:
Part-1, description of the system, Br J Radiol 46: 1016–1022, 1973.
8. Hounsfield GN, A method and apparatus for examination of a body
by radiation such as X or gamma radiation, Patent 1283915, The patent
Office, London, England, 1972.
9. Ramachandran GN, Lakshminaryanan AV, Three-dimensional recon-
struction from radiographs and electron micrographs, Proc Nat Acad
Sci USA 68: 2236–2240, 1971.
10. Deans SR, The RadonTransformand Some of Its Applications, JohnWiley&
Sons, 1983.
11. Dhawan AP, Medical Image Analysis, John Wiley and Sons, 2003.
12. Herman GT, Image Reconstruction from Projections, Academic Press,
1980.
13. Rosenfeld, Kak AC, Digital Picture Processing, Vol 1, Academic Press,
1982.
14. Gordon R, A tutorial on ART (Algebraic Reconstruction Techniques),
IEEE Trans Nucl Sci 21: 78–93, 1974.
15. Dempster AP, Laird NM, Rubin DB, Maximumlikelihood fromincom-
plete data via the EM algorithm, J R Stat Soc Ser B 9: 1–38, 1977.
16. Shepp LA, Vardi Y, Maximum likelihood reconstruction for emission
tomography, IEEE Trans Med Imag 1: 113–121, 1982.
17. Fessler JA, Statistical image reconstruction methods for transmission
tomography, in Sonka, M, Fitzpatrick, JM (eds.), Handbook of Medi-
cal Imaging, Vol. 2, Medical Image Processing and Analysis, SPIE Press,
pp. 1–70, 2000.
18. Erdogen H, Fessler J, Monotonic algorithms for transmission tomog-
raphy, IEEE Trans Med Imag 18: 801–814, 1999.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch07 FA
172
Atam P Dhawan
19. Yu DF, Fessler JA, Ficaro EP, Maximum likelihood transmission image
reconstruction for overlapping transmission beams, IEEE Trans Med
Imag 19: 1094–1105, 2000.
20. Lange K, Carson R, EM reconstruction algorithms for emission and
transmission tomography, J Comp Asst Tomogr 8: 306–316, 1984.
21. Olinger JM, Maximum likelihood reconstruction of transmission
images in emission computed tomography via the EMalgorithm, IEEE
Trans Med Imag 13: 89–101, 1994.
22. WelchA, Clack R, Natterer F, Gullberg G, Toward accurate attenuation
correction in SPECT without transmission measurements, IEEE Trans
Med Imag 16: 532–541, 1997.
23. Ranganath MV, Atam Dhawan P, Mullani N, A multigrid expectation
maximization reconstruction algorithm for positron emission tomog-
raphy, IEEE Trans on Med 7: 273–278, 1988.
24. RahejaA, DhawanAP, Wavelet basedmultiresolutionexpectationmax-
imization reconstruction algorithm for emission tomography, Comp
Med Imag And Graph 24: 87–98, 2000.
25. Solo V, Purdon P, Weisskoff R, Brown E, Asignal estimation approach
to functional MRI, IEEE Trans Med Imag 20: 26–35, 2001.
26. Basu S, Bresler Y, O(N
3
log N)Backprojection algorithm for the 3-D
Radon transform, IEEE Trans Med Imag 21: 76–88. 2002.
27. Bouman CA, Saur K, A unified approach to statistical tomography
using coordinate descent optimization, IEEETrans Image Process 5: 480–
492, 1996.
28. Erdogen H, Gualtiere G, Fessler JA, Ordered subsets algorithms for
transmission tomography, Phys Med Biol 44: 2835–2851, 1999.
29. Mumcuoglu EU, Leahy R, Cherry SR, Zhou Z, Fast gradient-based
methods for Bayesian Reconstruction of transmission and emission
PET images, IEEE Trans Med Imag 13: 687–701, 1994.
30. Green PJ, Bayesian reconstructions from emission tomography data
using a modified EM algorithm, IEEE Trans Med Imag 9: 84–93, 1990.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
CHAPTER 8
Principles of Image Processing
Methods
Atam P Dhawan
Medical image processing methods including image restoration and
enhancement methods are very useful for effective visual examination
and computerized analysis. Image processing methods enhance features
of interest for better analysis and characterization. Though there have
been more advanced model-based image processing methods investi-
gated and developed recently, this chapter presents the principles of
selected basic image processing methods. Advanced image process-
ing and reconstruction methods are described in other chapters in this
book.
8.1 INTRODUCTION
Medical images are examined through visual inspection by expert
physicians or analyzed through computerized methods for spe-
cific feature extraction, classification, and statistical analysis. In both
of these approaches, image processing operations such as image
restoration (such as smoothing operations for noise removal) and
enhancement for better feature representation, extraction and anal-
ysis are very useful. The principles of some of the most commonly
used basic image processing methods for noise removal, image
smoothing, and feature enhancement are described in this chap-
ter. These methods are usually available in any image processing
software such as MATLAB through the image processing toolbox.
173
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
174
Atam P Dhawan
Medical images show characteristic information about the
physiological properties of the structures and tissues. However,
the quality and visibility of information depend on the imaging
modality and the response functions (such as point spread func-
tion) of the imaging scanner. Medical images from specific modal-
ities need to be processed using methods suitable to enhance
the features of interest. For example, a chest X-ray radiographic
image shows the anatomical structure of the chest based on the
total attenuation coefficients. If the radiograph is being examined
for a possible fracture in the ribs, an image enhancement method
is required to improve the visibility of hard bony structure. But, if
an X-ray mammogram is obtained for the examination of potential
breast cancer, an image processing method is required to enhance
visibility of microcalcifications, speculated masses and soft tis-
sue structures such as parenchyma. A single image enhancement
method may not serve both of these applications. Image enhance-
ment methods for improving the soft tissue contrast in MR brain
images may be entirely different than those used for PET brain
images. Thus, image enhancement tasks andmethods are verymuch
application dependent.
Image enhancement methods may also include image restora-
tionmethods whichare generallybasedonminimummean-squared
error operations, such as Wiener filtering and other constrained
deconvolution methods incorporating some a priori knowledge of
degradation.
1–5
Since the main objective is to enhance features of
interest, a suitable combination of both restoration and contrast
enhancement algorithms is the integral part of pre-processing in
image analysis. The selection of a specific restoration algorithm for
noise removal is highly dependent on the image acquisition system.
For example, in the filtered-backprojection method for reconstruct-
ing images in computed tomography (CT), the raw data obtained
from the scanner is first deconvolved with a specific filter. Filter
functions such as Hamming window, as described in chapter 7, may
also be used to reduce noise in the projection data. On the other
hand, several image enhancement methods, such as neighborhood
based operations, frequency filtering operations, etc., implicitly de-
emphasize noise for feature enhancement.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
175
Image processing methods are usually performed in one of the
two domains: (1) spatial domain; (2) spectral domain. Image or spa-
tial domain provides a distribution of an image feature such as
brightness over the spatial grid of samples. Spectral or frequency
domainprovides spectral informationina transformeddomainsuch
as the one obtained through Fourier transform. In addition, specific
transform based methods such Hough transform, neural networks
andmodel-basedmethods have also beenusedfor image processing
operations.
1–7
8.2 IMAGE PROCESSING IN SPATIAL DOMAIN
Spatial domain methods process an image with pixel-by-pixel
transformation based on the histogram statistics or neighborhood
operations. These methods are usually faster in computer imple-
mentation as compared to frequency filtering methods that require
computation of Fourier transform for frequency domain represen-
tation. However, frequency filtering methods may provide better
results in some applications if a priori information about the char-
acteristic frequency components of the noise and features of inter-
est is available. For example, specific spike based degradations
due to mechanical stress and vibration on the gradient coils in
the raw signal often cause striation artifacts in fast MR imaging
techniques. The spike degradations based noise in the MR signal
can be modeled with their characteristic frequency components
and can be removed by selective filtering and wavelet processing
methods.
7
Wiener filtering methods have been applied for signal
enhancement to remove frequency components related to the unde-
sired resonance effects of the nuclei and noise suppression in MR
imaging.
8–10
8.2.1 Image Histogram Representation
A histogram of an image provides information about the intensity
distribution of pixels in the image. The simplest formof a histogram
is the plot of occurrence of specific gray-level values of the pixels in
the image. The occurrence of gray levels can be provided in terms of
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
176
Atam P Dhawan
the absolute values, i.e. the number of times a specific gray-level has
occurred in the image, or the probability values, i.e. the probability
of occurrence of a specific gray-level in the image. In mathematical
terms, a histogram h(r
i
) is expressed as:
h(r
i
) = n
i
for i = 0, 1, . . . , L −1, (1)
where r
i
is the ith gray-level in the image for a total of L gray
values and n
i
is the number of occurrences of gray-level r
i
in the
image.
If a histogram is expressed in terms of the probability of occur-
rence of gray-levels, it can be expressed as:
p(r
i
) =
n
i
n
, (2)
where n is the total number of pixels.
Thus, a histogram is a plot of h(r
i
) or p(r
i
) versus r
i
. Figure 1(A)
shows an X-ray mammogram image while 1(B) shows its gray-
level histogram.
(A) (B)
Fig. 1. An X-ray (A) mammogram image on the left with its (B) histogram at the
right.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
177
8.2.2 Histogram Equalization
A popular general-purpose method of image enhancement is his-
togram equalization. In this method, a monotonically increasing
transformationfunction, T(r) is usedtomapthe original grayvalues,
r
i
of the input image into new gray values, s
i
of the output image
such that:
s
i
= T(r
i
) =
i

j=0
p
r
(r
j
)
=
i

j=0
n
j
n
for i = 0, 1, . . . , L −1, (3)
where p
r
(r
i
) is the probability based histogram of the input image
that is transformed into the output image with the histogram p
s
(s
i
).
The transformation function T(r
i
) in Eq. (6.3) stretches the his-
togram of the input image such that the gray values occur in the
output image with equal probability of occurrence. It should be
noted that the uniform distribution of the histogram of the out-
put image is limited by discrete computation of the gray-level
transformation. The histogram equalization method forces image
intensity levels to be redistributed with an equal probability of
occurrence.
Figure 2 shows the original mammogram image and its his-
togram equalized image with their respective histograms. Image
saturation around the middle of the image can be noticed in the
histogram equalized image.
8.2.3 Histogram Modification
The histogram equalization method stretches the contrast of an
image by redistributing the gray values to achieve a uniform dis-
tribution. This general method may not provide good results in
many applications. It can be noted from Fig. 2 that the histogram
equalization method can cause saturation in some regions of the
image resulting in loss of details and high-frequency information
that maybe necessaryfor interpretation. Sometimes, local histogram
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
178
Atam P Dhawan
Fig. 2. Top left: original X-ray mammogram image; Bottom left: histogram of
the original image; Top right: the histogram equalized image; Bottom right:
histogram of the equalized image.
equalizationis appliedseparately onpredefinedlocal neighborhood
regions, such as 7 ×7 pixels, to provide better results.
1
If a desired distribution of gray values is known a priori, a his-
togram modification method is used to apply a transformation that
changes the gray values to match the desired distribution. The tar-
get distribution can be obtained from a good contrast image that is
obtained under similar imaging conditions. Alternatively, an orig-
inal image from a scanner can be interactively modified through
regional scaling of gray values to achieve the desired contrast. This
image cannowprovide a target distributionto the rest of the images,
obtained under similar imaging conditions, for automatic enhance-
ment using the histogram modification method.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
179
The conventional scaling method of changing gray values from
the range [a, b] to [c, d] can be given by a linear transformation as:
z
new
=
d −c
b −a
(z −a) +c, (4)
where z and z
new
are, respectively, the original and new gray values
of a pixel in the image.
Let us assume that p
z
(z
i
) is the target histogram expressed, and
p
r
(r
i
) and p
s
(s
i
) are, respectively, the histograms of the input and
output image. Atransformationis neededsuchthat the output image
p
s
(s
i
) should have the desired histogram of p
z
(z
i
). The first step in
this process is to equalize p
r
(r
i
) using the Eq. 3 such that
1,6
:
u
i
= T(r
i
) =
i

j=0
p
r
(r
j
) for i = 0, 1, . . . , L −1, (5)
where is u
i
represents the equalized gray values of the input image.
A new transformation V can be defined to equalize the target
histogram such that:
v
i
= V(z
i
) =
i

k=0
p
z
(z
k
) for i = 0, 1, . . . , L −1. (6)
Putting V(z
i
) = T(r
i
) = u
i
to achieve the target distribution, new
gray values s
i
for the output image are computed from the inverse
transformation V
−1
as:
s
i
= V
−1
[T(r
i
)] = V
−1
(u
i
). (7)
With the transformation defined in Eq. 7, the histogramdistribution
of the output image p
s
(s
i
) would become similar to that of p
z
.
8.2.4 Image Averaging
Signal averaging is a well known method for enhancing signal-to-
noise ratio. In medical imaging, data fromthe detector is often aver-
aged over time or space for signal enhancement. However, such
signal enhancement is achieved at the cost of some loss of temporal
or spatial resolution. Sequence images, if properly registered and
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
180
Atam P Dhawan
acquired in non-dynamic applications, can be averaged for noise
reduction leading to smoothing effects. Selective weighted averag-
ing can also be performed over a specified neighborhood of pixels
in the image.
Let us assume that an ideal image f (x, y) suffers through an
additive noise n(x, y). The acquired image g(x, y) then can be rep-
resented as:
g(x, y) = f (x, y) +n(x, y). (8)
Ina general imagingprocess, the noise is assumedtobe uncorrelated
and random with a zero average value. If a sequence of K images
is acquired for the same object under the same imaging conditions,
the average image ¯ g(x, y) can be obtained as:
¯ g(x, y) =
1
K
K

i=1
g
i
(x, y), (9)
where g
i
(x, y); i = 1, 2, . . . , K represents the sequence of images to
be averaged.
As the number of images K increases, the expected value of the
average image ¯ g(x, y) approaches to f (x, y) reducing the noise per
pixel in the averaged image as:
E{¯ g(x, y)} = f (x, y)
σ
¯ g(x,y)
=
1

K
σ
n(x,y)
, (10)
where σ represents the standard deviation of the respective random
field.
8.2.4.1 Neighborhood Operations
The spatial filtering methods using neighborhood operations
involve the convolution of the input image with a specific mask
(such as Laplacian based high frequency emphasis filtering mask)
to enhance an image. The gray value of each pixel is replaced by the
new value computed according to the mask applied in the neigh-
borhood of the pixel. The neighborhood of a pixel may be defined
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
181
in any appropriate manner based on a simple connectedness or any
other adaptive criterion.
13
Let us assume a general weight mask of (2p +1) ×(2p +1) pixels
where p can take integer values, such as 1, 2, . . . , depending upon
the size of the mask. For p = 1, the size of the weight mask is 3 ×3
pixels. Adiscrete convolution of an image f (x, y) with a spatial filter
represented by a weight mask w(x, y) is given by:
g(x, y) =
1

p
x

=−p

p
y

=−p
w(x

, y

)
p

x

=−p
p

y

=−p
w(x

, y

)f (x+x

, y+y

),
(11)
where the convolution is performed for all values of x and y in the
image. In other words, the weight mask of the filter is translated and
convolved over the entire extent of the input image to provide the
output image.
The values of the weight mask are derived froma discrete repre-
sentation of the selected filter. Based on the filter, the characteristics
of the input image are changed in the output image. For example,
Fig. 3 shows a weighted averaging mask that can be used for image
smoothing and noise reduction. In this mask, the pixels in the
4-connected neighborhood are weighted twice than other pixels as
they are closer than others to the central pixel. The mask is usedwith
a scaling factor of 1/16 that is multiplied to the values obtained by
convolution of the mask with the image Eq. 11.
Figure 4 shows an X-ray mammogram image smoothed by spa-
tial filtering using the weighted averaging mask shown in Fig. 3.
Some loss of details can be noted in the smoothed image because of
1 2 1
2 4 2
1 2 1
Fig. 3. Aweighted averaging mask for image smoothing.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
182
Atam P Dhawan
Fig. 4. Left: an original X-ray mammogramimage; right: a smoothed image using
the weight mask shown in Fig. 3.
the averaging operation. In order to minimize the loss of details, an
adaptive median filtering may be applied.
1–4
8.2.4.2 Median Filter
Median filter is a well known order-statistics filter that replaces the
original grayvalue of a pixel bythe medianof grayvalues of pixels in
the specifiedneighborhood. For example, for 3×3 pixels basedfixed
neighborhood, the gray value of the central pixel f (0, 0) is replaced
by the median of gray values of all nine pixels in the neighborhood.
Insteadof replacing the gray value of the central pixel by the median
operation of the neighborhood pixels, other operations such as mid-
point, arithmetic mean, and geometric mean, can also be used in
order-statistics filtering methods.
1–5
Amedian filter operation for a
smoothed image
ˆ
f (x, y) computed from the acquired image g(x, y) is
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
183
defined as:
ˆ
f (x, y) =
median
(i, j) ∈ N
{g(i, j)}, (12)
where N is the prespecified neighborhood of the pixel (x, y).
8.2.4.3 Adaptive Arithmetic Mean Filter
Adaptive local noise reduction filtering can be applied using the
variance information of the selected neighborhood and an estimate
of the overall variance of noise in the image. If the noise variance of
the image is similar to the variance of gray values in the specified
neighborhood of pixels, the filter provides an arithmetic mean value
of the neighborhood. Let σ
2
n
be anestimate of the variance of the noise
in the image and σ
2
s
be the variance of gray values of pixels in the
specified neighborhood, an adaptive local noise reduction filtering
can be implemented as:
ˆ
f (x, y) = g(x, y) −
σ
2
n
σ
2
s
[g(x, y) − ¯ g
ms
(x, y)], (13)
where ¯ g
ms
(x, y) is the meanof the grayvalues of pixels inthe specified
neighborhood. This should be noted that if the noise variance is zero
in the image, the resultant image is the same as the input image. If
an edge were present in the neighborhood, the local variance would
be higher than the noise variance of the image. In such cases, the
above estimate in Eq. 13 would return the value close to the original
gray value of the central pixel.
8.2.4.4 Image Sharpening and Edge Enhancement
Edges in an image are basically defined by the change in gray values
of pixels in the neighborhood. The change of gray values of adjacent
pixels in the image can be expressed by a derivative (in continuous
domain) or a difference (in discrete domain) operation.
A first-order derivative operator, such as Sobel, computes the
gradient information in a specific direction. The derivative operator
canbe encodedinto a weight mask. Figure 5 shows two Sobel weight
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
184
Atam P Dhawan
-1 -2 -1
0 0 0
1 2 1
-1 0 1
-2 0 2
-1 0 1
Fig. 5. Weight masks for first derivative operator known as Sobel. The mask at
the left is for computing gradient in the x-direction while the mask at the right
computes the gradient in the y-direction.
masks that are used, respectively, in computing the first-order gra-
dient in x- and y-directions (defined by δ f (x, y)/δx and δ f (x, y)/δy).
These weight masks of 3 ×3 pixels each are used for convolution to
compute respective gradient images. For spatial image enhancement
based on the first-order gradient information, the resultant gradient
image can simply be added to the original image and rescaled using
the full dynamic range of gray values.
Asecond-order derivative operator, known as Laplacian, can be
defined as:

2
f (x, y) =
δ
2
f (x, y)
δx
2
+
δ
2
f (x, y)
δy
2
= [f (x +1, y) +f (x −1, y) +f (x, y +1)
+f (x, y −1) −4f (x, y)], (14)
where ∇
2
f (x, y) represents the second-order derivative or Laplacian
of the image f (x, y).
An image can be sharpened with enhanced edge information by
adding the Laplacian of the image to the original image itself. Such a
mask with Laplacian added to the image is shown in Fig. 6. Figure 7
shows the enhanced version of the original mammographic image
shown in Fig. 4.
8.3 FREQUENCY DOMAIN FILTERING
Frequency domain filtering methods process an acquired image in
the Fourier domain to emphasize or de-emphasize specified fre-
quency components. In general, the frequency components can be
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
185
-1 -1 -1
-1 9 -1
-1 -1 -1
Fig. 6. Weight masks for image enhancement through addition of Laplacian gra-
dient information to the image.
Fig. 7. The original mammogramimage on the left, also shown in Fig. 4(left), with
the Laplacian gradient based image enhancement shown in at the right.
expressed in lowand high ranges. The lowfrequency range compo-
nents usually represent shapes and blurred structures in the image
while high frequency information belongs to sharp details, edges
and noise. Thus, a low-pass filter with attenuation to high frequency
components would provide image smoothing and noise removal.
A high-pass filtering with attenuation to low frequency extracts
edges and sharp details for image enhancement and sharpening
effects.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
186
Atam P Dhawan
8.3.1 Inverse Filtering
As presentedinChapter 2, anacquiredimage g(x, y) canbe expressed
as a convolution of the object f (x, y) with a point spread function
(PSF) h(x, y) of a linear spatially invariant imaging systemwithaddi-
tive noise n(x, y) as:
g(x, y) = h(x, y) ⊗f (x, y) +n(x, y). (17)
The Fourier transform of Eq. 17, provides a multiplicative relation-
ship of F(u, v), the Fourier transform of the object and H(u, v), the
Fourier transform of the PSF:
G(u, v) = H(u, v)F(u, v) +N(u, v), (18)
where u and v represents frequency domain along x- and
y-directions, and G(u, v) and N(u, v) are respectively the Fourier
transforms of the acquired image g(x, y) and the noise n(x, y).
The object information in the Fourier domain can be recovered
by inverse filtering as:
ˆ
F(u, v) =
G(u, v)
H(u, v)

N(u, v)
H(u, v)
, (19)
where
ˆ
F(u, v) is the restored image in the frequency domain.
The inverse filtering operation represented in Eq. 19 provides
a basis for image restoration in the frequency domain. Inverse
Fourier transform of F(u, v) provides the restored image in the spa-
tial domain. The PSF of the imaging system can be experimentally
determined or statistically estimated.
1
8.3.2 Wiener Filtering
The image restoration approach presented in Eq. 19 appears to be
simple but poses a number of challenges in practical implementa-
tion. Besides the difficulties associated with the determination of the
PSF, low-values or zeros in H(u, v) cause computational problems.
Constrained deconvolution approaches and weighted filtering have
been used to avoid the “division by zero” problem in Eq. 19.
1–3
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
187
Wiener filtering is a well known and effective method for image
restoration to perform weighted inverse filtering as:
ˆ
F(u, v) =
_
_
_
1
H(u, v)
_
_
_
|H(u, v)|
2
|H(u, v)|
2
+
S
n
(u,v)
S
f
(u,v)
_
_
_
_
G(u, v), (20)
where S
f
(u, v) and S
n
(u, v) are, respectively, the power spectrum of
the signal and noise.
The Wiener filter, also known as the minimum square error fil-
ter, provides an estimate determined by exact inverse filtering if the
noise spectrum is zero. In cases of non-zero signal-to-noise spec-
trum ratio, the division is appropriately weighted. If the noise can
be assumed to be spectrally white, Eq. 20 reduces to a simple para-
metric filter with a constant K as:
ˆ
F(u, v) =
__
1
H(u, v)
__
|H(u, v)|
2
|H(u, v)|
2
+K
__
G(u, v). (21)
In implementing inverse filtering based methods for image restora-
tion, the major issue is the estimation of the PSF and noise spectra.
The estimation of PSF is dependent on the instrumentation and
parameters of the imaging modality. For example, in the EPI method
of MR imaging, an image formation process can be described in a
discrete representation by
16
:
g(x, y) =
M−1

x

=0
N−1

y

=0
f (x

, y

)H(x

, y

; x, y), (22)
where g(x, y) is the reconstructed image of M×N pixels, f (x, y) is the
ideal image of the object and H(x

, y

; x, y) is the PSF of the image
formation process in EPI. The MR signal s(k, l) at a location (k, l) in
the k-space for the EPI method can be represented as:
s(k, l) =
M−1

x=0
N−1

y=0
f (x, y)A(x, y; k, l), (23)
where
A(x, y; k, l) = e
−2πj((kx/M)+(ly/N)−(γ/2π)B
x,y
T
k,l
)
, (24)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
188
Atam P Dhawan
where B
x,y
is spatially variant field inhomogeneity and t
k,l
is the
time between the sampling of the k-space location (k, l) and the RF
excitation.
With the above representation, the PSF H(x

, y

; x, y) can be
obtained from the 2D inverse FFT of the function A(x, y; k, l) as:
H(x

, y

; x, y) =
M−1

k=0
N−1

l=0
A(x, y; k, l)e
2πj((kx/M)+ly/N)
=

e
2πj((k(x

−x)/M)+(l(y

−y)/N)−(γ/2π)B
x,y
T
k,l
)
. (25)
8.4 CONSTRAINED LEAST SQUARE FILTERING
The constrained least square filtering method uses optimization
techniques on a set of equations representing the image formation
process. The Eq. 6.18 can be rewritten in the matrix form as:
g = Hf +n, (26)
where g is a column vector representing the reconstructed image
g(x, y), f is a column vector of MN ×1 dimension, representing the
ideal image f (x, y), and n represents the noise vector. The PSF is
represented by the matrix H of MN ×MN elements.
For image restoration using the above equation, an estimate
ˆ
f
needs to be computed such that the mean-square error between
the ideal image and the estimated image is minimized. The over-
all problem may not have a unique solution. Also, small variations
in the matrix H may have significant impact on the noise content
of the restored image. To overcome these problems regularization
methods involving constrained optimization techniques are used.
Thus, the optimizationprocess is subjectedtothe specific constraints
such as smoothness to avoid noisy solutions for the vector
ˆ
f. The
smoothness constraint can be derived from the Laplacian for the
estimated image. Using the theory of random variables, the opti-
mization process is defined to estimate
ˆ
f such that the mean square
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
189
error, e
2
given by:
e
2
= Trace E{(f −
ˆ
f)f
t
},
is minimizedsubject tothe smoothness constraint involvingthe min-
imization of the roughness or Laplacian of the estimated image as
min{
ˆ
f
t
[C][C]
ˆ
f},
where [C] =
_
_
_
_
_
_
_
_
_
_
_
_
_
1
−2 1
1 −2 1
1 −2
1 ·
· 1
· −2
1
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
. (27)
It can be shown that the estimated image
ˆ
f can be expressed as
4
:
ˆ
f = ([H]
t
[H] +
1
λ
[C]
t
[C])
−1
[H]
t
g, (28)
where λ is a Lagrange multiplier.
8.4.1 Low-Pass Filtering
The ideal low-pass filter suppresses noise and high-frequency
information providing a smoothing effect to the image. A two-
dimensional low-pass filter function H(u, v) is multiplied with the
Fourier transform G(u, v) of the image to provide a smoothed
image as:
ˆ
F(u, v) = H(u, v)G(u, v), (29)
where
ˆ
F(u, v) is the Fourier transformof the filteredimage
ˆ
f (x, y) that
can be obtained by taking an inverse Fourier transform.
Anideal low-pass filter canbe designedbyassigninga frequency
cut-off value ω
0
. The frequency cut-off value can also be expressed
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
190
Atam P Dhawan
as the distance D
0
fromthe origin in the Fourier (frequency) domain:
H(u, v) =
_
1 if D(u, v) ≤ D
0
0 otherwise
, (30)
where D(u, v) is the distance of a point in the Fourier domain from
the origin representing the dc value.
An ideal low-pass filter has sharp cut-off characteristics in the
Fourier domain causing a rectangular window for the pass band.
From Chapter 2, it can be shown that a rectangular function in the
frequency domain provides a sinc function in the spatial domain.
Also, the multiplicative relationship of the filter model in Eq. 29
leads to a convolution operation in the spatial domain. The rectan-
gular pass-band window in the ideal low-pass filter causes ringing
artifacts in the spatial domain. To reduce ringing artifacts the pass
band should have a smooth fall-off characteristic. A Butterworth
low-pass filter of nth order can be used to provide smoother fall-off
characteristics and is defined as:
H(u, v) =
1
1 +[D(u, v)/D
0
]
2n
. (31)
As the order n increases, the fall off characteristics of the pass band
become sharper. Thus, a first-order Butterworth filter provides the
least amount of ringing artifacts in the filtered image.
AGaussianfunctionis alsocommonlyusedfor low-pass filtering
to provide smoother fall-off characteristics of the pass band and is
defined by:
H(u, v) = e
−D
2
(u,v)/2σ
2
, (32)
where D(u, v) is the distance fromthe origininthe frequency domain
and σ represents the standard deviation of the Gaussian function
that can be set to the cut off distance D
0
in the frequency domain.
In this case, the gain of the filter is down to 0.607 of its maximum
value at the cut off frequency. Figure 8 shows a CT axial image of the
chest cavity with its Fourier transform. The image was processed
with a low-pass filter with the frequency response shown in the
middle column of Fig. 8. The resultant low-pass filtered image with
its Fourier transform is shown in the right column. It can be seen
J
a
n
u
a
r
y
2
2
,
2
0
0
8
1
2
:
2
W
S
P
C
/
S
P
I
-
B
5
4
0
:
P
r
i
n
c
i
p
l
e
s
a
n
d
R
e
c
e
n
t
A
d
v
a
n
c
e
s
c
h
0
8
F
A
P
r
i
n
c
i
p
l
e
s
o
f
I
m
a
g
e
P
r
o
c
e
s
s
i
n
g
M
e
t
h
o
d
s
1
9
1
Fig. 8. Left column: the original CT image with its Fourier transform; middle column: frequency response of the desired and
actual low-pass filter; right column: the resultant low-pass filtered image with its Fourier transform.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
192
Atam P Dhawan
that low-frequencyinformation is preserved while some of the high-
frequency information is removed from the filtered image. The fil-
tered image appears to be smoother.
8.4.2 High-Pass Filtering
High-pass filtering is used for image sharpening and extraction of
high-frequency informationsuchas edges. The low-frequency infor-
mationis attenuatedor blockeddependingonthe designof the filter.
An ideal high-pass filter has a rectangular window function for the
high-frequency pass-band. Since the noise in the image usually car-
ries high-frequency components, high-pass filtering also shows the
noise along with edge information. An ideal 2Dhigh-pass filter with
a cut-off frequency at a distance D
0
from the origin in the frequency
domain is defined as:
H(u, v) =
_
1 if D(u, v) ≥ D
0
0 otherwise
. (33)
As described above for an ideal low-pass filter, the sharp cut-off
characteristic of the rectangular window function in the frequency
domain as defined in Eq. 33 causes the ringing artifacts in the fil-
tered image in the spatial domain. To avoid ringing artifacts filter
functions with smoother fall-off characteristics such as Butterworth
and Gaussian are used. AButterworth high-pass filter of n-th order
is defined in the frequency domain as:
H(u, v) =
1
1 +[D
0
/D(u, v)]
2n
. (34)
Figure 9 shows a CT axial image of the chest cavity with its
Fourier transform. The image was processed with a high-pass filter
with the frequency response shown in the middle column of Fig. 9.
The resultant high-pass filtered image with its Fourier transform
is shown in the right column. It can be seen that the low-frequency
information is attenuatedor de-emphasizedin the high-pass filtered
image. High-frequency information belonging to the edges can be
seen in the filtered image.
J
a
n
u
a
r
y
2
2
,
2
0
0
8
1
2
:
2
W
S
P
C
/
S
P
I
-
B
5
4
0
:
P
r
i
n
c
i
p
l
e
s
a
n
d
R
e
c
e
n
t
A
d
v
a
n
c
e
s
c
h
0
8
F
A
P
r
i
n
c
i
p
l
e
s
o
f
I
m
a
g
e
P
r
o
c
e
s
s
i
n
g
M
e
t
h
o
d
s
1
9
3
Fig. 9. Left column: the original CT image with its Fourier transform; middle column: frequency response of the desired and
actual high-pass filter; right column: the resultant high-pass filtered image with its Fourier transform.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
194
Atam P Dhawan
8.5 CONCLUDING REMARKS
Image processing operations such as noise removal, averaging,
filtering and feature enhancement are critically important in com-
puterized image analysis for feature characterization, analysis
and classification. These operations are also important to help
visual examination and diagnostic evaluation for medical applica-
tions. Though the basic image processing operations as described
in this chapter are quite efficient and effective, more sophis-
ticated model-based methods have been developed for image-
specific feature enhancement operations. These methods utilize
a priori information about the statistical distribution of gray-level
features in the context of a specific application. Such methods
are useful in enhancing the signal-to-noise ratio of the acquired
image for better analysis and classification of medical images.
Some of the recently developed image processing methods are
described in various chapters in the second and third part of
this book.
References
1. Jain AK, Fundamentals of Digital Image Processing, Prentice Hall, 1989.
2. Gonzalez RC, Woods RE, Digital Image Processing, Prentice Hall, 2002.
3. Jain R , Kasturi R, Schunck BG, Machine Vision, McGraw-Hill, 1995.
4. Rosenfeld A, Kak AV, Digital Picture Processing, 1 & 2, 2nd edn.
Academic Press, 1982.
5. Russ JC, The Image Processing Handbook, 2nd edn., CRC Press, 1995.
6. Schalkoff RJ, Digital Image Processing and Computer Vision,
John Wiley & Sons, 1989.
7. Kao Y H, MacFall JR, Correction of MR-k space data corrupted by
spike noise, IEEE Trans Med Imag 19: 671–680, 2000.
8. Ahmed OA, Fahmy MM, NMR signal enhancement via a new time
frequency transform, IEEE Trans Med Imag 20: 1018–1025, 2001.
9. Goutte C, Nielson FA, Hansen LK, Modeling of Hemodynamic
response in fMRI using smooth FIR filters, IEEE Trans Med Imag 19:
1188–1201, 2000.
10. Zaroubi S, Goelman G, Complex denoising of MR data via wavelet
analysis: Applications for functional MRI, Mag Reson Imag 18: 59–68,
2000.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
Principles of Image Processing Methods
195
11. Davis GW, Wallenslager ST, Improvement of chest region CT
images through automated gray-level remapping, IEEE Trans Med
Imaging 1–5: 30–35, 1986.
12. Pizer SM, Zimmerman JB, Staab EV, Adaptive gray-level assignment
in CT scan display, J Comput Assist Tomog 8: 300–306, 1984.
13. Dhawan AP, LeRoyer E, Mammographic feature enhancement by
computerized image processing, Comp Methods & Programs in Biomed
27: 23–29, 1988.
14. Kim JK, Park JM, Song KS, Park HW, Adaptive mammographic
image enhancement using first derivative and local statistics, IEEE
Trans Med Imag 16: 495–502, 1997.
15. Chen G, Avram H, Kaufman L, Hale J, et al., T2 restoration and
noise suppression of hybrid MR images using Wiener and linear
prediction techniques, IEEE Trans Med Imag 13: 667–676, 1994.
16. Munger P, Crelier GR, Peters TM, Pike GB, An inverse problem
approach to the correction of distortion in EPI images, IEEE Trans
Med Imag 19: 681–689, 2000.
17. DhawanAP, Medical Image Analysis, Wiley Interscience, JohnWiley and
Sons, Hoboken, NJ, 2003.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch08 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
CHAPTER 9
Image Segmentation and Feature
Extraction
Atam P Dhawan
Medical image segmentation tasks are important to visualize features of
interests such as lesions with boundary and volume information. Simi-
lar information is required in the computerized quantitative analysis and
classification for diagnostic evaluation and characterization. This chap-
ter presents some of the most effective and commonly used edge and
region segmentation methods. Statistical quantitative features from gray
level distribution, segmented regions, and texture in the image are also
presented.
9.1 INTRODUCTION
After an image is processed for noise removal, restoration and fea-
ture enhancement as needed, it is important to analyze the image
for extractionof features of interest involvingedges, regions, texture,
etc. for further analysis. This goal is accomplishedby image segmen-
tation task. Image segmentation refers to the process of partitioning
an image into distinct regions by grouping together neighborhood
pixels based on a predefined similarity criterion. The similarity cri-
terion can be determined using specific properties or features of
pixels representing objects in the image. Thus, image segmentation
can also be considered as a pixel classification technique that allows
an edge or region based representation towards the formation of
regions of similarities inthe image. Once the regions are defined, sta-
tistical and other features can be computed to represent regions for
197
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
198
Atam P Dhawan
characterization, analysis and classification. This chapter describes
major image segmentation methods for medical image analysis and
classification.
9.2 EDGE-BASED IMAGE SEGMENTATION
Edge-basedapproaches use spatial filtering methods to compute the
first-order or second-order gradient information of the image. There
are a number of gradient operators that can be used for edge-based
segmentation. These operators include Roberts, Sobel, Laplacian,
Cannyandothers.
1–5
Some involve directional derivative masks that
are used to compute gradient information. The Laplacian mask can
be usedto compute second-order gradient information of the image.
For segmentation purposes, after edges are extracted, an edge link-
ing algorithm is applied to form closed regions.
1–3
Gradient infor-
mation of the image can be used to track and link relevant edges.
This step is usually very tedious for it needs to deal with the noise
and irregularities in the gradient information.
9.2.1 Edge Detection Operations
The gradient magnitude and directional information fromthe Sobel
horizontal and vertical direction masks can be obtained by convolv-
ing the respective G
x
and G
y
masks with the image as
1–2
:
G
x
=
_
_
−1 0 1
−2 0 2
−1 0 1
_
_
G
y
=
_
_
1 2 1
0 0 0
−1 −2 −1
_
_
(1)
M =
_
G
2
x
+G
2
Y
≈ |G
x
| +|G
Y
|,
where M represents the magnitude of the gradient that can be
approximated as the sum of the absolute values of the horizontal
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
199
andvertical gradient images obtainedby convolving the image with
the horizontal and vertical masks, G
x
and G
y
.
The second order gradient operator Laplacian can be computed
by convolving one of the following masks, G
l(4)
and G
L(8)
, which,
respectively, use a 4- and 8-connected neighborhood.
G
L(4)
=
_
_
0 −1 0
−1 4 −1
0 −1 0
_
_
or
G
L(8)
=
_
_
−1 −1 −1
−1 8 −1
−1 −1 −1
_
_
(2)
The second-order derivative, Laplacian is very sensitive to noise
as it canbe seenfromthe distributionof weights inthe masks inEq. 2.
The Laplacian mask provides a non-zero output even for a single
pixel based speckle noise in the image. Therefore, it is usually bene-
ficial to apply a smoothing filter first before taking a Laplacian of the
image. The image can be smoothed using a Gaussian weighted spa-
tial averaging as the first step. The second step then uses a Laplacian
mask to determine edge information. Marr and Hildreth
3
combined
these two steps into a single Laplacian of Gaussian function as:
h(x, y) = ∇
2
[g(x, y) ⊗f (x, y)]
= ∇
2
[g(x, y)] ⊗f (x, y), (3)
where ∇
2
[g(x, y)] is the Laplacian of the Gaussian function that
is used for spatial averaging and is commonly expressed as the
Mexican Hat operator:

2
[g(x, y)] =
_
x
2
+y
2
−2σ
2
σ
4
_
e
(x
2
+y
2
)

2
, (4)
where σ
2
is the variance of the Gaussian function.
ALaplacian of Gaussian (LOG) mask for computing the second-
order gradient information of the smoothed image can be computed
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
200
Atam P Dhawan
from Eq. 4. With σ = 2, the LOG mask G
LOG
of 5 × 5 pixels is
given by:
G
LOG
=
_
_
_
_
_
_
_
0 0 −1 0 0
0 −1 −2 −1 0
−1 −2 16 −2 −1
0 −1 −2 −1 0
0 0 −1 0 0
_
¸
¸
¸
¸
¸
_
. (5)
The image obtained by convolving the LOGmask with the orig-
inal image is analyzed for zero crossing to detect edges since the
output image provides values fromnegative to positive values. One
simple method to detect zero crossing is to threshold the output
image for zero value. This operation provides a new binary image
such that a “0” gray value is assigned to the binary image if the
output image has a negative or zero value for the corresponding
pixel. Otherwise, a high gray value (such as “255” for an 8 bit
image) is assigned to the binary image. The zero crossing of the
output image can now be easily determined by tracking the pixels
with a transition from black ( “0” gray value) to white (“255” gray
value).
9.2.1.1 Boundary Tracking
Edge detection operations are usually followed up by the edge-
linking procedures to assemble meaningful edges to form closed
regions. Edge-linking procedures are based on pixel-by-pixel search
to find connectivity among the edge segments. The connectivity can
be defined using a similarity criterion among edge pixels. In addi-
tion, geometrical proximity or topographical properties are used
to improve edge-linking operations for pixels that are affected by
noise, artifacts or geometrical occlusion. Estimation methods based
on probabilistic approaches, graphs and rule-based methods for
model-based segmentation have also been used.
4–12
In the neighborhood search methods, the simplest method is
to follow the edge detection operation by a boundary-tracking
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
201
algorithm. Let us assume that the edge detection operation
produces an edge magnitude e(x, y) and an edge orientation φ(x, y)
information. The edge orientation information can be directly
obtained from the directional masks, as described in chapter 6, or
computed from the horizontal and vertical gradient masks. Let us
start with a list of edge pixels that can be selected fromscanning the
gradient image obtained fromthe edge detection operation. Assum-
ing the first edge pixel as a boundary pixel b
j
, a successor boundary
pixel b
j+1
can be found in the 4- or 8-connected neighborhood if the
following conditions are satisfied:
|e(b
j
)| > T
1
|e(b
j+1
)| > T
1
|e(b
j
) −e(b
j+1
)| < T
2
|φ(b
j
) −φ(b
j+1
)|mod 2π < T
3
,
(6)
where T
1
, T
2
, and T
3
are pre-determined thresholds.
If there is more than one neighboring pixel satisfying these con-
ditions, the pixel that minimizes the differences is selected as the
next boundary pixel. The algorithm is recursively applied until all
neighbors are searched. If no neighbor is found satisfying these con-
ditions, the boundary search for the striating edge pixel is stopped
anda newedge pixel is selected. It canbe notedthat sucha boundary
tracking algorithm may leave many edge pixels and partial bound-
aries unconnected. Some a priori knowledge about the object bound-
aries is often needed to form regions with closed boundaries. Also,
relational tree structures or graphs can be used to help the formation
of closed regions.
13–14
A graph-based search method attempts to find paths between
the start and end nodes minimizing a cost function that may be
established based on the distance and transition probabilities. The
start and end nodes are determined from scanning the edge pixels
basedon some heuristic criterion. For example, an initial search may
label the first edge pixel in the image as the start node and all the
other edge pixels in the image or a part of the image as potential
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
202
Atam P Dhawan
end nodes. Among several graph-based search algorithms, the A*
algorithm is widely used.
13–15
9.3 PIXEL-BASED DIRECT CLASSIFICATION METHODS
Thepixel-baseddirect classificationmethods usehistogramstatistics
to define single or multiple thresholds to classify an image pixel-
by-pixel. The threshold for classifying pixels into classes is obtained
fromthe analysis of the histogramof the image. Asimple approachis
to examine the histogram for bimodal distribution. If the histogram
is bimodal, the threshold can be set to the gray value corresponding
to the deepest point in the histogram valley. If not, the image can
be partitioned into two or more regions using some heuristics about
the properties of the image. The histogramof each partition can then
be used for determining thresholds. By comparing the gray value of
each pixel to the selected threshold, a pixel can be classified into one
of the two classes.
Let us assume that an image or a part of the image has a bimodal
histogram of gray values. The image f (x, y) can be segmented into
two classes using a gray value threshold T such that:
g(x, y) =
_
1 if f (x, y) > T
0 if f (x, y) ≤ T
(7)
where g(x, y) is the segmented image with two classes of binary gray
values “1” and “0” and T is the threshold selected at the valley point
from the histogram. Asimple approach to determine the gray value
threshold T is by analyzing the histogram for the peak values and
then finding the deepest valley point between the two consecutive
major peaks.
9.3.1 Optimal Global Thresholding
To determine an optimal global gray value threshold for image seg-
mentation, parametric distribution based methods can be applied to
the histogramof an image.
1,2,5,15
Let us assume that the histogramof
an image to be segmented has two Gaussian distributions belong-
ing to two respective classes such as background and object. Thus,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
203
the histogram can be represented by a mixture probability density
function p(z) as:
p(z) = P
1
p
1
(z) +P
2
p
2
(z), (8)
where p
1
(z) and p
2
(z) are the Gaussian distributions of class 1 and 2,
respectively, with the class probabilities of P
1
and P
2
such that:
P
1
+P
2
= 1. (9)
Using a gray value threshold T, a pixel in the image f (x, y) can be
classified to class 1 or class 2 in the segmented image g(x, y) as:
g
(
x, y) =
_
Class1 if f (x, y) > T
Class2 if f (x, y) ≤ T
. (10)
Let us define the error probabilities of misclassifying a pixel as:
E
1
(T) =
_
T
−∞
p
2
(z)dz
and
E
2
(T) =
_
T
−∞
p
1
(z)dz, (11)
where E
1
(T) and E
2
(T) are, respectively, the probability of erro-
neously classifying a class 1 pixel to class 2 and a class 2 pixel to
class 1.
The overall probability of error in pixel classification using the
threshold T is then expressed as:
E(T) = P
2
(T)E
1
(T) +P
1
(T)E
2
(T). (12)
For image segmentation, the objective is to find an optimal
threshold T that minimizes the overall probability of error in pixel
classification. The optimization process requires the parameteriza-
tion of the probability density distributions and likelihood of both
classes. These parameters can be determined from a model or set of
training images.
1,2,15,19,24
Let us assume σ
i
and µ
i
to be the standard deviation and mean
of the Gaussian probability density function of the class i (i = 1, 2
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
204
Atam P Dhawan
for two classes) such that:
p(z) =
P
1

2πσ
1
e
−(z−µ
1
)
2
/2σ
2
1
+
P
2

2πσ
2
e
−(z−µ
2
)
2
/2σ
2
2
. (13)
The optimal global threshold T can be determined by finding a gen-
eral solution that minimizes Eq. 12 with the mixture distribution in
Eq. 13 and thus satisfies the following quadratic expression
2
:
AT
2
+BT +C = 0,
where
A = σ
2
1
−σ
2
2
B = 2(µ
1
σ
2
2
−µ
2
σ
2
1
)
C = σ
2
1
µ
2
2
−σ
2
2
µ
2
1
+2σ
2
1
σ
2
2
ln (σ
2
P
1

1
P
2
). (14)
If the variances of both classes can be assumed to be equal to σ
2
, the
optimal threshold T can be determined as:
T =
µ
1

2
2
+
σ
2
µ
1
−µ
2
ln
_
P
2
P
1
_
. (15)
It shouldbe notedthat incase of equal likelihoodof classes, the above
expression for determining the optimal threshold is simply reduced
to the average of the mean values of two classes. Figure 1 shows the
Fig. 1. Segmentation of a T-2 weighted MR brain image (shown at the left) using
optimal thresholdingmethodat T =54 yieldingthe binarysegmentedimage shown
at the right.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
205
results of the optimal thresholdingmethodappliedtoaT-2weighted
MR brain image. It can be seen that most of such a segmentation
method is quite effective in determining the intercranial volume.
9.3.2 Pixel Classification Through Clustering
In histogram based pixel classification method for image segmen-
tation, the gray values are partitioned into two or more clusters
depending on the peaks in the histogram to obtain thresholds. The
basic concept of segmentationbypixel classificationcanbe extended
to clustering the gray values or feature vector of pixels in the image.
This approach is particularly useful when images with pixels repre-
senting a feature vector consisting of multiple parameters of interest
are to be segmented. For example, a feature vector may consist of
gray value, contrast and local texture measures for each pixel in the
image. A color image may have additional color components in a
specific representation such as red, green and blue components in
the R-G-B color coordinate system that can be added to the feature
vector. Magnetic Resonance (MR) or multimodality medical images
may also require segmentation using a multidimensional feature
space with multiple parameters of interest.
Images can be segmented by pixel classification through cluster-
ing of all features of interest. The number of clusters in the multi-
dimensional feature space thus represents the number of classes in
the image. As the image is classified into cluster classes, segmented
regions are obtained by checking the neighborhood pixels for the
same class label. However, clustering may produce disjoint regions
with holes or regions with single pixel. After the image data is clus-
tered and pixels are classified, a post-processing algorithm such as
region growing, pixel connectivity or rule-based algorithm is usu-
ally applied to obtain the final segmented regions.
21,37
There are a
number of algorithms developed for clustering in the literature and
used for a wide range of applications.
15,20,21,36–41
Clustering is the process of grouping data points with similar
feature vectors together in a single cluster while data points with
dissimilar feature vectors are placed in different clusters. Thus, the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
206
Atam P Dhawan
data points that are close to each other in the feature space are clus-
teredtogether. The similarityof feature vectors canbe representedby
an appropriate distance measure such as Euclidean or Mahalanobis
distance.
42
Each cluster is represented by its mean (centeroid) and
variance (spread) associatedwiththe distributionof the correspond-
ing feature vectors of the data points in the cluster. The formation of
clusters is optimized with respect to an objective function involving
prespecified distance and similarity measures along with additional
constraints such as smoothness.
9.3.2.1 k-Means Clustering
The k-means clustering is a popular approach to partition
d-dimensional data into k clusters such that an objective function
providingthe desiredproperties of the distributionof feature vectors
of clusters in terms of similarity anddistance measures is optimized.
Ageneralizedk-means clusteringalgorithminitiallyplaces k clusters
at arbitrarily selected cluster centroids v
i
; i = 1, . . . 2, k and modifies
centroids for the formation of new cluster shapes optimizing the
objective function. The k-means clustering algorithm includes the
following steps:
(1) Select the number of clusters k with initial cluster centroids v
i
;
i = 1, . . . 2, k.
(2) Partition the input data points into k clusters by assigning each
data point x
j
to the closest cluster centroid v
i
using the selected
distance measure, e.g. Euclidean distance, defined as:
d
ij
= x
j
−v
i
, (16)
where X = {x
1
, x
2
, . . . , x
n
} is the input data set.
(3) Compute a cluster assignment matrix U representing the parti-
tion of the data points with the binary membership value of the
j-th data point to the i-th cluster such that:
U = u
ij
,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
207
where
u
ij
∈ {0, 1} for all i, j
k

i=1
u
ij
= 1 for all j and0 <
n

j=1
u
ij
< n for all i. (17)
(4) Recompute the centroids using the membership values as:
v
i
=

n
j=1
u
ij
x
j

n
j=1
u
ij
for all i. (18)
(5) If cluster centroids or the assignment matrix does not change
from the previous iteration, stop; otherwise go to step 2.
The k-means clustering method optimizes the sum-of-squared-error
based objective function J
w
(U, v) such that:
J
w
(U, v) =
k

i=1
n

j=1
x
j
−v
i

2
. (19)
It can be noted from the above algorithm that the k-means clus-
tering method is quite sensitive to the initial cluster assignment
and the choice of the distance measure. Additional criterion such
as within-cluster and between-cluster variances can be included in
the objective function as constraints to force the algorithm to adapt
the number of clusters k (as needed for optimization of the objective
function).
9.3.2.2 Fuzzy c-Means Clustering
The k-means clustering method utilizes the hard binary values for
the membership of a data point to the cluster. The fuzzy c-means
clustering method utilizes an adaptable membership value that can
be updated based using the distribution statistics of the data points
assigned to the cluster minimizing the following objective function
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
208
Atam P Dhawan
J
m
(U, v):
J
m
(U, v) =
c

i=1
n

j=1
u
m
ij
d
2
ij
=
c

i=1
n

j=1
u
m
ij
x
j
−v
i
, (20)
where c is the number of clusters, n the number of data vectors,
u
ij
is the fuzzy membership and m is the fuzziness index. Based
on the constraints defined on the distribution statistics of the data
points inthe clusters, fuzziness index canbe definedbetween1 anda
very large value for the highest level of fuzziness (maximum allow-
able variance within a cluster). The membership values in the fuzzy
c-means algorithm can be defined as
36
:
0 ≤ u
ij
≤ 1 for all i, j
c

i=1
u
ij
= 1 for all j and 0 <
n

j=1
u
ij
< n for all i. (21)
The algorithmdescribedfor k-means clusteringcanbe usedfor fuzzy
c-means clustering with the update of the fuzzy membership values
as defined in Eq. 21 minimizing the objective function as defined in
Eq. 20.
Figure 2 shows the results of k-means clustering on a T-2
weighted MR brain image with k = 9. Different regions segmented
from selected clusters are shown in Fig. 2.
Fig. 2(A). AT-2 weighted MR brain image used for segmentation in Fig. 2(B).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
209
Fig. 2(B). Results of segmentation of the image shown in Fig. 2(A) using k-means
clustering algorithm with k = 9; top left: all segmented regions belonging to all
9 clusters; top middle: regions segmented from cluster k = 1; top right: regions
segmented from cluster k = 4; bottom left: regions segmented from cluster k = 5;
bottom middle: regions segmented from cluster k = 6; bottom right: regions seg-
mentedfromcluster k = 9. (CourtesyDonAdams, Arwa GheithandValerie Rafalko
from their class project.)
9.4 REGION-BASED SEGMENTATION
Region-growing based segmentation algorithms examine pixels in
the neighborhood based on a predefined similarity criterion and
then assign pixels into groups to form regions. The neighborhood
pixels with similar properties are merged to form closed regions
for segmentation. The region growing approach can be extended to
merging regions instead of merging pixels to form larger meaning-
ful regions of similar properties. Such a region merging approach is
quite effective when the original image is segmented into a large
number of regions in the preprocessing phase. Large meaning-
ful regions may provide a better correspondence and matching to
the object models for recognition and interpretation. An alternate
approach is region splitting in which either the entire image or large
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
210
Atam P Dhawan
regions are split into two or more regions based on a heterogene-
ity or dissimilarity criterion. For example, if a region has a bimodal
distribution of gray value histogram, it can be split into two regions
of connected pixels with gray values falling in their respective dis-
tributions. The basic difference between the region and threshold-
ing based segmentation approaches is that region-growing methods
guarantee the segmented regions of connected pixels. On the other
hand, pixel thresholding-based segmentation methods as defined in
the previous section may yield regions with holes and disconnected
pixels.
9.4.1 Region-growing
Region-growing methods merge pixels of similar properties by
examining the neighborhood pixels. The process of merging pixels
continues with the growth of region adapting a new shape and size
until there is insufficient number of neighborhoodpixels tobe added
in the current region. Thus, the region-growing process requires a
similarity criterion that defines the basis for inclusion of pixels in
the growth of the region; and a stopping criterion that stops the
growth of the region. The stopping criterion is usually based on the
minimum number or percentage of neighborhood pixels required
to satisfy the similarity criterion for inclusion in the growth of the
region.
In the region-merging algorithms, an image may be partitioned
into a large number of potential homogeneous regions. For exam-
ple, an image of 1024 ×1024 pixels can be portioned into regions of
8 × 8 pixels. Each region of 8 × 8 pixels can now be examined for
homogeneity of predefined property such as gray values, contrast,
texture, etc. If the histogramof the predefinedpropertyfor the region
is unimodal, the region is said to be homogeneous. Two neighbor-
hood regions can be merged if they are homogeneous and satisfy
a predefined similarity criterion. The similarity criterion imposes
constraints on the value of the property with respect to its mean
and variance values. For example, two homogeneous regions can
be merged if the difference in their mean gray values is within 10%
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
211
of the entire dynamic range and the difference in their variances is
within 10% of the variance in the image. These thresholds may be
selected heuristically or through probabilistic models.
10,15
It is inter-
esting to note that the above criterion can be easily implemented as
a conditional rule in a knowledge-based system. Region-merging or
region-splitting (described in the next section) methods have been
implemented using a rule based system for image segmentation.
25
Model-based systems typically encode knowledge of anatomy
and image acquisition parameters. Anatomical knowledge can be
modeled symbolically, describing the properties and relationships
of individual structures, or geometrically either as masks or tem-
plates of anatomy, or using an atlas.
18,19,25–26
Figure 3 shows a MR
brain image and the segmented regions for ventricles. The knowl-
edgeof anatomical locations of ventricles was usedtoestablishinitial
seed points for region growing. Afeature adaptive region growing
method was used for segmentation.
9.4.2 Region-splitting
Region-splitting methods examine the heterogeneity of a predefined
propertyof the entire regioninterms of its distributionandthe mean,
Fig. 3(A). AT-2 weighted MR brain image used for ventricle segmentation using
a region growing approach.
26
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
212
Atam P Dhawan
Fig. 3(B). Segmented ventricle regions of image shown in Fig. 3(A) using a mod-
elbased region growing algorithm.
26
variance, minimumand maximumvalues. If the region is evaluated
as heterogeneous, that is it fails the similarity or homogeneity crite-
rion, the original region is split into two or more regions. The region-
splitting process continues until all regions satisfy the homogeneity
criterion individually. In the region-splitting process, the original
region R is split into R1, R2, . . . ., R
n
subregions such that the follow-
ing conditions are met
2,5
:
(1) Each region, R
i
; i = 1, 2, . . . , n is connected.
(2)
n

i=1
R
i
= R
(3) R
i
_
R
j
= O for all i, j; i = j
(4) H(R
i
) = TRUE for i = 1, 2, . . . , n.
(5) H(R
i

R
j
) = FALSE for i = j,
where H(R
i
) is a logical predicate for the homogeneity criterion on
the region R
i
.
Region-splitting methods can also be implemented by rule-
based systems and quad-trees. In the quad-tree based region-
splitting method, the image is partitioned into four regions that are
represented by nodes in a quad tree. Each region is checked for the
homogeneity and evaluated for the logical predicate H(R
i
). If the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
213
region is homogeneous, no further action is taken for the respective
node. If the region is not homogeneous, it is further split into four
regions.
9.5 RECENT ADVANCES IN SEGMENTATION
The problem of segmenting medical images into anatomically and
pathologicallymeaningful regions has beenaddressedusingvarious
approaches including model-based estimation methods and rule-
based systems.
17–27
Nevertheless, automatic (or semi-automatic
with minimal operator interaction) segmentation methods for spe-
cific applications are still current topics of research. This is due to the
large variability in anatomical structures and challenging needs of
a reliable, accurate, and diagnostically useful segmentation. A rule
based low-level segmentation system for automatic identification
of brain structures from MR images has been described by Raya.
17
Neural network based classification approaches have also been
applied for medical image segmentation.
10,28
A multi-level adap-
tive segmentation method (MAS) was used to segment and classify
multiparameter MR brain images into a large number of classes of
physiological and pathological interest.
24
The MAS method is based
on estimation of signatures for each segmentation class for pixel-by-
pixel classification.
9.6 IMAGE SEGMENTATION USING NEURAL NETWORKS
Neural networks provide another pixel classification paradigmthat
can be used for image segmentation.
10,28–29
Neural networks do not
require underlying class probability distribution for accurate clas-
sification. Rather, the decision boundaries for pixel classification
are adapted through an iterative training process. Neural network
based segmentation approaches may provide good results for med-
ical images with considerable variance in structures of interest. For
example, angiographic images show a significant variation in arte-
rial structures and therefore are difficult to segment. The variation
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
214
Atam P Dhawan
in image quality among various angiograms and the introduction
of noise in the course of image acquisition emphasizes the impor-
tance of an adaptive non-parametric segmentation method. Neural
network paradigms such as Backpropagation, Radial Basis Func-
tion and Self-Organizing Feature Maps have been used to segment
medical images.
10,28–35
Neural networks learn from examples in the training set in
which the pixel classification task has already been performed using
manual methods. Anon-linear mapping function between the input
features and the desired output for labeled examples is learned
by neural networks without using any parameterization. After the
learning process, a pixel in a new image can be classified for seg-
mentation by the neural network.
It is important to select a meaningful set of features to provide
as input to the neural network for classification. The selection of
training examples is also very important, as they should represent
a reasonably complete statistical distribution of the input data. The
architecture of the networkandthe distributionof trainingexamples
play a major role in determining its performance for accuracy, gen-
eralization and robustness. In its simplest form, the input to a neural
network can be the gray values of pixels in a predefined neighbor-
hood in the image. Thus, the network can classify the center pixel of
the neighborhood based on the information of the entire set of pixels
in the corresponding neighborhood. As the neighborhood window
is translated in the image, the pixels in the central locations of the
translated neighborhoods are classified. Neural network architec-
ture and learning methods are described in Chapter 10 for pattern
classificationthat canbe usedfor pixel-basedclassificationfor image
segmentation.
28–35
9.7 FEATURE EXTRACTION AND REPRESENTATION
Gray-level statistics of the image, gray-level statistics and shape of
the segmented regions, and texture can be used in feature represen-
tation of the image for characterization, analysis and classification.
Selectionof correlatedfeatures for a specific classificationtaskis very
important. Details about clustering and classification are provided
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
215
inChapter 10 of this book. Various commonlyusedfeatures inimage
analysis and classification are briefly described below.
9.7.1 Statistical Pixel-Level Image Features
Once the regions are segmented in the image, gray values of pixels
within the region can be used for computing the following statistical
pixel-level (SPL) features
1–2
:
(1) The histogram of the gray values of pixels in the image as:
p(r
i
) =
n(r
i
)
n
, (22)
where p(r
i
) and n(r
i
) are, respectively, the probability and num-
ber of occurrence of a gray value r
i
in the region and n is the total
number of pixels in the region.
(2) Mean m of the gray values of the pixels in the image can be
computed as:
m =
1
n
L−1

i=0
r
i
p(r
i
), (23)
where L is the total number gray values in the image with
0, 1, . . . , L −1.
(3) Variance and central moments in the region can be computed as:
µ
n
=
L−1

i=0
p(r
i
)(r
i
−m)
n
, (24)
where the secondcentral moment µ
2
is the variance of the region.
The third and fourth central moments can be computed, respec-
tively, for n = 3 andn = 4. The thirdcentral moment is a measure
of non-centrality while the fourth central moment is a measure
of flatness of the histogram.
(4) Energy: Total energy E of the gray-values of pixels in the region
is given by:
E =
L−1

i=0
[p(r
i
)]
2
. (25)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
216
Atam P Dhawan
(5) Entropy: The entropy Ent as a measure of information repre-
sentedbythedistributionof gray-values intheregionis givenby:
Ent =
L−1

i=0
p(r
i
) log
2
(r
i
). (26)
(6) Local contrast corresponding to each pixel can be computed by
the difference of the gray-value of the center pixel and the mean
of the gray values of the neighborhood pixels. The normalized
local contrast C(x, y) for the center pixel can also be computedas:
C(x, y) =
|P
c
(x, y) −P
s
(x, y)|
max{P
c
(x, y), P
s
(x, y)}
, (27)
where P
c
(x, y) andP
s
(x, y) are the average gray-level values of the
pixels corresponding to the “center” andthe “surround” regions
that are grown aroundthe centeredpixel through a region grow-
ing method.
5,45
(7) Additional features such as maximum and minimum gray val-
ues can also be used for representing regions.
(8) The features based on the statistical distribution of local contrast
values in the region also provide useful characteristics informa-
tion about the regions representing objects.
(9) Features based on the gradient information for the boundary
pixels of the region are also an important consideration in defin-
ing the nature of edges. For example, the fading edges with low
gradient form a characteristic feature of malignant melanoma
and must be included in the classification analysis of images of
skin lesions.
9
9.7.2 Shape Features
Shape features of the segmentedregioncanalso be usedinclassifica-
tion analysis. The shape of a region is basically definedby the spatial
distribution of boundary pixels. A simple approach for computing
shape features for a 2D region is representing circularity, compact-
ness, elongatedness through the minimum bounded rectangle that
covers the region.
1–5
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
217
Several shape features using the boundary pixels of the seg-
mented region can be computed as:
(1) Longest axis.
(2) Shortest axis.
(3) Perimeter and area of the minimum bounded rectangle.
(4) Elongation ratio.
(5) Perimeter p and area A of the segmented region.
(6) Hough transform of the region using the gradient information
of the boundary pixels of the region
1−5
[also described later in
this chapter].
(7) Circularity (C = 1 for a circle) of the region computed as:
C =
4πA
p
2
. (28)
(8) Compactness Cp of the region computed as:
C
p
=
p
2
A
. (29)
(9) Chain code for boundary contour as obtained using a set of
orientation primitives on the boundary segments derived from
a piecewise linear approximation.
(10) Fourier descriptor of boundary contours as obtained using
the Fourier transform of the sequence of boundary segments
derived from a piecewise linear approximation.
(11) Central moments based shape features for the segmented
region.
(12) Morphological shape descriptors as obtained though the mor-
phological processing on the segmented region.
46–51
9.7.3 Moments for Shape Description
The shape of a boundary or contour can be represented quantita-
tively by the central moments for matching. The central moments
represent specific geometrical properties of the shape andare invari-
ant to the translation, rotation and scaling. The central moments µ
pq
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
218
Atam P Dhawan
of a segmented region or binary image f (x, y) are given by
1,2,5,52
:
µ
pq
=
L

i=1
L

j=1
(x
i
− ¯ x)
p
(y
j
− ¯ y)
q
f (x, y)
where
¯ x =
L

i=1
L

j=1
x
i
f (x
i
, y
j
),
¯ y =
L

i=1
L

j=1
y
j
f (x
i
, y
j
). (30)
For example, the central moment µ
21
represents the vertical diver-
gence of the shape of the region indicating the relative extent of the
bottom of the region compared to the top. The normalized central
moments can be computed as:
η
pq
=
µ
pq

00
)
γ
,
where
γ =
p +q
2
+1. (31)
There are seven invariant moments φ
1
– φ
7
for shape matching are
defined as
52
:
φ
1
= η
20

02
φ
2
= (η
20
−η
02
)
2
+4η
2
11
φ
3
= (η
30
−3η
12
)
2
+(3η
21
−η
03
)
2
φ
4
= (η
30

12
)
2
+(η
21

03
)
2
φ
5
= (η
30
−3η
12
)(η
30

12
)[(η
30

12
)
2
−3(η
21

03
)
2
]
+(3η
21
−η
03
)(η
21

03
)[3(η
30

12
)
2
−(η
21

03
)
φ
6
= (η
20
−η
02
)[(η
30

12
)
2
−(η
21

03
)
2
]
+4η
11

30

12
)(η
21

03
)
φ
7
= (3η
21
−η
03
)(η
30

12
)[(η
30

12
)
2
−3(η
21
−η
03
)
2
]
+(3η
12
−η
30
)(η
21

03
)[3(η
30

12
)
2
−(η
21

03
). (32)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
219
The invariant moments are used extensively in the literature for
shape matching and pattern recognition.
1,52
9.7.4 Texture Features
Texture is an important spatial property that can be used in
region segmentation as well as description. There are three major
approaches to represent texture: statistical, structural and spec-
tral. Since texture is a property of the spatial arrangements of
the gray values of pixels, the first order histogram of gray val-
ues provide no information about the texture. Statistical methods
representing the higher order distribution of gray values in the
image are used for texture representation. The second approach
uses structural methods such as arrangements of prespecified prim-
itives in texture representation. For example, a repetitive arrange-
ment of square and triangular shapes can produce a specific
texture. The third approach is based on spectral analysis meth-
ods such as Fourier and wavelet transforms. Using spectral anal-
ysis, texture is represented by a group of specific spatiofrequency
components.
53,54
The gray-level co-occurrence matrix (GLCM) exploits the higher
order distribution of gray values of pixels that are defined with a
specific distance or neighborhood criterion. In the simplest form,
the GLCM P(i, j) is the distribution of the number of occurrence
of a pair of gray values i and j separated by a distance vector
d = [dx,dy].
The GLCM can be normalized by dividing each value in the
matrix by the total number of occurrences providing the probability
of occurrence of a pair of gray values separated by a distance vec-
tor. Statistical texture features are computed from the normalized
GLCM as the second order histogram H(y
q
, y
r
,d) representing the
probability of occurrence of a pair of gray values y
q
and y
r
separated
by a distance vector d. Texture features can also be described by a
difference histogram, H
d
(y
s
,d), where y
s
= |y
q
− y
r
|. H
d
(y
s
,d) indi-
cates the probability that a difference in gray-levels exists between
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
220
Atam P Dhawan
two distinct pixels. Commonly used texture features based on the
second order histogram statistics are as follows:
(1) Entropy of H(y
q
, y
r
, d), S
H
:
S
H
= −
y
t

y
q
=y
1
y
t

y
r
=y
1
H(y
q
, y
r
, d)log
10
[H(y
q
, y
r
, d)]. (33)
The entropy is a measure of texture nonuniformity. Lower
entropy values indicate greater structural variation among the
image regions.
(2) Angular Second Moment of H(y
q
, y
r
, d), ASM
H
:
ASM
H
=
y
t

y
q
=y
1
y
t

y
r
=y
1
[H(y
q
, y
r
, d)]
2
. (34)
The ASM
H
indicates the degree of homogeneity among tex-
tures, and is also representative of the energy in the image (11).
Alower value of ASM
H
is indicative of finer textures.
(3) Contrast of H(y
q
, y
r
, d):
Contrast =
y
t

y
q
=y
1
y
t

y
r
=y
1
∂(y
q
, y
r
)H(y
q
, y
r
, d), (35)
where ∂(y
q
, y
r
) is a measure of intensitysimilarityandis defined
by ∂ = (y
q
− y
r
)
2
. Thus the contrast characterizes the extent of
variation in pixel intensity.
(4) Inverse Difference Moment of H(y
q
, y
r
, d), IDM
H
:
IDM
H
=
y
t

y
q
=y
1
y
t

y
r
=y
1
H(y
q
, y
r
, d)
1 +∂(y
q
, y
r
)
, (36)
where δ is defined as before. The IDM
H
provides a measure of
the local homogeneity among textures.
(5) Correlation of H(y
q
, y
r
, d):
Cor
H
=
1
σ
y
q
σ
y
r
y
t

y
q
=y
1
y
t

y
r
=y
1
(y
q
−µy
q
)(y
r
−µy
r
)H(y
q
, y
r
, d), (37)
where µ
y
q
, µ
y
r
, σ
y
q
, σ
y
r
and are the respective means and stan-
dard deviations of y
q
and y
r
. The correlation can also be
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
221
expanded and written in terms of the marginal distributions
of the second order histogram, which are defined as:
H
m
(y
q
, d) =
y
t

y
r
=y
1
H(y
q
, y
r
, d), and
H
m
(y
r
, d) =
y
t

y
q
=y
1
H(y
q
, y
r
, d).
(38)
The correlation attribute is large for similar elements of the
second order histogram.
(6) Mean of H(y
q
, y
r
, d), µ
Hm
:
µ
Hm
=
y
t

y
q
=y
1
y
q
H
m
(y
q
, d). (39)
The mean characterizes the nature of the gray-level distribu-
tion. Its value is typically small if the distribution is localized
around y
q
= y
1
.
(7) Deviation of H
m
(y
q
, d), σ
Hm
:
σ
Hm
=
¸
¸
¸
¸
_
y
t

y
q
=y
1
_
_
y
q

y
t

y
r
=y
1
y
r
H
m
(y
r
, d)
_
_
2
H
m(y
q
,d)
. (40)
The deviation indicates the amount of spread around the mean
of the marginal distribution. The deviation is small if the his-
togram is densely clustered about the mean.
(8) Entropy of H
d
(y
s
, d), S
Hd(ys,d)
:
S
Hd(ys,d)
= −
y
t

y
s
=y
1
H
d
(y
s
, d)log
10
[H
d
(y
s
, d)]. (41)
(9) Angular second moment of H
d
(y
s
, d), ASM
Hd(ys,d)
:
ASM
Hd(ys,d)
=
y
t

y
s
=y
1
[H
d
(y
s
, d)]
2
. (42)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
222
Atam P Dhawan
(10) Mean of H
d
(y
s
, d), µ
Hd(ys,d)
:
µ
Hd(ys,d)
=
y
t

y
s
=y
1
y
s
[H
d
(y
s
, d)]. (43)
The features computed using the difference histogram,
H
d
(y
s
, d), have the same significance as those attributes deter-
mined by the second order statistics.
9.7.5 Hough Transform
The Hough transformis used to detect straight lines and other para-
metric curves such as circles, ellipses, etc.
1,2,5
It can also be used to
detect boundaries of an arbitrarily shaped object if the parameters
of the object are known. The basic concept of the GeneralizedHough
transform is that an analytical function such as straight line, circle
or a closed shape, represented in the image space (spatial domain)
has a dual representation in the parameter space. For example, the
general equation of a straight line can be given as:
y = mx +c, (44)
where m is the slope and c is the y-intercept.
As can be seen from Eq. 44, the locus of points is described by
two parameters, slope and y-intercept. Therefore, a line in the image
space forms a point (m, c) in the parameter space. Likewise, a point
in the image space forms a line in the parameter space. Therefore,
a locus of points forming a line in the image space will form a set
of lines in the parameter space, whose intersection represents the
parameters of the line in the image space. If a gradient image is
threshold to provide edge pixels, each edge pixel can be mapped
to the parameter space. The mapping can be implemented using
the bins of points in the parameter space. For each edge pixel of the
straight line in the image space, the corresponding bin in the param-
eter space is updated. At the end, the bin with the maximum count
represents the parameters of the straight line detected in the image.
The concept can be extended to map and detect boundaries of a
predefined curve. In general, the points in the image space become
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
223
hyperplanes inthe N-dimensional parameter space andthe parame-
ters of the object function in the image space can be found by search-
ing the peaks in the parameter space caused by the intersection of
the hyperplanes.
To detect object boundaries using the Hough transform, it is nec-
essary to create a parameter model of the object. The object model
is transferred into a table called an R-table. The R-table can be con-
sidered as a one-dimensional array where each entry of the array
is a list of vectors. For each point in the model description (MD),
a gradient along with the corresponding vector extending from the
boundary point to the centroid is computed. The gradient acts as an
index into the R-table.
For object recognition, a 2Dparameter space of possible x-y coor-
dinate centers is initialized with accumulator values associated with
each location set to zero. An edge pixel from the gradient image is
selected. The gradient information is indexed into the R-table. Each
vector in the corresponding list is added to the location of the edge
pixel. The endpoint of the vector should now point to a new edge
pixel in the gradient image. The accumulator of the corresponding
location in the parameter space is then incremented by one count.
As each edge pixel is examined, the accumulator of the correspond-
ing location receives the highest count. If the model object is con-
sidered to be translated in the image, the accumulator of the correct
translation location would receive the highest count. To deal with
rotation and scaling, the process must be repeated for all possible
rotations and scales. Thus, the complete process could become very
tedious if a large number of rotations and scales are examined. To
avoid this complexity, simple transformations can be made in the
R-table of the transformation.
16
9.8 CONCLUDING REMARKS
Segmenting image into regions of interest and extracting features
from the image and segmentation are essential for analyzing and
classifying the information represented in the image. In this chap-
ter, commonly used edge and region segmentation methods are
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
224
Atam P Dhawan
described. Image and region statistics based features extracted the
gray-level distributions along with shape and texture features are
also presented. Depending on the contextual knowledge, a number
of relational features withadjacencies graphandrelational attributes
can also be included in the analysis and classification of medi-
cal images. Model-based methods representing anatomical knowl-
edge from standardized atlases can be introduced in the segmen-
tation and feature analysis to help computerized classification and
interpretation of medical images. Pattern classification methods are
described in Chapter 10 while the model-based registration and
medical image analysis methods are described in various chapters
in the second and third part of this book. Recent developments
in model-based medical image analysis include probabilistic and
knowledge based approaches and can be found in detail in the pub-
lished literature.
22–24,53–62
This trend of using multifeature analysis
incorporating a priori and model-based knowledge is expected to
continue in medical image analysis for diagnostic applications, as
well as for understanding of physiological processes linked with
critical diseases and designing better treatment intervention proto-
cols for better healthcare.
References
1. Jain R, Kasturi R, Schunck BG, Machine Vision, McGRawHill Inc, 1995.
2. Gonzalez RC, Woods RE, Digital Image Processing, Prentice Hall, 2nd
edn., 2002.
3. Marr D, Hildreth EC, Theory of edge detection, Proc R Soc Lond B 207:
187–217, 1980.
4. Haralick RM, Shapiro LG, Image segmentation techniques, Comp Vis
Graph Imag Process 7: 100–132, 1985.
5. DhawanAP, Medical Image Analysis, Wiley Interscience, JohnWiley and
Sons, Hoboken, NJ, 2003.
6. Stansfield SA, ANGY: A rule-based expert system for automatic seg-
mentation of coronary vessels from digital subtracted angiograms,
IEEE Trans Patt Anal Mach Intel 8: 188–199, 1986.
7. Ohlander R, Price K, ReddyDR, Picture segmentationusinga recursive
region splitting method, Comp Vis Graph Imag Process 8: 313–333, 1978.
8. Zucker S, Regiongrowing: Childhoodandadolescence, Comp Vis Graph
Imag Process 5: 382–399, 1976.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
225
9. Dhawan AP, Sicsu A, Segmentation of images of skin lesions using
color and texture information of surface pigmentation, Comp Med Imag
Graph 16: 163–177, 1992.
10. Dhawan AP, Arata L, Segmentation of medical images through com-
petitive learning, Comp Methods and Prog In Biomed 40: 203–215, 1993.
11. Raya SR, Low-level segmentation of 3D magnetic resonance brain
images, IEEE Trans Med Imag 9: 327–337, 1990.
12. Liang Z, Tissue classification and segmentation of MR images, IEEE
Eng Med Biol Mag 12: 81–85, 1993.
13. Nilson NJ, Principles of Artificial Intelligence, Springer Verlag, 1982.
14. Winston PH, Artificial Intelligence, Addison Wesley, 3rd edn., 1992.
15. Dawant BM, Zijdenbos AP, Image segmentation, in Sonka M,
Fitzpatrick JM (eds.), Handbook of Medical Imaging, Vol 2: Medical Image
Processing and Analysis, SPIE Press, 2000.
16. Ballard DH, Generalizing the Hough transform to detect arbitrary
shapes, Pattern Recognition 13: 111–122, 1981.
17. Bomans M, Hohne KH, Tiede U, Riemer M, 3D segmentation of MR
images of the head for 3D display, IEEE Trans Medical Imaging 9: 177–
183, 1990.
18. Raya SR, Low-level segmentation of 3D magnetic resonance brain
images: A rule based system, IEEE Trans Med Imaging 9(1): 327–337,
1990.
19. Cline HE, Lorensen WE, Kikinis R, Jolesz F, Three-dimensional seg-
mentation of MR images of the head using probability and connectiv-
ity, Journal of Computer Assisted Tomography 14: 1037–1045, 1990.
20. Clarke L, Velthuizen R, Phuphanich S, Schellenberg J, et al., MRI: Sta-
bility of three supervised segmentation techniques, Magnetic Resonance
Imaging 11: 95–106, 1993.
21. Hall LO, BensaidAM, Clarke LP, Velthuizen RP, et al., Acomparison of
neural network and fuzzy clustering techniques in segmenting mag-
netic resonance images of the brain, IEEE Trans on Neural Networks 3:
672–682, 1992.
22. Vannier M, Pilgram T, Speidel C, Neumann L, et al., Validation of
magnetic resonance imaging (MRI) multispectral tissue classification,
Computerized Medical Imaging and Graphics 15: 217–223, 1991.
23. Choi HS, Haynor DR, KimY, Partial volume tissue classificationof mul-
tichannel magnetic resonance images — A mixed model, IEEE Trans-
actions on Medical Imaging 10: 395–407, 1991.
24. Zavaljevski A, Dhawan AP, Holland S, Ball W, et al., Multispectral MR
brain image classification, Computerized Medical Imaging, Graphics and
Image Processing 24: 87–98, 2000.
25. Nazif AM, Levine MD, Low-level image segmentation: An expert sys-
tem, IEEE Trans Pattern Anal Mach Intell 6: 555–577, 1984.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
226
Atam P Dhawan
26. Arata LK, Dhawan AP, Levy AV, Broderick J, et al., Three dimensional
anatomical model based segmentation of MR brain images through
prinicpal axes registration, IEEE Trans Biomed Eng 42: 1069–1078, 1995.
27. Xu L, Jackowski M, Goshtasby A, Yu C, et al., Segmentation of skin
cancer images, Image and Vision Computing 17: 65–74, 1999.
28. Sarwal A, DhawanAP, Segmentationof coronaryarteriograms through
Radial Basis Function neural network, Journal of Computing and Infor-
mation Technology, 135–148, 1998.
29. Ozkan B, Dawant RJ, Maciunas RJ, Neural-Network-Based Segmenta-
tion of Multi-Modal Medical Images: AComparative and Prospective
Study, IEEE Trans on Medical Imaging 12: 1993.
30. Xuanli C, Beni G, AValidity Measure for Fuzzy Clustering, IEEE Trans
on Pattern Anal Mach Intell 133: 1991.
31. Bezdek A, Pattern Recognition with Fuzzy Objective Function Algorithms,
Plenum, New York, 1981.
32. Chen C, Cowan CFN, Grant PM, Orthogonal least squares learning
for radial basis function networks, IEEE Trans On Neural Networks 2(2):
302–309, 1991.
33. Poggio T, Girosi F, Networks for approximation and learning, Proceed-
ings of the IEEE 78(9): 1481–1497, 1990.
34. Jacobson IRH, Radial basis functions: a survey and new results, in
Handscomb DC (ed.), The Mathematics of Surfaces III, pp. 115–133,
Clarendon Press, 1989.
35. Sarwal A, DhawanAP, Segmentationof coronaryarteriograms through
Radial Basis Function neural network, Journal of Computing and Infor-
mation Technology, 135–148, 1998.
36. Xuanli G, Beni A, Validity Measure for Fuzzy Clustering, IEEE Trans
on Pattern Anal Mach Intell 133(8): 1991.
37. Loncaric S, Dhawan AP, Brott T, Broderick J, 3D image analysis of
intracerebral brain hemorrhage, Computer Methods and Programs in
Biomed 46: 207–216, 1995.
38. Broderick J, Narayan S, Dhawan AP, Gaskil M, et al., Ventricular mea-
surement of multifocal brain lesions: Implications for treatment trials
of vascular dementia and multiple sclerosis, Neuroimaging 6: 36–43,
1996.
39. Schmid P, Segmentation of digitized dermatoscopic images by two-
dimensional color clustering, IEEE Trans Med Imag 18: 164–171, 1999.
40. Pham DL, Prince JL, Adaptive fuzzy segmentation of magnetic reso-
nance images, IEEE Trans Med Imag 18: 737–752, 1999.
41. Kanungo T, Mount DM, Netanvahu NS, Piatko CD, et al., An efficient
k-means algorithm: analysis and implementation, IEEE Trans on
Pattern Anal Mach Intell 24: 881–892, 2002.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
Image Segmentation and Feature Extraction
227
42. Duda RO, Hart PE, Pattern Classification and Scene Analysis, Wiley, 1973.
43. Zurada JM, Introductionto Artificial Neural Systems, West PublishingCo,
1992.
44. Fahlman SE, Lebeire C, The cascade-correlation learning architecture,
Tech Report, School of Computer Science, Carnegie Mellon University,
1990.
45. DhawanAP, LeRoyer E, Mammographic feature enhancement by com-
puterized image processing, Comp Methods & Programs in Biomed 27:
23–29, 1988.
46. Serra J, Image Analysis and Mathematical Morphology, Academic Press,
1982.
47. Sternberg S, Shapiro L, MacDonaldR, Orderedstructural shape match-
ing with primitive extraction by mathematical morphology, Pattern
Recognition 20: 75–90, 1987.
48. Maragos P, Patternspectrumandmultiscaleshaperepresentation, IEEE
Trans on Pattern Anal Mach Intell 11: 701–716, 1989.
49. Loncaric S, DhawanAP, Amorphological signature transformfor shape
description, Pattern Recognition 26(7): 1029–1037, 1993.
50. Loncaric S, Dhawan AP, Brott T, Broderick J, 3D image analysis of
intracerebral brain hemorrhage, Computer Methods and Programs in
Biomed 46: 207–216, 1995.
51. Loncaric S, Dhawan AP, Optimal MST-based shape description via
genetic algorithms, Pattern Recognition 28: 571–579, 1995.
52. Flusser J, Suk T, Pattern recognition by affine moments invariants,
Pattern Recognition 26: 167–174, 1993.
53. Loew MH, Feature extraction, in Sonka M, Fitzpatrick JM, Handbook
of Medical Imaging, Vol. 2: Medical Image Processing and Analysis, SPIE
Press, 2000.
54. Dhawan AP, Chitre Y, Kaiser-Bonassoand, Moskowitz M, Analysis of
mammographic microcalcifications using gray levels image structure
features, IEEE Trans Med Imaging 15: 246–259, 1996.
55. Xu L, Jackowski M, Goshtasby A, Yu C, et al., Segmentation of skin
cancer images, Image and Vision Computing 17: 65–74, 1999.
56. Dhawan AP, Sicsu A, Segmentation of images of skin lesions using
color and texture information of surface pigmentation, Comp Med Imag
Graph 16: 163–177, 1992.
57. Staib LH, Duncan JS, Boundary finding with parametrically
deformable models, IEEE Trans Pattern Anal Mach Intel 14: 1061–1075,
1992.
58. FanY, ShenD, Gur RC, Gur RE, et al., COMPARE: Classificationof Mor-
phological Patterns using Adaptive Regional Elements, IEEE Transac-
tions on Medical Imaging, 2006.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch09 FA
228
Atam P Dhawan
59. Grosbras MH, Laird AR, Paus T, Cortical regions involved in gaze
production, attention shifts and gaze perception, Hum Brain Mapp 25:
140–154, 2005.
60. Laird AR, Fox PM, Price CJ, Glahn DC, et al., ALE meta-analysis: Con-
trolling the false discovery rate and performing statistical contrasts,
Hum Brain Mapp 25: 155–164, 2005.
61. Zhang Y, Brady M, Smith S, Segmentation of brain MR images
through a hidden Markov random field model and the expectation-
maximization algorithm, IEEE Trans Med Imaging 20(1): 45–57, 2001.
62. Scherfler C, Schocke MF, Seppi K, Esterhammer R, et al., Voxel-wise
analysis of diffusion weighted imaging reveals disruption of the olfac-
tory tract in Parkinson’s disease, Brain 129(Pt 2): 538–42, 2006.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
CHAPTER 10
Clustering and Pattern Classification
Atam P Dhawan and Shuangshuang Dai
Clustering is a method to arrange data points into groups or clus-
ters based on a predefined similarity criterion. Classification maps
the data points or their representative features into predefined classes
to help the interpretation of the input data. There are several
methods available for clustering and classification for computer-
aided diagnostic or decision making systems for medical applica-
tions. This chapter reviews some of the clustering and classification
methods using deterministic as well as fuzzy approaches for data
analysis.
10.1 INTRODUCTION
Image classification is an important task in computer-aided diag-
nosis. An image after any preprocessing as needed to enhance
features of interest is processed to extract features for further analy-
sis. Computed features are then arranged as a feature vector. Since
features may utilize different dynamic ranges of values, normal-
ization may be required before they are analyzed for classification
into various categories. For example, a mammography image may
be processed to extract features related to microclacifications, e.g.
number of microcalcification clusters, number of microcalcifications
in each cluster, size and shape of microcalcifications, spatial distri-
butionof microcalcifications, spatial-frequencyandtexture informa-
tion, mean and variance of gray-level values of microcalcifications,
etc. These features are then used in a classification method such as
229
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
230
AP Dhawan and Shuangshuang Dai
statistical pattern classifier, Bayesian classifier, or neural network to
classify the image into two classes: benign and malignant.
Let us review some terms commonly used in pattern classi-
fication.
Pattern: Apattern (feature vector, observation, or datum) χ is a vec-
tor of measurements used by the clustering algorithm. It typically
consists of a vector of d measurements: χ = (x
1
, . . . x
d
).
Feature: Afeature is defined as an individual scalar components x
i
of a pattern χ.
Dimensionality: The dimensionality d usually refers to the number
of variables in the pattern or feature vector.
Pattern Set: Apattern set is denoted ℵ = {χ
1
, . . . χ
n
}. The i-th pattern
in ℵ is denoted χ
i
= (x
i,1
, . . . x
i,d
). In many cases, a pattern set to be
clustered can be viewed as an n × d pattern matrix.
Class: A class, in the abstract, refers to a state of nature that gov-
erns the pattern generation process. More concretely, a class can be
viewed as a source of patterns whose distribution in feature space
is governed by a probability density specific to the class.
Clustering: Clustering is a specific method that attempts to group
patterns into various classes on the basis of a similarity criterion.
Hard clustering: Hard clustering techniques assign a class label l
i
to each pattern χ
i
, using a deterministic similarity criterion or crisp
membership function.
Fuzzy clustering: Fuzzy clustering methods assign a class to each
input pattern χ
i
based on a fuzzy membership criterion with a frac-
tional degree of membership f
ij
for each cluster j.
Distance measure: A distance measure is a metric on the feature
space used to quantify the similarity of patterns.
A traditional pattern classification system can be viewed as a
mapping frominput variables representing the rawdata or a feature
set toanoutput variable representingone of the categories or classes.
To obtain a reasonable dimensionality, it is usually advantageous to
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
231
Feature
Selection/
Extraction
Interpattern
Similarity
Clustering
and/or
Classifier
Classification
Input
Data
Fig. 1. Atypical classification system.
apply preprocessing transformations to the raw data before it is fed
into a classification system. Preprocessing usually involves feature
extraction and/or feature selection to reduce the dimensionality to
a reasonable number. Feature selection is the process of identifying
the most effective subsets of the original features to be used in the
clustering. The selected features are expected to be correlated with
the classification task for better results.
After the preprocessing and pattern (feature) representation are
established, interpattern similarity should be defined on pairs of
patterns and it is often measured by a distance function. Finally, the
output of the clustering task is a set of clusters and it can be hard
(a deterministic partition of the data into clusters) or fuzzy where
each pattern has a variable degree of membership in each of the
output clusters. Figure 1 shows a schematic diagram of a typical
classification system.
10.2 DATA CLUSTERING
Clustering is assigning data points or patterns (usually represented
as a vector of measurements in a multidimensional space) into
groups or clusters based on a predefined similarity measure. Intu-
itively, patterns within a valid cluster are more similar to each other
than they are to a pattern belonging to a different cluster. Data clus-
tering is an efficient method to organize a large set of data for sub-
sequent classification. Except in certain advanced fuzzy clustering
techniques, each data point should belong to a single cluster, and no
point should be excluded from membership in the complete set of
clusters.
Since similarity is fundamental to the definition of a cluster, a
measure of the similarity between two patterns drawn from the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
232
AP Dhawan and Shuangshuang Dai
same feature space is essential to most clustering procedures.
1–10
Because of the variety of feature types and scales, the proper choice
of distance measure is of great importance. It is common to calcu-
late dissimilarity between two patterns using a distance measure
defined on the feature space. Euclidean distance is the most popular
metric
1,2
and it is defined as:
d
2
(x
i
, x
j
) =
_
d

k=1
_
x
i,k
− x
j,k
_
2
_
1/2
=
_
_
x
i
− x
j
_
_
2
. (1)
It is noted that Euclidean distance is actually a special case (p = 2)
of the Minkowski metric as
1,2
:
d
p
(x
i
, x
j
) =
_
d

k=1
_
x
i,k
− x
j,k
_
p
_
1/p
=
_
_
x
i
− x
j
_
_
p
. (2)
The Euclidean distance has an intuitive appeal as it is commonly
usedtoevaluate the proximityof objects intwoor three-dimensional
space. It works well when a data set has “compact” or “isolated”
clusters.
11
The drawback to the direct use of the Minkowski metrics
is the tendency of the largest-scaled feature to dominate all others.
Solutions to this problem include normalization of the continuous
features or other weighting schemes. Linear correlation among fea-
tures can also distort distance measures. This distortion can be alle-
viated by applying a whitening transformation to the data or by
using the squared Mahalanobis distance:
d
M
(x
i
, x
j
) = (x
i
− x
j
)A
−1
(x
i
− x
j
)
T
, (3)
where A is the sample covariance matrix of the patterns.
In this process, d
M
(x
i
, x
j
) assigns different weights to different
features based on their variances and pairwise linear correlations.
It is implicitly assumed here that class conditional densities are uni-
modal and characterized by multidimensional spread, i.e. that the
densities are multivariate Gaussian. The regularized Mahalanobis
distance was used in
11
to extract hyperellipsoidal clusters.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
233
Traditional clustering algorithms can be classified into two main
categories
1,2
: hierarchical and partitional. In hierarchical clustering,
the number of clusters need not be specified a priori, and problems
due to initializationandlocal minimumdo not arise. However, since
hierarchical methods consider onlylocal neighbors ineachstep, they
cannot incorporate a prior knowledge about the global shape or size
of clusters. As a result, they cannot always separate overlapping
clusters. Moreover, hierarchical clustering is static, and points com-
mittedtoa givencluster inthe earlystages cannot move toa different
cluster.
Partitional clusteringobtains asinglepartitionof thedatainstead
of a clustering structure by optimizing a criterion function defined
either locally (on a subset of the patterns) or globally (defined over
all of the patterns). Partitional clustering can be further divided into
two classes: crispclustering andfuzzy clustering. Incrispclustering,
every data point belong to only one cluster, while in fuzzy cluster-
ing every data point belongs to every cluster to a certain degree as
determinedby the membershipfunction.
3
Partitional algorithms are
dynamic, and points can move fromone cluster to another. They can
incorporate knowledge about the shape or size of clusters by using
appropriate prototypes and distance measures.
Hierarchical clustering is inflexible due to its greedy approach:
after a merge or a split is selectedit is not refined. Fisher
4
studiediter-
ative hierarchical cluster redistribution to improve once constructed
dendrograms. Karypis et al.
5
also researched refinements for hier-
archical clustering. The problem with partitional algorithms is the
initial guess of the number of clusters. A simple way to mitigate
the effects of clusters initialization was suggested by Bradley and
Fayyad.
6
First, k-means is performed on several small samples of
data with a random initial guess. Each of these constructed systems
is thenusedas a potential initializationfor a unionof all the samples.
Centroids of the best system constructed this way are suggested as
an intelligent initial guesses to ignite the k-means algorithm on the
full data. Zhang
7
suggested another way to rectify the optimiza-
tion process by soft assignment of points to different clusters with
appropriate weights, rather than by moving them decisively from
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
234
AP Dhawan and Shuangshuang Dai
one cluster to another. Nowadays, probabilistic models have been
proposedas a basis for cluster analysis. Inthis approach, the data are
viewed as coming from a mixture of probability distributions, each
representing a different cluster. Methods of this type have shown
promise in a number of practical applications.
8–10
10.2.1 Hierarchical Clustering with the Agglomerative Method
In hierarchical clustering, the number of clusters may not be spec-
ified in advance. It builds a cluster hierarchy or, in other words, a
tree of clusters. Every cluster node contains child clusters; sibling
clusters partition the points covered by their common parent. Such
an approach allows exploring data on different level of granularity.
Hierarchical clustering methods are dividedinto agglomerative and
divisive.
2,10,11
An agglomerative clustering method may start with
one-point (singleton) based clusters and recursively merges two or
more appropriate clusters. Adivisive clustering starts with one clus-
ter of all data points and recursively splits the most appropriate
cluster. The process continues until a stopping criterion is satisfied
providing a reasonable number of clusters.
Hierarchical methods of cluster analysis permit a convenient
graphical display in which the entire sequence of merging (or split-
ting) is shown. Because of its tree-like nature, the display has the
name of dendrogram. The agglomerative method is usually chosen
because it is more important and more widely used. One reason for
the popularity of agglomerative method is that during the merging
process, the choice of threshold is not a big concern which will be
illustrated in the details of the algorithm shown below. In contrast,
divisive methods are more computationally intensive and the diffi-
cultyof choosingpotential allocations toclusters duringthe splitting
stages.
To merge or split subsets of points rather than individual points,
the distance between individual points has to be generalized to
the distance between subsets. Such a derived proximity measure
is called a linkage metric. The type of the linkage metric used
significantly affects hierarchical algorithms, since it reflects the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
235
particular concept of closeness and connectivity. Major interclus-
ter linkage metrics include single link, average link and complete
link.
2,10–13
The underlying dissimilarity measure (usually distance)
is computed for every pair of points with one point in the first set
andanother point inthe secondset. Aspecific operationsuchas min-
imum (single link), average (average link), or maximum (complete
link) is applied to pair-wise dissimilarity measures:
d(C
1
, C
2
) = operation{d(x, y)| x ∈ C
1
, y ∈ C
2
}. (4)
For example, the SLINKalgorithm,
12
based on the single-link metric
representation provides the Euclidean minimal spanning tree with
O(N
2
) computational complexity.
As described above, the agglomerative methods are based on
measures of distance between clusters. Froma representation of sin-
gle point basedclusters, two clusters that are nearest andsatisfy sim-
ilarity criterion are merged to form a reduced number of clusters.
This is repeated until just one cluster is obtained. Let us suppose
that “n” sample (data) points are to be clustered, the initial num-
ber of clusters will then be equal to n as well. Let us represent the
data vector D with n data points as D={x(1),…,x(n)} and a function
D(C
i
, C
j
) as distance measure between two clusters C
i
and C
j
. An
agglomerative algorithm for clustering can be defined as follows:
Algorithm (agglomerative hierarchical clustering)
Step 1: for i = 1, . . ., n let C
i
={X(i)};
Loop: While there is more than one cluster left do
Minimizing the distance D(C
k
, C
h
) between any two
clusters
Let C
i
and C
j
be the clusters with minimum distance
C
i
= C
i
∪ C
j
;
Remove cluster C
j
;
End
In the above algorithm, a distance measure should be carefully
chosen. Normally, Euclidean distance is employed which assume
some degree of commensurability between the different variables.
It makes less sense if the variables are non-commensurate, that is,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
236
AP Dhawan and Shuangshuang Dai
Fig. 2. Asample dendrogram.
variables are measured in different units. A common strategy is to
standardize the data by dividing the sample value of each of the
variables by its sample standard deviation, so that they are equally
important. Figure 2 shows a sample dendrogram produced by the
agglomerative hierarchical clustering method for a given data.
Linkage metrics-based hierarchical clustering suffers from time
complexity. Under reasonable assumptions, suchas reducibilitycon-
dition, linkage metrics methods have O(N
2
) complexity.
10–14
Chiu
et al.
15
proposed another hierarchical clustering algorithm using
a model-based approach in which maximum likelihood estimates
were introduced.
Traditional hierarchical clustering is inflexible due to its greedy
approach: after a merge or a split is selected, it is not refined. In
addition, since they consider only local neighbors in each step, it is
difficult to incorporate a prior knowledge about the global shape or
size of clusters. Moreover, hierarchical clustering is static in a sense
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
237
that points assigned to a cluster in the early stages cannot be moved
to a different cluster in later stages.
10.2.2 Non-hierarchical or Partitional Clustering
Anon-hierarchical or partitional clustering algorithm obtains a sin-
gle partition of the data instead of a hierarchical clustering represen-
tationsuchas the dendrogram. Partitional methods have advantages
inapplications involvinglarge datasets for whichthe constructionof
a dendrogramis computationally problematic. The partitional tech-
niques usually produce clusters by optimizing an objective function
defined either locally (on a subset of the patterns) or globally (over
all of the patterns).
10.2.2.1 K-Means Clustering Approach
K-means
2
is the simplest and most commonly used algorithm
employing a squared error criterion which is defined as:
e
2
(ℵ, ) =
K

j=1
n
j

i=1
_
_
_x
(j)
i
− c
j
_
_
_
2
. (5)
The K-means algorithm starts with a random initial partition and
keeps reassigning the patterns to clusters based on the similarity
between the pattern and the cluster centers until a convergence cri-
terion is met, e.g. there is no reassignment of any pattern from one
cluster to another, or the squared error ceases to decrease signifi-
cantly after some number of iterations. The k-means algorithm is
popular because it is easy to implement with a computational com-
plexity of O(N), where N is the number of patterns.
Amajor problem with this algorithm is that it is sensitive to the
selection of the initial partition and may converge to a local min-
imum of the criterion function value if the initial partition is not
properly chosen. Bradley and Fayyad
6
suggested a way to mitigate
the effects of cluster initialization. One variationto the k-means algo-
rithmis to permit the splitting and merging of the resulting clusters.
Typically, a cluster is split when its variance is above a prespecified
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
238
AP Dhawan and Shuangshuang Dai
threshold and two clusters are merged when the distance between
their centroids is belowanother prespecifiedthreshold. Under sucha
scheme, it is possibletoobtaintheoptimal partitionstartingfromany
arbitraryinitial partition, providedproper thresholdvalues are spec-
ified. Another variation of the k-means algorithm involves select-
ing a different criterion function altogether. Diday
16
and Symon
17
described a dynamic clustering approach obtained by formulating
the clustering problem in the framework of maximum-likelihood
estimation. The regularized Mahakanobis distance was used in Mao
and Jain
11
to obtain hyperellipsoidal clusters.
Partitioning clustering algorithms can be divided into two
classes: crisp (or hard) clustering and fuzzy clustering. Hard cluster-
ing is the traditional approach in which each pattern belongs to one
and only one cluster. Hence, the clusters are disjoint. Fuzzy cluster-
ing extends this notion to associate each pattern with every cluster
usingamembershipfunction.
2
Fuzzyset theorywas initiallyapplied
to clustering in Ruspini.
28
The most popular fuzzy clustering algo-
rithm is the fuzzy k-means (FCM) algorithm. A generalization of
the FCM algorithm was proposed by Bezdek
18
through a family of
objective functions. Afuzzy c-shell algorithm and an adaptive vari-
ant for detecting circular and elliptical boundaries was presented in
Dave.
19
It was also extended in medical image analysis to segment
magnetic resonance images.
20
Even though it is better than the hard
k-means algorithmat avoiding local minima, FCMcan still converge
tolocal minimaof thesquarederror criterion. Thedesignof themem-
bership function is the most important problem in fuzzy clustering;
different choices include those based on similarity decomposition
and centroids of clusters.
10.2.3 Fuzzy Clustering
Conventional clustering and classification approaches assign a data
point to a cluster or class with a well defined metric. In other words,
the membership of a data point for a cluster is deterministic and can
be represented by a crisp membership function. In many real world
applications, setting up a crisp membership function for clustering
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
239
or classification often makes the result intuitively unreasonable.
Using a less deterministic approach with probabilistic membership
functions providing fuzzy overlapping boundaries in the feature
space have provided very useful results in many applications.
18–20
10.2.3.1 Fuzzy Membership Function
Afuzzy set is a set without a crisp boundary for its membership. If
X is a space of input data points denoted generically by x, then a
fuzzy set A in X is defined as a set of ordered pairs:
A = {(x, µ
A
(x))| x ∈ X} , (6)
where µ
A
(x) is called the membership function (MF) for the fuzzy
set Aand its value ranges from0 to 1. In other words, a membership
function can be represented as a mapping function that provides
each point in the input space with a membership value (or degree
of membership) between 0 and 1. For example, the age of a person
can be defined into some predefined deterministic groups such as
in the interval of 10, e.g. 21–30; 31, 40; 41–50, etc. However, defining
a “middle-aged” group of people is quite subjective to individual
perception. If we consider a range, say, between 40 and 50, as
“middle-aged,” a probabilistic membership function can be deter-
mined to represent the degree of belongingness to the group of mid-
dle aged people.
A membership function may be expressed as the generalized
Cauchy distribution
18
as:
µ
A
(x) = bell (x; a, b, c) =
1
1 +
¸
¸
x−c
a
¸
¸
2b
, (7)
where c is the median value of the range (for example, it is 45 in the
middle age group as described above), and a and b are parameters
to adjust the width and sharpness of the curve. The membership
function for a = 15, b = 3, and c = 45 is shown in Fig. 3 as µ
A
(x) =
bell(x; 15, 3, 45).
It can be noted that the definition of “middle aged” as rep-
resented by a membership function becomes more reasonable as
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
240
AP Dhawan and Shuangshuang Dai
Fig. 3. Aplot of the “bell-shape” membership function bell(x; 15, 3, 45).
against to a crisprepresentation. If a personis between40 and50, the
membership function value is 1 which is considered middle-aged.
Extending this concept to three groups: “young,” “middle-aged,”
and “old” with three membership functions (MF) based representa-
tion, a probabilistic interpretation of the age group can be obtained
as shown in Fig. 4. Aperson with 35 years of age is more likely con-
sidered to be middle-aged than young because the corresponding
MF value is around 0.8 for the middle-age versus 0.2 for the young
group. Therefore, a particular age has three corresponding MF val-
ues indifferent categories. As mentionedabove, the three MFs totally
cover the value range of X andthe transition fromone MF to another
is smooth and gradual.
10.2.3.2 Membership Function Formulation
The parameterized functions can be used to define membership
functions (MF) with different transition properties. For example,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
241
Fig. 4. Aplot of three Bell MFs for “young,” “middle aged” and “old.”
triangular, trapezoidal, Gaussian and bell shape functions have
different transition curves and therefore the corresponding prob-
ability function provides different mappings to the data distribu-
tion. Further, multidimensional MFs with desired shape (Triangle,
Gaussian, Bell, etc.) may be needed to deal with multidimensional
data. Amultidimensional Gaussian MF can be represented as:
µ
A
(X) = gaussian(X; M, K) = exp
_

1
2
(X − M)
T
K
−1
(X − M)
_
,
(8)
where X and M are column vectors defined by: X = [x
1
, x
2
, . . . , x
n
]
T
and M = [m
1
, m
2
, . . . , m
n
]
T
= [E(x
1
), E(x
2
), . . . , E(x
n
)]
T
, m
i
is the
mean value of variable x
i
, and K is covariance matrix of variables x
i
defined as:
K =
_
_
var(x
1
) cov(x
1
, x
2
) . . . . cov(x
1
, x
n
)
cov(x
2
, x
1
) var(x
2
) . . . . cov(x
2
, x
n
)
cov(x
n
, x
1
) cov(x
n
, x
2
) . . . . var(x
n
)
_
_
. (9)
10.2.3.3 Fuzzy k-Means Clustering
The fuzzy k-means algorithm
18
is based on the minimization of
an appropriate objective function J, with respect to U, a fuzzy
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
242
AP Dhawan and Shuangshuang Dai
K-partition of the dataset, and to V, a set of K prototypes as:
J
q
(U, V) =
N

j=1
K

i=1
(u
ij
)
q
d
2
(X
j
, V
i
); K ≤ N (10)
where q is any real number greater than 1, X
j
is the j-th m-
dimensional feature vector, V
j
is the centroid of the i-th cluster, u
ij
is the degree of membership of X
j
in the i-th cluster. d
2
(X
j
, V
i
) is any
inner product metric (distance between X
j
and V
j
), N is the num-
ber of data points. K is the number of clusters. The parameter q is
the weighting exponent for u
ij
and controls the “fuzziness” of the
resulting clusters.
18
Fuzzy partition may be carried out through an
iterative optimization of the above objective function as the follow-
ing algorithm.
Step 1: Choose primary centroid V
i
(prototypes);
Step 2: Compute the degree of membership of all feature vectors in
all the clusters:
u
ij
=
(1/d
2
(X
j
, V
i
)
1/(q−1)

K
k=1
(1/d
2
(X
j
, V
i
)
1/(q−1)
; (11)
Step 3: Compute new centroids

V
i
:

V
i
=

N
j=1
(u
ij
)
q
X
j

N
j=1
(u
ij
)
q
, (12)
and update the degree of membership, u
ij
to

u
ij
, according to Eq. 11.
Step 4: If max[|u
ij


u
ij
|] < ε stop, otherwise go to Step 3
where ε is a termination criterion between 0 and 1.
Computation of the degree of membership u
ij
depends on the
definition of the distance measure, d
2
(X
j
, V
i
)
18
as:
d
2
(X
j
, V
i
) = (X
j
− V
i
)
T
A(X
j
− V
i
). (13)
The inclusion of A (an m × m positive-definite matrix) in the
distance measure results in weighting according to the statistical
properties.
2,18
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
243
10.3 NEAREST NEIGHBORED CLASSIFIER
Apopular statistical methodfor classification is the nearest neighbor
classifier, which assigns a data point to the nearest class model in the
feature space. It is apparent that the nearest neighbor classifier is a
supervised method as it uses labeled clusters of training samples in
the feature space as models of classes. Let us assume that there are
C number of classes represented by c
j
; j = 1, 2, . . . , C. An unknown
feature vector f is to be assigned to the class that is closest to the class
model developed fromclustering the labeled feature vectors during
the training. A distance measure D
j
(f) is defined by the Euclidean
distance in the feature space as
2
:
D
j
(f) = f −u
j
, (14)
where
u
j
=
1
N
j

f ∈c
j
f
j
j = 1, 2, . . . , C
is the mean of the feature vectors for the class c
j
and N
j
is the total
number of feature vectors in the class c
j
.
The unknown feature vector is assigned to the class c
i
if:
D
i
(f) = min
C
j=1
[D
j
(f)]. (15)
A probabilistic approach can be applied to the task of classifica-
tion to incorporate a priori knowledge to improve performance.
Bayesian and maximumlikelihood methods have been widely used
in object recognition and classification for different applications. Let
us assume that the probability of a feature vector f belonging to the
class c
i
is denoted by p(c
i
/f). Let an average risk of wrong classifica-
tion for assigning the feature vector to the class c
j
be expressed by
r
j
(f) as:
r
j
(f) =
C

k=1
Z
kj
p(c
k
/f), (16)
where Z
kj
is the penalty of classifying a feature vector to the class c
j
when it belongs to the class c
k
.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
244
AP Dhawan and Shuangshuang Dai
It can be shown that:
r
j
(f) =
C

k=1
Z
kj
p(f/c
k
)P(c
k
), (17)
where P(c
k
) is the probability of occurrence of class c
k
.
ABayes classifier assigns an unknown feature vector to the class
c
i
if:
r
i
(f) < r
j
(f)
or
C

k=1
Z
ki
p(f/c
k
)P(c
k
) <
C

q=1
Z
qj
p(f/c
q
)P(c
q
) for j = 1, 2, . . . , C. (18)
Other versions of the Bayesian classification as applied to medical
image classification can be found in many papers
20–25
for radiolog-
ical image analysis and computer-aided diagnosis.
10.4 DIMENSIONALITY REDUCTION
As described above, the goal of clustering is to group the data or
feature vector into some meaningful categories for better classifica-
tion and decision making without making errors of assigning a data
vector to a wrong class. For example, for computer-aided analysis
of mammograms, mammography image feature vectors may need
to be classified into “benign” or “malignant” classes by a pattern
classification system. The error in classification may assign a nor-
mal patient to “malignant” class (therefore creating a false positive)
or may assign a cancer patient to “benign” class (therefore miss-
ing a cancer). If the data (or features) are assumed to be statistically
independent, the probability of classification error decreases as the
distance between the classes increases. This distance is defined as
Ref. [14]:
d
2
=
n

i=1
µ
i1
− µ
i2
σ
2
i
, (19)
where µ
i1
and µ
i2
are the mean of each feature for the two classes.
Thus, the most useful features are those with large differences in
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
245
mean as compared to their standard deviation. The performance
should continue to improve by the addition of new features as long
as the means for the two classes differ (thereby increasing d) and
the numbers of observations are increased accordingly. The classi-
fier performance may be affected by unnecessary or noisy observa-
tions or features that are not well correlated to the required classes.
Therefore, it is useful to reduce the number of features to those that
can provide maximum separation in the feature space among the
required classes. In addition, by reducing the number of features,
significant gain many be achieved in computational efficiency. This
process is usually calleddimensionality reduction. Though there are
a number of approaches investigated for dimensionality reduction
and improving the performance of a classifier in the feature space,
two useful approaches using principal component analysis (PCA)
and genetic algorithms (GA) are described here.
10.4.1 Principal Component Analysis
Principal component analysis (PCA) is an efficient methodto reduce
the dimensionality of a data set which consists of a large number
of interrelated variables while retaining as much as possible of the
variation present in the data set.
2
The goal here is to map vectors
X
d
in a d-dimensional space (x
1
, x
2
, . . . , x
d
) onto vectors Z
M
in an
M-dimensional space (z
1
, z
2
, . . . , z
M
) where M < d. Without loss of
generality, we express vector X as a linear combination of a set of d
orthonormal vectors u
i
:
X =
d

i=1
x
i
u
i
, (20)
where the vectors u
i
satisfy the orthonormality relation:
u
T
i
u
j
= δ
ij
. (21)
Therefore the coefficient in (20) can be expressed as:
x
i
= u
T
i
X. (22)
Let us suppose that only a subset of M<d of the basis vectors u
i
are
to be retained, so that only Mcoefficients x
i
are used. Ingeneral, PCA
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
246
AP Dhawan and Shuangshuang Dai
does not retain a subset of the original set of basis vectors. It finds a
new set of basis vectors that spans the original d-dimensional space
such that the data can be well represented by a subset of these new
basis vectors. Here, v
i
is used to denote the new basis vectors which
meet the orthonormality requirement. As above, only Mcoefficients
x
i
are used and the remaining coefficients will be replaced by con-
stants b
i
. Noweach vector x is approximated by an expression of the
form:
˜
X =
M

i=1
x
i
v
i
+
d

i=M+1
b
i
v
i
(23)
x
i
= v
T
i
X. (24)
We need to choose the basis vectors v
i
and the coefficients b
i
to be
made such that an approximation given by Eq. 23, with the values
of x
i
determined by Eq. 24, provides the best approximation to the
original vector X on average for the whole set data set.
The next stepis tominimizes the sumof squares of errors over the
whole data set. The sum-of-square error can be written as follows:
E
M
=
1
2
d

i=M+1
v
T
i
Av
i
, (25)
where A is the covariance matrix of the set of vectors X
n
, which is
defined as follows:
A =

(x
n
− ¯ x)(x
n
− ¯ x)
T
. (26)
Nowthe problemis converted to minimizing E
M
with respect to the
choice of basis vectors v
i
. A minimum value is obtained when the
basis vectors satisfy the following condition:
Av
i
= β
i
v
i
(27)
Thus, v
i
(i =M+1· · · d) are the eigenvectors if the covariance
matrix. Note that, since the covariance matrix is real and symmetric,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
247
its eigenvectors can indeed be chosen to be orthonormal. Finally, the
minimum of error is in the form:
E
M
=
1
2
d

i=M+1
β
i
. (28)
Therefore, the minimum error is achieved by rejecting the (d-M)
smallest eigenvalues andtheir corresponding eigenvectors. The first
Mlargest eigenvalues are thenretained. Eachof the associatedeigen-
vectors v
i
is called a principal component.
With matrix representation, singular value decomposition (SVD)
algorithm can be employed to calculate the eigenvalues and its cor-
responding eigenvectors. The use of SVD has two important impli-
cations. First, it is computationally efficient and second, it provides
additional insight into what a PCAactually does. It also provides a
way to represent the results of PCAgraphically and analytically.
10.4.2 Genetic Algorithms Based Optimization
In nature, the features that characterize an organism determine its
ability to endure in a competition for limited resources. These fea-
tures are fixed by the building block of genetics, or the gene. These
genes form chromosomes, the genetic structures which ultimately
define the survival capability of an organism. Thus, the most supe-
rior organisms survive andpass their genes onto future generations,
while the genes of less fit individuals are eventually eliminatedfrom
the population.
Reproduction introduces diversity into a population of individ-
uals through the exchange of genetic material. Repeated selection
of the fittest individuals and recombination of chromosomes pro-
motes evolution in the gene pool of a species which creates even
better population members.
A genetic algorithm (GA) is a robust optimization and search
method based on the natural selection principles outlined above.
Genetic algorithms provide improved performance by exploiting
past information and promoting competition for survival. GAs gen-
erate a population of individuals through selection, and search for
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
248
AP Dhawan and Shuangshuang Dai
the fittest individuals throughcrossover andmutation. Afundamen-
tal feature of GAs is that they operate on a representation of problem
parameters, rather than manipulating the parameters themselves.
These parameters are typically encoded as binary strings that are
associated with a measure of goodness, or fitness value. As in nat-
ural evolution, GAs encourage the survival of the fittest through
selection and recombination. Through the process of reproduction,
individual strings are copied according to their degree of fitness.
In crossover, strings are probabilistically mated by swapping all
characters located after a randomly chosen bit position. Mutation
is a secondary genetic operator that randomly changes the value of
a stringpositiontointroduce variationinthe populationandrecover
lost genetic information.
31,32
GAs maintain a population of structures that are potential
solutions to an objective function. Let us assume that features
are encoded into binary strings that can be represented as A =
a
1
, a
2
, . . . , a
L
, where L is the specified string length, or the number
of representative bits. Asimple genetic algorithm operates on these
strings according to the following iterative procedure:
(1) Initialize a population of binary strings.
(2) Evaluate the strings in the population.
(3) Select candidate solutions for the next population and apply
mutation and crossover operators to the parent strings.
(4) Allocate space for new strings by removing members from the
population.
(5) Evaluate the new strings and add them to the population.
(6) Repeat steps 3–5 until the stopping criterion is satisfied.
Detailed knowledge of the encoding mechanism, the objec-
tive function, the selection procedure, and the genetic operators,
crossover and mutation, is essential for a firm understanding of the
above procedure as appliedto a specific problem. These components
are considered below.
The structure of the GA is based on the encoding mechanism
used to represent the variables in the given optimization problem.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
249
The candidate solutions may encode any number of variable types,
including continuous, discrete, and boolean variables. Although
alternate string codings exist,
31,32
a simple binary encoding mech-
anism is considered within the scope of this thesis. Thus, the allele
of a gene in the chromosome indicates whether or not a feature is
significant in microcalcification description. The objective function
evaluates each chromosome in a population to provide a measure
of the fitness of a given string. Since the value of the objective func-
tion can vary widely between problems, a fitness function is used
to normalize the objective function within the range of 0 to 1. The
selection scheme uses this normalized value, or fitness, to evaluate
a string.
One of the most basic reproduction techniques is proportionate
selection, whichis carriedout bytheroulettewheel selectionscheme.
In roulette wheel selection, each chromosome is given a segment of a
roulette wheel whose size is proportionate to the chromosome’s fit-
ness. Achromosome is reproduced if a randomly generated number
falls in the chromosome’s corresponding roulette wheel slot. Thus
since more fit chromosomes are allocatedlarger wheel portions, they
are more likely to generate offspring after a spin of the wheel. The
process is repeated until the population for the next generation is
completely filled. However, due to sampling errors the population
must be very large in order for the actual number of offspring pro-
ducedfor anindividual chromosome toapproachthe expectedvalue
for that chromosome.
In proportionate selection, a string is reproduced according to
how its fitness compares to the population average, in other words,
as f
i
/f , where f
i
is the fitness of the string and f is the average fit-
ness of the population. This proportionate expression is also known
as the selective pressure on an individual. The mechanics of pro-
portionate selection can be expressed as: A
i
receives more than one
offspring on average if fi > f ; otherwise, A
i
receives less than one
offspring on average. Since the result of applying the proportionate
fitness expression will always be a fraction, this value represents the
expected number of offspring allocated to each string, not the actual
number.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
250
AP Dhawan and Shuangshuang Dai
Once the parent population is selected through reproduction,
the offspring population is created after application of genetic oper-
ators. The purpose of recombination, also referred to as crossover, is
todiscover newregions of thesearchspace, rather thanrelyingonthe
same population of strings. In recombination, strings are randomly
paired and selected for crossover. If the crossover probability condi-
tion is satisfied, then a crossover point along the length of the string
pair is randomly chosen. The offspring are generated by exchanging
the portion of the parent strings beyond the crossover position. For
a string of length l, the l−1 possible crossover positions are chosen
with equal probability.
Mutation is a secondary genetic operator that preserves the ran-
dom nature of the search process and regenerates fit strings that
may have been destroyed or lost during crossover or reproduction.
The mutation rate controls the probability that a bit value will be
changed. If the mutation probability condition is exceeded, then the
selected bit is inverted.
An example of a complete cycle for the simple genetic algorithm
is shown in Table 1.
31
The initial population contains four strings
composed of ten bits. The objective function determines the number
of 1’s in a chromosome and the fitness function normalizes the value
to lie in the range of 0 to 1.
The proportional selection scheme allocates 0, 1, 1, and 2 off-
spring to the initial offspring in their respective order. After selec-
tion, the offspring are randomly paired for crossover so that strings
1 and 3 and strings 2 and 4 are mated. However, since the crossover
rate is 0.5, only strings 1 and 3 are selected for crossover. The other
strings are left intact. The pair of chromosomes then exchange their
genetic material after the fifth bit position, which is the randomly
selectedcrossover point. The final stepin the cycle is mutation. Since
the mutation rate is selected to be 0.05, only two bits out of the forty
present in the population are mutated. The second bit of string 2
and the fourth bit of string 4 are randomly selected for mutation. As
can be seen from the figure, the average fitness of population P
4
is
significantly better than the initial fitness after only one generational
cycle.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
251
Table 1. ASample Generational Cycle of the Simple Genetic Algorithm
Chromosome Fitness Value Average Fitness
Population P
1
0001000010 0.2 0.50
(initial population) 0110011001 0.5
1010100110 0.5
1110111011 0.8
Population P
2
0110011001 0.5 0.65
(after selection) 1010100110 0.5
1110111011 0.8
1110111011 0.8
Population P
3
01100|11011 0.6 0.65
(after crossover) 1010100110 0.5
11101|11001 0.7
1110111011 0.8
Population P
4
0110011011 0.6 0.70
(after mutation) 1110100110 0.6
1110111001 0.7
1111111011 0.9
Although roulette wheel selection is the simplest method to
implement proportionate reproduction, it is highly inefficient since
it requires n spins of the wheel to fill a population with n mem-
bers. Stochastic universal selection (SUS) is an efficient alternative
to roulette wheel selection. SUS also uses a weighted roulette wheel,
but adds equally spaced markers along the outside rimof the wheel.
The wheel is spun only once, and each individual receives as many
copies of itself as there are markers in its slot.
32
The average fitness value in the initial stages of a GA is typ-
ically low. Thus, during the first few generations the proportion-
ate selection scheme may assign a large number of copies to a few
strings with relatively superior fitness, known as super individuals.
These strings will eventually dominate the population andcause the
GAto converge prematurely. The proportionate selection procedure
also suffers from decreasing selective pressure during the last gen-
erations when the average fitness value is high. Scaling techniques
andranking selection can help alleviate the problems of inconsistent
selective pressure and domination by superior individuals.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
252
AP Dhawan and Shuangshuang Dai
In linear scaling, the fitness value is adjusted by:
f

= af + b, (29)
where f is the original fitness value and f

is the scaled fitness value.
The coefficients a and b are chosen so that the fittest individuals
do not receive too many copies, and average individuals typically
receive one copy. These coefficients should also be adjusted to avoid
negative fitness values.
Ranking selection techniques assign offspring to individuals by
qualitatively comparing levels of fitness. The population is sorted
according to their fitness values andallottedoffspring basedontheir
rank. In ranking selection, subsequent populations are not influ-
encedby the balance of the current fitness distributions so that selec-
tive pressure is uniform. Each cycle of the simple GA produces a
completely new population of offspring from the previous gener-
ation, known as generational replacement. Thus, the simple GA is
naturally slower in manipulating useful areas of the search space
for a large population. Steady-state replacement is an alternative
method which typically replaces one or more of the worst members
of the population each generation. Steady-state replacement can be
combined with an elitist strategy, which retains the best strings in
the population.
32
GAs are efficient global optimization techniques which are
highly suited to searching in nonlinear, multidimensional prob-
lem spaces.
32
The most widely accepted theory on the operation
of the GA search mechanism in global optimization is the Schema
Theorem. This theorem states that the search for the fittest individ-
uals is guided by exploiting similarities among the superior strings
in a population. These similarities are described by schemata, which
are composedof strings with identical alleles at the same position on
each string. The order of a particular schema is the number of fixed
positions among the strings, and the defining length is the distance
between the first and last fixed positions on a string. The schemata
with superior fitness, low order and small defining length increase
with each passing generation.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
253
From a set of coded parameters, GAs use a population of points
to search for the optimal solution, not just a single point in the search
space. The GAthus has a high probability of discovering the optimal
global solution in a multimodal search space since it is less likely
to be troubled by false optima. This ability becomes a tremendous
advantage over traditional methods in more complex problems.
10.5 NON-PARAMETRIC CLASSIFIERS
Artificial neural network basedclassifiers have been exploredexten-
sively in the literature for non-parametric classification using a set
of training vectors providing relationships between input features
or measurements to output classes. Such classification methods
that do not require any prior probabilistic model of class distribu-
tions of input vectors; they learn this relationship during training.
Though there are a number of such classifiers have been used for
different applications, more common networks such as Backprop-
agation and Radial-Basis Function neural networks are described
here.
33–34
10.5.1 Backpropagation Neural Network for Classification
The backpropagation network is the most commonly used neural
network in signal processing and classification applications. It uses
a set of interconnected neural elements that process the information
in a layered manner. A computational neural element, also called
as perceptron, provides an output as a thresholded weighed sum
of all inputs. The basic function of the neural element, as shown in
Fig. 5, is analogous to the synaptic activities of a biological neuron.
In a layered network structure, the neural element may receive its
input from an input vector or other neural elements. A weighted
sum of these inputs constitutes the argument of a non-linear activa-
tionfunctionsuchas a sigmoidal function. The resultingthresholded
value of the activation function is the output of the neural element.
The output is distributed along weighted connections to other
neural elements.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
254
AP Dhawan and Shuangshuang Dai
X
f(ϕ)
Σ
1
w
2
ϕ
w
n+1
w
1
w
d
f(ϕ):
Y
Fig. 5. Acomputational neuron model with linear synapses.
In order to learn a specific pattern of input vectors for classifi-
cation, an iterative learning algorithm, such as the LMS algorithm,
often called the Widrow-Hoff Delta Rule
34
is used with a set of pre-
classified training examples that are labeled with the input vectors
and their respective class outputs. For example, if there are two out-
put classes for classification of input vectors, the weighted sum of
all input vectors may be thresholded to a binary value, 0 or 1. The
output 0 represents class 1, while the output 1 represents class 2. The
learning algorithmrepeatedly presents input vectors of the training
set to the network and forces the network output to produce the
respective classification output. Once the network converges on all
training examples to produce the respective desired classification
outputs, the network is used to classify new input vectors into the
learned classes.
The computational output of a neural element can be
expressed as:
y = F
_
n

i=1
w
i
x
i
+ w
n+1
_
, (30)
where F is a non-linear activation function that is used to threshold
the weighted sumof inputs x
i
and w
i
is the respective weight. Abias
is added to the element as w
n+1
, as shown in Fig. 5.
Let us assume a multilayer feed-forward neural network with L
layers of N neural elements (Perceptrons) in each layer, as shown in
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
255
Hidden Layer
Neurons
Output Layer
Neurons
L
y
1
x
1
x
2
x
3
x
n
1
L
y
2
L
n
y
Fig. 6. Afeedforward Backpropagation neural network.
Fig. 6, such that:
y
(k)
= F
_
W
k
y
(k−1)
_
for k = 1, 2, . . . L, (31)
where y
(k)
is the output of the k-th layer neural elements with k = 0
representing the input layer and W
(k)
is the weight matrix for the
k-th layer such that:
y
(0)
=
_
_
_
_
_
_
_
x
1
x
2
·
x
n
1
_
¸
¸
¸
¸
¸
_
; y
(k)
=
_
_
_
_
_
_
_
_
y
(k)
1
y
(k)
2
·
y
(k)
n
y
(k)
(n+1)
k
_
¸
¸
¸
¸
¸
¸
_
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
256
AP Dhawan and Shuangshuang Dai
and
W
(k)
=
_
_
_
_
_
_
_
_
_
w
(k)
11
w
(k)
12
· w
(k)
1n
w
(k)
1(n+1)
w
(k)
21
w
(k)
22
· w
(k)
2n
w
k
2(n+1)
· · · · ·
w
(k)
n1
w
(k)
n2
· w
(k)
nn
w
(k)
n(n+1)
w
(k)
(n+1)1
w
(k)
(n+1)2
· w
(k)
(n+1)n
w
(k)
(n+1)(n+1)
_
¸
¸
¸
¸
¸
¸
¸
_
. (32)
The neural network is trained by presenting classified exam-
ples of input and ouput patterns. Each example consists of the
input and output vectors {y
(0)
, y
L
} or {x, y
L
} that are encoded for the
desiredclasses. The objective of the training is to determine a weight
matrix that would provide the desired output, respectively for each
input vector in the training set. The least mean squared (LMS) error
algorithm
43,44
can be implemented to train a feed forward neural
network using the following steps:
(1) Assign random weights in the range of [−1,+1] to all weights
w
k
ij
.
(2) For each classified pattern pair {y
(0)
, y
L
} in the training set, do
the following steps:
a. Compute the output values of each neural element using the
current weight matrix.
b. Find the error e
(k)
between the computed output vector and
the desired output vector for the classified pattern pair.
c. Adjust the weight matrix using the change W
(k)
computed
as W
(k)
= αe
(k)
[y
(k−1)
] for all layers k = 1, . . . , L,
where α is the learning rate that can set between 0 and 1.
(3) Repeat step 2 for all classified pattern pairs in the training set
until the error vector for each training example is sufficiently
low or zero.
The non-linear activation function is an important consideration in
computing the error vector for each classified pattern pair in the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
257
training set. Asigmoidal activation function can be used as:
F(y) =
1
1 + e
−y
. (33)
The above described gradient descent algorithmfor training a feed-
forward neural network also called as backpropagation neural net-
work (BPNN) is sensitive to the selection of initial weights and noise
in the training set that can cause the algorithm to get stuck in local
minima in the solution pace. This causes a poor generalization per-
formance of the network when it is used to classify new patterns.
Another problem with the BPNN is to find optimal network archi-
tecture with the consideration of optimal number of hidden layers
and neural elements in each of the hidden layers. Several solutions
to find the best architecture and generalization performance have
been explored in the literature.
34
10.5.2 Classification Using Radial Basis Functions
Radial basis function (RBF) classifiers are useful interpolation meth-
ods for multidimensional tasks. One major advantage of RBFs is
their structural simplicity, as seen from Fig. 7. The response of each
node in the single hidden layer is weighted and linearly summed at
the output.
The RBF network was configured by finding the centers and
widths of the basis functions and then determining the weights at
x
1
x
2
w
1
x
3
. y
. .
.
. w
ϕ
ϕ Σ
ϕ
m
x
n
Fig. 7. The radial basis function neural network representation.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
258
AP Dhawan and Shuangshuang Dai
the output of each node. The goal in selecting the unit width, or
variance, is to minimize the overlap between nearest neighbors and
to maximize the network’s generalization ability. For good gener-
alization, the eigen values of the covariance matrix of each basis
are chosen as large as possible. Typically, the kernel function is a
Gaussian with unit normalization as given by
34
:
ϕ(x) = exp
_

_
_
x − c
i
_
_

2
i
_
, (34)
where c
i
is the center of a given kernel, and σ
2
i
is the corresponding
variance. Basis functions with less than exponential decay should
be avoided because of inferior local response.
The network output can be written in terms of the above
Gaussian basis function and the hidden-to-output connection
weights, w
i
, as:
f ( x) =
K

i=1
w
i
ϕ( x). (35)
Toaccount for large variances amongthe nodal outputs, the network
output is usually normalized. The normalized result is specified as:
f ( x) =

K
i=1
w
i
ϕ( x)

K
i=1
ϕ( x)
, (36)
where K is the total number of basis functions.
After the centers and widths of the basis functions are deter-
mined, the network weights can be computed from the following:
y = F
nxp
w, (37)
where the elements of F
nxp
are the activation functions, ϕ
ij
, which
are found by evaluating the j-th Gaussian function at the i-th input
vector. Typically F
nxp
is rectangular with more rows than columns so
that w is overdetermined and no exact solution exists. Thus, instead
of solving for the weights by matrix inversion, w is determined by
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
259
solving a sum-of-squared-error functional as:
F
T
F w = F
T
y, such that
w = (F
T
F)
−1
F
T
y
=
˜
F y,
, (38)
where
˜
F is called the pseudoinverse of F (4). In order to guarantee a
unique solution to Eq. 19,
˜
F is better expressed as:
˜
F = (F
T
F + εI)
−1
F
T
, (39)
where ε is a small constant known as the regularization parameter,
and I is the identity matrix. If F is square and nonsingular, then sim-
ple matrix inversion could be used to solve for the network weights.
When the amount of data is insufficient for complete approx-
imation and that data is inherently noisy, it becomes necessary to
impose additional a priori constraints in order to manage the prob-
lemof learning by approximation. The typical a priori supposition is
that of smoothness, or at least piecewise smoothness. The smooth-
ness condition assumes that the response to an unknown data point
should be similar to the response fromits neighboring points. With-
out the smoothness criterion, it would be infeasible to approximate
any function because of the large number of examples required.
34
Standard regularization is the method in learning for approxi-
mation that utilizes a smoothness criterion. A regularization func-
tion accomplishes two separate tasks: it minimizes the distance
between the actual data and the desired solution, and it minimizes
the deviation from the learning constraint, which can be piecewise
smoothness in classification problems. The general functional to be
minimized is as follows:
H[f ] =
N

i=1
_
f ( x
i
) − y
i
_
2
+ ε
_
_
Df
_
_
2
, ε ∈
+
(40)
where N is the dimension of the regularization solution, ε is the
positive regularization parameter, y
i
is the actual solution, f ( x
i
) is
the desired solution, and Df
2
is a stabilizer term with D as a first-
order differential operator.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
260
AP Dhawan and Shuangshuang Dai
The solution to the above regularization functional is given by:
f ( x) =
N

i=1
b
i
G( x; c
i
), (41)
where G is the basis for the solution to the regularization problem
centered at c
i
, and b
i
=
y
i
−f ( x
i
)
ε
. Under conditions of rotational and
translational invariance, the solution can be written as:
f ( x) =
N

i=1
b
i
G
_
_
x − c
i
_
_
, (42)
10.6 EXAMPLE CLASSIFICATION ANALYSIS USING FUZZY
MEMBERSHIP FUNCTION
Skin lesion images obtained using the Nevoscope were classified
using different techniques into two classes, melanoma and dys-
plastic nevus.
36
The combined set of epi-illuminance and multi-
spectral transilluminance images were classified using a wavelet
decomposition based ADWAT method
37
and Fuzzy Membership
Function based classification.
36
Wavelet transform based bimodal
channel energy features obtained from the images were used
in the analysis. Methods using both crisp and fuzzy member-
ship based partitioning of the feature space were evaluated. For
this purpose, the ADWAT classification method using crisp par-
titioning was extended to handle multispectral image data. Also,
multidimensional fuzzy membership functions with gaussian and
bell profiles were used for classification. Results showthat the fuzzy
membership functions with bell profile are more effective than the
extended ADWAT method in discriminating melanoma from dys-
plastic nevus. The sensitivity and specificity of melanoma diagnosis
can be improved by adding the lesion depth and structure infor-
mation obtained from the multispectral, transillumination images
to the surface characteristic information obtained from the epi-
illumination images.
Bimodal features were obtained from the epi-illumination
images and the multispectral transillumination images using
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
261
wavelet decompositionandstatistical analysis of the channel energy
and energy ratios for the extended ADWAT classification method.
All these features were combined to form a composite feature set.
In this composite feature set, the dynamic range of the channel
energy ratio features is far less compared to the dynamic range of
the channel energy features. For classification, it is necessary to nor-
malize the feature set so that all the features have similar dynamic
range. Using linear transformations, all the features in the compos-
ite feature set were normalized so that they have a dynamic range
betweenzero andone. Using the covariance informationas obtained
from the feature distribution of the learning data set, the values of
the dysplastic and melanoma membership functions were calcu-
lated. Decision as to whether the unknown image set belongs to the
melanoma or dysplastic nevus class was taken basedon the “winner
takes all” criteria. The unknown image set was assigned to the class
with maximummembership function value. Although the member-
ship functions can be thought of as multivariate conditional densi-
ties similar to those used in the Bayes classifier, making the decision
based on the probabilities of all the image classes for the candidate,
gives the classifier its fuzzy nature.
Out of the 60 unknown images (15 melanoma and 45 dysplas-
tic nevus cases) used in the classification phase, 52 cases were cor-
rectly classified using the Gaussian membership function.
36
All the
cases of melanoma and 37 cases of dysplastic nevus were identified
giving a true positive fraction of 100 percent with a false positive
fraction of 17.77 percent. For the eight dysplastic nevus cases that
were misclassified, the values of both the melanoma and dysplastic
nevus membership functions were equal to zero. These cases were
assignedto the melanoma category, since no decision about the class
can be taken if both the membership function values are the same.
Classification results were obtained for the Bell membership func-
tion using different values of the weighing constant W. Out of all
the values of W used, best classification results are obtained for a
value of 0.6, with a true positive fraction of 100 percent with a false
positive fraction of 4.44 percent. The results obtained from all these
classification techniques are summarized in Table 2.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
262
AP Dhawan and Shuangshuang Dai
Table 2. Results of Classification of Optical Images Using Different
Classification Methods for Detection of Melanoma
36
Type of
Images Used
Method Images Correctly
Classified
True
Positive
False
Positive
Melanoma Dysplastic
Epi-
illuminance
Neural Network 13/15 34/45 86.66% 24.44%
Images Bayesian
Classifier
13/15 40/45 86.66% 11.11%
Multispectral
and Epi-
illuminance
Images
Fuzzy Classifier
with
Gaussian
Membership
Function
15/15 37/45 100% 17.77%
Fuzzy Classifier
with Bell
Membership
Functions
15/15 43/45 100% 4.44%
10.7 CONCLUDING REMARKS
Clustering and image classification methods are critically impor-
tant in medical imaging for computer-aided analysis and diagnosis.
Though there is a wide spectrum of pattern analysis and classi-
fication methods has been explored for medical image analysis,
clustering and classification methods have to be customized and
carefully implemented for specific medical image analysis and deci-
sionmaking applications. Agoodunderstanding of the involvement
of features and the contextual information may be incorporated in
model-based approaches utilizing deterministic or fuzzy classifica-
tion approaches.
References
1. Jain AK, Dubes RC, Algorithms for Clustering Data, Prentice Hall,
Englewood Cliffs, NJ, 1998.
2. Duda RO, Hart PE, Stork DG, Pattern Classification (2nd edn.),
John Wiley & Sons, 2001.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
263
3. Zadeh LA, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets
and Systems 1: 3–28, 1978.
4. Fisher D, Iterative optimization and simplification of hierarchical clus-
tering, Journal of Artificial Intelligence Research 4: 147–179, 1996.
5. Karypis G, Han EH, Multilevel refinement for hierarchical clustering,
Technical Report #99–020, 1999.
6. Bradley P, Fayyad U, Refining initial points for k-means clustering, in
Proceedings of the 15th ICML, pp. 91–99, Madison, WI, 1998.
7. ZhangB, Generalizedk-harmonic means —Dynamic weightingof data
inunsupervisedlearning, inProceedings of the 1st SIAMICDM, Chicago,
IL, 2001.
8. Campbell JG, FrakeyC, MurtaghF, RafteryAE, Linear flawdetectionin
woventextiles usingmodel-basedclustering, PatternRecognitionLetters
18: 1539–1548, 1997.
9. Celeux G, Govaert G, Gaussian parsimonious clustering models,
Pattern Recognition 28: 781–793, 1995 .
10. Olson C, Parallel algorithms for hierarchical clustering, Parallel Com-
puting 21: 1313–1325, 1995.
11. Mao J, JainAK, Aself-organizing network for hyperellipsoidal cluster-
ing (HEC), IEEE Trans Neural Network 7: 16–29, 1996.
12. Sibson R, SLINK: An optimally efficient algorithm for the single link
cluster method, Computer Journal 16: 30–34, 1973.
13. Voorhees EM, Implementing agglomerative hierarchical clustering
algorithms for use in document retrieval, Information Processing and
Management 22(6): 465–476, 1986.
14. Dai S, Adaptive learning for event modeling and pattern classification,
PhD dissertation, New Jersey Institute of Technology, Jan 2004.
15. Chiu T, Fang D, Chen J, Wang Y, A Robust and scalable clustering
algorithmfor mixed type attributes in large database environments, in
Proceedings of the 7th ACM SIGKDD, pp. 263–268, San Francisco, CA,
2001.
16. Diday E, The dynamic cluster method in non-hierarchical clustering,
J Comput Inf Sci 2: 61–88, 1973.
17. Symon MJ, Clustering criterion and multivariate normal mixture,
Biometrics 77: 35–43, 1977.
18. Bezdek JC, Pattern Recognition With Fuzzy Objective Function Algorithms,
Plenum Press, New York, NY, 1981.
19. Dave RN, Generalized fuzzy C-shells clustering and detection of cir-
cular and elliptic boundaries, Pattern Recogn 25: 713–722, 1992.
20. Pham DL, Prince JL, Adaptive fuzzy segmentation of magnetic reso-
nance images, IEEE Trans on Med Imaging 18(9): 737–752, 1999.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
264
AP Dhawan and Shuangshuang Dai
21. Vannier M, PilgramT, Speidel C, et al., Validationof magnetic resonance
imaging (MRI) multispectral tissue classification, Computerized Medical
Imaging and Graphics 15: 217–223, 1991.
22. Choi HS, Haynor DR, KimY, Partial volume tissue classificationof mul-
tichannel magnetic resonance images — A mixed model, IEEE Trans-
actions on Medical Imaging 10: 395–407, 1991.
23. Zavaljevski A, Dhawan AP, Holland S, et al., Multispectral MR brain
image classification, Computerized Medical Imaging, Graphics and Image
Processing 24: 87–98, 2000.
24. Nazif AM, Levine MD, Low-level image segmentation: An expert sys-
tem, IEEE Trans Pattern Anal Mach Intell 6: 555–577, 1984.
25. Arata LK, Dhawan AP, Levy AV, et al., Three-dimensional anatomical
model based segmentation of MR brain images through prinicpal axes
registration, IEEE Trans Biomed Eng 42: 1069–1078, 1995.
26. Dhawan AP, Chitre Y, Kaiser-Bonassoand M Moskowitz, Analysis of
mammographic microcalcifications using gray levels image structure
features, IEEE Trans Med Imaging 15: 246–259, 1996.
27. Hall LO, BensaidAM, Clarke LP, Velthuizen RP, et al., Acomparison of
neural network and fuzzy clustering techniques in segmenting mag-
netic resonance images of the brain, IEEE Trans on Neural Networks 3:
672–682, 1992.
28. Xu L, Jackowski M, Goshtasby A, et al., Segmentation of skin cancer
images, Image and Vision Computing 17: 65–74, 1999.
29. Huo Z, Giger ML, Vyborny CJ, Computerized analysis of multiple
mammographic views: Potential usefulness of special view mam-
mograms in computer aided diagnosis, IEEE Trans Med Imaging 20:
1285–1292, 2001.
30. Grohman W, Dhawan AP, Fuzzy convex set based pattern classifica-
tion of mammographic microcalcifications, Pattern Recognition 34(7):
119–132, 2001.
31. Bonasso C, GA based selection of mammographic microcalcifica-
tion features for detection of breast cancer, MS Thesis, University of
Cincinnati, 1995.
32. Peck C, Dhawan AP, Areview and critique of genetic algorithm theo-
ries, J of Evolutionary Computing, MIT Press 3(1): 39–80, 1995.
33. Dhawan AP, Medical Image Analysis, John Wiley Publications and IEEE
Press June 2003, Reprint, 2004.
34. Zurada JM, Introduction to Artificial Neural Systems, West Publishing
Co., 1992.
35. Mitra S, Pal SK, Fuzzy Multi-layer perceptron, inferencing and rule
generation, IEEE Trans Neural Networks 6(1): 51–63, 1995.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
Clustering and Pattern Classification
265
36. Patwardhan S, Dai S, Dhawan AP, Multispectral image analysis and
classification of melanoma using fuzzy membership based partitions,
Computerized Medical Imaging and Graphics 29: 287–296, 2005.
37. Patwardhan SV, Dhawan AP, Relue PA, Classification of melanoma
using tree-structured wavelet transforms, Computer Methods and Pro-
grams in Biomedicine 72: 223–239, 2003.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch10 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
CHAPTER 11
Recent Advances in Functional
Magnetic Resonance Imaging
Dae-Shik Kim
While functional imaging of the brain function using magnetic resonance
imaging (fMRI) has gained a wide acceptance as a useful tool in basic and
clinical neurosciences, its ultimately utility remains elusive due to our lack
of understandingof its basic physiological processes andlimitations. Inthe
present chapter, we will discuss recent advances that are shedding light on
the relationship between the observable blood oxygenation level depen-
dent (BOLD) fMRI contrast and the underlying neuroelectrical activities.
Finally, we will discuss topical issues that remain to be solved in future.
11.1 INTRODUCTION
The rapid progress of blood oxygenation level dependent (BOLD)
functional magnetic resonance imaging (fMRI) in recent years
1−3
has raised the hope that —unlike most existing neuroimaging tech-
niques — the functional architecture of the human brain can be
studied directly in a noninvasive manner. The BOLD technique is
based on the use of deoxyhemoglobin as nature’s own intravas-
cular paramagnetic contrast agent.
4−6
When placed in a magnetic
field, deoxyhemoglobin alters the magnetic field in its vicinity, par-
ticularly when it is compartmentalized as it is within red blood
cells and vasculature. The effect increases as the concentration of
deoxyhemoglobin increases. At concentrations found in venous
blood vessels, a detectable local distortion of the magnetic field
267
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
268
Dae-Shik Kim
surrounding the red blood cells and surrounding blood vessel is
produced. This affects the magnetic resonance behavior of the water
proton nuclei within and surrounding the vessels, which in turn
results in decreases in the transverse relaxation times T
2
and T

2
.
4,6
Duringthe activationof the brain, this process is reduced: increase in
neuronal and metabolic activity results in a reduction of the relative
deoxyhemoglobin concentration due to an increase of blood flow
(and hence increased supply of fresh oxyhemoglobin) that follows.
Consequently, in conventional BOLD fMRI, brain “activity” can be
measured as an increase in T
2
or T

2
weighted MR signals.
1−3
Since
its introduction about 10 years ago, BOLD fMRI was successfully
applied — among numerous other examples — to precisely local-
ize the cognitive,
7
motor,
8
and perceptual
9−11
function of the human
cortex cerebri (Figs. 1 and 2). The explanatory power of BOLDfMRI is
being further strengthened in recent years through the introduction
of high (∼3T) and ultrahigh (∼7T) MRI scanners.
12
This is based on
the fact the stronger magnetic field will not only increase the fMRI
signal per se, but in addition, it will specifically enhance the sig-
nal components originating from parenchymal capillary tissue. On
the other hand, conventional, low-field magnets can be expected to
“over-represent” macrovascular signals.
11.2 NEURAL CORRELATE OF fMRI
BOLD fMRI contrast does not measure neuronal activity per se.
Rather, it reflects a complex convolution of changes ranging from
cerebral metabolic rate of oxygen (CMRO2), cerebral blood flow
(CBF), and cerebral blood volume (CBV) following focal neuronal
activity (Fig. 1). This poses a fundamental problem for the accu-
racy and validity of BOLDfMRI for clinical and basic neurosciences:
while the greatest body of existing neurophysiological data provide
spiking and/or subthreshold measurements from a small number
of neurons (10
0
–10
2
), fMRI on the other hand labels the local hemo-
dynamics from the parenchymal lattice consisting millions of neu-
rons (10
6
–10
8
) and a dense network of microcapillaries. Howcan we
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
269
Fig. 1. Hemodynamic basis of functional MRI. Note that fMRI is an indirect mea-
sure of the neuronal activity elicited by an external stimulus (“visual stimulation”)
mediatedthrough hemodynamic processes occurring in the dense network of veins
(“V”), arteries (“A”) and capillaries.
bridge this gap from micron-scale neuronal receptive field proper-
ties to millimeter scale voxel behaviors? The problem of bridging
this conceptual gap is greatly hindered by the presence substan-
tial differences between neuronal and fMRI voxel properties: small
number (10
0
–10
2
) versus large number (10
6
–10
8
) of neurons under-
lying the observed activation; point-like individual neurons versus
neurovascular lattice grid; largely spiking versus largely subthresh-
old activities; excitatory or inihibitory versus excitatory and/or
inhibitory (see Fig. 3 for differences in time scale between fMRI
and electrophysiological signals). The crucial questions we need to
address are discussed below.
11.2.1 Do BOLD Signal Changes Reflect the Magnitude
of Neural Activity Change Linearly?
Amplitude of the fMRI signal intensity change has been employed
by itself to obtain information beyond simple identification of spa-
tial compartmentalizationof brainfunctionbycorrelatingvariations
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
270
Dae-Shik Kim
Fig. 2. Functional MRI of the human visual cortex using BOLD contrast at 3T.
Here, the receptive field properties for isoeccentricity was mapped using the stan-
dard stimuli. Color-coded activation areas were responding to eccentricities repre-
sented by the colored rings in the upper right corner. Regions of activity were were
superimposed on the reconstructed and inflated brain surfaces.
in this amplitude with behavioral (e.g. Refs. 13–15) or the elec-
troencephelography (EEG) response.
16
However, extracting such
information requires the deconvolution of the compounded fMRI
response,
17
assuming that fMRI signals are additive. This assump-
tion, however, appears not to be generally valid (e.g. Refs. 18–20).
Tight and highly quantitative coupling between the EEG and T

2
BOLDsignals in the rat model was reported where the frequency of
forepawstimulationrate was variedunder steadystate conditions.
21
A linear relationship between the BOLD response and somatosen-
sory evoked potentials was demonstrated for brief stimuli but the
nature of the relationshipdependedonthe stimulationdurationand
ultimately became nonlinear;
22
in this study, the linearity was used
in a novel way to extract temporal information in the millisecond
time scale. More recently, local field potentials and spiking activ-
ity was recorded for the first time simultaneously with T

2
BOLD
fMRI signals in the monkey cortex, showing a linear relationship
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
271
between local field potentials and spiking rate, but displaying bet-
ter correlation with the former.
23
In a recent study, recording from
multiple sites for the first time, spiking activity was shown to be lin-
early correlated with the T

2
BOLD response in the cat visual cortex
using a single orientation of a moving grid but with different spa-
tial frequency of the grid lines.
24
However, the correlation varied
from point to point on the cortical surface and was generally valid
only when the data were averaged at least over 4 mm–5 mm spa-
tial scale,
24
demonstrating the fact that T

2
BOLD responses are not
spatially accurate at the level of orientation columns in the visual
system, as discussed previously. Adetailed set of studies were per-
formed asking the same type of questions and using laser Doppler
techniques to measure cerebral blood flow (CBF)
25,26,28
; these stud-
ies concluded that linear domains exist between CBF increases and
aspects of electrical activity and that hemodynamic changes evoked
by neuronal activity depend on the afferent input function but that
they do not necessarily reflect output level of activity of a region.
11.2.2 Small Versus Large Number
Given the nominal voxel size of most fMRI scans (several milli-
meters at best), it is safe to conclude that BOLD reflects the activ-
ity of many neurons (let’s say, for a voxel of 1 mm
3
–2 mm
3
around
10
5
neurons).
28
The overwhelming body of existing electrophysi-
ological data, however, is based on electrode recordings from sin-
gle (single unit recording, SUA) or a handful of neurons (multiunit
recording, MUA). The real question is hence to ask how accurately
the responses of single cells (our “gold standard” given the existing
bodyof data) arereflectedbyapopulationresponse, suchas inBOLD
fMRI. Theoretically, if each neuron would“fire” independently of its
neighbor’s behavior, this wouldbe anill posedproblem, as fMRI will
not be able to distinguish small activity changes in a large cellular
population from large changes in a small population. Fortunately,
however, neurons are embedded in tight local circuitries, forming
functional clusters with similar receptive field properties raging
from “micro-columns,” “columns,” to “hyper-columns.” Both the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
272
Dae-Shik Kim
neuronal firing rate and phase are correlated between neighboring
neurons (Singer, 1999), and in most sensory areas there is a good
correlation between local field potentials (LFP), which are assumed
to reflect the average activity of a large number of neurons, and
the responses of individual spiking neurons. In fact, it is difficult
to imagine how BOLD contrast could be detectable at all, if it were
sensitized to the behavior of uncorrelated individual neurons, as the
metabolic demand of a single neuron would be hardly sufficient to
initiate the chain of hemodynamic events giving rise to BOLD.
11.2.3 Relationship between Voxel Size and Neural
Correspondence
Clearly, the MRI voxel size is a key element in determining the spa-
tial dependence of the correlation between the BOLD and electrode
data. A large voxel will improve the relationship to the neuronal
event, since a voxel that displays BOLD signal changes will have
a much higher probability of including the site of the electrically
active column when its size increases, for example to sizes that are
often used in human studies (e.g. 3 mm
3
×3 mm
3
×3 mm
3
). How-
ever, such a large voxel will provide only limited information about
the pattern of activation, due to its low spatial resolution. Smaller
voxels (i.e. at the size of individual single unit recording sites) which
could potentially yield a much better spatial resolution will result in
a large variability in neuronal correspondence and the BOLDsignal
and a large number of “active” voxels will actually originate from
positions beyond the site of electrical activity (Fig. 4).
11.2.4 Spiking or Subthreshold?
According to the standard “integrate-and-fire” model of neurons,
action potential is generated when the membrane potential reaches
threshold by depolarization, which in turn is determined by the
integration of incoming excitatory (EPSP) and inhibitory (IPSP)
post-synaptic potentials. Action potentials are usually generated
only around the axon hillock, while synaptic potentials can be
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
273
S
p
i
k
e
s
/
s
e
c
B
O
L
D

[
%
]

Ti me after stimulus onset [sec]
0
10 20
25
0
0.0
1.0
30
R
e
c
t
i
f
i
e
d

v
o
l
t
s
[
µ
V
]
0
-50
BOLD
Low frequency analog
electrode signals (100-300Hz)
Spike rate
Bin size = TR = 0.5 sec
sti mulus
Fig. 3. Time course of BOLD and single unit recordings from the same cortical
location. Identical visual stimuli were used for fMRI and subsequent single unit
recording sessions. Blue trace: peristimulus histogram of the spike activity. Bin
size for the histogram = 0.5 sec = TR for fMRI. Red trace: BOLD percent changes
during visual stimulation. X-axis: time after stimulus onset. Left Y-axis: Spikes per
second. Right Y-axis: BOLDpercent changes. Graybox: stimulus duration. Theblack
trace above indicates the original low-frequency analog signals (100 Hz–300 Hz)
underlying the depicted spike counts.
generated all across the dendritic tree (mostly on dendritic spines)
and cell soma. The threshold-dependent action potential firing
means that much more sub- and suprathreshold synaptic activity
than action potential activity is likely at any one time. And the
much larger neural surface area associated with synaptic activity
means that the total metabolic demand (i.e. number of Na
+
/K
+
pumps involved etc.) for synaptic activity ought to be signifi-
cantly higher than those required for generating action poten-
tials. It seems therefore likely to be the case that BOLD con-
trast — like other methods based on cortical metabolism, such as
2-DG (
14
C-2-deoxyglucose)
49
and optical imaging — is dominated
by the synaptic subthreshold activity. However, the precise
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
274
Dae-Shik Kim
contributions of synaptic and spiking activities are hard to quantify,
since withconventional stimuli one wouldexpect synaptic input and
spikingoutput activityto be roughlycorrelatedwitheachother.
29−31
Indeed, it is not trivial to imagine an experiment where input and
output activities would not correlate with each other. One way this
has been proposed in the past, is to look in a visual area at spa-
tial activity resulting from the edge of a visual stimulus.
32−34
Since
“extra-classical” receptive fields across such an edge are by defini-
tion subthreshold activity, it follows that a stimulus with an edge
in it creates regions of cortex where activity is only subthreshold
in origin. Existing optical imaging studies
32,35
have concluded that
subthresholdactivity does indeedcontribute significantly to the opti-
cal signal, suggesting that it might contribute to the BOLD signal
as well. The results of our combined BOLD and single unit studies
suggest that both local field potential (LFP) and single-unit correlate
well with the BOLD signal (see Figs. 3 and 4). We have used LFP on
1 7
0
0.5
1.0
Neural modulation [∆ spikes/sec]
B
O
L
D

m
o
d
u
l
a
t
i
o
n

[

%
]

R = 0.85
2
y = 0.12x + .085
7.95 spikes per 1%BOLD
Fig. 4. Results of direct comparison between BOLD and single unit recordings
across all sites (n=58). X-axis: neural modulation for the single unit response in
spikes per seconds. Y-axis: % BOLD modulation. The six data points indicate the
BOLD/single unit responses for six different spatial frequencies used for this study.
The thick black line is the regression line for the depicted data points. Coefficient
of determination of the regression line, R
2
=0.85.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
275
the assumption that it represents the average activity of thousands
of neurons. In agreement with previous findings,
36
LFP signals may
provide a better estimate of BOLD responses than suprathreshold
spike rate. However, whether intracellular or extracellular activity is
better correlatedwithBOLDis harder toaddress, sincewithagrating
stimulus (and in fact with many types of visual stimuli), one would
expect intracellular and extracellular activity to be roughly corre-
lated with each other.
29−31
Separating intracellular and extracellu-
lar activity would have to be accomplished using a visual stimulus
known to do so. One imaging experiment presumptively showing
a large contribution of intracellular activity to the optical imaging
signal uses focal iontophoresis of GABA-A antagonist bicuculline
methiodide
37,38
to generate a mismatch between intracellular and
extracellular activity. This is a rare case where a blood-dependent
signal could be reversibly altered by an artificial manipulation of
neural activity. We are currently repeating these studies using fMRI
techniques to elucidate the spatial contribution of the intracellular
and extracellular activity in BOLD functional MRI signals.
11.2.5 Excitatory or Inhibitory Activity?
Although the neuro- and cognitive-science communities have
embraced fMRI with exuberance, numerous issues remain poorly
understood regarding this technique. Because fMRI maps are based
on secondary metabolic and hemodynamic events that follow neu-
ronal activity, and not the electrical activity itself, it remains mostly
unclear what the spatial specificity of fMRI is (i.e. how accurate are
the maps generated by fMRI compared to actual sites of neuronal
activity?). Inaddition, the nature of the link betweenthe magnitudes
of neuronal activity versus fMRI signals is not well understood (i.e.
what does a change of particular magnitude in fMRI signals mean
with respect to the change in magnitude of processes that define
neuronal signaling, such as action potentials or neurotransmitter
release?). fMRI is often used without considering these unknowns.
For example, modulating the intensity of fMRI signals by means
of different paradigms and interpreting the intensity changes as
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
276
Dae-Shik Kim
changes in neuronal activity of corresponding magnitude is a com-
mon practice that is not fully justified under most circumstances. To
the best of our knowledge, there is currently no evidence that the
metabolic demands differ greatly between excitatory and inhibitory
synapses. Therefore, fundamentally, both the excitatory (EPSP) and
inhibitory (IPSP) synaptic inputs can be expected to cause simi-
lar metabolic and hemodynamic events ultimately giving rise to
similar BOLD contrasts. On the site of the spiking output activ-
ity, however, they have an opposite effect: accumulation of EPSPs
will increase the probability for spike generation (and therefore
also the metabolic demand), while IPSPs will decrease it. Assuming
that the BOLD response predominantly reflects changes in synap-
tic subthreshold activity, it remains elusive whether excitatory and
inhibitory cortical events can be differentiated using the BOLD
response in any single region. Recently, one group proposed that
inhibition, unlike excitation, elicits no measurable change in the
BOLD signal.
39
They hypothesized that because of the lower num-
ber of inhibitory synapses,
40
their strategically superior location
(inhibitory receptors: basal cell body; excitatory receptors: distal
dendrites), andincreasedefficiency
41
there couldbe lower metabolic
demand during inhibition compared to excitation. The validity of
this claim notwithstanding, both empirical and theoretical studies
suggest that excitatory and inhibitory neurons in the cortex are so
tightly interconnected in local circuits (see e.g. Ref. 42 for details of
the local circuitry in cat primary visual cortex; see also Ref. 43 for
the anatomy of local inhibitory circuits in cats) that one is unlikely to
observe an increase in excitation without an increase in inhibition.
After all, for an inhibitory neuron to increase its firing rate, it must
be receiving more excitatory input, and most of the excitatory input
comes from the local cortical neighborhood (see Refs. 42 and 44 for
overview). Naturally, excitation and inhibition would not occur in
temporal unison, as otherwise no cell would reach threshold. On
the temporal scale of several hundred milliseconds to seconds dur-
ing which BOLD contrast emerges,
3
however, such potential tem-
poral differences would most likely be rendered indistinguishable.
One viable hypothesis is therefore that BOLD contrast reflects a
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
277
steady-state balance of local excitation and inhibition. In particular
if BOLD is more sensitive to subthreshold than to spiking activity.
11.3 NON-CONVENTIONAL fMRI
BOLD fMRI at conventional low magnetic field of 1.5T can com-
monly achieve a spatial resolution of up to 3–5 millimeters. This
is sufficient for labeling cortical organization at hypercolumn (sev-
eral millimeters) or area (several centimeters) scales. But functional
images at this resolution fail to accurately label the columnar orga-
nization of the brain. Studies at higher magnetic fields (such as 3 or
7T) can produce significant enhancement of the spatial resolution
and specificity of fMRI. Theoretical and experimental studies have
shown at least a linear increase in signal-to-noise ratio (SNR) with
magnetic field strength. The increase of the static MR signal can be
used to reduce the volume needed for signal averaging. Further-
more, as the field strength increases, the field gradient around the
capillaries becomes larger and extends further into the parenchyma
thus increasing the participation of the brain tissue in functional sig-
nal. Concurrently, the shortened T

2
of the blood at high B
0
reduces
the relative contribution from the large veins.
While these results suggest that stronger magnetic field per se
will specifically enhance the signal components originating from
parenchymal capillary tissue, recent optical spectroscopy and func-
tional MRI data
45−48
suggest that the spatial specificity of BOLD
could be further and more dramatically improved if an — hypoth-
esized — initial decrease of MR signals can be utilized for functional
imaging formation. To this end, it is suggested that the first event
following focal neuronal activity is a prolonged increase in oxygen
consumption, caused by an elevation in oxidative metabolism of
active neurons. Based on 2-DG data,
49
one can assume the increase
in oxidative metabolismin mammalian cortex to be colocalizedwith
the site of electrical activity. The increase in oxidative metabolism
will naturally elevate the local deoxyhemoglobin content in the
parenchyma of active neurons, assuming there is no immedi-
ate commensurate change in cerebral blood flow.
50
In T
2
or T

2
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
278
Dae-Shik Kim
weightedBOLDfMRI images, suchincrease inparamagnetic deoxy-
hemoglobin should therefore be detectable as a transient decrease
in observable MR signals. Such an initial deoxygenation of the
local cortical tissue will last only for a brief period, as fresh blood
(fresh oxyhemoglobin) will rush into capillaries in response to the
increased metabolism, thus reversing the local ratio of hemoglobin
in favor of oxyhemoglobin, and hence resulting in a delayed increase
in observable MR signals (i.e. the conventional BOLD signal). The
crucial question here is the “where” of the above described “bipha-
sic” hemodynamic processes. Grinvald and coauthors
51,46
hypoth-
esized a fundamentally distinct functional specificity for these two
events: The initial deoxygenation, as a consequence of an increase
in oxidative metabolism, should be coregistered with the site of
electrical activity up to the level of individual cortical columns (in
fact, the well established “optical imaging of intrinsic signals”,
52,53
which has been cross validated with single unit techniques,
54,55
is
similarly based on measuring the local transient increase of deoxy-
hemoglobin). The delayed oxygenation of the cortical tissue on the
other hand, is suggested to be far less specific due to the spread
of hemodynamic activity beyond the site of original neural activ-
ity. Both the existence of “biphasic” BOLD response per se, and the
suggested differences in functional specificity has been the subject
of heated controversies in recent years (see Ref. 56 for a comprehen-
sive update of this saga). While the initial deoxygenation signal in
fMRI (termed “initial dip”) has been reported in awake behaving
humans
57,58
and anesthetized monkeys,
59
studies in rodents failed
to detect any significant initial decrease in BOLD signal following
sensory stimulation,
60−62
but see Ref. 63. The question of whether
the use of initial dip would indeed improve the spatial specificity
of BOLD has been far more difficult to address experimentally. This
is largely because most fMRI studies examining this phenomenon
so far have been conducted in humans (e.g. Refs. 57 and 64), and
therefore, by necessity have used relatively coarse nominal spatial
resolution above the level of the individual cortical columns. In ani-
mal studies using ultra-high magnetic fields (e.g. 9.4T), in which
functional images at submillimeter scale can be acquired, the results
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
279
of our own group
45
(Fig. 6) suggest that indeed the use of the “initial
dip” can significantly improve the spatial specificity of BOLD. This
result has been questioned afterwards
65
; see Ref. 66 for our reply.
On the other hand, in a recent pioneering study, preoperative func-
tional MRI and intraoperative optical imaging were performed in
the same human subject.
67
While the spatial overlap between opti-
cal imaging and conventional (positive) fMRI was poor, there was
a dramatic improvement in spatial correspondence between the
two dataset when the initial dip portion of the MRI signal was
used. Furthermore, combined single unit and oxygen tension probe
measurements
68
convincingly demonstrated both the presence as
well as the functional significance of the initial deoxygenation sig-
nal component.
Alternative to the initial deoxygenationsignals, the spatial speci-
ficity of T

2
based fMRI can be further improved if only the arterial
contribution and/or to attenuate the draining vessel artifacts are
utilized for functional image construction. For example, perfusion
weighted images based on arterial spin labeling can be made sensi-
tive to the cerebral blood flow(CBF) changes fromupstreamarterial
networks to the capillaries, thus providing better spatial localization
ability
69,70
than T

2
BOLD imaging methods.
11.4 CONCLUSIONS AND FUTURE PROBLEMS OF fMRI
In less than a decade since the first noninvasive measurements of
functional blood oxygenation level signals from the human brain,
fMRI has developed into an indispensable neuroimaging tool that
is ubiquitous in both clinical and basic neuroscience settings. The
explanatorypower of fMRI however, is currentlylimiteddue topres-
ence of major theoretical and practical shortcoming. These include
(but not limited to): (a) lack of the detailed understanding of its
neural correlate; (b) limited spatial resolution; and (c) the difficulty
in combining fMRI with other imaging/measurement techniques.
Furthermore, it is important to note that conventional functional
MRI data analysis techniques (e.g. General Linear Model, t-test,
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
280
Dae-Shik Kim
Fig. 5. Figure 4 shows the neuronal correspondence (R
2
between BOLD and
single unit responses) as a function of the reshuffled voxel sizes. For each
voxel size, the distribution of the neuronal qualities is indicated by the stan-
dard deviation. The red curve marks the mean neuronal correspondence for
each voxel size. For curve fitting, conventional sigmoidal fitting was used. The
results depicted in Fig. 8 predict that the neuronal correspondence saturates
around R
2
=0.7 at the voxel size of around 4.7 × 4.7 mm
2
. Larger voxel sizes
are suggested to be ineffective in further improving the level of neuronal cor-
respondence. That is, the maximum amount of variance in the underlying neu-
ronal modulation that can be explained with the variance of conventional T

2
based positive BOLD is about 70%. Once the voxel size has been reduced to
be smaller than ∼2.8 × 2.8 mm
2
, only less than 50% of the variance in the
underlying neuronal modulation can be explained through the observed BOLD
responses.
cross-correlation etc.) implicitly assume a modularity of cortical
functions: parametric statistical methods test the hypothesis that
certain areas of the brain are significantly more active than others
with non-vanishing residual false positive detection error (repre-
sented as p-value). However, such techniques assume that the brain
consists of individual computational modules (similar to “Phreno-
logical” ideas) that are spatially distinct from each other. Interest-
ingly, increasing number of evidences in recent years suggest an
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
281
alternative representation model: that the information in the brain
is representedina more distributedfashion.
71,72
Inthe latter case, the
conventional statistical techniques may fail to detect the correct pat-
tern of neuronal activation, because they attempt to detect the areas
of “strongest” activation, while the information may represented
information using a much larger area of cortical tissue than conven-
tionally assumed. In their original works, Haxby and colleagues
71
have used simple voxel-to-voxel comparison methods to look for
Fig. 6. Improvement of BOLD spatial specificity by using nonconventional func-
tional MRI signals. Time course on the left side shows biphasic evolution of MR
signals, resulting the early deoxygenation contrast. If used, such deoxygenation
signals produce high-resolution images of exceedingly high functional specificity
(termed BOLD−) that contrasts with conventional BOLD fMRI signals (termed
BOLD+).
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
282
Dae-Shik Kim
activity pattern in the human brain. Linear pattern discrimination
techniques, such as support vector machines (SVM) or fisher’s lin-
ear discriminators (FLD) are inherently better suited for classifying
observed activation pattern into separable categories. For example,
when appliedfor discriminating orientation tuning behavior of vox-
els fromprimary visual areas, SVMwas able to detect minute differ-
ences in orientation selectivity of individual voxels in human V1.
73
Finally, while fMRI provides detailed information about the
“where” of the brain’s functional architecture non-invasively, such
localization information alone, must leave pivotal questions about
the brain’s information processing (the “how” of the processing)
unanswered, as long as the underlying pattern of neuronal connec-
tivity cannot be mapped in an equally non-invasive manner. Future
fMRI studies incognitive neuroimagingstudies will have toembrace
a significantly more multimodal approach. For example, combin-
ing fMRI with diffusion tensor imaging
74,75
will label the pattern of
structural connectivity between functionally active areas. The direc-
tion of the flow of functional flow of information within this mesh
of neural networks could then be elucidated by performing time-
resolved fMRI, effective connectivities, and possibly also repetitive
transcranial magnetic stimulations (rTMS) together with high reso-
lution fMRI experiments.
11.5 ACKNOWLEDGMENTS
We thank Drs Louis Toth, Itamar Ronen, Mina Kim and Kamil
Ugurbil for their help during the studies. This work was supported
by grants from NIH (MH67530, NS44820).
References
1. Bandettini PA, Wong EC, Hinks RS, Tikofsky RS, et al., Time course EPI
of human brain function during task activation, Magn Reson Med 25:
390–397, 1992.
2. Kwong KK, Belliveau J, Chesler DA, Goldberg IE, et al., Dynamic
magnetic resonance imaging of human brain acrivity during primary
sensory stimulation, Proc Natl Acad Sci USA 89: 5675–5679, 1992.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
283
3. Ogawa S, Tank DW, Menon R, Ellermann JM, et al., Intrinsic signal
changes accompanying sensory stimulation: Functional brain map-
ping with magnetic resonance imaging, Proc Natl Acad Sci USA 89:
5951–5955, 1992.
4. Ogawa S, Lee TM, Nayak AS, Glynn P, Oxygenation-sensitive contrast
in magnetic resonance image of rodent brain at high magnetic fields,
Magn Reson Med 14: 68–78, 1990.
5. Pauling L, Coryell CD, The magnetic properties and structures of
hemoglobin, oxyhemoglobin, and carbonmonoxyhemoglobin, Proc
Natl Acad Sci USA 22: 210–216, 1936.
6. Thulborn KR, Waterton JC, Mattews PM, Radda GK, Dependence of
the transverse relaxation time of water protons in whole blood at high
field, Biochem Biophys Acta 714: 1982.
7. Wagner AD, Schacter DL, Rotte M, Koutstaal W, et al., Building mem-
ories: Remembering and forgetting of verbal experiences as predicted
by brain activity, Science 281: 1188–1191, 1998.
8. Kim SG, Ashe J, Hendrich K, Ellermann JM, et al., Functional mag-
netic resonance imaging of motor cortex: Hemispheric asymmetry and
handedness, Science 261: 615–617, 1993.
9. Engel SA, Glover GH, Wandell BA, Retinotopic organization in human
visual cortex and the spatial precision of functional MRI, Cereb Cortex
7: 181–192, 1997.
10. Sereno MI, Dale AM, Reppas JB, Kwong KK, et al., Borders of multi-
ple visual areas in humans revealed by functional magnetic resonance
imaging, Science 268: 889–893, 1995.
11. Tootell RB, Mendola JD, Hadjikhani NK, Ledden PJ, et al., Functional
analysis of V3Aand related areas in human visual cortex, J Neurosci 17:
7060–7078, 1997.
12. Ugurbil K, Toth L, KimDS, Howaccurate is magnetic resonance imag-
ing of brain function? Trends Neurosci 26: 108–114, 2003.
13. Gandhi SP, Heeger DJ, Boynton GM, Spatial attention affects brain
activity in human primary visual cortex, Proc Natl Acad Sci USA 96:
3314–3319, 1999.
14. Salmelin R, Schnitzler A, Parkkonen L, Biermann K, et al., Native lan-
guage, gender, and functional organization of the auditory cortex, Proc
Natl Acad Sci USA 96: 10460–10465, 1999.
15. Tagaris GA, Kim, SG, StruppJP, AndersenP, et al., Mental rotationstud-
ied by functional magnetic resonance imaging at high field (4 tesla):
Performance and cortical activation, J of Cogn Neurosci 9: 419–432, 1997.
16. Dehaene S, Spelke E, Pinel P, Stanescu R, et al., Sources of mathematical
thinking: Behavioral andbrain-imaging evidence, Science 284: 970–974,
1999.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
284
Dae-Shik Kim
17. Glover GH, Deconvolutionof impulse response inevent-relatedBOLD
fMRI, Neuroimage 9: 416–429, 1999.
18. BoyntonGM, Engel SA, Glover GH, Heeger DJ, Linear systems analysis
of functional magnetic resonance imaging in human V1, J Neurosci 16:
4207–4221, 1996.
19. Sidtis JJ, Strother SC, Anderson JR, Rottenberg DA, Are brain functions
really additive? Neuroimage 9: 490–496, 1999.
20. Vazquez AL, Noll DC, Nonlinear aspects of the BOLD response in
functional MRI, Neuroimage 7: 108–118, 1998.
21. Brinker G, Bock C, Busch E, Krep H, et al., Simultaneous recording of
evoked potentials and T

2
-weighted MR images during somatosensory
stimulation of rat, Magn Reson Med 41: 469–473, 1999.
22. Ogawa S, Lee TM, Stepnoski R, Chen W, et al., An approach to probe
some neural systems interaction by functional MRI at neural time scale
down to milliseconds (In Process Citation), Proc Natl Acad Sci USA 97:
11026–11031, 2000.
23. Logothetis NK, Pauls J, Augath M, Trinath T, et al., Neurophysiologi-
cal investigation of the basis of the fMRI signal, Nature 412: 150–157,
2001.
24. TothLJ, RonenI, OlmanC, Ugurbil K, et al., Spatial correlationof BOLD
activity with neuronal responses, Paper presented at Soc. Neurosci,
Abstracts, 2001.
25. Lauritzen M, Relationship of spikes, synaptic activity, and local
changes of cerebral blood flow, J Cereb Blood Flow Metab 21: 1367–
1383, 2001.
26. MathiesenC, Caesar K, AkgorenN, LauritzenM, Modificationof activ-
itydependent increases of cerebral blood flow by excitatory synap-
tic activity and spikes in rat cerebellar cortex, J Physiol 512(Pt 2):
555–566, 1998.
27. Mathiesen C, Caesar K, Lauritzen M, Temporal coupling between
neuronal activity and blood flow in rat cerebellar cortex as indi-
cated by field potential analysis, J Physiol 523(Pt 1): 235–246,
2000.
28. Braitenberg V, Brain size and number of neurons: An exercise in syn-
thetic neuroanatomy, J Comput Neurosci 10: 71–77, 2001.
29. Ferster D, Linearityof synaptic interactions inthe assemblyof receptive
fields in cat visual cortex, Curr Opin Neurobiol 4: 563–568, 1994.
30. Jagadeesh B, Wheat HS, Ferster D, Linearity of summation of synaptic
potentials underlying direction selectivity in simple cells of the cat
visual cortex, Science 262: 1901–1904, 1993.
31. Nelson S, Toth L, Sheth B, Sur M, Orientation selectivity of corti-
cal neurons during intracellular blockade of inhibition, Science 265:
774–777, 1994.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
285
32. GrinvaldA, Lieke EE, Frostig RD, HildesheimR, Cortical point-spread
function and long-range lateral interactions revealed by real-time opti-
cal imaging of macaque monkey primary visual cortex, J Neurosci 14:
2545–2568, 1994.
33. Gulyas B, Orban GA, Duysens J, Maes H, The suppressive influence
of moving textured backgrounds on responses of cat striate neurons to
moving bars, J Neurophysiol 57: 1767–1791, 1987.
34. KnierimJJ, vanEssenDC, Neuronal responses to static texture patterns
inareaV1of the alert macaque monkey, J Neurophysiol 67: 961–980, 1992.
35. Toth LJ, Rao SC, Kim DS, Somers D, et al., Subthreshold facilitation
and suppression in primary visual cortex revealed by intrinsic signal
imaging, Proc Natl Acad Sci USA 93: 9869–9874, 1996.
36. Logothetis NK, Pauls J, Augath M, Trinath T, et al., Neurophysiological
investigation of the basis of the fMRI signal, Nature 412: 150–157, 2000.
37. Ajima A, Matsuda Y, Ohki K, Kim DS, et al., GABA-mediated repre-
sentation of temporal information in rat barrel cortex, Neuroreport 10:
1973–1979, 1999.
38. Toth LJ, Kim DS, Rao SC, Sur M, Integration of local inputs in visual
cortex, Cereb Cortex 7: 703–710, 1997.
39. Waldvogel D, van Gelderen P, Muellbacher W, Ziemann U, et al., The
relative metabolic demand of inhibition and excitation, Nature 406:
995–998, 2000.
40. Beaulieu C, Colonnier, M, A laminar analysis of the number of
roundasymmetrical andflat-symmetrical synapses onspines, dendritic
trunks, and cell bodies in area 17 of the cat, J Comp Neurol 231: 180–189,
1985.
41. Koos T, Tepper JM, Inhibitory control of neostriatal projection neurons
by GABAergic interneurons, Nat Neurosci 2: 467–472, 1999.
42. Payne BR, Peters A, The cat primary visual cortex, Academic Press,
San Diego, 2001.
43. Kisvarday ZF, Kim DS, Eysel UT, Bonhoeffer T, Relationship between
lateral inhibitory connections and the topography of the orientation
map in cat visual cortex, Eur J Neurosci 6: 1619–1632, 1994.
44. Sherpherd GM, The synaptic organization of the brain, Oxford Univer-
sity Press, Oxford, 1990.
45. Kim DS, Duong TQ, Kim SG, High-resolution mapping of isoorienta-
tion columns by fMRI, Nat Neurosci 3: 164–169, 2000.
46. Malonek D, Grinvald A, Interactions between electrical activity and
cortical microcirculation revealed by imaging spectroscopy: Implica-
tions for functional brain mapping, Science 272: 551–554, 1996.
47. Malonek D, Grinvald A, Vascular regulation at sub millimeter range.
Sources of intrinsic signals for highresolutionoptical imaging, Adv Exp
Med Biol 413: 215–220, 1997.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
286
Dae-Shik Kim
48. Vanzetta I, GrinvaldA, Increased cortical oxidative metabolismdue to
sensory stimulation: Implications for functional brain imaging, Science
286: 1555–1558, 1999.
49. Sokoloff L, Reivich M, Kennedy C, Des Rosiers MH, et al., The
[14C]deoxyglucose method for the measurement of local cerebral glu-
cose utilization: Theory, procedure, andnormal values in the conscious
and anesthetized albino rat, J Neurochem 28: 897–916, 1977.
50. Fox PT, Raichle ME, Focal physiological uncoupling of cerebral blood
flow and oxidative metabolism during somatosensory stimulation in
human subjects, Proc Natl Acad Sci USA 83: 1140–1144, 1986.
51. Malonek D, Dirnagl U, Lindauer U, Yamada K, et al., Vascular imprints
of neuronal activity: Relationships between the dynamics of cortical
blood flow, oxygenation, and volume changes following sensory stim-
ulation, Proc Natl Acad Sci USA 94: 14826–14831, 1997.
52. Frostig RD, Lieke EE, Ts’o DY, GrinvaldA, Cortical functional architec-
ture and local coupling between neuronal activity and the microcircu-
lation revealed by in vivo high-resolution optical imaging of intrinsic
signals, Proc Natl Acad Sci USA 87: 6082–6086, 1990.
53. GrinvaldA, Lieke E, Frostig RD, Gilbert CD, et al., Functional architec-
ture of cortex revealed by optical imaging of intrinsic signals, Nature
324: 361–364, 1986.
54. Crair MC, Gillespie DC, Stryker MP, The role of visual experience in
the development of columns in cat visual cortex, Science 279: 566–570,
1998.
55. Shmuel A, GrinvaldA, Functional organization for direction of motion
and its relationship to orientation maps in cat area 18, J Neurosci 16:
6945–6964, 1996.
56. Buxton RB, The elusive initial dip, Neuroimage 13: 953–958, 2001.
57. Hu X, Le TH, Ugurbil K, Evaluation of the early response in fMRI in
individual subjects using short stimulus duration, Magn Reson Med 37:
877–884, 1997.
58. MenonRS, OgawaS, HuX, StruppJP, et al., BOLDbasedfunctional MRI
at 4 Tesla includes a capillary bed contribution: Echo-planar imaging
correlates with previous optical imaging using intrinsic signals, Magn
Reson Med 33: 453–459, 1995.
59. Logothetis NK, Guggenberger H, Peled S, Pauls J, Functional imaging
of the monkey brain, Nat Neurosci 2: 555–562, 1999.
60. Lindauer U, Royl G, Leithner C, Kuhl M, et al., No evidence for early
decrease in blood oxygenation in rat whisker cortex in response to
functional activation, Neuroimage 13: 988–1001, 2001.
61. Marota JJ, Ayata C, Moskowitz MA, Weisskoff RM, et al., Investigation
of the early response to rat forepaw stimulation, Magn Reson Med 41:
247–252, 1999.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
Recent Advances in Functional Magnetic Resonance Imaging
287
62. Silva AC, Kim SG, Pseudo-continuous arterial spin labeling technique
for measuring CBF dynamics with high temporal resolution, Magn
Reson Med 42: 425–429, 1999.
63. Mayhew J, Johnston D, Martindale J, Jones M, et al., Increased oxygen
consumption following activation of brain: Theoretical footnotes using
spectroscopic data from barrel cortex, Neuroimage 13: 975–987, 2001.
64. Yacoub E, Le TH, Ugurbil K, Hu X, Further evaluation of the initial
negative response in functional magnetic resonance imaging, Magn
Reson Med 41: 436–441, 1999.
65. Logothetis N, Can current fMRI techniques reveal the micro-
architecture of cortex? Nat Neurosci 3: 413–414, 2000.
66. Kim DS, Duong TQ, Kim SG, Reply to “Can current fMRI techniques
reveal the micro-architecture of cortex?” Nat Neurosci 3: 414, 2000.
67. CannestraAF, PouratianN, Bookheimer SY, MartinNA, et al., Temporal
spatial differences observed by functional MRI and human intraoper-
ative optical imaging, Cereb Cortex 11: 2001.
68. Thompson JK, Peterson MR, Freeman RD, Single-neuron activity and
tissue oxygenation in the cerebral cortex, Science 299: 1070–1072, 2003.
69. Duong TQ, KimDS, Ugurbil K, KimSG, Localized cerebral blood flow
response at submillimeter columnar resolution, Proc Natl Acad Sci USA
98: 10904–10909, 2001.
70. Luh WM, Wong EC, Bandettini PA, Ward BD, et al., Comparison of
simultaneouslymeasuredperfusionandBOLDsignal increases during
brain activation with T(1)-based tissue identification, Magn Reson Med
44: 137–143, 2000.
71. Haxby JV, Gobbini MI, Furey ML, Ishai A, et al., Distributed and over-
lapping representations of faces and objects in ventral temporal cortex,
Science 293: 2425–2430, 2001.
72. Ishai A, Ungerleider LG, Haxby JV, Distributed neural systems for the
generation of visual images, Neuron 28: 979–990, 2000.
73. Kim DS, Kim M, Ronen I, Formisano E, et al., In vivo mapping of func-
tional domains and axonal connectivity in cat visual cortex using mag-
netic resonance imaging, Magn Reson Imaging 21: 1131–1140, 2003.
74. Kim M, Ducros M, Carlson T, Ronen I, et al., Anatomical correlates
of the functional organization in the human occipitotemporal cortex,
Magn Reson Imaging 24: 583–590, 2006.
75. Singer, W, Time as coding space? Curr Opin Neurobiol 9: 189–194, 1999.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch11 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
CHAPTER 12
Recent Advances in Diffusion
Magnetic Resonance Imaging
Dae-Shik Kim and Itamar Ronen
Diffusion weighted magnetic resonance imaging (DWI) plays an increas-
ingly important role in clinical and basic neurosciences. This is thanks
to DWI’s exceptional capability in representing structural properties of
neural tissue as local water molecular displacements: changes in mean
diffusivity reflect changes in macroscopic structural properties, while
gradient-direction encoded diffusion tensor imaging (DTI) can reveal
neuroanatomical connections in a noninvasive manner. Finally, recent
advances in compartmental-specific diffusion MRI suggest that micro-
scopic cellular tissue properties might be measurable as well using diffu-
sion MRI.
12.1 INTRODUCTION
Magnetic resonance imaging has paved the way for accurately map-
ping the structural and functional properties of the brain in vivo.
In particular, the intrinsic noninvasiveness of magnetic resonance
(MR) methods and the sensitivity of the MRsignal to subtle changes
in the structural and physiological neuronal tissue fabric make
it an all but ideal research and diagnostic tool for characterizing
intact neural tissue and studying processes that affect neural tis-
sue properties such as cortical thinning, demyelination, and nerve
degeneration/regeneration following injury. To this end, the tech-
nique of diffusion weighted MRI (DWI) has become one of the pri-
mary research and diagnostic tools in evaluating tissue structure
289
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
290
Dae-Shik Kim and Itamar Ronen
thanks to its ability to represent structural properties of neural
tissue as local water molecular displacements. For example, the
sharp difference in structural characteristics between tissue prop-
erties in the central nervous system has been extensively exploited
in countless DWI applications, ranging from the characterization of
ischemia,
1,2
demarcation of brain tumors
3
and the extensive inves-
tigation of connectivity through the use of diffusion tensor imaging
(DTI).
4,5
In addition, recent advances in diffusion tensor imaging
(DTI) promises to label axonal connectivity pattern in a noninvasive
manner by utilizing directionally encoded local water diffusivity.
Finally, recent advances in compartmental-specific diffusion MRI
suggest that diffusion MRI might be also able to provide semiquan-
titative information about microscopic cellular tissue properties.
12.1.1 Brownian Motion and Molecular Diffusion
The essential nature of diffusion is that a group of molecules that
start at the same location will spread out over time. Each molecule
experiences a series of random displacements so that after a time T,
the spread of position along a spatial axis x has a variance of:
σ
2
x
= 2DT, (1)
where D is the diffusion coefficient, a constant characteristic of the
medium. Diffusion of water molecules in most biological tissues is
recognizedas beingsmaller thanthe value inpure water. Inthe brain
tissue, the diffusion coefficient is two to ten times lower than in pure
water.
6
It has been shown that in brain gray matter, the diffusion
properties are relatively independent of orientation (or isotropic).
Conversely, in fibrous tissues such as brain white matter, the diffu-
sion properties vary with orientation. A very important empirical
observation is that the diffusion parallel to the fiber is much greater
than the diffusion perpendicular to it.
7
The variation with orienta-
tion is termed diffusion anisotropy (Fig. 1). Isotropic diffusion may
indicate either a structurally isotropic medium, or the existence of
multiple anisotropic structures that are randomly oriented in the
same sample volume.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
291
Fig. 1. The upper panel shows a schematic representation of a typical white mat-
ter voxel. The voxel is mostly occupied by closely packed myelinated axons. Water
molecule diffusion is restricted in the direction perpendicular to the axonal fibers
leading to an anisotropic diffusion pattern. In the lower panel, a schematic repre-
sentation of a gray matter voxel is shown. Although the presence of cell membranes
still poses restriction on diffusion, the well oriented structure of white matter fiber
tract no longer exists, and thus the diffusion pattern is more isotropic.
12.1.2 Anisotropic Diffusion
Whereas the factors that determine the lower diffusion coefficient in
brain tissue and the anisotropic water diffusion in white matter are
not completelyunderstood, it is assumedthat increasedviscosity, tis-
sue compartmentalization, as well as interaction with the structural
components of the tissue such as macromolecules, membranes and
intracellular organelles contribute tothis phenomenon. One hypoth-
esis of biological diffusion properties is related to the restriction of
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
292
Dae-Shik Kim and Itamar Ronen
diffusion by obstacles such as membranes.
6,8
For very short diffu-
sion times (i.e. if the diffusion path is short relative to the structural
dimensions), the molecular diffusion should resemble the free dif-
fusion in a homogeneous medium. As the diffusion time increases,
the water molecules diffuse far enough to encounter obstacles that
may obstruct their movement. In certain media where the diffusion
is restricted by impermeable barriers, it has been shown that as the
diffusion time increases, the diffusion coefficient decreases when
the diffusion distance is comparable with structure dimensions.
6
Another hypothesis is that the behavior of water diffusion in tissue
may reflect rather hindered than restricted diffusion.
6,9
The move-
ment of water molecules may be hindered by much slower moving
macromolecules andbymembranes, resultingincomplicated, tortu-
ous pathways. The anisotropic behavior of diffusion in white matter
may be also due to the intrinsic order of the axoplasmatic medium.
6
The presence of microtubules and neurofilaments associated with
axonal transport andthe lamellar structure of the myelinsheathmay
inhibit motion perpendicular to axons, but does not restrict motion
parallel to the fiber. When diffusion is hindered, the observed or
apparent diffusion coefficient relates to the inherent diffusion coef-
ficient, D
0
, through a tortuosity factor, λ
9
:
D
app
=
D
0
λ
2
. (2)
12.1.3 Data Acquisition for DWI and DTI
As suggested by Stejskal and Tanner,
10
the MR image is sensitized
to diffusion in a given direction using a couple of temporally sepa-
rated pulsed B
0
field gradients in the desired direction. The appli-
cation of a magnetic field gradient pulse at e.g. one of the 3 spatial
dimensions (x, y, z) dephases the protons (spin) along the respective
dimension (Fig. 2). Asecond pulse at the same direction, but oppo-
site polarity(“refocusingpulse”), will rephase these spins. However,
such rephasing cannot be perfect if the protons moved between the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
293
δ

90º 180º
TE/2 TE/2
MR signal
g
y
g
z
g
x
RF
TE
Fig. 2. MIR pulse sequence for diffusion tensor imaging (DTI). The direction of
the magnetic field gradient is the one in which g(x) =g(y), and g(z) =0, or g=(1,
1, 0). See text for further details.
two gradient pulses. That is to say, the signal loss, which cannot
be recovered after the application of the second gradient pulse, is a
function of the local molecular motion. The amount of the molecular
diffusion is known to obey Eq. 3, assuming the sample is isotropic
(no directionality in water diffusion):
S
S
0
= e
−γ
2
G
2
δ
2
(−δ/3)D
, (3)
where S and S
0
are signal intensities with and without the diffu-
sion weighting, γ is a constant (gyromagnetic ratio), G and δ are
gradient strength and duration, and is the separation between
a pair of gradient pulses. Because these parameters are all known,
from the amount of signal decrease (S/S
0
), diffusion constants at
each voxel can be derived. Such measurements have revealed that
diffusionof brainwater has strongdirectionality(anisotropy), which
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
294
Dae-Shik Kim and Itamar Ronen
is attributed to the existence of natural boundaries, such as axons
and/or myelination. The properties of such water diffusion can be
expressed as an ellipsoid — “diffusion ellipsoid.”
11,12
This ellipsoid
can be characterized by six parameters; diffusion constants along
the longest, middle, and shortest axes (λ
1
, λ
2
, and λ
3
, called princi-
pal axes) and the direction of the three principal axes, perpendicular
to each other. Once the diffusion ellipsoid is fully characterized at
each pixel of the brain images, local fiber structure can be derived.
For example, if λ
1
λ
2
≥ λ
3
(diffusion is anisotropic), it suggests
the existence of dense and aligned fibers within each pixel, whereas
isotropic diffusion (λ
1
≈ λ
2
≈ λ
3
) suggests sparse or unaligned
fibers. When diffusion is anisotropic, the direction of λ
1
indicates the
direction of the fibers.
12.1.4 Measures of Anisotropy Using Diffusion Tensors
One important application of the diffusion tensor is the quantita-
tive characterization of the brain tissue structure and the degree of
anisotropy in brain white matter. Several scalar measures, which
emphasize different tensor features, have been derived fromthe dif-
fusion tensor by different groups.
7,13,14
To this end, diffusion tensor
elements can be calculated by:
b = γ
2
δ
2
(−δ/3)G
2
(4)
S = S
0
exp (−bD) (5)
D =
1
b
ln
S
0
S
. (6)
While the diffusion D is a scalar for conventional DWI, it is a ten-
sor in case of DTI data. That is, instead of being characterized by a
single number, it is describedbya 3×3 matrix of numbers. For exam-
ple, if the diffusion-sensitizing gradient pulses are applied along the
x-axis, u = (1, 0, 0), or if the measurement axis is at an angle θ to the
x-axis and in the x +y plane, u = ( cos θ, sin θ, 0), then the measured
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
295
value of D along any axis u is given by:
D =
_
u
x
u
y
u
z
_
_
_
D
xx
D
xy
D
xz
D
xy
D
yy
D
yz
D
xz
D
yz
D
zz
_
_
_
_
u
x
u
y
u
z
_
_
, (7)
D = u
2
x
D
xx
+u
2
y
D
yy
+u
2
z
D
zz
+2u
x
u
y
D
xy
+2u
y
u
z
D
yz
+2u
z
u
x
D
zx
, (8)

1
b
ln
S
0
S
= u
2
x
D
xx
+u
2
y
D
yy
+u
2
z
D
zz
+2u
x
u
y
D
xy
+2u
y
u
z
D
yz
+2u
z
u
x
D
zx
. (9)
For example, for 12 directions,
_
_
_
_
_
_
_
_
_
_
_
_
1
b
ln
S
0
S
1
·
·
·
·
1
b
ln
S
0
S
12
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
= U

D (10)
where,
U =
_
_
_
_
_
_
_
_
_
_
u
2
x1
u
2
y1
u
2
z1
u
x1
u
y1
u
y1
u
z1
u
z1
u
x1
· · · · · ·
· · · · · ·
· · · · · ·
· · · · · ·
u
2
x12
u
2
y12
u
2
z12
u
x12
u
y12
u
y12
u
z12
u
z12
u
x12
_
¸
¸
¸
¸
¸
¸
¸
¸
_
and

D =
_
_
_
_
_
_
_
_
_
_
D
xx
D
yy
D
zz
2D
xy
2D
yz
2D
zx
_
¸
¸
¸
¸
¸
¸
¸
¸
_
.
Now, if we assume that the columns of U are linearly independent,
then the matrix UTU is invertible and the least squares solution is

D
0
= (U
T
U)
−1
U
T
_
_
_
_
_
_
_
_
_
_
_
_
1
b
ln
S
0
S
1
·
·
·
·
1
b
ln
S
0
S
12
_
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
(11)
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
296
Dae-Shik Kim and Itamar Ronen
Since the 3×3 tensor matrix
=
D
=
_
D
xx
D
xy
D
xz
D
xy
D
yy
D
yz
D
xz
D
yz
D
zz
_
is symmetric along
the diagonal, the eigenvalues and eigenvectors can be obtained
by diagonalizing the matrix using the Jacobi transformation. The
resulting eigenvalues
=

=
_
λ
1
0 0
0 λ
2
0
0 0 λ
3
_
and corresponding eigenvec-
tors
=
P
=
_
−→
p
1
−→
p
2
−→
p
3
_
can then be used to describe the diffusivity
and directionality (or anisotropy) of water diffusion within a given
voxel. An important measure associated with the diffusion tensor is
its trace:
tr{D} = D
xx
+D
yy
+D
zz
= 3 · λ = λ
1

2

3
. (12)
The trace has similar values in healthy white and gray matter
(tr{D} ∼2.1×10
−3
mm
2
/s). However, the trace value drops consider-
ably in brain tissue affected by acute stroke.
15
This drop is attributed
to an increase in tortuosity factor due to the shrinkage of the extra-
cellular space.
15
Consequently, the trace of the diffusion tensor can
be used as an early indicator of ischemic brain injury. Finally, the
anisotropy of the diffusion tensor characterizes the amount of dif-
fusion variation as a function of direction (e.g. the deviation from
isotropy). Several of these anisotropy measures are normalized to
a range from 0 to 1. One of the most commonly used measures of
anisotropy is the fractional anisotropy (FA)
7
:
FA =
1

2
_

1
−λ
2
)
2
+(λ
2
−λ
3
)
2
+(λ
3
−λ
1
)
2
λ
2
1

2
2

2
3
, (13)
which is the ratio of the root-mean-square (RMS) of the eigenvalues
deviation from their mean normalized by the eigenvalues Euclid-
ian norm. FAhas been shown to provide the best contrast between
different classes of brain tissues.
16
Auseful way to display tract ori-
entation is to use color to encode the direction of the tensor major
eigenvector.
17,18
The 3D eigenvector space is associated with the
3D RGB (Red-Green-Blue) color space by assigning a color to each
component of the eigenvector (e.g. red to x, green to y, and blue
to z). Consequently, the fibers that are oriented from left to right
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
297
Fig. 3. Color maps of several brain slices. (Left) axial, (middle) coronal, and (right)
sagittal slices. See text for further details.
of the brain appear red, the fibers oriented anteriorly-posteriorly
(front-back) appear green, and those oriented superiorly-inferiorly
(top-bottom) appear blue (Fig. 3). All the other orientations are com-
binations of these three colors. Color maps allowthe identificationof
different white matter structures. Eigenvector color maps for three
orthogonal planes in a 3D brain volume are presented in Fig. 3.
The color intensities are weighted by FAto emphasize white matter
anatomy.
12.1.5 White Matter Tractography
White matter tractography (WMT) is based on the estimation of
white matter tract orientation using measured diffusion proper-
ties of water as described in the previous sections. Some of the
major techniques for DTI based fiber tractography are discussed
below:
12.1.6 Propagation Algorithms
In algorithms developed by many groups,
11
a continuous represen-
tation of the diffusion tensor and principal eigenvector ε
1
are inter-
polated from the discrete voxel data. The fiber track direction at
any location along the tract is given by the continuous ε
1
. Typically,
the tracking algorithm stops when the fiber radius of curvature or
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
298
Dae-Shik Kim and Itamar Ronen
Fig. 4. In vivo high-resolution diffusion tensor imaging (DTI) of the human corpus
callosum. The left panel depicts the user-defined seeding ROI for fiber reconstruc-
tion, and the right panel shows the result of DTI based fiber tractography of human
corpus callosum.
the anisotropy factor falls below a threshold (Figs. 4 and 5). With
this approach, the fiber is not represented by a succession of line
segments but by a relatively smooth curve that follows the local
diffusion direction and is more representative of the behavior of
real fibers. These two approaches, often designated “streamline”
approaches, are based on the assumption that diffusion is locally
uniformandcanbe accuratelydescribedbya single vector ε
1
. Unfor-
tunately, this fails todescribe voxels occupiedbyfibers withdifferent
diffusion tensors.
19
Furthermore, the presence of noise in the diffu-
sion MRI data induces a small uncertainty in the direction of the
vectors ε
1
, that can lead to significant fiber tact propagation error.
To try overcome these problems, tensorline approaches have been
developed such that the entire tensor information is used instead of
reducingit toa single eigenvector.
20,21
Arecentlyproposedapproach
is a continuous approximation of the tensor field using B-splines
to derive fiber tracts. Tensorline algorithms seem to perform better
than streamline algorithms for reconstructing low curvatures fibers
and, in general, achieve better reproducibility. Poupon
22,23
have
developed an algorithm based on a probabilistic approach aimed
at minimizing fiber bending along the fiber tract. A regularization
step based on the analogy between fiber pathways in white matter
and so called “spaghetti plates” is used to improve robustness. A
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
299
Fig. 5. The explanatory power of DTI can further be increased by combining DTI
fiber tractography with conventional functional imaging. Here, the areas of high
functional MRI (fMRI) activity during visual stimulation along the human ventro-
temporal cortex are used as seeding points for DTI based fiber reconstructions.
consequence of this approach is that it can represent fiber branching
and forks that are typically present in white matter fascicles, a clear
advantage over previously published methods.
12.1.6.1 Fiber assignment by continuous tracking
Mori et al.
24
developed one of the earliest and most commonly
employed algorithms: fiber assignment by continuous tracking
(FACT). The FACT is based on extrapolation of continuous vector
lines fromdiscrete DTI data. The reconstructedfiber directionwithin
each voxel is parallel to the diffusion tensor eigenvector (ε
1
) associ-
ated with the greatest eigenvalue (λ
1
). Within each voxel, the fiber
tract is a line segment defined by the input position, the direction of
ε
1
and an output position at the boundary with the next voxel. The
trackis propagatedfromvoxel tovoxel andterminatedwhena sharp
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
300
Dae-Shik Kim and Itamar Ronen
turn in the fiber orientation occurs. The FACT uses as propagation
direction the value corresponding to the current voxel: v

prop
= v
prop
.
The value of p is chosen such as the current step crosses the entire
voxel and reaches its boundary. To this end, the trajectory will be
formed by a series of segments of variable length. FACT integration
has the advantage of a high computational efficiency.
12.1.6.2 Streamline tracking
The streamline tracking (STT) technique
11,25,26
approximates v
prop
by
the major eigenvector of the tensor:
v
prop
= e
1
. (14)
This approach is analogous to simulated flow propagation in fluid
dynamics including the study of blood flow phenomena from MRI
flow measurements with 3D phase contrast.
27
12.1.6.3 Tensor deflection
An alternative approach for determining tract direction is to use
the entire diffusion tensor to deflect the incoming vector (v
in
)
direction
14,28
:
v
out
= D· v
in
. (15)
The incoming vector represents the propagation direction from
theprevious integrationstep. Thetensor operator deflects theincom-
ing vector towards the major eigenvector direction, but limits the
curvature of the deflection, which should result in smoother tract
reconstructions. Tensor deflection (TEND) was proposed in order to
improve propagation in regions with low anisotropy, such as cross-
ing fiber regions, where the direction of fastest diffusivity is not well
defined.
29
12.1.6.4 Tensorline algorithms
The tensorline algorithm, described by Weinstein et al.,
30
dynam-
ically modulates the STT and TEND contributions to steer the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
301
tract:
v
out
= fe
1
+(1 −f )((1 −g)v
in
+gDv
in
), (16)
where f and g are user defined weighting factors that vary between
0 and 1. The algorithm has 3D terms: (a) an STT term (e
1
) weighted
byf , (b) a TENDterm(D·v
in
) weightedby(1−f )g, andanundeviated
v
in
term weighted by (1 −f )(1 −g). The vectors and are normalized
to unify before being used in Eq. 16. Estimated trajectories with dif-
ferent properties can be achieved by changing f and g. Tensorline
may be considered as a family of tractography algorithms that can
be tuned to accentuate specific behavior. In the original implemen-
tation of this algorithm, Weinstein et al. used a measure of prolate
tensor shape, f =CL,
14
to weight the STT term. Note that for f =1,
the tensorlines algorithm is equivalent to STT.
12.1.6.5 Probabilistic mapping algorithm
Diffusion tensor imaging is based on the assumption that the local
orientation of nerve fibers is parallel to the first eigenvector of the
diffusion tensor. However, due to issues such as imaging noises,
limitedspatial resolutionandpartial volume effect, the fiber orienta-
tion cannot determined without uncertainty. Probabilistic methods
for determining the connectivity between brain regions using infor-
mationobtainedfromDTI have recentlybeenintroduced.
31−33
These
approaches utilize probability density functions (PDFs) defined at
each point within the brain to describe the local uncertainty in fiber
orientation. The probabilistic tractography algorithm reveals fiber
connectivitythat progresses intothe graymatter; while conventional
streamlined algorithms failed to yield acceptable results. The goal
of probabilistic tracking approaches is to determine the probabil-
ity that fibers project from a starting point (or group of points) to
regions of interest. In data analysis performed in this research, the
local fiber orientation is given by the first eigenvector of the diffu-
sion tensor that we call ε
1
. To performprobabilistic tacking, we need
to introduce an uncertainty of ε
1
orientation at every point along a
fiber created by a streamline tracking method. Then we repeat the
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
302
Dae-Shik Kim and Itamar Ronen
tracking a great number of times to generate a 3D probability map.
The uncertainty in ε
1
orientation can be described by the probabil-
ity that it is deflected about its original position. The result of this
deflection is a vector ε

1
. θ is the angle between ε
1
and ε

1
, and φ is
the rotation of ε

1
about ε
1
. The PDF for θ and φ are given by the 0-th
order model of uncertainty described in Ref. 32 φ is uniformly dis-
tributed between 0 and 2π, and θ is normal about 0 with a standard
deviation Sigma linked to FA. Indeed, the smaller FA, the greater
the fiber orientation uncertainty. We define Sigma = S(FA), S being
a sigmoid function. In our computation, we can modify the sigmoid
function parameters: Sigma max, the standard deviation of θ as FA
tends to 1, Sigma 0, the standard deviation of θ as FA tends to 1
(i.e. a residual uncertainty), FA
0
, the value of FAfor which Sigma =
(Sigma
0
+ Sigma max)/2, and slope, the slope of the sigmoid. To
create a probabilistic map, a great number of fibers are generated
using the streamline tracking algorithm. At each point along fiber
propagation, ε
1
is modified into ε

1
, using a randomnumber genera-
tor and the PDF for φ and θ described above. The probability map is
the number of fibers reaching a voxel divided by the total number of
fibers that were generated. Whenprobabilistic trackingis performed
from multiple starting point (such as en entire ROI), the probability
is multiplied by the number of starting points.
12.1.7 Limitations of DTI Techniques
Despite its great promise for visualizing and quantitatively char-
acterizing white matter connections, DTI has some important lim-
itations. It is not clear what is actually being measured with the
anisotropy index. For example, the precise contribution of these two
factors, fiber density and myelination, on the anisotropy index has
not been completely understood. Thus, it is not clear to what degree
the results of DTI correspond to the actual density and orientation
of the local axonal fiber bundles. It is also important to understand
how white matter is, in general, organized. The most basic short-
coming of DTI is that it can only determine a single fiber orienta-
tion at any given location in the brain. This is clearly inadequate
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
303
in regions with complex white matter architecture, where different
axonal pathways crisscross through each other. The crossing fibers
create multiple fiber orientations within a single MRI voxel, where
a voxel refers to a 3D pixel, and constitutes the individual element
of the MR image. Since the diffusion tensor assumes only a sin-
gle preferred direction of diffusion within each voxel, DTI cannot
adequately describe regions of crossing fibers, or of converging or
diverging fibers. 3D DTI fiber tracking techniques are also found
in these regions of complex white matter architecture, since there
is no well defined single dominant fiber orientation for them to
follow.
In recent years, some of these problems have been addressed
by measuring the full 3D dispersion of water diffusion in each MRI
voxel at highangular resolution. Thus, insteadof obtainingdiffusion
measurements in only a few independent directions to determine a
single fiber orientation as in DTI, dozens or even hundreds of uni-
formly distributed diffusion directions in 3D space are acquired to
resolve multiple fiber orientations in high angular resolution diffu-
sion imaging (e.g. HARDI). Each distinct fiber population can be
visualized on maps of the orientation distribution function (ODF),
which are computed from the 3D high angular resolution diffusion
data through a projection reconstruction technique known as the
Funk-Radon transform. This 3D projection reconstruction is very
similar mathematically to the 2D method by which CT images are
calculated from X-ray attenuation data. Unlike DTI, HARDI has
the advantage of being model-independent, and therefore does not
assume any particular 3Ddistribution of water diffusion or any spe-
cific number of fiber orientations within a voxel.
12.1.8 The Use of High b-value DWI for Tissue Structural
Characterization
As a result of the structural heterogeneity of tissue in a spatial
scale significantly smaller that the typical image voxel size, the
diffusion-weighted signals display a multiexponential dependence
on diffusion weighting magnitude quantified with the parameter
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
304
Dae-Shik Kim and Itamar Ronen
b, where b = γ
2
δ
2
g
2
( − δ/3) in a spin-echo diffusion experiment,
where γ is the gyromagnetic ratio, g is the magnitude of the Stejskal-
Tanner gradient pair each of which is of δ duration, and is the
temporal separation of the gradient pair. The complexity of this
multiexponential behavior of the signal led to a more detailed
inspection of diffusion properties in matter, as proposed.
34,35
The
method, known as q-space imaging, is based on the acquisition of
data with multiple gradient strength values g. When a Fourier trans-
formation is performed pixel by pixel with respect to the variable
q = γg/2π:
P(

R, ) =
1

_

−∞
S( q, ) · exp (−i2π q ·

R)d q, (17)
the transformed data set P represents the displacement probability
of the water molecules with respect to the axis which was sensi-
tized to diffusion, at a given diffusion time . This concept has been
successfully applied in various in vitro and in vivo applications,
36−40
where the use of long diffusion times combined with gradation of
b-values andFourier transformationhas yieldeddisplacement maps
with exquisite accuracy.
Although q-space imaging potentially yields detailed diffusion
data on heterogeneous tissue, the straightforward use of q-space
data for imagingpurposes has beenmostlylimitedto displayingone
of the mainparameters of the displacement distributionfunction, i.e.
the zero displacement probability (amplitude at displacement = 0),
and the displacement probability RMS (FWHM of the distribution
function). This particular use clusters together the various diffusion
components, and thus it is particularly suitable for applications in
which diffusion in a voxel is dominated by one component, either
because of the nature of the tissue or by eliminating nonrestricted
diffusion components by means of a large value.
The other approach of using diffusion data acquired with multi-
ple b-values is to model the data according to a plausible model that
governs the diffusionpatternineachvoxel. Inthis approach, the data
is fitted to a multiparametric model function that best represents
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
305
the expected behavior of the signal with respect to b. The advan-
tage of modeling the diffusion data is in the possibility to extract
information about diffusion characteristics of water in various com-
partments from the same data set, and thus simultaneously obtain
volumetric and structural information about those compartments.
The most common and useful model for that matter is a biexpo-
nential decay diffusion model, which partitions the diffusion data
into slow and fast diffusing components.
41−46
It is now accepted
that there is no soichiometric relation between the two components
in the biexponential model and two distinct tissue compartments.
However, it is widely accepted that the largest contribution to the
nonmonoexponential behavior stems from restriction imposed on
diffusion, mostly on the intracellular and intra-axonal water pool.
44
This view gains support from studies that measured diffusion of
intracellular metabolites such as N-acetyl aspartate (NAA), for
which the diffusion attenuation curve as a function of b-value was
shown to be nonmonoexponential.
47,48
12.2 SUMMARY AND CONCLUSIONS
Diffusion weighted magnetic resonance imaging (DWI) already
plans a crucial role in detecting neurostructural deviations at macro-
scopic level. With recent advances in DTI, multimodal imaging
and compartmental specific imaging, the importance of diffusion
MRI for clinical and basic neurosciences plays are likely to increase
exponentially.
12.3 ACKNOWLEDGMENTS
Drs Mina Kimand Susumu Mori provided crucial help in DWI/DTI
data acquisition and analyses. We also thank Mathieu Ducros, Sahil
Jain and Keun-Ho Kim for their help with DTI postprocessing. This
work was supported by grants from NIH (RR08079, NS44825), The
MIND institute, Keck Foundation, and Human Frontiers Science
Program.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
306
Dae-Shik Kim and Itamar Ronen
References
1. Moseley ME, Cohen Y, et al., Early detection of regional cerebral
ischemia in cats: Comparison of diffusion- and T2-weighted MRI and
spectroscopy, Magn Reson Med 14(2): 330–346, 1990.
2. Moseley ME, Kucharczyk J, et al., Diffusion-weighted MR imag-
ing of acute stroke: Correlation with T2-weighted and magnetic
susceptibility-enhanced MR imaging in cats, AJNR Am J Neuroradiol
11(3): 423–429, 1990.
3. Eis M, Els T, et al., Quantitative diffusion MRimaging of cerebral tumor
and edema, Acta Neurochir Suppl (Wien) 60: 344–346, 1994.
4. Basser PJ, Mattiello J, et al., Estimation of the effective self-diffusion
tensor from the NMR spin echo, J Magn Reson B 103(3): 247–254,
1994.
5. Pierpaoli C, JezzardP, et al., Diffusion tensor MRimaging of the human
brain, Radiology 201(3): 637–648, 1996.
6. Le Bihan D, Turner R, et al., Imaging of diffusion and microcirculation
with gradient sensitization: Design, strategy, and significance, J Magn
Reson Imaging 1(1): 7–28, 1991.
7. Basser, PJ, Pierpaoli C, Microstructural and physiological features of
tissues elucidated by quantitative-diffusion-tensor MRI, J Magn Reson
B 111(3): 209–219, 1996.
8. Hajnal JV, Doran M, et al., MR imaging of anisotropically restricted
diffusion of water in the nervous system: Technical, anatomic, and
pathologic considerations, J Comput Assist Tomogr 15(1): 1–18, 1991.
9. Norris DG, The effects of microscopic tissue parameters on the diffu-
sion weighted magnetic resonance imaging experiment, NMR Biomed
14(2): 77–93, 2001.
10. Stejskal EO, Tanner JE, Restricted self-diffusion of protons in colloidal
systems by the pulse-gradient, spin-echo method, J Chem Phys 49(4):
1768–1777, 1968.
11. Conturo TE, Lori NF, et al., Tracking neuronal fiber pathways in the
living human brain, Proc Natl Acad Sci USA 96(18): 10422–10427, 1999.
12. Basser PJ, MattielloJ, et al., MRdiffusiontensor spectroscopyandimag-
ing, Biophys J 66(1): 259–267, 1994.
13. Conturo TE, McKinstry RC, et al., Encoding of anisotropic diffusion
with tetrahedral gradients: A general mathematical diffusion formal-
ism and experimental results, Magn Reson Med 35(3): 399–412, 1996.
14. Westin CF, Maier SE, et al., Image Processing for Diffusion Tensor Magnetic
Resonance Imaging, Springer, Cambridge, 1999.
15. Sotak CH, The role of diffusion tensor imaging in the evaluation
of ischemic brain injury — A review, NMR Biomed 15(7–8): 561–569,
2002.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
307
16. Alexander AL, Hasan K, et al., Ageometric analysis of diffusion tensor
measurements of the human brain, Magn Reson Med 44(2): 283–291,
2000.
17. Makris N, Worth AJ, et al., Morphometry of in vivo human white mat-
ter association pathways with diffusion-weighted magnetic resonance
imaging, Ann Neurol 42(6): 951–962, 1997.
18. Pajevic S, Pierpaoli C, Color schemes to represent the orientation of
anisotropic tissues from diffusion tensor data: Application to white
matter fiber tract mapping in the human brain, Magn Reson Med 43(6):
921, 2000.
19. Alexander AL, Hasan KM, et al., Analysis of partial volume effects in
diffusion-tensor MRI, Magn Reson Med 45(5): 770–780, 2001.
20. Weinstein D, Rabinowitz R, et al., Ovarian hemorrhage in women with
Von Willebrand’s disease. A report of two cases, J Reprod Med 28(7):
500–502, 1983.
21. Pajevic S, Basser P, A continuous tensor field approximation for DT-
MRI data, 9th Annula Conference of the ISMRM, 2001.
22. Poupon C, Clark CA, et al., Regularization of diffusion-based direction
maps for the tracking of brain white matter fascicles, Neuroimage 12(2):
184–195, 2000.
23. Poupon C, Mangin J, et al., Towards inference of human brain connec-
tivity from MR diffusion tensor data, Med Image Anal 5(1): 1–15, 2001.
24. Mori S, CrainBJ, et al., Three-dimensional trackingof axonal projections
in the brain by magnetic resonance imaging, Ann Neurol 45(2): 265–269,
1999.
25. Basser PJ, Pajevic S, et al., In vivo fiber tractography using DT-MRI data,
Magn Reson Med 44(4): 625–632, 2000.
26. Lori NF, Akbudak E, et al., Diffusion tensor fiber tracking of human
brain connectivity: Aquisition methods, reliability analysis and bio-
logical results, NMR Biomed 15(7–8): 494–515, 2002.
27. Napel S, Lee DH, et al., Visualizing three-dimensional flow with simu-
lated streamlines and three-dimensional phase-contrast MR imaging,
J Magn Reson Imaging 2(2): 143–153, 1992.
28. Lazar M, Weinstein DM, et al., White matter tractography using diffu-
sion tensor deflection, Hum Brain Mapp 18(4): 306–321, 2003.
29. Westin CF, Maier SE, et al., Image Processing for Diffusion Tensor Magnetic
Resonance Imaging, Springer, Cambridge, 1999.
30. Weinstein DM, Kindlmann GL, et al., Tensorlines: Advection-diffusion
based propagation through diffusion tensor fields, IEEE Visualization
Proc, San Francisco, 1999.
31. Behrens TE, Johansen-Berg H, et al., Noninvasive mapping of connec-
tions between human thalamus and cortex using diffusion imaging,
Nat Neurosci 6(7): 750–757, 2003.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
308
Dae-Shik Kim and Itamar Ronen
32. Parker GJ, HaroonHA, et al., Aframework for a streamline-basedprob-
abilistic index of connectivity (PICo) using a structural interpretation
of MRI diffusion measurements, J Magn Reson Imaging 18(2): 242–254,
2003.
33. Jones DK, Pierpaoli C, Confidence mapping in diffusion tensor mag-
netic resonance imaging tractography using a bootstrap approach,
Magn Reson Med 53(5): 1143–1149, 2005.
34. Callaghan PT, Eccles CD, et al., NMR microscopy of dynamic displace-
ments: k-space and q-space imaging, Journal of Physics — Scientific
Instruments 21(8): 820–822, 1988.
35. Cory, DG, Garroway AN, Measurement of translational displacement
probabilities by NMR: An indicator of compartmentation, Magn Reson
Med 14(3): 435–444, 1990.
36. King MD, Houseman J, et al., q-Space imaging of the brain, Magn Reson
Med 32(6): 707–713, 1994.
37. King MD, Houseman J, et al., Localized q-space imaging of the mouse
brain, Magn Reson Med 38(6): 930–937, 1997.
38. Assaf Y, Cohen Y, Structural information in neuronal tissue as revealed
by q-space diffusion NMR spectroscopy of metabolites in bovine optic
nerve, NMR Biomed 12(6): 335–344, 1999.
39. Assaf Y, Cohen Y, Assignment of the water slow-diffusing compo-
nent in the central nervous system using q-space diffusion MRS:
Implications for fiber tract imaging, Magn Reson Med 43(2): 191–199,
2000.
40. Assaf Y, Ben-Bashat D, et al., High b-value q-space analyzed diffusion-
weightedMRI: Applicationtomultiple sclerosis, MagnResonMed 47(1):
115–126, 2002.
41. Niendorf T, Dijkhuizen RM, et al., Biexponential diffusion attenuation
in various states of brain tissue: Implications for diffusion-weighted
imaging, Magn Reson Med 36(6): 847–857, 1996.
42. Mulkern RV, Gudbjartsson H, et al., Multicomponent apparent diffu-
sion coefficients in human brain, NMR Biomed 12(1): 51–62, 1999.
43. Clark, CA, Le Bihan D, Water diffusion compartmentation and
anisotropy at high b values in the human brain, Magn Reson Med 44(6):
852–859, 2000.
44. Inglis BA, Bossart EL, et al., Visualization of neural tissue water com-
partments using biexponential diffusion tensor MRI, Magn Reson Med
45(4): 580–587, 2001.
45. MulkernRV, VajapeyamS, et al., Biexponential apparent diffusioncoef-
ficient parametrization in adult vs newborn brain, Magn Reson Imaging
19(5): 659–668, 2001.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
Recent Advances in Diffusion Magnetic Resonance Imaging
309
46. Clark CA, Hedehus M, et al., In vivo mapping of the fast and slow
diffusion tensors in human brain, Magn Reson Med 47(4): 623–628, 2002.
47. Assaf Y, CohenY, Invivo andinvitro bi-exponential diffusionof N-acetyl
aspartate (NAA) in rat brain: A potential structural probe?, NMR
Biomed 11(2): 67–74, 1998.
48. Assaf Y, Cohen Y, Non-mono-exponential attenuation of water and
N-acetyl aspartate signals due to diffusion in brain tissue, J Magn Reson
131(1): 69–85, 1998.
January 22, 2008 12:2 WSPC/SPI-B540:Principles and Recent Advances ch12 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
CHAPTER 13
Fluorescence Molecular Imaging:
Microscopic to Macroscopic
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
Medical imaging has revolutionized our understanding and ability to
monitor specific macroscopic physical, physiological, and metabolic func-
tions at cellular and subcellular levels. In the years to come, it will enable
detection and characterization of disease even before anatomic changes
become apparent. Fluorescence molecular imaging is revolutionarizing
drug discovery and development with real-time in vivo monitoring in
intact tissues. Technological advancements have taken fluorescence based
imaging from microscopy to preclinical and clinical instruments for med-
ical imaging. This chapter describes the current state of technology associ-
ated with in vivo noninvasive or minimally invasive fluorescence imaging
along with the underlying principles. An overview of microscopic and
macroscopic fluorescence imaging techniques is presented and their role
in the development and applications of exogenous fluorescence contrast
agents is discussed.
13.1 INTRODUCTION
Present medical imaging technologies rely on macroscopic physical,
physiological, or metabolic changes that differentiate pathological
fromnormal tissue rather than identifying specific molecular events
(e.g. gene expression) responsible for disease.
1
The human genome
project is making molecular medicine an exciting reality. Develop-
ments in quantum chemistry, molecular genetics and high speed
computers have created unparallel capabilities for understanding
311
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
312
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
complex biological systems. Current research has indicated that
many diseases such as cancer occur as the result of the gradual
buildup of genetic changes in single cells.
1−4
Molecular imaging
exploits specific molecular probes as the source of image contrast for
studying such genetic changes at subcellular level. Molecular imag-
ing is capable of yielding the critical information bridging molecular
structure and physiological function for understanding the integra-
tive biology, which is the most important process in characterization
of disease, prevention, earlier detection, treatment, and evaluation
of treatment.
The use of contrast agents for disease diagnostics and func-
tionality is very common in established imaging modalities like
positron emission tomography (PET), magnetic resonance imaging
(MRI), and X-ray tomography (CT). Contrast agents provide accu-
rate difference images under nearly identical biological conditions
and yield superior diagnostic information. Fluorescence molecular
imaging is a novel multidisciplinary field, in which fluorescence
contrast agents are used to produce images that reflect cellular and
molecular pathways and in vivo mechanisms of disease present
within the context of physiologically authentic environments. The
limitation of fluorescence imaging is that the excitation light must
reach the fluorescent molecule which is governed by the absorption
dependent penetration depth of the light within the tissue. How-
ever, fluorophores can be excited continuously and the signal is not
governed by the inherent properties of the probe like the radioactive
decay. Further, a set of photophysical properties are accessible like
fluorophore concentration, fluorescence quantumyield and fluores-
cence lifetime. Some of these parameters are influenced by the local
environment such as pH, ions, oxygen etc. and therefore, provide
more relevant information about the physiological and molecular
condition. Most importantly, light is a nonionizing radiation, ren-
dering it harmless and nontoxic.
Biophotonics can provide tools capable of identifying specific
subset of genes encoded within the human genome that can cause
the development of cancer and other diseases. Photonic techniques
are being developed to image and identify the molecular alterations
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
313
that distinguisha diseasedcell froma normal cell. Suchtechnologies
will ultimately aid in characterizing and predicting the pathologi-
cal behavior of the cell, as well as its responsiveness to drug treat-
ment. The rapid development of laser and imaging technology has
yield powerful tools for the study of disease on all scales: single
molecule totissue materials andwhole organs. Biochemical analyses
of individual compounds characterize basic fluorescence properties
of common fluorophores within the tissue. Additional information
associated with complex systems such as cells and tissues structure
can be obtainedfromin vitro measurements. The study of in vivo ani-
mal disease models provides information of about the intercellular
interactions andregulatoryprocesses. Humanclinical trials will then
lead to optical diagnostic, monitoring, and treatment procedures.
The purpose of this chapter is to provide an overview of micro-
scopic and macroscopic fluorescence imaging techniques. Fluores-
cence confocal microscopy, plan reflectance imaging, and diffuse
optical tomography techniques are discussed along with their role
in the development of exogenous fluorescence contrast agent for
cellular level to in vitro and in vivo tissue imaging. For more spe-
cific details on fluorescence contrast agents, measurement set ups
and image reconstruction techniques and applications, the reader is
encouragedtotapthe extensive literature available onthese subjects.
13.2 FLUORESCENCE CONTRAST AGENT:
ENDOGENOUS AND EXOGENOUS
Light induced fluorescence is a powerful noninvasive method for
tissue pathology recognition and monitoring.
4−7
The attractiveness
of fluorescence imaging is that fluorescent dyes can be detected
at low concentrations using non-ionizing harmless radiation that
can be applied repeatedly to the patient. In fluorescence imag-
ing, the energy from an external source of light is absorbed and
almost immediately re-emitted at a longer, lower energy wave-
length that is related to the electronic transition from the excited
state to the ground state of the fluorescent molecule. Fluorescence
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
314
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
that originates from chromophores naturally present in the tis-
sue (endogenous) is known as autofluorescence. Synthesized chro-
mophores (exogenous) may also be administered that target specific
tissue type, or may be activated by functional changes in the tissue.
13.2.1 Endogenous Fluorophores
These fluorophores are generally associated with the structural
matrix of tissue (e.g. collagen and elastin)
8
or with the cellular
metabolic pathways (e.g. NAD and NADH).
9
Cells in various dis-
ease state often undergo different rates of metabolismor have differ-
ent structures associatedwitha distinct fluorescent emissionspectra.
Fluorescence emission generally depends on the fluorophores con-
centration, spatial distribution throughout the tissue, local microen-
vironment, and light attenuation due to differences in the amount
of nonfluorescing chromophores. Autofluorescence of proteins is
associated with amino acids such as tryptophan, tyrosin and pheny-
lalanine with absorption maxima at 280 nm, 275 nm, and 257 nm
respectively, and emission maxima between 280 nm (phenylala-
nine) and 350 nm (tryptophan). One of the main imaging applica-
tions of fluorescent proteins is in monitoring tumor growth
10,11
and
metastasis formation,
12,13
as well as occasionally gene expression.
4
Structural fluorophores like collagen or elastin have absorption
maxima between 300 nm–400 nm and show broad emission bands
between 400 nm and 600 nm with maxima around 400 nm. Fluores-
cence of collagenor elastinhas beenusedtodistinguishbetweenvar-
ious tissue types e.g. epithelial and connective tissue.
14−20
NADH
is excited from 330 nm–370 nm wavelength range and is most con-
centrated within the mitochondrial membrane where it is oxidized
within the respiratory chain. Its fluorescence is an appropriate
parameter for detection of ischemic or neoplastic tissue. Fluores-
cence of free and protein bounded NADH has been shown to be
sensitive to oxygen concentration.
21
The main drawback of endoge-
nous fluorophores is their low excitation and emission wavelength.
Inthis spectral range, the tissue absorptionis relatively highlimiting
the light penetration.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
315
13.2.2 Exogenous Fluorophores
Various fluorescing dyes can be use for probing cell anatomy and
cell physiology. Exogenous fluorescence probes target specific cellu-
lar and subcellular events, and this ability differentiates them from
nonspecific dyes, such as indocyanine green (ICG), which reveals
generic functional characteristics such as vascular volume and per-
meability. These fluorescence probes typically consist of the active
component, which interacts with the target (i.e. the affinity ligand
or enzyme substrate); the reporting component (i.e. the fluorescent
dye); and possibly a delivery vehicle (for example, a biocompati-
ble polymer), which ensures optimal biodistribution. An important
characteristic inthe designof active andactivatable probes for invivo
applications is the use of fluorochromes that operate inthe NIRspec-
trum of optical energy. This is due to the low light absorption that
tissue exhibits in this spectral window, which makes light penetra-
tion of several centimeters possible.
Exogenous targeted and activatable imaging probes yield
particularly high tumor/background signal ratios because of their
nondetectability in the native state. In activatable probes, the fluo-
rochromes are usually arranged in close proximity to each other so
that they self-quench, or they are placed next to a quencher using
enzyme-specific peptide sequences.
22
These peptide sequences
can be cleaved in the presence of the enzyme, thus freeing the
fluorochromes that can then emit light upon excitation. In contrast
to active probes, activatable probes minimize background signals
because they are essentially dark at the absence of the target and can
improve contrast and the detection sensitivity. Avariety of endoge-
nous reporter probes have been usedfor enhanceddetection of early
cancers, including somatostatinreceptor targetedprobes
23−24
; folate
receptor targeted agents
25
; tumor cell targeted agents
26−29
; agents
that incorporate into areas of calcification; bone formation or both
30
;
and agents being activated by tumor-associated proteases.
31
Dyes
like fluorescein and indocyanine green are commonly used for flu-
orescence angiography or blood volume determination in a clinical
setup. Extensive research is also been carriedout for development of
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
316
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
exogenous fluorophores with applications as activable probes that
carry quenched fluorochromes
24,33
and photosensitizer or tumor
killing agents for cancer treatment using photodynamic therapy.
Aphotosensitizer is a drug that is preferentially taken up by malig-
nant tissue and can be photoactivated. After an optimal time from
administration, light is shown on the tissue area of interest and
absorbed by the sensitizer. The sensitizer then kills the surrounding
tumor tissue, leaving the healthy tissue undamaged. Tissue local-
ization, effectiveness in promoting cell death, and toxicity are some
of the parameters that need to be characterized before human trials.
13.3 FLUORESCENCE IMAGING
Fluorescence imaging can provide information at different res-
olutions and depth penetrations, ranging from micrometers
(microscopy) to centimeters (fluorescence reflectance imaging and
fluorescence molecular tomography).
2−3
On microscopic level, fluo-
rescent reporter dyes are typically used for monitoring the distribu-
tion of important chemical species throughout the cell by obtaining
fluorescence microscopy images of the cell after injecting it with the
dye. Viability of the cell or permeability of its membrane can also
be determined using fluorescence microscopy. Compared to micro-
scopic cellular imaging, macroscopic in vitro tissue imaging allows
us to study interactions between cells and provide a platform much
closer to true in vivo analysis in terms of structural architecture on
microscopic and macroscopic scales. There is a significant differ-
ence intissue uptake andstorage of various exogenous fluorophores
between in vitro and in vivo specimens. However, in vitro measure-
ments can provide information associated with complex systems
such as interaction of various biochemicals that are present in func-
tional systems. Further, the effect of local environment ontissue opti-
cal properties and properties such as reactivity to a specific chemical
can be investigated prior to involving live subjects. For diagnos-
tic purposes, the actual location and kinetics of tissue uptake are
important. This information cannot be obtained using in vitro tissue
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
317
analysis. The pharmacokinetics, tissue discrimination capabilities,
toxicity, and clearance pathways of fluorescence probes need to be
studied prior to use in human trials. Such studies are performed
in vivo using animal models.
13.3.1 Fluorescence Microscopic Imaging
Fluorescence microscopy using endogenous fluorophores finds
applications in discriminating normal tissue fromcancerous or even
precancerous tissue inreal-time clinical setting. Unique fluorescence
spectral patterns associated with cell proliferation and between
rapidly growing and slowly growing cells have been studied. Auto-
fluorescence was used to identify terminal squamous differentia-
tion of normal oral epithelial cells in culture and discrimination
of proliferating and nonproliferating cell populations. Fluorescence
microscopy using exogenous dyes is the most common technique
used for monitoring the spatial distribution of a particular analyte
throughout a cell. One or more exogenous dyes are introduced into
the cell and allowed to disperse. These dyes then interact with the
analyte of interest which in turn changes their fluorescence prop-
erties. By obtaining a fluorescence image of the cell using excita-
tion at specific wavelengths, relative concentrations of the analyte
can be determined. Another important application of exogenous
dyes is in elucidating the role of a particular chemical in cellular
biology.
In epifluorescence microscopy, the specimen is typically excited
using a mercury or xenon lamp along with a set of monochroma-
tor filters. The excitation light, after reflecting from a dichromatic
mirror shines on to the sample through a microscope objective. The
dichromatic mirror reflects light shorter than a certain wavelength
(excitation), andpasses light longer thanthat wavelength(emission).
Thus only the emitted fluorescence light passes onto the eye piece
or projected onto an electronic array detector positioned behind the
dichroic mirror. While imaging thick specimens, the emitted fluo-
rescent signal must pass through the volume of the specimen which
decreases the resolution of objects in the focal plane. Additionally,
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
318
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
fluorescence emitted from excited objects that lie above and below
the focal plane, obscures the emission from the in focus objects.
Laser-scanning confocal microscopy offers distinct advantages
over epifluorescence microscopy by using a pin hole aperture as
shown in Fig. 1. The laser excitation light reflects off a dichromatic
mirror and is focused on a single point within the tissue of inter-
est rather than broadly illuminating the entire specimen using a
computer-controlled X-Y scanning mirror pair. With only a single
point illuminated, the illumination intensity rapidly falls off above
and below the plane of focus as the beam converges and diverges,
thus reducing excitation of fluorescence form interfering objects
Fig. 1. The principle of operationof a confocal microscope is shownonthe left. The
pinhole aperture placed at the focal length of the lens blocks the light coming from
out-of focus planes (green and blue lines), while allowing the light coming from
the plane-in-focus to reach the detector. Aschematic of point-scanning fluorescence
confocal microscope is shown on the right. The dichromatic mirror reflects the
emission light while allowing the excitation light to pass through. Amotorized X-Y
scanning mirror pair is used to collect the data from the selected sample area.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
319
situated out of the focal plane being examined. The emitted fluo-
rescence light from the sample gets descanned by the same mirrors
that are used to scan the excitation light from the laser. The emitted
light passes through the dichromatic and is focused onto a pinhole
aperture. The light that passes through the pinhole is measured by a
detector, i.e. a photomultiplier tube. Any light emitting fromregions
awayfromthe vicinityof the illuminatedpoint will be blockedbythe
pinhole aperture, thus providing attenuation to out-of-focus inter-
ference. Most confocal imaging systems provide adjustable pinhole
blocking apertures. This enables a tradeoff to be made in vertical
resolution and sensitivity. Asmall pinhole gives the highest resolu-
tion and lowest signal and vice versa. With point-by-point scanning,
there is never a complete image of the sample at any given instant.
The detector is attached to a computer which builds up the image,
one pixel at a time.
Point-scanning microscopes, when used with high numerical
aperture lenses, have an inherent speed limitation in fluorescence.
This arises because of a limitation in the amount of light that
can be obtained from the small volume of fluorophore contained
within the focus of the scanned beam (less than a cubic micron).
At moderate levels of excitation, the amount of light emitted will be
proportional to the intensity of the incident excitation. However, flu-
orophore excitedstates have significant lifetimes (inthe order if afew
nanosecond). Therefore, as the level of excitation is increased, the
situation eventually arises when most of the fluorophore molecules
are pumped up to their excited state and the ground state becomes
depleted. At this stage, the fluorophore is saturatedandno more sig-
nal may be obtained from it by increasing the flux of the excitation
source.
Despite their success, conventional microscopy methods suf-
fer significant limitations when used in biological experimentation.
They usually require chemical fixation of removed tissues, involve
the observation of biological samples under non-physiological con-
ditions, can generally not resolve the dynamics of cellular processes,
andmost importantly, it is verydifficult togenerate quantitative data
using microscopy.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
320
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
13.3.2 Fluorescence Macroscopic Imaging
Planar fluorescence imaging, transillumination and fluorescence
molecular tomography (FMT) are the most common imaging tech-
niques used for obtaining fluorescence information at macroscopic
resolution. Collapsing the volume of an animal or tissue into a sin-
gle image, known as planar imaging, is generally fast, the data sets
generated are small, and imaging can be done in high throughput
fashion, at the expense of internal resolution. Tomographic imaging
on the other hand allows a virtual slice of the subject to be obtained
andis more quantitative andcapable of displayinginternal anatomic
structures and/or functional information. However, FMT requires
longer acquisition times, generates a very large data set and is com-
putationally expensive. Further, light becomes diffuse within a few
millimeters of propagationwithinthe tissues owingtoelastic scatter-
ing experienced by photons when they interact with various cellular
components, such as the membranes anddifferent organelles. Diffu-
sion results in the loss of imaging resolution. Therefore, macroscopic
fluorescence imaging largely depends on spatially resolving and
quantifying bulk signals from specific fluorescent entities reporting
on cellular and molecular activity.
13.3.3 Planar Fluorescence Imaging
The most common technique to record fluorescence within a large
tissues volume is associated with illuminating tissue with a plane
wave, i.e. an expanded light beam, and then collecting fluores-
cence signals emitted towards a CCD camera.
37
These methods can
be generally referred to as planar methods and can be applied in
epi-illuminationor transilluminationmode. Figure 2 shows a typical
setup of a planar reflectance imaging system. The imaging plane is
uniformly illuminated using a particular wavelength light source
and the light emitted by the fluorophore is captured using a CCD
camera. An illustrative image of a nude mouse with a subcutaneous
human breast cancer xenograft obtained using a near-infrared fluo-
rescent probe is also shown.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
321
Fig. 2. Schematic diagram of a typical planner reflectance imaging system. The
imaging plane is uniformly illuminated using a particular wavelength light source
and the light emitted by the fluorophore is captured using a CCD camera. An
illustrative image of a nude mouse with a subcutaneous human breast cancer
xenograft MDA MD 361 obtained using a near-infrared fluorescent probe is also
shown.
Planar imaging has the added advantage that same instru-
mentation can be used to image fluorescence in solutions and
excised tissues. However, a significant drawback of this method
is that it cannot resolve depth and does not account for non-
linear dependencies of the signal detected on propagation depth
and the surrounding tissue. Superficial fluorescence activity may
reduce the contrast of underlying activity from being detected
owing to the simple projection viewing. Despite the drawbacks,
planar imaging remains popular because setting up a reflectance
imaging system is comparatively easy and inexpensive. Planar flu-
orescence imaging is a very useful technique when probing super-
ficial structures (<5 mm deep), for example during endoscopy,
41,42
dermatological imaging,
43
intraoperative imaging,
44
probing tissue
autofluorescence
45,46
or small animal imaging,
47
with very high
throughputs.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
322
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
13.3.3.1 Fluorescence molecular tomography
Recent technological evolutions have been the development of flu-
orescence tomography for investigations at the whole-animal or tis-
sue level. These technologies allow three-dimensional imaging of
fluorescence biodistribution in whole animals and account for tis-
sue optical heterogeneity and the nonlinear dependence of fluo-
rescence intensity on depth and optical properties. It can localize
and quantify fluorescent probes three-dimensionally in deep tis-
sues at high sensitivities.
48,49
The diffuse optical tomography (DOT)
methods account for partial volume effects, reduce the influence of
superficial tissues and improve the contrast to noise ratio (CNR)
of buried targets
50−54
thereby overcoming the shortcomings of the
planer reflectance imaging.
Optical tomography is far more complex compared to X-ray CT.
In X-ray CT, the radiation propagates through the medium in a
straight line from the source to the detector. The forward problem
then becomes a set of integrals (Radon transform) and the inverse
problemis linear and well posed (back-projection methods). On the
other hand, inoptical imagingbythe time the light reaches the detec-
tor, it has lost all the information about the originating source due to
multiple scattering. Each measurement is therefore sensitive to the
whole tissue volume resulting into an ill posed, underdetermined
inverse problem. Mathematical models based on radiative transport
(e.g. Monte Carlo techniques) or diffusion equation are required to
reconstruct the most probable photon propagation path through tis-
sue for a given source detector geometry (forward problem).
55,56
Algorithms based on linear numerical inversion methods (inverse
solution) start withthe diffusionequationwhichis thentransformed
in to an integral equation via Green’s theorem. A linear version of
the equation is then obtained using Born’s (or Rytov’s) approxi-
mation and then discretized into a system of linear equations as
follows:
The fluorophore concentration is reconstructed by inverting
ratiometric data derived from the intensities of the excitation and
fluorescence light measured on the detector plane for each source
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
323
position. The light intensity at the excitation wavelength is written
as (r
s(i)
, r
d(i)
, λ
exc
), where r
s(i)
, and r
d(i)
is the positions of the i-th
source and i-th detector locations respectively, and λ
exc
is the excita-
tion wavelength. Similarly, the fluence at the emission wavelength
λ
emi
is written as (r
s(i)
, r
d(i)
, λ
emi
). Following the normalized Born
approach, formulation of the ratiometric fluorescence/excitation
measurements is written in discrete notation as y = Ax with the
following definitions
21,22
;
y
i
=

(r
s(i)
, r
d(i)
, λ
emi
) −θ
f

o
(r
s(i)
, r
d(i)
, λ
exc
)

o
(r
s(i)
, r
d(i)
, λ
exc
)

(1)
A
i,j
= −
S
o
vh
3
D
o
G(r
s(i)
, r
j
, λ
exc
)G(r
j
, r
d(i)
, λ
emi
)
G(r
s(i)
, r
d(i)
, λ
exc
)
(2)
x
j
= ∂N
j
(3)
Here, the two point Greens function, G, models light transport
for given boundary conditions and optical properties. Image voxel
(x
j
) have concentration, ∂N
j
and position r
j
. These equations are
then numerically solved using some type of regularization scheme.
Singular value decomposition, algebraic reconstructiontechnique or
conjugate gradient algorithms are used for example using Tikhonov
regularization.
57−60
The linear formulation works well when perturbations are small
and isolated, and when the background media is relatively uniform.
However, the diffusion equation is inherently nonlinear because
both the photon fluence rate and the Green’s function are depen-
dent upon the unknown quantities we are trying to solve. In algo-
rithms based on nonlinear iterative methods, a global norm, such
as mean square error, is iteratively minimized. The unknown in-
homogeneity is obtained that best predicts the measurement data
subject to some a priori knowledge. The unknown in-homogeneity
is computed based on its current estimate and is compared to the
measurements every iteration.
61−64
The fluorescence optical data can be obtained before and after
administration of the absorbing fluorescence contrast agent and
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
324
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
the DOT images can be reconstructed and subtracted. However, a
more robust approach is to use differential measurements due to the
extrinsic perturbation. The two data sets (excitation and emission)
are obtained within a short time of one another thereby minimiz-
ing positional and movement errors and instrumental drift. The
use of emission/excitation differential measurements eliminates
systematic errors associated with operational parameters and pro-
vides a baseline measurement for independent reconstruction.
3,65
Further, these ratio measurements reduce the influence of het-
erogeneous optical properties and path lengths.
65
Fluorescence
DOT images have recently been demonstrated in vivo using both
fiber-coupled cylindrical geometries
66−68
and lens-coupled planar
geometries.
69−71
The use of a lens to relay light fromthe tissue surface to a charge-
coupled device (CCD) for detection
69,71−73
permits dense spatial
sampling and large imaging domains on the detected surface. The
fiber coupledilluminationsystems introduce anundesiredasymme-
try between the illumination plane (sparsely sampled by discrete
fibers) and the detection plane (densely sampled by a CCD array
detector) and force tradeoffs between sampling density and field
of view on the illumination plane. In addition, fiber optic switch-
ing times (>0.1 seconds) limit data acquisition speeds. Rather than
direct lens coupling, other systems have usedarrays of detector fiber
torelaylight fromtissuetoaCCD.
66−68,74−76
Whileprovidingsource-
detector symmetry, this approach does not provide the dense sam-
pling of the lens coupleddetection. The source plane can be sampled
using fast acquisition, flexible, high-density, and large field-of-view
arrangements by raster scanning the source laser.
A schematic of the small animal continuous-wave fluorescence
DOT system is shown in Fig. 3.
77
Here, the source illumination
is provided by a laser diode. The collimated output of the laser
passes through a beam splitter that deflects 5% of the beam to
a photodiode for a reference measure of the laser intensity. The
remainder of the collimated beam (95%) passes through a lens, L,
into a dual-axis XY galvanometer mirror system. The mirror pair
samples the source plane using a flexible, high-density and large
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
325
Fig. 3. Fluorescence tomography system. The mouse subject is suspended and
held in light compression between two movable windows (W1 and W2). Light
from a laser diode at 785 nm (LD) is collimated and passes through a 95/5 beam
splitter (BS). A reference photodiode (PD) collects 5% of the beam. The main 95%
beampasses through lens (L1) into a XYgalvo scanning system(XYGal). The mirror
pair scans the beam onto the illumination window (W1) of the imaging tank. Light
emitted from W2 is detected by an EMCCD via a filter (F1) and lens system (L2).
77
field-of-view arrangements by raster scanning the focused illumi-
nation (spot size = 100 µm) in two dimensions with a position A
to position B switch time of <0.5 ms. The 100 µm source spot size
is similar to the multimode fibers sizes used in a wide variety of
DOT systems.
66−69,71−76
The use of the galvanometer mirror pair
permits the system to scan an adjustable area of up to 8 cm×8 cm
withflexible source positioningandsource separations. After propa-
gating throughthe sample volume, transmittedlight passes through
a selectable filter element andis detectedonthe opposite plane using
a lens coupled CCD camera. The typical scanning protocol consists
of two separate excitation and fluorescence scans. The excitation
light intensity profile is measured for each source position using a
neutral density filter. The fluorescence emission light intensity pro-
file is then measured by using a narrowband interference filter. The
excitation and emission images obtained from the CCD camera are
normalized using the mean source intensity values obtained from
the photodiode. This normalization compensates for the differences
inlight levels betweenthe excitationandemissionscans. Afull frame
4×4 binnedimage data (128×128) is collectedfor all the source posi-
tions. The full detector images are cropped and binned to generate
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
326
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
set of detector measurement positions symmetricallyarrangedinthe
x-y plane such that for x-y source position, there is a matched x-y
detector position. With a total small animal whole body scan time
of ∼2.2 min, this fluorescence DOT system provides a 10× larger
imaging domain (5 cm×5 cm×1.5 cm) compared to an equivalent
fiber-switched systemwhile maintaining the same resolution (small
object FWHM≤2.2 mm) and sensitivity (<0.1 pmole).
69,77
Imaging the distribution of tumor-targeted molecular probes
simultaneously in the liver, kidneys and tumors is demonstrated
in Fig. 4 by imaging the uptake of a breast tumor-specific polypep-
tide in nude mice bearing subcutaneously implanted human breast
Fig. 4. Representative slices from a 3D tomographic reconstruction of a nude
mouse with a subcutaneous human breast cancer xenograft MDA MD 361. (A)
a xy slice parallel to the detector plane at a depth of z = 2.5 mm and (B) a xz slice
extending from the source plane to detector plane at y = 12 mm.
77
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
327
cancer carcinoma MDA-MB-361. The polypeptide was conjugated
with a near infrared fluorescent probe cypate, which serves as the
fluorescent contrast for optical imaging. For imaging the small ani-
mal, anesthetizednude mice (ketamine/xylazine via intraperitoneal
injection) were suspended between the source and detector win-
dows. A matching fluid (µ
a
= 0.3 cm
−1
, µ

s
= 10 cm
−1
) surrounds
the animal. With warmed matching fluid (T=38

C), the mice can
be imaged multiple times over the normal course of anesthetic dose
(30 minutes–60 minutes). The full protocol for a combined fluores-
cence/excitation scan (a 24 × 36 array with 2 mm square spacing
between source positions, x = −24 mm to 24 mm, y = −36 mm
to 36 mm) took 5 minutes–6 minutes including animal position-
ing, emission and excitation scanning, retrieval of animal from the
scanner and reconstructing the data. Figure 5 shows a 2D slice
Fig. 5. Retinal angiography images of a diabetic fundus (FA) showing loss of
normal retinal capillaries andgrowthof abnormal ones that leakthe fluoresceindye.
(Source: Dr Levent Akduman, Saint Louis University Eye Center.)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
328
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
parallel to the detector plane at a depth of z = 2.5 mm and a 2D
slice extending fromthe source plane to detector plane at y =12 mm,
obtainedfromthe 3Dtomographic reconstruction. The tumor (breast
cancer) shows uptakeof afluorescingnear infraredcypatederivative
probe with a polypeptide that targets a protein receptor expressed
in breast cancer. The kidneys also show contrast. The maximum
values of probe concentration obtained from the tumor, liver and
kidney volumes as a ratio of the background are 54.7, 32.4, and 58.3
respectively.
Besides applications for disease diagnosis and monitoring,
molecular imaging assays in intact living animals can also benefit in
resolving biological questions raised by pharmaceutical scientists.
Transgenic animals are useful in guiding early drug discovery by
“validating” the target protein, evaluating test compounds, deter-
mining whether the target is involved in any toxicological effects of
test compounds, and testing the efficacy of compounds to ensure
that the compounds will act as expected in man (Livingston, 1999).
The implementation of molecular imaging approaches in this drug
discovery process offers the strong advantage of being able to mean-
ingfully study a potential drug labeled for imaging in an animal
model, often before phenotypic changes become obvious, and then
quickly move into human studies. It is likely that preclinical trials
canbe acceleratedtorule out drugs withunfavorable biodistribution
and/or pharmacokinetics prior to human studies. Afurther advan-
tage over in vitro and cell culture experimentation may be achieved
byrepetitivestudyof thesameanimal model, usingidentical or alter-
native biological imagingassays at different time points. This reveals
a dynamic and more meaningful picture of the progressive changes
in biological parameters under scrutiny, as well as possible temporal
assessment of therapeutic responses, all in the same animal without
recourse to its death. This yields better quality results fromfar fewer
experimental animals. Another benefit of molecular imaging assays
is their quantitative nature. The images obtained are usually not just
subjective or qualitative, as is the case with standard use of sev-
eral conventional medical imaging modalities, but instead, usually
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
329
provide meaningful numerical measures of biological phenomena
(exemplified below). Such quantitative data could even be consid-
ered more useful than similar data obtainable in vitro or ex vivo,
on account of preserving the intactness and the physiology of the
experimental subject.
13.4 CONCLUSIONS
With the completion of several genome sequences, the next cru-
cial step is to understand the function of gene products and their
role in the development of disease. This knowledge will potentially
facilitate the discovery of informative biomarkers that can be used
for the earliest detection of disease and for the creation of new
classes of drugs directed at new therapeutic targets. Thus, one of
the capabilities most highly sought after is the noninvasive visu-
alization of specific molecular targets, pathways and physiological
effects in vivo. Revolutionary advances in fluorescent probes, pho-
toproteins and imaging technologies have allowed cell biologists
to carry out quantitative examination of cell structure and function
at high spatial and temporal resolution. Indeed, whole cell assays
have become an increasingly important tool in screening and drug
discovery.
Fluorescence molecular imaging now creates the possibility of
achieving several important goals in biomedical research, namely,
(1) to develop noninvasive in vivo imaging methods that reflect spe-
cific cellular and molecular processes, for example, gene expres-
sion, or more complexmolecular interactions suchas protein-protein
interactions; (2) to monitor multiple molecular events near simulta-
neously; (3) tofollowtraffickingandtargetingof cells; (4) tooptimize
drug and gene therapy; (5) to image drug effects at a molecular and
cellular level; (6) to assess disease progression at a molecular patho-
logical level; and (7) to create the possibility of achieving all of the
above goals of imaging in a rapid, reproducible, and quantitative
manner, so as to be able to monitor time-dependent experimental,
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
330
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
developmental, environmental, and therapeutic influences on gene
products in the same animal or patient.
1
Fluorescein and ICG are FDA approved fluorescence dyes for
humanmedical applications andare routinelyusedinclinical retinal
angiography
78
and liver function testing.
79
Sample images of reti-
nal angiography of a diabetic fundus (FA) showing loss of normal
retinal capillaries andgrowth of abnormal ones that leak the fluores-
ceindye are showninFig. 5. ICGexhibits favorable pharmacokinetic
properties for assessment of hepatic functionandcardiac output and
has been applied in clinical settings.
80
ICGhas also been reported as
a NIR contrast agent for detection of tumors in animal research
81,82
andat clinical level.
83
The first fluorescence contrast-enhancedimag-
ing in a clinical setting was reported by Ntziachristos et al.
83
who
demonstrated uptake and localization of ICGin breast lesions using
DOT. Fluorescence imaging has shown very promising results as a
potential imaging modality that will provide specific macroscopic
physical, physiological, or metabolic information at molecular level.
With the current resources and research efforts, it won’t be long
before a library of fluorescence biomarkers and photosynthesizers
for diagnosis, monitoring and treatment various diseases is formed.
Technological advancements will soon take the fluorescence based
imaging devices from preclinical to clinical setups.
13.5 ACKNOWLEDGMENT
The authors acknowledge the help and support of Joseph P Culver,
Samuel Achilefu and the entire team of the Optical Radiology Lab-
oratory in the Department of Radiology at Washington University
School of Medicine, Saint Louis, Missouri. The authors are thankful
to Dr Levent Akduman, Saint Louis University Eye Center, Saint
Louis, Missouri, for providing the retinal angiography images illus-
trated in this chapter. Some of the work presented here was sup-
ported in part by the following research grants: National Institutes
of Health, K25-NS44339, BRGR01 CA109754, Small Animal Imaging
Resource Program (SAIRP) grant, R24 CA83060.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
331
References
1. Massoud TF, Gambhir SS, Molecular imaging in living subjects: Seeing
fundamental biological processes in a new light, Genes & Dev 17: 545–
580, 2003.
2. Weissleder R, Ntziachristos V, Shedding light onto live molecular
targets, Nature Medicine 9(1): 123–128, 2003.
3. Ntziachristos V, Fluorescence molecular imaging, AnnuRev Biomed Eng
8: 1–33, 2006.
4. Yang M, Baranov E, Moossa AR, et al., Visualizing gene expression by
whole body fluorescence imaging, Proc Natl Acad Sci USA 97: 12278–
12282, 2000.
5. Tuchin VV, Handbook of Optical Biomedical Diagnostics, PM107, SPIE
Press, Bellingham, WA, 2002.
6. Das BB, Lui F, Alfano RR, Time-resolved fluorescence and photon
migration studies in biomedical and random media, Rep Prog Phys 60:
227, 1997.
7. Lakowicz JR, Principles of fluorescence spectroscopy, 2ndedn., Kluwer
Academic, New York, 1999.
8. Denk W, Two-photon excitation in functional biological imaging,
J Biomed Opt 1: 296, 1996.
9. Fujimoto D, Akiba KY, Nakamura N, Isolation and characterization
of a fluorescent material in bovine Achilles-tendon collagen, Biochem
Biophys Res Commun 76: 1124, 1977.
10. Wagnieres GA, Star WM, Wilson BC, In vivo fluorescence spectroscopy
and imaging for oncological applications, Photochem Photobiol 68: 603,
1998.
11. Holfman RM, Visualization of GFP-expressing tumors and metastesis
in vivo, Biotechniques 30: 1016–1022, 1024–1026, 2001.
12. Yang M, et al., Whole-body optical imaging of green fluorescent protin-
expressing tumors and metastases, Proc Natl Acad Sci USA 97: 1206–
1211, 2000.
13. Moore A, Sergeyev N, Bredow S, et al., Model system to quantitate
tumor burdeninlocoregional lymphnodes during cancer spread, Inva-
sion Metastasis 18: 192–197, 1998.
14. Wunderbaldinger P, Josephson L, Bremer C, et al., Detection of lymph
node metsastases by contrast — Enhanced MRI in an experimental
model, Magn Reson Med 47: 292–297, 2002.
15. Das BB, Lui F, Alfano RR, Time-resolved fluorescence and photon
migration studies in biomedical and random media, Rep Prog Phys 60:
227, 1997.
16. Lakowicz JR, Principles of fluorescence spectroscopy, 2ndedn., Kluwer
Academic, New York, 1999.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
332
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
17. Schneckenburger H, Steiner R, Strauss W, et al., Fluorescence technolo-
gies in biomedical diagnostics, in Tuchin VV (ed.), Optical Biomedical
Diagnostics, SPIE Press, Bellingham, WA, 2002.
18. Sinichkin Yu P, Kollias N, Zonios G, et al., Reflectance and fluorescence
spectroscopy of human skin in vivo, in Tuchin VV (ed.), Handbook of
Optical Biomedical Diagnostics, SPIE Press, Bellingham, WA, 2002.
19. Sterenborg HJ, Motamedi M, Wagner RF, et al., In vivo fluorescence
spectroscopy and imaging of human skin tumors, Lasers Med Sci 9:
344, 1994.
20. Zeng H, MacAulay C, McLean DI, et al., Spectroscopic andmicroscopic
characteristics of human skin autofluorescence emission, Photochem
Photobiol 61: 645, 1995.
21. Schneckenburger H, Steiner R, Strauss W, et al., Fluorescence technolo-
gies in biomedical diagnostics, in Tuchin VV (ed.), Optical Biomedical
Diagnostics, SPIE Press, Bellingham, WA, 2002.
22. Tung CH, Fluorescent peptide probes for in vivo diagnostic imaging,
Biopolymers 76: 391–403, 2004.
23. Zaheer A, Lenkinski RE, Mahmood A, et al., In vivo near — infrared
fluorescence imaging of osteobllastic activity, Nat Biotechnol 19: 1148–
1154, 2001.
24. Weissleder R, Tung CH, Mahmood U, et al., In vivo imaging of tumors
with protease-activated near-infrared fluorescent probes, Nat Biotech-
nol 17: 375–378, 1999.
25. Tung CH, Fluorescent peptide probes for in vivo diagnostic imaging,
Biopolymers 76: 391–403, 2004.
26. Ballou B, et al., Tumor labeling in vivo using cyanine-conjugated mon-
oclonal antibodies, Cancer Immunol Immunother 41: 257–263, 1995.
27. Neri D, et al., Targetingbyaffinity-maturedrecombinant antibodyfrag-
ments on an angiogenesis-associated fibronectin isoform, Nat Biotech-
nol 15: 1271–1275, 1997.
28. Muguruma N, et al., Antibodies labeled with Fluorescence-agent
excitable by infrared rays, J Gastroenterol 33: 467–471, 1998.
29. Folli S, et al., Antibody —indocyanine conjugates for immunophotode-
tection of human squamous cell carcinoma in nude mice, Cancer Res
54: 2643–2649, 1994.
30. Zheer A, et al., In vivo near — infrared fluorescence imaging of
osteoblastic activity, Nat Biotechnol 19: 1148–1154, 2001.
31. Tung CH, Mahmood U, Bredow S, et al., In vivo imaging of proteolytic
enzyme activity using a novel molecular reporter, Cancer Res 60: 4953–
4958, 2000.
32. Bogdanov AAJr, Lin CP, Simonova M, et al., Cellular activation of the
self–quenched fluorescent reporter probe in tumor microenvironment,
Neoplasia 4: 228–236, 2002.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
333
33. Funovics M, Weissleder R, Tung CH, Protease sensors for bioimaging,
Anal Bioanal Chem 377: 956–963, 2003.
34. Phair RD, Misteli T, Kinetic modeling approaches to in vivo imaging,
Nat Rev Mol Cell Biol 2: 898–907, 2003.
35. Ke S, Wen XX, Gurfinkel M, et al., Near infrared optical imaging of
epidermal growth factor receptor in breast cancer xenografts, Cancer
Res 63: 7870–7875, 2003.
36. Zaheer A, Lenkinski RE, Mahmood A, et al., In vivo near — infrared
fluorescence imaging of osteobllastic activity, Nat Biotechnol 19: 1148–
1154, 2001.
37. Weissleder R, Tung CH, Mahmood U, et al., In vivo imaging of tumors
with protease-activated near-infrared fluorescent probes, Nat Biotech-
nol 17: 375–378, 1999.
38. Wunder A, Tung CH, Muller-Lander U, et al., In vivo imaging of pro-
tease activity in arthritis —Anovel approach for monitering treatment
response, Arthritis Rheum 50: 2459–2465, 2004.
39. MahmoodU, Tung C, BagdanovA, et al., Near infraredoptical imaging
system to detect tumor protease activity, Radiology 213: 866–870, 1999.
40. Yang M, Baranov E, Jinag P, et al., Whole–body optical imaging of green
fluorescent protin-expressingtumors andmetastases, Proc Natl AcadSci
USA 97: 1206–1211, 2000.
41. Ito S, et al., Detection of human gastric cancer in resected speci-
mens using a novel infrared fluorescent anti-human carcinoembry-
onic antigen antibody with an infrared fluorescence endoscope in vitro,
Endoscopy 33: 849–853, 2001.
42. Marten K, et al., Detection of dysplastic intestinal adenomas using
enzyme-sensing molecular beacons in mice, Gastroenterology 122: 406–
414, 2002.
43. Zonios G, Bkowski J, Kollias N, Skin melanin, hemoglobin, and light
scattering properties can be quantitatively assessed in vivo using dif-
fuse reflectance spectroscopy, J Invest Dermatol 117: 1452–1457, 2001.
44. KuroiwaT, KajimotoY, OhtaT, Development andclinical applicationof
near — infrared surgical microscope; preliminary report, Minim Inva-
sive Neurosurg 44: 240–242, 2001.
45. Richards-Kortum R, Sevick-Muraca E, Quantitative optical spec-
troscopyfor tissue diagnosis, AnnuRev Physical Chem47: 555–606, 1996.
46. Wang TD, et al., In vivo identification of colonic dysplasia using fluo-
rescence endoscopic imaging, Gastrointest Endosc 49: 447–455, 1999.
47. Mahmood U, Tung C, Bagdanov A Jr, et al., Near-infrared optical
imaging of protease activity for tumor detection, Radiology 213: 866–
870, 1999.
48. Ntziachistos V, Bremer C, Weissleder R, Fluorescence-mediatedtomog-
raphy resolves protease activity in vivo, Nat Med 8: 575–560, 2002.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
334
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
49. Ntziachristos V, Weissleder R, Charge coupled device based scanner
for tomography of fluorescent near-infrared probes in turbid media,
Med Phys 29: 803–809, 2002.
50. Hebden JC, Wong KS, Time-resolved optical tomography, App Optics
32(4): 372–380, 1993.
51. Barbour RL, Graber HL, Chang JW, et al., MRI-guided optical tomog-
raphy: Prospects and computation for a new imaging method, IEEE
Compu Sci & Eng 2(4): 63–77, 1995.
52. Pogue BW, Patterson MS, Jiang H, et al., Initial assessment of a simple
system for frequency-domain diffuse optical tomography, Physics in
Medicine and Biology 40(10): 1709–1729, 1995.
53. Oleary MA, Boas DA, Chance B, et al., Experimental images of hetero-
geneous turbid media by frequency-domain diffusing-photon tomog-
raphy, Optics Letters 20(5): 426–428, 1995.
54. Gonatas CP, Ishii M, Leigh JS, et al., Optical diffusion imaging using a
direct inversion method, Phys Rev E 52(4): 4361–4365, 1995.
55. Gibson AP, Hebden JC, Arridge SR, Recent advances in diffuse optical
imaging, Phys Med Biol 50: R1–R43, 2005.
56. Arridge SR, Optical tomography in medical imaging, Inverse Problems
15: R41–R93, 1999.
57. O’Leary MA, Boas DA, Chance B, et al., Experimental images of hetero-
geneous turbid media by frequency-domain diffusing-photon tomog-
raphy, Opt Lett 20: 426, 1995.
58. Yao Y, Wang Y, Pei Y, et al., Frequency-domain optical imaging of
absorption and scattering distributions by a Born iterative method,
J Opt Soc Am A14: 325, 1997.
59. Gaudette RJ, Brooks DH, DiMarzio CA, et al., A comparison study of
linear reconstruction techniques for diffuse optical tomography imag-
ing of absorption coefficient, Phy Med Biol 45: 1051, 2000.
60. Pogue B, McBride T, Prewitt J, et al., Spatially variant regularization
improves diffuse optical tomography, Appl Opt 38: 2950, 1999.
61. Ye JC, Webb KJ, Millane RP, et al., Modified distorted Born iterative
method with an approximate Frechet derivative for optical diffusion
tomography, J Opt Soc Am A16: 1814, 1999.
62. Hielsher AH, Klose AD, Hanson KM, Gradient-based iterative image
reconstruction scheme for time-resolved optical tomography, IEEE
Trans Med Imag 18: 262, 1999.
63. BluestoneAY, AbdoulaevG, SchmitzCH, et al., Three-dimensional opti-
cal tomography of hemodynamics in the human head, Opt Express 9:
272, 2001.
64. Roy R, Sevick-Muraca EM, Anumerical study of gradient based non-
linear optimization methods for contrast enhanced optical tomogra-
phy, Opt Express 9: 49, 2001.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
Fluorescence Molecular Imaging: Microscopic to Macroscopic
335
65. Soubret A, Ripoll J, Ntziachristos V, Accuracy of fluorescent tomogra-
phy in the presence of heterogeneities: Study of the normalized Born
ratio, IEEE Trans Med Imaging 24(10): 1377–1386, 2005.
66. Ntziachristos V, Weissleder R, Charge-coupled-device based scanner
for tomography of fluorescent near-infrared probes in turbid media,
Med Phys 29(5): 803–809, 2002.
67. Ntziachristos V, Tung CH, Bremer C, Weissleder R, et al., Fluorescence
molecular tomography resolves protease activity in vivo, Nat Med 8(7):
757–760, 2002.
68. Ntziachristos V, Bremer C, Tung C, et al., Imaging cathepsin B up-
regulation in HT-1080 tumor models using fluorescence-mediated
molecular tomography (FMT), Acad Radiol 9: S323–S325, 2002.
69. Graves EE, Ripoll J, Weissleder R et al., A submillimeter resolution
fluorescence molecular imaging systemfor small animal imaging, Med
Phys 30(5): 901–911, 2003.
70. Graves EE, Weissleder R, Ntziachristos V, Fluorescence molecular
imagingof small animal tumor models, Current Molecular Medicine 4(4):
419–430, 2004.
71. Ntziachristos V, Schellenberger EA, Ripoll J, et al., Visualization of anti-
tumor treatment by means of fluorescence molecular tomography with
an annexin V-Cy5.5 conjugate, Proc Nat Acad Sci USA 101(33): 12294–
12299, 2004.
72. Culver JP, Choe R, Holboke MJ, et al., Three-dimensional diffuse optical
tomography in the parallel plane transmission geometry: Evaluation
of a hybrid frequency domain/continuous wave clinical system for
breast imaging, Med Phys 30(2): 235–247, 2003.
73. Schulz RB, Ripoll J, Ntziachristos V, Experimental fluorescence tomog-
raphy of tissues with noncontact measurements, IEEE Transactions on
Medical Imaging 23(4): 492–500, 2004.
74. Godavarty A, Eppstein MJ, Zhang CY, et al., Fluorescence-enhanced
optical imaging in large tissue volumes using a gain-modulated ICCD
camera, Physics in Medicine and Biology 48(12): 1701–1720, 2003.
75. Ntziachristos V, Weissleder R, Experimental three-dimensional fluo-
rescence reconstruction of diffuse media by use of a normalized Born
approximation, Optics Letters 26(12): 893–895, 2001.
76. Oleary MA, Boas DA, Li XD, et al., Fluorescence lifetime imaging in
turbid media, Optics Letters 21(2): 158–160, 1996.
77. Patwardhan SV, et al., Time-dependent whole-body fluorescence
tomography of probe bio-distributions in mice, Optics Express 13(7):
2564–2577, 2005.
78. Richards G, Soubrane G, Yanuzzi L, Fluorescein and ICG Angiography,
Thieme, Stuttgart, Germany, 1998.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch13 FA
336
Sachin V Patwardhan, Walter J Akers and Sharon Bloch
79. Flanagan JH Jr, Khan S, Menchen S, et al., Functionalized tricarbocya-
nine dyes as near-infrared fluorescent probes for biomolecules, Biocon-
jug Chem 8: 751, 1997.
80. Caesar J, ShaldonS, Chiandussi L, et al., The use of indocyanine greenin
the measurement of hepatic bloodflowandas a test of hepatic function,
Clin Sci 21: 43, 1961.
81. Gurfinkel M, ThompsonAB, Ralston W, et al., Pharmacokinetics of ICG
anf HPPH-car for the detection of normal and tumor tissue using flu-
orescence, near-infrared reflectance imaging: A case study, Photochem
Photobiol 72: 94, 2000.
82. Licha K, Riefke B, Ntziachristos V, et al., Hydrophilic cyanine
dyes as contrast agents for near-infrared tumor imaging: Synthesis,
photo-physical properties and spectroscopic in vivo characterization,
Photochem PhotoBiol 72: 392, 2000.
83. Ntziachristos V, YodhAG, Schnall M, et al., Concurrent MRI anddiffuse
optical tomography of breast after indocyanine green enhancement,
Proc Natl Acad Sci USA 97: 2767, 2000.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
CHAPTER 14
Tracking Endocardium Using Optical
Flow along Iso-Value Curve
Qi Duan, Elsa Angelini, Shunichi Homma
and Andrew Laine
In cardiac image analysis, optical flowtechniques are widely used to track
ventricular borders as well as estimate myocardial motion fields. The opti-
cal flowcomputation is typically performed in Cartesian coordinates, and
not constrained froma priori knowledge of normal myocardiumdeforma-
tion patterns. However, for cardiac motion analysis, displacements along
specific directions and their derivatives are usually more interesting than
2D or 3D displacement fields themselves. In this context, we propose two
general frameworks on optical flow estimation along iso-value curves.
We applied the proposed frameworks in several specific applications: for
endocardium tracking on cine cardiac MRI series and real-time 3D ultra-
sound, and thickening computation in 2D ultrasound images. The endo-
cardial surfaces tracked with the proposed algorithm were quantitatively
comparedonmanual tracingat eachframe. The proposedmethodwas also
compared to the traditional Lucas-Kanade optical flow method directly
applied to MRI image data in Cartesian coordinates and the standard
correlation based optical flow estimation on real-time 3D echocardiogra-
phy. Quantitative comparison showed a positive improvement in average
tracking errors or efficiency, through the whole cardiac cycle.
14.1 INTRODUCTION
Cardiac imaging techniques, including echocardiography, cardiac
MRI, cardiac CT, and cardiac PET/SPECT, are widely used in clini-
cal screening and diagnosis examinations as well as in research for
in vivo studies. These imaging techniques provide structural and
337
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
338
Qi Duan et al.
functional information. In most clinical studies, quantitative evalu-
ation of cardiac function requires endocardial border segmentation
throughout the whole cardiac cycle.
Recent advance in cardiac imaging technology have greatly
improved the spatial and temporal resolution of acquired data, such
as with real-time three-dimensional echocardiography,
1
and high
temporal resolution MRI.
2
However, as information content is more
detailed, the amount of data needed to be analyzed for one cardiac
cycle also increases dramatically, making manual analysis of these
data sets prohibitively labor-intensive in clinical diagnosis centers.
In this context, many computer-aided methods were developed to
automate or semi-automate endocardial segmentation or tracking
tasks throughout the whole cardiac cycle. These computer-based
techniques can be divided into two classes: segmentation methods
and motion tracking methods.
Today, cardiac image segmentation is a very active research area.
Many techniques have been proposed, including active contour,
3,4
level-set methods and deformable models,
5–9
classification,
10
active
appearance models,
11
and other methods.
12
Optical flowalgorithms
on tracking of the endocardial borders or other anatomical land-
marks throughout the whole sequences were studied in several
recent works.
13–18
Optical-flow based tracking techniques offer the
possibility to compute myocardium motion field. Usually, these
methods require initializationof the trackedpoints, either bymanual
tracing or with other segmentation techniques (as a preprocessing
step).
However, in cardiac motion analysis, displacements along
specific directions are usually better indicators of wall motion
abnormality. In this context, we propose a general framework for
optical flow estimation along iso-value curves. An additional con-
straint related to specific motion direction was incorporated in the
original optical flow system of equations to properly constrain
the problem. A least-square fitting method was applied to small
neighborhoods for each point of interest to increase the robustness
of the method. The proposed method was then applied to endo-
cardium tracking and results were quantitatively compared with
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
339
that obtained by manual tracing as well as tracking with the origi-
nal Lucas-Kanade optical flow method.
19
14.2 MATHEMATICAL ANALYSIS
14.2.1 Optical Flow Constraint Equation
Optical flow(OF) tracking refers to the computation of the displace-
ment field of objects in an image, based on the assumption that the
intensity of the object remains constant. This notion was first pro-
posed by Horn
20
and drove the active area of motion analysis in
the 1990s. Barron et al.
21
wrote an extensive survey of the major
optical-flow techniques at that time and drew the conclusion that
the Lucas-Kanade and the Fleet-Jepson methods were the most reli-
able among the nine techniques they implemented and tested on
several image motion sequences.
Assuming the intensity at time frame t of the image point (x, y)
is I(x, y, t), with u(x, y) and v(x, y) being the corresponding x and y
components of the optical flowvector at that point, it is assumedthat
the image intensity will remain constant at point (x + dx, y + dy) at
time t +dt, where dx = udt and dy = vdt are the actual displacement
of the point during time perioddt, leading to the following equation:
I(x +dx, y +dy, t +dt) = I(x, y, t) (1)
If the image intensityis smoothwithrespect tox, y, andt, the left-
handside of Eq. (1) canbe expandedintoaTaylor series.
20
Simplifica-
tions, as detailed in Ref. 20, performed by ignoring the higher order
terms and taking limits as dt → 0, lead to the following equation:
∂I
∂x
dx
dt
+
∂I
∂y
dy
dt
+
∂I
∂t
= 0 (2)
Using the notations:
u =
dx
dt
, v =
dy
dt
,
I
x
=
∂I
∂x
, I
y
=
∂I
∂y
, I
t
=
∂I
∂t
,
(3)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
340
Qi Duan et al.
Eq. (2) can be simplified as:
I
x
u +I
y
v +I
t
= 0, (4)
Eq. (4) is called the optical flow constraint equation, as it expresses a
constraint on the components u and v of the optical flow. This sys-
tem is under-constrained and with this equation alone, the optical
flowproblemcan not be uniquely solved. All gradient-based optical
flow methods try to add additional constraints to make the system
sufficiently constrained or even over-constrained. For example, the
Lucas-Kanade method
19
tries to solve Eq. (2) through a weighted
least-squares fitting in each small spatial neighborhood by mini-
mizing the following equation, assuming a constant motion within
the neighborhood:

(x,y)∈
W
2
(x, y)[I
x
u +I
y
v +I
t
]
2
(5)
where W(x, y) denotes a window function applied to the neighbor-
hood. The solution to Eq. (5) is given by the following linear system:
A
T
W
2
A
_
u
v
_
= A
T
W
2
b (6)
where for n points in the neighborhood at single time t,
A =
_
I
x
1
. . . I
x
n
I
y
1
. . . I
y
n
_
T
,
W = diag
_
W(x
1
, y
1
), . . . , W(x
n
, y
n
)
_
,
b = −
_
_
_
_
_
I
t
(x
1
, y
1
)
.
.
.
I
t
(x
n
, y
n
)
_
¸
¸
¸
_
.
(7)
The systemdescribedinEq. (6) canbe solvedbymatrix inversion
when the 2 by 2 matrix A
T
W
2
A is non-singular. The intrinsic least-
square fitting property increases the robustness of the optical flow
estimation for the Lucas-Kanade method.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
341
14.2.2 Optical Flow along Iso-Value Curves
In cardiac motion analysis, motion along some iso-value curves is
usually more interesting than the full 2D or 3D displacement itself.
In both cardiac biomechanics,
22
and cardiac imaging analysis, such
as Ref. 23, 2D or 3D displacement vectors are usually decomposed
into radial and circumferential displacement components. These
components and their derivatives (strains) are usually good indica-
tors of ventricular abnormalities. For example, myocardium thick-
ening, computed via radial derivatives of radial displacements, is
the best indicator for ischemia according to a recent biomechanics
study.
24
With the correct use of a coordinate system, such as polar
coordinates
25
in 2D and cylindrical coordinates
23
in 3D, displace-
ments along some directions (e.g. along radial directions) can be
mathematically formulated as motion along some iso-value curves
(e.g. θ = const). In this context, investigating optical flow along iso-
value curves becomes important.
Given a time-varying N-dimensional time series I(
−→
X, t), where
−→
X = [x
1
, . . . , x
N
]
T
is the spatial coordinates and t is the temporal
dimension, the constant intensity constraint is
I(
−→
X, t) = I(
−→
X +
−→
dX, t +dt) (8)
where
−→
dX is the N-D displacement vector within time period dt for
the pixel located at
−→
X at time t. Using Taylor series expansion and
omitting higher order terms, we have
∇I(
−→
X, t) ·
−→
dX +
∂I(
−→
X, t)
∂t
dt = 0 (9)
where ∇I(
−→
X, t) =
_
∂I
∂x
1
, . . . ,
∂I
∂x
N
_
T
is the image spatial gradient vector
and the “·” represents the vector dot product.
By defining the velocity vector (i.e. optical flow vector) as
−→
v =
d
−→
X
dt
=
_
dx
1
dt
, . . . ,
dx
N
dt
_
T
, the optical flow constraint equation for
N-dimensional time series can be derived as
∇I(
−→
X, t) ·
−→
v +
∂I(
−→
X, t)
∂t
= 0 (10)
by taking limits as dt → 0.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
342
Qi Duan et al.
Assume the optical flowestimation is performedalong iso-value
curves G(
−→
X,
−→
dX) = const. Note that in N-dimensional space, more
than one equations may be needed to represent iso-value curves or
hyper-surfaces, so Gcould be a vector of functions and const could
be a constant vector with same length as G. By letting F(
−→
X,
−→
dX) =
G(
−→
X,
−→
dX) −const, the problemcan be converted into an optical flow
estimation along the zero-value curve(s) F(
−→
X,
−→
dX) = 0 (note F could
be a vector as the same reason as G). Thus, for a point
−→
X, two general
constraints are imposed on the optical flow vector
−→
v :
_
¸
_
¸
_
∇I(
−→
X, t) ·
−→
v +
∂I(
−→
X, t)
∂t
= 0
F(
−→
X,
−→
dX) = 0
(11)
There are many ways to solve the system described by Eq. (11).
Here, we will propose a framework to solve this system via energy
minimization since this framework can be easily extended to image
spaces with different dimensionalities, can easily incorporate neigh-
borhood information, and can easily add additional constraints.
One straightforward way to solve the optical flow along
iso-value curves as in Eq. (11) is to followthe rationale of the Lucas-
Kanade method. To increase the robustness of optical flow estima-
tion, for each point
−→
X
c
, the final optical flowestimation is solved via
energy minimization of the energy defined in Eq. (12), in the least
square fitting sense, in an n-point neighborhood centered at
−→
X
c
,
assuming a constant motion within the neighborhood:
−→
v = arg min
−→
v
E
1
= arg min
−→
v
(E
OF
+E
ISO
)
= arg min
−→
v
__
_
W(
−→
X ) · OF(
−→
X )
_
_
2
¸
¸
−→
X ∈
+
_
_
F(
−→
X
c
,
−→
dX
c
)
_
_
2
_
, (12)
where · represents the l
2
-norm, and the weighting vector W(
−→
X )
and optical flow constraint vector OF(
−→
X ) are defined as following
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
343
in the neighborhood :
W(
−→
X ) =
_
W(
−→
X
1
), . . . , W(
−→
X
n
)
_
T
OF(
−→
X ) =
_
_
_
_
_
_
_
_
_
∇I(
−→
X
1
, t) ·
−→
v +
∂I(
−→
X
1
, t)
∂t
.
.
.
∇I(
−→
X
n
, t) ·
−→
v +
∂I(
−→
X
n
, t)
∂t
_
¸
¸
¸
¸
¸
¸
¸
_
given (
−→
X
1
, . . . ,
−→
X
n
) ∈ .
(13)
Generally solving the energy minimizationprobleminEq. (12) is
not trivial dependinguponthe nonlinearityof the functionF(
−→
X,
−→
dX).
One important feature of the proposed framework as in Eq. (12)
is that everything is formulated in the original coordinate system of
the input image series. There is no need to resample the image data
to other coordinate system, e.g. polar coordinate, which is usually
done in motion analysis or segmentation in one direction, such as in
Ref. 26. The main advantage of the proposed framework compared
with these image resample frameworks are to avoid image resam-
pling, which is a relative expensive step especially for 3D image
volumes andmay introduce some artifact depending uponthe inter-
polation scheme used. This will save a lot of computational power
when dealing with higher dimensional image series.
Another thing needed to be pointed out is that Eq. (12) is not the
only way to formulate the optical flow along iso-value curve. Actu-
ally, another framework with identical optimum solution in ideal
case will be proposed in the real-time 3D ultrasound application in
a constrained energy minimization fashion.
Inthe followingsection, the proposedframeworkwill be applied
to different applications. Specific zero-value curve(s) F(
−→
X,
−→
dX) will
be derived and the instants of Eq. (12) or other energy minimiza-
tion schemes will be derived as well. The tracking results will
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
344
Qi Duan et al.
be quantitatively compared to the results derived from manual
tracing through area-based index or finite-element model based
contour/surface comparison.
14.3 METHODS AND RESULTS
14.3.1 Example I: Tracking Radial Displacements of the
Endocardium in 2D Cardiac MRI Series
Adirect application of the proposed framework was tested to track
the endocardiummotionalong radial displacements. Previous work
involving tracking endocardial borders using optical flow, such as,
27
usually applied the optical flow algorithm directly on the Carte-
sian image data without additional constraints on motion direction.
Since radial displacements and its derivatives are the most inter-
esting components of endocardial motion, we focused on OF radial
displacement computation only.
14.3.1.1 Mathematical analysis
Usually in 2D cardiac images, a polar coordinate system is used to
decompose the endocardium displacement field in radial and cir-
cumferential directions. We followed the same coordinate system
convention. The selection of the center of the polar coordinate sys-
tem cannot simply be the centroid of the blood pool because of the
well known “floating centroid” problem in cardiac biomechanics.
28
Following the proper ventricle axis selection protocol described in
Ref. 28, the long axis of left ventricle was first selected and then
the center of the polar coordinate system was set as the intersec-
tion of LV long axis and the imaging plane. In this coordinate sys-
tem, radial displacements can be defined as displacements along
iso-value lines θ = const. The corresponding zero-value function
F(
−→
X,
−→
dX) = f (x
c
, y
c
, u, v) = 0, expressingthe fact that the point (x
c
, y
c
)
and its motion vector (u, v) are along the line θ = const, is given by:
_
x
c
sin θ −y
c
cos θ = 0
usin θ −v cos θ = 0,
(14)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
345
which can be simplified into
f (x
c
, y
c
, u, v) = y
c
u −x
c
v = 0. (15)
So the total energy associated with the optical flow along zero-
value curve is:
E
1
=

(x,y)∈
W
2
(x, y)[I
x
u +I
y
v +I
t
]
2
+f
2
(x
c
, y
c
, u, v). (16)
Similar to the original Lucas-Kanade method, the energy mini-
mizationproblemdescribedbyEq. (16) canbe solvedbyleast-square
fitting of the following equivalent over-constrained system:
_
_
_
_

W
2
I
2
x

W
2
I
x
I
y

W
2
I
x
I
y

W
2
I
2
y
y
c
−x
c
_
¸
¸
_
_
u
v
_
=
_
A
T
W
2
b
0
_
, (17)
where W, A, and b are defined in Eq. (7).
14.3.1.2 Data and evaluation methods
The endocardial border tracking scheme developed in the previous
section was tested on two cardiac MRI protocols:
• A 2D cardiac MRI series with ECG gating acquired by a GE 1.5T
system using protocol FIESTA for 2D short axis stacks from an
IRB approved experiment of LAD occlusion in sheep hearts. This
protocol, which is also calledSSFPby other vendors, will generate
clear anatomical image of the heart. For this reason, this protocol
is widely used in cardiac MRI. This data set is selected to test the
performance of the optical flow on clear images with standard
temporal resolution in cardiac MRI.
• A2D cardiac MRI series with ECG gating acquired by a Siemens
1.5T system using a novel high-temporal resolution Phase Train
Imaging (PTI) protocol proposed by Pai et al.
2
for 2D short axis
stacks from a volunteer heart. This novel high speed can pro-
vide 2 ms temporal resolution on average andabout four hundred
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
346
Qi Duan et al.
frames per cardiac cycle. The image quality is worse than FIESTA
or SSFP protocol. This data is selected to test the performance of
the optical flow on high-speed low quality image series and also
the robustness on long-term tracking.
Endocardial border points for eachtime frame of the FIESTAdata
and the last frame for the PTI data were traced by an experienced
expert. The optical flowalgorithmwas initialized with manual trac-
ing points on the first frame (end-diastole) and then automatically
run to track those points throughout the whole cardiac cycle (20
frames in total for the FIESTAdata and 412 frames for the PTI data).
Two error measurements were used to evaluate the performance
of the optical flow: (1) the Tanimoto index TI =
TP
1+FP
=
Seg
1
∩Seg
2
Seg
1
∪Seg
2
,
29
whichis widely usedincomparisonof segmentationresults; (2) rela-
tive errors in radial coordinates. A24 finite element model was used
to fit manually traced points or optical flow tracked points for each
frame of interest. The relative errors in radial coordinates of each
element were then computed, with its mean serving as a perfor-
mance indicator for each frame. The original Lucas-Kanade optical
flowmethod was also implemented and applied to the same data as
a comparison method for endocardium tracking, without iso-value
curve constraint.
14.3.1.3 Results
On the FIESTAdata, radial lengths for the endocardial border points
tracked by our method at end-diastole (ED) and end-systole (ES) are
plotted in Fig. 1(a). When compared to endocardium obtained by
manual tracing, our proposed method has TI value 74.62%±8.54%,
compared with that obtained by original Lucas-Kanade method as
72.06% ± 9.13%. These results showed that our proposed method
has more accurate and robust performance than the original Lucas-
Kanade method. Example tracking results at frame 10 are shown in
Fig. 1(b), showing that our method is less likely to fail compared
with the original method. Similar conclusion can be drawn from
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
347
(a)
(c)
(d)
(b)
0 5 10 15 20
0
0.1
0.2
0.3
0.4
Frame Number
r
e
l

e
r
r
Proposed Method
LK Method
-100 0 100
12
14
16
18
20
22
24
(degree)
r
a
d
i
a
l

l
e
n
g
t
h

(
m
m
)
ED
ES
0 5 10 15 20
0
0.05
0.1
0.15
0.2
Frame Number
r
e
l

s
t
d
Proposed Method
LK Method
Fig. 1. On FIESTA sheep data: (a) Radial length of endocardium points at ED
(solid line) and ES (dashed line); (b) Tracking result at frame 10 with proposed
method (center of red circle) and Lucas-Kanade method (center of green cross);
(c-d) Relative radial coordinates errors: (c) relative error and(d) standarddeviation.
(b-d) solid line: proposed method; dashed line: Lucas-Kanade method.
the comparison of the relative errors in radial coordinates plotted in
Figs. 1(c) and1(d), for whichthe proposedmethodhas lower average
errors as well as lower standard deviations in the relative errors in
radial coordinates. The additional constraint of the OF motion along
iso-value curves improved the robustness and accuracy for tracking
of the endocardium.
Error accumulation of consecutive frames in the OF estimation
can be noticed from the plots, which suggests that applying for-
ward and backward tracking or adding more reference points may
improve the performance of OF estimation.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
348
Qi Duan et al.
Fig. 2. Tracking result at frame 412 with proposed method (red circle) and Lucas-
Kanade method (green cross) on the high-speed PTI data.
On the PTI data, after tracking the endocardium through the
whole cardiac cycle, the TI values at the last frame are 85.40% for
our method and 63.70% for the original Lucas-Kanade method.
The relative errors are 7.10%±9.52% for our method and 96.49%±
126.91% for the original Lucas-Kanade method. Tracking results for
the last frame (the 412th frame) are shown in Fig. 2, which shows
that the additional constraint derived from the iso-value curve
increases the robustness of our method for high temporal resolution
tracking.
14.3.2 Example II: Tracking the Endocardium in Real-Time
3D Ultrasound
Development of real-time 3D (RT3D) echocardiography started
in the late 1990s
30
based on matrix phased arrays transducers.
Recently, a new generation of RT3D transducers was introduced by
Philips Medical Systems (Best, The Netherlands) with the SONOS
7500 transducer followed by the iE33 that can acquire a fully
sampled cardiac volume in four cardiac cycles. This technical
design enabled a dramatic increase in spatial resolution and image
quality, which makes such 3D ultrasound techniques increasingly
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
349
attractive for daily cardiac clinical diagnosis. Since RT3D ultra-
sound acquires volumetric ultrasound sequences with fairly high
temporal resolution and a stationary transducer, it can capture
the complex 3D cardiac motion very well. Advantages of using
three-dimensional ultrasound in cardiology include the possibility
to display a three-dimensional dynamic view of the beating heart,
and the ability for the cardiologist to explore the three-dimensional
anatomy at arbitrary angles, to localize abnormal structures and
assess wall deformation. This technology has been shown, in the
past decade, to provide more accurate and reproducible screen-
ing for quantification of cardiac function for two main reasons: the
absence of geometrical assumption for ventricular shapes and the
accuracy of the visualization planes for performing ventricular vol-
ume measurements. It was validatedthroughseveral clinical studies
for quantification of LVfunction as reviewed in Ref. 31 and in Ref. 5.
The development for computer aided tools for RT3D ultrasound is
relatively limited compared with the development of image pro-
cessing techniques for other modalities. Early studies
17
used simple
simulated phantoms while recent research
32
used 3D ultrasound
data sequence for LV volume estimation. In Ref. 27, we proposed
a framework based on correlation based optical flow estimation to
track the endocardium. The result is quantitatively validatedagainst
manual tracing result. In a recent study,
33
3D speckle tracking tech-
niques, whichare similar to our methodinRef. 27, was testedmainly
on simulated data. All the tracking in previous studies was per-
formed directly in 3D Cartesian coordinates. However, for tracking
the endocardium purpose, the problem can be reformulated as an
optical flow along iso-value curve problem, which will be much
more efficient with comparable tracking results. Example frame of
RT3D ultrasound is shown in Fig. 3 in Phillips QLAB interface.
14.3.2.1 Mathematical analysis
As mentioned in previous section, Eq. (12) is not the only way to
formulate the optical flow along iso-value curves. An equivalent
frameworktoEq. (12) canbe formulatedthroughconstrainedenergy
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
350
Qi Duan et al.
Fig. 3. Example frame of RT3D ultrasound at ED for a patient with transplanted
heart. (a) axial, (b) elevation and (c) azimuth views.
minimization:
_
¸
_
¸
_
−→
v = arg min
−→
v
E
OF
= W(
−→
X ) · OF(
−→
X )
2
|
−→
X ∈
F(
−→
X
c
,
−→
dX
c
) = 0.
(18)
Equation (18) is an equivalent systemof Eq. (12) in the sense that
both systems have the same optimum solution under ideal case, i.e.
the minimum value of E
OF
is zero.
In order to show that our framework is not limited to gradient
based optical flow framework, in this example, we will derive an
energy term that is equivalent to correlation based optical flow as
used in Ref. 27. Since maximizing correlation coefficient is equiva-
lent to minimizing the sum squared difference between two neigh-
borhoods, the optical flow energy E
OF
can be simply defined with
this error energy. To properly define “radial displacement,” a prolate
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
351
spheroidal coordinate system (λ, µ, θ) with focus was established as
described in Refs. 27 and 34. So for each point
−→
X
c
with an n-point
neighborhood centered at
−→
X
c
, the tracking problemcan be formu-
lated as:
_
¸
¸
¸
¸
¸
_
¸
¸
¸
¸
¸
_
−→
v = arg min
−→
v
E
OF
=

−→
X ∈
_
I(
−→
X, t) −I(
−→
X +
−→
v dt, t +dt)
_
2
F(
−→
X
c
,
−→
dX
c
) =
_
µ
t+dt
c
−µ
t
c
θ
t+dt
c
−θ
t
c
_
= 0,
(19)
where
−→
X =
_
_
x
y
z
_
_
=
_
_
d sinh λ sin µcos θ
d sinh λ sin µsin θ
d cosh λ cos µ
_
_
. (20)
14.3.2.2 Data and evaluation method
The tracking approach was tested on one data set acquired with a
SONOS7500 3Dultrasoundmachine (Philips Medical Systems, Best,
The Netherlands): One transthoracic clinical data set was aquired
from a heart transplant patient. Spatial resolution of the analyzed
data was (0.8 mm
3
) and 16 frames were acquired for one cardiac
cycle. The endocardial surfaces were manually traced by one expe-
rience expert for every frame between the end-diastole and the end-
systole. The optical flow algorithms were initialized using manual
tracing at ED and ES frames. The endocardial surfaces in between
were generated by averaging the results from forward and back-
ward tracking by both methods. Manual tracing for each frame was
used as gold standard for surface comparison.
We evaluated OF tracking performance via visualization and
quantification of dynamic ventricular geometry compared to seg-
mented surfaces. Usually, comparison of segmentation results is
performedvia global measurements like volume difference or mean-
squared error. In order to provide local comparison, we proposed
a novel comparison method in Ref. 35 based on a parameteriza-
tion of the endocardial surface in prolate spheroidal coordinates
36
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
352
Qi Duan et al.
and previously used for comparison of ventricular geometries from
two 3D ultrasound machines in Ref. 37. The endocardial surfaces
were registered using three manually selected anatomical land-
marks: the center of the mitral orifice, the endocardial apex, and the
equatorial mid-septum. The data as fittedinprolate spheroidal coor-
dinates (λ, µ, θ), projecting the radial coordinate λ to a 64-element
surface mesh with bicubic Hermite interpolation, yielding a realis-
tic 3Dendocardial surface. The fitting process was performed using
the custom finite element package Continuity 5.5 developed at the
University of California San Diego (http://cmrg.ucsd.edu). The fit-
ted nodal values and spatial derivatives of the radial coordinate,
λ, were then used to map relative differences between two sur-
faces, ε = (λ
seg
− λ
OF
)/λ
seg
using custom software. A Hammer
mapping was used to flatten the endocardial surface via an area
preserving mapping,
28
through which relative λ difference maps
were generated for end-systole (ES), providing a direct quantitative
comparisonof ventricular geometry. These maps are visualizedwith
iso-level lines, quantified in percentage values of radial difference.
The area under 10%differences is usedas the criteria for quantitative
comparison.
Average intra-observer variance and inter-observer variance
were also computed by the similar scheme using the two tracings
from a single user one month apart and two tracings from two dif-
ference users at the same time.
14.3.2.3 Results
The area percentages under 10% difference are plotted in Fig. 4(a).
The mean values are 69.66% ± 21.42% for proposed method and
87.29%±10.38%for direct tracking scheme. Example Hammer maps
from both methods at one frame are shown in Figs. 4(b) and 4(c)
respectively. The average intra- and inter-observer differences are
79.38% and 55.33%, respectively, in terms of same surface compari-
soncriteria. Bothmethods are comparable to inter-observer variance
and the direct tracking has better performance than the proposed
method.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
353
(a) (b) (c)
1 2 3 4 5 6 7
0
20
40
60
80
100
P
e
r
c
e
n
t
a
g
e
Constrained
Direct
Intra
Inter
0
0
0
-
0
.
1
0
.1
-
0
.
2
-
0
.
1
0
.
1
septum
anterior
lateral
posterior
septum
apex
-
0
.
2
0
.
2
-
0.
1
0
-
0
.
1
0
septum
anterior
lateral
posterior
septum
apex
Fig. 4. Optical flowtrackingresults onRT3Dultrasound: (a) areapercentage under
10% difference from manual tracing for each frame generated by the proposed
method (blue) and direct tracking method (green). Average intra-observer variance
(red) and inter-observer variance (cyan) are also plotted for reference; (b) Hammer
map of direct tracking result at frame 5; (c) Hammer map of constrained tracking
result at frame 5.
From computational cost point of view, the proposed method
used 5.3767 seconds on average to tracking the surface between
two frames, whereas the direct tracking method needed 112.78 sec-
onds on average for the same task, which leads about 20 times
saving in computational power for our method compared with
direct tracking scheme. With performance comparable to inter-user
difference and much shorter time cost in computation than direct
tracking scheme, our method may be more suitable in clinical appli-
cations, where the total analysis time is limited to 5–10 minutes for
each data set.
14.3.3 Example III: Thickening Computation
on 2D Ultrasound Slices
In the previous two applications, optical flowalong iso-value curves
was used mainly as a tracking tool. In this example, we will show
that the displacement estimated in this framework can also be used
in motion analysis and strain computation.
14.3.3.1 Data and method
One basal short-axis cross-sectionviewwas extractedfromthe RT3D
clinical data used in previous section. 2D versions of optical flow
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
354
Qi Duan et al.
(a)
(b)
(c)
spetum
spetum
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Fig. 5. Results on thickening computation: (a) example 2D slice; (b) Segmental
average of thickening from direct tracking scheme; (c) Segmental average of thick-
ening from our method.
methods detailed in previous section were also implemented. Both
optical flow methods were initialized at end-diastole with man-
ual tracing. Segmental average thickening from ED to ES were
computed.
14.3.3.2 Results
Segmental average results of thickening computation from both
methods are shown in Fig. 5. Both methods generatedsimilar results
and correctly indicated the reduced motion at the septum in the
original data set.
14.4 DISCUSSION
Fromthe three examples we showed, we can conclude that, with the
additional energy term or constraint from the iso-value curve, the
optical flow algorithm can either perform better with roughly same
computational cost or much efficiently without downgrading the
accuracy a lot, especially for the tracking tasks like tracking endo-
cardium in cardiac imaging. Radial displacements and thickening
estimated derived from constrained scheme were similar to those
obtained by direct tracking.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
355
The frameworks proposed in Eqs. (12) and (18) are generic.
They can be easily extended to higher dimensional space; and the
energy for optical flow estimation can be chosen different from
the optical flow constrained equation. Actually with proper choice of
the optical flow energy term, the well known intensity constancy
assumption can be loosened, which could increase the robustness
of the estimation. Moreover, in addition to direct benefit from the
energy minimization framework, additional constraints, such as
smoothness constraint, can be seamlessly incorporated by simply
adding weighted energy terms associated with these constraints.
This framework could be also merged with variational optical flow
approaches, such as works in Refs. 38 and 39.
The proposed frameworks were formulated directly in the same
coordinate systemas the input image, so there is no data resampling
required, which will reduce the overall computational cost and
reduce the accuracy dependency on the interpolation methods. The
key points in these frameworks are to properly define the zero-value
curve function vector F and to properly minimize the energy. The
latter one could be formulated as a non-linear problem for some
applications.
Although ideal systems described by Eqs. (12) and (18) have
same optimal solutions, the results on real image series from these
two frameworks may be different if the zero-minimum for the opti-
cal flow energy can not be reached. In this case, framework defined
by Eq. (18) will still give optical flow displacement vector along the
iso-value curves whereas the framework defined by Eq. (12) may
loosen this constraint to get estimation with even lower energy,
which yields the fact that the framework defined by Eq. (12) out-
performs its constrained counterpart defined by Eq. (18). Consider-
ing computational cost, framework defined by Eq. (12) will slightly
increase the cost compared to direct tracking method with the addi-
tional energy term; on the contrary, the framework defined by
Eq. (18) usually offers huge saving in computational power due to
dimensionality reduction. So for tracking purpose, if the accuracy is
more important than the efficiency, we would suggest the use of the
unconstrained version as Eq. (12). If efficiency is more important or
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
356
Qi Duan et al.
displacement is required to strictly follow the iso-value curves, the
constrained version as Eq. (18) would be a good choice.
The last thing that needs to be pointed out is that the displace-
ments estimated by the proposed frameworks cannot be used for
motion analysis along directions other than the given iso-value
curve. For example, if the displacement of the endocardium is esti-
mated by optical flow along radial direction (θ = const) in 2D, this
estimation cannot be directly used to estimate the circumferential
displacement or cardiac twist. This is a limitation of the proposed
frameworks since in some sense we are trading the universality in
free motion estimation for much better accuracy or efficiency for
motion estimation along specific iso-value curves. Fortunately, this
limitation would not limit the usefulness of the proposed frame-
work a lot since in most of cardiac applications, landmark or surface
trackingandmotionanalysis alongspecific directions are more often
than free motion analysis.
14.5 CONCLUSION
Twogeneric frameworks for optical flowwere proposedas anenergy
minimization problem with local constraints related to iso-value
curves. Three applications of these frameworks were presented for
trackingof the endocardiumon2DMRI dataseries (bothFIESTAand
PTI protocols) andreal-time 3Dultrasoundseries. The endocardium
borders tracked by the proposed method as well as the Lucas-
Kanade method were quantitatively compared to manual tracing on
each frame through the Tanimoto Index and relative errors in radial
coordinates after FEM fitting. The results showed superior perfor-
mance for the proposed method in tracking the endocardium. The
constrained version was applied on real-time 3D ultrasound data.
Quantitative evaluation results yielded comparable performance to
inter-observer variance with about 20-fold saving in computational
cost compared to direct tracking scheme. Thickening computations
from the proposed method and direct tracking method were com-
pared with similar results. These frameworks are generic and can
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
357
be readily extended to n-dimensional spaces and seamlessly incor-
porated additional constraints via a similar energy minimization
framework.
14.6 ACKNOWLEDGMENT
This work was funded by National Science Foundation grant BES-
02-01617, American Heart Association #0151250T, Philips Medi-
cal Systems, New York State NYSTAR/CAT Technology Program.
Dr Andrew McCulloch at the University of California, San Diego
provided the finite element software “Continuity” through the
National Biomedical Computation Resource (NIH P41RR08605).
The authors also would like to thank Dr Todd Pulerwitz
(Department of Medicine, Columbia University), Susan L. Herz,
Christopher M. Ingrassia, Drs Jeffrey W. Holmes, Dr Kevin D. Costa
(Department of Biomedical Engineering), andDr VinayM. Pai (Radi-
ology, New York University).
References
1. Ramm OTV, Pavy JHG, Smith SW, Kisslo J, Real-time, three-
dimensional echocardiography: The first human images, Circulation
84: 685, 1991.
2. Pai V, Axel L, Kellman P, Phase train approach for very high temporal
resolution cardiac imaging, J Cardiovasc Magn Reson 7: 98–99, 2005.
3. Drezek R, Stetten GD, Ota T, Fleishman C, et al., Active contour based
on the elliptical Fourier series, applied to matrix-array ultrasound of
the heart, presented at 25th AIPR Workshop: Emerging Applications
of Computer Vision, 1997.
4. Chalana V, Linker DT, Haynor DR, Kim Y, A multiple active con-
tour model for cardiac boundary detection on echocardiographic
sequences, IEEE Transactions on Medical Imaging 15: 290–298, 1996.
5. Angelini ED, Homma S, Pearson G, Holmes JW, et al., Segmentation of
real-time three-dimensional ultrasound for quantification of ventricu-
lar function: Aclinical study on right and left ventricles, Ultrasound in
Med & Biol 31: 1143–1158, 2005.
6. Paragios N, A level set approach for shape-driven segmentation and
tracking of the left ventricle, IEEE Transactions on Medical Imaging 22:
773–776, 2003.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
358
Qi Duan et al.
7. Lin N, Duncan JS, Generalized robust point matching using an
extendedfree-formdeformation model: Application to cardiac images,
presented at 2004 2nd IEEE International Symposium on Biomedical
Imaging: Macro to Nano, 2004.
8. Rueckert D, Burger P, Geometrically Deformable Templates for Shape-based
Segmentation and Tracking in Cardiac MR Images, presented at Energy
Minimization Methods in Computer Vision and Pattern Recognition,
Venice, Italy, 1997.
9. Montagnat J, Delingette H, Spatial and Temporal Shape Constrained
Deformable Surfaces for 3D and 4D Medical Image Segmentation, INRIA,
Sophia Antipolis RR-4078, 2000.
10. van Assen CH, Danibuchkine MG, Frangi AF, Ordas S, et al., SPASM:
A3D-ASM for segmentation of sparse and arbitrarily oriented cardiac
MRI data, Medical Image Analysis 10: 286–303, 2006.
11. Mitchell SC, Lelieveldt BPF, van der Geest R, Schaap J, et al., Segmenta-
tion of Cardiac MR Images: An Active Appearance Model Approach,
presented at SPIE-The International Society for Optical Engineering,
2000.
12. Setarehdan SK, Soraghan JJ, Automatic cardiac LV boundary detec-
tion and tracking using hybrid fuzzy temporal and fuzzy multiscale
edge detection, IEEE Transaction on Biomedical Engineering 46: 1364–
1378, 1999.
13. Veronesi F, Corsi C, Caiani EG, Sarti A, et al., Tracking of left ventricular
long axis from real-time three-dimensional echocardiography using
optical flow techniques, IEEE Transactions on Information Technology in
Biomedicine 10: 174–181, 2006.
14. Duan Q, Angelini E, Herz SL, Ingrassia CM, et al., Dynamic Cardiac
Information From Optical Flow Using Four Dimensional Ultrasound, pre-
sented at 27th Annual International Conference IEEE Engineering in
Medicine and Biology Society (EMBS), Shanghai, China, 2005.
15. Loncaric S, Majcenic Z, Optical Flow Algorithm for Cardiac Motion Esti-
mation, presented at 22ndAnnual International Conference of the IEEE
Engineering in Medicine and Biology Society, Jul 23–28 2000, Chicago,
IL, 2000.
16. Gindi GR, Gmitro AF, Delorie DHJ, Velocity Flow-Field Analysis of Car-
diac Dynamics, presented at Proceedings of the Thirteenth Annual
Northeast Bioengineering Conference, Philadelphia, PA, USA, 1987.
17. Gutierrez MA, Moura L, Melo CP, Alens N, Computing Optical Flow
in Cardiac Images for 3D Motion Analysis, presented at Proceed-
ings of the 1993 Conference on Computers in Cardiology, London,
UK, 1993.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
Tracking Endocardium Using Optical Flow along Iso-Value Curve
359
18. Suhling M, Arigovindan M, Jansen C, Hunziker P, et al., Myocardial
motion analysis from B-mode echocardiograms, IEEE Transactions on
Image Processing 14: 525–536, 2005.
19. Lucas BD, Kanade T, An Iterative Image Registration Technique with an
Appication to Stereo Vision, presented at International Joint Conference
on Artificial Intelligence (IJCAI), 1981.
20. Horn BKP, Robot Vision, MIT Press, Cambridge, 1986.
21. Barron JL, Fleet D, Beauchemin S, Performance of optical flow tech-
niques, Int Journal of Computer Vision 12: 43–77, 1994.
22. Humphrey JD, Cardiovascular Solid Mechanics: Cells, Tissues, and Organs,
Springer, New York, USA, 2002.
23. Papademetris X, Sinusas AJ, Dione DP, DuncanJS, Estimationof 3Dleft
ventricular deformation from echocardiography, Medical Image Analy-
sis 8: 285–294, 2004.
24. Azhari H, Sideman S, Weiss JL, Shapiro EP, et al., Three-dimensional
mapping of acute ischemic regions using MRI: Wall thichening versus
motionanalysis, AmericanJournal of Physiology259: H1492–H1503, 1990.
25. Suhling M, Arigovindan M, Jansen C, Hunziker P, et al., Myocardial
motion analysis from B-mode echocardiograms, IEEE Transactions on
Image Processing 14: 525–536, 2005.
26. Noble N, Hill D, Breeuwer M, Schnabel J, et al., Myocardial delineation
via registration in a polar coordinate system, Acad Radiol 10: 1349–1358,
2003.
27. Duan Q, Angelini ED, Herz SL, Gerard O, et al., Tracking of LV Endo-
cardial Surface on Real-Time Three-Dimensional Ultrasound with Optical
Flow, presented at Third International Conference on Functional Imag-
ing and Modeling of the Heart 2005, Barcelona, Spain, 2005.
28. Herz S, Pulerwitz T, Hirata K, Laine A, et al., Novel Technique for
Quantitative Wall Motion Analysis Using Real-Time Three-Dimensional
Echocardiography, presented at Proceedings of the 15th Annual Scien-
tific Sessions of the American Society of Echocardiography, 2004.
29. Theodoridis S, Koutroumbas K, Pattern Recognition, Academic Press,
USA, 1999.
30. Ramm OTV, Smith SW, Real-time volumetric ultrasound imaging sys-
tem, Journal of Digital Imaging 3: 261–266, 1990.
31. Krenning BJ, Voormolen MM, Roelandt JRTC, Assessment of left ven-
tricular function by three-dimensional echocardiography, Cardiovasc
Ultrasound 1(1): 2003.
32. Shin I-S, Kelly PA, Lee KF, Tighe DA, Left Ventricular Volume Estimation
From Three-Dimensional Echocardiography, presented at Proceedings of
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch14 FA
360
Qi Duan et al.
SPIE, Medical Imaging 2004 —Ultrasonic Imaging and Signal Process-
ing, San Diego, CA, United States, 2004.
33. Yu W, Yan P, Sinusas AJ, Thiele K, et al., Towards pointwise motion
tracking in echocardiographic image sequences: Comparing the relia-
bility of different features for speckle tracking, Medical Image Analysis
10: 495–508, 2006.
34. Herz S, Ingrassia C, Homma S, Costa K, et al., Parameterization of left
ventricular wall motion for detection of regional ischemia, Annals of
Biomedical Engineering 33: 912–919, 2005.
35. DuanQ, Angelini ED, Herz SL, Ingrassia CM, et al., Evaluation of Optical
Flow Algorithms for Tracking Endocardial Surfaces on Three-Dimensional
Ultrasound Data, presented at SPIE International Symposium, Medical
Imaging 2005, San Diego, CA, USA, 2005.
36. Ingrassia CM, Herz SL, Costa KD, Holmes JW, Impact of Ischemic
Region Size on Regional Wall Motion, presented at Proceedings of the
2003 Annual Fall Meeting of the Biomedical Engineering Society,
2003.
37. Angelini ED, HammingD, Homma S, Holmes J, et al., Comparisonof Seg-
mentation Methods for Analysis of Endocardial Wall Motion with Real-Time
Three-Dimensional Ultrasound, presented at Computers in Cardiology,
Memphis TN, USA, 2002.
38. BruhnA, Weickert J, Feddern C, Kohlberger T, et al., Variational optical
flowcomputation in real-time, IEEETransactions on Image Processing 14:
608–615, 2005.
39. Ruhnau P, Kohlberger T, Schnorr C, Nobach H, Variational optical
flowestimation for particle image velocimetry, Experiments in Fluids 38:
21–32, 2005.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
CHAPTER 15
Some Recent Developments in
Reconstruction Algorithms for
Tomographic Imaging
Chien-Min Kao, Emil Y Sidky, Patrick La Rivière
and Xiaochuan Pan
Ionizing-radiationbasedimaging techniques play anextremely important
role in non-invasively yielding information about the internal anatomic
structure and functional information within a subject under study. Com-
puted tomography (CT), positron emission tomography (PET), and single
photon emission computed tomography (SPECT) are the main imaging
modalities based upon ionizing radiation, and they have found applica-
tions in virtually every discipline in science, engineering, biology, chem-
istry, and, more notably, medicine. In these imaging techniques, one needs
to develop algorithms for accurately reconstructing the underlying object
function fromacquired projection data. In the last decade or so, in parallel
to tremendous tomographic hardware advancement for data acquisition,
there have also beenimportant breakthroughs inthe development of inno-
vative algorithms for reconstructing the underlying object function. Inthis
chapter, we briefly reviewsome of the recent developments in reconstruc-
tion algorithms for tomographic imaging in CT, PET, and SPECT.
15.1 IMAGE RECONSTRUCTION IN COMPUTED
TOMOGRAPHY
15.1.1 Introduction
X-ray projection imaging is the most common non-invasive scan
employed for probing the interior of a subject, and it found wide
application very quickly after its initial discovery in 1895. For many
361
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
362
Chien-Min Kao et al.
purposes, the projection of the subject’s X-ray attenuation coeffi-
cient yields important diagnostic information in medical imaging,
or structural and compositional information in industrial imaging.
There are, however, an increasing number of imaging applications,
where it is desirable to have full 3D information of the X-ray atten-
uation coefficient. Such information can be provided by combining
and processing X-ray projections of a subject taken from multiple
viewangles surrounding the subject. In the 1970s, as computer tech-
nology began its rapid ascension, computed tomography (CT) was
developed to address the need for internal 3D structural informa-
tion. The early CTscanners obtained3Dimages slice-by-slice by illu-
minating the subject with a fan of X-rays, rotating the X-ray source
and detector to obtain complete information to reconstruct the 2D
slice image. By translating the subject the subsequent slices could
be obtained. The theory of image reconstruction lead to the filtered
back projection (FBP) and fan-beam filtered back projection (FFBP)
algorithms, corresponding respectively to parallel- and diverging-
ray illumination. This step-and-shoot process was streamlined by
introducing the helical source trajectory. This trajectory is what is
seen from the subject reference frame as it is translated at a con-
stant rate through a rotating gantry that carries the X-ray source
and detector on a circular trajectory. If the helical pitch, the distance
covered by the subject during a single turn of the gantry, is not too
great then variations on the 2D FFBP algorithm can be utilized to
obtain accurate reconstruction of the subject’s 3D X-ray attenuation
coefficient.
The trend in the technical development of CT scanners is to
include more and more rows on the detector, extending its dimen-
sion along the longitudinal axis of the helical scan (referred to as
simply the longitudinal direction for short). Currently, commercial
scanners employ up to 64 detector rows and this number will cer-
tainly increase, because more detector rows allows for higher heli-
cal pitches and more rapid coverage of 3D volumes. As the detector
size increases longitudinally the X-ray source slit is opened up to
illuminate the subject with an X-ray cone beam. This evolution of
CT scanners has increased the sense of urgency for the development
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
363
of practical algorithms that can yield accurate image reconstruction
from cone beam CT projection data.
Atheory of 3D image reconstruction for cone beam CT has been
known since work by Tuy,
1
who derived a inversion formula that
yields a 3Ddistribution fromits cone beamprojection at views along
a general class of X-ray source trajectories. The helical trajectory falls
into this class. Although the Tuy formula represents an important
advance in the theory of cone beam CT image reconstruction, it has
two major practical short-comings. First, direct implementation of
this formula is numerically inefficient. Second, the projection data
cannot be truncated; a complete projection of the subject is needed
fromall of the viewangles. This is particularly impractical in helical,
conebeamCTfor humansubjects; as thedetector wouldhavetohave
an extent larger than the body’s projection at all the sampled views.
During the 1990s and early 2000s much effort was devoted to derive
a practical image reconstruction algorithm, using a relation between
cone beamprojection data and the 3DRadon transformof the object
function derivedby Grangeat.
2
These algorithms sought to solve the
so called long object problem, where the cone beam projection data
are truncated only in the longitudinal direction.
2
A breakthrough
in image reconstruction theory occurred in 2001 when Katsevich
published an exact formula for image reconstruction directly from
helical, conebeamprojectiondata.
3
This algorithm, thoughrelatedto
the Tuy formula,
4
could support longitudinal truncation of the cone
beamprojection data, andrequires only 1Dfiltering of the projection
data thereby improving numerical efficiency.
The ideas of the Katsevichalgorithmcombinedwiththe geomet-
rical construct of the so called π-line in helical cone beamscanning,
5
led Zou and Pan to develop a new class of cone beam CT image
reconstruction algorithms.
6−10
These algorithms obtain the image
in a curvilinear coordinate system that is defined by the chords of
a general source trajectory. In helical cone beam CT, the π-lines can
be interpreted as a special set of chords. The new algorithms are
efficient and create opportunities to design novel data acquisition
configurations that allow for dose reduction and increased scan-
ning speed. The Zou-Pan image reconstruction formula involves
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
364
Chien-Min Kao et al.
reversing the usual data processing steps of data filtration followed
by back projection to the image array. These algorithms instead
perform the back projection step first, and are hence called back
projectionfiltration(BPF). The reversal of these operations improves
algorithm efficiency, because the filtration in the image space is less
time consuming than in the data space. More importantly, BPF can
perform exact image reconstruction for projection data that is trun-
catedbothlongitudinallyandtransversely. Inthe followingsections,
we introduce the data model for helical cone beam CT; we then
explain the BPF image reconstruction algorithm; and finally we dis-
cuss the implications for region-of-interest (ROI) imaging.
15.1.2 The Data Model of Helical Cone Beam CT
In helical cone beam CT, the X-ray source travels along a helical tra-
jectory along with the 2D detector array. The detector shown in the
image is a flat-panel array, while current helical cone beam systems
generally use curved detector arrays. The image reconstruction the-
ory, below, is presentedina detector independent formulationwhich
canbe easilyadaptedtoeither detector geometry. The data model for
the helical cone beam system assumes that the line integral of the
X-ray attenuation coefficient for a ray originating from the source
and terminating at a detector bin can be obtained from:
d
i
= −ln
_
I
i
(I
0
)
i
_
, (1)
where i is a generic index for the rays specified by the combination
of all detector bin and source locations; (I
0
)
i
is the X-ray intensity in
number of photons that would be measured for the i-th ray if there
were no subject; I
i
is the actual measured intensity; and d
i
represents
the line integral of the X-ray attenuation for the i-th ray:
d
i
=
_

0
dµ(s
i
+
ˆ
θ
i
). (2)
The vector s
i
is the X-ray source location and
ˆ
θ
i
is the unit vector for
the i-th ray; µ is the spatially varying X-ray attenuation coefficient,
ignoring energy dependence. The data model is idealized; X-ray
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
365
scatter, beampolychromatricity, partial volume averaging, etc.
11
are
all neglected here. The set of d
i
is interpreted as the measurements
because they can be computed fromthe rawmeasurements through
Eq. (1). The aimof the reconstruction algorithmis to find µ(r ) given
measurements d
i
.
As described by Eq. (2) the measurement set is a large but finite
set of line integrals. The theory of image reconstruction in helical
cone beam CT is formulated in terms of a continuous data function;
thus we rewrite Eq. (2) to reflect this fact, and discuss discretiza-
tion after the reconstruction formula is written down in Eq. (3). To
develop the image reconstruction formula, we assume that we can
obtain the continuous data function
g(λ,
ˆ
θ) =
_

0
dµ[s(λ) +
ˆ
θ], (3)
where λ is the continuously varying helical parameter indicating
the source position. The source position is given in Cartesian coor-
dinates by:
s(λ) =
_
Rcos λ, Rsin λ,
h

λ
_
, (4)
where R is the helical radius, and h is the pitch length. The coordi-
nate system is set up so that the axis of the helix is aligned along z.
The detector bin locations are not specified. It is assumed that the
detector captures the attuation measurements along the necessary
rays originating at s(λ) in the direction
ˆ
θ. The sufficient range of λ
and
ˆ
θ is discussed below.
15.1.3 The BPF Algorithm
The BPF algorithmfor image reconstructioninhelical cone-beamCT
involves decomposing the imaging volume in chords of the source
trajectory. Mathematically, a single chord is described by:
r
c

1
, λ
2
, t) = s(λ
1
)(1 − t) + s(λ
2
)t; t ∈ [0, 1]. (5)
The chord, specifiedby the helical parameters λ
1
andλ
2
, is a line seg-
ment that joins the source positions s(λ
1
) and s(λ
2
). The parameter
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
366
Chien-Min Kao et al.
t locates a point on the chord. It has been previously observed that
all points internal to the convex hull of the helix can be uniquely
assigned to a point on a helical chord with the restriction that

1
−λ
2
| < 2π; chords that satisfy this restriction are called π-lines.
5
The BPF algorithm obtains the volume image by reconstructing it
chord by chord.
The main steps of the BPF algorithminvolve taking a derivative
of the projection data, back projection of the data derivative to the
chord to form an intermediate image function, and finally filtration
of the intermediate image to obtain the actual image function. The
first processing step for the data function follows this equation:
g
D
(λ,
ˆ
θ) =

∂p
g(p,
ˆ
θ)
¸
¸
¸
p=λ
. (6)
The next step involves back projecting the data onto the chord:
f
I

1
, λ
2
, t) =
_
λ
2
λ
1

1
|s(λ) − r
c

1
, λ
2
, t)|
g
D
(λ,
ˆ
θ
c
),
where
ˆ
θ
c
=
r
c

1
, λ
2
, t) − s(λ)
|r
c

1
, λ
2
, t) − s(λ)|
.
(7)
Before continuing on to the last step of the chord image reconstruc-
tion, we note here that the above formula says something about the
projection data sufficient for reconstruction the image on the chord.
The integration of λ for the back projection goes from λ
1
to λ
2
, so
projection views for λ ∈ [λ
1
, λ
2
] are needed to form f
I

1
, λ
2
, t),
and for each view the rays that intersect the chord need to be
measured.
It turns out that the intermediate chord image f
I

1
, λ
2
, t) is sim-
ply the Hilbert transform of the desired image function µ
c

1
, λ
2
, t)
along the chord:
f
I

1
, λ
2
, t) = 2
_

−∞
dt

µ
c

1
, λ
2
, t

)
t − t

where µ
c

1
, λ
2
, t) = µ[r
c

1
, λ
2
, t)].
(8)
The Hilbert transform involves an infinite range integration, but it
is known that the object function µ has compact support. It turns
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
367
out that the solution µ
c

1
, λ
2
, t) to the integral equation, Eq. (8), can
be expressed with a finite range integration because of the compact
support property. Assuming that µ
c
is compactly supported within
the interval t ∈ [t
a
, t
b
], we have:
µ
c

1
, λ
2
, t) =
_
t
b
− t
t − t
a
×
_
t
b
t
a
dt

1
t − t

_
t

− t
a
t
b
− t

f
I

1
, λ
2
, t

). (9)
The fact that the t

integration only runs from t
a
to t
b
has further
implications on the data sufficiency conditions. Only the projection
rays that intersect the chord for t ∈ [t
a
, t
b
] need to be measured, not
the complete π-line. This inverse for the finite Hilbert transform is
actually only one of many possibilities.
8,12
This completes the chain
of operations needed to go from the projection data to the image
along a π-line. The only issues that remain are howto obtain volume
images, and what projection data are sufficient for reconstruction.
15.1.4 The Long Object Problem and ROI Reconstruction
The theory of π-line image reconstruction, above, tells howto obtain
the reconstructedimage onthe trajectory chords. The endgoal, how-
ever, is volume reconstruction. This section clarifies the connection
between the two cases, and along the way discusses scanning data
requirements for various scanning tasks.
For diagnostic helical cone-beam CT the most important task
that the image reconstruction can fulfill is to provide numerically
exact images efficiently from projection data that are longitudi-
nally truncated. The BPF algorithm does this. As the BPF algorithm
itself provides the image on individual π-lines, the volume must be
parameterized first in the curvilinear system specified by the inde-
pendent variables λ
1
, λ
2
, and t. The variables λ
1
and λ
2
specify a
π-line, and t yields a specific point on that chord. We illustrate now
how the volume coverage works in this coordinate system. First,
one can fix one end of the chord, say λ
1
= λ
A
, then sweep λ
2
in
the range [λ
A
, λ
A
+ 2π]. Such a set of π-lines defines a π-surface
whose geometry only depends on λ
A
. To obtain the volume, λ
A
is
swept throughaninterval [λ
start
, λ
end
]. The data sufficency condition
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
368
Chien-Min Kao et al.
for such a volume scan is easy to derive from geometric considera-
tions of the individual π-line is obviously λ ∈ [λ
start
, λ
end
+2π]. The
required projection data on the detector fromeach view, however, is
less obvious but not difficult to derive. From Sec. 1.3, rays passing
through the π-line defined by λ
1
and λ
2
must be detected. It turns
out that the area on the detector that should be measured is speci-
fied by the so called Tam-Danielsson (TD) window.
5,13
This window
represents the shadowof all π-lines to which the current viewangle
contributes. Geometrically, the boundaries of the TD-window are
defined by the shadows of the helical scanning trajectory on the
detector within 2π of the current scanning angle λ. Note that this
geometrical definition can be applied to any detector geometry as
long as the TDwindowfits within the detector. In practice, even the
TDwindowis the upper limit on the detector area. If it is known that
the subject support is confinedwell withinthe convexhull of the heli-
cal scan, then the required detector area can be reduced further. In
either case, theBPFreconstructionallows theutilizationof projection
data that are longitudinally truncated; thus solving the long object
problem.
Because the BPF theory reconstructs a volume image chord-by-
chord substantial reduction of scanning effort, even over long object
scanning, is possible when the image is desired only within a certain
ROI. Given the ROI, one needs only identify the π-lines that inter-
sect with the ROI and then reconstruct them. The scanning range is
found by examining the volume parameterization in terms of λ
1
and
λ
2
. For example, spherical volume can be reparameterized in terms
of λ
1
, λ
2
, andt, then subsequently projecteddown to the λ
1
, λ
2
-plane
(by integrating over t). Each point within the area represents a sin-
gle π-line that should be reconstructed. The actual volume that is
reconstructed is the union of the support segments of all of these
π-lines, which in general will be larger than the desired ROI. True
ROI reconstruction (known as the interior problem) is theoretically
not possible. As with the long object scanning, the necessary projec-
tion data are identified by the shadow of the support segments of
each π-line on the detector. While long object scanning serves the
bulk of diagnostic helical cone-beam CT, ROI scanning may prove
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
369
useful for specific protocols in image guided radiation therapy, CT
breast screening, or cardiac imaging.
15.2 IMAGE RECONSTRUCTION IN POSITRON
EMISSION TOMOGRAPHY
15.2.1 Introduction
Positron emission tomography (PET) is a unique, functional imag-
ing modality that is capable of producing quantitative in vivo assays
of a large variety of molecular pathways of biological systems. PET
has been routinely used in cancer diagnosis and evaluation.
14
It is
also widely used in neurology
15
and cardiology,
16
and is promising
for providing effective treatment outcome evaluations.
17
Recently,
there has been substantial interest in developing dedicated PET sys-
tems for imaging small animals (such systems are referred to as
microPET systems below).
18
In combination with the use of animal
models of human biology and diseases, microPET systems are pow-
erful tools inpreclinical research.
19
MicroPETimaging of gene trans-
fer, expression, and therapy have been successfully demonstrated
20
;
and there are high expectations that microPET systems will play
important roles in discovering new biology, as well as in drug and
treatment developments.
21
In comparison with human PET imag-
ing, microPETimagingdemands muchhigher imagingperformance
characteristics,
18
making microPET system development a useful
test bed for innovative PET designs and technologies. Because both
animal and human PET systems are available, PET imaging is also
a useful translational research tool. Finally, in recent years there has
also been greatly renewed enthusiasm for time-of-flight (TOF) PET
imaging due to its ability to produce improved image quality and
the availability of fast and dense scintillators adequate for imple-
menting TOF-PET systems.
22
The imaging performance of a PET system depends critically
on both its instrumentation and reconstruction.
23
Many discov-
eries and innovations in PET instrumentation have taken place
in recent years.
18,23
These include new scintillators and detector
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
370
Chien-Min Kao et al.
designs that enable substantial improvement of the spatial reso-
lution and timing accuracy in cost effective manners. Currently,
PET imaging can reach a spatial resolution of about or better than
1 mm. On the other hand, there exist efforts in achieving excep-
tional high sensitivity for microPET systems.
24
Parallel to advances
in instrumentation, there are substantial advances in PET image
reconstruction as well.
25
In practice, as we will discuss below, PET
data are significantly degraded, making the imaging model consid-
erably different from the ideal. Major degradations in PET imag-
ing include data noise, effects of finite detector size, and the pres-
ence of unwanted radiation (scatter and random coincidences).
18,23
These degradations need to be addressed in reconstruction in order
to produce high-quality PET images. As the application domain of
PET imaging enlarges, higher demands in all performance aspects
of PET imaging can be expected. These demands would require
many more advances in PET instrumentation and reconstruction
to be made.
Excellent reviewarticles onPETinstrumentationandreconstruc-
tion can be found in Refs. 18, 23 and 25. Below, we will discuss
issues and challenges facing PET image reconstruction and describe
approaches for addressing them.
15.2.2 Imaging Model
PET imaging is based on the principle of annihilation coincidence
detection and tracer kinetic modeling.
23
PET tracers are molecules
radioactively labeledwithpositronemitting isotopes, whichinclude
F-18, C-11, N-13, and O-15. Positrons emitted by PET tracers will
annihilate with electrons in their surroundings and give rise to a
pair of 511 keV photons traveling in opposite directions. Typically,
rings of gamma ray detectors are placed around the subject being
imaged. A simultaneous detection of two 511 keV photons by the
detector rings, called a coincidence detection, registers an annihila-
tion event. Generally, one can define the response function h
i
( x ) to
represent the probability for a positron emission occurring at x to
be detected by the i-th detector pair of a PET scanner. The response
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
371
function h
i
( x ) include factors such as the (geometric) detection effi-
ciency of the i-th detector pair to the position x and the attenuation
that the annihilation photons are subject to before exiting the sub-
ject. Because the annihilation photons travel in opposite directions,
for small detectors we have, to a good approximation, h
i
( x) =
i
a
i
for x ∈ L
i
and

h
i
( x ) = 0 for x ∈ L
i
, where L
i
denotes the line that
connects the centers of the front faces of the two detectors of the i-th
detector pair; and
i
and a
i
are the detection efficiency and subject
attenuation on L
i
respectively. In the literature, L
i
is called the line of
response (LOR). The number of the coincidence events collected at
the i-th detector pair, denotedby g
i
, is then relatedto the image func-
tion f ( x ), i.e. the spatial density distribution of the positron decays
taking place during the imaging time, by
g
i
=
i
a
i
×
_
L
i
dl f ( x ), (10)
where
_
L
i
dl denotes the line integral alongthe LORL
i
. Consequently,
under suitable conditions PET measurements are related to a collec-
tion of line integrals of the image function, i.e. to certain samplings
of the Radon transform of the image function. Provided that the
resulting samplings are adequate, according to the theory of Radon
transform, the image function can be recovered from the acquired
PET measurements, up to the spatial resolution limit supported by
the samplings.
The above description provides the basic principle underlying
PET imaging. This description is greatly simplified; it omits many
physical factors involved in the imaging process, including the
positron range, photon noncolinearity, the presence of scattered and
randomcoincidences, and the effects of finite detector size. Positron
range is the finite distance between the location where a positron
is emitted and where the annihilation takes place. Therefore, rigor-
ouslyspeaking, the image functionf ( x ) refers to the densityfunction
of the positron annihilation, rather than that of the positron emit-
ter itself unless the positron range is negligibly small. Depending
on the positron emitting isotope employed, the positron range can
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
372
Chien-Min Kao et al.
vary from0.83 mmto 8.54 mm.
18
Photon noncolinearity refers to the
fact that the directions of the two annihilation photons can slightly
deviate from the ideal 180

. The full-width-at-the-half-maximum of
this angular deviation is small but finite (about 0.5

). The departure
of the annihilation position from the detected LOR due to photon
noncolinearity increases as the size of the detector ring increases.
Scattered events are registered coincidence events of which at least
one annihilation event undergoes scattering before detection. There
are also random events which chance coincidence detection of two
photons originating from two independent positron annihilations.
These event types can significantly contaminate PET measurements
in3Dwhole-bodyPETimagingandinapplications that employhigh
tracer concentrations.
23
Apair of detectors is sensitive to all annihi-
lation events taking place inside the common volume seen by the
detectors. In the above simplified imaging model, we have assumed
sufficiently small detectors suchthat the sensitive volume reduces to
the LOR. In practice, this is often a poor assumption. Furthermore,
due to gamma-ray penetrations the sensitive volume can be much
larger than that suggested by the dimension of the detector front
face, leading to the phenomenon of parallax errors.
23
The sensitivity
of adetector pair topoints withinthe sensitive volume of the detector
pair can also vary considerably. In addition to the above described
physical factors, radioactive decay and photon detection are ran-
dom processes, giving rise to statistical variations in the number of
detected events when given the same image function and imaging
conditions. To include these physical and statistical factors in PET
imaging, one can write:
¯ g
i
= E{g
i
} =
_
d
3
xh
i
( x )f ( x ) + s
i
+ r
i
, i = 1, . . . , N, (11)
where s
i
and r
i
are the expectations of the number of scattered and
random events accumulated at the i-th detector pair during the
imaging time, andE{g
i
} denotes the ensemble mean of g
i
. In PET, g
i
’s
are independent Poisson variates. Therefore, the conditional prob-
ability distribution of the measurement g = [g
1
, . . . , g
N
]
t
given the
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
373
image function f ( x ), is equal to:
p( g|

f ) =
N

i=1
e
−¯ g
i
¯ g
g
i
i
/g
i
. (12)
It is well knownthat variance {g
i
} = ¯ g
i
. Therefore, generallyPETdata
noise is not stationary (i.e. variant with measurements), with the the
relative standard deviation of the noise with respect to its mean
decreasing with the number of detected events. From Eq. (12), the
log-likelihood function for the measurement is given by:
l(

f | g) = log p( g|

f ) = −
N

i=1
¯ g
i
+
N

i=1
g
i
log ¯ g
i
+ constant. (13)
15.2.3 Image Reconstruction
Under idealized conditions, PET measurements are related to cer-
tain samplings of the Radon transform of the underlying image
function. After correcting for the detection sensitivity and subject
attenuation, analytic algorithms developed for inverting the Radon
transform, such as the well celebrated filtered backprojection algo-
rithm(FBP), canbe employedfor reconstructingthe unknownimage
function fromPET measurements. Methods that can compensate for
certain deviations from the Radon transform, such as the positron
range and stationary detector response, have also been proposed.
26
Analytic PET reconstruction methods, however, have two major
shortcomings. First, the tomographic reconstruction process is ill
conditioned such that small data noises can give rise to large errors
in the solution image. Unfortunately, PET data generated in typi-
cal studies are quite noisy; therefore, achieving effective control of
the negative effects of data noise is a concern of special importance
in PET reconstruction. Noise reduction in analytic reconstruction
methods is typically achieved by employing ad hoc low-pass filters.
By assuming stationary data noise (which is incorrect), Wiener fil-
ters for reducing noise have also been developedandinvestigated.
27
Generally speaking, analytic methods lack proper mechanisms for
implementing optimized handing of the nonstationary data noise
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
374
Chien-Min Kao et al.
encountered in PET imaging. Second, analytic reconstruction algo-
rithms are based on simplified imaging models that do not take into
account most physical factors present inPETimaging. Therefore, it is
necessary to apply prereconstruction corrections so that the imaging
model for the corrected data can approximate the assumed models.
Physical factors that are uncorrected for, or are only partially cor-
rected, in the preprocessing can lead to image degradations, such
as image blur and spatial varying resolution, or even cause image
artifacts. Accurate prereconstruction data corrections are often diffi-
cult toachieve. Furthermore, suchprereconstructioncorrections will
deteriorate the statistical nature of the acquired data and aggravate
the aforementioned concern regarding the inferior noise handling
capability with the analytic reconstruction methods.
Model-based approaches that can fully account for the physical
and statistical models of the PET imaging process are necessary for
achieving best image reconstructions. Iterative reconstruction meth-
ods are such model-based techniques. For purpose of computation,
the image function needs to be discretized:
f ( x ) =
M

j=1
f
j
b
j
( x), (14)
where b
j
( x ), j = 1, . . . , M, is a expression set. The continuous image
model given by Eq. (11) then becomes:
E{ g} = H

f + s + r, (15)
where

f = [f
1
, . . . , f
M
]
t
, s = [s
1
, . . . , s
N
]
t
, r = [r
1
, . . . , r
N
]
t
, and H
is an N × M system response matrix having the elements H
ij
=
_
d
3
xh
i
( x )b
j
( x). The probability model for g is still given by Eq. (12).
In the literature, the voxel representation, in which the image is
assumed to consist of a lattice of cubic elements containing uni-
form radioactivity within, is widely adopted for image discretiza-
tion. Other discrete image representations have also been proposed
for PET image reconstruction. It is also common for researchers to
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
375
consider the following simplified PET imaging model:
E{ y} = H

f . (16)
In this case, one either removes scattered and random events in
the data by prereconstruction corrections, or simply ignores such
events. In addition, the Poisson model is often assumed for y even
though the model is no longer valid after data corrections. Many
iterative methods for solving the discrete PET imaging model given
byEqs. (15) and(16) havingbeendeveloped. These methods differ in
the cost functions they employ for finding solutions to these mod-
els. They also substantially differ in the quantitative performance
characteristics (i.e. the trade-off behavior between the reconstruc-
tion accuracy and noise sensitivity), the convergence behavior, and
the computational complexity.
Iterative PET reconstruction methods include the algebraic
reconstruction techniques (ART),
28
projection-onto-convex (POCS)
techniques,
29
penalized weighted least-square (PWLS) methods,
30
maximum likelihood-based (ML) approaches,
31
and the maximum
a posteriori (MAP) approaches.
32,33
IntheARTmethods, one observes
that the solution to Eq. (16) can be interpreted as:

f ∈
N
_
i=1
A
i
, A
i
= { x :

h
t
i
x = y
i
}, (17)
where

h
i
= [H
i1
, . . . , H
iM
]
t
. The projection operator P
i
that maps an
arbitraryvector x tothe closest point onthe hyperplane A
i
is equal to:
P
i
x = x + ( g
i


h
t
i
x)

h
i
_
|

h
i
|
2
. (18)
The ART algorithm seeks to sequentially enforce the hyperplane
constraints until convergenceis reached, yieldingthefollowingalgo-
rithm: given an initial estimate

f
(0)
, the nth estimate is equal to:

f
(n)
=
˜
P
N
· · ·
˜
P
1

f
(n−1)
. (19)
The order of the projectionis arbitrary. The resultingalgorithmis fast
in terms of both convergence andthe computation time neededeach
iteration, but it lacks the ability to explicitly incorporate mechanisms
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
376
Chien-Min Kao et al.
for handling data noise and often fails to converge when subject to
inconsistent data. The POCS techniques are generalizations of the
ART methods in which the solution image is given by:

f ∈
N
_
i=1
˜
A
i
, N

> N, (20)
where, in addition to the N hyperplane constraints given by the
measurements,
˜
A
i
can include any convex set. Let
˜
P
i
denote the
projection operator associated with the convex constraint sets
˜
A
i
,
the POCS update equation is then given by:

f
(n)
=
˜
P
N
· · ·
˜
P
1

f
(n−1)
. (21)
Therefore, certain information regarding the data noise and solution
images in the form of convex constraints can be specified for allevi-
ating the negative effects of data noise. Convergence can still be an
issue in POCS methods.
In contrast toART and POCS methods, the ML, MAP, and PWLS
methods are statistical methods that explicitly employ the probabil-
itydistributions of the data inreconstruction. Inthe MLmethods, the
solutions maximize the log likelihood functions given by Eq. (13),
and they are often generated by using the expectation maximization
(EM) algorithm given by:
f
(n)
j
=

i
H
ij

i
H
ij
_
g
i

j

H
ij

f
(n−1)
j

_
f
(n−1)
j
, j = 1, . . . , M. (22)
When the measurements are strictly independent Poisson variates,
given a positive initial estimate

f
(0)
the EM algorithm is guaranteed
to converge to the ML solution.
31
The EM algorithm has a relatively
simple update equation, offers favorable quantitative performance
characteristics, and automatically enforces the voxel positivity con-
dition. Inthe MAPapproach(alsocalledthe Bayesianapproach), one
seeks to maximize the a posteriori distribution p(

f | g) or, equivalently,
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
377
logp(

f | g). According to the Bayes theorem, we have:
log p(

f | g) = log
_
p( g|

f )p(

f )/p( g )
_
= l(

f | g) + log p(

f ) + constant.
(23)
The a priori information p(

f ) imposes smoothness conditions on
the solution,
32
or introduce structural information of the solution
deriving from associated anatomical images.
33
In the literature,
MAPmethods are also called penalized maximum-likelihood meth-
ods because the prior term penalizes the log-likelihood function in
Eq. (23). Many iterative algorithms for generating the MAP esti-
mates have been proposed, including EM-like algorithms. Both
ML and MAP methods require exact knowledge of the probabil-
ity distributions of the data. In many practical situations (such as
with pre-corrected data), such exact probability distributions are
not available. Approximate distributions for random-corrected data
have been proposed, including the shifted Poisson model and its
variations, and the saddle-point approximation.
34
Although exact
distributions are difficult to obtain, the second-order statistics of the
corrected data can be readily derived. It is therefore attractive to
employ PWLS methods that seek the minimize the cost function
30
:
(

f ) =
1
2
( y − H

f )
t
W( y − H

f ) + β(

f )G(

f ), (24)
where the weighting matrix W is the inverse of an estimate of the
conditional variances of y, G(

f ) imposes penalties for image rough-
ness, and β(

f ) provides a mechanismfor preserving edge structures
in image.
The EM algorithm is quite attractive for PET image reconstruc-
tion; the main drawback that limits its practical usefulness is its
slow convergence rate. An important variation of the EM algorithm
is the orderedsubsets EM(OSEM) algorithm, whichhas beenwidely
adopted as the de facto standard for practical applications.
35
In this
algorithm, the data are dividedinto a number of disjoint subsets and
the EM algorithm is sequentially applied to these subsets to consti-
tute one iteration. This simple modification has been observed to
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
378
Chien-Min Kao et al.
remarkably increases the algorithm’s convergence rate and, empir-
ically, faster convergence rate is achieved with the use of more sub-
sets. Unfortunately, under certain situations the OSEM algorithm
may not converge (similar to the situations with ART and POCS).
Modifications to ensure the convergence of the OSEM algorithm
have been developed and investigated. Important examples are the
raw-actionmaximum-likelihood(RAMLA) algorithm
36
andthe con-
vergent OSEM (COSEM) algorithm.
37
It is noted that, in practice, the properties of a statistical recon-
struction method depend not only on the cost function it aims to
optimize but also on the specific updating equation it employs. This
is because the algorithmis often terminated before reaching conver-
gence and in some situations multiple solutions can exist. We also
note that at convergence the EM algorithm is known to minimize
the Kullback-Leibler distance between the acquired data and the
predicted data based on the estimated solution. This interpretation
of the EM algorithm is valid irrespective of the specific data noise
model.
15.2.4 Three-Dimensional Imaging, Dynamic Imaging,
and List Model Reconstruction
Most modern PET systems are fully 3D systems, with the so called
3DRP algorithm of Kinahan
38
being widely offered for perform-
ing analytical 3D PET image reconstruction. Alternative rebinning
approaches that convert a fully 3D PET dataset to a collection of 2D
datasets associated with individual transaxial image slices have also
beendeveloped. The conversionprocess canbe either approximate
39
or mathematically exact.
40
Hybrid iterative reconstruction meth-
ods that first analytically rebin 3D PET datasets to 2D datasets
and employ 2D iterative reconstruction for achieving slice-by-slice
reconstruction have also been developed and investigated. In such
hybrid approaches, system response matrices for, and the proba-
bility distributions of, the rebinned data need to be determined.
Generally, as expected, direct 3D iterative reconstruction produces
the best solutions. Hybrid approaches, nonetheless, greatly alleviate
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
379
the tremendous computation demands required by fully 3D itera-
tive reconstructions and provide attractive tradeoffs between image
accuracy and computation burden.
For derivations of certain local biochemical/physiological
parameters within individual voxels, dynamic PET imaging is often
performed. Conventionally, dynamic PET data are stored as a tem-
poral sequence of static PET data. The acquired data at each time
point, calleda frame, is separately reconstructedby using analytic or
iterative reconstruction methods described above to generate a tem-
poral sequence of PET images. Appropriate kinetic models are then
employedtoaccount for the observedtemporal variations of the PET
tracers within each voxel for deriving relevant parametric images.
In this conventional approach, the spatial and temporal informa-
tion available in the dynamic PET data are treated independently,
although they are not uncorrelated. In Ref. 41, Kao et al. made the
observation that the temporal information available in the dynamic
data can be exploited for greatly reducing the data noise associated
with each frame and hence significantly improving the resulting
image quality. By having obtained better frame images, more accu-
rate kinetic parameters regarding the PET tracer are also obtained.
Reconstructionapproaches that generate parametric images directly
from the dynamic PET data have also been reported.
42
So far, we have discussed the histogram data format in which
the accumulated event counts at individual detector pairs (i.e. at
individual LORs) of a PET scanner are presented. In contrast, the
list-model data format presents a streamof individual event records
that are sequentially storedinthe chronological order of event detec-
tion. List-model data format is more versatile than the histogram
format. In principle, as much information as desired regarding the
detected events can be stored in the event records, therefore per-
mitting maximal utilization of the detected event information for
achieving optimized image reconstruction. Obviously, list-model
datasets grow linearly in size with the imaging time while the his-
togramdatasets have a fixedsize as determinedthe number of LORs
of a PETscanner. As the number of LORs ina modernhighresolution
PET system has drastically increased, the list-model data format
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
380
Chien-Min Kao et al.
has gained popularity because its advantages are starting to out-
weigh the storage disadvantage. Iterative algorithms basing on the
ML and MAP criteria for reconstructing list-model PET data have
been developed.
43,44
Methods for jointly estimating the image func-
tions and the temporal basis functions underlying the tracer kinetics
fromlist-model data have also beeninvestigated.
45
The combination
of list-model data and physiological gating information also pro-
vides excellent mechanisms for performing cardiac and respiratory
motion corrections. Readers are referred to Ref. 25 for more detailed
discussion on list-model PET image reconstruction.
15.3 IMAGE RECONSTRUCTION IN SINGLE PHOTON
EMISSION COMPUTED TOMOGRAPHY
15.3.1 Introduction
In single-photon emission computed tomography (SPECT), a radio-
pharmaceutical is injected into a patient with the expectation that
it will track some functional or physiological process of interest. At
any given time, one seeks to know the 3D distribution of the tracer.
This canbe achievedbyemployingone or more scintillationcameras
placed outside the patient, each of which records the 2Ddistribution
of emitted photons incident on it.
In order to form a projection image that represents a known
mapping from the 3D distribution of activity to the 2D projection,
the camera is generally equippedwith a leadcollimator that restricts
the angular range of photons that reach the face of the camera. In
this section, we focus on the case when a so-called parallel-hole col-
limator is employed. Such collimators attempt to restrict attention to
photons that are travellingnormal tothe face of the camera, although
in practice, they admit photons incident from a range of angles cen-
tered around zero degrees. This acceptance cone leads to depth-
dependent resolution and it is important to model and account for
this effect in order to obtain more accurate reconstructed images of
the activity distribution.
Another physical effect that must be accounted for is attenua-
tion of the photons as they travel through the patient fromthe point
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
381
at which they are emitted. SPECT is often performed with photons
around 140 keV for which the attenuation coefficient in soft tissue
is approximately 0.15 cm
−1
. In some areas of the body, such as the
abdomen, it is reasonable to assume that attenuation is uniform.
In other regions, such as the thorax, where the lungs, soft tissue,
andbone all present significantly different attenuationcoefficients, a
more general model of nonuniformattenuationhelps improve quan-
titative accuracy.
To obtainsufficient data to invert the mapping fromthe 3Dactiv-
ity distribution to 2D projections, the camera or cameras must be
rotated around the patient to a variety of angles. If we represent the
coordinates of the camera face by ξ and z and the angular position
of the camera by θ, then the mean of the set of measurements, which
we denote p (ξ, z, θ), can be related to the 3D activity distribution
by the following very general equation, adapted from Liang (PMB
1997), and which includes the effect of nonuniform attenuation and
depth-dependent resolution:
p(ξ, z, θ) =
_

−∞

_

−∞
_

−∞

dz

h (ξ − ξ

, z − z

, η) a
θ

, z

, η)
×exp
_

_
L
0

,z

,η;ξ,z)
µ
θ

, z

, η

)dl
_
. (25)
Here, a
θ
(ξ, η, z) represents the activity distribution a(x, y, z) in a coor-
dinate system rotated by θ about the z axis:
ξ = x cos θ + y sin θ
(26)
η = x sin θ − y cos θ,
and µ
θ
(ξ, η, z) represents the attenuation map µ(x, y, z) in the same
rotated coordinate system. The detector response kernel is repre-
sented by h(ξ, z, η) and it models blurring that is depth-dependent,
but shift invariant at a specified depth. The attenuation termis writ-
ten as a line integral through the attenuation map along the line
L
θ

, z

, η; ξ, z) that connects the point (ξ

, z

, η) to the detector bin
(ξ, z) at angle θ. This very general form accounts for the fact that
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
382
Chien-Min Kao et al.
photons travelling from different portions of the field of view of a
given bin at a given depth could experience different amounts of
attenuation because of the different path they travel along toward
the detector.
Fortunately, it is veryreasonable tosimplifyEq. (25) byassuming
that the attenuation experienced by the photons traveling along any
of the lines contributing to a given projection bin is the same and can
be represented by the attenuation that takes place along the central
ray of the bundle. We then obtain:
p(ξ, z, θ) =
_

−∞

_

−∞
_

−∞

dz

h (ξ − ξ

, z − z

, η) a
θ

, z

, η)
×exp
_

_
η
−∞
µ
θ

, z

, η

)dη

_
. (27)
We will take Eq. (27) as our fundamental imaging equation and con-
sider the approaches that have been developed to inverting it under
a variety of special cases.
15.3.2 No Attenuation, No Depth-dependent Resolution
The simplest possible case arises when one ignores the effects
of attenuation and depth-dependent resolution effects. Then we
obtain:
p (ξ, z, θ) =
_

−∞
dηa
θ
(ξ, z, η), (28)
which is a slack of two-dimensional Radon transforms. This can be
inverted by use of a number of standard reconstruction algorithms,
including filtered backprojection and direct Fourier methods.
15.3.3 Uniform Attenuation Alone
The next simplest case arises when ignoring depth-dependent res-
olution and assuming that the attenuation can be represented by
a uniform attenuation coefficient within some closed boundary.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
383
In this case, the imaging equation can be written:
p(ξ, z, θ) =
_

−∞
dηa
θ
(ξ, z, η)e
−µ[η+D(ξ,z,θ)]
, (29)
where D(ξ, z, θ) represents the distance fromthe point x = ξ cos θ, y =
ξcos θ, z totheboundaryinthedirectionof theprojection. Bydefining
a set of modified projections:
m(ξ, z, θ) ≡ e
µ[D(ξ,z,θ)]
p(ξ, z, θ), (30)
we obtain:
m(ξ, z, θ) =
_

−∞
dηa
θ
(ξ, z, η)e
−µη
, (31)
an equation generally known as the exponential Radon transform
(ERT).
46
Tretiak and Metz developed an FBP-style reconstruction for-
mula in which appropriately modified projections are subject to
exponentially weighted backprojection.
46
The reconstruction for-
mula for the activity a(r, φ, z) given in cylindrical coordinates can be
expressed as:
a(r, φ, z) =
_

0
e
µη
_

m
|≥ν
µ

m
|
2
e
j2πν
m
ξ
_

−∞
m(ξ

, z, θ)e
−j2πν
m
ξ


m
dθ,
(32)
where v
µ
= µ/2π, ε = r cos (θ, −ϕ), andµ = r sin (θ−ϕ). Anumber of
different analytic algorithms for inverting this imaging model were
proposed over the years. Bellini et al.
47
and Inouye et al.
48
developed
methods that workedinthe spatial frequencydomaintoestimate the
2DFourier transformof the unattenuated sinogram, fromwhich the
exact image couldbe obtainedby inverting the 2DRadon transform.
Hawkins proposeda methodbasedonthe use of circularlyharmonic
Bessel transforms.
49
These algorithms are all exact in the face of per-
fect data, but they propagate noise and inconsistencies differently.
In 1995, Metz and Pan
50
analyzed the 2D Fourier transform
of the 2D ERT and demonstrated that all these methods can be
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
384
Chien-Min Kao et al.
interpretedas special cases of a broadclass of methods. In particular,
they showed that these methods represented different choices of
weighting coefficients implicitly being used to combine redundant
data that arise due to certain Fourier-domain symmetries. Metz and
Pan also showed that a new member of the class, given by a differ-
ent choice of those weightingcoefficients, hadbetter noise properties
than those of the existing methods.
50,51
Specifically, the methodprovides a means of estimating the coef-
ficients of the angular Fourier series representation of the 2DFourier
transform of a(r, φ, z) at a fixed z, which we denote A
k

a
), from the
2D Fourier transform (actually a combination of a 1D Fourier trans-
formand a 1DFourier series expansion) of the modified data, which
we denote M
k

m
), by use of:
A
k

a
) = ωγ
k
M
k

m
) + (1 − ω)(−1)
k
γ
−k
M
k
(− ν
m
), (33)
where ν
m
=
_
ν
2
a
+ ν
2
µ
, γ =
_
ν
2
m
− ν
2
µ
/(ν
m
+ ν
µ
), and 0 ≤ ω ≤ 1
is a weight that allows the two independent estimates of A
k

a
)
to be combined in a way that minimizes the variance of the final
image. Metz and Pan showed that the existing algorithms can be
derived by the selection of different ω and that new algorithms can
be derived that may have noise properties superior to the existing
algorithms.
50,51
15.3.4 Distance-dependent Resolution Alone
If attenuation is ignored but distance-dependent resolution effects
modeled, then Eq. (27) becomes:
p(ξ, z, θ) =
_

−∞

_

−∞
_

−∞

dz

h(ξ − ξ

, z − z

, η)a
θ

, z

, η). (34)
Appledorn
52
presented an analytic solution to this equation for the
case when h(ξ, z, η) is a Cauchy function whose width parameter
grows linearly with distance η.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
385
15.3.5 Distance-dependent Resolution and
Uniform Attenuation
When both distance-dependent resolution and uniformattenuation
are modeled, Eq. (27) becomes:
p(ξ, z, θ)
=
_

−∞

_

−∞
_

−∞

dz

h(ξ − ξ

, z − z

, η)a
θ

, z

, η)e
−µ[η+D(ξ,z,θ)]
,
(35)
Soares derived the first analytic solution to this equation for the
case when h(ξ, z, η) is a Cauchy function.
53
Van Elmbt and Walrand
54
generalized Bellini’s method for inverting the ERT to invert Eq. (35)
for the more practical case when h(ξ, z, η) is modeled as a Gaussian
whose standard deviation grows linearly with distance η. Pan and
Metz extended the earlier work of Metz and Pan to this equation
for both the Cauchy form of h(ξ, z, η) considered by Soares and the
Gaussian form considered by van Elmbt and Walrand.
55
15.3.6 Nonuniform Attenuation Alone
When the attenuation is nonuniform and distance-dependent reso-
lution effects are ignored, Eq. (27) becomes:
p(ξ, z, θ) =
_

−∞
dηa
θ
(ξ, z, η)exp
_

_
η

µ
θ
(ξ, z, η

)dη

_
. (36)
This equation is often referred to as the attenuated Radon transform.
An approximate approach to inverting this equation was devel-
oped by Chang.
56
The multiplicative Change method entails cal-
culating the average fraction of photons that safely escape from
each point in the reconstructed volume to the various detector loca-
tions. The reconstructedimage is thenmultipliedbythe reciprocal of
this average transmission factor map to obtain the corrected image.
This correction is only approximate but it can be refined through
an iterative process in which the corrected image is reprojected and
the resulting data compared to the measured data. The difference
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
386
Chien-Min Kao et al.
between the reprojections and the measured data are used to gener-
ate an error image by reconstruction using FBP. The error images are
corrected for attenuation by the multiplicative trick and then added
to the original corrected image. This process could be continued for
a desired number of iterations.
The explicit solution for the attenuated Radon transform was
first presented by Novikov.
57
Natterer made significant contri-
butions to the theory, including a different inversion formula as
well as an alternative, simpler derivation of Novikov’s formula.
58
Kunyansky also provided a slightly modified version of Novikov’s
formula, whichallowedfor simpler implementationandfromwhich
FBP and the Tretiak-Metz algorithm could easily be obtained under
the cases when attenuation was zero or uniform, respectively.
59
15.3.7 Short Scan and Region of Interest Imaging
While it is well known that inversion of the Radon transform only
requires data acquired over a 180

angular range, it was not known
until recently whether the exponential Radon transform and the
attenuated Radon transform could also be inverted from so called
short scan data.
In 2002, however, Noo andWagner showedthat the ERTrequires
data on the angular interval θ ∈ [0, π].
60
Pan et al.
61
generalized this
result to develop so calledπ-scheme short scan strategy in which the
full angular range of 2π is divided into a number of nonoverlapping
angular intervals, and the data function is acquired only over dis-
joint angular intervals whose summation without conjugate views
is equal to π. This approach does not yield an explicit inversion for-
mula, but it was demonstrated that an iterative algorithm is able to
generate high quality reconstructions from π-scheme data.
As for the attenuated Radon transform, Sidky et al., were able to
show, both heuristically and rigorously, that a short scan is sufficient
here as well by adopting the so called potato-peeler perspective to
establish that there is two-fold redundancy in a fullscan dataset.
62
Recently, Noo et al. have developed approaches to reconstructing
ROI images from the ERT and attenuated RT with truncations.
63
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
387
15.4 ACKNOWLEDGMENTS
This work was supported in part by NIH grants EB00225 and
EB02765. Dr Emil Sidky was supported NIH K01 EB003913. The
authors are thankful to Mr Xiao Han and Mr Dan Xia for working
on the latex file of the manuscript.
References
1. Tuy HK, An inversion formula for cone-beam reconstruction, SIAM
J Appl Math 43: 546–552, 1983.
2. Grangeat P, Mathematical framework of cone-beam3Dreconstruction
via the first derivative of the Radon transform, in Herman GT, Louis
AK, Natterer F (eds.), Mathematical Methods in Tomography, Lecture Notes
in Mathematics, Springer-Verlag, Berlin, pp. 66–97, 1991.
3. Katsevich A, Analysis of an exact inversion algorithm for spiral cone-
beam CT, Phys Med Biol 47: 2583–2597, 2002.
4. Chen G, An alternative derivation of Katsevich’s cone-beam recon-
struction formula, Med Phys 30: 3217–3226, 2003.
5. Danielsson PE, Edholm P, Seger M, Towards exact 3D-reconstruction
for helical cone-beamscanning of long objects. Anewdetector arrange-
ment anda newcompleteness condition, inTownsendDW, KinahanPE
(eds.), Proceedings of the 1997 International Meeting on Fully Three-
Dimensional Image Reconstruction in Radiology and Nuclear Medicine,
Pittsburgh, pp. 141–144, 1997.
6. Zou Y, Pan X, Exact image reconstruction on PI-line from minimum
data in helical cone-beam CT, Phys Med Biol 49: 941–959, 2004.
7. Zou Y, Pan X, Image reconstruction on PI-lines by use of filtered back-
projection in helical cone-beam CT, Phys Med Biol 49: 2717–2731, 2004.
8. Sidky EY, Zou Y, Pan X, Minimum data image reconstruction algo-
rithms with shift-invariant filtering for helical, cone-beam CT, Phys
Med Biol 50: 1643–1657, 2005.
9. Zou Y, Pan X, An extended data function and its backprojection onto
PI-lines in helical cone-beam CT, Phys Med Biol 49: N383–N387, 2004.
10. Zou Y, Pan X, Sidky EY, Theory and algorithms for image reconstruc-
tion on chords and within region of interests, Journal of the Optical Soci-
ety of America A 22: 2372–2384, 2005.
11. Jiang Hsieh, Computed Tomography — Principles, Designs, Artifacts, and
Recent Advances, SPIE Press, Bellingham, WA, 2003.
12. Sidky EY, Pan X, Recovering compactly supported functions from
knowledge of its hilbert transform on a finite interval, IEEE Signal
Processing Lett 12: 97–100, 2005.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
388
Chien-Min Kao et al.
13. Tam KC, Samarasekera S, Sauer F, Exact cone-beam CT with a spiral
scan, Phys Med Biol 43: 1015–1024, 1998.
14. Gambhir SS, Molecular imaging of cancer with positron emission
tomography, Nature Rev Cancer 2: 683–693, 2002.
15. Herholz K, Heiss WD, Positron emission tomography in clinical neu-
rology, Molecular Imaging and Biology 6: 239–269, 2004.
16. Lance Gould K, PET perfusion imaging and nuclear cardiology, J Nucl
Med 32: 579–606, 1991.
17. Alexander GE et al., Longitudinal PET evaluation of cerebral metabolic
decline in dementia: Apotential outcome measure in Alzheimer’s dis-
ease treatment studies, Am J Psychiastry 159: 738–745, 2002.
18. Tai YC, Laforest R, Instrumentation aspects of animal PET, Annu Rev
Biomed Eng 7: 255–285, 2005.
19. Weissleder R, Scaling down imaging: Molecular mapping of cancer in
mice, Nature Rev Cancer 2: 11–18, 2002.
20. Herschman HR et al., Seeing is believing: Non-invasive, quantita-
tive and repetitive imaging of reporter gene expression in living ani-
mals, using positron emission tomography, J Neurosci Res 59: 699–705,
2000.
21. Kelloff GJ, Progress and promise of FDG-PET imaging for cancer
patient management and oncologic drug develpment, Clinical Cancer
Research 11: 2785–2808, 2005.
22. Moses WW, Time-of-flight in PET revisited, IEEE Trans Nucl Sci 50:
1325–1330, 2003.
23. Wernick MN, Aarsvold JN (eds.), Emisson Tomography: The Fundamen-
tals of PET and SPECT, Elsevier Academic Press, San Diego, CA, 2004.
24. Kao CM, Chen CT, Development and evaluation of a dual-head PET
system for high-throughput small-animal imaging, 2003 IEEE Nuclear
Science Symposium Conference Record, 2072–2076, 2003.
25. Qi J, Leahy RM, Iterative reconstruction techniques in emission com-
puted tomography, Phys Med Biol 51: R541–R578, 2006.
26. HuesmanR, SalmeronE, Baker J, Compensationfor crystal penetration
in high resolution positron tomography, IEEE Trans Nucl Sci 36: 1100–
1107, 1989.
27. Shao L, Karp JS, Countryman P, Practical considerations of the Wiener
filtering technique on projection data for PET, IEEE Trans Nucl Sci 41:
1560–1565, 1994.
28. Herman GT, Meyer LB, Algebraic reconstruction techniques can be
made computationally efficient [positron emission tomography appli-
cation], IEEE Trans Med Imag 12: 600–609, 1993.
29. Wernick MN, Chen CT, Superresolved tomography by convex projec-
tions and detector motion, J Opt Soc Am A 9: 1547–1553, 1992.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
389
30. Fessler JA, Penalized weighted least-squares image reconstruction for
PET, IEEE Trans Med Imag 13: 290–300, 1994.
31. Shepp LA, Vardi Y, Maximum likelihood reconstruction for emission
tomography, IEEE Trans Signal Process 41: 534–548, 1982.
32. Hebert T, Leahy R, Ageneralized EMalgorithmfor 3DBayesian recon-
struction from Poisson data using Gibbs priors, IEEE Trans Med Imag
8: 194–202, 1989.
33. Ouyang X, Wong WH, Johnson VE, Hu X, Chen CT, Incorporation of
correlated structural image in PET image reconstruction, IEEE Trans
Med Imag 13: 627–640, 1994.
34. Ahn S, Fessler JA, Emission image reconstruction for randoms-
precorrected PET allowing negative singoram values, IEEE Trans Med
Imag 23: 591–601, 2004.
35. Hudson HM, Larkin RS, Accelerated image reconstruction using
ordered subsets of projection data, IEEE Trans Med Imag 13: 601–609,
1994.
36. Browne J, De Pierro AR, Arow-action alternative to the EM algorithm
for maximizing likelihoods in emission tomography, IEEE Trans Med
Imag 15: 687–699, 1996.
37. Hsiao IT, Rangarajan A, Khurd P, Gindi G, An accelerated convergent
ordered subsets algorithm for emission tomography, Phys Med Biol 49:
2145–2156, 2004.
38. Kinahan PE, Rogers WL, Analytic 3D image reconstruction using all
detected events, IEEE Trans Nucl Sci 36: 864–968, 1989.
39. Duabe-Witherspoon ME, Muehllehner G, Treatment of axial data in
three-dimensional PET, J Nucl Med 28: 1717–1724, 1987.
40. Defrise Met al., Exact andapproximate rebinning algorithmfor 3DPET
data, IEEE Trans Med Imag 16: 167–186, 1997.
41. Kao CM, Yap JT, Mukherjee J, Wernick MN, Image reconstruction for
dynamic pet based on low-order approximation and restoration of the
sinograms, IEEE Trans Med Imag 16: 738–749, 1997.
42. Kamasak ME, Bouman CA, Morris ED, Sauer K, Direct reconstruction
of kinetic parameter images from dynamic PET data, IEEE Trans Med
Imag 25: 636–650, 2005.
43. Barrett HH, White T, Parra L, List-model likelihood, J Opt Soc Am A14:
2914–2923, 1997.
44. Rahmin A, Blinder S, Cheng JC, Sossi V, Statistical list model recon-
sturction in quantitative dynamic imaging using high resolution
research tomograph, 8th Fully 3D Meeting 117–120, 2005.
45. Reader AJ, Sureau FC, Comtat C, Trebossen R et al., Joint estimation
of dynamic PET images and temporal basis functions using fully 4D
ML-EM, Phys Med Biol 51: 5455–5474, 2006.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
390
Chien-Min Kao et al.
46. Tretiak O, Metz CE, The exponential radon transform, SIAM J Appl
Math 39: 341–354, 1980.
47. Bellini S, Piacenti M, Caffario C, Rocca F, Compensation of tissue
absorption in emission tomography, IEEE Trans Acoust Speech, Sig Pro-
cessing 27: 213–218, 1979.
48. Inouye T, Kose K, Hasegawa A, Image reconstruction algorithm for
single-photon-emission computed tomography with uniform attenu-
ation, Phys Med Biol 34: 299–304, 1989.
49. Hawkins WG, Leichner PK, Yang NC, The circular harmonic trans-
form for spect reconstruction and boundary conditions on the fourier
transform of the sinogram, IEEE Trans Med Imaging 7: 135–148, 1988.
50. Metz CE, Pan X, A unified analysis of exact methods of inverting the
2-D exponential radon transform, with implications for noise control
in spect, IEEE Trans Med Imaging 14: 643–658, 1995.
51. Pan X, Metz CE, Analysis of noise properties of a class of exact meth-
ods of inverting the 2-D exponential radon transform, IEEE Trans Med
Imaging 14: 659–668, 1995.
52. Appledorn CP, An analytical solution to the nonstationary reconstruc-
tion problem in single photon emission computed tomography, in
Ortenhdahl DA, Llacer J (eds.), Information Processing in Medical Imag-
ing, Wiley-Liss, New York, pp. 69–79, 1990.
53. Soares EJ, Byrne CL, Glick SJ, Appledorn CR et al., Implementation
and evaluation of an analytical solution to the photon attenuation and
non-stationaryresolutionreconstructionprobleminSPECT, IEEETrans
Nucl Sci 40: 1231–1237, 1993.
54. van Elmbt L, Walrand S, Simultaneous correction of attenuation and
distance-dependent resolution in SPECT: An analytical approach, Phys
Med Biol 38: 1207–1217, 1993.
55. Pan X, Metz CE, Analytical approaches for image reconstruction in 3d
spect, inGrangeat P, Amans J (eds.), 3DImage ReconstructioninRadiology
andNuclear Medicine, KluwerAcademic Publishers, NewYork, 103–116,
1996.
56. Chang LT, A method for attenuation correction in radionuclide com-
puted tomography, IEEE Trans Nucl Sci 25: 638–643, 1978.
57. Novikov RG, An inversion formula for the attenuated X-ray transfor-
mations, Dèpartment deMathèmatique, Universitè de Nantes, Nantes,
France (preprint), 2000.
58. Natterer F, Inversion of the attenuated Radon transform, Inv Probs 17:
113–119, 2001.
59. Kunyansky LA, A new SPECT reconstruction algorithm based upon
the Novikov’s explicit inversion formula, Inv Prob 17: 293–306,
2001.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
Some Recent Developments in Reconstruction Algorithms for Tomographic Imaging
391
60. Noo F, Wagner JM, Image reconstruction in 2D spect with 180-degree
acquisition, Inv Probs 17: 1357–1372, 2001.
61. Pan X, Kao CM, Metz C, A family of pi-scheme exponential radon
transform and the uniqueness of their inverses, Inv Prob 18: 825–836,
2002.
62. Sidky E, Pan X, Variable sinogramand redundant information in spect
withnon-uniformattenuationandthe uniqueness of their inverses, Inv
Probs 18: 1483–1497, 2002.
63. Noo F, Defrise M, Pack JD, Clackdoyle R, Image reconstruction from
truncated data in single-photon emission computed tomography with
uniform attenuation, Inv Prob 23: 645–667, 2007.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch15 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
CHAPTER 16
Shape-Based Reconstruction from
Nevoscope Optical Images of
Skin Lesions
Song Wang and Atam P Dhawan
Optical imaging of skin lesions has been of significant interest for early
diagnosis and characterization of skin cancers. The work presented in
this chapter is in continuation of the development of an optical imaging
based portable system with computerized analysis to detect skin cancers,
particularly melanomas in early curable stage. The method developed
in this paper can provide reconstructions of melanin and blood contents
associatedwithaskinlesionfromits multispectral transilluminationbased
optical images. The results of simulation of a skin lesion for reconstruction
of melaninandblood(hemoglobin) informationfrommultispectral optical
images are presented. Changes in melanin and hemoglobin contents in a
skin lesion detected over time using the proposed method would allow
early detection of malignant transformation and the development of a
cancerous lesion.
16.1 INTRODUCTION
In recent years, optical medical modalities have drawn significant
attention from researchers. Visible and near-infrared light wave-
lengths have been used of surface reflectance, transillumination and
transmission based methods.
1
Also, optical modalities can provide
a portable imaging system for routine screening and monitoring of
skin lesions. Optical modalities usually make use of light within the
lower part of magnetic electric spectra, which is believed that this
kind of light is not going to poison the interrogated tissue or the
393
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
394
Song Wang and Atam P Dhawan
side effect would be greatly reduced. In the so called “therapeutic
window,” including part of visible light and infrared light, physio-
logically meaningful chromophores like melanin, oxyhemoglobin
and deoxyhemoglobin have relative low absorption coefficients.
Meanwhile, the scattering coefficients of human tissues are rela-
tively high in this range, resulting in a penetrating depth favor-
able for investigating brain, breast and skin etc. More importantly,
the optical coefficients of these chromophores are strongly wave-
lengthdependent. The discrepancy of chromophores therefore gives
us a chance to reveal their distributions using multispectral light.
1
Though the work presented in this chapter is motivated by the
need of developing a computer-aided optical imaging system for
diagnosis and characterization of skin cancers, the methods pre-
sented in the paper are generally applicable for optical image
reconstruction.
Skin cancer is one of fastest growing cancer among all cancers.
2
The majority of skin cancers are nonmelanoma skin cancers. The
cancer is derived fromkeratinocytes, the main type of cell of epider-
mis. Malignant melanomaresults fromuncontrolledgrowthof mela-
somes originallyexistedinepidermis. Thoughnonmelanoma cancer
prevails among all kinds of skin cancers, malignant melanoma is
the most fatal form which accounts for 90% death.
2
It is fatal if
not detected in early stages. It can be cured with nearly 100% sur-
vival rate, if removed at its early stage. Malignant melanoma is
currently diagnosed by dermatologists according to its color and
morphology.
2
However, such diagnosing process to large extent
is subjective and diagnostic accuracy rests on the dermatologist’s
individual experience. There is an urgent need for developing a
noninvasive modality to reveal physiological features of malignant
melanoma quantitatively and objectively so that even an unskilled
dermatologist is able to make right decision.
As an effective utility to diagnosing malignant melanoma, the
light-based device should be able to provide both morphological
and physiologic information. Morphological information may be
utilized to determine the depth of invasion of malignant melanoma
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
395
while physiologic characteristics like distribution of melanin dust
and blood vessels are essential to differentiate it from a benign
one. Within visible and infrared spectra, major chromophores of
malignant melanoma include melanin, oxyhemoglobin and deoxy-
hemoglobin. The absorptionof water is negligible comparedtothese
major ones. Among these chromophores, melanin presents a higher
absorption than oxyhemoglobin and deoxyhemoglobin. Also, their
absorption spectra are not linear dependent. Hence, it is possible
to uncover distributions of these chromophores by multispectral
optical measurement. Like mentioned before, malignant melanoma
have a nonblood, melanin-rich core and a hemoglobin-rich periph-
eral blood net. Once distributions of chromophores are rendered, its
structure is available simultaneously. Having investigated the phys-
ical properties of malignant melanoma under visible and infrared
light, it is obvious that anoptical device wouldbe areasonable choice
for imaging malignant melanoma.
An optical transillumination imaging device, Nevoscope is
used for imaging skin lesions. It was introduced by Dhawan
3
for
noninvasive diagnosis of malignant melanoma and other skin can-
cers. In its transillumination mode, light is directed by a channel 45

with respect to the normal of skin and enters skin though a ring light
source. The reemerged light gets captured by the CCD camera and
forms the transilluminationimage. This image contains the informa-
tionof underlyingoptical properties. Withinthe optical tomography
framework, it is possible to retrieve two key signatures of malignant
melanoma, the spatial distribution of melanin and blood from opti-
cal reflectance measurements.
16.2 OPTICAL IMAGING METHODS
During the last several decades, various optical imaging modalities
have been developed.
10
These methods can be divided into five cat-
egories: surface imaging, fluorescence imaging, optical coherence
imaging (OCT), optical spectroscope and optical tomography (OT).
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
396
Song Wang and Atam P Dhawan
16.2.1 Surface Imaging
Surface imaging methods provide a specific light source to illumi-
nate the surface of the skin and skin lesion. Surface reflectance based
measurements are then stored as a high-resolution image using a
high-resolution CCDdigital camera via a magnification optical lens.
For example, “Dermoscope” has been used for surface-reflectance
based imaging of skin lesion.
2,3
A better accuracy for detecting
melanoma can be obtained through the use of the Epiluminescence
Light Microscopy (ELM) imaging method, where the reflection of
the surface light is reduced by either an oil-glass interface on the
skin or cross-polarization of the surface and reflected light to cancel
the surface reflection.
2
Dermoscopyutilizes surface reflectance dom-
inant illumination methods that are to render the skin translucent
and thereby allowing for the visualization of subsurface structures
and colors. These subsurface structures and colors in combination
with their location and distribution (pattern) have been shown to
improve a clinician’s ability to detect early melanoma and basal
cell carcinoma. Dermoscopy can be performed utilizing polarized
or nonpolarized light. Cross-polarization method for epilumines-
cence uses linear polarizers in the incident light and a viewing lens
to cancel the light that is reflected from the skin. Since most of the
reflected light from the skin surface has, for the most part, the same
polarization angle as the incident light, cross-polarization blocks
most of the surface reflected light and only the light that is diffused
below the skin surface is visualized.
A novel optical imaging system, the Nevoscope that uses
transillumination as well as a combination of surface illumination
and transillumination, has been developed by Dhawan to provide
images with significant information about skin-lesion subsurface
pigmentation architecture.
3,4
Nevoscope consists of a digital CCD
camera hookedupto a zoomlens anda customizedoptical assembly
to obtain surface and/or transillumination-based images of the skin
and skin lesion. In the Nevoscope transillumination method, light
is transmitted into the skin area surrounding the lesion at 45

angle.
A virtual light source is thus created a few millimeters below the
skin surface for uniformtransillumination of a small area of the skin
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
397
containing the skin lesion. In such a side-transillumination method,
no surface light is used. An annular transillumination ring provides
fiber optics directed light to illuminate the region of interest uni-
formly. Theskinlesionis positionedinsidethetransilluminationring
through the opening providing a direct field of view to the digital
camera througha zoomlens assembly. The light fromthe illuminator
ring that is not reflectedback due to a mismatch in refractive indices,
enters into the skin and goes through multiple internal reflections
andscattering. This light eventuallygets diffusedacross the layers of
the skin and back-scattered diffused light photons emerge from the
skin to forma transilluminated image of the skin and skin lesion. For
surface illumination, additional fiber optics directed point sources
distributed around the internal wall of the Nevoscope are provided
to reflect light through the surface of the skin lesion. The surface
light intensity can be adjusted and is polarized. Another polarizing
lens (cross-polarized by 90

) is used with cross-polarization method
for the imaging of skin lesion. The Nevoscope by virtue of its design
provides three different ways of imaging a skin lesions.
Besides using the pigment and color information from surface
reflectance information, optical models to relate the reflectance mea-
surements to underlying optical properties have been developed.
For instance, Claridge etc.
15
use a Kubelka-Munk model to simu-
late the formation of color images of melanoma. They eventually
are able to recover blood and melanin distribution in various skin
layers. Kubelka-Munk model is basically a one-dimensional theory.
Using this model and multispectral imaging, the usually ill-posed,
underdetermined inverse problem occurred in optical tomography
is dealt for image reconstruction.
16.2.2 Fluorescence Imaging
Fluorescence imaging uses ultraviolet light to excite fluorophores
and collects emitted light at a higher wavelength. Fluorophores
include endogenous and exogenous fluorophores. The former refers
to natural fluorophores intrinsic inside the skin such amino acid
andstructural protein. These fluorophores are randomly distributed
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
398
Song Wang and Atam P Dhawan
in skin. So direct reconstructing their distributions is meaningless
if not possible. The latter usually refers to some smart polymer
nanoparticles targeting at specific molecules like hemoglobin.
Due to the disparity of the metabolism states, some kinds of
exogenous fluorophores have unique distributions in malignant
melanoma compared to those in normal tissue, which may suggest
the presence of a cancer. Fluoresence imaging has a similar mech-
anism as single particle emission computed tomography (SPECT).
The purpose is to recover source distributiongivena boundary mea-
surement. Aspiring results are obtained by several groups.
16,17
16.2.3 Optical Coherence Tomography
The relatively new modality OCT makes use of coherent proper-
ties of light.
18
In OCT system, light with a low coherence length
is divided into two parts. One serves as reference while the other
is directed into tissue. When light travels in the tissue, it encoun-
ters interface with different refractive index and part of the light is
reflected. This reflectance is mixed with the reference subsequently.
Once the difference of optical path length between reference light
and reflected light is less than the coherence length, coherence
occurs. By observing the coherence pattern and changing the optical
path length of reference light with a mirror, a cross section of skin
can be rendered.
With a sufficient low coherence length, the resolution of OCT
may reach a magnitude of micrometer hence can disclose subtle
changes in cancer tissue at a cellular level. OCT recovers the struc-
ture of interrogated tissue in a mechanism analogous to ultrasonic
imaging. The latter modality sends sound wave into tissue and the
sound wave reflects when encountering impedance varied inter-
face. However, the resolution of OCT is much higher than ultrasonic
imaging.
16.2.4 Optical Spectroscope
Another optical imaging modality is an optical spectroscope. Its
application dates back several decades when spectroscope was
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
399
first used to evaluate blood oxygenation. The spectroscope samples
investigated tissue using reasonable distanced detector and source.
It measures re-emerging light at multiple wavelengths usually rang-
ing from visible to infrared spectra. The measured absorption spec-
trum is a direct reflectance of what happens within the sampling
volume.
In skin, the absorption spectrum is an overlap effect of several
chromophores. However, to recognize the fraction of these chro-
mophores is difficult. The common assumption is that these chro-
mophores are homogenously distributed in the sampling volume.
In fact, it is not true for skin, a complex and heterogeneous layered
tissue. Even a further assumption that chromophores in each skin
layer are homogenous is against reality. Havingsaidthat, the absorp-
tion spectrumis indeed related to the underlying composition of tis-
sue. It may therefore provide significant signatures to some diseases
by itself.
Malignant melanoma contains more melanin and blood than
normal tissue hence more light is absorbedandthe absorptionvaries
in terms of characteristics of melanin and hemoglobin. Study shows
that the absorption spectrum of malignant melanoma differs signif-
icantly fromthat of normal tissue. Features extracted fromthe spec-
trum can be subsequently used to identify malignant melanoma.
Tomatis etc.
19
used artificial neural network as the classifier. They
reporteda sensitivityof 80.4%anda specificityof 75.6%in1391 cases
where 184 are melanoma. A study based on multivariate discrimi-
nant analysis
20
also shows promising results.
16.2.5 Optical Tomography
When discussing about optical tomography, we refer to the optical
imaging systemaimed to reconstruct inside spatial-resolved optical
properties by multiple source detector channels. Though some opti-
cal tomography systems borrowsimilar ideas fromother well estab-
lishedtomographysystems like CT, the fundamental designconcept
of optical devices may deviate greatly from these well established
ones. The characteristic of the system is related to the configuration
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
400
Song Wang and Atam P Dhawan
of source detector channels. It is also related to the tissue since the
optical properties of the tissue affects light transportation and con-
sequently affects the spatial sensitivity of the system. The reason is
that, unlike the straight line trajectory of CT, light becomes diffuse
in tissue. Both system configuration and tissue optical properties
change the trajectories of light photons.
The reconstruction of optical properties differs from most well
established modalities in that it usually does not have access to the
standard algorithms such as filtered back-propagation algorithm.
With the under-determinedandill-posednature of the inverse prob-
lem, an optical reconstruction algorithm is to seek an optimal solu-
tion from numerous possible ones meeting some priori. The optical
reconstruction algorithms are far away from perfect and it is still
a hot research topic in the optical communities. As about Nevo-
scope, because of its reflectance geometry, the measurements of
source detector channels are highly dependent. That is, effective
measurements are greatly reduced. In other words, there are fewer
constraints on parameter space under this scenario, which makes
the inverse problem even harder to solve. The shape based multi-
constrants algorithm presented in this chapter has several advan-
tages over the conventional voxel by voxel approaches. It has fewer
parameters and more constraints. And it has a global method to
search the parameter space. The algorithm will be illustrated and
discussed in rest of the chapter.
16.3 METHODOLOGY: SHAPE-BASED OPTICAL
RECONSTRUCTION
Figure 1 shows a flow chart of the proposed reconstruction method
in terms of Nevoscope transillumination images. Firstly, the overall
goal is to minimize difference between the real measurement and
the predicted measurement. This minimization problem is solved
by genetic algorithms to offer a global searching. Secondly, a lin-
earized forward model is adopted and evaluated by Monte Carlo
simulation in terms of typical optical properties of normal skin.
Thirdly, the malignant melanoma is represented by shapes of its
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
401
Normal Skin Image
Skin Lesion Image
real
M ∆
Jacobian Matrix
Normal Skin
Optical Properties
cal
M ∆
Genetic
Algorithm
Predicted
Shape
Sampling
Function
Monte Carlo
Simulation
Fig. 1. Flow chart of proposed method.
melanin part and blood part. These parameters are lumped into
genetic algorithms.
16.3.1 Forward Modeling
To develop an optical tomographic system, a forward model is
required to relate the measurement to the optical properties of the
investigated tissue. Regardless of what kind of imaging geometry
we are using, an optical system may be described as:
M = F(x), (1)
where M is the measurement and F is a forward model. x is a distri-
bution of unknown optical properties.
Given a reasonable initial guess x
0
of the background optical
properties, we may expand Eq. (1) into:
M = F(x
0
) + F

(x
0
)(x − x
0
) +
1
2
F

(x
0
)(x − x
0
) + · · · , (2)
where F

and F

are first order and second order Frechet derivatives
respectively.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
402
Song Wang and Atam P Dhawan
Let M
cal
= M−F(x
0
) andx = x−x
0
, Eq. (2) maybe rearranged
as:
M
cal
= F

x +
1
2
F

x + · · · . (3)
The discrete form of Eq. (3) turns to be:

M
cal
= J x +
1
2
H x + · · · . (4)
Here, J is the Jacobian matrix and H is the Hessian Matrix.

M
cal
is the measurement vector and x is the vector that gives the varia-
tions from the background x
0
.
Neglecting higher order terms in Eq. (4), we get a simplified
linear system:

M
cal
= J x. (5)
The formulation in terms of Eq. (5) leads to linear optical tomog-
raphy which is also known as “difference imaging.” That is, two
measurements are taken. One is for background tissue (that is, x
0
)
and one is for abnormal tissue (that is, unknown x). Their difference
is then fedto the reconstruction algorithmto obtain the optical prop-
erties. In this study, the linear approach is adopted for Nevoscope
and the Jacobian matrix is extracted by Monte Carlo simulation in
terms of a seven layered optical skin model.
16.3.2 Shape Representation of Skin-Lesions
There are a variety of shape representation methods adopted by
authors working on optical tomography. In their 2D shape-based
reconstruction, Kilmer et al.
5
used a B-spline curve to describe
the 2D shape. Babaeizadeh
6
used tensor-product B-spline to cre-
ate 3D heart shape when studying electrical impedance tomogra-
phy. The B-spline curve can sufficiently describe complex shape
given a few control points. Kilmer
7
later used an ellipsoid in their
3D study. To fully determine the ellipsoid, they need three param-
eters to represent the centroid, three parameters to represent the
lengths of the semiaxes and three parameters to represent the direc-
tion of the ellipsoid. The advantage of their approach is that only
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
403
nine parameters are required, which is a quite small number of
parameters in 3D geometry. However, this simplification makes it
impossible to describe more complex 3D shapes. Zacharopoulos
8
uses the spherical harmonic representation in their 3D study. In
their study, Zacharopoulos shows that the eleven degree spherical
harmonic representation can describe the shape of neonatal head
fairly well.
Amelanoma basically is a 3D object. Therefore, the 3D descrip-
tion like spherical harmonic is appropriate. In addition, we can
observemalignant melanomaonestepfurther. Malignant melanoma
is a result of uncontrolled replication of melasome cells sitting in
the basal layer of epidermis. The shape of the melanoma is hence
bounded by the epidermis layer. All we need to describe the lesion
is therefore reduced to 2D surfaces in the 3D domain.
In order to represent malignant melanoma with 2D surfaces,
we break it into two parts: the melanin part and the blood part.
The melanin part is a 3D region bounded by a single surface and
the epidermis layer. Within the region, the optical properties are
constant and the only absorber is melanin. Furthermore, a second
surface which sits below the first surface is used to represent the
blood part. The region bounded by the first surface and the second
surface is blood only. This model mimics the deteriorated lesion and
its x-z intersection is shown in Fig. 2.
Let the first surface be represented as f
1
(x, y) which corresponds
to the depth of lesion from the epidermis layer at the position (x, y).
The idea is to represent the continuous surface with limited parame-
ters. Firstly, we put a N×N rectangular gridto lie over the epidermal
layer. Secondly, the function f
1
(x, y) is sampled to N×N discrete val-
ues f
d1
(X, Y). Here, (x, y) is continuous and (X, Y) is N ×N numbers
of discrete sampling positions. Third, the discrete values are inter-
polated by the cubic tensor-product B-spline which satisfies the fol-
lowing condition:
f
d1
(X, Y) =
N

i=1
N

j=1
c
1
(i, j)β
3
(X − i, Y − j). (6)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
404
Song Wang and Atam P Dhawan
Epidermis
Dermis
Fat
Melanin
Blood
f
2
(x,y)
f
1
(x,y)
Stratum Corneum
Fig. 2. Shape representation of malignant melanoma.
The original function f
1
(x, y) can then be approximated by:
f
B1
(x, y) =
N

i=1
N

j=1
c
1
(i, j)β
3
(x − i, y − j), (7)
where
β
3
(x − i, y − j) = β
3
(x − i)gβ
3
(y − j), (8)
is the tensor product of one-dimensional cubic B-spline basis β
3
(x−i)
and β
3
(y − j). And c
1
(i, j) is B-spline coefficient.
Similarly, the second surface can be defined by N × N discrete
values f
d1
(X, Y) or, equivalently, z
d
(X, Y) = f
d2
(X, Y)−f
d1
(X, Y) which
is the thickness of blood region between the first surface and the
second surface.
16.3.3 Reconstruction Algorithm
To reconstruct the surfaces and piecewise constant optical proper-
ties, the continuous surface representation should be incorporated
into the forward photon transportation model. In our study, the lin-
earized forward model (Eq. 5) is kept intact and the continuous rep-
resentation is sampled into the discrete vector of unknowns x.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
405
That is:

M = J · x = J · S(f
d1
(X, Y), mp1, z
d
(X, Y), bp2), (9)
where S( · ) is the sampling function which converts the continuous
shape representation to the voxel-based optical properties in for-
ward model, mp1 is the fraction of melanin and bp2 is fraction of
blood.
The inverse problemcan therefore be formulated as minimizing
the objective function:
F
obj
=
1
2

M
real
− J · x
2
=
1
2

M
real
− J · S(f
d1
(X, Y), mp1, z
d
(X, Y), bp2)
2
. (10)
Now, the unknowns of this inverse problem are reduced to
f
d1
(x, y), z
d
(X, Y), mp1 and bp2.
The multispectral shape reconstruction using N wavelengths
can be formulated as a multiobjective optimization problem and
its objective function is given as:
F
obj
= α
1
· F
λ
1
obj

2
· F
λ
2
obj
+ · · · +α
N
· F
λ
N
obj
, (11)
where, {α
1
, α
2
, . . . , α
N
} is a set of coefficients to balance the contribu-
tions from different single-wavelength objective functions.
In our study, we use a genetic algorithm to solve the optimiza-
tion problem
8
for the following reasons. First, in genetic algorithm,
the gradient neednot be evaluatedwhichsimplifies the computation
and provides reliability. Second, genetic algorithmis one of the most
popular methods used to seek global minimal. Third, among the
global optimization techniques, genetic algorithm provides a rea-
sonable convergence rate due to its implicit parallel computation.
Its elements include the fitness function, coding the chromosome,
reproduction and crossover (breeding) and mutation.
As to the optimization problem occurred in the shape-based
reconstruction, the objective function(Eq. 11) is selectedas the fitness
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
406
Song Wang and Atam P Dhawan
function. Its parameters is coded into chromosome like:

f
d1
(X
1
, Y
1
) − f
d1
(X
2
, Y
2
) − · · · − f
d1
(X
N×N
, Y
N×N
) − mpl
−z
d
(X
1
, Y
1
) − z
d
(X
2
, Y
2
) − · · · − z
d
(X
N×N
, Y
N×N
) − bp2

. (12)
Reproduction is governed by the “Roulette wheel” rule and
crossover and mutation events occur according to some predefined
probabilities.
Before working on the optimization algorithm, we may add
some reasonable constraints to the parameters. Firstly, the region
of support of the lesion in x-y plane is readily available in terms of
its surface shape. This further reduces the number of parameters to
represent a surface from N × N to a smaller set. As a consequence,
the optimization algorithm has a faster convergence rate. Secondly,
the blood region is typically thin layers within a few hundreds of
micrometers which put a constraint on z
d
(X, Y). Thirdly, the frac-
tions of melanin and blood are not free parameters. They can also
be bounded according to the appearance of melanoma and the clin-
ical experience. Lastly, multispectral imaging provides implicit con-
straints. Given the distinct absorption spectra of blood and melanin,
a reasonable solution must satisfy the measurements of all involved
wavelengths.
16.3.4 Phantom and Error Evaluation
To validate the shape-based multispectral algorithm, a double-
surface phantom is created to represent malignant melanoma. The
first andsecondsurfaces are describedbya mixedGaussianfunction
which are given as:
f
1
(x, y) = MAX(peak1 · G(x, y, µ
1a
, µ
2a
, σ
a
), peak2 · G(x, y, µ
1b
, µ
2b
, σ
b
))
(13)
f
2
(x, y) = MAX(peak3 · G(x, y, µ
1a
, µ
2a
, σ
a
), peak4 · G(x, y, µ
1b
, µ
2b
, σ
b
))
where the Gaussian function is:
G(x, y, µ
1
, µ
2
, σ) =
1
2πσ
2
exp


(x −µ
1
)
2
+(y −µ
2
)
2

2

. (14)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
407
And the parameters used in Eq. (13) are:













µ
1a
= 0.0375 ×2 cm µ
2a
= −0.0375 ×3 cm
µ
1b
= −0.0375 ×2 cm µ
2b
= 0.0375 ×3 cm
σ
a
= 0.0375 ×2.5 cm σ
b
= 0.0375 ×2 cm
peak1 = 100 ×6 µm peak2 = 100 ×4 µm
peak3 = 100 ×8 µm peak4 = 100 ×6 µm
(15)
The fraction of melanin is set to 5% between the epidermal layer
and the first surface f
1
(x, y). And the fraction of blood is set to 20%
between the first surface f
1
(x, y) and the second surface f
2
(x, y). This
model has sufficient variation in order to verify the reliability of the
reconstruction algorithm. Figure 3(A) displays the 3D view of this
model.
To further evaluate the reconstruction result, we introduce the
volume deviations Volerr1 and Volerr2. They are defined as:
Volerr1 =
|vol1
c
− vol1
m
|
vol1
m
. (16)
Here, vol1
c
is the calculatedvolume boundedby the first surface and
vol1
m
is the corresponding volume from the model:
Volerr2 =
|vol2
c
− vol2
m
|
vol2
m
. (17)
Here, vol2
c
is the calculated volume bounded by the first and second
surfaces and vol2
m
is the corresponding volume from the model.
16.4 RESULTS AND DISCUSSIONS
We select 580 nm and 800 nm to validate the reconstruction algo-
rithm since at these two wavelengths the absorption of oxy- and
deoxyhemoglobin is equivalent. In addition, absorption of melanin
andbloodat the two wavelengths has considerable difference which
provides excellent constraints to the solution. First of all, the dou-
ble surface continuous model is sampled and two “real” measure-
ments M
580
and M
800
are calculated by Monte Carlo simulation
at 580 nm and 800 nm respectively. Next, a nine by nine rectangu-
lar grid is overlapped on epidermal layer. The region of support of
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
408
Song Wang and Atam P Dhawan
Fig. 3. Reconstruction results: (A) Double-surface model (B–E) Reconstructed
surfaces with different constraints.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
409
the lesion is counted as 10 discrete control points. As a result, the
chromosome contains 22 genes and is coded as:

f
d1
(X
1
, Y
1
) − f
d1
(X
2
, Y
2
) − · · · − f
d1
(X
10
, Y
10
) − mpl
−z(X
1
, Y
1
) − z(X
2
, Y
2
) − · · · − z(X
10
, Y
10
) − bp2

. (18)
The fitness function of the genetic algorithm is given as:
F
obj
= α
1
F
580
obj

2
F
800
obj
, (19)
where in our simulation, α
1
is 0.3 and α
2
is 1.
Four simulations with different constraints are implemented in
real number represented genetic algorithm and results are summa-
rizedinTable 1. The recoveredsurfaces are displayedinFigs. 3(B–E).
In each case, the left is the first surface and the right is the second
surface. There is noconstraint for the first surface while the thickness
between the first surface and the second surface is set to be 300 µm
to represent a thin layer of blood net. In addition, the deformation
process of the surfaces during optimization is shown in Fig. 4.
In terms of Table 1, the constraints have significant impacts on
thereconstructedsurfaces. Except theloosest constrainedcase(E), all
cases present reasonable reconstructions which are consistent with
the model. Moreover, the reconstructed first surface has a smaller
volume error than the second surface. There are several reasons to
explain the larger volume error of the second surface. Firstly, the
absorption coefficient of blood is smaller than that of melanin. As a
result, the change in blood region has less contribution to the fitness
function. Secondly, since a reflectance geometry is adopted in Nevo-
scope, the sensitivity decreases at deeper layers. This also influences
Table 1. Summary of Reconstruction Results
Melanin Blood Recovered Recovered
Bounds Bounds Melanin Blood Volerr 1 Volerr 2
Case (%) (%) (%) (%) (%) (%)
(b) 5–5 20–20 5 20 2.68 16.58
(c) 4.5–5.5 10–30 5.10 18.33 3.81 20.89
(d) 4–6 10–30 4.64 18.97 5.18 24.71
(e) 3–7 10–30 3.37 10.08 44.09 63.50
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
410
Song Wang and Atam P Dhawan
Fig. 4. Deformation process during optimization (From left to right and from top
to bottom): (A) the first surface (B) the second surface.
the accurate reconstruction of the second surface. Thirdly, because
the two surfaces are attached together, the error resulting from the
first surface would inevitably propagate to the second surface. In
the worst case (E), a large error has been observed. The fraction of
melanin and blood is underestimated, which associates with over-
estimated volumes. It is therefore still a reasonable result for the
optimization problem.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
Shape-Based Reconstruction from Nevoscope Optical Images of Skin Lesions
411
16.5 CONCLUSION
Ashape based reconstruction method using a genetic algorithmhas
been presented in this chapter. Though the reconstruction algorithm
has beendescribedfor optical images of skinlesions for the detection
of malignant melanoma, the framework of the shape based image
reconstruction can be applied to other optical imaging applications.
16.6 ACKNOWLEDGMENTS
This research was partially funded by grants from George AOhl Jr
Trust Foundation and Gustarus and Louise Pfeiffer Research Foun-
dation. The workpresentedinthis report is also a part of the doctoral
dissertation work of Song Wang with Dr AtamDhawan as his Ph.D.
advisor.
References
1. Abramovits W, Stevenson LC, Changing paradigms in dermatology:
New ways to examine the skin using noninvasive imaging methods,
Clinics in Dermatol 21: 353–358, 2003.
2. Bashkatov AN, Genina EA, et al., Optical properties of human skin,
subcutaneous and mucous tissues in the wavelength range from 400
to 2 000 nm, Journal of Physics D: Applied Physics 38: 2543–2555, 2005.
3. Dhawan AP, Gordon R, Rangayyan RM, Nevoscopy: Three-
dimensional computed tomography for nevi and melanoma by trans-
illumination, IEEE Trans on Medical Imaging MI-3(2): 54–61, 1984.
4. Patwardhan S, Dai S, Dhawan AP, Multispectral image analysis and
classification of melanoma using fuzzy membership based partitions,
Computerized Medical Imaging and Graphics 29: 287–296, 2005.
5. Misha E Kilmer, Eric L Miller, David Boas, et al., Ashape-based recon-
structiontechnique for DPDWdata, Optics Express 7(13): 481–491, 2000.
6. Saeed Babaeizadeh, Dana H Brooks, David Isaacson, A deformable-
radius B-spline method for shape-based inverse problems, as applied
to electrical impedance tomography, acoustics, speech, and signal pro-
cessing 2005 (ICASSP ’05).
7. Misha E Kilmer, Eric L Miller, et al., Three-dimensional shape-based
imaging of absorption perturbation for diffuse optical tomography,
Applied Optics 42(16): 3129–3144, 2003.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch16 FA
412
Song Wang and Atam P Dhawan
8. Zacharopoulos A, Arridge S, Dorn O, et al., 3Dshape reconstruction in
optical tomography using spherical harmonics and BEM, Progress in
Electromagnetics Research Symposium 2006.
9. Chris Houck, Jeff Joines, Mike Kay, AGenetic Algorithm for Function
Optimization: AMatlabImplementation, NCSU-IETR, pp. 95–09, 1995.
10. Balch CM, et al., Prognostic factors analysis of 17,600 melanoma
patients: Validation of the American joint committee on cancer
melanoma staging system, Journal of Clinical Oncology 19(16): 3622–
3634, 2001.
11. Marchesini R, et al., Optical imaging and automated melanoma detec-
tion: Questions and answers, Melanoma Research 12: 279–286, 2002.
12. Ganster H, Pinz A, Kittler H, et al., Computer aided recognition of
pigmented skin lesions, Melanoma Research 7: 1997.
13. Seidenari S, et al., Digital video-microscopy and image analysis with
automatic classification for detection of thin melanomas, Melanoma
Research 9(2): 163–171, 1999.
14. Menzies S, Crook B, McCarthy W, et al., Automated instrumentation
and diagnosis of invasive melanoma, Melanoma Research 7: 1997.
15. Claridge E, Cotton S, et al., From color to tissue histology: Physics-
based interpretation of images of pigmented skin lesion, Medical Image
Analysis, 489–502, 2003.
16. Claridge E, Preece SJ, An inverse method for recovery of tissue param-
eters from colour images, Information Processing in Medical Imaging,
Springer, Berlin, LNCS2732, pp. 306–317, 2003.
17. Churmakov DY, et al., Analysis of skin tissues spatial fluorescence dis-
tribution by the Monte Carlo simulation, J Phys D: Applied Phys 36:
1722–1728, 2003.
18. Chang J, Graber HL, Barbour RL, Imaging of fluorescence in highly
scattering media, IEEE Trans on Biomedical Engineering 44(9): 810–822,
1997.
19. Fercher AF, et al., Optical coherence tomography — Principles and
applications, Rep Prog Phys 66: 239–303, 2003.
20. Tomatis S, et al., Automated melanoma detection: Multispectral imag-
ing and neural network approach for classification, Med Phys 30(2):
212–221, 2003.
21. Tomatis S, Bartoli C, et al., Spectrophotometric imaging of subcuta-
neous pigmented lesion: Discriminant analysis, optical properties and
histological characteristics, J Photochem Photobiol 42: 32–39, 1998.
22. Young AR, Chromophores in human skin, Physics in Medicine and
Biology 42: 789–802, 1997.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
CHAPTER 17
Multimodality Image Registration
and Fusion
Pat Zanzonico
Imaging has long been a vital component of clinical medicine and, more
recently, of biomedical research in small animals. In addition, image reg-
istration and fusion have become increasingly important components of
both clinical and laboratory (i.e. small-animal) imaging and have lead
to the development of a variety of pertinent software and hardware tools,
including multimodality, e.g. PET-CT, devices which “automatically” pro-
vide registered and fused three-dimensional (3D) image sets. This chapter
is a brief, largely non-mathematical review of the basics of image regis-
tration and fusion and of software and hardware approaches to 3D image
alignment, including mutual information algorithms and multimodality
devices.
17.1 INTRODUCTION
Since the discovery of X-rays, imaging has been a vital compo-
nent of clinical medicine. Increasingly, in vivo imaging of small
laboratory animals, i.e. mice and rats, has emerged as an impor-
tant component of basic biomedical research. Historically, clinical
and laboratory imaging modalities have often been divided into
two general categories, structural (or anatomical) and functional (or
physiological). Anatomical modalities, i.e. depicting primarily mor-
phology, include X-rays (plain radiography), CT (computed tomog-
raphy), MRI (magnetic resonance imaging), and US (ultrasound).
Functional modalities, i.e. depicting primarily information related
413
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
414
Pat Zanzonico
to underlying metabolism, include (planar) scintigraphy, SPECT
(single-photon emission computed tomography), PET (positron
emission tomography), MRS (magnetic resonance spectroscopy),
and fMRI (functional magnetic resonance imaging). The functional
modalities formthe basis of the rapidly advancing field of “molecu-
lar imaging,” defined as the direct or indirect noninvasive monitor-
ing and recording of the spatial and temporal distribution of in vivo
molecular, genetic, and/or cellular processes for biochemical, bio-
logical, diagnostic, or therapeutic applications.
1
Since informationderivedfrommultiple images is oftencomple-
mentary, e.g. localizing the site of an apparently abnormal metabolic
process to a pathologic structure such as a tumor, integration of
this information may be helpful and even critical. In addition to
anatomic localization of “signal” foci, image registration and fusion
provide: intra- as well as intermodality corroboration of diverse
images; more accurate and more certain diagnostic and treatment-
monitoring information; image guidance of external-beam radia-
tion therapy; and potentially, more reliable internal radionuclide
dosimetry, e.g. in the formof radionuclide image-derived “isodose”
contours superimposed on images of the pertinent anatomy. The
problem, however, is that differences in image size and dynamic
range, voxel dimensions and depth, image orientation, subject posi-
tion and posture, and information quality and quantity make it dif-
ficult to unambiguously co-locate areas of interest in multiple image
sets. The objective of image registration and fusion, therefore, is (a) to
appropriately modify the format, size, position, and even shape of
one or both image sets to provide a point-to-point correspondence
between images and (b) to provide a practical integrated display
of the images thus aligned. This process entails spatial registration
of the respective images in a common coordinate system based on
optimization of some “goodness-of-alignment,” or “similarity,” cri-
terion (or metric). This chapter is a brief, largely nonmathematical
review of the basics of image registration and fusion and of soft-
ware andhardware approaches to 3Dimage alignment andpresents
illustrative examples of registered and fused multimodality images
in both clinical and laboratory settings.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
415
17.2 BACKGROUND
The image registration and fusion process
2−5
is illustrated diagram-
matically and in general terms in Fig. 1. The first step is reformatting
of one image set (the “floating,” or secondary, image) to match that
of the other image set (the reference, or primary, image). Alter-
natively, both image sets may be transformed to a new, common
image format. Three-dimensional (3D), or tomographic, image sets
are characterized by: the dimensions (e.g. in mm), i.e. the length
Fig. 1. The image registration and fusion process. See text for details.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
416
Pat Zanzonico
(X), width (Y), and depth (Z), of each voxel; the image matrix,
X×Y×Z=number of rows, X×number of columns, Y×number
of tomographic images (or “slices”), Z; and the image depth (e.g. in
bytes), which defines the dynamic range of signal display-able in
each voxel (e.g. a word-mode, i.e. one word- or two byte-“deep,”
PET image can display up to 2
16
=65 536 signal levels for 16-bit
words). The foregoing image parameters are provided in the image
“header,” a block of data which may either be in a stand-alone text
file associated with the image file or incorporated into the image file
itself. Among the image sets to be registered, either the finer matrix
is reformatted to the coarser matrix by combining of voxels or the
coarser matrix is reformatted to the finer matrix by interpolation
of voxels. One of the resulting 3D image sets is then magnified or
minified to yield primary and secondary images with equal voxel
dimensions. Finally, the “deeper” image is rescaled to match the
depth of the “shallower” matrix. Usually, the higher spatial resolu-
tion and finer matrix structural (e.g. CT) image is the primary image
and the functional (e.g. PET) image the secondary image.
Thesecondstepinimageregistrationis theactual transformation
[translation, rotation, and/or deformation (warping)] of the refor-
mattedsecondary image set to spatially align it, in three dimensions,
with the primary image set.
The third and fourth steps are, respectively, the evaluation of
the accuracy of the registration of the primary and transformed sec-
ondary images and adjustment, iteratively, of the secondary image
transformation until the registration (i.e. the goodness-of-alignment
metric) is optimized.
The fifth and final step is image fusion, the integrated display of
the registered images.
17.3 PROCEDURES AND METHODS
17.3.1 “Software” versus “Hardware” Approaches
to Image Registration
In both clinical and laboratory settings, there are two practi-
cal approaches to image registration and fusion, “software” and
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
417
“hardware” approaches. In the software approach, images are
acquired on separate devices, imported into a common image-
processing computer platform, and registered and fused using
the appropriate software. In the hardware approach, images are
acquired on a single, multimodality device and transparently regis-
tered and fused with the manufacturer’s integrated software. Both
approaches are dependent on software sufficiently robust to recog-
nize and import diverse image formats. The availability of industry-
wide standard formats, such as the ACR-NEMA DICOM standard
i.e. theAmericanCollege of Radiology(ACR) andNational Electrical
Manufacturers Association (NEMA) for Digital Imaging and Com-
munications in Medicine (DICOM) standard,
6−9
is therefore critical.
17.3.2 Software Approaches
17.3.2.1 Rigid versus non-rigid transformations
Software-based transformations of the secondary image set to spa-
tially align it with the primary image set are commonly character-
ized as either “rigid” or “nonrigid”.
2−5
In a rigid transformation,
the secondary image is only translated and/or rotated with respect
to the primary image. However, the Euclidean distance between
any two points (i.e. voxels) within an individual image set remains
constant. In nonrigid, or deformable, transformations (commonly
known as “warping”), selected subvolumes within the image set
may be expanded or contracted and/or their shapes altered. Trans-
lations and/or rotations may be performed as well. Such warping is
therefore distinct fromany magnification or minification performed
in the reformatting step, where distances between points all change
by the same relative amount. Unlike rigid transformations, which
may be either manual or automated, non-rigid transformations are
generally automated.
17.3.2.2 Feature- and intensity-based approaches
Registrationtransformations are oftenbasedonalignment of specific
landmarks visible in the image sets; this is sometimes characterized
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
418
Pat Zanzonico
as the “feature-based” approach.
2−5
Such landmarks may be either
intrinsic, i.e. one or more well-defined anatomic structure(s) or the
body contour (i.e. surface outline), or extrinsic, i.e. one or more fidu-
cial markers placed in or around the subject. Feature-based registra-
tiongenerallyrequires some sort of preprocessing“segmentation”of
the image sets beingaligned, that is, identificationof the correspond-
ing features (e.g. fiduciary markers) of the image sets. Feature-based
image registration algorithms may be automated by minimization
of the difference(s) in position of the pertinent feature(s) between
the image sets being aligned.
Other registration algorithms are based on analysis of voxel
intensities (e.g. counts in a PET or SPECT image) and are character-
ized as “intensity-based” approaches.
2−5
These include: alignment
of the respective “centers of mass” (e.g. counts) and orientation (i.e.
principal axes) calculated for each image set; minimization of abso-
lute or sum-of-square voxel intensity differences between the image
sets; cross-correlation (i.e. maximizing the voxel intensity correla-
tion between the image sets); minimization of variance (i.e. match-
ing of identifiable homogeneous regions in the respective images
sets); and matching of voxel intensity histograms (discussed fur-
ther in the Results and Findings section).
2
Such intensity-based
approaches implicitly assume that the voxel intensities inthe images
being aligned represent the same, positively correlated parameters
(e.g. counts) and thus are directly applicable only to intramodality
image registration.
17.3.2.3 Mutual information
A relatively new but already widely used automated registra-
tion algorithm is based on the statistical concept of mutual
information,
3,10
also known as transinformation or relative entropy.
The mutual information of two random variables A and B is a
quantity that measures the statistical dependence of the two vari-
ables, that is, the amount of information that one variable contains
about the other. Mutual informationmeasures the informationabout
Athat is shared by B. If Aand B are independent, then Acontains no
information about B and vice versa and their mutual information is
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
419
therefore zero. Conversely, if Aand B are identical, then all informa-
tion conveyed by A is shared with B and their mutual information
is maximized. Accurate spatial registration of two such image sets
thus results in the maximization of their mutual information and
vice versa.
The concepts of entropy and mutual information are developed
more formally in the following. Given “events” (e.g. gray-scale val-
ues) e
1
, e
2
, . . . , e
n
with probabilities (i.e. frequencies of occurrence)
p
1
, p
2
, . . . , p
n
in an image set, the entropy (specifically, the so-called
“Shannon entropy”) H is defined as follows
3
:
H ≡
n

1
p
i
log
1
p
i
(1a)
= −
n

1
p
i
log p
i
. (1b)
The term, log
1
p
i
, indicates that the amount of information provided
by an event is inversely related to the probability (i.e. frequency)
of that event: the less frequent an event, the more significant is its
occurrence. The information per event is thus weighted by the fre-
quency of its occurrence. The uniform “background” (e
BG
) occu-
pies a large portion of a CT image (i.e. p
BG
is large), for example,
and therefore contributes relatively little information (i.e. log
1
P
BG
is
small) — and would not contribute substantially to accurate align-
ment with an MR image. The Shannon entropy is also a measure of
the uncertaintyof anevent. Whenall events (e.g. all grayscale values
in an image) are equally likely to occur (as in an highly heteroge-
neous image), the entropy is maximal.
a
When an event or a range of
events is more likely to occur (as in a uniformimage), the entropy is
minimal. Additionally, the entropy is a measure of dispersion of an
image’s probability distribution (i.e. the probability of a grey scale
value versus the grey scale values): a highly heterogeneous image
has a broad dispersion and a high entropy while a uniform image
has no dispersion and minimal entropy. Entropy thus has several
a
The analogy between signal entropy, used in the context of mutual information, and
thermodynamic entropy thus becomes clear.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
420
Pat Zanzonico
interpretations: the information content per event (e.g. grey-scale
value), the uncertainty per event, and the statistical dispersion of
events in an image.
For two images A and B, the mutual information MI(A,B) may
be defined as follows
b,3
:
MI(A, B) ≡ H(B) −H(B|A). (2)
H(B) is the Shannon entropy of image B (derived from the proba-
bility distribution of its grey-scale value) and H(B|A) is the condi-
tional entropy of image B with respect to image A [derived fromthe
conditional probabilities p(b|a), the probability of grey scale value b
occurring in image B given that grey scale value a occurs in the cor-
responding voxel in image A]. When interpreting entropy in terms
of uncertainty, MI(A,B) thus corresponds to the uncertainty in image
B minus the uncertainty in image B when image A is known. Intu-
itively, therefore, MI(A,B) — the image-B information in image A —
is the amount by which the uncertainty in image B decreases when
image A is given. Because images A and B can be interchanged,
MI(A,B) is also the information image B contains about image Aand
it is therefore mutual information. Registration thus corresponds to
maximizing mutual information: the amount of information images
have about each other is maximized when, and only when, they
are aligned. If a subject is imaged by two different modalities, there
is presumably considerable mutual information between the spatial
distributionof the respective signals inthe twoimages sets nomatter
howdiverse (i.e. unrelated) they may appear to be. For example, the
distribution of fluorine-18-labeled fluorodeoxyglucose (FDG) visu-
alized in a PET scan is, at some level, dictated by (i.e. dependent on)
the distribution of different tissue types imaged by CT.
17.3.2.4 Goodness-of-alignment metrics
Regardless of the algorithm employed, the evaluation and adjust-
ment of the registration requires some metric of its accuracy. It may
b
In information theory, there are actually a number of different definitions of mutual
information.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
421
be as simple as visual (i.e. qualitative) inspection of the aligned
images and a judgment by the operator that the registration is
or is not “acceptable.” A more objective, and ideally quantitative,
evaluation of the accuracy of the registration is, of course, pre-
ferred. One goodness-of-alignment metric, for example, is the sum
of the Euclidean distances between corresponding fiduciary mark-
ers (or anatomic landmarks) in the two image sets; the optimum
alignment corresponds to the transformation yielding the mini-
mum sum of distances. Another similarity metric, as discussed
above, is the mutual information: when the mutual information
between the two image sets is maximized, they are optimally
aligned.
17.3.3 Hardware Approaches
The major manufacturers of PET and CT scanners now also market
multimodality scanners,
11−13
combining high performance state-of-
the-art PET and CT scanners in a single device. These instruments
provide near-perfect registration of images of in vivo function (PET)
and anatomy (CT) using a measured, and presumably fixed, rigid
transformation between the image sets. These devices have already
had a major impact on clinical practice, particularly in oncology,
and PET-CT devices are currently outselling “PET-only” systems by
a two-to-one ratio.
14
Although generally encased in a single seam-
less housing, the PETandCTgantries insuchmultimodality devices
are separate; the respective fields of vieware separatedby a distance
of the order of 1 mand the PET and CT scans are performed sequen-
tially (Figs. 2 and 3). In one such device (Gemini, Philips Medical),
the PET and CT gantries are actually in separate housings with an
adjustable separation (up to ∼1 m) between them; this not only pro-
vides access to patients but also may minimize anxiety among claus-
trophobic subjects (Fig. 4).
In addition to PET-CT scanners, SPECT-CT scanners are now
commercially available. The design of SPECT-CT scanners is similar
to that of PET-CT scanners in that the SPECT and CT gantries are
separate and the SPECT and CT scans are acquired sequentially, not
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
422
Pat Zanzonico
Fig. 2. Schematic diagram(side view) of a commercially available clinical PET-CT
scanner. From reference
11
by permission of the authors. Inset: Photo of the PET-CT
scanner in the diagram, the Biograph™ (Siemens-CTI).
simultaneously. In such devices, the separation of the SPECT and
CT scanners is more apparent (Fig. 5) because the rotational and
other motions of the SPECT detectors effectively precludes encas-
ing them in a housing with the CT scanner. Multimodality imaging
devices for small animals (i.e. rodents) — PET-CT, SPECT-CT, and
even SPECT-PET-CT devices — are now commercially available as
well (Fig. 6).
Multimodality devices simplify image registration andfusion —
conceptually as well as logistically — by taking advantage of the
fixed geometric arrangement between the PET and CT scanners or
the SPECT and CT scanners in such devices. Further, because the
time interval between the sequential scans is short (i.e. a matter of
minutes) and the subject remains in place, it is unlikely that sub-
ject geometry will change significantly between the PET or SPECT
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
423
Fig. 3. Atypical imagingprotocol for a combinedPET-CTstudy: (A) the topogram,
or scout CT scan, for positioning; (B) the CT scan; (C) generation of CT-based atten-
uation correction factors; (D) the PET scan over the same longitudinal range of the
patient as the CT scan; (E) reconstruction of the attenuation-corrected PET emission
data; (F) the attenuation-corrected PET images; and (G) display of the final fused
PET-CT images. From Ref. 13 by permission of the authors.
scan and the CT scan. Accordingly, a rigid transformation matrix
(i.e. translations and rotations in three dimensions) can be used to
align the PET or SPECT and the CT image sets. This matrix can be
measured using a “phantom,” i.e. an inanimate object with PET- or
SPECT- and CT-visible landmarks arranged in a well-defined geom-
etry. The transformation matrix required to align these landmarks
can then be stored and used to automatically register all subsequent
multimodality studies, since the device’s geometry and therefore
this matrix should be fixed.
17.3.4 Image Fusion
Image fusion may be as simple as simultaneous display of images
in a juxtaposed format. Amore common, and more useful, format is
an overlay of the registered images, where one image is displayed
in one color table and the second image in a different color table.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
424
Pat Zanzonico
Fig. 4. Photo of a commercially available clinical PET-CT scanner, the
Gemini™ (Philips Medical), which allows variable separation of the PET and the
CT subsystems.
Typically, the intensities of the respective color tables as well as the
“mixture” of the two overlaid images can be adjusted. Adjustment
(e.g. with a slider) of the mixture allows the operator to interactively
vary the overlay so that the designated screen area displays only the
first image, only the second image, or some weighted combination
of the two images, each in its respective color table.
17.4 RESULTS AND FINDINGS
17.4.1 Software Approaches to Image Registration
17.4.1.1 Feature-based approach: Extrinsic fiduciary markers
Comparative imaging of multiple radiotracers in the same subject
can be invaluable in elucidating and validating their respective
mechanisms of localization. Comparative imaging of PET trac-
ers, particularly in small animals, is problematic, however: such
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
425
Fig. 5. Photo of a commercially available clinical SPECT-CT scanner, the
Precedence™ (Philips Medical).
tracers must be administered and imaged separately because simul-
taneously imaged positron emitters cannot be separated based on
energy discrimination. In one such study (Fig. 7),
15
the intratumoral
distributions of sequentiallyadministeredF18-FDGandthe hypoxia
tracer F18-fluoromisonidazole (FMiso) were compared in rats by
registered R4 microPET™ imaging with positioning of each animal
in a custom-fabricated whole-body mold. Custom-manufactured
germanium-68 rods were reproducibly positioned in the mold
as external fiduciary markers. The registered microPET™ images
unambiguously demonstrate grossly similar though not identical
distributions of FDGand FMiso in the tumors —a high-activity rim
surrounding a lower-activity core. However, there were subtle but
possibly significant differences in the intratumoral distributions of
FDG and FMiso, and these may not have been discerned without
careful image registration.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
426
Pat Zanzonico
Fig. 6. Photos of two commercially available laboratory (i.e. rodent) SPECT-CT
scanners: (A) the X-SPECT™ (Gamma Medica); (B) the Inveon™ (Siemens Preclin-
ical Solutions), which allows the detachment and separate use of the CT and the
PET subsystems.
17.4.1.2 Intensity-based approach: Minimization
of intensity differences
As illustrated in Fig. 8 showing sequential PET brain images of the
same patient,
2
misalignment of the image sets produces visualiz-
able structure in the difference images (the bottomrowof Fig. 8(A)),
i.e. the voxel-by-voxel intensity differences are not zero. In con-
trast, accurate registration yields differences images whose voxel-
by-voxel intensity differences are equal to zero within statistical
uncertainty (i.e. “noise”) and therefore an absence of visualizable
structure (bottom row of Fig. 8(B)).
17.4.1.3 Intensity-based approach: Matching of voxel
intensity histograms
For two image sets A and B, a 2D joint histogram (also known as
the “feature space”) (Fig. 9)
2
can be constructed by plotting, for
each combination of intensity a in image A and intensity b in image
B, the point (a, b) whose darkness or lightness reflects the number
of occurrences of the combination of intensities a and b. Thus, a
darker point in the joint histogram indicates a larger number and
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
427
Fig. 7. (A) Left panel: An anesthetized tumor-bearing rat in its custom-fabricated
mold (Rapid-Foam™, Soule Medical) for immobilization and reproducible posi-
tioning for repeat and/or intermodality imaging studies. Right panel: Three custom-
manufactured
68
Ge fiduciary markers (10 µCi each, 1 × 10 mm) (Sanders Medical
Products) reproducibly insertedinto the moldandusedas extrinsic fiduciary mark-
ers for software registration of serial microPET™ images. (B) The appearance of
the
68
Ge markers on overlaid F18-FDG and -FMiso transverse-section microPET™
images before and after registration based on the rigid transform consisting of
translations x, y, and z and rotations θ
x
, θ
y
, and θ
z
. (C) Registered and
fused
18
F-FDG (gray scale) and FMiso (hot iron) transverse, coronal, and sagit-
tal microPET™ images; the sagittal views are through a R3327-AT rat prostate
tumor xenograft inthe animal’s right hindlimb. Discordant areas of FDGandFMiso
uptakes are indicated by the white arrows for the R3327-AT tumor and by the yel-
low arrows for a FaDu human squamous cell carcinoma tumor xenograft. Both
tumors, 20 mm×20 mm×30 mm in size, were significantly hypoxic. From Ref. 15
by permission of the authors.
a lighter point a smaller number of occurrences of the combina-
tion (a,b). When two identical image sets are aligned (matched), all
voxels coincide and the plot in the voxel intensity histogram is the
line of identity (i.e. a = b for all voxels). As one of the image sets
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
428
Pat Zanzonico
Fig. 8. Intramodality image registration based on minimization of voxel intensity
differences. (A) Selected brain images of sequential misaligned (i.e. nonregistered)
PET studies of the same patient, with the section-by-section difference images in
the bottom row. (B) The same image sets as in (A), now aligned by minimization of
the voxel-by-voxel intensity differences. From Ref. 2 by permission of the authors.
Fig. 9. Intramodality image registration based on matching of voxel intensity his-
tograms. The joint intensity histograms of a transverse-section brain MR image
with itself when the two image sets are originally matched (i.e. aligned) and when
misaligned by counterclockwise rotations of 10

and 20

, respectively. See text for
details. Adapted from Ref. 2 by permission of the authors.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
429
Fig. 10. An intermodality (CT and MR) joint intensity histogram. The feature-
less (i.e. uniform) area corresponding to brain tissue in the transverse-section head
CT image (left panel), in contrast to the anatomic detail in the corresponding area
of the MR image (middle panel), yields a distinct vertical cluster (arrow) in the
CT-MR joint histogram (right panel). Adapted from Ref. 3 by permission of the
authors.
is rotated relative to the other (by 10

and then by 20

), for exam-
ple, the joint histogrambecomes increasinglyblurred(i.e. dispersed)
(Fig. 9). Alignment of the images can therefore be achieved by min-
imizing the dispersion in the joint intensity histogram. Like other
intensity-based approaches, this approach is most readily adapt-
able to similar (i.e. intramodality) images sets but in principle can
be applied to dissimilar (i.e. intermodality) images by appropri-
ate mapping of one image intensity scale to the other intensity
scale (Fig. 10).
3
17.4.1.4 Mutual information
As illustrated in Fig. 11
3
for registration of a brain MR image with
itself, the joint histogram of two images changes as the alignment
of the images changes. When the images are registered, correspond-
ing signal foci overlap and the joint histogram will show certain
clusters of grey scale values. As images become increasingly mis-
aligned (illustrated in Fig. 11 with rotations of 2

, 5

, and then
10

of the brain MRI relative to the original image), signal foci
will increasingly overlap that are not their respective counterparts
on the original image. Consequently, the cluster intensities for
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
430
Pat Zanzonico
Fig. 11. Effect of misregistrationonjoint intensityhistograms andmutual informa-
tion (MI) between a transverse-section brain MR image (top row) and itself. Shown
are the joint intensity histograms and mutual information (MI) (middle row) when
the two image sets are originally matched (i.e. aligned) and when misaligned by
clockwise rotations of 2

, 5

, and 10

, respectively (bottomrow). See text for details.
Adapted from Ref. 3 by permission of the authors.
corresponding signal foci (e.g. skull and skull, brain and brain
etc.) will decrease and newnoncorresponding combinations of grey
scale values (e.g. of skull and brain) will appear. The joint his-
togram will thus become more dispersed; as described above, min-
imization of this dispersion is the basis of certain intensity-based
registration algorithms. At the same time, the mutual information
(MI) (see Eqs. 1 and 2), which is minimized when the two images
are aligned, will increase. However, unlike other intensity-based
approaches, no assumptions are made in the MI approach regard-
ing the nature of the relationship between image intensities (e.g.
a positive or a negative correlation). MI is thus a completely gen-
eral goodness-of-alignment metric and can be applied to inter- as
well as intramodality registration and automatically without prior
segmentation.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
431
17.4.2 Hardware Approaches to Image Registration
Multimodality devices simplify image registration and fusion —
conceptually as well as logistically — by taking advantage of the
fixed geometric arrangement between the PET and CT scanners or
the SPECT and CT scanners in such devices. Further, because the
time interval between the sequential scans is short (i.e. a matter of
minutes), it is unlikely that a subject’s geometry will change signif-
icantly between the PET or SPECT scan and the CT scan. Accord-
ingly, a rigid transformation matrix (i.e. translations and rotations
in three dimensions) can be used to align the PET or SPECT and
the CT image sets. This matrix can be measured using a “phan-
tom,” i.e. an inanimate object with PET- or SPECT- and CT-visible
landmarks arranged in a well defined geometry. The transforma-
tion matrix required to align these landmarks can then be stored and
used to automatically register all subsequent multimodality studies,
since the devices mechanics and therefore this matrix are presum-
ably fixed.
To illustrate the utility of registered and fused multimodality
imaging studies in both clinical and laboratory settings, examples
are presented in Figs. 12
16
and 13.
17.5 DISCUSSION AND CONCLUDING REMARKS
In practice, two basic approaches to image registration and fusion,
“software” and “hardware” approaches, have been developed.
In the software approach, images are acquired on separate
devices and registered and fused using the appropriate software.
Rather robust and user-friendly software for image registration and
fusion is now widely available. Software approaches to registra-
tion of images acquired on separate devices have been particularly
successful in the brain because of the ability to reliably immobilize
and position the head, the pronounced contrast between the bony
skull (an intrinsic landmark) and the brain, and the lack of motion
or deformation of internal structures. Outside the brain, however,
software registration is more difficult because of the many degrees
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
432
Pat Zanzonico
Fig. 12. Registered and fused FDGPET and CT scans of a patient with lung cancer
and an adrenal gland metastasis. (A) Coronal PET images showtypically increased
FDG uptake in the primary lung tumor (single arrow in left panel) and in the metas-
tasis in the left adrenal gland (double arrow in left panel) but also in the left side
of the neck (arrow in right panel). (B) Transaxial PET and CT images through this
neck lesion. Reading these images separately or in the juxtaposed format shown,
it is difficult to definitively identify the anatomic site (i.e. tumor versus normal
structure) of the focus of activity in the neck. (C) The registered and fused PET-CT
images, using the fused, or overlay, display, unambiguously demonstrate that the
FDGactivity is located within muscle, a physiological normal variant. Because it is
best visualizedusing the original color display, an arrowis usedto identify the loca-
tion of this unusual, but nonpathologic, focus of FDG activity on the fused images.
Therefore, the FDG activity in the neck was not previously undetected disease, a
finding which would significantly impact the subsequent clinical management of
the patient. Adapted from Ref. 16 with permission of the authors.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
433
Fig. 13. Registered and fused SPECT-CT images (coronal views) of a mouse with
a LAN1 neuroblastoma tumor xenograft in its hindlimb (arrow). The radiotracer
was iodine-125-labeled 3F8, an antibody directed against the ganglioside 2 (GD2)
antigen, which is overexpressed on neuroblastomas (including LAN1). The images
were acquired at two days postinjection with the X-SPECT™ [Fig. 6(A)]. The CT
image shows the tumor as a space-occupying structure along the contour of the ani-
mal (left panel). The specific targeting of the radiolabeled 3F8 to the GD2-expressing
tumor xenograft is demonstrated by the high-contrast SPECT image (middle panel).
The registered and fused PET-CT images, again using the fused, or overlay, display,
unambiguously demonstrate that the 3F8 activity is located in the tumor, confirm-
ing that the focus of activity represents specific tumor-targeting by this antibody
and not, for example, excreted activity in the urinary bladder or radioactive con-
tamination. The images are providedcourtesy of Drs Shakeel Modak andNai-Kong
Cheung, Memorial Sloan-Kettering Cancer Center.
of freedom of the torso and its internal structures when imaged at
different times by different devices and with the subject in differ-
ent positions. For example, depending on the variable degree of
filling of the bladder with urine or the intestines with gas, pelvic
and abdominal structures may be significantly displaced from one
imaging study to the next. The registration process may therefore be
rather time-consuming and labor-intensive.
In the hardware approach, images are acquired on a single,
multimodality device and transparently registered and fused. To
date, such multimodality devices have been restricted almost exclu-
sively to PET-CT and SPECT-CT scanners. While MRI-CT scan-
ners might have little practical advantage, since both MRI and
CT are both anatomic imaging modalities, PET-MRI and SPECT-
MRI devices would be highly attractive. Combining PET or SPECT
and MRI remains problematic, however, because the magnetic
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
434
Pat Zanzonico
fields proximal to an MRI scanner interfere with the scintillation
detection process in all current generation PET and SPECT scan-
ners. Nonetheless, practical PET-MRI scanners are currently under
development.
17
Both intra- and intermodality image registration and fusion will
nodoubt become evenmore widelyusedandincreasinglyimportant
in both clinical and laboratory settings.
References
1. SNA News, RSNA, SNM Urge Interdisciplinary Cooperation to Advance
Molecular Imaging, 2005.
2. Hutton BF, Braun M, Thurfjell L, et al., Image registration: An essential
tool for nuclear medicine, Eur J Nucl Med 29: 559–577, 2002.
3. Maintz JBA, Viergever MA, A survey of medical image registration,
Med Image Anal 2: 1–36, 1998.
4. Hajnal JV, Hill DLG, Hawkes DJ (eds.) Medical Image Registration, Boca
Raton, FL, CRC Press, 2001.
5. Hill DLG, Batchelor PG, Holden M, et al., Medical image registration,
Phys Med Biol 46: R1–R45, 2001.
6. American College of Radiology, National Electrical Manufacturers
Association, “ACR-NEMA Digital Imaging and Communications
Standard,” NEMAStandards PublicationNo. 300–1985, Washington, DC,
1985.
7. American College of Radiology, National Electrical Manufacturers
Association, “ACR-NEMA Digital Imaging and Communications
Standard: Version 2.0,” NEMA Standards Publication No. 300–1988,
Washington, DC, 1988.
8. American College of Radiology, National Electrical Manufacturers
Association, “Digital Imaging and Communications in Medicine
(DICOM): Version3.0,” Draft Standard, ACR-NEMACommittee, Work-
ing Group VI, Washington, DC, 1993.
9. Mildenberger P, Eichelberg M, Martin E, Introduction to the DICOM
standard, Eur Radiol 12: 920–927, 2002.
10. Viola P, Wells III WM, Alignment by maximization of mutual informa-
tion, Inter J Computer Vision 22: 137–154, 1997.
11. Beyer T, Townsend DW, Brun T, et al., Acombined PET/CT scanner for
clinical oncology, J Nucl Med 41: 1369–1379, 2000.
12. Townsend DW, Carney JPJ, Yap JT, et al., PET/CT today and tomorrow,
J Nucl Med 445: 4S–14S, 2004.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
Multimodality Image Registration and Fusion
435
13. Yap JT, Carney JPJ, Hall NC, et al., Image-guided cancer therapy using
PET/CT, Cancer J 10: 221–223, 2004.
14. J Nucl Med(Newsline), PETon Display: Notes fromthe 59th SNMAnnual
Meeting, pp. 24N–26N, 2003.
15. Zanzonico P, Campa J, Polycarpe-Holman D, et al., Animal-specific
positioning molds for registration of repeat imaging studies: Com-
parative microPET™imaging of F18-labeled fluoro-deoxyglucose and
fluoro-misonidazole in rodent tumors, Nucl Med Biol 33: 65–70, 2006.
16. Schoder H, Erdi Y, Larson S, et al., PET/CT: Anewimaging technology
in nuclear medicine, Eur J Nucl Med Mol Imaging 30: 1419–1437, 2003.
17. Catana C, Wu Y, Judenhofer MS, et al., Simultaneous Acquisition of
multislice PET and MR images: Initial results with a MR-compatible
PET scanner, J Nucl Med 47: 1968–1976, 2006.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch17 FA
This page intentionally left blank This page intentionally left blank
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
CHAPTER 18
Wavelet Transformand Its Applications
in Medical Image Analysis
Atam P Dhawan
Recently, wavelet transform has been found to be a very productive and
efficient tool for image processing, analysis and compression for medical
applications. Wavelet transformprovides complete spatiofrequency local-
ization for medical images that may be used to remove noise, undesired
features and artifact, or to extract useful features for image characteriza-
tion and classification. This chapter provides an introduction of wavelet
transform with decomposition and reconstruction methods for medical
image analysis.
18.1 INTRODUCTION
Wavelet transform has recently emerged as an efficient signal pro-
cessing tool for the localization of frequency or spectral components
in the data. As a historical perspective of signal analysis, the Fourier
transform has proved to be an extremely useful tool for decompos-
ing a signal into constituent sinusoids of different frequency com-
ponents. However, Fourier analysis suffers from a drawback of the
loss of localization or time information when transforming infor-
mation from the time domain to the frequency domain. When the
frequency representation of a signal is looked into, it is impossible
to tell when a particular event took place. If the signal properties do
not change much over time, this drawback may be ignored. How-
ever, signals change with interesting properties over time or space.
437
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
438
Atam P Dhawan
An electrocardiogram signal changes over the time marker with
respect to heart beat events. Similarly, in the context of two-
dimensional and three-dimensional images, a signal or a property
represented by the image changes over the sampled data points
in space. Fourier analysis, in general, does not provide a specific
event (frequency) localizedinformation with respect to time (in time
series signals) or space (inimages). This drawbackof Fourier transfer
can be somewhat addressed by using short-time fourier transform
(STFT).
1−4
This technique adapted the Fourier transform to analyze
only a small section of the signal at a time. As a matter of fact, STFT
maps a signal into separate functions of time and frequency. The
STFT provides some information about frequency localization with
respect to a selected window. However, this information is obtained
withlimitedprecisiondeterminedbythe size of the window. Amajor
shortcoming with STFT is that the window size is fixed for all fre-
quencies once a particular size for the time windowis chosen. In real
applications, signals may require a variable windowsize in order to
accurately determine event localization with respect to frequency
and time or space.
Wavelet transform may use long sampling intervals where low
frequency information is needed, and shorter sampling intervals
where high frequency information is available. The major advan-
tage of wavelet transform is its ability to perform multiresolu-
tion analysis for event localization with respect to all frequency
components in data over time or space. Thus, wavelet analysis is
capable of revealing aspects of data that other signal analysis tech-
niques miss, suchas breakdownpoints, anddiscontinuities inhigher
derivatives.
1−4
Wavelet transform theory uses two major concepts: scaling and
shifting. Scaling, through dilation or compression, provides a capa-
bility of analyzing a signal over different windows or sampling
periods in the data while shifting, through delay or advancement,
provides translation of the wavelet kernel over the entire signal.
Daubechies wavelets
1
are compactly orthonormal wavelets which
make discrete wavelet analysis practicable. Wavelet analysis has
seen numerous applications in statistics,
1−4
time series analysis
1−2
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
439
and image processing.
5−8
Generalized wavelet basis functions have
been studied for image processing applications.
6−8
Furthermore,
wavelet transformhas also been used in data mining field and other
data intensive applications because of its many favorable proper-
ties, such as vanishing moments, hierarchical and multiresolution
decomposition structure, linear time and space complexity of the
transformations, decorrelatedcoefficients andawide varietyof basis
functions.
18.2 WAVELET TRANSFORM
Wavelet transform is the decomposition of a signal, f (t), with a
family of real orthonormal bases ψ
j,k
(t) obtained through trans-
lation and scaling of a kernel function ψ(t) in the Hilbert
space L
2
(R) of square integrable functions, known as the mother
wavelet, i.e.
ψ
j,k
(t) = 2
j/2
ψ(2
j
t − k); j, k ∈ Z, (1)
where j and k are integers representing, respectively, scaling and
shifting indices. Using the orthonormal property, the wavelet
coefficients of a signal f (t) can be computed as:
c
j,k
=

+∞
−∞
f (t)ψ
j,k
(t)dt. (2)
The signal f (t) can be fully recovered or reconstructed from the
wavelet coefficients as:
f (t) =

j,k
c
j,k
ψ
j,k
(t). (3)
To obtain wavelet coefficients fromEq. (2), ψ
j,k
(t), the translated and
scaled versions of the mother wavelet ψ(t), are obtained using a
scaling function. Using a scale resolution of multiples of two, the
scaling function φ(t) can be obtained as:
φ(t) =

2

n
h
0
(n)φ(2t − n). (4)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
440
Atam P Dhawan
Then, the wavelet kernel ψ(t) is related to the scaling function as:
ψ(t) =

2

n
h
1
(n)φ(2t − n), (5)
where
h
1
(n) = ( − 1)
n
h
0
(1 − n). (6)
The coefficients h(n) in Eq. (4) must satisfy several conditions for
the set of basis wavelet functions defined in Eq. (1) to be unique,
orthonormal, and with a certain degree of regularity.
1−4
18.3 SERIES EXPANSION AND DISCRETE WAVELET
TRANSFORM
Let x[n] be an arbitrary square summable sequence representing a
signal in the time domain such that:
x[n] ∈ l
2
(Z). (7)
The series expansion of a discrete signal x[n] using a set of ortho-
normal basis functions ϕ
k
[n] is given by:
x[n] =

k∈Z
ϕ
k
[l], x[l]ϕ
k
[n] =

k∈Z
X[k]ϕ
k
[n]
where X[k] = ϕ
k
[l], x[l] =

l
ϕ

k
[l]x[l]. (8)
where X[k] is the transform of x[n]. All basis functions must satisfy
the orthonormality condition, i.e.
ϕ
k
[n], ϕ
l
[n] = δ[k − l]
with
x
2
= X
2
(9)
where represents the inner product.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
441
The series expansion is considered to be complete if every signal
from l
2
(Z) can be expressed as shown in Eq. (8). Similarly, using a
set of biorthogonal basis functions, the series expansion of the signal
x[n] can be expressed as:
x[n] =

k∈Z
ϕ
k
[l], x[l] ˜ ϕ
k
[n] =

k∈Z
˜
X[k] ˜ ϕ
k
[n]
=

k∈Z
˜ ϕ
k
[l], x[l]ϕ
k
[n] =

k∈Z
X[k]ϕ
k
[n]
where
˜
X[k] = ϕ
k
[l], x[l] and X[k] = ˜ ϕ
k
[l], x[l]
and ϕ
k
[n], ˜ ϕ
l
[n] = δ[k − l]. (10)
Using a quadrature mirror filter theory, the orthonormal bases ϕ
k
[n]
can be expressed as lowpass and high pass filters for the decompos-
tion and reconstruction of a signal. It can be shown that a discrete
signal x[n] can be decomposed into X[k] as:
x[n] =

k∈Z
ϕ
k
[l]x[l]ϕ
k
[n] =

k∈Z
X[k]ϕ
k
[n]
where
ϕ
2k
[n] = h
0
[2k − n] = g
0
[n − 2k]
ϕ
2k+1
[n] = h
1
[2k − n] = g
1
[n − 2k]
and
X[2k] = h
0
[2k − l], x[l]
X[2k + 1] = h
1
[2k − l], x[l]. (11)
In Eq. (11), h
0
and h
1
are respectively, the lowpass and high pass
filters for signal decomposition or analysis, and g
0
and g
1
are respec-
tively, the low pass and high pass filters for signal reconstruction or
synthesis. A perfect reconstruction of the signal can be obtained if
the orthonormal bases are usedindecompositionandreconstruction
stages as:
x[n] =

k∈Z
X[2k]ϕ
2k
[n]+

k∈Z
X[2k + 1]ϕ
2k+1
[n]
=

k∈Z
X[2k]g
0
[n − 2k]+

k∈Z
X[2k + 1]g
1
[n − 2k]. (12)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
442
Atam P Dhawan
As described above, the scaling function provides low pass fil-
ter coefficients and the wavelet function provides the high pass
filter coefficients. Amultiresolutionsignal representationcanbe con-
structed based on the differences of information available at two
successive resolutions 2
j
and 2
j−1
. Such a representation can be com-
puted by decomposing a signal using the wavelet transform. First,
the signal is filtered using the scaling function, a low pass filter. The
filtered signal is then subsampled by keeping one out of every two
samples. The result of low pass filtering and subsampling is called
the scale information. If the signal has the resolution 2
j
, the scale
information provides the reduced resolution 2
j−1
. The difference of
information between resolutions 2
j
and 2
j−1
is called the “detail”
signal at resolution 2
j
. The detail signal is obtained by filtering the
signal with the wavelet, a high pass filter, and subsampling by a
factor of two.
In order to decompose an image, the above method for 1D sig-
nals is applied first along the rows of the image, and then along the
columns. The image, at resolution 2
j+1
, represented by A
j+1
, is first
low pass and high pass filtered along the rows. The result of each
filtering process is subsampled. Next, the subsampled results are
low pass and high pass filtered along each column. The results of
these filtering processes are again subsampled. The combination of
filtering and subsampling processes essentially provides the band
pass information. The frequency band denoted by A
j
in Fig. 1 is
Fig. 1. A three-level wavelet decomposition tree, where A means approximation
and D means detail.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
443
2
H
1
2
H
0
2
H
1
2
H
0
2
H
1
2
H
0
Horizontal Subsampling
Vertical Subsampling
2
H
1
2
H
0
2
H
1
2
H
0
2 2
1
2 2
H
0
2
H
1
2
H
0
2
H
1
2
H
0
2 2
H
1
2 2
H
0
2
H
1
2
H
0
2
H
1
2
H
0
2 2
H
1
2 2
0
Horizontal Subsampling
Vertical Subsampling
Low-Low A
j
High-High Dj
3
High-Low Dj
2
Low-High Dj
1
Fig. 2. Multiresolution decomposition of an image using the wavelet transform.
referred to as the low-lowfrequency band. It contains the scaled low
frequency information. The frequency bands labeled D
j
1
, D
j
2
, and
D
j
3
denote the detail information. They are referred to as low-high,
high-low, and high-high frequency bands, respectively (Fig. 2). This
scheme can be iteratively applied to an image to further decompose
the signal into narrower frequency bands, i.e. each frequency band
can be further decomposed into four narrower bands. Since each
level of decomposition reduces the resolution by a factor of two,
the length of the filter limits the number of levels of decomposition
(Fig. 3).
The signal decomposition at the j-th stage can thus be general-
ized as:
x[n] =

J
j=1

k∈z
X
(j)
[2k + 1]g
(j)
1
[n − 2
j
k]+

k∈z
X
(j)
[2k]g
(j)
0
[n − 2
j
k]
X
(j)
[2k] = h
(j)
0
[2
j
k − l], x[l]
X
(j)
[2k + 1] = h
(j)
1
[2
j
k − l], x[l]. (13)
Wavelet based decomposition of a signal x[n] using a low-pass filter
h
0
[k] (obtained fromthe scaling function) and a high-pass filter h
1
[k]
is shown in Fig. 4(A) while the reconstruction of the signal from
wavelet coefficients is shown in Fig. 4(B).
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
444
Atam P Dhawan
D
1
3
High-High
Component
D
1
2
High-Low
Component
D
1
1
Low-High
Component
A
1
Low-Low
Component
A
2 D
2
1
D
2
3
D
2
2
Fig. 3. Wavelet transform based image decomposition: the original resolution
image (NxN) is decomposedinto four lowlowA
1
, low-high D
1
1
, high-lowD
1
2
, and
high-high D
1
3
images each of which is subsampled to resolution

N
2
X
N
2

. The low-
lowimage is further decomposedinto four images of

N
4
X
N
4

resolutioneachinthe
second level of decomposition. For a full decomposition, each of the “detail” com-
ponent can also be decomposed into four subimages with

N
4
X
N
4

resolution each.
The “least asymmetric” wavelets were computed and reported
by Daubechies.
1
Different least asymmetric wavelets were com-
puted for different support widths as larger support widths pro-
vide more regular wavelets, a desired property in signal and image
processing. A least asymmetric wavelet is shown in Fig. 5 with the
coefficients of the correspondinglowpass andhighpass filters given
in Table 1.
18.4 IMAGE PROCESSING USING WAVELET TRANSFORM
The wavelet transform provides a set of coefficients representing
the localized information in a number of frequency bands. A pop-
ular method for denoising and smoothing is to threshold these
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
445
2 H
1
2 H
0
2 H
1
2 H
0
2 H
1
2 H
0
22 H
1
22 H
0
2 H
1
2 H
0
22 H
1
2 H
0
H
1
22 H
0
2 H
1
2 H
0
22 H
1
2 H
0
H
1
22 H
0
x[n] X
(1)
[2k+1]
2
G
1 +
G
0 2 2
G
1 +
G
0 2 2
G
1 +
G
0 2
2
G
1 22
G
1 +
G
0 22 2
G
1 +
G
0 2
22
G
1 +
G
0 22 2
G
1 +
G
0 2
22
G
1 +
G
0 22
] [n x
(A)
(B)
X
(1)
[2k]
X
(2)
[2k+1]
X
(2)
[2k]
X
(3)
[2k+1]
X
(3)
[2k]
X
(3)
[2k+1]
X
(3)
[2k]
X
(2)
[2k+1]
X
(1)
[2k+1]
Fig. 4. (A) Amultiresolution signal decomposition using wavelet transform and
(B) the reconstruction of the signal from wavelet transform coefficients.
coefficients in those bands that have high probability of noise and
then reconstruct the image using the reconstruction filters. The
reconstruction filters, as described in Eq. (12), can be derived from
the decomposition filters using the quadrature mirror theory.
1−4
The reconstruction process integrates information from specific
bands with successive upscaling of resolution to provide the final
reconstructed image at the same resolution as of the input image. If
certain coefficients related to the noise or noise like information are
not included in the reconstruction process, the reconstructed image
shows a reduction of noise and smoothing effects. As can be seen
in Fig. 6.22, the coefficients available in the low-high, high-low and
high-high frequency bands in the decomposition process, provide
edge related information that can be emphasized in the reconstruc-
tionprocess for image sharpening.
5−8
Figure 6 shows anX-raymam-
mogramoriginal image that is smoothedusingthe wavelet shownin
Fig. 5. To obtain the smoothed image shown in Fig. 7, a hard thresh-
olding method was used in which high-high frequency wavelet
coefficients was equated to zero and not used in the reconstruction
process. The loss of high-frequency information can be seen in the
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
446
Atam P Dhawan
Fig. 5. The least asymmetric wavelet with eight coefficients.
Table 1. The Coefficients for the Corresponding LowPass
and High Pass Filters for the Least Asymmetric
Wavelet
N High Pass Low Pass
0 −0.107148901418 0.045570345896
1 −0.041910965125 0.017824701442
2 0.703739068656 −0.140317624179
3 1.136658243408 −0.421234534204
4 0.421234534204 1.136658243408
5 −0.140317624179 −0.703739068656
6 −0.017824701442 −0.041910965125
7 0.045570345896 0.107148901418
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
447
Fig. 6. An original digital mammogram image.
smoothed image. Figure 8 shows the reconstructed image of from
the high-high wavelet coefficients only.
18.5 FEATURE EXTRACTION USING WAVELET
TRANSFORM FOR IMAGE ANALYSIS
Two-dimensional wavelet transform is widely used in image pro-
cessing applications. Its ability to repeatedly decompose an image
in the low frequency channels makes it ideal for image analysis
since the lower frequencies dominate the real images. The smooth
image has strong components only in the low frequencies whereas
the textured image has substantial components in the wide fre-
quency/scale spectrum. Features related to spatiofrequency rep-
resentation of the image can be efficiently extracted and analyzed
usingwavelet transformmethod. Wavelet transformprovides one of
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
448
Atam P Dhawan
Fig. 7. Asmoothed version of the image shown in Fig. 6 obtained through wavelet
transform based smoothing method.
the best representation methods for analysis of texture information
in the image. Texture has been widely used in image analysis for
biomedical applications and satellite image analysis. It is an impor-
tant characteristic of an image and is useful for image interpretation
and recognition. The application of wavelet orthogonal representa-
tiontotexture discriminationandfractal analysis has beendiscussed
by Mallat.
2
Feature extraction for texture analysis and segmentation
using wavelet transforms has been applied by Chang and Kuo,
9
Laine and Fan,
10
Unser,
11
and others.
12−15
Each level of decomposition provides band pass filtered spa-
tiofrequency information that can be used for feature extraction,
representation and analysis. For example, energy ratios in spe-
cific subbands from the wavelet transform based multiresolution
decomposition have been used in characterization of skin lesion
images for detection of skin cancer, malignant melanoma.
16−18
The
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
449
Fig. 8. An image reconstructed using high-high wavelet coefficients from the
wavelet transform of the image shown in Fig. 6.
epiluminesence images of skin lesion were obtained using Nevo-
scope and used for classification using texture based features
extracted through the wavelet transform based decomposition
method.
19−22
The method is briefly described here.
21−22
Figure 9
Fig. 9. Sample images (A) dysplastic nevus and (B) malignant melanoma.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
450
Atam P Dhawan
shows sample images of a dysplastic nevus (nonmalignant lesion)
and a malignant melanoma.
18.5.1 Feature Extraction Through Wavelet Transform
Three-level wavelet transform was applied to the epiluminesence
images usingthe Daubechies 3 wavelet toobtainthe 10 mainwavelet
subbands. Figure 10 shows three-level wavelet decomposition cod-
ing of the image.
These channels (subbands) were further grouped into low-
(channels 1–4), middle- (channels 5–7) and high-frequency (chan-
nels 8–10). The ratio of the mean energy in the four low-frequency
channels (1–4) to the mean energy in the three middle-frequency
channels (5–7) is proposed as a criterion for optimal feature selec-
tion by R Porter and N Canagarajah.
12
Similarly, a set of ratios of
the wavelet coefficients are studied for the textural analysis and the
optimal set of features is obtained by statistical analysis.
The set of ratios studied are:
r
1
=
m(c
1
)
m(c
12
)
; r
2
=
m(c
12
)
m(c
11
)
; r
3
=
m(c
2
) + m(c
3
) + m(c
4
)
m(c
5
) + m(c
6
) + m(c
7
)
;
r
4
=
m(c
5
) + m(c
6
) + m(c
7
)
m(c
8
) + m(c
9
) + m(c
10
)
;
r
5
=
m(c
1
)
m(c
12
)

m(c
2
) + m(c
3
) + m(c
4
)
m(c
5
) + m(c
6
) + m(c
7
)
;
Fig. 10. Three-level wavelet decomposition of an image.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
451
r
6
=
m(c
12
)
m(c
11
)

m(c
5
) + m(c
6
) + m(c
7
)
m(c
8
) + m(c
9
) + m(c
10
)
;
r
7
=
m(c
1
)
m(c
2
) + m(c
3
) + m(c
4
)
÷
m(c
12
)
m(c
5
) + m(c
6
) + m(c
7
)
;
r
8
=
m(c
11
)
m(c
5
) + m(c
6
) + m(c
7
)
÷
m(c
12
)
m(c
8
) + m(c
9
) + m(c
10
)
, (14)
wherec
i
stands for thedifferent wavelet channels i = 1..10, of decom-
position and m stand for the mean value of the wavelet coefficients
for different channels given by:
m =

i

j
x
ij
length ∗ breadth
, (15)
where x
ij
is the computed coefficient of wavelet transform; the
length and breadth are the dimensions of the respective channels
decomposed.
The variance of the wavelet coefficient is given by:
ε =

i

j
(x
ij
− mean)
2
length ∗ breadth
, (16)
where mean represents the mean of the wavelet coefficients.
The entropy measure for texture analysis can be defined as:
H =

i

j
x
2
ij
∗ log (x
2
ij
)
length ∗ breadth
. (17)
The energy of the wavelet coefficients defined as follows:
E =

i

j
x
2
ij
length ∗ breadth
. (18)
The set of ratios mentioned earlier is calculated for mean, vari-
ance, energy and entropy of wavelet coefficients giving in all 32
ratios, which henceforth are referred to as features. Also the gray
level features such as the mean and standard deviation of the image
intensity were included in the feature set. Thus, 34 features were
considered in this texture analysis.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
452
Atam P Dhawan
Statistical correlation analysis was performed on extracted
features to select statistically significant and correlated features.
The statistical correlation analysis provideda reducedset of features
of the following five features with highest statistical significance:
f
1
=
m(c
1
)
m(c
12
)
; f
2
=
m(c
12
)
m(c
11
)
; f
3
=
e(c
12
)
e(c
11
)
;
f
4
=
et(c
1
)
et(c
2
) + et(c
3
) + et(c
4
)
÷
et(c
12
)
et(c
5
) + et(c
6
) + et(c
7
)
;
f
5
= ln (std + 1), (19)
where m, e, et and std stands for the mean, energy and entropy of the
wavelet coefficients and standard deviation of the image intensity
respectively.
The selected features were then used in training a nearest-
neighborhoodclassifier (describedinChapter 10) usingatrainingset
of pathologically validated labeled set of images. The trained clas-
sifer was then used to classify those images that were not included
in the training set. Results of the nearest-neighborhood classifier
were compared to the pathology to obtain true positive and false-
positive rates of melanoma detection. Atrue positive rate of 93%for
melanoma detection was obtained with a false positive rate of 0%
through this analysis.
20
18.6 CONCLUDING REMARKS
Wavelet transform has been effectively used for one- and multi-
dimensional data analysis with a number of applications including
medical image analysis. Wavelet transformprovides a simple series
expansion based signal decomposition and reconstruction methods
for localization of characteristic events associated with frequency
and time/space information. Utilizing the property orthonormal
basis functions with scaling and shifting operations, multiresolu-
tion wavelet packet analysis provides localized responses equiva-
lent to multiband filters but in a computationally efficient manner.
Wavelet transform can be implemented through a simple modular
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
Wavelet Transform and Its Applications in Medical Image Analysis
453
algorithm suitable for fast or real-time applications of any kind of
data analysis.
Wavelet transform has been used for image enhancement,
restoration and reconstruction for medical images. The localized
spatiofrequency information available through wavelet transform
can be effectively used for defining specific features for image rep-
resentation, characterization and classification. Multidimensional
expansion of wavelets transform and adaptive design of wavelet
for specific image processing tasks have become areas of significant
research interest in the recent years and will continue to be a pro-
ductive research area in the near future.
References
1. Ingrid Daubechies, Ten Lectures on Wavelets, Society for Applied Math-
ematics, Philadelphia, PA, 1992.
2. Mallat S, A theory for multiresolution signal decomposition: The
wavelet representation, IEEE Transactions on Pattern Analysis and
Machine Intelligence 11: 674–693, 1989.
3. Stephane Mallat, Wavelets for a Vision, Proceedings of the IEEE 84:
604–614, 1996.
4. Cohen A, Kovacevic J, Wavelets: The mathematical background, Pro-
ceedings of the IEEE 84: 514–522, 1996.
5. BovikA, Clark M, Geisler W, Multichannel texture analysis using local-
ized spatial filters, IEEE Transactions on Pattern Analysis and Machine
Intelligence 12: 55–73, 1990.
6. Weaver JB, Yansun X, Healy Jr DM, Cromwell LD, Filtering noise from
images with wavelet transforms, Magnetic Resonance in Medicine 21:
288–295, 1991.
7. Alex P Pentland, Interpolation using wavelet bases, IEEE Trans-
actions on Pattern Analysis and Machine Intelligence 16: 410–414,
1994.
8. Ming-Haw Yaou, Wen-Thong Chang, Fast surface interpolation using
multiresolution wavelet transform, IEEE Transactions on Pattern Anal-
ysis and Machine Intelligence 16: 673–688, 1994.
9. Chang T, Kuo CCJ, Texture analysis and classification with tree-
structure wavelet transform, IEEE Trans Image Process 2(4): 429–447,
1993.
10. LaineA, FanJ, Texture classificationbywavelet packet signatures, IEEE
Trans Pattern Anal Mach Intell 15(11): 1186–1191, 1993.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch18 FA
454
Atam P Dhawan
11. Unser M, Texture classification and segmentation using wavelet
frames, IEEE Trans Image Process 4(11): 1549–1560, 1993.
12. Porter R, Canagarajah N, A robust automatic clustering scheme for
image segmentation using wavelets, IEEE Trans Image Process 5(4):
662–665, 1996.
13. Wang JW, Chen CH, Chien WM, Tsai CM, Texture classification using
non-separable two-dimensional wavelets, Pattern Recognition Letters
19: 1225–1234, 1998.
14. Chitre Y, Dhawan A, M-band wavelet discrimination of natural tex-
tures, Pattern Recognition Letters, 773–789, 1999.
15. vanErkel AR, ThPattynama PM, Receiver operating characteris-
tic (ROC) analysis: Basic principles and applications in radiology,
European Journal of Radiology 27: 88–94, 1998.
16. Kopf A, SaloopekT, SladeJ, MarghoodA, et al., Techniques of cutaneous
examination for the detection of skin cancer, Cancer Supplement 75(2):
684–690, 1994.
17. Koh H, Lew R, Prout M, Screening for melanoma/skin cancer: Theo-
retical and practical considerations, J Am Acad Dermatol 20: 159–172,
1989.
18. Stoecker W, Moss R, Skin Cancer Recognition by Computer Vision:
Progress Report, National Science Foundation Grant ISI 8521284,
August 29, 1988.
19. Dhawan AP, Early detection of cutaneous malignant melanoma by
three dimensional Nevoscopy, Computer Methods and Programs in
Biomedicine 21: 59–68, 1985.
20. Nimukar A, DhawanA, Relue P, Patwardhan S, Wavelet and Statistical
Analysis for Melanoma Classification, SPIE International Conference on
Medical Imaging, MI 4684, 1346–1353, Feb 24–28, 2002.
21. Patwardhan S, Dhawan AP, Relue P, Classification of melanoma using
tree-structured wavelet transform, Computer Methods and Programs in
Biomedicine 72(3): 223–239, 2003.
22. Patwardhan S, Dai S, Dhawan AP, Multispectral image analysis and
classification of melanoma using fuzzy membership based partitions,
Computerized Medical Imaging and Graphics 29: 287–296, 2005.
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
CHAPTER 19
Multiclass Classification for Tissue
Characterization
Atam P Dhawan
Computer aided diagnostic applications such as cancer detection may
require a binary classification into benign and malignant classes. How-
ever, there are many medical imaging applications requiring multiclass
classifications to categorize image data into more than two classes for tis-
sue or pathology characterization. This chapter provides an introduction
of some of the approaches such as Bayesian classification, support vec-
tor machine, and neuro-fuzzy systems that can be applied in multiclass
classification.
19.1 INTRODUCTION
Conventional methods for computer-aided medical image analy-
sis for the detection of an outcome or pathology such as cancer
usually require a binary classification of acquired image data. How-
ever, other medical image analysis applications such as segmenta-
tion and tissue characterization from multiparameter images may
require multiclass classification. For example, brainimages acquired
through multiparameter multidimensional imaging protocols may
be analyzed for multiclass segmentation for tissue characteriza-
tion for the evaluation and detection of critical neurological func-
tions and disorders. Several chapters in this book describe current
and merging trends in multiparameter brain imaging and radia-
tion therapy that can be benefited using multiclass classification
455
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
456
Atam P Dhawan
approaches. Fusion of anatomical, metabolic and functional infor-
mation usually leads to multidimensional data sets leading to anal-
ysis of local regions that can be obtained from segmentation and
detectionapproaches basedonmulticlass classification. Inthis chap-
ter, we present some of the multiclass classificationmethods suitable
for multiparameter medical image analysis.
19.2 MULTICLASS CLASSIFICATION USING MAXIMUM
LIKELIHOOD DISCRIMINANT FUNCTIONS
Medical images preprocessing and feature extraction analysis leads
to a set of spatially distributed multidimensional data vectors of
raw measurements and computed features. The total number of
measurements and computed features allocated to each pixel in the
image sets up the dimension d of the feature space. Let us assume
that we have an image of m rows and n columns with mn number
of pixels to be classified into k number of classes. Thus, we have mn
data vectors X = {x
j
; j = 1, 2, . . . , mn} distributedin a d-dimensional
feature space. Thus, each element of the data vector (i.e. pixel in the
image) is associated with d-dimensional feature vector. The pur-
pose of multiclass classification is to find a mapping f (X) to map the
input data vectors into k classes denoted by C = {c
i
; i = 1, 2, . . . , k}.
In order to learn such a mapping, we can use a training set S of
cardinality l with labeled input vectors such that:
S = {(x
1
, c
l
), . . . , (x
l
, c
l
)}, (1)
x
i
∈ χ are provided in the inner-product space of and χ ⊆ R
d
and
C
i
∈ γ = {1, . . . , k} the corresponding class or category label.
As shown in Eq. (1), there is a pair relationship of the assignment
of each input pixel X to a class C. Let us assume that each class c
i
model obtained from the training set has a mean vector µ
i
and a
covariance represented by

i
such that:
ˆ µ
i
=
1
n

j
x
j
, (2)
where i = 1, 2, . . . , k; andj = 1, . . . , n; n is the number of pixel vectors
in the i-th class, and x
j
is the j-th of n multidimensional vectors that
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
Multiclass Classification for Tissue Characterization
457
comprise the class. The dimension of x
j
corresponds to the number
of image modalities used in the analysis. The covariance matrix of
class i,
ˆ

i
, is:
ˆ

i
=
1
n − 1

j
(x
j
− ˆ µ
i
)(x
j
− ˆ µ
i
) . (3)
For developing anestimationmodel,
1–3
let us assume that the image
to classify is a realization of a pair of random variables {C
mn
, X
mn
};
where C
mn
is the class of the pixel mn. C
mn
represents the spatial
variability of the class in the image and can take the values in a
discrete set {1, 2, . . . , k}. X
mn
is a d-dimensional random variable of
pixel mn describing the variability of measurements for that pixel.
X
mn
describes the variability of the observed values x in a particular
class. Given that C
mn
= i, (i = 1, 2, . . . , k), the distribution of X
mn
is estimated to obey the general multivariate normal distribution
described by the density function:
ˆ p(x) =
1
(2π)
d/2
¸
¸
¸
¸
ˆ

i
¸
¸
¸
¸
1/2
exp
_
_
_
−(x − ˆ µ
i
)
2
ˆ

i
(x − ˆ µ
i
)
_
¸
_
, (4)
where x is a d-element column vector, ˆ µ
i
is a d-element estimated
mean vector for the class i calculated from the training set,
ˆ

i
is the
estimated d × d covariance matrix for class i also calculated from
the training set, and d is the dimension of multiparameter or feature
vector.
For maximum likelihood based discriminant analysis to assign
a class to a given pixel in the image.
1–4
For each pixel, four tran-
sition matrices P
r
(m, n) = [p
ijr
(m, n)] can be estimated, where r
is a direction index (following four spatial connectedness direc-
tions in the image) and p
ijr
(m, n) are the transition probabilities
defined by:
p
ij1
(m, n) = P{C
mn
= j|C
m,n−1
= i}, (5)
p
ij2
(m, n) = P{C
mn
= j|C
m+1,n
= i}, (6)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
458
Atam P Dhawan
p
ij3
(m, n) = P{C
mn
= j|C
m,n+1
= i}, (7)
p
ij4
(m, n) = P{C
mn
= j|C
m−1,n
= i}. (8)
A generalized estimation of transition probabilities for classes
can be obtained using b images in the training set and aver-
aged over small neighborhood of h pixels around the pixel mn
as:
p
ij1
(m, n) =

b

n
{pix|C
mn
= j, C
m,n−1
= i}

b

n
{pix|C
m,n−1
= i}
p
ij2
(m, n) =

b

n
{pix|C
mn
= j, C
m+1,n
= i}

b

n
{pix|C
m+1,n
= i}
p
ij3
(m, n) =

b

n
{pix|C
mn
= j, C
m,n+1
= i}

b

n
{pix|C
m,n+1
= i}
p
ij4
(m, n) =

b

n
{pix|C
mn
= j, C
m−1,n
= i}

b

n
{pix|C
m−1,n
= i}
,
(9)
where

b
{pix|CP} denotes the number of pixels with the
property CP in the images used in the training set used
to generate the model and

n
represents the number of
pixels with the given property in the predefined neighbor-
hood.
The equilibrium transition probabilities can then be estimated
using a similar procedure as:
π
i
(mn) =

b

n
{pix|C
mn
= i}

b

n
{pix}
. (10)
19.2.1 Maximum Likelihood Discriminant Analysis
The class random variable C
mn
is assumed to constitute a k-state
Markovrandomfield. Rows andcolumns of C
mn
constitute segments
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
Multiclass Classification for Tissue Characterization
459
of k-state Markov chains. Chains are specified by the k ×k transition
matrix P = [p
ij
] where:
p
ij
= P{C
mn
= j|C
m,n−1
= i}, (11)
which leads to the equilibrium probabilities (π
1
, π
2
, . . . , π
K
).
Usingthe above model,
1
the probabilities of eachpixel belonging
to a specific class i is:
P{C
mn
= i|x
kl
, (k, l) ∈ N(m, n)}, (12)
where N(m, n) is a predefined neighborhood of the pixel (m, n). For
example, a four-connected neighborhood around a pixel mn can be
defined as:
N(m, n) = {(m, n), (m − 1, n), (m, n − 1), (m + 1, n), (m, n + 1)}. (13)
It follows that:
P{C
mn
= i|x
kl
, (k, l) ∈ N(m, n)} =
P
_
C
mn
= i, X
mn
|X
m±1,n
, X
m,n±1
_
P
_
X
mn
|X
m±1,n
, X
m,n±1
_
(14)
and
P{C
mn
= i|x
kl
, (k, l) ∈ N(m, n)}
=
P
_
X
mn
|C
mn
= i, |X
m±1,n
, X
m,n±1
_
P
_
C
mn
= i|X
m±1,n
, X
m,n±1
_
P
_
X
mn
|X
m±1,n
, X
m,n±1
_ ,
(15)
where
P{◦|X
m±1,n
, X
m,n±1
} ≡ P{◦|X
m−1,n
}P{◦|X
m,n−1
}
×P{◦|X
m+1,n
}P{◦|X
m,n+1
}. (16)
Taking into account the class conditional independence Eq. (15) can
be stated as:
P{C
mn
= i|x
kl
, (k, l) ∈ N(m, n)}
=
P{X
mn
|C
mn
= i}P{C
mn
= i|X
m±1,n
, X
m,n±1
}
P{X
mn
|X
m±1,n
, X
m,n±1
}
. (17)
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
460
Atam P Dhawan
With the Bayes estimation method, the above expression leads to:
P{C
mn
= i|x
kl
, (k, l) ∈ N(m, n)}
=
P{X
mn
|C
mn
= i}P{X
m±1,n
, X
m,n±1
|C
mn
= i}P{C
mn
= i}
P{X
mn
|X
m±1,n
, X
m,n±1
}P{X
m±1,n
, X
m,n±1
}
(18)
where
P{X
m±1,n
, X
m,n±1
|◦} ≡ P{X
m−1,n
|◦}P{X
m,n−1
|◦}
×P{X
m+1,n
|◦}P{X
m,n+1
|◦}. (19)
The terms in Eq. (19) can be further expressed as:
P{X
m−1,n
|C
mn
= i} =
N

j=1
P{X
m−1,n
|C
mn
= j}P{C
m−1,n
= j|C
mn
= i}
≡ H
m−1,n
(i). (20)
Finally, substituting Eqs. (19) and (20) into Eq. (18), the probability
of the current pixel, mn belonging to class i given the characteristics
of the pixels in the neighborhood of mn can now be defined as:
P{C
mn
= i|x
kl
, (k, l) ∈ N(m, n)}
=
P{C
mn
= i|X
mn
}P{X
mn
}H
m−1,n
(i)H
m,n−1
(i)H
m+1,n
(i)H
m,n+1
(i)
P{X
mn
|X
m±1,n
, X
m,n±1
}P{X
m±1,n
, X
m,n±1
}
.
(21)
Equation (21) shows the conventional expression for the class proba-
bilities, denotedby P{C
mn
= i|X
mn
}P{X
mn
}, is modifiedby the factors
H
ij
according to the evidence foundin the immediate neighborhood.
Pixels are classified based on the class that maximizes.
19.3 NEURO-FUZZY CLASSIFIERS FOR MULTICLASS
CLASSIFICATION
Thepatternrecognitionsystems suchas backpropagationneural net-
work, radial basis function (RBF) network or of k-nearest-neighbor
(KNN) can provide multiclass classification using crisp decision
surfaces that often suffer fromlowimmunity to noise in the training
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
Multiclass Classification for Tissue Characterization
461
patterns. Neural networks and clustering methods for classification
are described in Chapter 10 of this book. To overcome problems of
crisp function based classifier, fuzzy functions have been used for
classification applications.
Several approaches using fuzzy set theory for pattern recogni-
tion can be found in a number of publications.
5–19
A novel pattern
recognition method using fuzzy functions with a winner-take-all
strategy is presented here that can be used for multiclass classifica-
tion. In this approach, the feature space is first partitioned into all
categories using the training data. The data is thus transformed into
convex sets in the feature space. It is achieved by dividing theminto
homogeneous (containing only points fromone category), nonover-
lapping, closed convex subsets, and then placing separating hyper-
planes between neighboring subsets from different categories. The
hyperplane separation of the obtained subsets with homogenous
convex regions provides the consecutive network layer to deter-
mine what region a given input pattern belongs to. In our approach,
a fuzzy membership M
f
function is devised for each created con-
vex subset (f =1, 2, . . . , k). The classification decision is made by the
output layer based on the “winner-take-all” principle. The resulting
category C is the convex set category with the highest value of mem-
bership function for the input pattern. Aschematic diagram of such
a neuron-fuzzy classification system is shown in Fig. 1.
5
19.3.1 Convex Set Creation
There are two requirements for the convex sets: they have to be
homogeneous and nonoverlapping. To satisfy the first condition,
one needs to devise a method of finding one category points within
another category’s hull. Thus, two problems can be defined: (1) how
to findwhether the point Plies inside of a convex hull (CH) of points;
(2) howto findout if two convex hulls of points are overlapping. The
second problem is more difficult to examine because hulls can be
overlapping over a common (empty) space that contains no points
fromeither category. This problemcanbe definedas a generalization
of the first one,
20
andthe first conditioncanbe seenas aspecial case of
January 22, 2008 12:3 WSPC/SPI-B540:Principles and Recent Advances ch19 FA
462
Atam P Dhawan
M
1
winner