You are on page 1of 419

PART III

WAVEFRONT ANALYSIS
PART III

WAVEFRONT ANALYSIS

VIRENDRA N. MAHAJAN

THE AEROSPACE CORPORATION


AND
COLLEGE OF OPTICAL SCIENCES - THE UNIVERSITY OF ARIZONA

Bellingham, Washington USA


Library of Congress Cataloging-in-Publication Data

Mahajan, Virendra N.
Optical imaging and aberrations, part III: wavefront analysis / Virendra N. Mahajan
pages cm.
Includes bibliographical references and index.
ISBN 978-0-8194-9111-4
1. Optical measurements. 2. Aberration--Measurement. 3. Orthogonal decompositions.
4. Orthogonal polynomials. I. Title.
QC367.M24 2013
621.36--dc23
2013018827

Published by
SPIE
P.O. Box 10
Bellingham, Washington 98227-0010 USA
Phone: +1 360.676.3290
Fax: +1 360.647.1445
Email: Books@spie.org
Web: http://spie.org

Copyright © 2013 Society of Photo-Optical Instrumentation Engineers

All rights reserved. No part of this publication may be reproduced or distributed in any
form or by any means without written permission of the publisher.

The content of this book reflects the work and thought of the author(s). Every effort has
been made to publish reliable and accurate information herein, but the publisher is not
responsible for the validity of the information or for any outcomes resulting from reliance
thereon.

Printed in the United States of America.


First printing

Front cover: Shown from left to right are the aberration-free PSFs of optical imaging
systems with circular, annular, hexagonal, elliptical, rectangular, and square pupils.
To my grandchildren

Maya, Leela, Rohan, and Krishan

v
FOREWORD

For years Vini Mahajan has been publishing a book series on optical imaging and
aberrations. Part I of the series on Ray Geometrical Optics was published in 1998, and
Part II on Wave Diffraction Optics followed in 2001. A second edition of Part II appeared
in 2011. Now Vini has written Part III on Wavefront Analysis, which should be of interest
to anyone working in the fields of optical design, fabrication, or testing.

Wavefront Analysis is focused on the use of orthonormal polynomials for wavefront


analysis of optical imaging systems with pupils of different shapes. The book starts with
an excellent introduction to optical imaging and aberrations. These first two chapters
should be of interest to anyone working in optics. Chapter 3 describes orthonormal
polynomials and the Gram–Schmidt orthonormalization process for obtaining
orthonormal polynomials over one domain from those that are orthonormal over another.

Chapter 4 is a long and complete chapter on imaging and aberrations for optical
systems with circular pupils. The chapter covers the PSF and OTF for aberration-free
imaging, Strehl ratio and aberration balancing and tolerancing, and a very complete
description of Zernike circle polynomials. Isometric, interferometric, and imaging
characteristics of the circle polynomial aberrations are very nicely explained and
illustrated. The important relationship between the circle polynomials and the classical
aberrations is discussed. Since optical systems generally have circular pupils, this chapter
will be of use to almost anyone working in optics.

The next several chapters are intended for readers interested in optical systems with
noncircular or apodized circular or annular pupils. Much of this material is difficult to
find in such detail elsewhere. The chapters start with a brief discussion of aberration-free
imaging that includes both the PSF and the OTF of the optical system, as this is
potentially the ultimate goal of any optical design or test. Then the polynomials
appropriate for systems with pupils of different shapes representing balanced classical
aberrations are described in detail. As in the case of the circle polynomial aberrations, the
isometric, interferometric, and PSF plots of the first forty-five polynomial aberrations for
systems with hexagonal, elliptical, annular, rectangular, and square pupils facilitate
understanding of their significance. Systems with circular and annular pupils with
Gaussian illumination, anamorphic systems with square and circular pupils, and those
with circular and annular sector pupils are also discussed thoroughly.

Anyone thinking of using the Zernike circle polynomials for wavefront analysis of
systems with noncircular pupils should read Chapter 12, where their pitfalls are
illustrated by applying them to systems with annular and hexagonal pupils. Numerical
examples on the calculation of the orthonormal aberration coefficients from the
wavefront or the wavefront slope data given in Chapter 14 add to the utility and

vii
practicality of the book. A summary at the end of each chapter is quite useful, as it
describes the essence of the content.

Vini is an excellent writer with the gift of writing complex topics in a simplified, yet
rigorous, manner. As in the first two volumes of this book series, the material presented
in Part III is thorough and detailed, and much of it is from his own publications.
Wavefront Analysis is primarily analytical in nature, but it is generally easy to read with a
lot of examples and numerical results. Both students and experienced optical engineers
and scientists who have a need for wavefront analysis of optical systems will find it to be
extremely useful.

Tucson, Arizona James C. Wyant


June 2013

viii
TABLE OF CONTENTS

PART III. WAVEFRONT ANALYSIS


Preface ........................................................................................................................... xvii
Acknowledgments .......................................................................................................... xix
Symbols and Notation.................................................................................................... xxi

CHAPTER 1: OPTICAL IMAGING ............................................................. 1


1.1 Introduction ............................................................................................................................ 3
1.2 Diffraction Image ................................................................................................................... 3
1.2.1 Pupil Function .......................................................................................................... 4
1.2.2 PSF ........................................................................................................................... 5
1.2.3 OTF .......................................................................................................................... 6
1.3 Strehl Ratio ............................................................................................................................. 7
1.3.1 General Expression .................................................................................................. 7
1.3.2 Approximate Expression in Terms of Aberration Variance ..................................... 9
1.4 Aberration Balancing ........................................................................................................... 10
1.5 Summary ............................................................................................................................... 11
References ........................................................................................................................................ 12

CHAPTER 2: OPTICAL WAVEFRONTS AND THEIR ABERRATIONS .......... 13


2.1 Introduction .......................................................................................................................... 15
2.2 Optical Imaging .................................................................................................................... 15
2.3 Wave and Ray Aberrations ................................................................................................. 17
2.4 Defocus Aberration .............................................................................................................. 22
2.5 Wavefront Tilt ...................................................................................................................... 23
2.6 Aberration Function of a Rotationally Symmetric System .............................................. 25
2.7 Observation of Aberrations:
s: Interferograms .................................................................... 29
2.8 Summary ............................................................................................................................... 31
References ........................................................................................................................................ 33

CHAPTER 3: ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT


ORTHONORMALIZATION................................................... 35
3.1 Introduction .......................................................................................................................... 37
3.2 Orthonormal Polynomials ................................................................................................... 37
3.3 Equivalence of Orthogonality-Based Coefficients and Least-Squares Fitting ............... 39
3.4 Orthonormalization of Zernike Circle Polynomials over Noncircular Pupils ............... 40

ix
3.5 Unit Pupil .............................................................................................................................. 43
3.6 Summary ............................................................................................................................... 43
References ........................................................................................................................................ 46

CHAPTER 4: SYSTEMS WITH CIRCULAR PUPILS...................................... 47


4.1 Introduction .......................................................................................................................... 49
4.2 Pupil Function....................................................................................................................... 49
4.3 Aberration-Free Imaging .................................................................................................... 50
4.3.1 PSF ......................................................................................................................... 51
4.3.2 OTF ........................................................................................................................ 53
4.4 Strehl Ratio and Aberration Tolerance.............................................................................. 54
4.4.1 Strehl Ratio............................................................................................................. 54
4.4.2 Defocus Strehl Ratio............................................................................................... 55
4.4.3 Approximate Expressions for Strehl Ratio............................................................. 56
4.5 Balanced Aberrations........................................................................................................... 57
4.6 Description of Zernike Circle Polynomials ........................................................................ 63
4.6.1 Analytical Form...................................................................................................... 63
4.6.2 Circle Polynomials in Polar Coordinates ............................................................... 65
4.6.3 Polynomial Ordering .............................................................................................. 65
4.6.4 Number of Circle Polynomials through a Certain Order n .................................... 65
4.6.5 Relationships among the Indices n, m, and j .......................................................... 69
4.6.6 Uniqueness of Circle Polynomials ......................................................................... 69
4.6.7 Circle Polynomials in Cartesian Coordinates......................................................... 70
4.7 Zernike Circle Coefficients of a Circular Aberration Function ...................................... 70
4.8 Symmetry Properties of Images Aberrated by a Circle Polynomial Aberration ........... 74
4.8.1 Symmetry of PSF ................................................................................................... 74
4.8.2 Symmetry of OTF................................................................................................... 76
4.9 Isometric, Interferometric, and Imaging Characteristics of
Circle Polynomial Aberrations ........................................................................................... 78
4.9.1 Isometric Characteristics ........................................................................................ 78
4.9.2 Interferometric Characteristics ............................................................................... 78
4.9.3 PSF Characteristics ................................................................................................ 83
4.9.4 OTF Characteristics ............................................................................................... 84
4.10 Circle Polynomials and Their Relationships with Classical Aberrations ....................... 88
4.10.1 Introduction ............................................................................................................ 88
4.10.2 Wavefront Tilt and Defocus ................................................................................... 88
4.10.3 Astigmatism ........................................................................................................... 89
4.10.4 Coma....................................................................................................................... 90
4.10.5 Spherical Aberration............................................................................................... 90
4.10.6 Seidel Coefficients from Zernike Coefficients ....................................................... 91
4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ........................... 92

x
4.11 Zernike Coefficients of a Scaled Pupil ............................................................................... 92
4.11.1 Theory .................................................................................................................... 92
4.11.2 Application to a Seidel Aberration Function.......................................................... 97
4.11.3 Numerical Example................................................................................................ 99
4.12 Summary ............................................................................................................................. 102
References ...................................................................................................................................... 103

CHAPTER 5: SYSTEMS WITH ANNULAR PUPILS .................................... 105


5.1 Introduction ........................................................................................................................ 107
5.2 Aberration-Free Imaging .................................................................................................. 107
5.2.1 PSF ....................................................................................................................... 107
5.2.2 OTF ...................................................................................................................... 109
5.3 Strehl Ratio and Aberration Balancing............................................................................ 111
5.4 Orthonormalization of Circle Polynomials over an Annulus ......................................... 114
5.5 Annular Polynomials ......................................................................................................... 116
5.6 Annular Coefficients of an Annular Aberration Function ............................................. 123
5.7 Strehl Ratio for Annular Polynomial Aberrations ......................................................... 129
5.8 Isometric, Interferometric, and Imaging Characteristics of
Annular Polynomial Aberrations ..................................................................................... 132
5.9 Summary ............................................................................................................................. 139
References ...................................................................................................................................... 140

CHAPTER 6: SYSTEMS WITH GAUSSIAN PUPILS ................................... 141


6.1 Introduction ........................................................................................................................ 143
6.2 Gaussian Pupil .................................................................................................................... 144
6.3 Aberration-Free Imaging .................................................................................................. 145
6.3.1 PSF ....................................................................................................................... 145
6.3.2 Optimum Gaussian Radius.................................................................................. 146
6.3.3 OTF ...................................................................................................................... 147
6.4 Strehl Ratio and Aberration Balancing............................................................................ 149
6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil . 153
6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations for a
Gaussian Circular Pupil..................................................................................................... 155
6.7 Weakly Truncated Gaussian Pupils ................................................................................. 156
6.8 Aberration Coefficients of a Gaussian Circular Aberration Function......................... 157
6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil ............ 157
6.10 Gaussian Annular Polynomials
yn Representing Balanced Primary Aberrations for a
Gaussian Annular Pupil ..................................................................................................... 159

xi
6.11 Aberration Coefficients of a Gaussian Annular Aberration Function ......................... 161
6.12 Summary ............................................................................................................................. 161
References ...................................................................................................................................... 163

CHAPTER 7: SYSTEMS WITH HEXAGONAL PUPILS ............................... 165


7.1 Introduction ........................................................................................................................ 167
7.2 Pupil Function..................................................................................................................... 168
7.3 Aberration-Free Imaging .................................................................................................. 169
7.3.1 PSF ..........................................................................................................169
7.3.2 OTF ..........................................................................................................174
7.4 Hexagonal Polynomials...................................................................................................... 177
7.5 Hexagonal Coefficients of a Hexagonal Aberration Function........................................ 185
7.6 Isometric, Interferometric, and Imaging Characteristics of
Hexagonal Polynomial Aberrations ................................................................................. 187
7.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio............................................. 194
7.7.1 Defocus ....................................................................................................194
7.7.2 Astigmatism............................................................................................. 194
7.7.3 Coma ........................................................................................................195
7.7.4 Spherical Aberration ................................................................................196
7.7.5 Strehl Ratio ..............................................................................................197
7.8 Summary ............................................................................................................................. 197
References ...................................................................................................................................... 200

CHAPTER 8: SYSTEMS WITH ELLIPTICAL PUPILS ................................... 201


8.1 Introduction ........................................................................................................................ 203
8.2 Pupil Function..................................................................................................................... 203
8.3 Aberration-Free Imaging .................................................................................................. 204
8.3.1 PSF ....................................................................................................................... 204
8.3.2 OTF ...................................................................................................................... 207
8.4 Elliptical Polynomials......................................................................................................... 209
8.5 Elliptical Coefficients of an Elliptical Aberration Function ......................................... 210
8.6 Isometric, Interferometric, and Imaging Characteristics of
Elliptical Polynomial Aberrations..................................................................................... 214
8.7 Seidel Aberrations and Their Standard Deviations ........................................................ 228
8.7.1 Defocus ................................................................................................................. 228
8.7.2 Astigmatism ......................................................................................................... 228
8.7.3 Coma..................................................................................................................... 229
8.7.4 Spherical Aberration............................................................................................. 230
8.8 Summary ............................................................................................................................. 232
References ...................................................................................................................................... 234

xii
CHAPTER 9: SYSTEMS WITH RECTANGULAR PUPILS ............................ 235
9.1 Introduction ........................................................................................................................ 237
9.2 Pupil Function..................................................................................................................... 237
9.3 Aberration-Free Imaging .................................................................................................. 238
9.3.1 PSF ..........................................................................................................238
9.3.2 OTF ..........................................................................................................240
9.4 Rectangular Polynomials ................................................................................................... 242
9.5 Rectangular Coefficients of a Rectangular Aberration Function.................................. 243
9.6 Isometric, Interferometric, and Imaging Characteristics of
Rectangular Polynomial Aberrations ............................................................................... 247
9.7 Seidel Aberrations and Their Standard Deviations ........................................................ 260
9.7.1 Defocus ....................................................................................................260
9.7.2 Astigmatism............................................................................................. 260
9.7.3 Coma ........................................................................................................261
9.7.4 Spherical Aberration ................................................................................261
9.8 Summary ............................................................................................................................. 264
References ...................................................................................................................................... 265

CHAPTER 10: SYSTEMS WITH SQUARE PUPILS ..................................... 267


10.1 Introduction ........................................................................................................................ 269
10.2 Pupil Function..................................................................................................................... 269
10.3 Aberration-Free Imaging .................................................................................................. 270
10.3.1 PSF ..........................................................................................................272
10.3.2 OTF ..........................................................................................................274
10.4 Square Polynomials ............................................................................................................ 281
10.5 Square Coefficients of a Square Aberration Function.................................................... 282
10.6 Isometric, Interferometric, and Imaging Characteristics of
Square Polynomial Aberrations ........................................................................................ 289
10.7 Seidel Aberrations and Their Standard Deviations ........................................................ 289
10.7.1 Defocus ....................................................................................................289
10.7.2 Astigmatism............................................................................................. 289
10.7.3 Coma ........................................................................................................290
10.7.4 Spherical Aberration ................................................................................292
10.8 Summary ............................................................................................................................. 293
References ...................................................................................................................................... 294

xiii
CHAPTER 11: SYSTEMS WITH SLIT PUPILS ............................................. 295
11.1 Introduction ........................................................................................................................ 297
11.2 Aberration-Free Imaging .................................................................................................. 297
11.2.1 PSF ..........................................................................................................297
11.2.2 Image of an Incoherent Slit......................................................................298
11.3 Strehl Ratio and Aberration Balancing............................................................................ 299
11.3.1 Strehl Ratio ..............................................................................................299
11.3.2 Aberration Balancing............................................................................... 289
11.4 Slit Polynomials .................................................................................................................. 301
11.5 Standard Deviation of a Primary Aberration ................................................................. 302
11. Summary ............................................................................................................................. 305
References ...................................................................................................................................... 306

CHAPTER 12: USE OF ZERNIKE CIRCLE POLYNOMIALS FOR


NONCIRCULAR PUPILS ................................................. 307
12.1 Introduction ........................................................................................................................ 309
12.2 Relationship Between the Orthonormal and the Corresponding
Zernike Circle Coefficients ................................................................................................ 309
12.3 Use of Zernike Circle Polynomials for the Analysis of an Annular Wavefront ........... 314
12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ...................... 314
12.3.2 Interferometer Setting (rrors ................................................................................320
12.3.3 Wavefront Fitting ................................................................................................. 320
12.3.4 Application to an Annular Seidel Aberration Function........................................ 321
12.3.4.1 Annular Coefficients ............................................................................ 321
12.3.4.2 Circle Coefficients................................................................................ 323
12.3.4.3 Residual Aberration Function Dfter Removing
Interferometer Setting Errors................................................................ 323
12.3.4.4 Error with Assuming Circle Polynomials to be
Orthogonal over an Annulus ................................................................ 325
12.3.4.5 Numerical Example ............................................................................. 326
12.4 Use of Zernike Circle Polynomials for the Analysis of a Hexagonal Wavefront ......... 332
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients........................ 332
12.4.2 Interferometer Setting Errors................................................................................ 335
124.3 Numerical Example.............................................................................................. 336
12.5 Aberration Coefficients from Discrete Wavefront Data................................................. 345
12.6 Summary ............................................................................................................................. 345
References ...................................................................................................................................... 348

xiv
CHAPTER 13: ANAMORPHIC SYSTEMS................................................ 349
13.1 Introduction ........................................................................................................................ 351
13.2 Gaussian Imaging ............................................................................................................... 352
13.3 Classical Aberrations ......................................................................................................... 354
13.4 Strehl Ratio and Aberration Balancing for a Rectangular Pupil .................................. 355
13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil ................................. 356
13.6 Expansion of a Rectangular Aberration Function in Terms of Orthonormal
Rectangular Polynomials ................................................................................................... 360
13.7 Anamorphic Imaging System with a Circular Pupil....................................................... 361
13.7.1 Balanced Aberrations ..............................................................................361
13.7.2 Orthonormal Polynomials Representing Balanced Aberrations ..............362
13.8 Comparison of Polynomials for Rotationally Symmetric and
Anamorphic Imaging Systems .......................................................................................... 362
13.9 Summary ............................................................................................................................. 365
References ...................................................................................................................................... 367

CHAPTER 14: NUMERICAL WAVEFRONT ANALYSIS............................ 369


14.1 Introduction ..........................................................................................................371
14.2 Zernike Coefficients from Wavefront Data....................................................... 372
14.2.1 Theory ......................................................................................................372
14.2.2 Numerical Example ................................................................................. 373
14.3 Zernike Coefficients from Wavefront Slope Data ............................................383
14.3.1 Theory ......................................................................................................383
14.3.2 Alternative Approach for Obtaining Zernike Coefficients from
Wavefront Slope Data..............................................................................388
14.3.3 Numerical Example ................................................................................. 393
14.4 Summary............................................................................................................... 398
References ......................................................................................................................399

APPENDIX: SYSTEMS WITH SECTOR PUPILS ......................................... 401

Index ............................................................................................................................. 415

xv
PREFACE
This book is Part III of a series of books on Optical Imaging and Aberrations. Part I
on Ray Geometrical Optics and Part II on Wave Diffraction Optics were published
earlier. Part III is on Wavefront Analysis, which is an integral part of optical design,
fabrication, and testing. In optical design, rays are traced to determine the wavefront and
thereby the quality of a design. In optical testing, the fabrication errors and, therefore, the
associated aberrations are measured by way of interferometry. In both cases, the quality
of the wavefront is determined from the aberrations obtained at an array of points. The
aberrations thus obtained are used to calculate the mean, the peak-to-valley, and the
standard deviation values. While such statistical measures of the wavefront are part of
wavefront analysis, the purpose of this book is to determine the content of the wavefront
by decomposing the ray-traced or test-measured data in terms of polynomials that are
orthogonal over the expected domain of the data. These polynomials must include the
basic aberrations of wavefront defocus and tilt, and represent balanced classical
aberrations.

We start Part III with an outline of optical imaging in the presence of aberrations in
Chapter 1, i.e., on how to obtain the point-spread and optical transfer functions of an
imaging system with an arbitrary shaped pupil. The Strehl ratio of a system as a measure
of image quality is introduced in this chapter, and shown to be dependent only on the
aberration variance when the aberration is small. It is followed in Chapter 2 with a brief
discussion of the wavefronts and aberrations. This chapter introduces the nomenclature of
aberrations. How to obtain the orthogonal polynomials over a certain domain from those
over another is discussed in Chapter 3. For systems with a circular pupil, the Zernike
circle polynomials are well known for wavefront analysis. They are discussed at length in
Chapter 4. These polynomials are orthogonalized over an annular pupil in Chapter 5, and
over a Gaussian pupil in Chapter 6. They are obtained similarly for systems with
hexagonal, elliptical, rectangular, square, and slit pupils in the succeeding chapters. For
each pupil, the polynomials are given in their orthonormal form so that an expansion
coefficient (with the exception of piston) represents the standard deviation of the
corresponding polynomial aberration term. The standard deviation of a Seidel aberration
with and without aberration balancing is also discussed in these chapters.

Since the Zernike circle polynomials form a complete set, a wavefront over any
domain can be expanded in terms of them. However, the pitfalls of their use over a
domain other than circular and resulting from the lack of their orthogonality over the
chosen domain are discussed in Chapter 12. Finally, the aberrations of anamorphic
systems are discussed, and polynomials suitable for their aberration analysis are given in
Chapter 13 for both rectangular and circular pupils. The use of the orthonormal
polyonomials for determining the content of a wavefront is demonstrated in Chapter 14
by computer simulations of circular wavefronts. The determination of the aberrations
coefficients from the wavefront slope data, as in a Shack–Hartmann sensor, is also
discussed in this chapter.

El Segundo, California Virendra N. Mahajan


June 2013

xvii
ACKNOWLEDGMENT6

Once again, it is a great pleasure to acknowledge the generous support I have


received over the years from my employer, The Aerospace Corporation, in preparing Part
III on Wavefront Analysis in a series of bookV on Optical Imaging and Aberrations. My
special thanks go to my former classmate Dr. Bill Swantner for his constant advice on
and constructive critique of my work. I have benefitted greatly from his practical
expertise in both optical design and testing. The Sanskrit verse on p. xxiii was provided
by Professor Sally Sutherland of the University of California at Berkeley. Many thanks to
Professor James W. Wyant for writing the Foreword for this book.

I am grateful to Professor José Antonio Díaz Navas for carrying out many computer
calculations and preparing many of the figures. My thanks to Drs. Barry Johnson, James
Harvey, and Daniel Topa for reading an early version of the manuscript and suggesting to
include examples of wavefront analysis. I am grateful to Professor Eva Acosta for her
help with writing Chapter 14 on Numerical Wavefront Analysis, as my response to their
suggestion. Of course, any shortcomings or errors anywhere in the book are totally my
responsibility.

As in the past, I cannot say enough about the constant support I have received from
my wife Shashi over the many years it has taken me to complete this three-part series. I
dedicate Part III to my grandchildren.

Finally, I would like to thank SPIE Press Editors Dara Burrows and Scott McNeill,
and Manager Tim Lamkins for their quality support in bringing this book to publication.
It has always been a pleasure to work with the 63,( staff, starting with the 3XEOLFDWLRQV
'LUHFWRU Eric Pepper.

xix
SYMBOLS AND NOTATION

r
ai aberration coefficient rp pupil point position vector
A amplitude R radius of reference sphere
Ai peak aberration coefficient Re real part
Bd defocus coefficient Rj rectangular polynomial
Bj wave aberration polynomial Rnm (r) Zernike radial polynomial
Bt tilt coefficient S Strehl ratio
c aspect ratio Sex area of exit pupil
Ej elliptical polynomial Sj square, sector, or ray aberration
F focal ratio polynomial
r
Gj Gaussian or vector polynomial V vector polynomial
Hj hexagonal polynomial x, y Cartesian coordinates of a point
I irradiance W wave aberration
Im imaginary part Z nm Zernike circle polynomial
j polynomial number Zj Zernike circle polynomial
r image spatial frequency vector
Jn Bessel function vi
Lj Legendre polynomial v normalized spatial frequency
M magnification t optical transfer function
MTF modulation transfer function r = r a normalized radial coordinate
OTF optical transfer function q polar angle of a position vector
P object point f polar angle of frequency vector
P¢ Gaussian image point ⑀ obscuration or aspect ratio
Pex power in the exit pupil d (◊) Dirac delta function
Pi image power d ij Kronecker delta
Pn polynomial D longitudinal defocus
P(◊) pupil function F phase aberration
PSF point-spread function r, q polar coordinates of a point
PTF phase transfer function l optical wavelength
r radial coordinate x, h spatial frequency coordinates
rc radius of circle sW standard deviation (wave)
r
ri image point position vector sF standard deviation (phase)

xxi
Anantaratnaprabhavasya yasya himam
. na saubhagyavilopi jatam
Eko hi doso
. gunasannipate
. ˙ .
nimajjatindoh. kiranesvivankah
.

The snow does not diminish the beauty of the Himalayan mountains
which are the source of countless gems. Indeed, one flaw is lost
among a host of virtues, as the moon’s dark spot is lost among its rays.

Kalidasa Kumarasambhava 1.3

xxiii
PART III

WAVEFRONT ANALYSIS
CHAPTER 1

OPTICAL IMAGING

1.1 Introduction ..............................................................................................................3

1.2 Diffraction Image ..................................................................................................... 3

1.2.1 Pupil Function..............................................................................................4

1.2.2 PSF ..............................................................................................................5

1.2.3 OTF ..............................................................................................................6

1.3 Strehl Ratio ............................................................................................................... 7

1.3.1 General Expression ......................................................................................7

1.3.2 Approximate Expressions in Terms of Aberration Variance ......................9

1.4 Aberration Balancing ............................................................................................10

1.5 Summary................................................................................................................. 11

References ........................................................................................................................12

1
Chapter 1
Optical Imaging
1.1 INTRODUCTION
The position and the size of the Gaussian image of an object formed by an optical
imaging system is determined by using its Gaussian imaging equations. The aperture stop
of the system limits the amount of light entering it the most. Its entrance pupil determines
the amount of light from an object that enters it, and the exit pupil determines how that
light is distributed in the image. The Gaussian image is an exact replica of the object,
except for its magnification. The diffraction image of an isoplanatic incoherent object is
given by the convolution of the Gaussian image and the diffraction image of a point
object, called the point-spread function (PSF). In the spatial frequency domain, the
spectrum of the image is correspondingly given by the product of the optical transfer
function (OTF), which is the Fourier transform of the PSF, and the spectrum of the
Gaussian image. The image is obtained by inverse Fourier transforming its spectrum [1].
We define a pupil function, representing the complex amplitude at the exit pupil, and give
equations for obtaining the PSF and the OTF.

The aberrations of the system determine the quality of an image. An important


measure of the quality of an image is its Strehl ratio, which represents the ratio of the
central irradiances of the PSF with and without the aberration. This ratio is discussed and
simple but approximate expressions for it are derived for small aberrations in terms of the
variance of the aberration at the exit pupil. Since the Strehl ratio is higher for a smaller
variance, we discuss aberration balancing in which an aberration of a higher order is
balanced with one or more aberrations of lower order to minimize its variance and
thereby maximize the Strehl ratio. We discuss some general results on the effects of
nonuniform amplitude, called apodization, and nonuniform phase, called aberration, at
the exit pupil on the irradiance at the center of the reference sphere with respect to which
the aberration is defined. For a given total power in the pupil and, therefore, in the image
of a point object, maximum central irradiance is obtained for a system with an
unapodized and unaberrated pupil. Moreover, the peak value of an unaberrated image lies
at the center of curvature of the reference sphere regardless of the apodization of the
pupil. Generally, the effect of even large amplitude variations across the pupil is
relatively small compared to that of even small aberrations.

1.2 DIFFRACTION IMAGE


The Gaussian image of a point object formed by an imaging system is determined by
using Gaussian optics. In the Gaussian approximation, the aberrations are completely
neglected, and all of the rays originating at the point object and transmitted by the system
pass through the Gaussian image point. In reality, however, when the object rays are
traced through the system, they do not generally pass through the Gaussian image point
due to the aberrations. Instead, they are distributed in the vicinity of the image point, and
their distribution is referred to as the spot diagram. In practice, even if the aberrations are
3
4 OPTICAL IMAGING

absent or neglected, the light is distributed in a finite region around the Gaussian image
point due to its diffraction by the system. The diffraction image of a point object is called
the PSF of the system, and the aberration-free image is referred to as the diffraction-
limited image. The image of an extended object is determined by adding the amplitude or
the irrandiance images of its small elements, depending on whether the object radiation is
coherent or incoherent.

A system is called isoplanatic for a small enough object if the distribution of light in
the image of any point on it is approximately the same, except for its location in the
image plane. Thus, over a small field of view, the image of a point object is shift
invariant. For an incoherent isoplanatic object, the diffraction image can be obtained by
convolving the Gaussian image (which is an exact replica of the object except for its size
and illumination scaling) with the diffraction PSF. In the spatial frequency domain, the
spectrum of the image is correspondingly given by the product of the OTF, which is the
Fourier transform of the PSF, and the spectrum of the Gaussian image. The image is
obtained by inverse Fourier transforming its spectrum [1]. We define a pupil function,
representing the complex amplitude at the exit pupil, and give equations for obtaining the
PSF and the OTF.

1.2.1 Pupil Function


r
Consider a point object located at ro in the object plane radiating at a wavelength l .
Its Gaussian image formed by an imaging system determines the amount of light in the
image, depending on the object intensity, and distance from and the size of the entrance
pupil. The wave at the exit pupil of the system is represented by the pupil function

(r r ) (r r ) [ (r r )]
P rp ; ro = A rp ; ro exp iF rp ; ro , inside the exit pupil
= 0 , outside the exit pupil , (1-1)

r
(r r )
where rp is the 2D position vector of a point in the plane of the pupil and A rp ; ro and
F (r, q) are the amplitude and phase aberration functions of the system for the point
object under consideration. The phase aberration F (r, q) is related to the wave aberration
r r
( )
W rp ; ro according to

F (r, q) = (2p l)W rp ; ro (r r ) . (1-2)

The shape of the pupil is arbitrary. It may, for example, be circular or annular. The total
power in the pupil and, therefore, in the image is given by

r r 2 r
Pex = Ú P (r ; r )
p o d rp

r r r
= Ú A 2 ( rp ; ro )d rp , (1-3)

where the integration is across the pupil.


 3XSLO )XQFWLRQ 5

The image lies at a distance R from the plane of the exit pupil, where R is the radius
of curvature of the Gaussian reference sphere with respect to which the aberration
r r
( )
W rp ; ro is defined. The center of curvature of the reference sphere lies at the Gaussian
r r
image point (unless defocus is introduced). Generally, the amplitude function A rp ; ro ( )
is uniform across the exit pupil. An exception is the Gaussian pupil considered in Chapter
6. We assume a small field of view so that the dependence of the aberration function
r r
( )
W rp ; ro on the location of the point object in the object plane can be neglected.

1.2.2 PSF
The PSF of the system imaging an incoherent object is given by [1]
2
r 1 Û r Ê 2pi r r ˆ r
PSF (ri ) = 2 2 Ù
Pex l R ı
P rp exp Á -
Ë lR
( )
ri rp ˜ d rp
¯
◊ , (1-4)

r
where the position vector ri of the observation point is written with respect to the
r
location rg of the Gaussian image point, and Pex is the total power in the image. The
irradiance distribution of the image is obtained by multiplying the PSF by the total power
Pex in the image, i.e.,
2
r 1 Û r Ê 2pi r r ˆ r
I (ri ) = 2 2 Ù P rp exp Á -
lR ı Ë lR
( )
ri rp ˜ d rp
¯
◊ . (1-5)

For a uniformly illuminated pupil with irradiance I 0 , the total power incident on and
transmitted by the pupil is given by

Pex = Sex I 0 , (1-6)

(r )
where Sex is the area of the exit pupil. Letting A 2 rp = I 0 , we may write the irradiance
distribution
2
r I0 Û r Ê 2pi r r ˆ r
I (ri ) = 2 2 Ù exp iF rp
lR ı
[ ( )] exp Á -
Ë lR

ri rp ˜ d rp
¯
. (1-7)

The aberration-free irradiance at the center is given by

I0 r 2
I ( 0) =
l R2
2 [
Ú d rp ]
Pex Sex
= . (1-8)
l2 R 2

The irradiance distribution normalized by its central value may be written


2
r 1 Û r Ê 2pi r r ˆ r
I (ri ) = 2 Ù exp iF rp
Sex ı
[ ( )] exp Á -
Ë lR

ri rp ˜ d rp
¯
. (1-9)
6 OPTICAL IMAGING

For convenience, we will refer to the irradiance distribution given by Eq. (1-9) as the
r
( )
PSF. Letting F rp = 0, we obtain the aberration-free PSF.

1.2.3 OTF
The imaging process can be described in the space domain by way of the PSF, or in
the spatial frequency domain by way of the OTF. The OTF is the Fourier transform of the
PSF, defined as
r r r r r
t (v i ) = Ú PSF (ri ) exp (2p i v i ◊ ri ) d ri , (1-10)
r
where v i is a spatial frequency vector in the image plane and related to the corresponding
r r r
frequency v o in the object plane by the image magnification M according to v i = v o M .
Since the image of an isoplanatic incoherent object is given by the convolution of the PSF
and the Gaussian image, the (spatial frequency) spectrum of the image is given by the
product of the OTF and the spectrum of the Gaussian image. The image is obtained by
inverse Fourier transforming its spectrum.

Because of the relationship of the PSF with the pupil function, as in Eq. (1-4), the
OTF can also be written as the autocorrelation of the pupil function in the form

r r r r r r 2 r
t (v i ) = Û ( ) (
Ù P rp P * rp - l R v i d rp
ı
) Ú ( )
P rp d rp

r r r
Ú ( ) (
= Pex1 A rp A rp - l R v i exp iQ rp ) [ (r )] d rr p , (1-11)

where an asterisk denotes a complex conjugate and

(r r ) (r ) (r
Q rp ; v i = F rp - F rp - l R v i
r
) (1-12)

is a phase aberration difference function defined over the region of overlap of two pupils:
r r r
one centered at rp = 0 and the other at rp = l Rvi .

From Eq. (1-11), the aberration-free OTF can be written


r
(r ) (r
t (v i ) = Pex1 Ú A rp A rp - l R v i d rp
r
) r
. (1-13)

For a uniformly illuminated pupil, the OTF is simply the fractional area of overlap of two
pupils centered at (0, 0) and l R(x, h) , where (x, h) are the Cartesian components of the
r
spatial frequency vector v i .
r
The region of overlap is maximum and equal to the area of the pupil for vi = 0,
giving a value of unity for t (0) . It represents the fact that the contrast of an image is zero
for an object of zero contrast. Because of the finite size of the pupil, the overlap region
r
reduces to zero at some frequency vc , called the cutoff frequency, and stays zero for
r r r
larger frequencies, i.e., t ( vi ) = 0 for vi ≥ vc . Because of isoplanatism, the spatial
frequency spectrum of the image is obtained as the product of the spectrum of the
 27) 7

Gaussian image and the OTF. Inverse Fourier transforming the image spectrum yields the
space domain image.

From Eq. (1-10), we note that


r r
t ( vi ) = t * ( - vi ) , (1-14)

i.e., the OTF is complex symmetric or Hermitian. Therefore, its real part is even and its
imaginary part is odd, i.e.,
r r
Re t ( vi ) = Re t ( - vi ) ,
(1-15)

and
r r
Im t ( vi ) = - Im t ( - vi ) . (1-16)

The OTF can also be written in the form


r r r
[
t ( vi ) = t ( vi ) exp i Y ( vi ) ] , (1-17)
r r
where t ( vi ) and Y( vi ) are its modulus and phase, called the modulation and phase
transfer functions (MTF and PTF), respectively. Depending on the shape of the pupil and
the type of the aberration, the OTF may be real. A phase of p is sometimes associated
with a negative value of the MTF. It represents contrast reversal i.e, bright and dark
regions in the object appear as dark and bright regions in the image.

By inverse Fourier transforming Eq. (1-10), we can obtain the PSF according to
r r r r r

PSF (ri ) = Ú t (v i ) exp (- 2 pi v i ri ) d v i . (1-18)

For a radially symmetric pupil with a radially symmetric aberration, e.g., a circular
pupil aberrated by spherical aberration, the OTF and PSF Eqs. (2-4) and (2-18) yield

PSF (ri ) = 2p Ú t (v i ) J 0 (2 p v i ri ) v i dv i (1-19)

and

t (v i ) = 2p Ú PSF (ri ) J 0 (2p v i ri ) ri dri , (1-20)

respectively, where J 0 (◊) is the zeroth-order Bessel function of the first kind. The OTF is
evidently real in this case.

1.3 STREHL RATIO


1.3.1 General Expression
The Strehl ratio of an image represents the ratio of its central irradiances with and
without aberration. From Eq. (1-5), the ratio of the central irradiances with aberration and
that at the Gaussian image point without aberration, may be written [1]
8 OPTICAL IMAGING

I a ( 0)
S = , (1-21)
I u ( 0)

where the subscripts a and u refer to an aberrated and an unaberrated system,


respectively, and S is the Strehl ratio given by

r r r 2
Ú ( ) [ ( )]
A rp exp iF rp d rp

[ Ú A (rr ) d rr ]
S = 2
. (1-22)
p p

It can be shown that [1]

0£ S £ 1 . (1-23)

The Strehl ratio may also be determined from the OTF of the system. By definition,

S = PSFa ( 0) PSFu ( 0) . (1-24)

From Eq. (1-11), we may write


r r
PSF ( 0) = Ú t (v i ) d v i . (1-25)

Since the PSF at any point is a real quantity, only the real part of the aberrated OTF
contributes to the integral, and the integral of its imaginary part must be zero. Hence, the
Strehl ratio is given by
r r r r
S = Ú Re t a ( v ) d v Ú t u ( v ) d v . (1-26)

Thus, the Strehl ratio may be obtained by integrating the real part of the measured
aberrated OTF over all spatial frequencies and dividing it by a similar integral of the
calculated unaberrated OTF.

The Strehl ratio gives a measure of the image quality in terms of the reduction in the
central irradiance due to the aberration in the system, including any defocus. Its value
being less than one is a consequence of the fact that the Huygens’ secondary spherical
wavelets on the reference sphere are not in phase due to the aberrations and, therefore,
they interfere nonconstructively at its center of curvature.

It can be shown that, for a given total power, the amplitude variations across the
pupil of an aberration-free system reduce the central irradiance, and any phase variations
(i.e., aberrations) further reduce it [2]. However, an irradiance reduced by phase
variations alone does not necessarily reduce any further if any amplitude variations are
also introduced. In fact, the amplitude variations can even increase this irradiance. For
example, the central value of a defocused PSF for a circular pupil decreases to zero as the
defocus aberration approaches one wave (see Section 4.4). The Huygens’ secondary
wavelets arriving at this point completely cancel each other. Hence, any amplitude
variations across the pupil will only help avoid complete cancellation and thereby
 *HQHUDO ([SUHVVLRQ 9

increase the central value. The maximum value of central irradiance is obtained when the
system is unapodized and unaberrated [1,2]. It is shown in Chapter 5 how a Gaussian
pupil, as in a Gaussian beam, yields a smaller central value.

The peak value of the aberrated irradiance distribution of the image of a point object
does not necessarily occur at the center of the reference sphere. However, the peak value
of an unaberrated image does occur at the center regardless of the apodization. The
Huygens’ secondary wavelets emanating from the spherical wavefront being equidistant
from this point are in phase. Hence, they interfere constructively, producing a maximum
possible value at this point.

1.3.2 Approximate Expressions in Terms of Aberration Variance


Equation (1-22) for the Strehl ratio can be written in an abbreviated form
2
S = exp (i F) , (1-27)

where the angular brackets L indicate a spatial average over the amplitude-weighted
pupil, e.g.,
r r r
Ú A ( rp ) F ( rp ) d rp
F = r r . (1-28)
Ú A ( rp ) d rp
r
Since F is independent of rp , Eq. (1-27) can be written
2
S = [
exp i ( F - F )]
2 2
= cos (F - F ) + sin (F - F )
2 (1-29)
≥ cos (F - F ) ,

equality holding when F is zero across the pupil, in which case S = 1. For small
aberrations, expanding the cosine function in a power series and retaining the first two
obtain the Maréchal result generalized for an apodized pupil

S >~ (1 - sF2 2) 2 , (1-30)

where

s 2F = (F - F )2 (1-31)

is the variance of the phase aberration across the amplitude-weighted pupil. The quantity
s F is the standard deviation of the aberration. We will refer to it as the “sigma value” or
simply the “sigma” of the aberration.
10 OPTICAL IMAGING

For small values of s F , three approximate expressions have been used in the
literature:
2
S1 ~ (1 - s 2F 2) , (1-32)

S2 ~ 1 - s 2F , (1-33)

and

S3 ~ exp (- s 2F ) . (1-34)

The first is the Maréchal formula [3], the second is the commonly used expression ob-
4
tained when the term in s F in the first is neglected [4,5], and the third is an empirical ex-
pression giving a better fit to the actual numerical results for various aberrations [6]. Just
as S1 > S2 by s F4 4 , similarly, S3 > S1 by approximately the same amount. The simplest
expression to use is, of course, S2 , according to which s 2F gives the drop in the Strehl
ratio. We note that, for a pupil of any shape, the Strehl ratio for a small aberration does
not depend on its type but only on its variance across the apodized pupil. For a high-
quality imaging system, a typical value of the Strehl ratio desired is 0.8, corresponding to
a wave aberration with a sigma of s w = l 14 , where s w = (l 2p) s F .
1.4 ABERRATION BALANCING
In geometrical optics, we mix one aberration with another in order to minimize the
variance of the ray distribution in an image plane. For example, when we minimize the
variance by combining the primary spherical aberration with defocus aberration by
considering the ray distribution in a defocused image plane, the smallest spot, called the
circle of least confusion, has a radius that is 1/4 of its value in the Gaussian image plane
[7]. Similarly, when astigmatism is combined with defocus, the circle of least confusion
has a diameter equal to half the length of the line image in the Gaussian image plane. In
the case of coma, the ray distribution is asymmetric about the Gaussian image point and,
therefore, its centroid does not lie at this point. The centroid shift is equivalent to
introducing a wavefront tilt, or balancing coma with tilt.

Based on diffraction, the best image for small aberrations is the one for which the
variance of the wave aberration is minimum so that its Strehl ratio is maximum. Since the
value of variance depends on the shape of and the amplitude across the pupil, the value of
the balancing aberration also depends on those factors. Thus, for example, the value of
defocus for balancing spherical aberration for an annular pupil is different than that for a
circular pupil. Similarly, its value for a Gaussian circular pupil, as in the case of a circular
Gaussian beam, is different than that for a uniform circular pupil. The process of
balancing a higher-order aberration with one or more aberrations of the same and/or
lower orders to minimize the variance is called aberration balancing. Thus, for example,
secondary spherical aberration is balanced with primary spherical aberration and defocus,
and secondary coma is balanced with primary coma and tilt.
 $EHUUDWLRQ %DODQFLQJ 11

The balanced aberrations for a system with a certain shape of the pupil form the basis
of determining the orthogonal polynomial aberrations for the analysis of wavefronts
across the given pupil. The Zernike circle polynomials, for example, are the orthogonal
polynomial aberrations for a system with a circular pupil that represent the balanced
classical aberrations for such a system.

1.5 SUMMARY
The diffraction image of an isoplanatic incoherent object is given by the convolution
of its Gaussian image and the PSF. In the spatial frequency domain, the spectrum of the
image is given by the product of the OTF and the spectrum of the Gaussian image. The
image is obtained by inverse Fourier transforming its spectrum.

For a system with a uniformly illuminated pupil, the aberration-free central


irradiance is given by Pex Sex l2 R 2 , independent of the shape of the pupil [see Eq. (1-8)].
The aberrations of a system are neglected in Gaussian optics when determining the
location and the size of an image formed by the system. The aberration-free OTF of a
system with a uniformly illuminated pupil is simply equal to the fractional area of overlap
of two pupils whose separation depends on the spatial frequency vector.

The aberrations of a system determine the quality of an image actually observed in


practice. An important measure of this quality is the Strehl ratio [see Eq. (1-21)], which
represents the ratio of the central irradiances of the image of a point object with and
without aberration. The Strehl ratio can also be obtained by integrating the real part of the
OTF of a system [see Eq. (1-26)]. For small aberrations, the Strehl ratio is determined by
the variance of the aberration according to, for example, Eq. (1-34), and it is independent
of the type of an aberration. The peak value of a PSF does not necessarily lie at its center,
as, for example, in the case of coma. For an apodized pupil, the aberration variance is
calculated over the amplitude-weighted pupil. A Strehl ratio of 0.8 is obtained when the
standard deviation s w of the wave aberration is approximately l 14 .

The variance of an aberration of a certain order can be reduced by mixing it with one
or more aberrations of lower order, thereby improving the Strehl ratio. The process of
mixing one aberration with others in this manner is called aberration balancing. The
polynomial aberrations used for wavefront analysis are not only orthogonal across the
pupil of a system, but also represent balanced classical aberrations for it.
12 OPTICAL IMAGING

References

1. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction


Optics, 2nd ed. (SPIE Press, Bellingham, WA, 2011).

2. V. N. Mahajan, “Luneburg apodization problem I,” Opt. Lett. 5, 267–269 (1980).

3. A. Maréchal, “Etude des effets combines de la diffraction et des aberrations


geometriques sur l'image d'un point lumineux,” Revue d'Optique 26, 257–277
(1947).

4. B. R. A. Nijboer, Thesis: ”The Diffraction Theory of Aberrations,” University of


Groningen, The Netherlands (1942).

5. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II:


Diffraction pattern in the presence of small aberrations,” Physica 13, 605–620
(1947).

6. V. N. Mahajan, “Strehl ratio for primary aberrations in terms of their aberration


variance,” J. Opt. Soc. Am. 73, 860–861 (1983).

7. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Aberration Optics,


(SPIE Press, Bellingham, WA, Second Printing 2001).
CHAPTER 2

OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

2.1 Introduction ............................................................................................................15

2.2 Optical Imaging ......................................................................................................15

2.3 Wave and Ray Aberrations ................................................................................... 17

2.4 Defocus Aberration ................................................................................................22

2.5 Wavefront Tilt ........................................................................................................23

2.6 Aberration Function of a Rotationally Symmetric System ................................25

2.7 Observation of Aberrations: Interferograms ......................................................29

2.8 Summary................................................................................................................. 31

References ........................................................................................................................33

13
Chapter 2
Optical Wavefronts and Their Aberrations
2.1 INTRODUCTION
The position and the size of the Gaussian image of an object formed by an optical
imaging system is determined by using its Gaussian imaging equations. We have stated in
Chapter 1 that the quality of the diffraction image depends on the aberrations of the
system. A spherical wave originating at a point object is incident on the system. The
image formed by the system is aberration free and perfect if the wave exiting from the
system is also spherical. In this case, the rays originating at the point object and traced
through the system all pass through the Gaussian image point.

If the optical wavefront exiting from the exit pupil is not spherical, its optical
deviations from a spherical form represent its wave aberrations. These wave aberrations
play a fundamental role in determining the quality of the aberrated image. The rays traced
from the object point through the system, instead of passing through the Gaussian image
point, intersect the image plane in its vicinity. The distance of the point of intersection of
a ray in the image plane from the Gaussian image point is called the transverse ray
aberration, and the distribution of the rays is referred to as the spot diagram. In this
chapter, we define the wave and ray aberrations and give a relationship between them.
We relate the longitudinal defocus of an image to the defocus wave aberration, and its
wavefront tilt to the wavefront tilt aberration. Next, the possible aberrations of an
imaging system that is rotationally symmetric about its optical axis are described. The
aberration function of the system is expanded in a power series of the object and pupil
coordinates, and primary (or Seidel), secondary (or Schwarzschild), and tertiary
aberrations are introduced [1]. We also discusss briefly how the aberrations may be
observed using a Twyman–Green interferometer and what the fringe pattern of a primary
or Seidel aberration looks like. A short summary of the chapter is given at the end.

2.2 OPTICAL IMAGING


An optical imaging system consists of a series of refracting and/or reflecting
surfaces. The surfaces refract or reflect light rays from an object to form its image. The
image obtained according to geometrical optics in the Gaussian approximation, i.e.,
according to Snell's law in which the sines of the angles are replaced by the angles, is
called the Gaussian image. The Gaussian approximation and the Gaussian image are
often referred to as the paraxial approximation and the paraxial image, respectively. We
assume that the surfaces are rotationally symmetric about a common axis called the
optical axis (OA). Figure 2-1 illustrates the imaging of an on-axis point object P0 and an
off-axis point object P, respectively, by an optical system consisting of two thin lenses.
P ¢ and P0¢ are the corresponding Gaussian image points. An object and its image are
called conjugates of each other, i.e., if one of the two conjugates is an object, the other is
its image. The location and size of the image of an extended object is determined by
using its Gaussian imaging equations.
15
16 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

ExP

EnP
L1 L2
AS

MR 0
B02
OA CR0 A01
P0 A02 P¢0
B01
MR
0

(a)

ExP

L1 EnP
AS L2

C2

B2 P¢
P0 OA A2
MR A1 P¢
0
B1
CR
C1
MR
P

(b)

Figure 2-1. (a) Imaging of an on-axis point object P0 by an optical imaging system
consisting of two lenses L1 and L2 . OA is the optical axis. The Gaussian image is at
P0¢ . AS is the aperture stop; its image by L1 is the entrance pupil EnP, and its image
by L2 is the exit pupil ExP. CR0 is the axial chief ray, and MR0 is the axial marginal
ray. (b) Imaging of an off-axis point object P. The Gaussian image is at P ¢. CR is the
off-axis chief ray, and MR is the off-axis marginal ray.
 2SWLFDO ,PDJLQJ 17

An aperture in the system that physically limits the solid angle of the rays from a
point object the most is called the aperture stop (AS). For an extended (i.e., a nonpoint)
object, it is customary to consider the aperture stop as the limiting aperture for the axial
point object, and to determine vignetting, or blocking of some rays, by this stop for off-
axis object points. The object is assumed to be placed to the left of the system so that
light initially travels from left to right. The image of the stop by surfaces that precede it in
the sense of light propagation, i.e., by surfaces that lie between it and the object, is called
the entrance pupil (EnP). When observed from the object side, the entrance pupil appears
to limit the rays entering the system to form the image of the object. Similarly, the image
of the aperture stop by surfaces that follow it, i.e., by surfaces that lie between it and the
image, is called the exit pupil (ExP). The object rays reaching its image appear to be
limited by the exit pupil. Since the entrance and exit pupils are images of the stop by the
surfaces that precede and follow it, respectively, the two pupils are conjugates of each
other for the whole system, i.e., if one pupil is considered as the object, the other is its
image formed by the system.

An object ray passing through the center of the aperture stop and appearing to pass
through the centers of the entrance and exit pupils is called the chief (or the principal) ray
(CR). An object ray passing through the edge of the aperture stop is called a marginal ray
(MR). The rays lying between the center and the edge of the aperture, and, therefore,
appearing to lie between the center and edge of the entrance and exit pupils, are called
zonal rays.

It is possible that the stop of a system may also be its entrance and/or exit pupil. For
example, a stop placed to the left of a lens is also its entrance pupil. Similarly, a stop
placed to the right of a lens is also its exit pupil. Finally, a stop placed at a single thin lens
is both its entrance and exit pupils.

2.3 WAVE AND RAY ABERRATIONS


Consider an optical system imaging a point object P, as illustrated in Figure 2-2. The
object radiates a spherical wave. For perfect imaging, the diverging spherical wave
incident on the system is converted by it into a spherical wave converging to the Gaussian
image point P ¢ . Generally, the wave exiting from real systems is only approximately
spherical.

The optical path length of a ray in a medium of refractive index n is equal to n times
its geometrical path length. Consider rays from a point object traced through the system
up to the exit pupil such that each one travels exactly the same optical path length. The
ray passing through the center of the pupil is called the chief ray, and represents the
reference ray with respect to which the optical path lengths of the other rays are
compared. The surface passing through the end points of the rays is called the system
wavefront, and it represents a surface of constant phase for the point object under
consideration. If the wavefront is spherical, with its center of curvature at the Gaussian
18 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

Optical
System

Figure 2-2. Perfect imaging of a point object P by an optical system at its Gaussian
image point P ¢ .

image point, we say that the image is perfect. The rays transmitted by the system have
equal optical lengths in propagating from P to P ¢ , and they all pass through P ¢ . If,
however, the actual wavefront deviates from this spherical wavefront, called the
Gaussian reference sphere, we say that the image is aberrated. The rays reaching the
Gaussian reference sphere do not travel the same optical path length, and they intersect
the Gaussian image plane in the vicinity of P ¢ . The optical deviations (i.e., the
geometrical deviations times the refractive index ni of the image space) of the wavefront
from a Gaussian reference sphere are called wave aberrations. The wave aberration of a
ray at a point on the reference sphere where the ray meets it is equal to the optical
deviation of the wavefront along that ray from the Gaussian reference sphere. It
represents the difference between the optical path lengths of the ray under consideration
and the chief ray in traveling from the point object to the reference sphere. Accordingly,
the wave aberration associated with the chief ray is zero. Since the optical path lengths of
the rays from the reference sphere to the Gaussian image point are equal, the wave
aberration of a ray is also equal to the difference between its optical path length from the
point object P to the Gaussian image point P ¢ and that of the chief ray.

The wave aberration of a ray is positive if it has to travel an extra optical path length,
compared to the chief ray, in order to reach the Gaussian reference sphere. Figures 2-3a
and 2-3b illustrate the reference sphere S and the aberrated wavefront W for on-axis and
off-axis point objects, respectively. The reference sphere, which is centered at the
Gaussian image point P0¢ in Figure 2-3a or P ¢ in Figure 2-3b, and the wavefront pass
through the center O of the exit pupil. The wave aberration ni Q Q of a general ray GR0
or GR, where ni is the refractive index of the image space, as shown in the figures, is
numerically positive. The coordinate system is also illustrated in these figures. We choose
a right-hand Cartesian coordinate system such that the optical axis lies along the z axis.
The object, entrance pupil, exit pupil, and Gaussian image lie in mutually parallel planes
that are perpendicular to this axis. Figure 2-4 illustrates the coordinate systems in the
object, exit pupil, and image planes. The origin of the coordinate system lies at O and the
Gaussian image plane lies at a distance zg from it along the z axis.

We assume that a point object such as P lies along the x axis. (There is no loss of
generality because of this since the system is rotationally symmetric about the optical
axis.) The z x plane containing the optical axis and the point object is called the
2.3 Wave and Ray Aberrations 19

ExP

Q Q(x, y, z)

GR0 x

d a

P0¢¢ (xi, yi)


CR0
z
O OA P0¢ (0, 0)
g

b
y

W(x,y) = niQQ

S
W
R

Figure 2-3a. Aberrated wavefront for an on-axis point object. The reference sphere
S of radius of curvature R is centered at the Gaussian image point P0¢ . The
wavefront W and reference sphere pass through the center O of the exit pupil ExP.
A right-hand Cartesian coordinate system showing x, y, and z axes is illustrated,
where the z axis is along the optical axis O A of the imaging system. Angular
rotations a , b , and g about the three axes are also indicated. CR0 is the chief ray,
and a general ray GR0 is shown intersecting the Gaussian image plane at P0¢¢ .
ExP

Q(x,y,z)
Q
GR

P¢¢(xi,yi)

P¢(xg,0)
R

O OA P¢0
x

a
z
g
y b W(x,y) = niQQ

S
W

zg

Figure 2-3b. Aberrated wavefront for an off-axis point object. The reference sphere
S of radius of curvature R is centered at the Gaussian image point P ¢ . The value of
R in this figure is slightly larger than its value in Figure 1-3a. GR is a general ray
intersecting the Gaussian image plane at the point P ¢¢ . By definition, the chief ray
(not shown) passes through O, but it may or may not pass through P ¢ .
20 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

xo

P (xo, 0) xp

Q (x, y)
P0
an ct

xg
pl bje
e

r
O

q
P¢¢ (xi, yi, zg)
yo
R
O P¢ (xg, 0, zg)
an il
pl up
e
P

zg
yp P¢0

pl n
e
e sia
an
ag us
yg im Ga

Figure 2-4. Right-hand coordinate system in object, exit pupil, and image planes.
The optical axis of the system is along the z axis, and the off-axis point object P is
assumed to be along the x axis, thus making the z x plane the tangential plane.

tangential or the meridional plane. The corresponding Gaussian image point P ¢ lying in
the Gaussian image plane along its x axis also lies in the tangential plane. This may be
seen by consideration of a tangential object ray and Snell’s law, according to which the
incident and the refracted (or reflected) rays at a surface lie in the same plane. The chief
ray always lies in the tangential plane. The plane normal to the tangential plane but
containing the chief ray is called the sagittal plane. As the chief ray bends when it is
refracted or reflected at an optical surface, so does the sagittal plane. It should be evident
that only the chief ray lies in both the tangential and sagittal planes, because it lies along
the line of intersection of these two planes.

Consider an image ray such as GR in Figure 2-2b passing through a point Q with
coordinates (x, y, z) on the reference sphere of radius of curvature R centered at the image
point. We let W(x, y) represent its wave aberration nQ Q , because z is related to x and y
by virtue of Q being on the reference sphere. It can be shown that the ray intersects the
Gaussian image plane at a point P ¢¢ whose coordinates with respect to the Gaussian
image point P ¢ are approximately given by [1,2]

R Ê ∂W ∂W ˆ
(x i , y i ) = Á , ˜ , (2-1)
n Ë ∂x ∂y ¯

where ( xi , yi ) represent the coordinates of P ¢¢ with respect to those of the Gaussian


image point P ¢. For systems with narrow fields of view, P ¢ lies close to P0¢ , and we may
 :DYH DQG 5D\ $EHUUDWLRQV 21

replace R with zg . Note that in the case of an axial point object, R zg . [Equation (2-1)
has been derived by Mahajan [1], Born and Wolf [2], and Welford [3]. Note, however,
that Welford uses a sign convention for the wave aberration that is opposite to ours.]

The displacement P0cP0s in Figure 2-3a (or Pc Ps in Figure 2-3b) of a ray from the
Gaussian image point is called its geometrical or transverse ray aberration, and its
coordinates ( x i , y i ) in the Gaussian image plane relative to the Gaussian image point are
called its ray aberration components. Since a ray is normal to a wavefront, the ray
aberration depends on the shape of the wavefront and, therefore, on its geometrical path
difference from the reference sphere. The division of W by n in Eq. (2-1) converts the
optical path length difference into geometrical path length difference. When an image is
formed in free space, as is often the case in practice, then n = 1. The angle G ~ P0cP0s R
between the ideal ray QP0c and the actual ray QP0s is called the angular ray aberration.
The distribution of rays in an image plane is called the ray spot diagram.

We will refer to the aberration W x, y as the wave aberration at a projected point


Q x, y in the plane of the exit pupil. If r, T are the polar coordinates of this point, as
illustrated in Figure 2-5, they are related to its rectangular coordinates x, y according to

x, y r cos T, sin T . (2-2)

Note that the tangential rays, i.e., those lying in the z x plane, lie along the x axis of the
exit pupil plane and thus correspond to T 0 or S . Similarly, the sagittal rays, i.e., those
lying in a plane orthogonal to the tangential plane but containing the chief ray lie along
the y axis of the exit pupil plane and thus correspond to T S 2 or 3S 2 .

Q(x, y)
Q(r, T)
r
y
T
x
O x

Figure 2-5. Circular exit pupil of radius a of an imaging system, and Cartesian and
polar coordinates x, y and r, T , respectively, of a point Q on the pupil.
22 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

2.4 DEFOCUS ABERRATION


We now discuss defocus wave aberration of a system and relate it to its longitudinal
defocus. Consider an imaging system for which the Gaussian image of a point object is
located at P1 . As indicated in Figure 2-6, let the wavefront for this point object be
spherical with a center of curvature at P2 (due, for example, to field curvature discussed
in Section 1.6 for an off-axis point object) such that P2 lies on the line OP, joining the
center O of the exit pupil and the Gaussian image point P1 . The aberration of the
wavefront representing its optical deviation along a ray from the Gaussian reference
sphere is given by nQ2Q1 , where n is the refractive index of the image space, and Q2Q1,
as indicated in the figure, is approximately equal to the difference in the sags of the
reference sphere and the wavefront at a height r. (The sag of a surface at a certain point
on it represents its deviation at that point along its axis of symmetry from a plane surface
that is tangent to it at its vertex.) Thus, the defocus wave aberration at a point Q1 at a
distance r from the optical axis, representing the second-order difference, is given by

n §1 1· 2
W r ¨  ¸r , (2-3)
2 ©z R¹

where z and R are the radii of curvature of the reference sphere S and the spherical
wavefront W centered at P1 and P2 , respectively, passing through the center O of the exit
pupil, and r is the distance of Q1 from the optical axis. We note that the defocus wave
aberration is proportional to r 2 . If z ~ R , then Eq. (2-3) may be written as follows:
ExP

Q2 Q1

O B P1 P2
S centered at P1
W centered at P2

W S
Z

Figure 2-6. Wavefront defocus. Defocused wavefront W is spherical with a radius of


curvature R centered at P2 . The reference sphere S with a radius of curvature z is
centered at P1 . Both W and S pass through the center O of the exit pupil ExP. The
ray Q2 P2 is normal to the wavefront at Q2 . OB represents the sag of Q1 .
 'HIRFXV $EHUUDWLRQ 23

W (r) ~ - n D2 r 2 , (2-4)
2R

where D = z - R is called the longitudinal defocus. We note that the defocus wave
aberration and the longitudinal defocus have numerically opposite signs.

A defocus aberration is also introduced if the image is observed in a plane other than
the Gaussian image plane. Consider, for example, an imaging system forming an
aberration-free image at the Gaussian image point P2 (and not at P1 , as in Figure 1-6).
Thus, the wavefront at the exit pupil is spherical passing through its center Q with its
center of curvature at P2 . Let the image be observed in a defocused plane passing through
a point P1 , which lies on the line joining Q and P2 . For the observed image at P1 to be
aberration free, the wavefront at the exit pupil must be spherical with its center of
curvature at P1 . Such a wavefront forms the reference sphere with respect to which the
aberration of the actual wavefront must be defined. The aberration of the wavefront at a
point Q1 on the reference sphere is given by Eqs. (2-3) and (2-4).

If the exit pupil is circular with a radius a, then Eq. (2-4) may be written

W (r) = Bd r 2 , (2-5)

where r = r a is the normalized distance of a pupil point and

Bd ~ - nD 8 F 2 (2-6)

represents the peak value of the defocus aberration with F = R 2a as the focal ratio or
the f-number of the image-forming light cone. Note that a positive value of Bd implies a
positive value of D. Thus, an imaging system having a positive value of defocus
aberration D can be made defocus free if the image is observed in a plane lying farther
from the plane of the exit pupil, compared to the defocused image plane, by a distance
8Bd F 2 n . Similarly, a positive defocus aberration of Bd ~ - nD 8F 2 is introduced into
the system if the image is observed in a plane lying closer to the plane of the exit pupil,
compared to the defocus-free image plane, by a distance D.

2.5 WAVEFRONT TILT


Now we describe the relationship between a wavefront tilt and the corresponding tilt
aberration. As indicated in Figure 2-7, consider a spherical wavefront centered at P2 in
the Gaussian image plane passing through the Gaussian image point P1 . The wave
aberration of the wavefront at Q1 is its optical deviation nQ2Q1 from a reference sphere
centered at P1 . It is evident that, for small values of the ray aberration P1P2 , the wavefront
and the reference sphere are tilted with respect to each other by an angle b . The
wavefront tilt may be due to an inadvertently tilted element of the imaging system or
distortion (discussed in Section 2.6) for an off-axis point object. The ray and the wave
aberrations can be written

x i = R (2-7)
24 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

ExP

Q2 Q1

r
P2
xi
b
O OA P1

S W

Figure 2-7. Wavefront tilt. The spherical wavefront W is centered at P2 while the
reference sphere S is centered at P1 , such that the two spherical surfaces are tilted
with respect to each other by a small angle  = P1 P2 R , where R is their radius of
curvature. The ray Q2 P2 is normal to the wavefront at Q2.

and

W ( r , q) = nbr cos q , (2-8)

respectively, where P1P2 = x i and (r, q) are the polar coordinates of the point Q1 . Both
the wave and ray aberrations are numerically positive in Figure 2-7.

Once again, for a system with a circular exit pupil of radius a, Eq. (2-8) may be
written

W (r, q) = nab r cos q = Bt r cos q , (2-9)

where

B t = n i ab (2-10)

is the peak value of the wavefront tilt aberration. Note that a positive value of Bt implies
that the wavefront tilt angle  is also positive. Thus, if an aberration-free wavefront is
centered at P2 , then an observation with respect to P1 as the origin implies that we have
introduced a tilt aberration of Bt r cos q.
2.6 Aberration Function of a Rotationally Symmetric System 25

2.6 ABERRATION FUNCTION OF A ROTATIONALLY SYMMETRIC


SYSTEM
Consider a point object with Cartesian coordinates (p, q) in the object plane. Its
image, formed by a rotationally symmetric system, is perfect if the spherical wavefront
diverging from the object point and incident on the imaging system is converted by the
system into a spherical wavefront converging to its Gaussian image point. Any deviation
of the imaging wavefront at the exit pupil of the system from a reference sphere passing
through the center of the pupil with center of curvature at the Gaussian image point
represents the aberration function. In optical design, the aberration function is determined
by tracing rays originating at the point object and propagating them through the system
and determining their optical path lengths in reaching the reference sphere relative to that
of the chief ray passing through the center of the pupil. Similarly, in optical testing the
wave aberration at a discrete array of points is determined interferometrically.

If (x, y) are the coordinates of a pupil point, the aberration function consists of terms
r
formed from three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy . If h
r
and rr are
r the position vectors of the object and pupil points,rthen the rotational invariants
r r r r r
are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and hr cos q , where h = h , r = r , and q is the polar
r r
angle of r with respect to that of h . It is convenient to consider the aberration function
in terms of the image height h ¢ , for example, when the object is at infinity, and let q be
the angle for the image point. The image height is, of course, related to the object height
by the Gaussian magnification. We now expand the aberration function W (h ¢; r , q) in a
power series in terms of the three rotational invariants h ¢ 2 , r 2 , and h ¢r cos q in the form

• • •
W (h¢; r , q) = Â Â ( ) l (r 2 ) p (h¢r cos q) m
 C lpm h ¢ 2
l =0 p =0 m =0

• • •
= Â Â Â C lpm h ¢ 2l + m r 2 p + m cos m q , (2-11)
l =0 p =0 m =0

where C lpm are the expansion coefficients, and l, p, and m are positive integers, including
zero. There is no term with sinq dependence. The aberration terms are called the
classical aberrations.

It is evident that the degree of each term of the series in the object or image and pupil
coordinates is even and given by 2(l + p + m) . Any terms for which p = 0 = m so that
2 p + m = 0 , i.e., those terms that do not depend on r and, therefore, vary only as h ¢ 2l ,
must add up to zero since the aberration associated with the chief ray (for which r = 0 ) is
zero. Thus, the zero-degree term C000 and terms such as C100 h ¢ 2 , C 200 h ¢ 4 , etc., do not
appear in Eq. (2-11). There is also no term of second degree. For example, the term
C010 r 2 represents defocus aberration that is independent of h. It has the implication that
the image is being observed in a plane other than the Gaussian image plane. Similarly, the
term C 001 h ¢r cos q represents a wavefront tilt aberration that depends on h. It has the
implication that the image height is not h ¢ . Hence, a power series expansion of the
26 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

aberration function consists of terms of degree 4, 6, 8, etc. The corresponding aberrations


are referred to as the primary, secondary, tertiary aberrations, etc. The primary
aberrations are also called the Seidel aberrations, and the secondary aberrations are also
called the Schwarzschild aberrations.

It is convenient to write Eq. (2-11) in the form

• • n
W (h¢; r , q) = Â Â Â 2 l + m a nm h¢ 2l + m r n cos m q , (2-12)
l = 0 n =1 m = 0

where

n = 2p + m (2-13)

is a positive integer not including zero, and 2l + m anm are the expansion coefficients. From
Eq. (2-13), we note that n - m = 2 p ≥ 0 and even. The order i of an aberration term,
which is equal to its degree in the object and pupil coordinates, is given by

i = 2l + m + n . (2-14)

The number of terms Ni of a certain order i, i.e., the number of integer sets satisfying Eq.
(2-14) with n - m ≥ 0 and even, is given by

N i = (i + 2) (i + 4) 8 . (2-15)

This number includes a term with n = 0 = m , called piston aberration, although such a
term does not constitute an aberration (since it corresponds to the chief ray, which has a
zero aberration associated with it). It is included here for completeness, as interferometric
data based on the aberrations of a system may have a piston component.

The fourth order (i = 4), i.e., the primary or the Seidel aberration function consisting
of a sum of five fourth-order terms, can be written

W P (r , q; h ¢ ) = 0 a 40 r
4
+ 1a 31h ¢ r 3 cos q + 2 a 22 h ¢ 2 r 2 cos 2 q
(2-16)
+ 2 a 20 h ¢ 2 r 2 + 3 a11h ¢ 3 r cos q .

Since the wave aberration W has dimensions of length, the dimensions of the coefficients
i a jk are inverse length cubed. Since the ray aberrations are related to the wave
aberrations by a spatial derivative [see Eq. (2-1)], their degree is lower by one.
Accordingly, the primary aberrations are also referred to as the third-order ray
aberrations. The wave aberration coefficients 0 a 40 , 1a 31 , 2 a 22 , 2 a 20 , and 3 a11 represent
the coefficients of spherical aberration, coma, astigmatism, field curvature, and
distortion, respectively.

From Eq. (2-16), we note that only spherical aberration is independent of the object
or image height. The field curvature, in its dependence on the pupil coordinates (r, q) , is
like the defocus aberration discussed in Section 2.4. However, the field curvature
 $EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 27

represents a defocus aberration that depends on the field h ¢ , thus requiring a curved
image surface for its elimination. On the other hand, pure defocus aberration, such as that
produced by observing the image in a plane other than the Gaussian image plane, is
independent of the field h ¢ . Similarly, distortion depends on the pupil coordinates as a
wavefront tilt. However, distortion depends on the field as h ¢ 3 , but the wavefront tilt
produced by a tilted element in the system would be independent of h¢ .

The sixth order ( i = 6), i.e., the secondary or the Schwarzschild aberration function,
can be written

W S (h¢; r , q) = 0 a 60 r 6 +1 a 51h ¢ r 5 cos q + 2 a 42 h ¢ 2 r 4 cos 2 q + 3 a 33 h ¢ 3 r 3 cos 3 q + 2 a 40 h ¢ 2 r 4


+ 3 a 31h¢ 3 r 3 cos q + 4 a 22 h ¢ 4 r 2 cos 2 q + 4 a 20 h ¢ 4 r 2 + 5 a11h ¢ 5 r cos q . (2-17)

Four of the nine aberration terms (excluding piston) correspond to l = 0. They are the
secondary spherical aberration ( 0 a 60 r 6 ), secondary coma ( 1a 51h¢ r 5 cos q ), secondary
astigmatism ( 4 a 22 h¢ 4 r 2 cos 2 q ) (wings or Flügelfehler), and arrows or Pfeilfehler
( 3 a 33 h¢ 3 r 3 cos 3 q ). The remaining five corresponding to l π 0 and called lateral
aberrations are similar to the corresponding primary aberrations except for their
dependence on the image height h ¢. The lateral spherical aberration 2 a40 h ¢ 2 r 4 is also
called the oblique spherical aberration.

Aberration terms of the eighth (i = 8) order are called the tertiary aberrations. There
are fourteen aberration terms of this order, excluding piston. Only five of them have the
dependencies on pupil coordinates that are different from those of the secondary or
primary aberrations. Four have dependence on these coordinates as for the secondary
aberrations, and the remaining five have the same dependence as the primary aberrations.
Their difference lies in their dependence on the image height.

By combining the aberration terms having different dependencies on the object


coordinates but the same dependence on pupil coordinates so that there is only one term
for each pair of (n, m) values, Eq. (2-12) for the power-series expansion of the aberration
function may be written

• n
W (r, q) = Â Â a nm r n cos m q , (2-18)
n =1 m = 0

where the expansion coefficients a nm are related to the coefficients i a jk according to



2l + m
anm = a n  2 l + m anm h ¢ . (2-19)
l=0

The radial coordinate r has been normalized to r = r a . It has the advantage that, since
0 £ r £ 1 and cos q £ 1, the coefficient a nm of a classical aberration r n cos m q
represents the peak value or half of the peak-to-valley (P-V) value of the corresponding
aberration term, depending on whether m is even or odd, respectively. The indices n and
m represent the powers of r and cos q, respectively. The index m also represents the
28 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

minimum power of h ¢ dependence of a coefficient (with the exception of tilt and defocus
terms corresponding to n - m ≥ 0 and 2, respectively). The maximum power of h ¢
dependence is given by i - n . Moreover, the powers of h ¢ dependence are even or odd
according to whether n and m are even or odd, respectively. The number of terms through
a certain order i in the reduced power-series expansion of the aberration function given
by Eq. (2-18) is also given by Eq. (2-15). This number includes a nonaberration piston
term corresponding to n = 0 = m . The terms of Eq. (2-12) through a certain order i
correspond to those terms of Eq. (1-18) for which n + m £ i.

The primary aberrations correspond to terms with n + m £ 4 . The primary or the


Seidel aberration function of Eq. (2-16) may be written in terms of the coefficients a nm
in the form

W P (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31q 3 cos q + a 40r 4 , (2-20)

where
3
a11 = 3 a11h ¢ a , (2-21a)
2
a 20 = 2 a 20 h ¢ a2 , (2-21b)
2
a 22 = 2 a 22 h ¢ a2 , (2-21c)

a 31 = 1a 31h ¢ a 3 , (2-21d)

and
4
a 40 = 0 a 40 a . (2-21e)

Comparing the distortion term a11r cos q with the wavefront tilt aberration given by
Eq. (2-9), we note that while the two are similar in their dependence on the pupil
coordinates, their coefficients depend on the image height differently. The distortion
coefficient a11 varies with h ¢ as h ¢ 3 , but the tilt coefficient Bt is independent of h ¢.
Similarly, comparing the field curvature term a 20r 2 with the defocus wave aberration
given by Eq. (2-5), we note that their dependence on the pupil coordinates is the same.
However, whereas the field curvature coefficient a20 varies with h ¢ as h ¢ 2 , the defocus
coefficient Bd is independent of h ¢.

The aberration function through the sixth order, i.e., for i £ 6 or n + m £ 6 may be
written

W S (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31r 3 cos q + a 33r 3 cos 3 q

+ a 40r 4 + a 42r 4 cos 2 q + a 51r 5 cos q + a 60r 6 , (2-22)

where

a11 = ( 3 a11h ¢
3
)
+ 5 a11h¢ 5 a , (2-23a)
 $EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 29

a20 = ( 2 a20 h ¢
2
)
+ 4 a20 h¢ 4 a 2 , (2-23b)

a22 = ( 2 a22 h ¢
2
)
+ 4 a22 h¢ 4 a 2 , (2-23c)

a31 = (a 1 31h ¢ )
+ 3 a31h ¢ 3 a 3 , (2-23d)

3 3
a33 = 3 a33 h ¢ a , (2-23e)

a 40 = ( 0 a 40 + 2a 40h ¢ 2 ) a 4 , (2-23f)

2 4
a42 = 2 a42 h ¢ a , (2-23g)

a51 = 1a51h ¢a 5 , (2-23h)

6
a60 = 0 a60 a . (2-23i)

Written in this form, the aberration function has nine aberration terms through the sixth
order or through the secondary aberrations. Since the dependence of an aberration term
on the image height h ¢ is contained in the aberration coefficient anm , it should be noted
that the primary aberrations (including distortion and field curvature terms) in Eqs. (2-23)
are not the same as those in Eq. (2-20), because they contain aberration components not
only of the fourth degree, but of the sixth degree as well. For example, a 40r 4 consists of
spherical and lateral spherical aberrations 0 a 40 a 4 r 4 and 2 a 40 h ¢ 2 a 4 r 4 .

Similarly, the aberration function through the eighth order can be written. Once
again, an aberration term of this expansion will not be necessarily the same as a
corresponding term of the expansions of Eq. (2-20) or (2-22). We add that it is convenient
to refer to the aberration terms of a power-series expansion as the classical aberrations,
e.g., a term in r4 may be referred to as the classical primary spherical aberration.

2.7 OBSERVATION OF ABERRATIONS: INTERFEROGRAMS


There are a variety of interferometers that are used for detecting and measuring
aberrations of optical systems [4]. Figure 2-8 illustrates schematically a Twyman–Green
interferometer in which a collimated laser beam is divided into two parts by a beam
splitter BS. One part, called the test beam, is incident on the system under test, indicated
by the lens L, and the other, called the reference beam, is incident on a plane mirror M 1 .
The focus F of the lens system lies at the center of curvature C of a spherical mirror M 2 .
As the angle of the incident light is changed to study the off-axis aberrations of the
system, the mirror is tilted so that its center of curvature lies at the current focus of the
beam. In this arrangement the mirror does not introduce any aberration since it is forming
the image of an object lying at its center of curvature .

The two reflected beams interfere in the region of their overlap. Lens L ¢ is used to
observe the interference pattern on a screen S placed in a plane containing the image of L
30 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

M1

BS

L M2
x

Figure 2-8. Twyman–Green interferometer for testing a lens system L. A laser beam
is split into two parts by a beam splitter BS. The reflected part is incident on a plane
mirror M1 and the transmitted part is incident on L. F is the image-space focal
point of L , and C is the center of curvature of a spherical mirror M2 . The
interfering beams are focused by a lens L ¢ , and the interference pattern is observed
on a screen S.

formed by L ¢ . A record of the interference pattern is called an interferogram. Note that


since the test beam goes through the lens system L twice, its aberration is twice that of the
system.

If the reference beam has a uniform phase and the test beam has a phase distribution
F( x , y ) , and if their amplitudes are equal to each other, the irradiance distribution of their
interference pattern is given by

[ ]2
I ( x , y ) = I 0 1 + exp iF( x , y )

{ [
= 2I 0 1 + cos F( x , y ) ]} , (2-24)

where I0 is the irradiance when only one beam is present. Of course, the phase and the
wave aberration distributions are related to each other according to

2p
F( x , y ) = W (x, y) , (2-25)
l
 2EVHUYDWLRQ RI $EHUUDWLRQV ,QWHUIHURJUDPV 31

where l is the wavelength of the laser beam. The irradiance has a maximum value equal
to 4 I 0 at those points for which

F( x , y ) = 2pn (2-26a)

and a minimum value equal to zero wherever

F( x , y ) = 2p(n + 1 2) , (2-26b)

where n is a positive or a negative integer, including zero. Each fringe in the interference
pattern represents a certain value of n, which in turn corresponds to the locus of ( x , y )
points with phase aberration given by Eq. (2-25a) for a bright fringe and Eq. (2-25b) for a
[ ]
dark fringe. If the test beam is aberration free F ( x , y ) = 0 , then the interference pattern
has a uniform irradiance of 4 I 0 . Figure 2-9 shows interferograms of six waves of a
primary aberration. In Figure 2-9a for spherical aberration and 2-9d for astigmatism, a
certain amount of defocus has also been added. In Figure 2-9c, a certain amount of tilt has
been added to the coma aberration.

2.8 SUMMARY
A perfect image of a point object is formed by an imaging system when a spherical
wave diverging from the object and incident on the system is converted by it into a
spherical wave converging to the Gaussian image point. If rays from the object point are
traced through the system, they all travel exactly the same optical path length from the
object point to the Gaussian image point, and they all pass through this image point.
When the wavefront exiting from the exit pupil of the system is not spherical, its optical
deviations from the spherical form represent the wave aberrations, and an aberrated
image is formed. The rays intersect the image plane in the vicinity of the Gaussian image
point, and their distribution is called the spot diagram. The wave and the ray aberrations
are related to each other by a spatial derivative, as in Eq. (2-1).

The aberrations of a rotationally symmetric system depend on the product of the


integral powers of three rotational invariants, namely, h ¢ 2 , r 2 , and h ¢r cos q , where h ¢ is
the height of the Gaussian image point from the optical axis and (r, q) are the polar
coordinates of a point in the plane of the exit pupil. There is no term with sinq
dependence. The order of an aberration term, representing its degree in the object and
pupil coordinates, is even. The aberrations of the lowest order, namely 4, are called
primary or Seidel aberrations. Similarly, the aberrations of the next order, namely 6, are
called the secondary or the Schwarzschild aberrations. When an image is observed in a
defocused image plane, the defocus aberration thus introduced varies as r 2 . It is similar
to the field curvature aberration in its pupil dependence, but whereas the former is
independent of the image height, the latter varies as h ¢ 2 .

The interference pattern formed by two beams, one of which has traveled through an
aberrated system, is shown in Figure 2-9 for primary aberrations, as an illustration of
interferograms.
32 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

Figure 2-9. Interferograms of primary aberrations: (a) defocus Bd r 2 , (b) spherical


aberration combined with defocus As r 4 + Bd r 2 , (c) coma combined with tilt
Ac r 3 + Bt rcos q , and (d) astigmatism combined with defocus Aa r 2 cos 2q + Bd r 2 . The
aberrations in the interferograms are twice their corresponding values in the system
under test, because the test beam goes through the system twice.
5HIHUHQFHV 33

References

1. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Geometrical Optics,


2nd Printing (SPIE Press, Bellingham, Washington, 2001).

2 M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999).

3. W. T. Welford, Aberrations of the Symmetrical Optical System (Academic Press,


New York, 1974).

4. D. Malacara, Ed., Optical Shop Testing, 3rd ed., Wiley, New York (2007).
CHAPTER 3

ORTHONORMAL POLYNOMIALS AND


GRAM–SCHMIDT ORTHONORMALIZATION

3.1 Introduction ............................................................................................................37

3.2 Orthonormal Polynomials ..................................................................................... 37

3.3 Equivalence of Orthogonality-Based Coefficients and

Least-Squares Fitting............................................................................................. 39

3.4 Orthonormalization of Zernike Circle Polynomials over

Noncircular Pupils ................................................................................................. 40

3.5 Unit Pupil ................................................................................................................43

3.6 Summary................................................................................................................. 43

References ........................................................................................................................46

35
Chapter 3
Orthonormal Polynomials and Gram–Schmidt
Orthonormalization
3.1 INTRODUCTION
In optical design, we trace rays from a point object through a system to determine the
aberrations of the wavefront at its exit pupil. In optical testing, we determine the
aberrations of a system or an element interferometrically. In both cases, we obtain
aberration numbers at an array of points. We can calculate the PSF or other associated
image quality measures from these numbers. We can also calculate the aberration
variance, which, in turn, gives some idea of the image quality. However, such measures
do not shed light on the content of the aberration function. To understand the nature of
this function, we want to know the amount of certain familiar aberrations discussed in
Chapter 2 that are present, so that perhaps something can be done about them in
improving the design or the system under test.

A straightforward approach to determine the content of an aberration function is to


decompose it into a set of orthogonal polynomials that represent balanced classical
aberrations and include wavefront defocus and tilt. The Zernike circle polynomials are in
widespread use for this purpose for systems with circular pupils. These polynomials are
unique in the sense that they are not only orthogonal across a unit circle, but they also
represent balanced aberrations yielding minimum variance, as we shall see in Chapter 4.
In this chapter, we discuss the basic properties of the orthogonal polynomials. We also
describe the Gram–Schmidt orthogonalization process for obtaining orthogonal
polynomials over one domain from those that are orthogonal over another domain, e.g.,
obtaining polynomials that are orthogonal over an annular pupil from the circle
polynomials. We emphasize the use of orthonormal polynomials so that their coefficients
represent the standard deviations of the corresponding polynomial aberration terms.

3.2 ORTHONORMAL POLYNOMIALS


Consider a complete set of polynomials F j ( x , y ) in Cartesian coordinates ( x , y ) that
are orthonormal over a certain pupil according to

1
Ú F ( x , y ) F j ' ( x , y ) dx dy = d jj ' , (3-1)
A pupil j

where A is the area of the pupil inscribed inside a unit circle, the integration is carried out
over the area of the pupil, and d jj' is a Kronecker delta. Let F1 = 1. Since it is
independent of the coordinates x and y, it is referred to as the piston polynomial. As a
result, the mean value of each polynomial, except for j = 1, is zero, i.e.,

1
F j ( x, y) = Ú F ( x , y ) dx dy
A pupil j
37
38 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

= 0 for j π 1 , (3-2)

as may be seen by letting j ¢ = 1 in Eq. (3-1). The angular brackets on the left-hand side
of Eq. (3-2) indicate a mean value over the area of the pupil. Similarly, the mean square
value of a polynomial is unity, i.e.,

1
F j2 ( x , y ) = Ú F ( x , y ) dx dy
2
A pupil j

= 1 , (3-3)

as may be seen by letting j ¢ = j in Eq. (3-1).

An aberration function W ( x , y ) can be expanded in terms of the polynomials in the


form


W ( x, y) = Â a j F j ( x, y) , (3-4)
j =1

where a j is an expansion or the aberration coefficient of the polynomial F j ( x , y ) .


Multiplying both sides of Eq. (3-4) by F j ¢ ( x , y ) , integrating over the pupil, and utilizing
the orthonormality Eq. (3-1), the aberration coefficients are given by

1 1 •
Ú W ( x , y ) F j ¢ ( x , y ) dx dy = Â a Ú F ( x , y ) F j ¢ ( x , y ) dx dy
A pupil A j =1 j pupil j

= a j¢ ,

or

1
aj = Ú W ( x , y ) F j ( x , y ) dx dy . (3-5)
A pupil

It is evident that the value of an expansion coefficient is independent of the number of


polynomials used in the expansion. Accordingly, one or more terms can be added to or
subtracted from the aberration function without affecting the other coefficients. It is a
consequence of the orthogonality of the polynomials.

The mean value of the aberration function is given by


W ( x, y) = Â a j F j ( x, y)
j =1

= a1 , (3-6)

where we have utilized Eq. (3-2) for the mean value of a polynomial. The mean square
value of the aberration function is given by
 2UWKRQRUPDO 3RO\QRPLDOV 39

1 • •
W 2 ( x, y) = Ú Â a j F j ( x , y ) Â a j ¢ F j ¢ ( x , y ) dx dy
A pupil j =1 j ¢ =1


= Â a 2j , (3-7)
j =1

where we have utilized the orthonormality Eq. (3-1) and Eq. (3-3) for the mean square
2
value of a polynomial. The variance s W of the aberration function is accordingly given
by

2
2
sW = W 2 ( x, y) - W ( x, y)


= Â a 2j , (3-8)
j =2

where s W is the standard deviation or the sigma value of the aberration function. Since
the mean value of a polynomial (except piston) is zero, each expansion coefficient a j
represents the standard deviation of the corresponding polynomial term. The variance of
the aberration function is simply the sum of the variances of the polynomial terms.

In the orthonormality Eq. (3-1) and those that follow it, we have assumed a
uniformly illuminated pupil, i.e., the amplitude across it is constant. If that is not the case,
as for example in a Gaussian pupil where the amplitude across the pupil varies as a
Gaussian function, then the amplitude function must be included in all the integrations
over the pupil (see Chapter 6). The quantity A in such cases would also be an amplitude-
weighted area of the pupil. Thus, the integrations, indicated by the angular brackets
implying a mean value, would be over an amplitude-weighted area of the pupil.

In practice, the number of polynomials used in the expansion will be truncated such
that the resulting variance obtained from Eq. (3-8) equals the actual value obtained from
the function W ( x , y ) within some specified tolerance. The Strehl ratio of an image for
small aberrations can be estimated from the variance according to Eq. (1-34).

3.3 EQUIVALENCE OF ORTHOGONALITY-BASED COEFFICIENTS AND


LEAST-SQUARES FITTING

It is easy to show that the expansion coefficients a j given by Eq. (3-5) and obtained
as a consequence of the orthogonality of the polynomials F j ( x , y ) represent a least-
squares fit of the aberration function W ( x , y ) . Suppose we estimate the function with
only J polynomials. Thus we write
J
Wˆ ( x , y ) = Â a j F j ( x , y ) , (3-9)
j =1

where Wˆ ( x , y ) is the best-fit estimate of W ( x , y ) . The least-squares error resulting from


fitting the aberration function with J polynomials is given by
40 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

1 2
E =
A pupil
[
Ú W ( x , y ) - Wˆ ( x , y ) ] dx dy

2
1 È J ˘
= Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ dx dy . (3-10)
A pupil Î j =1 ˚

The error is minimum when the coefficients obey the condition

∂E
= 0 , (3-11)
∂a j ¢

or

1 È J ˘
Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ F j ¢ ( x , y ) dx dy = 0 . (3-12)
A pupil Î j =1 ˚

Using the orthonormality Eq. (3-1), Eq. (3-12) yields Eq. (3-5). The variance of the
estimated aberration function is given by

2
ˆ2 ˆ
ˆ = W ( x, y) - W ( x, y)
2
sW

J
= Â a 2j . (3-13)
j =2

It should be evident that each polynomial coefficient provides a best fit to the
aberration function. The fit, of course, improves as more and more polynomials are added
until there is no more improvement. We point out that, in practice, the aberration function
data is available at a discrete set of points. Hence, there will be some error in the
coefficient values, because the orthonormality Eq. (3-1) will not be satisfied exactly. This
error decreases as the number of data points increases.

3.4 ORTHONORMALIZATION OF ZERNIKE CIRCLE POLYNOMIALS OVER


NONCIRCULAR PUPILS
The Zernike circle polynomials (discussed in Chapter 4) are orthogonal over a
circular pupil. They uniquely represent balanced classical aberrations and include
wavefront tilt and defocus aberrations. The corresponding polynomials F j ( x , y ) that are
orthogonal over a noncircular pupil can be obtained by orthogonalizing the circle
polynomials Z j ( x , y ) using the Gram–Schmidt orthonormalization process [1]. Omitting
the argument ( x , y ) of the polynomials for simplicity, we may write

G1 = Z1 = 1 , (3-14)

j
G j +1 = Z j +1 + Â c j +1,k Fk , (3-15)
k =1
 2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 41

G j +1 G j +1
F j +1 = = 12
, (3-16)
G j +1 È1 2
˘
Í Ú G j +1 dx dy ˙
Î A pupil ˚

where

1
c j +1, k = - Ú Z F dx dy . (3-17a)
A pupil j +1 k

∫ - Z j +1Fk . (3-17b)

It is evident from Eq. (3-14) that F1 = 1. Substituting Eq. (3-17b) into Eq. (3-15) and
substituting the result thus obtained into Eq. (3-12), we may write

È j ˘
F j +1 = N j +1 Í Z j +1 - Â Z j +1Fk Fk ˙ , (3-18)
Î k = 1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the
pupil under consideration, i.e., they satisfy the orthonormality condition of Eq. (3-1).
Thus, the F-polynomials are obtained recursively, starting with F1 = 1. It is clear from Eq.
(3-18) that each F-polynomial of a certain order is a linear combination of the circle
polynomials of no more than that order. It should be evident that the F-polynomials are
ordered in the same manner as the basis polynomials and that there is a one-to-one
correspondence between them.

Because of the biaxial symmetry of the pupils considered in this chapter and,
therefore, the symmetric limits of integration, the integral in Eq. (3-17a) is zero when the
integrand is an odd function of one or both integration variables. It should be evident that
a c-coefficient is zero unless the Z- and the G-polynomials have the same cosine or sine
dependence. If all of the c-coefficients in Eq. (3-15) are zero, then the F-polynomial has
the same form as the corresponding Zernike polynomial, except for its normalization.

The orthonormal F-polynomials represent the unit vectors of the space that span the
aberration function. They can be written in a matrix form according to
l 1
Fl ( x, y) = Â Mli Zi ( x, y) with Mll = . (3-19)
i =1 Gl

While the diagonal elements of the M-matrix are simply equal to the normalization
constants of the G- polynomials [since there is no multiplier with the polynomial Z j +1 in
Eq. (3-15)], there are no matrix elements above the diagonal because a polynomial Fl
consists of a linear combination of circle polynomials up to Zl only. The matrix is lower
triangular and the missing elements may be given a value of zero when multiplying a
( )
Zernike column vector L, Z j , L to obtain the orthonormal column vector L , F j ,L . ( )
It should be evident that the orthonormal polynomials for a noncircular pupil written in
42 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

terms of the circle polynomials immediately yield the elements of the conversion matrix
M.

The conversion matrix M can be obtained independently and nonrecursively using a


matrix approach [2], which is not only faster but also avoids the potential numerical
instability of the Gram–Schmidt approach as the number of polynomials increases.
Multiplying both sides of Eq. (3-19) by Fk , integrating over the pupil, and using the
orthonrmality Eq. (3-1), we obtain
J
Fk Fl = d kl = Â M kj Z j Fk , (3-20)
j =1

where, for example, Z j Fk represents the inner product of the Zernike polynomial Z j
and the orthonormal polynomial Fk over the pupil, i.e.,

1
Z j Fk = Ú Z ( x , y ) Fk ( x , y ) dx dy . (3-21)
A pupil j

Equation (3-19) can be written in a matrix form as

MC ZF = 1 , (3-22)

where C ZF is a J ¥ J matrix of the inner products between the Zernike polynomials Z j


and the orthonormal polynomials Fk . The elements of this matrix are given by

J T
Z k Fi [
= Â M ij Z j Z k
j =1
]
J T
= Â Z k Z j M ij
j =1
[ ] , (3-23)

T
[ ]
where, for example, M ij is the transpose of the matrix with elements M ij (obtained
by interchanging the rows and columns of the matrix M ). Equation (3-23) can be written
in the matrix form as

C ZF = C ZZ M T , (3-24)

where C ZZ is a J ¥ J symmetric matrix of inner products of the first J Zernike circle


polynomials between themselves. Substituting Eq. (3-24) into Eq. (3-22), we obtain

MC ZZ M T = 1 . (3-25)

Letting

M = QT ( )1 , (3-26)

where Q T is the transpose of the matrix Q , Eq. (3-24) reduces to


 2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 43

QT Q = C ZZ . (3-27)

Solving Eq. (3-27) for the matrix Q , the conversion matrix M can be obtained from Eq.
(3-26). While the matrix M is lower triangular, the matrix Q is upper triangular.

3.5 UNIT PUPIL


When considering the aberrations of a circular pupil of radius a, we normalize the
radial coordinate r by defining r = r a . Thus, 0 £ r £ a , but 0 £ r £ 1. This
normalization has the advantage that the coefficient of a classical aberration r n cos m q
(see Section 2.6) represents its peak value. This value occurs at the point where the x axis
intersects the circle. At this point, r has its maximum value of unity and the value of q is
zero giving a maximum value of unity for cos q . For example, the coefficient As of the
primary spherical aberration Asr 4 represents the peak value of the aberration. Indeed,
when As = 1l , we speak of one wave of spherical aberration. The same is true of primary
coma Ac r 3 cos q , where Ac represents its peak value. Similarly, we define a unit pupil
such that the distance of the farthest point from its center is unity. Figure 3-1 shows the
noncircular pupils considered in this book. The outer radius of an annular pupil is unity,
as in Figure 3-1a. The corners of the hexagon in Figure 3-1b lie at a distance of unity.
Figure 3-1c illustrates an ellipse with an aspect ratio of b, and its semimajor axis has a
length of unity. For each of these pupils, the coefficient of a classical aberration
represents its peak value. Figure 3-1d shows a rectangle with a half width a and its
corners at a distance of unity from its center. Similarly, Figure 3-1e shows a square of
half width 1 2 so that its corners are also at a distance of unity from its center. In these
two cases, while r has its maximum value of unity at a corner, the value of cos q at that
point is not unity. Hence, in these cases, the coefficient of a classical aberration does not
represent its peak value. In the case of a rectangle, the value of cos q depends on the
value of a, but in the case of a square its value is 1 2 . For example, coma has a peak
value of Ac 2 at a corner or the midpoint of a side. Finally, a unit slit pupil with a half
width of unity is shown in Figure 3-1f. The value of a coefficient of a classical aberration
in this case does represent its peak value.

3.6 SUMMARY
The content of an aberration function can be determined by expanding it in terms of a
complete set of polynomials that are orthogonal over its domain and have the form of
familiar aberrations, such as those discussed in Chapter 2. The Zernike circle
polynomials, for example, are not only orthogonal over a circular pupil, but they also
represent balanced classical aberrations, as discussed in Chapter 4. It is advantageous to
use the polynomials in their orthonormal form so that the piston coefficient represents the
mean value of the aberration function and the other expansion coefficients represent the
standard deviations of the corresponding polynomial aberration terms. As illustrated by
Eq. (3-5), the value of an expansion coefficient is independent of the number of
polynomials used in the expansion. Moreover, each coefficient yields a least-squares fit to
the aberration function. The variance of the aberration function is given by the sum of the
squares of the coefficients (other than the piston), as in Eq. (3-8).
44 ORTHONORMAL POLYNOMIALS AND GRAM SCHMIDT ORTHONORMALIZATION

( ) ( )

1

q

( ) ( )

(a) Annulus (b) Hexagon

y y

D(0,c)

(
D –c, 1 – c 2 ) (
A c, 1 – c 2 )

C – 1, 0 A 1, 0
O x O x

(
C – c, – 1 – c 2 ) (
B c, – 1 – c 2 )
B(0, – c)

(c) E l l i p s e (d) Rectangle

y y


D – 1 2, 1 2
A 1 2,1 2

x
O x –1 O 1


C –1 2, – 1 2
B 1 2, – 1 2

(e) Sq u a r e (f) S l i t

Figure 3-1. Unit pupils inscribed inside a unit circle. (a) annulus of obscuration ratio
, (b) hexagon with a side of unity, (c) ellipse of aspect ratio b, (d) rectangle of half
width a, (e) square of half width 1 2 , and (f) slit of half width of unity.
 6XPPDU\ 45

Given a set of polynomials that are orthonormal over a certain domain, those that are
orthonormal over another domain can be obtained from them by the recursive Gram–
Schmidt orthonormalization process. They can also be obtained by a nonrecursive matrix
approach. Each new polynomial obtained is a linear combination of the basis
polynomials, as indicated by Eq. (3-18). We use the Zernike circle polynomials as the
basis functions to obtain the polynomials that are orthonormal over an annular, Gaussian,
hexagonal, elliptical, rectangular, or a square pupil. The slit pupil is a limiting case of a
rectangular pupil whose one dimension is negligibly small compared to the other. The
concept of a unit pupil is emphasized so that the farthest point or points on a pupil are at a
distance of unity from its center. It has the advantage that the coefficient of a single
aberration term represents its peak value. Thus, in each case the pupil is inscribed inside a
unit circle.
46 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

References

1. A. Korn and T. M. Korn, Mathematical Handbook for Scientists and Engineers


(McGraw-Hill, New York, 1968).

2. G.-m. Dai and V. N. Mahajan, “Nonrecursive orthonormal polynomials with


matrix formulation,” Opt. Lett. 32, 74–76 (2007).
CHAPTER 4

SYSTEMS WITH CIRCULAR PUPILS


4.1 Introduction ............................................................................................................49

4.2 Pupil Function ........................................................................................................50

4.3 Aberration-Free Imaging ......................................................................................51

4.3.1 PSF ............................................................................................................51

4.3.2 OTF ............................................................................................................53

4.4 Strehl Ratio and Aberration Tolerance ............................................................... 54

4.4.1 Strehl Ratio ................................................................................................54

4.4.2 Defocus Strehl Ratio ..................................................................................55

4.4.3 Approximate Expressions for Strehl Ratio ................................................56

4.5 Balanced Aberrations ............................................................................................57

4.6 Description of Zernike Circle Polynomials..........................................................63

4.6.1 Analytical Form ......................................................................................... 63

4.6.2 Circle Polynomials in Polar Coordinates ..................................................65

4.6.3 Polynomial Ordering ................................................................................. 65

4.6.4 Number of Circle Polynomials through a Certain Order n........................65

4.6.5 Relationships among the Indices n, m, and j ............................................. 69

4.6.6 Uniqueness of Circle Polynomials ............................................................69

4.6.7 Circle Polynomials in Cartesian Coordinates ............................................70

4.7 Zernike Circle Coefficients of a Circular Aberration Function ........................70

4.8 Symmetry Properties of Images Aberrated by a

Circle Polynomial Aberration............................................................................... 74

4.8.1 Symmetry of PSF ......................................................................................74

4.8.2 Symmetry of OTF ......................................................................................76

47
48 SYSTEMS WITH CIRCULAR PUPILS

4.9 Isometric, Interferometric, and Imaging Characteristics of

Circle Polynomial Aberrations ............................................................................. 78

4.9.1 Isometric Characteristics ........................................................................... 78

4.9.2 Interferometric Characteristics ..................................................................78

4.9.3 PSF Characteristics ....................................................................................83

4.9.4 OTF Characteristics ................................................................................... 84

4.10 Circle Polynomials and Their Relationships with Classical Aberrations ......... 88

4.10.1 Introduction................................................................................................88

4.10.2 Wavefront Tilt and Defocus ......................................................................88

4.10.3 Astigmatism............................................................................................... 89

4.10.4 Coma ..........................................................................................................90

4.10.5 Spherical Aberration ..................................................................................90

4.10.6 Seidel Coefficients from Zernike Coefficients ..........................................91

4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ..............92

4.11 Zernike Coefficients of a Scaled Pupil ................................................................. 92

4.11.1 Theory ........................................................................................................94

4.11.2 Application to a Seidel Aberration Function ............................................. 98

4.11.3 Numerical Example ................................................................................. 100

4.12 Summary............................................................................................................... 101

References ......................................................................................................................103
Chapter 4
Systems with Circular Pupils
4.1 INTRODUCTION
Optical systems generally have a circular pupil. The imaging elements of such
systems also have a circular boundary. Therefore, they are also represented by circular
pupils in fabrication and testing. As a result, the Zernike circle polynomials have been in
widespread use since Zernike introduced them in his phase contrast method for testing
circular mirrors [1]. They are used in optical design and testing to understand the
aberration content of a wavefront. They have also been used for analyzing the wavefront
aberrations introduced by atmospheric turbulence on a wave propagating through it [2].

We start this chapter with a brief discussion of the point-spread function (PSF) and
the optical transfer function (OTF) of an aberration-free system with a circular pupil. We
then consider the effect of primary aberrations on the Strehl ratio of an image. Since the
Strehl ratio for small aberrations depends on the variance of an aberration, we balance a
classical aberration of a certain order with those of lower orders to reduce its variance.
The utility of the Zernike circle polynomial stems from the fact that they are not only
orthogonal over a circular pupil, but they also uniquely represent the balanced classical
aberrations yielding minimum variance over the pupil [3–6]. Because of their
orthogonality, when a circular wavefront is expanded in terms of them, the value of a
Zernike expansion coefficient is independent of the number of polynomials used in the
expansion. Hence, one or more polynomial terms can be added or subtracted without
affecting the other coefficients. The piston coefficient represents the mean value of the
aberration function, and the variance of the function is given simply by the sum of the
squares of the other expansion coefficients.

Given the m -fold symmetry of a Zernike polynomial aberration, we discuss the


symmetry of its interferogram, the corresponding aberrated PSF, the real and imaginary
parts of the OTF, and the modulation transfer function (MTF). It is shown that the
interferogram, the real part of the OTF, and the corresponding MTF are 2m-fold whether
m is an even or an odd integer, but the PSF and the imaginary part of the OTF are m-fold
when m is odd. Numerical examples are given to illustrate the Zernike aberrations
isometrically, interferometrically, and by the corresponding PSFs, OTFs, and MTFs.

Relationships between the coefficients of a power series expansion of an aberration


function and the corresponding Zernike expansion coefficients are considered. In
particular, we discuss how to obtain the Seidel coefficients from the Zernike coefficients
of an aberration function. We illustrate by an example how wrong Seidel coefficients are
obtained when using only the corresponding Zernike polynomials. Finally, we show how
the Zernike coefficients of an aberration function over a circular pupil change as its
diameter is reduced.

49
50 SYSTEMS WITH CIRCULAR PUPILS

4.2 PUPIL FUNCTION


Consider an imaging system with a circular exit pupil of radius a, diameter D 2a ,
and area Sex Sa 2 lying in the pupil plane x p y p with z as its optical axis. The Cartesian

and polar coordinates x p , y p and r p , T of a pupil point Q, as illustrated in Figure 4-1,
are related to each other according to

x p, yp r p cos T, sin T , 0 d r p d a , 0 d T d 2S . (4-1)

Using a normalized radial variable U r p a , we may write

x p, yp aU cos T, sin T , 0 d U d 1 . (4-2)

We refer to the pupil in the U, T coordinates as a unit circular pupil in the sense of a unit
G
disc. For a uniformly illuminated pupil with an aberration function ) r p and power Pex
exiting from it, the pupil function of the system can be written

G
P rp G > G @
A r p exp i) r p ,
G
rp d a
(4-3)
0 , otherwise ,

where

G P
A rp ex Sex
12
(4-4)

is the uniform amplitude across the circular pupil.

yp y pc

Q(xp ,yp) Q(U, T)


Q(rp , T)
rp U
yp U sin T
T T
xp x pc
O xp O U cos T

a 1

(a) (b)

Figure 4-1. (a) Circular exit pupil of radius a of an imaging system. (b) Circular
pupil as a unit disc. The polar coordinates of a point Q are r p , T in (a) and U, T
in (b).
4.3 Aberration Free Imaging 51

4.3 ABERRATION-FREE IMAGING


4.3.1 PSF
Using polar coordinates (ri , q i ) for an observation point in Eq. (2-9), the PSF
representing the irradiance distribution in the image plane for a circular pupil can be
written

1 2p 2
1 Û Û
I (r , q i ) [ ] [
= 2 Ù Ù exp iF (r, q) exp - pir r cos (q - q i ) r dr dq
p ı ı
] , (4-5)
0 0

where r = r i l F , F = R D is the focal ratio of the image-forming light cone, F (r, q) is


the phase aberration at a point (r, q) in the pupil plane, and the irradiance is normalized
by the aberration-free central value Pex Sex l2 R 2 = p Pex 4 l2 F 2 .

For an aberration-free system, i.e., for a spherical wavefront exiting from the pupil so
that F(r, q) = 0, Eq. (4-5) reduces to

2
1 1 2p
I (r , q i ) = [ (
Ú Ú exp - pi r r cos q p - q i r dr d q p
p2 0 0
)] . (4-6)

Noting that
2p
Ú exp (i x cos a ) da = 2pJ 0 ( x ) , (4-7)
0


where J 0 ( ) is the zero-order Bessel function of the first kind, Eq. (4-7) reduces to
1 2
[
I ( r ) = 4 Ú J 0 (p r r) r dr
0
] . (4-8)

Noting further that


a a
Ú x J 0 (bx ) dx = J ( ab) , (4-9)
0 b 1

where J 0 (◊) is the first-order Bessel function of the first kind, Eq. (4-9) yields
2
È 2J (p r ) ˘
I (r) = Í 1 ˙ , (4-10)
Î pr ˚

where J1(◊) is the first-order Bessel function of the first kind. Integrating over a circle of
radius rc , (in units of l F ) it can be shown that it contains a fractional power given by

P (rc ) = 1 - J 02 ( p rc ) - J12 ( p rc ) . (4-11)

Figure 4-2 shows a plot of Eq. (4-10), called the Airy pattern. It consists of a bright
52 SYSTEMS WITH CIRCULAR PUPILS

spot at the center, called the Airy disc, surrounded by dark and bright diffraction rings.
The fractional power is also plotted in Figure 4-2a. The radius of the Airy disc is 1.22 and
contains 83.8% of the total light, as may be seen by letting rc = 1.22 in Eq. (4-11). The
center of the pattern lies at the Gaussian image point.

1.0

0.8
P
I(r), P(rc)

0.6

0.4

0.2 I

0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r, rc
(a)

(b)

Figure 4-2. (a) Irradiance and encircled power distributions for an aberration-free
system with a circular pupil. (b) 2D PSF, called the Airy pattern.
4.3.2 OTF 53

4.3.2 OTF
From Eq. (2-11), the aberration-free OTF can be written

r Û r r r r
ı
( ) (
t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (4-12)

It is evident that the OTF represents the fractional area of overlap of two circles, each of
r
radius a, separated by a distance l Rvi , where v i = v i . From Figure 4-3, we note that the
area of overlap is given by four times the difference between the area of a sector of radius
a and cone angle b , and the area of the triangle OAB. Hence, the OTF can be written

4 Ê b 1 ˆ
t(v i ) = Á p a 2 - OA ◊ AB˜ . (4-13)
Sex Ë 2p 2 ¯

Substituting OA = a cos b , AB = a sinb, and cos b = l Rv i 2a = l Fv i = v , into Eq. (4-


13), we obtain

2
t(v i ) = (b - sin b cos b) (4-14)
p


=
p ÎÍ
(
cos 1 v - v 1 - v 2 )1 2 ˘˚˙ , 0£ v£1 . (4-15)

Here, v = cos b is a spatial frequency normalized by the cutoff frequency v c = (1 l F ) at


which the overlap area reduces to zero. The OTF is radially symmetric because the
overlap area depends only on the separation l Rvi of the two pupils and is independent of
r
the direction of v i .

a
b
O
A O¢

lRni

Figure 4-3. Aberration-free OTF as the fractional area of overlap of two circles of
radius a whose centers are separated by a distance lRvi .
54 SYSTEMS WITH CIRCULAR PUPILS

Figure 4-4 shows how the OTF varies with v. The integral of the aberration-free
OTF that enters into the calculation of the Strehl ratio from the real part of the complex
aberrated OTF [see Eq. (2-25)] is given by
1
Û
Ù t (v) v dv = 1 8 . (4-16)
ı
0

The slope of the OTF at the origin is given by

t¢ ( 0) = - 4 p . (4-17)

Although obtained from the aberration-free OTF, this slope is independent of any
aberration.

4.4 STREHL RATIO AND ABERRATION TOLERANCE

4.4.1 Strehl Ratio


Letting r = 0 in Eq. (4-5) for the irradiance distribution normalized by its aberration-
free central value, we obtain the Strehl ratio of an aberrated image:

1 2p 2
1 Û Û
S =
p2 ı ı
[ ]
Ù Ù exp i F(r, q) r dr dq . (4-18)
0 0

1.0

0.8

0.6
t

0.4

0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
n

Figure 4-4. Aberration-free OTF as a function of normalized spatial frequency v .


4.4.2 Defocus Strehl Ratio 55

4.4.2 Defocus Strehl Ratio


Consider an observation being made in an image plane passing through a point P1 at
a distance z from the exit pupil of a system, while a beam with a spherical wavefront W
is focused at a point P2 at a distance R, as illustrated in Figure 1-6. The spherical
wavefront is aberrated with respect to the reference sphere S of radius of curvature z due
to the longitudinal defocus z  R . The defocus aberration may be written

) U Bd U 2 , (4-19)

where the peak value Bd of the phase aberration is related to the longitudinal defocus
according to

Bd
 S 4O F 2 z  R . (4-20)

A positive value of the defocus aberration is introduced when an observation is made at a


distance z  R , as in Figure 1-6. Substituting Eq. (4-19) into Eq. (4-18), we obtain the
Strehl ratio of the defocused image:

S >sin Bd 2 Bd 2 @ 2 . (4-21)

The Strehl ratio decreases as the aberration increases until it reaches a value of zero
when the aberration becomes 2S radians or one wave. As shown in Figure 4-5, it
fluctuates for increasing value of defocus, becoming zero when the aberration is an
integral number of waves. It should be evident that the defocused Strehl ratio represents
the axial irradiance of a focused beam.

1.0

0.8

0.6
S

0.4

0.2

0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Bd

Figure 4-5. Strehl ratio S of a defocused beam, representing its axial irradiance,
where Bd is the defocus aberration in units of wavelength.
56 SYSTEMS WITH CIRCULAR PUPILS

4.4.3 Approximate Expressions for Strehl Ratio


The approximate expressions for the Strehl ratio when the aberration is small are
given by Eqs. (2-31)–(2-33), i.e.,
2
S1 ~ (1 - s 2F 2) , (4-22a)

S2 ~ 1 - s 2F , (4-22b)

and

S3 ~ exp (- s 2F ) , (4-22c)

where

s F2 = < F 2 > - < F > 2 (4-23)

is the variance of the phase aberration across the pupil. The mean and the mean square
values of the aberration are obtained from the expression
1 2p
Û Û
Fn = p 1 Ù Ù F n (r, q) r dr dq (4-24)
ı ı
0 0

with n = 1 and 2, respectively.

Table 4-1 gives the form as well as the standard deviation s F of a primary (or a
Seidel) aberration, where an aberration coefficient Ai represents the peak value of the
aberration. It also lists the aberration tolerance, i.e., the value of the aberration coefficient
Ai , for a Strehl ratio of 0.8. This tolerance has been obtained by using the Strehl ratio
expression S2 , according to which the standard deviation for a Strehl ratio of 0.8 is given
by

sF = 0.2 (4-25)

or

s w = (l 2p) 0.2 = 0.07l = l 14.05 , (4-26)

where s w is the sigma value of the wave aberration. The aberration tolerance listed in
Table 4-1 is for the wave (as opposed to the phase) aberration coefficient, as is customary
in optics. It should be understood that the tolerance numbers given are not accurate to the
second decimal place. They are listed as such for consistency only. We have used the
symbol Ad for the coefficient of field curvature aberration, which varies quadratically
with the angle that a point object makes with the optical axis of the system. However, to
4.4.3 Approximate Expressions for Strehl Ratio 57

Table 4-1. Standard deviation and aberration tolerance for primary aberrations.

Aberration F(r, q ) sF A i for S = 0.8

Spherical As r 4 2 As As l 4.19
=
3 5 3.35

Coma Ac r3 cos q Ac Ac l 4.96


=
2 2 2.83

Astigmatism Aa r2 cos 2 q Aa l 3.51


4

Field Curvature Ad r2 Ad Ad l 4.06


=
(defocus) 2 3 3.46

Distortion (tilt) At r cos q At l 7.03


2

avoid confusion, we have used the symbol Bd for representing the defocus wave
aberration, which is independent of the field angle but has the same dependence on pupil
coordinates as field curvature. Similarly, we have used the symbol At for distortion,
which varies as the cube of the field angle. But, we will use the symbol Bt to represent
the wavefront tilt, which is independent of the field angle but has the same dependence on
pupil coordinates as distortion.

4.5 BALANCED ABERRATIONS


The variance of a primary aberration can be reduced by observing the image in a
defocused image plane, i.e., by mixing it with defocus aberration. Thus, for example, we
balance primary spherical aberration with defocus aberration and write it as

F(r) = As r 4 + Bd r 2 . (4-27)

The defocus aberration is introduced by making an observation in a plane at a distance z,


as discussed in Section 4.3. The mean and the mean square value of the aberration
function are given by
1 2p
1 Û Û
<F > =
p Ù Ù
ı ı
( A s r 4 + B d r 2 ) r dr d q
0 0

As Bd
= + (4-28)
3 2

and

As2 B2 A B
F2 = + d + s d . (4-29)
5 3 2
58 SYSTEMS WITH CIRCULAR PUPILS

Accordingly, the aberration variance is given by


2
s F2 = F 2 - F

4 As2 B2 A B
= + d + s d . (4-30)
45 12 6

The value of defocus Bd yielding minimum variance is obtained by letting

∂ s F2
= 0 , (4-31)
∂ Bd

and checking that it yields a minimum and not a maximum. Thus, we find that the
optimum value is Bd = - As, and the balanced aberration is given by

(
F bs (r) = As r 4 - r 2 ) . (4-32)

Its standard deviation or sigma value is As 6 5 , which is a factor of 4 smaller than the
corresponding value 2 As 3 5 for Bd = 0. Since the sigma value has been reduced by a
factor of 4, its tolerance has been increased by the same factor. For example, S = 0.8 is
obtained in the Gaussian image plane for As = l 4 . However, the same Strehl ratio is
obtained for As = 1 l in a slightly defocused image plane such that Bd = - l .

Similarly, we balance astigmatism with defocus and coma with tilt. Table 4-2 lists
the form of a balanced primary aberration, its standard deviation, and its tolerance for a
Strehl ratio of 0.8, according to Eq. (4-16b). Also listed in the table is the location of the
diffraction focus, i.e., the point with respect to which the aberration variance is minimum
so that the Strehl ratio is maximum at it. The amount of balancing defocus is minus half

Table 4-2. Balanced primary aberrations and corresponding diffraction focus


standard deviation, and aberration tolerance.

Balanced Diffraction sF A i for


F ( r, q)
Aberration Focus* S = 0.8

Spherical (
As r 4 - r2 ) (0, 0, 8F A )
2
s
As 0.955l
6 5

Coma (
Ac r3 - 2r 3 cos q ) (4 FAc 3, 0, 0 ) Ac 0.604l
6 2
Aa
Astigmatism (
Aa r2 cos 2 q - 1 2 ) (0 , 0 , 4 F A )
2
a
2 6
0.349l

= ( Aa 2) r2 cos 2q

*The diffraction focus coordinates are relative to the Gaussian image point.
4.5 Balanced Aberrations 59

the amount of astigmatism, or the diffraction focus lies at a distance 4 F 2 As along the z
axis. The balancing tilt is minus two-thirds the amount of the coma. Thus, the maximum
Strehl ratio is obtained at a point that is displaced from the Gaussian image point by
4 FAc 3 but lies in the Gaussian image plane.

For primary aberrations, S1 and S2 underestimate the true Strehl ratio S. S3 gives a
better approximation for the true Strehl ratio than S1 and S2 . The reason is that, for small
4
values of s w , it is larger than S1 by approximately s F 4 . Of course, S1 is larger than S2
4
by s F 4 . The expression S3 underestimates the true Strehl ratio only for coma and
astigmatism; it overestimates for the other aberrations. Numerical analysis shows that the
error, defined as 100 (1 - S3 S ) , is < 10% for S > 0.3 [5,7].

Rayleigh [8] showed that a quarter-wave of primary spherical aberration reduces the
irradiance at the Gaussian image point by 20%, i.e., the Strehl ratio for this aberration is
0.8. This result has brought forth the Rayleigh’s l 4 rule; namely, that a Strehl ratio of
approximately 0.8 is obtained if the maximum absolute value of the aberration at any
point in the pupil is equal to l 4 . A variant of this definition is that an aberrated
wavefront that lies between two concentric spheres spaced a quarter-wave apart will give
a Strehl ratio of approximately 0.8. Thus, instead of W p = l 4 , we require
W p v = l 4 , where Wp is the peak absolute value and Wp v is the peak-to-valley (P-V)
value of the aberration. However, a Strehl ratio of 0.8 is obtained for W p = l 4 = W p v
for spherical aberration only. For other primary aberrations, distinctly different values of
Wp and Wp v give a Strehl ratio of 0.8 [5,9]. Thus, it is advantageous to use s w for
estimating the Strehl ratio. A Strehl ratio of S >
~ 0.8 is obtained for s w <
~ l 14 .

When a certain aberration is balanced with other aberrations to minimize its variance,
the balanced aberration does not necessarily yield a higher or the highest possible Strehl
ratio. For small aberrations, a maximum Strehl ratio is obtained when the variance is
minimum. For large aberrations, however, there is no simple relationship between the
Strehl ratio and the aberration variance. For example [9], when As = 3l , the optimum
amount of defocus is Bd = - 3l , but the Strehl ratio is a minimum and equal to 0.12. The
Strehl ratio is maximum and equal to 0.26 for Bd ~ - 4l or - 2l . For As < ~ 2.3l , the
axial irradiance is maximum at a point with respect to which the aberration variance is
minimum. Similarly, in the case of coma, the maximum irradiance in the image plane
occurs at the point with respect to which the aberration variance is minimum only if
~ 0.7l , which in turn corresponds to S >
Ac < ~ 0.76 . For larger values of Ac , the
distance of the point of maximum irradiance does not increase linearly with its value and
even fluctuates in some regions [10]. Moreover, it is found that for Ac > 2.3l , the Seidel
coma gives a larger Strehl ratio than the balanced coma, i.e., the irradiance in the image
plane at the origin is larger than at the point with respect to which the aberration variance
is minimum. Thus, only for large Strehl ratios, the irradiance is maximum at the point
associated with the minimum aberration variance.
60 SYSTEMS WITH CIRCULAR PUPILS

The defocused PSFs are shown in Figure 4-6 to illustrate the zero Strehl ratio for
integral number of waves of defocus aberration. As an illustration of the improvement in
the Strehl ratio by aberration balancing, Table 4-3 lists the Strehl ratio of a primary
aberration with and without balancing for a quarter wave of aberration. The Strehl ratio
for a quarter of defocus is 0.811. As shown in Figure 4-7, the Strehl ratio for a quarter
wave of spherical aberration improves from a value of 0.800 to 0.986 when it is balanced
with an equal and opposite amount of defocus aberration. In the case of coma, a Strehl
ratio of 0.737 is obtained, but a peak of value 0.966 lies to the right of the origin, as
shown in Figure 4-8. When coma is balanced with a wavefront tilt equal to 2 3 the
amount of coma, the peak moves to the origin and the Strehl ratio increases from 0.737 to
0.966. In the case of astigmatism, as shown in Figure 4-9, the Strehl ratio increases from
a value of 0.857 to 0.902 when it is balanced with defocus.

The variance of the secondary spherical aberration ( U 6 ), secondary coma ( U 5 cos T ),


and secondary astigmatism ( U 4 cos 2 T ) can be reduced similarly by mixing them with
appropriate aberrations of lower order. The secondary spherical aberration is balanced
with primary spherical aberration and defocus to minimize its variance. The balanced
secondary spherical aberration thus obtained is given by

) bss U, T U 6  1.5U 4  0.6U 2 . (4-33)

Similarly, secondary coma is balanced with primary coma and wavefront tilt to minimize
its variance, and the balanced aberration thus obtained is given by

) bsc U, T U5  1.2U3  0.3U cos T . (4-34)

1.0

Bd = 0 Defocus
0.8

0.6
I (r)

1/4
1
0.4 x10

0.2

0.0
0.0 0.5 1.0 1.5 2.0
r
Figure 4-6. PSFs for a quarter-wave and one wave of defocus as a function of r in
units of O F . For clarity, the curve for Bd 1 has been multiplied by ten. The
aberration-free PSF, representing the Airy pattern with its first zero at 1.22, is
shown by the solid curve.
4.5 Balanced Aberrations 61

Table 4-3. Strehl ratio S for a quarter-wave of a primary aberration with and
without balancing for a circular pupil, i.e., for Bd Aa Ac As O 4 and
0 d U d 1.

Aberration S

Aberration free 1

Defocus, Bd U 2 0.811

Astigmatism, Aa U 2 cos 2 T 0.857

>
Balanced astigmatism, Aa U 2 cos 2 T  1 2 @ 0.902

Coma, Ac U 3 cos T 0.737

>
Balanced coma, Ac U 3  2 3 U cos T @ 0.966

Spherical aberration, As U 4 0.800


Balanced spherical aberration, As U 4  U 2 0.986

1.0

0.8

0.6
I (r)

0.4 Balanced
Spherical
Spherical
0.2

0.0
0.0 0.5 1.0 1.5 2.0
r

Figure 4-7. PSFs for a quarter-wave of spherical aberration with and without
balancing with equal and opposite amount of defocus. The aberration-free PSF,
representing the Airy pattern with its first zero at 1.22, is shown by the solid curve.
62 SYSTEMS WITH CIRCULAR PUPILS

1.0

0.8
I (x,0)

0.6
Coma
0.4
Balanced
Coma
0.2

0.0
-2 -1 0 1 2
x

Figure 4-8. PSFs for a quarter-wave of coma along the x axis (in units of O F ) with
and without the balancing tilt. The aberration-free PSF is shown by the solid curve.

Finally, secondary astigmatism is balanced with primary spherical aberration, primary


astigmatism, and defocus to minimize its variance, and the balanced aberration thus
obtained is given by:

1 4 3 2 3 1 § 4 3 2·
) bsa U, T U 4 cos 2 T  U  U cos 2 T  U 2 U  U cos 2T . (4-35)
2 4 8 2© 4 ¹

1.0

0.8 Balanced
Astigmatism
I (x,0)

0.6

0.4
Astigmatism
0.2

0.0
0 1 2
x

Figure 4-9. PSFs for a quarter-wave of astigmatism along the x axis (in units of
O F ) with and without the balancing defocus. The aberration-free PSF is shown by
the solid curve.
4.5 Balanced Aberrations 63

When secondary spherical aberration or secondary coma is balanced with lower-


order aberrations to minimize their variance, it is found [11] that a maximum of Strehl
ratio is obtained only if its value comes out to be greater than about 0.5. Otherwise, a
mixture of aberrations yielding a larger-than-minimum possible variance gives a higher
Strehl ratio than the one provided by a minimum-variance mixture.

4.6 DESCRIPTION OF ZERNIKE CIRCLE POLYNOMIALS


4.6.1 Analytical Form
In his phase contrast method for testing the figure of circular mirrors, which he
proposed as an improvement over the Foucault knife-edge test, Zernike introduced his
circle polynomials as eigenfunctions of a second-order differential equation in two
variables [1]. These polynomials, which form a complete orthogonal set for the interior of
a unit circle, are the well-known circle polynomials. Nijboer used these polynomials to
study the balancing of classical aberrations of a power-series expansion of the aberration
function and the effect of small aberrations on the diffraction images formed by
rotationally symmetric imaging systems with circular pupils [2].

The orthonormal form of the circle polynomials may be written

[
Z nm (r, q) = 2( n + 1) (1 + d m 0 ) ]1/ 2Rnm (r) cos mq , 0 £ r £ 1 , 0 £ q £ 2 p , (4-36)

where n and m are positive integers including zero, n - m ≥ 0 and even, and Rnm (r) is a
radial polynomial given by

( n m )/ 2 ( -1) s ( n - s)!
Rnm (r) = Â rn 2s
(4-37)
s= 0 Ên+m ˆ Ên-m ˆ
s!Á - s˜ ! Á - s˜ !
Ë 2 ¯ Ë 2 ¯

with a degree n in r containing terms in rn , rn 2 , K, and rm. It is clear from Eq. (4-36)
that the circle polynomials are separable in the polar coordinates r and q of a pupil
point.

A radial polynomial Rnm (r) is even or odd in r depending on whether n (or m) is


even or odd. It is normalized such that

Rnm (1) = 1 . (4-38)

We find from Eq. (4-19) that

Rnn (r) = r n , (4-39)

and

Ïd m 0 for even n 2
Rnm ( 0) = Ì (4-40)
Ó - d m 0 for odd n 2 .
64 SYSTEMS WITH CIRCULAR PUPILS

For m = 0 , a radial polynomial has the same form as a corresponding Legendre


polynomial Pn (◊) according to

(
Rn0 (r) = Pn 2r 2 - 1 ) . (4-41)

The orthogonality of the trigonometric functions yields


2p
Ú cos mq cos m¢q dq = p (1 + d m 0 ) d mm ¢ . (4-42)
0

The polynomials Rnm (r) obey the orthogonality relation


1
Û m 1
Ù Rn (r) Rn ¢ (r) r dr = 2 n+ 1 d nn ¢
m
. (4-43)
ı ( )
0

In Eq. (4-43), the m value is the same for both radial polynomials because of the
orthogonality Eq. (4-42) of the trigonometric functions. Accordingly, the polynomials
Z nm (r, q) are orthonormal according to

1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢ d mm ¢

. (4-44)
p0 0 n

Since the aberrations introduced by fabrication errors or atmospheric turbulence are


random in nature, we need both the cosine and the sine Zernike circle polynomials to
express them. It is convenient in such cases to write their form and numbering as [5]:

Z even j (r, q) = 2(n + 1) Rnm (r) cos mq, m π 0 , (4-45a)

Z odd j (r, q) = 2(n + 1) Rnm (r) sin mq, m π 0 , (4-45b)

Z j (r, q) = n + 1 Rn0 (r), m = 0 . (4-45c)

An even number is associated with a cosine polynomial and an odd number with a sine
polynomial. The orthogonality of the trigonometric functions yields

2p
Ï cos mq cos m¢q , j and j ¢ are both even
Ô cos mq sin m¢q , j is even and j ¢ is odd
Û Ô
Ù dq Ì
ı Ôsin mq cos m¢q , j is odd and j ¢ is even
0
ÔÓsin mq sin m¢q , j and j ¢ are both odd

Ï p (1 + d m 0 )d mm ¢ , j and j ¢ are both even


Ô
= Ì p d mm ¢ , j and j ¢ are both odd (4-46)
Ô0 , otherwise .
Ó

Therefore, the Zernike circle polynomials are orthonormal over a unit disc according to
4.6.1 Analytical Form 65

1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (4-47)
0 0 0 0

4.6.2 Circle Polynomials in Polar Coordinates


The orthonormal Zernike circle polynomials and the names associated with some of
them when identified with the classical aberrations are listed in Table 4-4 in polar
coordinates for n £ 8. The polynomials independent of q are the spherical aberrations,
those varying as cos q are the coma aberrations, and those varying as cos 2q are the
astigmatism aberrations. The variation of several radial polynomials Rnm (r) with r is
illustrated in Figure 4-10. A polynomial with an even value of n has a value of zero at n 2
values of r , e.g., for defocus, astigmatism, and various orders of spherical aberration. A
polynomial with an odd value of n has a value of zero at ( n + 1) 2 values of r , e.g., for
various orders of coma. The larger the value of n of a polynomial, the more oscillatory
the polynomial.

4.6.3 Polynomial Ordering


The index n of a Zernike polynomial represents its radial degree or the order, since it
represents the highest power of r in the polynomial. This is different from the order of a
classical aberration, which represents the degree of the object (for which the aberration
function is considered) and pupil points in Cartesian coordinates (see Section 1.6). The
index m of a polynomial is referred to as its azimuthal frequency. The index j is a
polynomial-ordering number and is a function of both n and m. The polynomials in Table
4-4 are ordered such that an even j corresponds to a symmetric polynomial varying as
cosmq, while an odd j corresponds to an antisymmetric polynomial varying as sinmq. A
polynomial with a lower value of n is ordered first, and for a given value of n, a
polynomial with a lower value of m is ordered first.

4.6.4 Number of Circle Polynomials through a Certain Order n


The number of circle polynomials of a given order n is n + 1. Their number through
a certain order n is given by

N n = ( n + 1)( n + 2) 2 . (4-48)

For a rotationally symmetric imaging system, each of the sin mq terms is zero, as
discussed in Section 1.6. Accordingly, the number of polynomials of an even order is
(n 2) + 1 and ( n + 1) 2 for an odd order. Their number through an order n is given by
[
N n = (n 2) + 1 ]2 for even n , (4-49a)

= ( n + 1)( n + 3) 4 for odd n . (4-49b)


66 SYSTEMS WITH CIRCULAR PUPILS

Table 4-4. Orthonormal Zernike circle polynomials Z j ( r,, q) . The indices j, n, and m
are called the polynomial number, radial degree, and azimuthal frequency,
respectively. The polynomials Z j are ordered such that an even j corresponds to a
symmetric polynomial varying as cos mqq , while an odd j corresponds to an
antisymmetric polynomial varying as sin mqq. A polynomial with a lower value of n
is ordered first, and for a given value of n, a polynomial with a lower value of m is
ordered first.

j n m Z j ( r,, q) Aberration Name*


1 0 0 1 Piston
2 1 1 2 r cos q x-tilt
3 1 1 2 r sin q y-tilt

4 2 0 (
3 2r 2 - 1 ) Defocus

5 2 2 6 r2 sin 2q 45∞ Primary astigmatism


6 2 2 6 r2 cos 2 q 0∞ Primary astigmatism

7 3 1 (
8 3r3 - 2r sin q ) Primary y-coma

8 3 1 8 (3r 3
- 2r) cos q Primary x-coma

9 3 3 8 r 3 sin 3 q
10 3 3 8 r 3 cos 3 q

11 4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical aberration

12 4 2 (
10 4r 4 - 3r2 cos 2q ) 0∞ Secondary astigmatism

13 4 2 10 ( 4r 4
- 3r ) sin 2q
2 45∞ Secondary astigmatism

14 4 4 10 r 4 cos 4 q
15 4 4 10 r 4 sin 4 q

16 5 1 ( )
12 10r5 - 12r3 + 3r cos q Secondary x-coma

17 5 1 12 (10r - 12r + 3r) sin q


5 3
Secondary y-coma

18 5 3 12 (5r - 4r ) cos 3q
5 3

19 5 3 12 (5r - 4r ) sin 3q
5 3

20 5 5 12 r 5 cos 5 q
21 5 5 12 r 5 sin 5 q

*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
4.6.4 Number of Circle Polynomials through a Certain Order n 67

Table 4-4. Orthonormal Zernike circle polynomials Z j ( r,, q) . (Cont.)

j n m Z j ( r,, q) Aberration Name*

22 6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical

23 6 2 ( 6
)
14 15r - 20r + 6r sin 2q 4 2
45∞ Tertiary astigmatism

24 6 2 14 (15r - 20r + 6r ) cos 2q


6 4 2
0∞ Tertiary astigmatism

25 6 4 14 (6r - 5r ) sin 4q
6 4

26 6 4 14 (6r - 5r ) cos 4q
6 4

27 6 6 14 r 6 sin 6 q

28 6 6 14 r 6 cos 6 q

29 7 1 ( )
4 35r7 - 60r5 + 30r3 - 4r sin q Tertiary y-coma

30 7 1 4 (35r - 60r + 30r - 4r) cos q


7 5 3
Tertiary x-coma

31 7 3 4 (21r - 30r + 10r ) sin 3q


7 5 3

32 7 3 4 (21r - 30r + 10r ) cos 3q


7 5 3

33 7 5 4 (7r - 6r ) sin 5q
7 5

34 7 5 4 (7r - 6r ) cos 5q
7 5

35 7 7 4 r 7 sin 7 q

36 7 7 4 r 7 cos 7 q

37 8 0 (
3 70r8 - 140r6 + 90r4 - 20r2 + 1 ) Tertiary spherical

38 8 2 ( )
18 56r 8 - 105r 6 + 60r 4 - 10r 2 cos 2q 0∞ Quaternary astigmatism

39 8 2 18 ( 56r 8 - 105r 6 + 60r 4 - 10r 2 ) sin 2q 45∞ Quaternary astigmatism

40 8 4 18 ( 28r 8 - 42r 6 + 15r 4 ) cos 4 q

41 8 4 18 ( 28r 8 - 42r 6 + 15r 4 ) sin 4 q

42 8 6 18 (8r 8 - 7r 6 ) cos 6q

43 8 6 18 (8r 8 - 7r 6 ) sin 6q

44 8 8 18 r 8 cos 8q

45 8 8 18 r 8 sin 8q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
68 SYSTEMS WITH CIRCULAR PUPILS

n 4

0.5 8
R n(ρ)

0 (a)
0

-0.5 6

2
-1
0 0.2 0.4 0.6 0.8 1

n 5
0.5

7
1
R n(ρ)

0 (b)
1

-0.5

-1
0 0.2 0.4 0.6 0.8 1

n 6
0.5

2
R n(ρ)

0 (c)
2

-0.5 8
4

-1
0 0.2 0.4 0.6 0.8 1
U

Figure 4-10. Variation of a Zernike circle radial polynomial Rnm U as a function of


U. (a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
4.6.5 Relationships among the Indices n, m, and j 69

4.6.5 Relationships among the Indices n, m, and j


The number of polynomials Nn through a certain order n represents the largest value
of j. Since the number of polynomials with the same value of n but different values of m
is equal to n + 1, the smallest value of j for a given value of n is Nn - n . For a given
value of n and m, there are two j values, Nn - n + m - 1 and Nn - n + m . The even value
of j represents the cos mq polynomial, and the odd value of j represents the sin mq
polynomial. The value of j with m = 0 is Nn - n . For example, for n = 5, N n = 21 and
j = 21 represents the sin 5q polynomial. The number of the corresponding cos 5q
polynomial is j = 20. The two polynomials with m = 3, for example, have j values of 18
and 19, representing the cos 3q and the sin 3q polynomials, respectively.

For a given value of j, n is given by

[
n = ( 2 j - 1)
12
]
+ 0.5
integer
-1 , (4-50)

where the subscript integer implies the integer value of the number in brackets. Once n is
known, the value of m is given by

Ô {
Ï 2 [ 2 j + 1 - n( n + 1) ] 4 }
integer
when n is even (4-51a)
m=Ì
{ }
Ô 2 [ 2( j + 1) - n( n + 1) ] 4 integer - 1 when n is odd .
Ó
(4-51b)

For example, suppose we want to know the values of n and m for the polynomial j = 10.
From Eq. (4-50), n = 3 and from Eq. (4-51b), m = 3. Hence, it is a cos 3q polynomial.

4.6.6 Uniqueness of Circle Polynomials


The Zernike circle polynomials have certain unique mathematical properties. They
are the only polynomials in two variables r and q, which (a) are orthogonal over a circle,
(b) are invariant in form with respect to rotation of the coordinate axes about the origin,
and (c) include a polynomial for each permissible pair of n and m values [4,12].

From the standpoint of wavefront analysis, their uniqueness lies in the fact that they
are not only orthogonal over a circular pupil, but include wavefront tilt, defocus, and
balanced classical aberrations as members of the polynomial set for such a pupil. For
example, Z 6 , Z 8 , and Z11 represent the balanced primary aberrations of astigmatism,
coma, and spherical aberration, as may be seen by comparing their forms with those
given in Table 4-2. Similarly, Z12 , Z16 , and Z 22 represent the balanced secondary
aberrations of astigmatism, coma, and spherical aberration, respectively, as may be seen
by comparing their forms with those given in Eqs. (4-33)–(4-35), respectively. Note that
the constant term in a radially symmetric aberration is needed to make its mean value
zero over the pupil. A balanced classical aberration in the form of a Zernike polynomial is
referred to as a Zernike or orthogonal aberration, e.g., Z 6 is Zernike primary
astigmatism or Z 8 is Zernike primary coma. In Section 4.5, aberrations with only cos mq
type dependence are considered, as would be the case for a rotationally symmetric
70 SYSTEMS WITH CIRCULAR PUPILS

imaging system. In general, an aberration function will also have sin mq type terms, for
example, due to fabrication errors or those due to atmospheric turbulence. The
corresponding polynomials with sin mq dependence are considered in Section 4.6.

4.6.7 Circle Polynomials in Cartesian Coordinates


The circle polynomials given in polar coordinates in Table 4-4 can be written in the
Cartesian coordinates ( x , y ) of a pupil point, and cos mq and sin mq can be written in
terms of powers of cos q and sinq , respectively. They are listed in Table 4-5 using the
polynomial ordering index j. It is quite common in the optics literature to consider a point
object lying along the y axis when imaged by a rotationally symmetric optical system,
thus making the yz plane the tangential plane [4]. To maintain symmetry of the aberration
function about this plane, the polar angle q of a pupil point in Figure 4-1 is accordingly
defined as the angle made by its position vector OQ with the y axis, contrary to the
standard convention as the angle with the x axis. We choose a point object along the x
( )
axis so that, for example, the coma aberration is expressed as x x 2 + y 2 and not as
( )
y x 2 + y 2 . A positive value of our coma aberration yields a diffraction point spread
function that is symmetric about the x axis (or symmetric in y) with its peak and centroid
shifted to a positive value of x with respect to the Gaussian image point.

In practice, the aberration data obtained by way of interferometry will generally be


available at a uniformly spaced array of points in Cartesian coordinates. Hence, it is
convenient to carry out numerical analysis in a Cartesian coordinate system using the
Zernike circle polynomials in Cartesian coordinates.

4.7 ZERNIKE CIRCLE COEFFICIENTS OF A CIRCULAR ABERRATION


FUNCTION
The aberration function W (r, q) of a rotationally symmetric imaging system for a
certain point object can be expanded in terms of the orthonormal Zernike circle
polynomials Z nm (r, q) that are orthonormal over a unit disc in the form

• n
W (r, q) = Â Â c nm Z nm (r, q) , 0 £ r £ 1 , 0 £ q £ 2p , (4-52)
n =0 m =0

where c nm are the orthonormal expansion coefficients that depend on the object location.
The orthonormal Zernike expansion coefficients are given by

1 1 2p
c nm = Ú Ú W (r, q)Z n (r, q) r dr d q ,
m
(4-53)
p0 0

as may be seen by substituting Eq. (4-52) and utilizing the orthonormality Eq. (4-44) of
the polynomials.

Because of the orthogonality of the Zernike polynomials, the mean value of a circle
polynomial, except when n = 0 = m (the piston polynomial), is zero, and its mean square
value is unity, as shown in Section 3.2. Therefore, the mean and the mean square values
4.7 Zernike Circle Coefficients of a Circular Aberration Function 71

Table 4-5. Orthonormal Zernike circle polynomials Zj ( x, y) in Cartesian


coordinates ( x, y) , where x = r cosq , y = r sinq , and 0 £ r = x 2 + y 2(1 2
£ 1. )
Poly. n m Zj ( x, y) Name

Z1 0 0 1 Piston

Z2 1 1 2x x tilt

Z3 1 1 2y y tilt

Z4 2 0 3 (2r2 – 1) Defocus

Z5 2 2 2 6 xy 45∞ Primary astig.

Z6 2 2 6 ( x 2 – y2 ) 0∞ Primary astig.

Z7 3 1 8 y (3r 2 – 2) Primary y-coma

Z8 3 1 8 x (3r 2 – 2) Primary x-coma

Z9 3 3 8 y (3 x 2 – y 2 )

Z10 3 3 8 x( x 2 – 3y 2 )

Z11 4 0 5 (6r 4 – 6 r2 + 1 ) Primary spherical

Z12 4 2 10 ( x 2 – y 2 ) ( 4r2 – 3) 0∞ Secondary astig.

Z13 4 2 2 10 xy ( 4r2 – 3) 45∞ Secondary astig.

Z14 4 4 10 (r 4 – 8 x 2 y 2 )

Z15 4 4 4 10 xy ( x 2 – y 2 )

Z16 5 1 12 x (10 r 4 – 12 r2 + 3 ) Secondary x-coma

Z17 5 1 12 y (10r 4 – 12 r2 + 3 )] Secondary y-coma

Z18 5 3 12 x ( x 2 – 3 y 2 ) (5 r2 – 4)

Z19 5 3 12 y (3 x 2 – y 2 ) (5 r2 – 4 )

Z 20 5 5 12 x (16 x 4 – 20 x 2 r2 + 5 r 4 )

Z 21 5 5 12 y(16 y 4 – 20 y 2 r2 + 5 r 4 )

Z 22 6 0 7 (20 r6 – 30 r 4 + 12 r2 – 1 ) Secondary spherical

Z 23 6 2 2 14 xy (15 r 4 – 20 r2 + 6 )
72 SYSTEMS WITH CIRCULAR PUPILS

Table 4-5. Orthonormal Zernike circle polynomials Zj ( x, y) in Cartesian


coordinates ( x, y) , where x = r cosq , y = r sinq , and 0 £ r = x 2 + y 2
1 2
( )
£ 1 . (Cont.)

Poly. n m Zj ( x, y) Name

Z 24 6 2 14 ( x 2 – y 2 ) (15 r 4 – 20 r2 + 6 ) 45∞ Tertiary astig.

Z 25 6 4 4 14 xy ( x 2 - y 2 ) (6r2 – 5 ) 0∞ Tertiary astig.

Z 26 6 4 14 (8 x 4 - 8 x 2 r2 + r 4 ) (6r2 – 5 )

Z 27 6 6 14 xy (32 x 4 – 32 x 2 r2 + 6 r 4 )

Z 28 6 6 14 (32 x 6 – 48 x 4r2 + 18 x 2 r4 – r6 )

Z 29 7 1 (
4 y 35r 6 - 60r 4 + 30r 2 - 4 ) Tertiary y-coma

Z 30 7 1 4 x ( 35r 6 - 60r 4 + 30r 2 - 4) Tertiary x-coma

Z 31 7 3 4 y ( 3x 2 - y 2 )( 21r 4 - 30r 2 + 10)

Z 32 7 3 4 x ( x 2 - 3y 2 )( 21r 4 - 30r 2 + 10)

Z 33 7 5 4( 7r 2 - 6)[ 4 x 2 y ( x 2 - y 2 ) + y (r 4 - 8 x 2 y 2 ) ]

Z 34 7 5 4( 7r 2 - 6)[ x (r 4 - 8 x 2 y 2 ) - 4 xy 2 ( x 2 - y 2 ) ]

Z 35 7 7 8 x 2 y ( 3r 4 - 16 x 2 y 2 ) + 4 y ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )

Z 36 7 7 4 x ( x 2 - y 2 )(r 4 - 16 x 2 y 2 ) - 8 xy 2 ( 3r 4 - 16 x 2 y 2 )

Z 37 8 0 3( 70r 8 - 140r 6 + 90r 4 - 20r 2 + 1) Tertiary spherical

Z 38 8 2 18 ( 56r 6 - 105r 4 + 60r 2 - 10)( x 2 - y 2 ) 0∞ Quaternary astig.

Z 39 8 2 2 18 xy ( 56r 6 - 105r 4 + 60r 2 - 10) 45∞ Quaternary astig.

Z 40 8 4 18 ( 28r 4 - 42r 2 + 15)(r 4 - 8 x 2 y 2 )

Z 41 8 4 4 18 xy ( 28r 4 - 42r 2 + 15)( x 2 - y 2 )

Z 42 8 6 18 ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )(8r 2 - 7)

Z 43 8 6 2 18 xy ( 3r 4 - 16 x 2 y 2 )

Z 44 8 8 (
2 18 r 4 - 8 x 2 y 2 ) 2 - r8
Z 45 8 8 7 (20 r6 – 30 r 4 + 12 r2 – 1 )
4.7 Zernike Circle Coefficients of a Circular Aberration Function 73

of the aberration function are given by

W (r, q) = c 00 , (4-54)

• •
W 2 (r, q) = Â 2
 c nm , (4-55)
n =0 m =0

respectively. Accordingly, its variance is given by

2
s 2 = W 2 (r, q) - W (r, q)

• •
2
= Â Â c nm . (4-56)
n =1 m = 0

In practice, the expansion will be truncated at some value N of n such that the variance
obtained from Eq. (4-56) will be equal to its value obtained from the actual data within
some specified tolerance.

An aberration function W (r, q) across a unit disc representing aberrations resulting


from fabrication errors or atmospheric turbulence can be expanded in terms of the
Zernike circle polynomials Z j (r, q) in the form [2,5]
J
W (r, q) = Â a j Z j (r, q) , (4-57)
j =1

where a j are the expansion coefficients, and we have truncated the polynomials at
maximum value J of j. Multiplying both sides of Eq. (4-57) by Z j (r, q), integrating over
the unit disc, and using the orthonormality Eq. (4-4), we obtain the circle expansion
coefficients:
2p
11
aj = Ú
p0 Ú W (r, q)Z j (r, q) r dr dq . (4-58)
0

As stated in Section 3.2, it is evident from Eq. (4-58) that the value of a circle coefficient
a j is independent of the number J of the polynomials used in Eq. (4-57) for the
expansion of the aberration function. Hence, one or more terms can be added to or
subtracted from the aberration function without affecting the value of the coefficients of
the other polynomials in the expansion.

The mean and the mean square values of the aberration function are given by

W (r, q) = a1 , (4-59)

J
W 2 (r, q) = Â a 2j , (4-60)
j =1

respectively. Accordingly, the aberration variance is given by


74 SYSTEMS WITH CIRCULAR PUPILS

s 2 = W 2 (r, q) - W (r, q)
2

J
= Â a 2j . (4-61)
j =2

4.8 SYMMETRY PROPERTIES OF IMAGES ABERRATED BY A CIRCLE


POLYNOMIAL ABERRATION

It is evident that a Zernike circle polynomial aberration varying as cos mq or sin mq


is m-fold symmetric, unless m = 0, in which case it is radially symmetric. However, the
symmetry of the corresponding interferogram depends on cos mq or sin mq , since it
does not depend on the sign of the aberration. Hence, it is 2m-fold symmetric. Based on
the symmetry of the aberration, we now determine the symmetry of the PSF, the real and
the imaginary parts of the OTF, and the MTF [13,14].

4.8.1 Symmetry of PSF


Consider an m-fold symmetric aberration of the form cos mq . From Eq. (4-5), the
PSF at a distance r but an angle q i + 2pk m , where k = 1, 2,..., m, can be written

2
1 1 2p
I (r , q i + 2pk m) = [ ] [ ]
Ú Ú exp i F ( r, q) exp - pirr cos(q - q i - 2 pk m) r dr dq
p2 0 0
,

(4-62)

Now,

[ ]
F(r, q - 2 pk m) ~ cos m(q - 2 pk m) = cos( mq - 2 pk ) = cos mq ~ F(r, q) .

(4-63)
Hence, we can write Eq. (4-62) as

1 1 2p
I (r , q i + 2pk m) = [ ] [
Ú Ú exp i F(r, q - 2pk m) exp - pirr cos(q - q i - 2 pk m)
p2 0 0
]
2
¥ r dr d q

= I (r , q i ) . (4-64)

Thus if we change the angle q i by 2pk m but keep r unchanged, we obtain the same
value of the PSF as at (r , q i ) . This change can occur m times over a complete cycle of
2p . Therefore, Eq. (4-64) shows that the PSF is m-fold symmetric, as expected for the m-
fold aberration function. However, this is true for odd values of m only.

If m is even, the invariance of the PSF when q i changes by p, i.e., for k = m/2,
r r
implies that the PSF is symmetric or even about the origin, i.e., I ( r ) = I ( -r ) . It has the
consequence that the PSF is 2m-fold symmetric when m is even, as we show next. The
PSF at a distance r but angle q i ± pj m , where j = 1, 2, ..., 2m, is given by
4.8.1 Symmetry of PSF 75

2
1 1 2p
I (r , q i ± pj m) = [ ] [
Ú Ú exp i F ( r, q) exp - pirr cos(q - q i m pj m) r dr dq
p2 0 0
] . (4-65)

Now

[ ]
F(r, q ± pj m) ~ cos m(q ± pj m) = cos( mq ± pj )

Ï cos mq for even j ÔÏF(r, q) for even j


= Ì ~ Ì (4-66)
Ó - cos mq for odd j ÔÓ -F(r, q) for odd j .

Therefore, Eq. (4-65) can be written

2
1 1 2p
I (r , q i ± pj m) = [ ] [
Ú Ú exp i F(r, q - pj m) exp - pirr cos(q - q i m pj m) r dr dq
p2 0 0
]
(4-67)

ÏÔ I (r , q i ) for even j
= Ì r (4-68)
ÔÓ I (r , q i + p) ∫ I ( -r ) for odd j ,

where in Eq. (4-67) we have substituted F(r, q) = F(r, q ± pj m) for even j and
r r
F(r, q) = -F(r, q ± pj m) for odd j to obtain Eq. (4-68). Since I ( r ) = I ( -r ) for even m,
the right-hand side of Eq. (4-68) is equal to I (r , q i ) for odd values of j also. Hence the
PSF is 2m-fold symmetric when m is even. Of course, when m = 0, the PSF is radially
symmetric, like the aberration function.

The PSFs for two polynomial aberrations with the same n and m values, and the
same sigma value, but different angular dependence as cos mq and sin mq are the same
except that one is rotated by an angle p 2m with respect to the other. If two such
polynomial aberrations are present simultaneously with sigma values a j and b j , we can
write their sum in the form

W (r, q) = a j Z even j (r, q) + b j Z odd j (r, q)

= (
2(n + 1) Rnm (r) a j cos mq + b j sin mq )
= {[
2(n + 1) Rnm (r) a 2j + b 2j cos m q - (1 m) tan 1
(b j aj )]} . (4-69)

It represents an aberration of the form cos mq with a sigma value of a 2j + b 2j , except


( )
that its orientation is different by an angle (1 m) tan 1 b j a j . Hence, the orientation of
the PSF (and OTF) also change by this angle.

( )
12
It is easy to see that when both a j and b j are negative, a 2j + b 2j in Eq. (4-69)
( )
12
must be replaced by - a 2j + b 2j . However, when one of the coefficients is positive and
( )
the other is negative, then tan 1 b j a j of a negative argument has two solutions: a
76 SYSTEMS WITH CIRCULAR PUPILS

negative acute angle or its complimentary angle. The choice is made depending on
whether a 2 or a 3 is negative according to

( )
Ï - tan 1 b a for positive a and negative a
Ô (4-70a)
(b )
j j 2 3
tan 1
aj = Ì
( )
j
Ô p - tan 1 b j a j for negative a 2 and positive a 3 . (4-70b)
Ó

An alternative when a 2 is negative is to let the angle be - tan 1


(b j )
a j , as when a 2 is
( ) ( )
12 12
positive, but also replace a 2j + b 2j with - a 2j + b 2j .

4.8.2 Symmetry of OTF


The complex OTF given by Eq. (2-10) can be written in terms of its real and
imaginary parts:
r r r
t( v ) = Re t( v ) + i Im t( v ) , (4-71)

where the real and the imaginary parts are given by


r r r r r
Re t( v ) = Ú I ( r ) cos( 2pv ◊ r ) d r (4-72a)

and
r r r r r
Im t( v ) = Ú I ( r ) sin( 2pv ◊ r ) d r , (4-72b)

respectively. In polar coordinates, we can write them

[
Re t(v , f) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f) r dr dq i ] (4-73a)

and

[
Im t(v , f) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f) r dr dq i ] . (4-73b)

When m is odd, the OTF is complex. To determine the symmetry of its real part, we
consider it for a spatial frequency (v , f + pj m), where, as before, j = 1, 2, ..., 2m :

[
Re t(v , f + pj m) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f - pj m) r dr dq i ] . (4-74)

From Eq. (4-68) for even j, we can replace I (r , q i ) with I (r , q i - pj m) , and thus

[
Re t(v , f - pj m) = ÚÚ I (r , q i - pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i ]
= Re t( v , f) . (4-75)

For odd j,

I (r , q i + pj m) = I (r , q i + p) . (4-76)
4.8.2 Symmetry of OTF 77

Therefore, changing the variable of integration from q i to q i + p , we may write Eq. (4-
74) as

[ ]
Re t(v , f + pj m) = ÚÚ I (r , q i + p) cos 2 pvr cos(q i + p - f - pj m) r dr dq i

[ ]
= ÚÚ I (r , q i + pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i

= Re t(v , f) . (4-77)

Hence, Re t(v , f) is 2m-fold symmetric.

Now consider the imaginary part given by Eq. (4-73b). Following the same
procedure as for the real part, we replace I (r , q i ) by I (r , q i - pj m) for even j and write

[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i - pj m) sin 2pvr cos(q i - f - pj m) r dr dq i

= Im t(v , f) . (4-78)

However, for odd j, we obtain

[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f - pj m) r dr dq i . (4-79)

Again, changing the variable of integration from q i to q i + p and utilizing Eq. (4-68) for
odd j, we may write Eq. (4-79) as

[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i + p) sin 2 pvr cos(q i + p - f - pj m) r dr dq i

[ ]
= - ÚÚ I (r , q i + pj m) sin 2pvr cos(q i - f - pj m) r dr dq i

= - Im t(v , f) . (4-80)

Thus, the imaginary part does not change for even j, but its sign changes for odd j without
changing its magnitude. Hence, the imaginary part is only m-fold symmetric.

However, when m is even, the PSF is even about the origin, and, therefore, the
imaginary part of the OTF given by Eq. (4-72b) is zero (since its integrand is an odd
function). Accordingly, the OTF is real. Moreover, since the PSF is 2m-fold symmetric in
this case, so is the OTF. Accordingly, the MTF, which is the modulus of the OTF, is 2m-
fold symmetric whether m is even or odd. Of course, when m = 0, i.e., for a radially
symmetric aberration, the OTF is real, radially symmetric, and equal to the MTF.

The symmetry properties of the various functions discussed above for a Zernike
polynomial aberration with m -fold symmetry varying as cos mq or sin mq are
summarized in Table 4-6, where NA stands for “not applicable.” Of course, for m = 0,
the interferogram, the PSF, and the OTF are all radially symmetric. In addition, the OTF
is real when m is zero or even.
78 SYSTEMS WITH CIRCULAR PUPILS

Table 4-6. Symmetry of interferogram, PSF, real and imaginary parts of OTF, and
MTF for m-fold symmetric Zernike polynomial aberration varying as cosmqq or
sinmq .

m Interferogram PSF ReOTF ImOTF MTF

Even 2m-fold 2m-fold 2m-fold NA 2m-fold

Odd 2m-fold m-fold 2m-fold m-fold 2m-fold

4.9 ISOMETRIC, INTERFEROMETRIC, AND IMAGING


CHARACTERISTICS OF CIRCLE POLYNOMIAL ABERRATIONS

The circle polynomial aberrations for n £ 8 are illustrated in three different but
equivalent ways in Figure 4-11 for a sigma value of one wave. For each polynomial
aberration, the isometric plot is shown at the top, the interferogram on the left, and the
PSF on the right. The peak-to-valley numbers of the aberrations are given, and the Strehl
ratio and examples of the OTF characteristics are illustrated for a sigma value of 0.1 wave
[14].

4.9.1 Isometric Characteristics


The isometric plot at the top illustrates the shape of an aberration polynomial, as
produced, for example, in a deformable mirror. The corresponding P-V aberration
numbers (in units of wavelength) are given in Table 4-7. From the form of the
polynomials given in Eqs. (4-45a) and (4-45b) for m π 0 , these numbers are given by
2 2( n + 1) , since Rnm (1) = 1 and cos q or sinq varies by 2 from –1 to 1. When m = 0
and n 2 is even, as for the primary and tertiary spherical aberrations Z11 and Z 37 , the P-
V numbers are given by (1 - b) n + 1 , where b is the extreme negative value of Rnm (r)
as r varies between 0 and 1. However, when m = 0 and n 2 is odd, as for defocus Z 4
and secondary spherical aberration Z 22 , Rnm (r) varies from –1 at r = 0 to 1 at r = 1, as
may be seen from Figure 4-10. The P-V numbers in this case are given by 2 ( n + 1) . It
should be evident that the P-V numbers of two polynomials with the same values of n and
m are the same. The P-V numbers of a polynomial aberration representing the fabrication
errors give a measure of the depth of material to be removed in the fabrication process.

4.9.2 Interferometric Characteristics


The symmetry of an interferogram of a polynomial aberration, as in optical testing,
can be different from that of the aberration, because a fringe is formed independent of its
sign. For example, astigmatism Z 6 varying as cos 2q is 2-fold symmetric. It has the
implication that the aberration function does not change when it is rotated by p. Rotating
by p 2 yields an aberration of the same magnitude but with an opposite sign.
Accordingly, its interferogram is 4-fold symmetric WKXV Whe fringes intersecting the x axis
4.9.2 Interferometric Characteristics 79

Z1 Z2 Z3

Z4 Z5 Z6

Z7 Z8 Z9

Z10 Z11 Z12

Z13 Z14 Z15

Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
80 SYSTEMS WITH CIRCULAR PUPILS

Z16 Z17 Z18

Z19 Z20 Z21

Z22 Z23 Z24

Z25 Z26 Z27

Z28 Z29 Z30

Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
(Cont.)
4.9.2 Interferometric Characteristics 81

Z31 Z32 Z33

Z34 Z35 Z36

Z37 Z38 Z39

Z40 Z41 Z42

Z43 Z44 Z 45

Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
(Cont.)
82 SYSTEMS WITH CIRCULAR PUPILS

Table 4-7. Peak-to-valley (P-V) numbers (in units of wavelength) of orthonormal


Zernike polynomial aberrations for a sigma value of one wave.

Poly. P-V # Poly. P-V # Poly. P-V #

Z1 0 Z16 2 12 = 6.928 Z 31 8

Z2 4 Z17 2 12 = 6.928 Z 32 8

Z3 4 Z18 2 12 = 6.928 Z 33 8

Z4 2 3 = 3.464 Z19 2 12 = 6.928 Z 34 8

Z5 2 6 = 4.899 Z 20 2 12 = 6.928 Z 35 8

Z6 2 6 = 4.899 Z 21 2 12 = 6.928 Z 36 8

Z7 4 2 = 5.657 Z 22 2 7 = 5.292 Z 37 4.286

Z8 4 2 = 5.657 Z 23 2 14 = 7.483 Z 38 2 18 = 8.485

Z9 4 2 = 5.657 Z 24 2 14 = 7.483 Z 39 2 18 = 8.485

Z10 4 2 = 5.657 Z 25 2 14 = 7.483 Z 40 2 18 = 8.485

Z11 1.5 5 = 3.354 Z 26 2 14 = 7.483 Z 41 2 18 = 8.485

Z12 2 10 = 6.325 Z 27 2 14 = 7.483 Z 42 2 18 = 8.485

Z13 2 10 = 6.325 Z 28 2 14 = 7.483 Z 43 2 18 = 8.485

Z14 2 10 = 6.325 Z 29 8 Z 44 2 18 = 8.485

Z15 2 10 = 6.325 Z 30 8 Z 45 2 18 = 8.485

are formed by a positive aberration, and those intersecting the y axis are formed by a
negative aberration. The number of fringes in an interferogram, which is equal to the
number of times the aberration changes by one wave as we move from the center to the
edges of the pupil, is different for the different polynomials. Each fringe represents a
contour of constant phase or aberration. The fringe is dark when the phase is an odd
multiple of p, or the aberration is an odd multiple of l 2. In the case of tilts, for
example, the aberration changes by one wave four times, which is the same as the peak-
to-valley value of 4 waves. Hence, 4 straight line fringes symmetric about the center are
obtained. The x-tilt polynomial Z2 yields vertical fringes, and the y-tilt polynomial Z3
yields horizontal fringes. Similarly, defocus aberration Z4 yields about 3.5 fringes. In the
case of spherical aberration Z11 , the aberration starts at a value of 5 waves, decreases
to zero, reaches a negative value of - 5 2 waves, and then increases to 5 waves.
4.9.2 Interferometric Characteristics 83

Hence, the total number of times the aberration changes by unity is equal to 6.7, and
approximately seven circular fringes are obtained.

4.9.3 PSF Characteristics


The PSF plots represent the images of a point object in the presence of a polynomial
aberration. The piston aberration represented by the Zernike polynomial Z1 has no effect
on the image. Thus the PSF it yields is the Airy pattern given by Eq. (4-10). The full
width of a square displaying the PSFs in Figure 4-11 is 24l F .

The polynomial aberrations Z 2 and Z 3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 4(l D)a 2 about the y axis and displaces the PSF along the x axis
by 4l Fa 2 . Similarly, a 3 corresponds to a wavefront tilt angle of 4(l D)a 3 about the x
axis and displaces the PSF by 4l Fa 3 along the y axis. The aberrated PSFs can be
obtained from Eq. (4-5). For astigmatism Z 5 and Z 6 , m = 2, and the PSF is 4-fold
symmetric. For coma Z 7 and Z 8 , m = 1, the PSF is symmetric about the y and the x axis,
respectively. The polynomial Z10 corresponds to m = 3, the aberration function is 3-fold
symmetric, but the interferogram is 6-fold symmetric. Since m is odd, the PSF is also 3-
fold symmetric.

The Strehl ratio for the first 45 circle polynomial aberrations with a sigma value of
0.1 wave is listed in Table 4-8 and plotted in Figure 4-12 on a nominal and an expanded
scale to clearly show the variation of their values. For the tilt polynomials Z 2 and Z 3 , the
Strehl ratio simply represents the PSF value at a displaced point along the x or the y axis,
respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.4 l F .

A closed-form expression for the Strehl ratio for the defocus circle polynomial Z 4
can be obtained from Eq. (4-18) by letting

2pW (r, q) = a 4 Z 4 (r) . (4-81)

The result obtained is

( ) ˘˙
2
È sin 3a
4
S = Í . (4-82)
Í 3a 4 ˙
Î ˚

For a defocus sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement with the
result given in Table 4-8. Note that a 4 is the sigma value, which in turn is equal to
Bd 2 3 , where Bd is the peak value of the defocus aberration. Hence, Eq. (4-82) is the
same as Eq. (4-21). The amount of longitudinal defocus required to produce a certain
value of a 4 , and therefore Bd , is given by Eq. (4-20).

The results of Table 4-8 and Figure 4-12 illustrate that the Strehl ratio for a small
84 SYSTEMS WITH CIRCULAR PUPILS

Table 4-8. Strehl ratio S for Zernike circle polynomial aberrations with a sigma
value of 0.1 wave.

Poly. S Poly. S Poly. S


Z1 1 Z16 0.673 Z 31 0.674

Z2 0.665 Z17 0.673 Z 32 0.674

Z3 0.665 Z18 0.674 Z 33 0.680

Z4 0.663 Z19 0.674 Z 34 0.680

Z5 0.671 Z 20 0.692 Z 35 0.705

Z6 0.671 Z 21 0.692 Z 36 0.705

Z7 0.669 Z 22 0.668 Z 37 0.670

Z8 0.669 Z 23 0.673 Z 38 0.674

Z9 0.678 Z 24 0.673 Z 39 0.674

Z10 0.678 Z 25 0.677 Z 40 0.676

Z11 0.666 Z 26 0.677 Z 41 0.676

Z12 0.672 Z 27 0.698 Z 42 0.684

Z13 0.672 Z 28 0.698 Z 43 0.684

Z14 0.685 Z 29 0.675 Z 44 0.711

Z15 0.685 Z 30 0.675 Z 45 0.711

aberration is nearly independent of the type of the aberration and that it depends primarily
( )
on its sigma value. It is approximately given by Eq. (4-22c) as exp - s F2 , or 0.67,
where s F = 0.2p .

4.9.4 OTF Characteristics


r
An image displacement of rt due to a wavefront tilt produces a linearly varying
r r r
phase factor of 2pv ◊ rt in the OTF, as may be seen from Eq. (1-10) by replacing PSF ( r )
r r r r
with the displaced PSF PSF (r - rt ) and the OTF t( v ) by the corresponding OTF t t ( v ) .
Of course, the phase factor, representing the phase transfer function, has no effect on the
MTF of the system.

The 3D MTF plots are shown in Figure 4-13 for the primary aberration polynomials
with a sigma value of 0.1 wave. The MTF for the piston aberration represents the
aberration-free MTF. It is included among the aberrated MTF plots by a solid line as a
4.9.4 OTF Characteristics 85

oS

oj
oS

oj

Figure 4-12. Strehl ratio for Zernike circle polynomial aberrations with a sigma
value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.

reference. The symmetry of the MTFs is made more explicit by the contour plots shown
below each 3D MTF figure. The MTF value at the center of the contours is unity and
decreases to zero from the center out starting with a value of 0.9 and ending with zero.
The tangential (long dashes), sagittal (medium dashes), and 45o (small dashes) MTF plots
are also shown in this figure, i.e., for the spatial frequency vector along the x axis, y axis,
and at 45o from the x axis, respectively. Because of the 4-fold symmetry of the MTF in
the case of astigmatism, the tangential MTF is equal to the sagittal MTF. As expected
[3,8], the aberrated MTF is lower than the aberration-free MTF at all spatial frequencies
0  v  1, i.e., within the passband of the system.
86 SYSTEMS WITH CIRCULAR PUPILS

y x

Z 1 - Piston

Z 4 - Defocus

Z6 Primary astigmatism

Z8 Primary coma

Z 10

Z 11 Primary spherical

Figure 4-13. 3D, tangential or along x axis (in long dashes), sagittal or along y axis
(in medium dashes), and at 45 D from the x axis (in small dashes) MTF plots for
Zernike circle polynomial aberrations with a sigma value of 0.1 wave. The solid
curve represents the aberration-free MTF. The spatial frequency v is normalized
by the cutoff frequency 1 O F . The contour plots below each 3D MTF plot are in
steps of 0.1 from the center out, starting with 0.9 and ending with zero.
4.9.4 OTF Characteristics 87

Figure 4-14a shows the symmetry of the real and the imaginary parts of the OTF for
coma Z 8 . The real part has even symmetry, but the imaginary part has odd symmetry.
The thick and thin contours of the imaginary part in both cases represent its positive and
negative values, respectively. The real and imaginary parts of the OTF for the aberration
Z10 are shown in Figure 4-14b. In addition to their even and odd symmetry, it shows that
the real part is 6-fold symmetric and the imaginary part is 3-fold symmetric, as expected
for a 3-fold symmetric aberration. Because of the odd symmetry of the imaginary part, its
integral over the spatial frequencies imaged by a system is zero, as expected from the
statement after Eq. (1-25).

(a) Z8 Primary coma

(b) Z10
Re ( ) Im ( )
Figure 4-14. Real and imaginary parts of the OTF for a Zernike polynomial
aberration with a sigma value of 0.1 wave. (a) Z8 (primary coma) showing the even
and odd symmetry of the real and imaginary parts. (b) Z10 showing the 6-fold
symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to
their even and odd symmetry, respectively. The thick and thin contours of the
imaginary part in both cases represent its positive and negative values, respectively.
88 SYSTEMS WITH CIRCULAR PUPILS

4.10 CIRCLE POLYNOMIALS AND THEIR RELATIONSHIPS WITH


CLASSICAL ABERRATIONS
4.10.1 Introduction
It is seen from Eq. (1-18) that a classical aberration depends on the polar angle q as
m
cos q . However, a Zernike polynomial depends on the angle as cos mq (or sin mq). By
expressing cos m q as a series of cos mq terms, or cos mq as a power series of cos q
terms, the coefficients of classical aberrations can be obtained from the Zernike
coefficients and vice versa [15,16]. We illustrate this for primary aberrations. The names
of some of the aberrations associated with the Zernike polynomials are given in Table 4-
4. They are a carry over from the names associated with the classical aberrations.

The Seidel aberrations are well known in optical design, where the optical system
has an axis of rotational symmetry with the consequence that the angle-dependent terms
are in the form of powers of cos q . However, the measured aberrations of a system in
optical testing generally contain both the cosine and sine terms due to the assembly and
fabrication errors. We show how to define the effective Seidel coefficients in such cases.
We emphasize that the Seidel aberration coefficients determined from the primary
Zernike aberrations will be in error unless the higher-order terms that also contain Seidel
terms are negligible [16,17].

4.10.2 Wavefront Tilt and Defocus


The Zernike tilt aberration

a 2 Z 2 (r, q) = 2a 2r cos q (4-83)

represents a tilt of the wavefront about the y axis by an angle 4(l D)a 2 , where the
aberration coefficient is in units of wavelength. It results in a displacement of the PSF
along the x axis by 4l Fa 2 . Similarly, the Zernike tilt aberration

a 3 Z 3 (r, q) = 2a 3r sin q (4-84)

represents a tilt of the wavefront about the x axis by an angle 4(l D)a 3 and results in a
displacement of the PSF along the y axis by 4l Fa 3 .

It should be evident that when the cosine and sine terms of a certain aberration are
present simultaneously, as in optical testing, their combination represents the aberration
whose orientation depends on the value of the component terms. For example, if both x
and y Zernike tilts are present in the form

W (r, q) = a 2 Z 2 (r, q) + a 3 Z 3 (r, q) (4-85a)

= 2 a 2r cos q + 2a 3r sin q , (4-85b)

it can be written
4.10.2 Wavefront Tilt and Defocus 89

(
W (r, q) = 2 a 22 + a 32 )1 2 r cos [q - tan 1(a 3 a 2 )] . (4-86)

Thus, it represents a Zernike wavefront tilt aberration of magnitude 2 a 22 + a 32 (


about )1 2
an axis that is orthogonal to a line making an angle of tan (a 3 a 2 ) with the x axis. How
1

to decide the sign of the overall tilt and the value of its angle are discussed following Eq.
(4-69).

The Zernike tilt aberration Z 2 (r, q) is similar to the Seidel distortion in its (r, q)
dependence. Similarly, the Zernike defocus aberration Z 4 (r) varies with r as the Seidel
field curvature varies with it. The constant term in Z 4 (r) makes its mean value across the
circular pupil to be zero, without changing its standard deviation.

4.10.3 Astigmatism
The Zernike primary astigmatism

a 6 Z 6 (r, q) = 6 a 6r 2 cos 2q (4-87)

is referred to as the 0∞ astigmatism. It consists of Seidel astigmatism r2 cos 2 q balanced


with defocus aberration r2 to yield minimum variance. It yields a uniform circular spot
diagram, but a line sagittal image along the x axis (i.e., in a plane that zeroes out the
defocus part). The Zernike primary astigmatism

a 5 Z 5 (r, q) = 6 a 5r 2 sin 2q (4-88)

can be written

a 5 Z 5 (r, q) = [
6 a 5r 2 cos 2(q + p 4) ] . (4-89)

Comparing with Eq. (4-87), it is equivalent to changing q to q + p 4 . Accordingly, it is


called the 45∞ astigmatism. The secondary Zernike astigmatism given by

a12 Z12 (r, q) = ( )


10 a12 4 r 4 - 3r 2 cos 2q (4-90)

does not yield a line image in any plane. However, it is referred to as the 0∞ astigmatism
in conformance with the corresponding primary astigmatism because of its variation with
q as cos 2q . Similarly, the name tertiary astigmatism in Table 4-4 can be explained.

If both x and 45∞ astigmatisms are present so that

W (r, q) = a 6 Z 6 (r, q) + a 5 Z 5 (r, q) (4-91a)

= 6 a 6r 2 cos 2q + 6 a 5r 2 sin 2q , (4-91b)

we may write it in the form

(
W (r, q) = a 52 + a 62 )1 2 {[
6 r 2 cos 2 q - (1 2) tan 1
(a 5 ]}
a6 ) , (4-92)
90 SYSTEMS WITH CIRCULAR PUPILS

showing that it is Zernike astigmatism of magnitude (a 52 + a 62 )1 2 at an angle of


(1 2) tan 1( a 5 a 6 ) .
It should be evident that there is ambiguity in determining astigmatism, because it
can be written in different but equivalent forms by separating defocus aberration from it.
For example, a 0∞ astigmatism can be written

a 6 Z 6 (r, q) = a 6 ( 6r 2 cos 2q ) (4-93a)

(
= a 6 6 2r 2 cos 2 q - r 2 ) (4-93b)

= a6 6 ( - 2r 2 sin 2 q + r 2 ) . (4-93c)

It is clear that a 0∞ Zernike astigmatism given by Eq. (4-93a) can be written as a


combination of 0∞ positive Seidel astigmatism and a negative defocus, as in Eq. (4-93b),
or a 90∞ negative Seidel astigmatism and a positive defocus, as in Eq. (4-93c).

4.10.4 Coma
The Zernike coma terms a 8 Z 8 (r, q) and a 7 Z 7 (r, q) are called the x and y Zernike
comas. They represent classical coma r 3 cos q or r 3 sin q balanced with tilt r cos q or
r sin q , respectively, to yield minimum variance. They yield PSFs that are symmetric
about the x and y axes, respectively. Similarly, the names for the secondary and tertiary
coma can be explained.

When both x- and y -Zernike comas are present, the aberration may be written

W (r, q) = a 8 Z 8 (r, q) + a 7 Z 7 (r, q) (4-94a)

= ( ) (
8 a 8 3r 3 - 2r cos q + 8 a 7 3r 3 - 2r sin q ) (4-94b)

(
= a 72 + a 82 )1 2 8 (3r3 - 2r) cos [q - tan 1(a 7 a 8 )] , (4-94c)

which is equivalent to a Zernike coma of magnitude a 72 + a 82 ( )1 2 inclined at an angle of


tan 1(a 7 a 8 ) with the x axis.

4.10.5 Spherical Aberration


The Zernike spherical aberrations represent balanced classical spherical aberrations.
For example, the primary or Seidel spherical aberration varying as r 4 is balanced with
defocus varying as r 2 to yield Z11(r) representing the balanced primary spherical
aberration. As in the case of Zernike defocus term Z 4 (r) the constant term in Z11(r)
makes its mean value across the circular pupil to be zero. Similarly, the Zernike
secondary and tertiary spherical aberrations Z 22 and Z 37 also contain a constant term so
that their mean value is zero.
4.10.6 Seidel Coefficients from Zernike Coefficients 91

4.10.6 Seidel Coefficients from Zernike Coefficients


It should be noted that the wavefront tilt aberration given by Eq. (4-86) represents the
tilt aberration obtained from Zernike tilt aberrations. However, there are other Zernike
aberrations that also contain tilt aberration built into them, e.g., Zernike primary,
12
(
secondary, or tertiary coma. Similarly, the Seidel coma 3 8 a 72 + a 82 )
in Eq. (4-88c) at
an angle of tan 1(a 7 a 8 ) is only from the primary Zernike comas. But the secondary and
tertiary Zernike comas also contain Seidel coma. Hence, only if the higher-order Zernike
comas are zero or negligible, the PSF aberrated by primary Zernike coma will be
symmetric about a line making an angle of tan 1(a 7 a 8 ) with the x axis. Similarly, only
if the secondary and tertiary astigmatisms are zero or negligible, the Seidel astigmatism is
12
( )
2 6 a 52 + a 62 , as in Eq. (4-92). It yields an aberrated PSF that is symmetric about two
orthogonal axes, one of which is along a line that makes an angle of (1 2) tan 1( a 5 a 6 )
with the x axis.

To illustrate how a wrong Seidel coefficient can be inferred unless it is obtained from
all of the significant Zernike terms that contain Seidel aberrations, we consider an axial
image aberrated by one wave of secondary spherical aberration r 6 . In terms of Zernike
polynomials it will be written as

W (r) = a 22 Z 22 (r) + a11Z11(r) + a 4 Z 4 (r) + a1Z1(r) , (4-95)

where

(
a 22 = 1 20 7 , a11 = 1 4 5 , a 4 = 9 20 3 , a1 = 1 4 . ) (4-96)

If we infer the Seidel spherical aberration from only the primary Zernike aberration
a11Z11(r) , its amount would be 1.5 waves. Such a conclusion is obviously incorrect,
because in reality the amount of Seidel spherical aberration is zero. Needless to say if we
expand the aberration function up to the first, say, as many as 21 terms, we will in fact
incorrectly conclude that the amount of Seidel spherical aberration is 1.5 waves.
However, the Seidel spherical aberration will correctly reduce to zero when at least the
first 22 terms are included in the expansion. For an off-axis image, there are angle-
dependent aberrations, e.g., Z14 , that also contain Seidel aberrations. Hence, it is
important that the expansion be carried out up to a certain number of terms such that any
additional terms do not significantly change the mean square difference between the
function and its estimate. Otherwise, the inferred Seidel aberrations will be erroneous.

If we approximate a certain aberration function by the primary Zernike aberrations


only, we may write [16,17]

8
W (r, q) = Â a j Z j (r, q) + a11Z11(r) (4-97a)
j =1

= A p + At r cos(q - b t ) + Ad r 2 + Aa r 2 cos 2 (q - b a ) + Ac r cos(q - b c ) + Asr 4 ,


(4-97b)
92 SYSTEMS WITH CIRCULAR PUPILS

where A p is the piston aberration, other coefficients Ai represent the peak value of the
corresponding Seidel aberration term, and b i is the orientation angle of the Seidel
aberration. They are given by

A p = a1 - 3a 4 + 5a11 , (4-98a)

2 2 12 Ê a - 8a7 ˆ
At = 2ÈÍ a 2 - 8 a 8
( ) + (a 3 - 8 a 7 ˘˙
) , b t = tan 1Á 3 ˜ , (4-98b)
Î ˚ Ë a2 - 8a8 ¯

Ad = 2 ( 3a 4 - 3 5a11 - Aa ) , (4-98c)
1
(
Aa = 2 6 a 52 + a 62 )1 2 , ba =
2
tan 1
(a 5 a6 ) , (4-98d)

(
Ac = 6 2 a 72 + a 82 )1 2 , b c = tan 1
(a 7 a8 ) , (4-98e)

and

As = 6 5a11 . (4-98f)

As a note of caution, we add that the approximation of Eq. (4-97a) is good only when the
higher-order Zernike aberrations that also contain Seidel aberration terms are negligible.

4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing
In Figure 4-12, we have shown the Strehl ratio for the circle polynomial aberrations
with a sigma value of one-tenth of a wave. In Figure 4-13, we show how it varies with the
sigma value of a Seidel aberration, with and without balancing (as in Tables 4-1 and 4-2),
for 0 £ s W £ 0.25 . Also plotted is the Strehl ratio obtained from the approximate
( )
expression exp - s F2 as the dashed curve. As expected, the exponential expression
yields a very good estimate of the Strehl ratio for s W £ 0.1. As s W increases, the true
Strehl ratio departs from its approximate value, except in the case of balanced
astigamtism for which the difference is quite small. It overestimates in the case of
defocus, balanced coma, and spherical aberration, but underestimates for astigmatism and
coma. Morover, for agiven value of sigma, its value for spherical aberration is exactly the
same as for the balanced spherical aberration. The aberration coefficient and the P-V
number for a certain value of s W of these aberrations can be obtained from Table 4-9.

4.11 ZERNIKE COEFFICIENTS OF A SCALED PUPIL


Given an aberration function across a circular pupil, its orthonormal Zernike
coefficients can be obtained from Eq. (4-48). Now we discuss how these coefficients
change when the size of the pupil is reduced, as when the aperture of a camera lens or the
pupil of a human eye (assuming it to be circular) is reduced due to an illumination
increase. We give two approaches. In one, we express a scaled Zernike radial polynomial
as a linear combination of the unscaled radial polynomials and utilize the orthogonal
property of the radial polynomials [18]. In the other, we use some known integrals [19].
4.11 Zernike Coefficients of a Scaled Pupil 93

1.0 1.0

0.8 0.8

0.6 0.6
S

S
0.4 0.4

0.2 0.2

Defocus Astigmatism
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
ΣW ΣW

 
1.0 1.0

0.8 0.8

0.6 0.6
S

0.4 0.4

0.2 0.2

Coma Spherical
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
ΣW ΣW

 

Figure 4-15. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
Table 4-9. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.

Aberration Sigma P-V # for s = 1

Defocus s d = Ad 2 3 = Ad 3.46 3.46


Astigmatism s a = Aa 4 4
Balanced astigmatism s ba = Aa 2 6 = Aa 4.90 4.90
Coma s c = Ac 2 2 = Ac 2.83 2.83
Balanced coma s bc = Ac 6 2 = Ac 8.49 9.212
Spherical aberration, s s = 2 As 3 5 = As 3.35 3..35
Balanced spherical aberration s bs = As 6 5 = As 13.42 3.35
94 SYSTEMS WITH CIRCULAR PUPILS

An alternate approach may also be considered [20]. It is perhaps worth noting that, in
practice, one will determine the Zernike coefficients of an aberration function of a system
from its interferometric data by using Eq. (4-58). The corresponding coefficients of a
scaled pupil can also be determined in the same manner by utilizing its data, i.e., by
excluding that data of the unscaled pupil that is not part of the scaled pupil. The result
obtained can be illustrated by considering a Seidel aberration function and writing it in
terms of the Zernike polynomials for both the unscaled and the scaled pupils.

4.11.1 Theory
Consider a circular pupil with its wave aberration function W (r, q) expanded in
terms of the orthonormal Zernike circle polynomials Z j (r, q), as in Eq. (4-57). For a
corresponding scaled pupil with a normalized radius of  £ 1, as in Figure 4-16, the
aberration function can be written from Eq. (4-57) in the form

W (r, q) = Â a j Z j (r, q) . (4-99)


j

Normalizing the smaller pupil to a unit circle, the aberration function across it can also be
written in terms of the Zernike polynomials that are orthonormal over it in the form

W  (r, q) = Â bj ¢ Z j ¢ (r, q) , (4-100)


where W  (r, q) = W (r, q) and the orthonormal coefficients bj ¢ are given by


2p
11
bj ¢ = W  (r, q) Z j ¢ (r, q) r dr dq ,
p Ú0 Ú (4-101)
0

or
2p
11
bj ¢ = W (r, q) Z j ¢ (r, q) r dr dq .
p Ú0 Ú (4-102)
0

Figure 4-16. Scaled circular pupil, where the pupil radius is reduced from unity to 
by blocking the outer portion.
4.11.1 Theory 95

To obtain a coefficient bj ¢ in terms of the coefficients a j , we substitute Eq. (4-99) into


Eq. (4-102) and obtain
1 2p
1
bj ¢ = Â Ú Ú a j Z j (r, q) Z j ¢ (r, q) r dr dq . (4-103)
p j 0 0

From Eq. (4-46), the angular integration in Eq. (4-103) yields p(1 + d m 0 ) d mm ¢ . Hence, we
may write
1
bn ¢,m = 2(n ¢ + 1) Â 2(n + 1)a n,m Ú Rnm (r) Rnm¢ (r) r dr , (4-104)
n 0

where we have replaced the single index j by the corresponding double indices n and m,
and similarly replaced j ¢ by n ¢ and m according to Eqs. (4-50) and (4-51).

The integral in Eq. (4-104) can be solved very simply by writing the radial
polynomial Rnm (r) in terms of the corresponding polynomials Rnm¢ (r) in the form [18]

n
Rnm (r) = Â hn ¢ (n; )Rnm¢ (r) , (4-105)
n ¢=m

where

( -1) s ( n - s)! n2s


hn ¢ (n; ) = ( n ¢ + 1) Â Â , (4-106)
s s ¢ s! s¢!( n ¢ + s¢ + 1)!

s and s¢ are positive integers (including zero), and n - n ¢ = 2( s + s¢) . Substituting Eq. (4-
105) into Eq. (4-104) and utilizing Eq. (4-43) for the orthogonality of the radial
polynomials, we obtain the intended result:

n +1
bn ¢,m = Â h (n; ) a n,m . (4-107)
n n¢ + 1 n ¢

Since n - n ¢ ≥ 0 and even, therefore, n = n ¢, n ¢ + 2,... . If N is the highest order among


the terms of the aberration function in Eq. (4-52), then the largest value of n in Eq. (4-
107) is N or N - 1, depending on whether N - m is even or odd, respectively. From Eq.
(4-105), it is easy to show that

hn (n; ) = n , (4-108a)

hn 2 (n; ) (
= - ( n - 1) 1 - 2 n ) 2
, (4-108b)

n-3
hn 4 (n; ) =
2
( )(
1 - 2 n - 2 - n2 n ) 4
, (4-108c)

n-5
hn 6 (n; ) =
6
1 - 2 ( )[(n - 3)(n - 4) - 2(n - 1)(n - 3)2 + n(n - 1)4 ] , (4-108d)
96 SYSTEMS WITH CIRCULAR PUPILS

hn 8 (n; ) =
n-7 n
2
 8
(1 - 2 ) ÈÍÎ (n - 4)(n12- 5)(n - 6) - (n - 2)(n 4- 4)(n - 5) 2
( n - 1)( n - 2)( n - 4) n( n - 1)( n - 2) 6 ˘
+ 4 -  ˙ , etc. (4-108e)
4 12 ˚

Equations (4-108a)–(4-108e) are sufficient to obtain the Zernike coefficients of the scaled
pupil up to and including the eighth order. The expressions for hn ¢ (n; ) for n £ 8 are
listed in Table 4-9.

Since hn ¢ (n ¢; ) = n ¢ from Eq. (4-108a), the first term in the summation is n ¢ a n ¢m .


Moreover, for a given value of n ¢ , the multiplier of a coefficient a nm is independent of
m, regardless of whether it is a cosine or a sine polynomial. For example, when n ¢ = 4,
the b-coefficients are given by

b4,0 = h4 (4; )a 4,0 + 7 5h4 (6; )a 6,0 + 9 5h4 (8; )a 8,0 + ... , (4-109a)

b4,2 = h4 (4; )a 4,2 + 7 5h4 (6; )a 6,2 + 9 5h4 (8; )a 8,2 + ... , (4-109b)

and

b4,4 = h4 (4; )a 4,4 + 7 5h4 (6; )a 6,4 + 9 5h4 (8; )a 8,4 + ... . (4-109c)

As  Æ 1, all the multipliers vanish except a n ¢m , which approaches unity and yields the
expected result bn ¢,m = a n ¢,m .

The integral in Eq. (4-104) can also be evaluated by using the relationship [21]

( n m) 2
Rnm (r) = ( -1) Ú J n +1( r ) J m (rr ) dr (4-110)
0

to rewrite Rnm (r) , where J n (◊) is the nth-order Bessel function of the first kind. Thus,
we obtain after interchanging the integrals,

1 È1 m
( n m) 2 Û ˘
Ú n
R m
( r) R m
n¢ (r) r d r = ( -1) Ù n +1 Í Ú Rn ¢ (r) J m (rr ) r dr˙ dr
J ( r )
0 ı Î0 ˚
0


(n + n ¢ 2m) 2 Û J n ¢ +1( r )
= ( -1) Ù J n +1( r ) dr
ı r
0

1
= [
R n ¢ ( ) - Rnn ¢ + 2 ( )
2( n ¢ + 1) n
] , (4-111)

where we have sequentially used the relationships


4.11.1 Theory 97

Table 4-9. Expansion coefficients h n ¢ (n; ) given by Eq. (4-106) for n £ 8.

n n¢ h n ¢ (n; )
0 0 1

1 1 

2 0 (
- 1 - 2 )
2 2 2

3 1 - 2 1 - 2 ( )
3 3  3

4 0 (1 - 2 )(1 - 22 )
4 2 - 32 (1 - 2 )

4 4 4

5 1 (
 1 - 2 3 - 52 )( )
5 3 - 4 3 1 - 2 ( )
5 5 5

6 0 ( )(
- 1 - 2 1 - 52 + 54)
6 2 3 (1 -  )( 2 - 3 )
2 2 2

6 4 - 54 (1 - 2 )

6 6 6

7 1 ( )(
- 2 1 - 2 2 + 82 - 74 )
7 3 2 (1 - 2 )( 5 - 72 )
3

7 5 - 65 (1 - 2 )

7 7 7

8 0 (1 - 2 )(1 - 22 )(1 - 72 + 74 )


8 2 - 2 (1 - 2 )(10 - 352 + 284 )

8 4 54 (1 - 2 )( 3 - 4 2 )

8 6 - 76 (1 - 2 )

8 8 8
98 SYSTEMS WITH CIRCULAR PUPILS

( n ¢ m) 2 È J n ¢ +1 ( r ) ˘
1
Ú Rnm¢ (r) J m (rr ) r dr = ( -1) Í ˙ , (4-112a)
0 Î r ˚

J n +1( r ) J ( r ) + J n + 2 ( r )
= n , (4-112b)
r 2( n + 1)

and Eq. (4-110). Substituting Eq. (4-111) into Eq. (4-104), we obtain

n +1
bn ¢m = Â
n n ¢ + 1 nm n
[
a R n ¢ ( ) - Rnn ¢ + 2 ( ) ] . (4-113)

The equivalence of Eqs. (4-107) and (4-113) can be established by expanding the scaled
radial polynomial in terms of the orthogonal radial polynomials in the form

n
Rnm (r) = Â a n ¢ (n; )Rnm¢ (r) , (4-114)
n ¢=m

where, using the orthogonality of the radial polynomials, an expansion coefficient given
by

1
a n ¢ (n; ) = 2( n ¢ + 1) Ú Rnm (r) Rnm¢ (r) r dr (4-115)
0

is the same as hn ¢ (n; ) , as may be seen by comparing Eqs. (4-105) and (4-114).

4.11.2 Application to a Seidel Aberration Function


As an example of the use of Eq. (4-107), we consider a Seidel aberration function
[16]

W (r, q) = At r cos q + Ad r 2 + Aa r 2 cos 2 q + Ac r 3 cos q + Asr 4 , (4-116)

where a Seidel coefficient Ai represents the peak value of a Seidel aberration. It can be
written in terms of the Zernike polynomials in the form

W (r, q) = a 0,0 Z 00 + a11, Z11 + a 2,0 Z 20 + a 2,2 Z 22 + a 3,1Z 13 + a 4,0 Z 40

= a1Z1 + a 2 Z 2 + a 4 Z 4 + a 6 Z 6 + a 8 Z 8 + a11Z11 , (4-117)

where the argument (r, q) of the orthonormal Zernike polynomials Z nm is omitted for
brevity, and the Zernike coefficients are given by

Ad Aa As
a 0,0 ∫ a1 = + + , (4-118a)
2 4 3

At Ac
a11, ∫ a 2 = + , (4-118b)
2 3
4.11.2 Application to a Seidel Aberration Function 99

Ad Aa As
a 2,0 ∫ a 4 = + + , (4-118c)
2 3 4 3 2 3

Aa
a 2,2 ∫ a 6 = , (4-118d)
2 6

Ac
a 3,1 ∫ a 8 = , (4-118e)
6 2

and

As
a 4,0 ∫ a11 = . (4-118f)
6 5

Moreover, it is evident that the highest order among the aberrations is N = 4 . The
aberration variance in terms of the Zernike coefficients is given by

s 2 = a11
2 2 2 2 2
, + a 2, 0 + a 2, 2 + a 3,1 + a 4 , 0 (4-119a)

= a 22 + a 42 + a 62 + a 82 + a11
2
. (4-119b)

For a scaled pupil, the aberration function can be written in the form

W  (r, q) = b0,0 Z 00 + b11, Z11 + b2,0 Z 20 + b2,2 Z 22 + b3,1Z 13 + b4,0 Z 40 (4-120a)

= b1Z1 + b2 Z 2 + b4 Z 4 + b6 Z 6 + b8 Z 8 + b11Z11 , (4-120b)

where, from Eq. (4-107) and utilizing the h-coefficients given in Table 4-9, the Zernike
coefficients are given by

b0,0 = a 0,0 h0 (0; ) + 3h0 (2; )a 2,0 + 5h0 (4; )a 4,0

( )
= a 0,0 - 3 1 - 2 a 2,0 + 5 1 - 2 1 - 22 a 4,0 ( )( ) ,

or

( )
b1 = a1 - 3 1 - 2 a 4 + 5 1 - 2 1 - 22 a11 , ( )( ) (4-121a)

[
b11, = h1 (1; ) a11, + 2 h1 (3; ) a 3,1 =  a11, - 2 2 1 - 2 a 3,1 ( ) ] ,

or

[
b2 =  a 2 - 2 2 1 - 2 a 8( ) ] , (4-121b)

b2,0 = h2 (2; ) a 2,0 + 5 3h2 (4; ) a 4,0 = 2 a 2,0 - 15 1 - 2 a 4,0 [ ( ) ] ,

or
100 SYSTEMS WITH CIRCULAR PUPILS

[ (
b4 = 2 a 4 - 15 1 - 2 a11 ) ] , (4-121c)

b2,2 = h2 (2; ) a 2,2 = 2 a 2,2 ,

or

b6 = 2 a 6 , (4-121d)

b3,1 = h3 (3; ) a 3,1 = 3 a 3,1 ,

or

b8 = 3 a 8 , (4-121e)

and

b4,0 = h4 (4; ) a 4,0 = 4 a 4,0 ,

or

b11 = 4 a11 . (4-121f)

The aberration variance for the scaled pupil is given by

s 2 = b22 + b42 + b62 + b82 + b11


2
. (4-122)

It is easy to verify that the Zernike coefficients obtained in Eqs. (4-121a)–(4-121f)


are indeed correct by writing the Seidel aberration function for the scaled pupil and
determining its Zernike coefficients. From Eq. (4-116), the aberration function of the
scaled pupil can be written

W (r, q) = At r cos q + Ad 2r 2 + Aa 2r 2 cos 2 q + Ac 3r 3 cos q + As4 r 4 . (4-123)

It can also be written

W  (r, q) = At¢r cos q + Ad¢ r 2 + Aa¢ r 2 cos 2 q + Ac¢ r 3 cos q + As¢r 4 , (4-124)

where

At¢ = At , Ad¢ = Ad 2 , Aa¢ = Aa 2 , Ac¢ = Ac 3 , and As¢ = As4 . (4-125)

Writing Eq. (4-124) in terms of Zernike polynomials, as was done in obtaining Eq. (4-
117) from Eq. (4-116), it is easy to see that the Zernike coefficients thus obtained are the
same as the corresponding coefficients given by Eqs. (4-121a)–(4-121f).

4.11.3 Numerical Example


If each Seidel aberration coefficient in Eq. (4-116) is unity (e.g., one wave), then the
corresponding Zernike coefficients in Eq. (4-117) for the full pupil are given by
4.11.3 Numerical Example 101

a1 = 13 12 , a 2 = 5 6 , a 4 = 5 4 3 , a 6 = 1 2 6 , a 8 = 1 6 2 , a11 = 1 6 5 . (4-126)

Substituting Eqs. (4-126) into Eq. (4-119b), the variance of the aberration function is
given by s 2 = 919 720 , or its standard deviation is given s = 1.1298 . For a pupil scaled
with  = 0.8 , the Zernike coefficients in Eq. (4-120b) are given by

b1 = 0.6165, b2 = 0.5707, b4 = 0.3954, b6 = 0.1306, b8 = 0.0603, b11 = 0.0305 . (4-127)

Substituting Eq. (4-118) into Eq. (4-122), the aberration variance and standard deviation
for the scaled pupil are given by

s 2 = 0.5036 (4-128)

and

s  = 0.7097 , (4-129)

respectively.

We have thus demonstrated how to analytically obtain the Zernike coefficients of an


aberration function of a scaled pupil in terms of their values for a corresponding unscaled
pupil. It is perhaps worth noting that, in practice, one will determine the Zernike
coefficients of an aberration function of a system from its interferometric data by using
Eq. (4-58). The corresponding coefficients of a scaled pupil can also be determined in the
same manner by utilizing its data, i.e., by excluding that data of the unscaled pupil that is
not part of the scaled pupil.

4.12 SUMMARY
The aberration-free PSF, called the Airy pattern, is shown in Figure 4-2. It consists of
a bright central spot of radius 1.22l F , called the Airy disc, containing 83.8% of the total
light, surrounded by the diffraction rings. The corresponding OTF shown in Figure 4-4
starts at a value of unity and decreases monotonically to zero at the cutoff frequency
1 l F . Since the Strehl ratio for a small aberration increases with a decrease in the
aberration variance, we explicitly consider the balancing of primary aberrations with
lower-order aberrations. As seen from Tables 4-1 and 4-2, the sigma value of primary
spherical aberration when balanced with defocus, primary coma balanced with tilt, and
primary astigmatism balanced with defocus, is reduced by a factor of 4, 3, and 6 2,
respectively. Accordingly, the aberration tolerance for a given Strehl ratio increases by
the same factor.

The Zernike circle polynomials are in widespread use for the analysis of circular
wavefronts because of their orthogonality over a unit circle and their representation of the
balanced classical aberrations for systems with circular pupils. The polynomials are
described by three indices: j is a polynomial ordering number, n represents the radial
degree or the order of a polynomial, and m represents its azimuthal frequency. The
polynomials are ordered such that an even j corresponds to a cosine polynomial and an
102 SYSTEMS WITH CIRCULAR PUPILS

odd j corresponds to a sine polynomial. A polynomial with a lower value of n is ordered


first, and, for a given value of n, a polynomial with a lower value of m is ordered first.
The expressions for the polynomials through the eighth order are given in polar
coordinates in Table 4-4 and in Cartesian coordinates in Table 4-5 in the orthonormal
form so that each expansion coefficient (except piston) of an aberration function
represents the sigma value of the corresponding polynomial term.

Only the cosine circle polynomials are needed to represent the aberration function of
a rotationally symmetric system. However, both cosine and sine polynomials are needed
to represent fabrication errors, or the aberrations introduced by atmospheric turbulence. A
circle polynomial aberration varying as cos mq or sin mq is m-fold symmetric. However,
its interferogram is 2m-fold symmetric. The PSF is m-fold symmetric when m is odd, and
2m-fold symmetric when m is even, unless m = 0, in which case it is radially symmetric,
like the aberration itself. These symmetry properties (along with those of the OTF) are
summarized in Table 4-6. The PSFs for two polynomial aberrations with the same n and
m values and the same sigma value but different angular dependence as cos mq and
sin mq are the same except that one is rotated by an angle p 2m with respect to the
other. If two such polynomial aberrations are present simultaneously with sigma values
a j and b j , then the orientation of the interferogram, PSF, and OTF changes by an angle
( )
(1 m) tan 1 b j a j .
The circle polynomials for n £ 8 are illustrated in Figure 4-11 by an isometric plot,
an interferogram, and a PSF for a sigma value of one wave. The corresponding P-V
numbers are given in Table 4-7. The Strehl ratio for a sigma value of 0.1 l for each
polynomial aberration is given in Table 4-8 and plotted in Figure 4-12, illustrating that,
for a small aberration, its value can be estimated from the aberration variance regardless
of the aberration type.

The OTF is complex with real and imaginary parts (or MTF and PTF) for odd m, but
it is real for even m. For m = 0, the OTF is real and radially symmetric. The real part of
the OTF is 2m-fold symmetric whether m is odd or even. However, its imaginary part is
m-fold symmetric for odd m, though its magnitude (i.e., if we ignore its sign) is 2m-fold
symmetric. Accordingly, the MTF is 2m-fold symmetric whether m is even or odd. The
MTF for primary aberrations, and Z10 and the real and imaginary parts of the OTF for
coma and Z10 , are given for a sigma value of 0.1 wave in Figures 4-13 and 4-14,
respectively.

The determination of the effective Seidel or primary aberration coefficients from the
corresponding coefficients of the cosine and sine polynomials is demonstrated in Section
4.9. It is emphasized that these coefficients cannot be obtained from only the primary
Zernike aberrations, but must also include the primary aberrations in the higher-order
Zernike terms. How to obtain the Zernike coefficients of a certain aberration function
when the diameter of the pupil is reduced from its nominal value is discussed in Section
4.11.
5eferences 103

References

1. F. Zernike, “Diffraction theory of knife-edge test and its improved form, the phase
contrast method,” Mon. Not. R. Astron. Soc. 94, 377–384 (1934).

2. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am.


66, 207–211 (1976).

3. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II:


Diffraction pattern in the presence of small aberrations,” Physica 13, 605–620
(1947)

4. M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999).

5. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction


Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).

6. V. N. Mahajan, “Zernike polynomials and aberration balancing,” Proc. SPIE


Proc. 5173, 1–17 (2003).

7. V. N. Mahajan, “Strehl ratio for primary aberrations in terms of their aberration


variance,” J. Opt. Soc. Am. 73, 860–861 (1983).

8. Lord Rayleigh, Phil. Mag. (5) 8, 403 (1879); also in his Scientific Papers (Dover,
New York, 1964) Vol. 1, p. 432.

9. V. N. Mahajan, “Strehl ratio for primary aberrations: some analytical results for
circular and annular pupils,” J. Opt. Soc. Am. 72, 1258–1266 (1982); Errata, 10,
2092 (1993).

10. V. N. Mahajan, “Line of sight of an aberrated optical system,” J. Opt. Soc. Am. A
2, 833–846 (1985).

11. W. B. King, “Dependence of the Strehl ratio on the magnitude of the variance of
the wave aberration,” J. Opt. Soc. Am. 58, 655–661 (1968).

12. A. B. Bhatia and E. Wolf, “On the circle polynomials of Zernike and related
orthogonal sets,” Proc. Cambridge Philos. Soc. 50, 40–48 (1954).

13. V. N. Mahajan, “Symmetry properties of aberrated point-spread functions,” J.


Opt. Soc. Am. 11, 1993–2003 (1994).

14. V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular
polynomial aberrations,” Appl. Opt. 52, 2062-2074 (2013).

15. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Geometrical


Optics, (SPIE Press, Bellingham, Washington, Second Printing 2001).5
104 SYSTEMS WITH CIRCULAR PUPILS

16 J. C. Wyant and K. Creath, “Basic wavefront aberration theory for optical


metrology,” Applied Optics and Optical Engineering, XI, 1–53 (1992). Note that
the polynomials used in this work are not in their orthonormal form, and are
ordered differently as well.

17. V. N. Mahajan and W. H. Swantner, “Seidel coefficients in optical testing,” Asian


J. Phys. 15, 203–209 (2006).

18. V. N. Mahajan, “Zernike coefficients of a scaled pupil,” Appl. Opt. 49, 5374-5377
(2010).

19. A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients
of scaled pupils,” Microlith, Microfab. and Microsyst, 5, 030501 (2006).

20. J. A. Díaz, J. Fernández-Dorado, C. Pizarro, and J. Arasa, “Zernike coefficients


for concentric circular scaled pupils: an equivalent expression,” J. Mod. Opt. 56,
149-155 (2009).

21. B. R. A. Nijboer, “The Diffraction Theory of Aberrations,” Thesis, University of


Groningen, The Netherlands (1942).
CHAPTER 5

SYSTEMS WITH ANNULAR PUPILS

5.1 Introduction ..........................................................................................................107

5.2 Aberration-Free Imaging ....................................................................................107

5.2.1 PSF ..........................................................................................................107

5.2.2 OTF ..........................................................................................................109

5.3 Strehl Ratio and Aberration Balancing ............................................................. 111

5.4 Orthonormalization of Circle Polynomials over an Annulus ..........................114

5.5 Annular Polynomials ........................................................................................... 116

5.6 Annular Coefficients of an Annular Aberration Function ..............................123

5.7 Strehl Ratio for Annular Polynomial Aberrations ........................................... 129

5.8 Isometric, Interferometric, and Imaging Characteristics of

Annular Polynomial Aberrations ......................................................................132

5.9 Summary............................................................................................................... 139

References ......................................................................................................................140

105
Chapter 5
Systems with Annular Pupils
5.1 INTRODUCTION
An important example of an imaging system with a noncircular pupil is that of a
system with an annular pupil. The two-mirror astronomical telescopes represent systems
with annular pupils. Examples of such telescopes, including their linear obscuration ratios
given in parentheses are the 200-inch telescope at Mount Palomar (0.36), the 84-inch
telescope at the Kitt-Peak observatory (0.37), the telescope at the McDonald Observatory
(0.5), and the Hubble Space Telescope (0.33 when using the Wide-Field Planetary
Camera).

We start this chapter with a brief discussion of how the obscuration affects the
aberration-free PSF and OTF of a circular pupil. We then consider its effect on the Strehl
ratio of primary aberrations, their balancing, and tolerances with and without balancing.
Next we obtain the polynomials that are orthonormal over an annular pupil by
orthogonalizing the Zernike circle polynomials by the procedure outlined in Chapter 3.
The annular polynomials are given in terms of the Zernike circle polynomials, and in both
polar and Cartesian coordinates. They are also related to the balanced aberrations. The
aberrated PSFs and OTFs are illustrated for the annular polynomial aberrations.

5.2 ABERRATION-FREE IMAGING


5.2.1 PSF
Figure 5-1 illustrates a unit annular pupil with outer and inner radii of 1 and , i.e., a
pupil with a linear obscuration ratio of . Thus, if (r, q) are the coordinates of a point on
the pupil, then  £ r £ 1 and 0 £ q £ 2 p . The PSF, Strehl ratio, and the OTF of a system
with an annular pupil can be obtained from the equations given in Section 2.2 in the same
manner as for a system with a circular pupil. The significant difference lies in replacing
the lower limit 0 of the radial integration by the obscuration ratio  of the annular pupil.
Thus, Eq. (4-3) for the aberrated PSF for an aberration F(r, q; ) is replaced by

1
'

Figure 5-1. Unit annulus of obscuration ratio , representing the ratio of its inner
and outer radii.

107
108 SYSTEMS WITH ANNULAR PUPILS

1 2p 2
1
I (r , q i ) = [ ] [
Ú Ú exp i F ( r, q) exp - pirr cos(q i - q) r dr dq ] , (5-1)
(
p 2 1 - 2 )2  0

where (r ,q i ) are the polar coordinates of a point in the image plane, r is in units of l F ,
and F = R D is the focal ratio of the image-forming light cone. The PSF is normalized to
unity at the center by the aberration-free central irradiance p Pex 1 - 2 4l2 F 2 . It is
2
( )
smaller than the corresponding central value for a circular pupil by a factor of 1 - 2 , ( )
since both the pupil area and the power Pex are each smaller by a factor of 1 - 2 . ( )
The aberration-free PSF is given by [1,2]

2
1 È 2J1( pr ) 2J ( pr ) ˘
I ( r; ) = Í pr - 2 1 . (5-2)
(1 - 2 ) 2 Î pr ˙˚

The effect of the obscuration  is two fold. First, there is a loss of light in the image that
increases with increasing . Second, the radius of the central bright spot decreases and
contains less and less light, while more and more light appears in the diffraction rings. As
 Æ 1, the PSF approaches J 0 ( pr ) , and the central bright spot radius decreases to 0.76
compared to a value of 1.22 for a circular pupil. The irradiance distribution I of the PSF
and its encircled power P are shown in Figure 5-2 for several typical values of the
obscuration ratio. The 2D PSF is shown in Figure 5-3 for obscuration ratios of 0.5 and
0.8. For large obscuration ratios, such as 0.8, the PSF consisits of groups of diffraction
rings.

1.0

0.9 I
P
=0
0.8

0.7 0.25
(r) P(rc)

0.6

0.5
0.50
0.4

0.3 0.75
0.2

0.1

0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r; rc

Figure 5-2. The irradiance and encircled power distributions for various values of
the obscuration ratio .
5.2.2 OTF 109

(a)

(b)

Figure 5-3. 2D aberration-free PSF of a system with an annular pupil having an


obscuration ratio of (a) 0.5 and (b) 0.8.

5.2.2 OTF
The aberration-free OTF, representing the Fourier transform of the corresponding
PSF given by Eq. (5-2) [3], or the fractional overlap area of two unit annular circles
separated by a distance l Rv i , is given by [1,4]
110 SYSTEMS WITH ANNULAR PUPILS

1
t (v; ) =
1 - 2
[ ]
t (v) +  2 t (v ) - t12 (v; ) , 0 £ v £ 1 , (5-3)

where t (v) is given by Eq. (4-15) and represents the OTF of the system if there were no
obscuration, v = l Fv i is a normalized radial spatial frequency as in the case of a circular
pupil (since the obscuration has no effect on the cutoff frequency 1 l F ), and

t12 (v; ) = 2 2 , 0 £ v £ (1 - ) 2 (5-4a)

(
= (2 p) q1 +  2 q 2 - 2 v sin q1 , ) (1 - ) 2 £ v £ (1 + ) 2 (5-4b)

= 0, otherwise . (5-4c)

In Eq. (5-4b), the angles q1 and q 2 are given by

4v 2 + 1 -  2
cos q1 = (5-5a)
4v

and

4v 2 - 1 + 2
cos q 2 = , (5-5b)
4 v

respectively. It is evident from Eq. (5-3) that t ( v; ) > t ( v ) at least for spatial frequencies
1
( )
(1 + ) 2 < v < 1 by a factor of 1 - 2 . This is illustrated in Figure 5-4 for the same
values of  as the PSFs in Figure 5-2. The OTF decreases at the low and mid spatial
frequencies and increases at the high. This is the spatial frequency analog of the increased
light in the diffraction rings and a smaller central bright spot.

1.0

0.8
= 0

0.6
t (n; )

0.25

0.4 0.50

0.75
0.2

0.0
0.0 0.2 0.4 0.6 0.8 1.0
n

Figure 5-4. OTF of an aberration-free system with an annular pupil of obscuration


ratio .
5.2.2 OTF 111

The radial integral of the aberration-free OTF is given by


1

0
(
Ú t ( v; ) vdv = 1 -  8 .
2
) (5-6)

Its slope at the origin is given by

t ¢(0; ) = - 4 p (1 - ) . (5-7)

5.3 STREHL RATIO AND ABERRATION BALANCING


Letting r = 0 in Eq. (5-1), we obtain the Strehl ratio of an image:

1 2p 2
1 Û Û
S ∫ I (0; ) = [ ]
Ù Ù exp iF(r, q; ) r dr dq . (5-8)
(
p 2 1 - 2 )2 ı ı
 0

The approximate value of the Strehl ratio can be obtained from the aberration variance

s2F = < F2 > - < F > 2 (5-9)

according to Eq. (1-34), where


1 2p
1 ÛÛ
n
[(
< F > = p 1-  2
)] Ù Ù F (r, q; ) r dr dq ,
ıı
n
(5-10)
 0

with n = 1 and 2, respectively. Table 5-1 gives the form as well as the standard deviation
s F of a primary aberration.

Table 5-1. Primary aberrations and their standard deviations for a system with a
uniformly illuminated annular pupil of obscuration ratio .

Aberration F( r,, q) sF

Spherical As r 4 12
(4 -  2
- 6  4 -  6 + 4 8 ) As 3 5

Coma Ac r3 cos q 12
(1 +  2
+  4 + 6 ) Ac 2 2

Astigmatism Aa r2 cos 2 q 2 12
(1 +  ) Aa 4

Field curvature (defocus) Ad r2 (1 -  ) A


2
d 2 3

2 12
Distortion (tilt) At r cos q (1 +  ) At 2
112 SYSTEMS WITH ANNULAR PUPILS

For a small aberration, we balance a classical aberration with one or more aberrations
of lower order to minimize its variance and thereby maximize the corresponding Strehl
ratio. Thus, for example, we balance spherical aberration with defocus, as in Chapter 4,
and write it as

F (r; ) = Asr 4 + Bd r 2 . (5-11)

We determine the amount of defocus Bd such that the variance sF2 is minimized; i.e., we
calculate sF2 and let

∂s F2
= 0 (5-12)
∂B d

to determine Bd . Proceeding in this manner, we find that the optimum value is


2
( )
Bd = - 1 + 2 As . The corresponding standard deviation is 1 - 2 As 6 5 . ( )
Astigmatism and coma aberrations can be treated similarly. Table 5-2 lists the form
of a balanced primary aberration and its standard deviation. Also listed in the table is the
location of the diffraction focus, i.e., the point with respect to which the aberration
variance is minimum so that the Strehl ratio at it is maximum. We note that in the case of
coma, the balancing aberration is a wavefront tilt whose amount depends on  . Thus,
maximum Strehl ratio is obtained at a point that is displaced from the Gaussian image
point but lies in the Gaussian image plane. In the case of astigmatism, the amount of
balancing defocus is independent of  . The higher-order classical aberrations can be
balanced in a similar manner.

Figure 5-5 shows how the standard deviation of an aberration, for a given value of
the aberration coefficient Ai , varies with the obscuration ratio of the pupil. In Figures 5-
5a and 5-5b, the amounts of defocus and tilt required to minimize the variance of
spherical aberration and coma, respectively, are also shown. We observe from these
figures that the standard deviation of spherical and balanced spherical aberrations and

Table 5-2. Balanced primary aberrations, their standard deviation, and diffraction
focus.

Aberration F(r, q;  ) sF Diffraction Focus

Balanced
spherical [ (
As r 4 - 1 + 2 r 2 ) ] 1
6 5
1 - 2( )
2
As [0,0,8(1 +  )F A ]
2 2
s

Balanced 2 1 + 2 + 4 4 12
coma
Ê
Ac Á r3 -
ˆ
r˜ cos q (1 -  ) (1 + 4  +  )
2 2

Ac Í
(
È 4 1 + 2 + 4 ) ˘
FAc , 0, 0 ˙
Ë 3 1 + 2 ¯
6 2 (1 +  ) 2 12
Î (
Í 3 1+  2
) ˙
˚

Balanced
astigmatism a
(
A r 2 cos 2 q - 1 2 ) 1
(1 +  2
+ 4
12
) Aa (0, 0, 4 F A )
2
a
2 6
5.3 Strehl Ratio and Aberration Balancing 113

0.30 0.12 1.2 0.12


Spherical Balanced
0.25 0.10 1.0 0.10

sf /Ac (coma) balancing tilt


coma

sf /Ac (balanced coma)


Balanced defocus
0.20 0.08 0.8 0.08
sf /As

(1 + 2) 2(1 + 2 + 4)/3(1 + 2)


0.15 0.06 0.6 0.06

0.10 0.04 0.4 0.04


Coma
Balanced
0.05 spherical 0.02 0.2 0.02

0.00 0.00 0.0 0.00


0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
 
(a) (b)

0.40 0.30

0.25
Defocus
0.35
0.20
sf /Ad
VI /Aa

0.30 0.15

Astigmatism 0.10
0.25
Balanced 0.05
astigmatism

0.20 0.00
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
 
(c) (d)

0.75

0.70

0.65 Tilt
sf /At

0.60

0.55

0.50
0.0 0.2 0.4 0.6 0.8 1.0

(e)

Figure 5-5. Variation of standard deviation of a primary and a balanced primary


aberration with obscuration ratio . Variation of balancing defocus in the case of
spherical aberration and tilt in the case of coma are also shown. (a) Spherical
aberration, (b) coma, (c) astigmatism, (d) defocus, and (e) tilt.
114 SYSTEMS WITH ANNULAR PUPILS

defocus decreases as  increases. Correspondingly, the tolerance in terms of their


aberration coefficients As and Bd , for a given Strehl ratio, increases. Thus, for example,
the depth of focus for a certain value of the Strehl ratio increases as  increases. The
standard deviation of coma, astigmatism, balanced astigmatism, and tilt increases as 
increases. The standard deviation of balanced coma first slightly increases, achieves its
maximum value at  = 0.29 , and then decreases rapidly as  increases. The factor by
which the standard deviation of an aberration is reduced by balancing it with another
aberration is reduced in the case of spherical aberration, but increases in the case of coma
and astigmatism, as  increases.

5.4 ORTHONORMALIZATION OF CIRCLE POLYNOMIALS OVER AN


ANNULUS
The polynomials Aj (r, q; ) orthonormal over a unit annulus of obscuration ratio 
can be obtained recursively from the Zernike circle polynomials Z j (r, q), starting with
A1 = 1 (omitting the arguments for brevity) from Eq. (3-18) according to [5–7]

È j ˘
A j +1 = N j +1 Í Z j +1 - Â Z j +1 Ak Ak ˙ , (5-13)
Î k =1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal. The


angular brackets indicate a mean value over the annulus. Thus,
1 2p
1 Û Û
Z j +1 Ak = Ù Ù Z j +1 Ak r dr dq . (5-14)
(
p 1 - 2 ) ı ı
 0

The orthonormality of the polynomials implies that


1 2p
1 Û Û
A j A j¢ = Ù Ù A j A j ¢ r dr d q
(
p 1 - 2 ) ı ı
 0

= d jj ¢ . (5-15)

Now a circle polynomial Z j varies with angle q as cos mq or sin mq depending on


whether j is even or odd. It is radially symmetric when m = 0 . Because of the orthogonal
properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)], the
polynomials Ak that contribute to the sum in Eq. (5-13) must also have the same angular
dependence as that of the polynomial Z j +1. Hence, the polynomial A j +1 will also have
the same angular dependence. Thus, an annular polynomial A j is separable in polar
coordinates r and q , and differs from the corresponding circle polynomial only in its
radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the
annular polynomials can accordingly be written [1]

Aeven j (r, q; ) = 2(n + 1) Rnm (r; ) cos mq , m π 0 , (5-16a)


5.4 Orthonormalization of Circle Polynomials over an Annulus 115

Aodd j (r, q; ) = 2(n + 1) Rnm (r; ) sin mq , m π 0 , (5-16b)

A j (r, q; ) = n + 1 Rn0 (r; ) , m = 0 , (5-16c)

where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; ) is
an annular radial polynomial.

Substituting Eqs. (5-16a)–(5-16c) into Eq. (5-15), we find that the annular radial
polynomials obey the orthogonality condition
1
Û m 1 - 2
Ù Rn (r; ) Rn ¢ (r; ) r dr = 2
m
d . (5-17)
ı (n+ 1) nn ¢


In the two-index n and m representation Anm (r, q; ) of an annular polynomial, Eq. (5-13)
can be written

È ( n m) 2 ˘
Anm = N nm Í Z nm - Â Z nm Anm 2i An 2i ˙ , (5-18)
Î i =1 ˚

where N nm replaces the normalization constant N j and, as in Eq. (5-13), the angular
brackets indicate a mean value over the unit annulus. Substituting Eqs. (5-16a)–(5-16c)
into Eq. (5-18), we find that the annular radial polynomials are given by

È ( n m) 2 ˘
Rnm (r; ) = N nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; ) Rnm 2i (r; )˙ , (5-19)
Î i ≥1 ˚

where
1
2 Û m
Rnm (r) Rnm¢ (r; ) = Ù Rn (r) Rn ¢ (r; ) r dr .
m
(5-20)
1 - 2 ı


Thus, Rnm (r; ) is a radial polynomial of degree n in r containing terms in rn , rn 2 , K,


and r m with coefficients that depend on  . The radial polynomials are even or odd in r
depending on whether n (or m) is even or odd.

For m = 0 , the annular radial polynomials are equal to the Legendre polynomials
Pn (◊) according to

È 2 r 2 - 2
R20n (r; ) = Pn Í -
(1
˘
˙ .
) (5-21)
ÍÎ 1 - 
2
˙˚

Thus, they can be obtained from the circle radial polynomials R20n (r) by replacing r with

[(r 2
- 2 ) (1 -  )] 2 12
, i.e.,
116 SYSTEMS WITH ANNULAR PUPILS

ÈÊ r2 -  2 ˆ 1 2 ˘
R20n (r; ) = R20n ÍÁ 2 ˜
˙ . (5-22)
ÍÎË 1 -  ¯ ˙˚

Given that Rnn (r) = r n [see Eq. 4-39)], it can be seen from Eqs. (5-17) and (5-19) that

12
{(
Rnn (r; ) = r n 1 - 2 ) [1 - 2(n +1) ]} (5-23a)

12
Ê n ˆ
= r n Á Â 2i ˜ . (5-23b)
Ë i=0 ¯

Moreover,

Rnn 2 (r; ) =
[(
nrn - (n - 1) 1 - 2 n ) (1 -  ( ) )] r
2 n 1 n 2

12 . (5-24)
Ï 1 - 2
Ì
Ó
( )
1
(
Èn 2 1 - 2( n +1
ÎÍ
)
) - (n - 1)(1 -  ) (1 -  ( ) )˘˚˙¸˝˛
2 2n 2 2 n 1

It is evident that an annular radial polynomial Rnn (r; ) differs from the corresponding
circle polynomial Rnn (r) only in its normalization. We also note that

Rnm (1; ) = 1, m = 0 (5-25a)

π 1, m π 0 . (5-25b)

5.5 ANNULAR POLYNOMIALS


The annular polynomials obtained from Eq. (5-13) in terms of the Zernike circle
polynomials are given in Table 5-3 [1,7]. The elements of the matrix M to convert the
circle polynomials into the annular polynomials can be obtained easily from this table
{ } { }
according to A j = M Z j [see Eq. (3-19)]. The nonzero elements of the matrix for
the first 15 polynomials are given in Table 5-4. The polynomial ordering, the number of
polynomials of a certain order or through a certain order n, and the relationships among
the indices n, m, and j are the same as discussed for circle polynomials in Chapter 4. It
should be evident that an annular polynomial Aj (r, q; ) reduces to the corresponding
circle polynomial Z j (r, q) as  Æ 0. In Table 5-5, the annular polynomials are given in
the Cartesian coordinates. The variation of several annular radial polynomials with r is
shown in Figure 5-6 for  = 0.5 .

The annular polynomials are also unique like the circle polynomials. They not only
are orthogonal over an annular pupil but also include wavefront tilt and defocus and
balanced classical aberrations as members of the polynomial set. For example, A6 , A8 ,
and A11 represent the balanced primary aberrations of astigmatism, coma, and spherical
aberration, as may be seen by comparing their forms with those given in Table 5-2. The
annular polynomials may be referred to as the orthogonal aberrations because of their
orthogonality over the annular pupil.
5.5 Annular Polynomials 117

Table 5-3. Orthonormal annular polynomials A j (r, q; ) in terms of the orthonormal


Zernike circle polynomials Z j (r, q ) , where  is the obscuration ratio of the annular
pupil.

A1 = Z1

( ) 1 2 Z2
A2 = 1 + 2
12
A3 = (1 + 2 ) Z 3
1
A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )
12
A5 = (1 + 2 + 4 ) Z 5

A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]

A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]
12
B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]
12
A9 = (1 + 2 + 4 + 6 ) Z 9
12
A10 = (1 + 2 + 4 + 6 ) Z10

A11 = (1 - 2 ) [ 52 (1 + 2 ) Z1 - 152 Z 4 + Z11 ]


2

12
Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ
A12 = Á 8˜ Á - 15 Z +
6 6
Z
2 12 ˜
Ë 1 + 4  + 10  + 4  +  ¯
2 4 6
Ë 1-  1-  ¯
12
Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ
A13 = Á 8˜ Á - 15 Z +
6 5
Z
2 13 ˜
Ë 1 + 4  + 10  + 4  +  ¯
2 4 6
Ë 1-  1-  ¯

(
A14 = 1 + 2 + 4 + 6 + 8 ) 1 2 Z14
12
A15 = (1 + 2 + 4 + 6 + 8 ) Z15
1 Ï 4 ¸
A16 =
2 2
Ì [ 3( 3 + 4 
2
) ( ) ]
+ 34 Z 2 + 2 6 3 + 2 Z 8 + bZ16 ˝
(1 -  ) Óa ˛
1 Ï 4 ¸
A17 =
2 2
Ì [ 3( 3 + 4 
2
) ( ) ]
+ 34 Z 3 + 2 6 3 + 2 Z 7 + bZ17 ˝
(1 -  ) Óa ˛
12
10 1 2 Ê 1 + 4 2 + 4 ˆ
(
a = 1 + 13 + 46  + 46 + 13 + 
2 4 6 8
) , b = Á 6˜
Ë 1 + 9 + 9  +  ¯
2 4

12
Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ
A18 = Á 12 ˜ Á Z10 + Z
2 18 ˜
Ë 1 + 4  + 10  + 20 + 10 + 4  +  ¯
2 4 6 8 10
Ë 1- 
8
1-  ¯
12
Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ
A19 = Á 12 ˜ Á Z9 + Z
2 19 ˜
Ë 1 + 4  + 10  + 20 + 10 + 4  +  ¯
2 4 6 8 10
Ë 1- 
8
1-  ¯
118 SYSTEMS WITH ANNULAR PUPILS

Table 5-3. Orthonormal annular polynomials A j (r, q; ) in terms of the orthonormal


Zernike circle polynomials Z j (r, q ) , where  is the obscuration ratio of the annular
pupil. (Cont.)

(
A20 = 1 + 2 + 4 + 6 + 8 + 10 ) 1 2 Z 20
12
A21 = (1 + 2 + 4 + 6 + 8 + 10 ) Z 21

= (1 - 2 ) [ - 7 2 (1 + 32 + 4 ) Z1 + ]
3
A22 ( )
212 1 + 22 Z 4 - 35 Z11 + Z 22

1 Ï 6 ¸
A23 =
2 2
Ì [ 21(2 + 3 2
) ( ) ]
+ 34 + 26 Z 5 - 35 6 + 32 + 4 Z13 + dZ 23 ˝
(1 -  ) Óg ˛
1 Ï 6 ¸
A24 =
2 2
Ì [ 21(2 + 3 2
) ( ) ]
+ 34 + 26 Z 6 - 35 6 + 32 + 4 Z14 + dZ 24 ˝
(1 -  ) Óg ˛
12
(
g = 1 + 13 2 + 91  4 + 339 6 + 792 8 + 102810 + 72912 + 33914 + 9116 + 1318 +  20 )
12
Ê 1 + 4 2 + 104 + 4 6 + 8 ˆ
d =Á 12 ˜
Ë 1 + 9 + 45  + 65 + 45 + 9 +  ¯
2 4 6 8 10

Ê - 3510 1 ˆ
A25 = c Á Z15 + Z
2 25 ˜
Ë 1- 1-
10
¯
Ê - 3510 1 ˆ
A26 = c Á Z14 + Z
2 26 ˜
Ë 1- 1-
10
¯
12
Ê 1 + 2 + 4 + 6 + 8 ˆ
c = Á 16 ˜
Ë 1 + 4  + 10  + 20 + 35 + 20 + 10 + 4  +  ¯
2 4 6 8 10 12 14

(
A27 = 1 + 2 + 4 + 6 + 8 + 10 + 12 ) 12 Z 27
12
A28 = (1 + 2 + 4 + 6 + 8 + 10 + 12 ) Z 28

It is evident from Eq. (5-13) that each annular polynomial is a linear combination of
the circle polynomials, without any mixing of the cosine and the sine terms. Similarly,
because of the same angular dependence of an annular polynomial Aj (r, q; ) as the
corresponding circle polynomial Z j (r, q), each radial polynomial Rnm (r; ) can be written
as a linear combination of the polynomials Rnm (r) , Rnm 2 (r) , etc. This, of course, is also
evident from Eq. (5-19). For example,

1
R13 (r; ) =
B
[( )
1 + 2 R13 (r) - 24 R11(r) ] , (5-26)

where

12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] , (5-27)
5.5 Annular Polynomials 119

Table 5-4. Nonzero elements of a 15 ¥ 15 conversion matrix M for obtaining the


annular polynomials A j (r, q; ) from the Zernike circle polynomials Z j (r, q ) .

M 11 = 1

(
M 22 = 1 + 2 ) 1 2 = M 33
M 41 = -32 1 - 2( )1
(
M 44 = 1 - 2 )1
(
M 55 = 1 + 2 + 4 ) 1 2 = M 66
M 73 = -2 2 4 B = M 82

(
M 77 = 1 + 2 B = M 88 )
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )]
(
M 99 = 1 + 2 + 4 + 6 ) 1 2 = M10,10
(
M 111, = 52 1 + 2 1 - 2 )( )2
M 11,4 = - 152 1 - 2 ( )2
, = 1-
M 1111 2
( )2
12
6 Ê 1 + 2 + 4 ˆ
M 12,6 = - 15 6 Á 8˜
= M 13,5
1 -  Ë 1 + 4  + 10  + 4  +  ¯
2 4 6

12
1 Ê 1 + 2 + 4 ˆ
M 12,12 = Á 8˜
= M 13,13
1 -  Ë 1 + 4  + 10  + 4  +  ¯
2 2 4 6

(
M 14,14 = 1 + 2 + 4 + 6 + 8 ) 1 2 = M15,15
120 SYSTEMS WITH ANNULAR PUPILS

Table 5-5. Orthonormal annular polynomials Aj (x, y; ) in Cartesian coordinates


1 2
(
( x, y) , where x = rcos q , y = rsinq , and  £ r = x 2 + y 2 £ 1. )

Poly. Aj (x, y; )

A1 1

A2 2 x / (1 + 2 )1 / 2

A3 2y /(1 + 2 )1/ 2

A4 3 (2r2 – 1 - 2 ) / (1 – 2 )

A5 2 6 xy/(1 + 2 + 4 )1 / 2

A6 6 ( x 2 – y 2 )/(1 + 2 + 4 )1 / 2

8 y[3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]
A7
(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2

8 x [3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]
A8
(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2

A9 8 y (3 x 2 – y 2 ) / (1 + 2 + 4 + 6 )1 / 2

A10 8 x ( x 2 – 3 y 2 ) / (1 + 2 + 4 + 6 )1 / 2

A11 5[6r 4 – 6 (1 + 2 ) r2 + (1 + 4 2 + 4 )] / (1 – 2 ) 2

10 ( x 2 – y 2 ) [ 4r2 – 3 (1 - 8 ) / (1 – 6 )]
A12 1/ 2
{(1 –  ) 2 –1
[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }
2 10 xy[ 4r2 – 3 (1 – 8 ) / (1 – 6 )]
A13 1/ 2
{(1 –  ) 2 –1
[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }
A14 10 (r 4 – 8 x 2 y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2

A15 4 10 xy ( x 2 – y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2
5.5 Annular Polynomials 121

Table 5-5. Orthonormal annular polynomials Aj (x, y; ) in Cartesian coordinates


1 2
(
( x, y) , where x = rcos q , y = rsinq , and  £ r = x 2 + y 2 £ 1. (Cont.) )
Poly. Aj (x, y; )

12 x [10 (1 + 4  2 +  4 ) r 4 – 12 ( 1 + 4  2 + 4  4 +  6 )r 2 ] + 3(1 + 4  2 + 10  4 + 4  6 +  8 )]
A16
(1 –  2 ) 2 [(1 + 4  2 +  4 )(1 + 9  2 + 9 4 + 9 6 )]1/ 2

12 y [ 10 (1 + 4  2 +  4 ) r 4 – 12 (1 + 4  2 + 4  4 +  6 ) r 2 + 3(1 + 4  2 + 10 4 + 4  6 +  8 ) ]
A17
(1 –  2 ) 2 [(1 + 4  2 +  4 )(1 + 9  2 + 9 4 +  6 )]1/ 2

12 x ( x 2 – 3 y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]
A18 1/ 2
{(1 –  ) 2 –1
[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }
12 y [3 x 2 – y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]
A19 1/ 2
{(1 –  ) 2 –1
[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }
A20 (
12 x 16 x 4 – 20 x 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2
A21 (
12 y 16 y 4 – 20 y 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2
7 [ 20 r 6 – 30(1 + 2 ) r 4 + 12 (1 + 3 2 + 4 ) r 2 – (1 + 9 2 + 94 + 6 )]
A22
(1 – 2 ) 3

2 14 xy [15 (1 + 4  2 + 10  4 + 4  6 +  8 ) r 4 – 20 (1 + 4  2 + 10  4 + 10  6 + 4  8 + 10 ) r 2
+ 6 (1 + 4  2 + 10  4 + 20  6 + 10  8 + 4 10 + 12 )]
A23
(1 –  2 ) 2 [1 + 4  2 + 10  4 + 4  6 +  8 ) (1 + 9  2 + 45  4 + 65  6 + 45  8 + 9 10 + 12 )]1/ 2

14 ( x 2 – y 2 )[15 (1 + 4  2 + 10  4 + 4  6 +  8 ) r 4 – 20 (1 + 4  2 + 10  4 + 10  6 + 4  8 + 10 ) r 2
+ 6 (1 + 4  2 + 10  4 + 20  6 + 10  8 + 4 10 + 12 )]
A24
(1 –  2 ) 2 [1 + 4  2 + 10  4 + 4  6 +  8 ) (1 + 9  2 + 45  4 + 65  6 + 45  8 + 9 10 + 12 )] 1/2

4 14 xy ( x 2 - y 2 )[6r2 – 5 (1 – 12 ) / (1 – 10 )]


A25 1/ 2
{(1 –  ) 2 –1
[36 (1 – 14 ) – 35 (1 – 12 )2 / (1 – 10 )] }
14 (8 x 4 - 8 x 2 r2 + r 4 )[6r2 – 5 (1 – 12 ) / (1 – 10 )]
A26 1/ 2
{(1 –  ) 2 –1
[36 (1 – 14 ) – 35 (1 – 12 )2 / (1 – 10 )] }
A27 (
14 xy 32 x 4 – 32 x 2 r 2 + 6 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 + 12 )1/ 2
A28 (
14 32 x 6 – 48 x 4 r 2 + 18 x 2 r 4 – r 6 ) (1 + 2 + 4 + 6 + 8 + 10 + 12 )1/ 2
122 SYSTEMS WITH ANNULAR PUPILS

n 4

0.5
8
Rn(U; H)

0 (a)
0

-0.5
6

2
-1
0.5 0.6 0.7 0.8 0.9 1
U
1

n 5
1

0.5

7
R1n(U; H)

0 (b)

-0.5
3

-1
0.5 0.6 0.7 0.8 0.9 1

n 6 2
0.5
Rn(U; H)

0 (c)
2

-0.5
4

-1
0.5 0.6 0.7 0.8 0.9 1
U

Figure 5-6. Variation of an annular radial polynomial Rnm (r; ) with r for  = 0.5.
(a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
5.5 Annular Polynomials 123

and

(
R40 (r; ) = 1 - 2 ) 2 [R40 (r) - 32R20 (r) + 2 (1 + 2 )R00 (r)] . (5-28)

The radial annular polynomials Rnm (r; ) for n £ 8 are listed in Table 5-6. Table 5-7 lists
the full annular polynomials, illustrating their ordering.

5.6 ANNULAR COEFFICIENTS OF AN ANNULAR ABERRATION FUNCTION


The aberration function W (r, q; ) across a unit annulus with an obscuration ratio 
can be expanded in terms of J annular polynomials Aj (r, q; ) in the form
J
W (r, q; ) = Â a j Aj (r, q; ) , 0 £  < 1 , 0 £ r £ 1 , 0 £ q £ 2 p , (5-29)
j =1

where a j is an annular expansion coefficient of the polynomial Aj . Multiplying both


sides of Eq. (5-29) by A j (r, q; ) , integrating over the unit annulus, and using the
orthonormality Eq. (5-15), we obtain the annular expansion coefficients:

1 1 2p
aj = 2 Ú Ú W (r, q; ) Aj (r, q; ) r dr d q . (5-30)
p(1 - )  0

The mean and the mean square values of the aberration function are given by

W (r, q; ) = a1 (5-31)

and
J
W 2 (r, q; ) = Â a 2j . (5-32)
j =1

The variance of the aberration function is accordingly given by


2
2
sW = W 2 (r, q; ) - W (r, q; )

J
= Â a 2j . (5-33)
j =2

As explained in Section 3.3, the annular expansion coefficients yield a least-squares fit of
the aberration function with J polynomials.
124 SYSTEMS WITH ANNULAR PUPILS

Table 5-6. Annular radial polynomials Rnm (r; ) , where  is the obscuration ratio
and  £ r £ 1.

n m Rnm (r; )
0 0 1
12
1 1 (
r 1 + 2 )
2 0 ( 2r 2
) (1 -  )
- 1 - 2 2

4 12
2 2 r (1 +  +  )
2 2

3 (1 +  ) r - 2 (1 +  +  ) r
2 3 2 4
3 1
12
(1 -  ) [(1 +  ) (1 + 4 +  )]
2 2 2 4

6 12
3 3 r (1 +  +  +  )
3 2 4

4 0 [6r - 6 (1 +  ) r + 1 + 4 +  ] (1 -  )
4 2 2 2 4 2 2

4r - 3 [(1 -  ) (1 -  )] r
4 8 6 2
4 2
Ï 1 1 2¸
8 2
Ì(1 -  ) Í16 (1 -  ) - 15 (1 -  ) (1 -  )˙
È 2 ˘ 10 6
˝
Ó Î ˚ ˛
12
4 4 (
r 4 1 + 2 + 4 + 6 + 8 )
5 1 ( ) ( ) (
10 1 + 4 2 + 4 r5 - 12 1 + 4 2 + 4 4 + 6 r3 + 3 1 + 4 2 + 10 4 + 4 6 + 8 r )
12
(1 -  ) [(1 + 4 +  ) (1 + 9 + 9 2 2 2 4 2 4
+ 6 )]
5 r - 4 [(1 -  ) (1 -  )] r 5 10 8 3

5 3 12
Ï1-  1
1 -  )˘ ¸˝ 10 2
Ì( ) ( ) ( ) (
È25 1 -  - 24 1 - 
2 12 8
Ó Í
Î ˚˙ ˛
12
5 5 (
r5 1 + 2 + 4 + 6 + 8 + 10 )
6 0 [20 r 6
( ) (
- 30 1 + 2 r 4 + 12 1 + 32 + 4 r 2 - 1 + 92 + 94 + 6 ) ( )] (1 - 2 ) 3
( )
15 1 + 4 2 + 104 + 4 6 + 8 r 6 - 20 1 + 4 2 + 104 + 106 + 4 8 + 10 r 4 ( )
6 2
( )
+ 6 1 + 4 2 + 104 + 206 + 108 + 4 10 + 12 r 2
12
(1 -  ) [(1 + 4 2 + 104 + 4 6 + 8 ) (1 + 92 + 454 + 656 + 458 + 910 + 12 )]
2 2

6 4
6r6 - 5 1 - 12 [( ) (1 -  )] r 10 4

12
Ï 1 - 2
) - 35 (1 -  ) (1 -  )˘˚˙¸˝˛
1È 12 2
Ì
Ó
( ) ÎÍ
36 1 - 14( 10

12
6 6 (
r6 1 + 2 + 4 + 6 + 8 + 10 + 12 )
5.6 Annular Coefficients of an Annular Aberration Function 125

Table 5-6. Annular radial polynomials Rnm (r; ) , where  is the obscuration ratio
and  £ r £ 1. (Cont.)

n m Rnm (r; )

7 1 a17 r7 + b71 r5 + c17 r3 + d71 r

7 3 a73 r7 + b73 r5 + c73 r3

7 5
7r7 - 6 1 - 14 [( ) (1 -  )] r
12 5

12
Ï 1 - 2
) - 48 (1 -  ) (1 -  )˘˙˚¸˝˛
1È 14 2
Ì
Ó
( ) ÍÎ
49 1 - 16 ( 12

12
7 7 (
r7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 )

8 0
( ) ( )
70 r8 - 140 1 + 2 r6 + 30 3 + 82 + 34 r4 - 20 1 + 6 2 + 6 4 + 6 r2 + e80 ( )
2 4
(1 -  )
8 2 a 82r 8 + b82r 6 + c 82r 4 + d 82r 2
8 4 a 84 r 8 + b84 r 6 + c 84 r 4
8 6 a 86r 8 + b86r 6

8 8 (
r 8 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 + 16 )1 2
(
a17 = 35 1 + 92 + 94 + 6 ) A17

(
b71 = - 60 1 + 9 2 + 154 + 9 6 + 8 ) A71

(
c17 = 30 1 + 9 2 + 254 + 256 + 9 8 + 10 ) A71

(
d71 = - 4 1 + 9 2 + 454 + 656 + 458 + 9 10 + 12 ) A71

(
A17 = 1 - 2 ) 3 (1 + 92 + 94 + 6 )1 2 (1 + 162 + 364 + 166 + 8 )1 2
(
a73 = 21 1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 ) A73

(
b73 = - 30 1 + 4 2 + 10 4 + 20 6 + 20 8 + 10 10 + 4 12 + 14 ) A73

(
c73 = 10 1 + 4 2 + 10 4 + 20 6 + 358 + 20 10 + 10 12 + 4 14 + 16 ) A73

2 12
(
A 73 = 1  2 ) (1 + 4  2
+ 10 4 + 20 6 + 10 8 + 4 10 + 12 )
12
(
¥ 1 + 9 2 + 45 4 + 165 6 + 270 8 + 27010 + 16512 + 4514 + 916 + 18 )
e80 = 1 + 162 + 364 + 166 + 8
(
a 82 = 56 1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 ) A82
126 SYSTEMS WITH ANNULAR PUPILS

Table 5-6. Annular radial polynomials Rnm (r; ) , where  is the obscuration ratio
and  £ r £ 1. (Cont.)

(
b82 = -105 1 + 9 2 + 45 4 + 85 6 + 85 8 + 45 10 + 912 + 14 ) A82

(
c 82 = 60 1 + 9 2 + 45 4 + 115 6 + 150 8 + 115 10 + 4512 + 914 + 16 ) A82

(
d 82 = -10 1 + 9 2 + 45 4 + 165 6 + 270 8 + 270 10 + 16512 + 4514 + 916 + 18 ) A82

(
A82 = 1 - 2 ) 3 (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )1 2
(
¥ 1 + 162 + 136 4 + 416 6 + 6268 + 416 10 + 13612 + 1614 + 16 )1 2
(
a 84 = 28 1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 ) A84

(
b84 = -42 1 + 4 2 + 10 4 + 20 6 + 35 8 + 35 10 + 2012 + 1014 + 4 16 + 16 ) A84

(
c 84 = 15 1 + 4 2 + 10 4 + 20 6 + 35 8 + 56 10 + 3512 + 2014 + 1016 + 4 16 + 16 ) A84

2 12
(
A 84 = 1  2 ) (1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 )
12
(
¥ 1 + 9 2 + 45  4 + 165  6 + 495 8 + 846 10 + 994 12 + 84614 + 49616 + 16518 + 45 20 + 9 22 +  24 )

(
a 86 = 8 1 + 2 + 4 + 6 + 8 + 10 + 12 ) A86

(
b86 = -7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 ) A86

12
( )(
A 86 = 1  2 1 +  2 +  4 +  6 +  8 + 10 + 12 )
12
¥ (1 + 4  + 10  2 4
+ 20 6 + 35 8 + 56 10 + 84 12 + 845614 + 3516 + 2018 + 10 20 + 4  22 +  24 )
5.6 Annular Coefficients of an Annular Aberration Function 127

Table 5-7. Orthonormal annular polynomials A j (r, q; ) , ordered in the same


manner as the circle polynomials in Table 4-3.

j n m A j (r, q; ) Aberration Name*

1 0 0 R00 (r; ) = 1 Piston

2 1 1 2 R11 (r; ) cos q x-tilt

3 1 1 2 R11 (r; )sin q y-tilt

4 2 0 3 R20 (r; ) Defocus

5 2 2 6 R22 (r; )sin 2q Primary astigmatism at 45∞

6 2 2 6 R22 (r; ) cos 2q Primary astigmatism at 0∞

7 3 1 8R31 (r; )sin q Primary y-coma

8 3 1 8R31 (r; ) cos q Primary x-coma

9 3 3 8 R33 (r; )sin 3q

10 3 3 8 R33 (r; ) cos 3q

11 4 0 5 R40 (r; ) Primary spherical

12 4 2 10 R42 (r; ) cos 2q Secondary astigmatism at 0∞

13 4 2 10 R42 (r; )sin 2q Secondary astigmatism at 45∞

14 4 4 10 R44 (r; ) cos 4q

15 4 4 10 R44 (r; )sin 4q

16 5 1 12 R51 (r; ) cos q Secondary x-coma

17 5 1 12 R51 (r; )sin q Secondary y-coma

18 5 3 12 R53 (r; ) cos 3q

19 5 3 12 R53 (r; )sin 3q

20 5 5 12 R55 (r; ) cos 5q

21 5 5 12 R55 (r; )sin 5q

* The words “orthonormal annular” should be added to the name, e.g., orthonormal
annular primary spherical aberration.
128 SYSTEMS WITH ANNULAR PUPILS

Table 5-7. Orthonormal annular polynomials A j (r, q; ) , ordered in the same


manner as the circle polynomials in Table 4-3. (Cont.)

j n m A j (r, q; ) Aberration Name*

22 6 0 7 R60 (r; ) Secondary spherical

23 6 2 14 R62 (r; )sin 2q Tertiary astigmatism at 45∞

24 6 2 14 R62 (r; ) cos 2q Tertiary astigmatism at 0∞

25 6 4 14 R64 (r; ) cos 4q

26 6 4 14 R64 (r; )sin 4q

27 6 6 14 R66 (r; )sin 6q

28 6 6 14 R66 (r; ) cos 6q

29 7 1 4R17 (r; ) sin q

30 7 1 4R17 (r; ) cos q

31 7 3 4 R73 (r; ) cos 3q

32 7 3 4 R73 (r; ) cos 3q

33 7 5 4 R75 (r; ) sin 5q

34 7 5 4 R75 (r; ) cos 5q

35 7 7 4 R77 (r; ) sin 7q

36 7 7 4 R77 (r; ) cos 7q

37 8 0 3R80 (r; ) Tertiary spherical aberration

38 8 2 18 R82 (r; ) cos 2q

39 8 2 18 R82 (r; ) sin 2q

40 8 4 18 R84 (r; ) cos 4q

41 8 4 18 R84 (r; ) sin 4q

42 8 6 18 R86 (r; ) cos 6q

43 8 6 18 R86 (r; ) sin 6q

44 8 8 18 R88 (r; ) cos 8q

45 8 8 18 R88 (r; ) sin 8q

* The words “orthonormal annular” should be added to the name, e.g., orthonormal
annular primary spherical aberration.
5.7 Strehl Ratio for Annular Polynomial Aberrations 129

5.7 STREHL RATIO FOR ANNULAR POLYNOMIAL ABERRATIONS


The Strehl ratio for an annular polynomial aberration with a sigma value of 0.1 wave
is listed in Table 5-8 and plotted in 5-7. For the wavefront tilt polynomials A2 and A3 ,
the Strehl ratio simply represents the PSF value at a displaced point along the x or the y
axis, respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.358l F .
A closed-form expression for the Strehl ratio for the annular defocus polynomial can be
obtained from Eq. (5-8) by letting

F(r, q) = a 4 A4 (r) . (5-34)

The result obtained is


2

S = Í
(
È sin 3a
4 ) ˘˙ . (5-35)
Í 3a 4 ˙
Î ˚

For a defocus aberration sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement
with the result given in Table 5-8. Although Eq. (5-35) reads exactly the same as Eq. (4-
82) for a circular pupil, the longitudinal defocus for a given value of a 4 is different for
the annular pupil [see Eq. (5-37)]. .

If the defocus aberration is introduced by making an observation in a plane at a


distance z instead of the Gaussian image plane at a distance R, the longitudinal defocus is
z - R , and the aberration may be written in the form

W (r) = Bd r 2 , (5-36)

where Bd represents its peak value given by Eq. (4-19). The annular coefficient a 4 is
related to the longitudinal defocus z - R according to

p
a4 =
8 3l F 2
(
1 - 2 z - R ) . (5-37)

A positive value of defocus aberration is introduced when an observation is made at a


distance z < R .

The results in Table 5-8 and Figure 5-7 illustrate that the Strehl ratio for a small
aberration is nearly independent of the type of the aberration, and depends primarily on
(
its sigma value. It is approximately given by Eq. (1-34) as exp - s F2 , or 0.67, where )
s F = 0.2p .
130 SYSTEMS WITH ANNULAR PUPILS

Table 5-8. Strehl ratio S for annular polynomial aberrations for  = 0.5 and a sigma
value of 0.1 wave.

Poly. S Poly. S Poly. S

A1 1 A16 0.675 A31 0.673

A2 0.661 A17 0.675 A32 0.673

A3 0.661 A18 0.669 A33 0.672

A4 0.663 A19 0.669 A34 0.672

A5 0.665 A20 0.681 A35 0.691

A6 0.665 A21 0.681 A36 0.691

A7 0.670 A22 0.668 A37 0.670

A8 0.670 A23 0.674 A38 0.678

A9 0.670 A24 0.674 A39 0.678

A10 0.670 A25 0.670 A40 0.672

A11 0.666 A26 0.670 A41 0.672

A12 0.669 A27 0.686 A42 0.675

A13 0.669 A28 0.686 A43 0.675

A14 0.675 A29 0.678 A44 0.696

A15 0.675 A30 0.678 A45 0.696


5.7 Strehl Ratio for Annular Polynomial Aberrations 131

o
o

Figure 5-7. Strehl ratio for annular polynomial aberrations for  = 0.5 and a sigma
value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.
132 SYSTEMS WITH ANNULAR PUPILS

5.8 ISOMETRIC, INTERFEROMETRIC, AND IMAGING CHARACTERISTICS


OF ANNULAR POLYNOMIAL ABERRATIONS

As in the case of circle polynomials (see Section 4.8), we illustrate the annular
polynomials for n £ 8 in three different but equivalent ways in Figure 5-8 for  = 0.5 and
a sigma value of one wave [8]. For each polynomial, the isometric plot at the top
illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is
shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers
(in units of wavelength) are given in Table 5-8. From Eqs. (5-16) for the form of the
polynomials, it is evident that the P-V numbers of two polynomials with the same values
of n and m are the same. This may also be seen from Table 5-7.

The PSF plots represent the images of a point object in the presence of an annular
polynomial aberration. Thus, for example, piston yields the aberration-free PSF (since it
has no effect on the PSF) given by Eq. (5-2). The full width of a square displaying the
PSFs in Figure 5-8 is 24l F .

The polynomial aberrations A2 and A3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
12
( )
wavefront tilt angle of 4 a2 l D 1 +  2 about the y axis and displaces the PSF along the
12
( )
x axis by 4 a2 lF 1 +  2 . Similarly, a 3 corresponds to a wavefront tilt angle of
12 12
( )
4 a3 l D 1 +  2 ( )
about the x axis and displaces the PSF by 4 a3 lF 1 +  2 along the y
axes. As the order of a polynomial aberration increases, the interferograms and the PSFs
become more and more complex.

The 3D MTF plots for the for the primary polynomial aberrations and A10 are shown
in Figure 5-9 for a sigma value of 0.1 wave. The contour plots shown below each 3D
MTF figure are in steps of 0.1 from the center out, starting with a value of 0.9 and ending
with zero. The tangential, (long dashes), sagittal (medium dashes), and 45o (small dashes)
MTF plots are also shown in this figure, i.e., for the spatial frequency vector along the x
axis, y axis, and at 45o from the x axis, respectively. Figure 5-10a shows the symmetry of
the real and the imaginary parts of the OTF for the orthogonal primary coma A8 . The real
part has even symmetry, but the imaginary part has odd symmetry. The real and
imaginary parts of the OTF for the polynomial aberration A10 are shown in Figure 5-10b.
Since the aberration is 3-fold symmetric, the imaginary part of the OTF is 3-fold
symmetric, but the real part is 6-fold symmetric, as expected.

Comparing the form of the annular polynomials with those of the circle polynomials
given in Chapter 4, it is easy to see that the symmetry properties of the interferograms,
PSFs, real and imaginary parts of the OTF and the MTFs aberrated by an annular
polynomial aberration are the same as those for a corresponding circle polynomial
aberration in a circular pupil. These properties are summarized in Table 4-6.
5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 133

A1 A2 A3

A4 A5 A6

A7 A8 A9

A10 A11 A12

A13 A14 A15

Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for  = 0.5 and a sigma value of one wave.
134 SYSTEMS WITH ANNULAR PUPILS

A 16 A 17 A 18

A19 A20 A21

A22 A23 A24

A25 A26 A27

A28 A29 A30

Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for  = 0.5 and a sigma value of one wave. (Cont.)
5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 135

A31 A32 A33

A34 A35 A36

A37 A38 A39

A40 A41 A42

A43 A44 A45

Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for  = 0.5 and a sigma value of one wave. (Cont.)
136 SYSTEMS WITH ANNULAR PUPILS

Table 5-9. Peak-to-valley (P-V) numbers in units of wavelength of orthonormal


annular polynomials for  = 0.5 and a sigma value of one wave.

Poly. P-V # Poly. P-V # Poly. P-V #

A1 0 A16 6.626 A31 7.206

A2 3.578 A17 6.626 A32 7.206

A3 3.578 A18 6.094 A33 6.944

A4 3.464 A19 6.094 A34 6.944

A5 4.276 A20 6.001 A35 6.928

A6 4.276 A21 6.001 A36 6.928

A7 5.285 A22 5.292 A37 4.286

A8 5.285 A23 6.916 A38 7.138

A9 4.909 A24 6.916 A39 7.138

A10 4.909 A25 6.520 A40 7.510

A11 3.354 A26 6.520 A41 7.510

A12 5.679 A27 6.481 A42 7.354

A13 5.679 A28 6.481 A43 7.354

A14 5.480 A29 7.329 A44 7.348

A15 5.480 A30 7.329 A45 7.348


5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 137

y x

A 1 - Piston

A 4 - Defocus

A6 Primary astigmatism

A8 Primary coma

A 10

A 11 Primary spherical

Figure 5-9. 3D, tangential or along x axis (in long dashes), sagittal or along y axis (in
medium dashes), and at 45 o from the x axis (in small dashes) MTF plots for annular
polynomial aberrations with a sigma value of 0.1 wave for  = 0.5. The solid curve
represents the aberration-free MTF. The spatial frequency v is normalized by the
cutoff frequency 1 l F . The contour plots below each 3D MTF plot are in steps of
0.1 from the center out, starting with 0.9 and ending with zero.
138 SYSTEMS WITH ANNULAR PUPILS

(a) A8 Primary coma

(b) A10
Re ( ) Im

Figure 5-10. Real and imaginary parts of the OTF for an annular polynomial
aberration with a sigma value of 0.1 wave for  = 0.5. (a) A8 (primary coma) shows
the even and odd symmetry of the real and imaginary parts. (b) A10 shows the 6-fold
symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to
their even and odd symmetry, respectively. The thick and thin contours of the
imaginary part represent its positive and negative values, respectively.
5.9 Summary 139

5.9 SUMMARY
A brief description of the aberration-free PSF and OTF of a system with an annular
pupil is given in Section 5.2, and follows with a discussion of the Strehl ratio and
aberration balancing for such a system in Section 5.3. The variation of the standard
deviation of a primary aberration with the obscuration ratio is shown in Figure 5-5. It is
evident, for example, from Figure 5-5d that the standard deviation of the defocus
aberration decreases, and the depth of focus accordingly increases as the obscuration
increases.

The annular polynomials orthonormal over an annular pupil, obtained by


orthonormalizing the Zernike circle polynomials, are given in Table 5-3 in terms of the
circle polynomials. This form is useful for comparing the expansions of an annular
wavefront in terms of the annular and circle polynomials, as discussed in Chapter 12. The
nonzero elements of a 15 ¥ 15 conversion matrix for obtaining the annular polynomials
from the circle polynomials are given in Table 5-4. The annular polynomials are given in
Cartesian coordinates in Table 5-5 for numerical analyses of annular wavefronts. The
radial annular polynomials for n £ 8 are given in Table 5-6. The ordering of the annular
polynomials in Table 5-7 is the same as that for the circle polynomials in Table 4-3.

The Strehl ratio for a sigma value of 0.1 l for each aberration polynomial is given in
Table 5-8 and illustrated in Figure 5-7. It shows that, for a small aberration, the Strehl
ratio can be estimated from the aberration variance. The annular polynomials for n £ 8
are illustrated by an isometric plot, an interferogram, and a PSF in Figure 5-8 for  = 0.5
and a sigma value of one wave. Their peak-to-valley numbers are given in Table 5-9 in
units of wavelength. The 3D MTFs are shown in Figure 5-9 for the primary and A10
polynomial aberrations. The tangential, sagittal, and 45o MTF plots are also shown in
Figure 5-9 for the orthogonal primary coma, i.e., for the spatial frequency vector along
the x axis, y axis, and at 45o from the x axis, respectively. The real and imaginary parts of
the OTFs are shown in Figure 5-10 for the A8 and A10 polynomial aberrations that have
odd values of m.

The symmetry properties of an interferogram, PSF, and real and imaginary parts of
the OTF and MTF aberrated by an annular polynomial aberration are the same as those
for a corresponding circle polynomial aberration in a circular pupil. These properties are
summarized in Table 4-6.
140 SYSTEMS WITH ANNULAR PUPILS

References

1. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction


Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).

2. H. F. Tschunko, “Imaging performance of annular apertures,” Appl. Opt. 18,


1820–1823 (1974).

3. E. L. O’Neill, “Transfer function for an annular aperture,” J. Opt. Soc. Am. 46,
285–288 (1956). Note that a term of - 2 h2 is missing in the second of O’Neill’s
Eq. (26), as was pointed out by the author in an Errata on p. 1096 in the Dec 1956
issue. Unfortunately, the obscuration ratio h in the original paper was typed
incorrectly as n in the Errata.

4. W. H. Steel, “Étude des effets combines des aberrations et d’une obturation


centrale de la pupille sur le contraste des images optiques.” Rev. Opt. (Paris) 32,
143–178 (1953).

5. V. N. Mahajan, “Zernike annular polynomials and optical aberrations of systems


with annular pupils,” Appl. Opt. 33, 8125–8127 (1994).

6. V. N. Mahajan, “Zernike annular polynomials for imaging systems with annular


pupils,” J. Opt. Soc. Am. 71, 75–85 (1981); 71, 1408 (1981); 1, 685 (1984).

7. V. N. Mahajan, “Orthonormal polynomials in wavefront analysis,” Handbook of


Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, (McGraw Hill,
2009), pp. 11.3–11.41.

8. V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular


polynomial aberrations,” Appl. Opt. 52, 1–13 (2013).
CHAPTER 6

SYSTEMS WITH GAUSSIAN PUPILS

6.1 Introduction ..........................................................................................................143

6.2 Gaussian Pupil ......................................................................................................144

6.3 Aberration-Free Imaging ....................................................................................145

6.3.1 PSF ..........................................................................................................145

6.3.2 Optimum Gaussian Radius ..................................................................... 146

6.3.3 OTF ..........................................................................................................147

6.4 Strehl Ratio and Aberration Balancing ............................................................. 149

6.5 Orthonormalization of Zernike Circle Polynomials over a

Gaussian Circular Pupil ......................................................................................153

6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations

for a Gaussian Circular Pupil ............................................................................. 155

6.7 Weakly Truncated Gaussian Pupils ................................................................... 156

6.8 Aberration Coefficients of a Gaussian Circular Aberration Function ..........157

6.9 Orthonormalization of Annular Polynomials over a

Gaussian Annular Pupil ......................................................................................157

6.10 Gaussian Annular Polynomials Representing Balanced

Primary Aberrations for a Gaussian Annular Pupil ........................................159

6.11 Aberration Coefficients of a Gaussian Annular Aberration Function ........... 161

6.12 Summary............................................................................................................... 161

References ......................................................................................................................163

141
Chapter 6
Systems with Gaussian Pupils
6.1 INTRODUCTION
In this chapter, we consider optical systems with Gaussian apodization or Gaussian
pupils, i.e., those with a Gaussian amplitude across the wavefront at their exit pupils,
which may be circular or annular [1,2]. The discussion in this chapter is equally
applicable to imaging systems with a Gaussian transmission (obtained, for example, by
placing a Gaussian filter at its exit pupil) as well as laser transmitters in which the laser
beam has a Gaussian distribution at its exit pupil. It is evident that whereas a Gaussian
function extends to infinity, the pupil of an optical system can only have a finite diameter.
The net effect is that the finite size of the pupil truncates the infinite-extent Gaussian
function. If the Gaussian function is very narrow (i.e., its standard deviation is very small)
compared to the radius of the pupil, it is said to be weakly truncated. In such cases, the
truncation can be neglected, and the pupil can be assumed to be infinitely wide.

The aberration-free image for a system with a Gaussian pupil shows that the
Gaussian illumination reduces the central value, broadens the central bright spot, but
reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,
the OTF for a Gaussian pupil is higher for low spatial frequencies, and lower for the high.
In these respects, the effect of a Gaussian illumination is opposite to that of a central
obscuration in an annular pupil. The diffraction rings practically disappear when the pupil
radius is twice the Gaussian radius, and the beam propagates as a Gaussian everywhere.
The OTF in this case is also described by a Gaussian function.

The standard deviation of a primary aberration over a Gaussian pupil is calculated


and shown to be smaller than its corresponding value for a uniform pupil. This is due to
the fact that the wave amplitude decreases as a function of the radial distance from the
center of the pupil while the aberration increases, i.e., the amplitude is smaller where the
aberration is larger. Accordingly, the Strehl ratio for a Gaussian pupil for a given amount
of a primary aberration is higher than that for a uniform pupil, or the aberration tolerance
for a given Strehl ratio is higher for a Gaussian pupil. The balanced primary aberrations
with minimum variance are also obtained, and the diffraction focus for various values of
the truncation ratio are given. The Gaussian polynomials orthonormal over a Gaussian
pupil are obtained by orthogonalizing the circle polynomials over such a pupil. As
expected, the Gaussian polynomials for primary aberrations represent balanced
aberrations. Similarly, the orthonormal Gaussian annular polynomials are obtained by
orthogonalizing the annular polynomials over a Gaussian pupil. Again, the primary
Gaussian annular polynomials represent the balanced aberrations for a Gaussian annular
pupil. The isometric, interferometric, and imaging characteristics of the Gaussian circular
and annular polynomial aberrations are not discussed because of their similarity with
those of the corresponding circle or annular polynomial aberrations for uniform pupils.

143
144 SYSTEMS WITH GAUSSIAN PUPILS

6.2 GAUSSIAN PUPIL


The pupil function for a system with a Gaussian pupil of radius a may be written [1]

P(r, q) = A(r) exp i F(r, q) [ ] , (6-1)

where

A(r) = A0 exp - g r 2 ( ) . (6-2)

Here A0 is a constant that is determined from the total power in the pupil and

2
g = (a w ) , (6-3)

where the quantity w, called the Gaussian radius represents the radial distance from the
center of the pupil at which the amplitude drops to e 1 of the amplitude at the center. The
pupil radius a normalized by the Gaussian radius w , i.e., g = a w , is called the
truncation ratio. The larger the value of g is, the narrower the Gaussian beam is. A
uniform beam is represented by the limiting case of g Æ 0 . The aberration function
F(r, q) represents the phase aberration at a point (r, q) in the plane of the exit pupil,
where 0 £ r £ 1 and 0 £ q p £ 2p . The amplitude A0 at its center is determined from
the total power in the pupil.

A Gaussian pupil is obtained when a Gaussian laser beam illuminates a pupil or


when a uniform beam illuminates the pupil with a Gaussian transmission. In the former
case, the total power incident on the pupil and that exiting from it are given by


Pinc = 2 A02 Sex Ú (
exp - 2gr 2 r dr )
0

A02 Sex
= , (6-4)
2g

and

1
Pex = 2 A02 Sex Ú (
exp - 2gr 2 r dr )
0

[
= A02 (Sex 2 g ) 1 - exp(- 2 g ) ] , (6-5)

respectively. The fractional transmitted power that goes on to the image is given by

Ptrans = Pex Pinc

= 1 - exp(- 2g ) . (6-6)
 *DXVVLDQ 3XSLO 145

More and more power is transmitted as the beam becomes narrower and narrower, i.e., as
w decreases or g increases. The pupil irradiance A 2 (r) in units of Pex Sex may be
written

I (r) = 2 g exp - 2 g r2 ( ) [1 - exp (- 2 g )] . (6-7)

The pupil in the latter case, where an amplitude filter is placed in the pupil plane, is
said to be apodized. The power incident in this case is Pinc = A02 Sex . The power exiting
from the pupil is again given by Eq. (6-5), but the fractional transmitted power is given
by

1 - exp(- 2g )
Ptrans = Pex Pinc = . (6-8)
2g

In this case, the transmitted power decreases as g increases.

6.3 ABERRATION-FREE IMAGING


6.3.1 PSF
Substituting Eq. (6-2) into Eq. (2-4), the irradiance distribution in the image plane in
units of Pex Sex l2 R 2 is may be written
2
1 2p
I (r; q i ; g ) = p 2
Ú Ú [ ]
I (r) exp -pirr cos(q i - q) r dr dq p , (6-9)
0 0

or, carrying out the angular integration,


2
È1 ˘
I ( r; g ) = 4 Í Ú I (r) J 0 ( prr) r dr˙ . (6-10)
ÍÎ 0 ˙˚

Letting r = 0 in Eq. (6-10), we obtain the central value

[
I (0; g ) = tanh ( g 2) ( g 2) ] . (6-11)

For large values of g, a pupil is said to be weakly truncated. For such a pupil,

I (0; g ) Æ 2 g . (6-12)

The fractional power in the image plane contained in a circle of radius rc is given by
rc
P(rc ; g ) = p 2 2( )Ú I (r; g ) rdr , (6-13)
0

where rc is in units of l F.
146 SYSTEMS WITH GAUSSIAN PUPILS

Figure 6-1 shows the image-plane irradiance and encircled-power distributions for
J 0 , 1, 2, and 3. It is evident that the Gaussian illumination reduces the central value
and broadens the central bright spot, but reduces the power in the diffraction rings. For
example, when J 1, the central value is 0.924 compared to a value of 1 for a uniform
beam. Moreover, the central bright spot has a radius of 1.43 and contains 95.5% of the
total power compared to a radius of 1.22 containing 83.8% of the power for a uniform
beam. The diffraction rings practically disappear for J t 4 , and the beam propagates as a
Gaussian everywhere.

6.3.2 Optimum Gaussian Radius


For a given total beam power Pinc incident on a pupil of fixed radius a, the
transmitted power Pex increases as Z decreases, but the corresponding central irradiance
in the image plane decreases. Hence, there is an optimum value of Z that yields the
maximum central value. To determine this value, we write the central irradiance given by
Eq. (6-11) in units of Pinc Sex O2 R 2 :

I 0; J >1  exp  2J @ tanh J 2 J 2


2 J >1  exp  J @2 . (6-14)

1
—J = 0 —J = 1
2

0.8 1 0

0.6
3
(r) P(rc)

0.4

3
0.2

0
0.5 1 1.5 2 2.5 3
r; rc

Figure 6-1. PSF and encircled power for a Gaussian pupil with J 0 , 1, 2, and 3.
The irradiance is in units of Pex Sex O2 R 2 , and the encircled power is in units of Pex .
r and rc are in units of OF.
6.3.2 Optimum Gaussian Radius 147

Letting

wI 0; J
0 , (6-15)
wJ

we find that I 0; J is maximum and equal to 0.8145 when J 1.255 or Z 0.893a.


The corresponding irradiance at the edge of the pupil is 8.1%, and the transmitted power
Ptrans is 91.87%. Figure 6-2 shows how I 0; J varies with J .

6.3.3 OTF
From Eq. (2-13), the OTF for an aberration-free Gaussian pupil is given by
G G G G G
W v i ; J
Pex1 ³ A r p A r p  O Rv i dr p (6-16)

G

in the pupil coordinate system x p , y p . Let the spatial frequency vector v i with its
Cartesian components [, K make an angle I with the x p axis, as illustrated in Figure 6-
3. It is convenient to write the autocorrelation integral in a p, q coordinate system

whose axes are rotated by an angle I with respect to the x p , y p system (so that the p
G
axis lies along the direction of the spatial frequency vector v i ) and whose origin lies at a

distance ORv i from that of the x p , y p system along the p axis. If we further let the
p, q coordinates be normalized by the pupil radius a and the spatial frequency v i be
normalized by the cutoff spatial frequency 1 O F , the OTF can be written

0.8

0.6
(0 J)

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3
—J

Figure 6-2. Variation of I 0 , J normalized by Pinc Sex O 2 R 2 as a function of J,


showing that its value is maximum when J 1.120 or Z 0.893a .
148 SYSTEMS WITH GAUSSIAN PUPILS

q
p

yp

xp
(0,0)

ni
lR

Figure 6-3. Geometry for evaluating the OTF. The centers of the two pupils are
( )
located at (0, 0) and l R ( x, h) in the x p , y p coordinate system and m (l R 2) (vi , 0)
12
in the ( p, q ) coordinate system, where vi = x 2 + h 2 ( )
and f = tan 1 ( h x) . The
shaded area is the overlap area of the two pupils. When normalized by the pupil
radius a, the centers of the two pupils of unity radius lie at m v along the p axis.

(
t (v ; g ) = a 2 Pex ) Ú Ú A( p + v , q) A( p - v , q) dp dq , 0£ v£1 . (6-17)

Substituting for the amplitude A(r) from Eq. (6-2) and for the power Pex from Eq. (6-5)
into Eq. (6-17), we obtain

1 v2 1 q2 v
(
8g exp -2gv 2 Û ) Û
t (v ; g ) = Ù
p [1 - exp( -2 g ) ] ı
dq Ù
ı
[ ( )]
exp -2g p 2 + q 2 dp , (6-18)
0 0

where the integration is over a quadrant of the overlap region of two pupils whose centers
are separated by a distance v along the p axis. For large values of g (e.g., g ≥ 4 ), the
contribution to the integral in Eq. (6-18) is negligible unless v = 0 , in which case it
represents the Gaussian-weighted area of a quadrant of the pupil, and the equation
reduces to

(
t (v ; g ) = exp -2gv 2 ) , 0£v £1 . (6-19)

Figure 6-4 shows how the OTF varies with v for several values of g . We note that
compared to a uniform pupil (i.e., for g = 0 ), the OTF of a Gaussian pupil is higher for
low spatial frequencies, and lower for the high. Moreover, as g increases, the bandwidth
6.3.3 OTF 149

0.8

1
0.6
W(Q J)

0.4
—J = 3 2

0.2

0
0 0.2 0.4 0.6 0.8 1
Q

Figure 6-4. The OTF of a Gaussian pupil. A uniform pupil corresponds to J 0,


and a large value of J represents a weakly truncated pupil.

of low frequencies for which the OTF is higher decreases and the OTF at high
frequencies becomes increasingly smaller. This is due to the fact that the Gaussian
weighting across the overlap region of two pupils whose centers are separated by small
values of v is higher than that for large values of v. If we consider an apodization such
that the amplitude increases from the center toward the edge of the pupil, then the OTF is
lower for low frequencies and higher for the high. Thus unlike aberrations, which reduce
the MTF of a system at all frequencies within its passband, the amplitude variations can
increase or decrease the MTF at any of those frequencies.

6.4 STREHL RATIO AND ABERRATION BALANCING


From Eq. (2-22), the Strehl ratio (representing the ratio of the central irradiances with
and without aberration) for a Gaussian pupil is given by [1–3]

2 2
1 2S ª1 2 S º
S ³ ³ A U exp>i ) U, T @ U dU dT «³ ³ A U U dU dT»
0 0 ¬0 0 ¼
2 1 2S 2
­ J ½
® S 1  exp  J ¾ ³ ³ exp JU exp>i ) U, T @ U dU dT
2
. (6-20)
¯ > @ Ó 0 0

For small aberrations, the Strehl ratio is approximately given by


150 SYSTEMS WITH GAUSSIAN PUPILS

S ~ exp ( - s F2 ) , (6-21)

where

s 2F = < F 2 > - < F > 2 (6-22)

is the variance of the phase aberration across the Gaussian-amplitude weighted pupil. The
mean and the mean square values of the aberration are obtained from the expression

1 2p 1 2p
n
< Fn > = Ú Ú [
A(r) F(r, q) ] r dr d q Ú Ú A(r) r dr dq
0 0 0 0

1 2p
g
= Ú Ú
p[1 - exp( - g ) ] 0
( )[
exp -gr 2 F(r, q) ] n r dr d q , (6-23)
0

with n = 1 and 2, respectively. The angular brackets indicate a mean value over the
Gaussian pupil.

Table 6-1 lists the primary aberrations and their standard deviations for increasing
values of g . It is evident that the standard deviation of an aberration decreases as g
increases. This is due to the fact that while an aberration increases as r increases, the
amplitude decreases more and more rapidly as g increases, thus reducing its effect more

Table 6-1. Primary aberrations and their standard deviations for optical systems
with Gaussian pupils. For comparison, the results for a uniform pupil ( g = 0 ) are
also given.

Primary Aberration sF ( g = 0) sF ( g = 1) sF ( g =2 ) sF ( g ≥3 )
Spherical, As r 4 2 As As As As 2 5 As
=
3 5 3.35 3.67 6.20 g2

Coma, Ac r3 cos q Ac Ac Ac Ac 3 Ac
=
2 2 2.83 3.33 6.08 g3 2

Astigmatism, Aa r2 cos 2 q Aa Aa Aa Aa
4 4.40 6.59 2g

Defocus, Bd r2 Bd Bd Bd Bd Bd
=
2 3 3.46 3.55 4.79 g

Tilt, Bt r cos q Bt Bt Bt Bt
2 2.19 2.94 2g
 6WUHKO 5DWLR DQG $EHUUDWLRQ %DODQFLQJ 151

and more compared to that for a uniform pupil. Accordingly, for a given small amount of
aberration Ai , the Strehl ratio for a Gaussian pupil is higher than that for a uniform pupil.
Similarly, the aberration tolerance for a given Strehl ratio is higher for a Gaussian pupil.
Its approximate value can be obtained from Eq. (6-21).

Since the Strehl ratio depends on the aberration variance, we balance a given
aberration with lower-order aberrations to minimize its variance. Thus, we balance
spherical aberration and astigmatism with defocus aberration, and coma with tilt
aberration to minimize their variance. The balanced primary aberrations thus obtained are
listed in Table 6-2. For example, the defocus aberration that balances spherical aberration
is given by Bd As = - 1, - 0.933 , and - 4 g when g = 0 , 1, and ≥ 3, respectively.
Similarly, the tilt aberration that balances coma for these values of g is given by
Bt Ac = - (2 3) , - 0.608 , and - 2 g , respectively. The defocus coefficient given by
Bd = - Aa 2 to balance astigmatism is independent of the value of g .

The standard deviations of the balanced primary aberrations are given in Table 6-3.
The factor by which the standard deviation of a primary aberration is reduced by
balancing it with another is listed in Table 6-4. The diffraction focus representing the
point of maximum irradiance for a small aberration is listed in Table 6-5. We note that,
although aberration balancing in the case of a uniform pupil reduces the standard
deviation of spherical aberration and coma by factors of 4 and 3, respectively, the
reduction in the case of astigmatism is only a factor of 1.22. For a Gaussian pupil, the
trend is similar but the reduction factors are smaller for spherical aberration and coma,
and are larger for astigmatism. For a Gaussian beam with g = 1, they are 3.74, 2.64, and
1.27, corresponding to spherical aberration, coma, and astigmatism, respectively. In
Section 6.6, the balanced aberrations are identified with the Gaussian polynomials
discussed in Section 6.5.

Table 6-2. Balanced primary aberrations.

Balanced F( r, q ; g = 0) F( r, q ; g = 1) (
F r , q;; g = 2 ) (
F r, q ; g ≥ 3 )
Aberration

Ê 4 2ˆ
Spherical (
As r 4 r2 ) (
As r 4 0.933r 2 ) (
As r 4 0.728 r 2 ) As Á r 4
Ë
r ˜
g ¯

Ê 2 ˆ Ê 3 2 ˆ
Coma Ac Á r 3
Ë
r˜ cos q
3 ¯
(
Ac r 3 )
0.608 r cos q A c r 3 ( )
0.419 r cos q A c Á r
Ë
r˜ cos q
g ¯

Astigmatism
(
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 )
152 SYSTEMS WITH GAUSSIAN PUPILS

Table 6-3. Standard deviation of balanced primary aberrations.

Balanced sF ( g = 0) s F ( g = 1) sF ( g =2 ) sF ( g ≥3 )
Aberration

Spherical As As As As 2 As
=
6 5 13.42 13.71 18.29 g2

Coma Ac Ac Ac Ac Ac
=
6 2 8.49 8.80 12.21 g3 2

Astigmatism Aa Aa Aa Aa Aa
=
2 6 4.90 5.61 9.08 2g

Table 6-4. Factor by which the standard deviation of a Seidel aberration across an
aperture is reduced when it is optimally balanced with other aberrations.

Reduction Factor
Balanced Uniform Gaussian Gaussian Weakly Truncated

Aberration ( g = 0) ( g = 1) ( g =2 ) (
Gaussian g ≥ 3 )
Spherical 4 3.74 2.95 5 = 2.24

Coma 3 2.64 2.01 3 = 1.73

Astigmatism 1.22 1.27 1.38 2 = 1.41

Table 6-5. Diffraction focus.

Diffraction Focus
Balanced Uniform Gaussian Gaussian Weakly Truncated

Aberration ( g = 0) ( g = 1) ( g =2 ) Gaussian g ≥ 3( )
Ê 32 2 ˆ
Spherical (0, 0, 8F A ) (0, 0, 7.46 F A ) (0, 0, 5.82 F A )
2
s
2
s
2
s Á 0, 0, F As ˜
Ë g ¯

Coma (4 FAc 3, 0, 0 ) (1.22 FAc , 0, 0) (0.84 FAc , 0, 0) (4 FAc g, 0, 0 )

Astigmatism (0 , 0 , 4 F A ) (0 , 0 , 4 F A )
2
a
2
a (0 , 0 , 4 F A )
2
a (0 , 0 , 4 F A )2
a
6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil 153

6.5 ORTHONORMALIZATION OF ZERNIKE CIRCLE POLYNOMIALS OVER


A GAUSSIAN CIRCULAR PUPIL
The Gaussian circle polynomials G j (r, q; g ) orthonormal over a Gaussian pupil can
be obtained recursively from the Zernike circle polynomials Z j (r, q) discussed in
Chapter 4, starting with G1 = 1 (omitting the arguments for brevity) from Eq. (3-18)
according to

È j ˘
G j +1 = N j +1 Í Z j +1 - Â Z j +1G k G k ˙ , (6-24)
Î k =1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal. The


angular brackets indicate a mean value over the Gaussian pupil. Thus

1 2p 1 2p
Z j +1G k = Ú Ú A(r) Z j +1G k r dr dq Ú Ú A(r) r dr dq
0 0 0 0

1 2p
g
= Ú
p[1 - exp( - g ) ] 0 Ú ( )
exp - gr 2 Z j +1G k r dr dq . (6-25)
0

The orthonormality of the polynomials implies that

1 2p 1 2p
G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq
0 0 0 0

1 2p
g
= Ú
p[1 - exp( - g ) ] 0 Ú ( )
exp - gr 2 G j G j ¢ r dr dq
0

= d jj ¢ . (6-26)

Now a circle polynomial Z j varies with the angle q as cos mq or sin mq depending
on whether j is even or odd. It is radially symmetric when m = 0. Because of the
orthogonal properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)],
the polynomials G k that contribute to the sum in Eq. (6-8) must also have the same
angular dependence as that of the polynomial Z j +1. Hence, the polynomial G j +1 will also
have the same angular dependence. Thus, a Gaussian polynomial G j is separable in polar
coordinates r and q , and differs from the corresponding circle polynomial only in its
radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the
Gaussian polynomials can accordingly be written

G even j (r, q; g ) = 2(n + 1) Rnm (r; g ) cos mq , m π 0 , (6-27a)

G odd j (r, q; g ) = 2(n + 1) Rnm (r; g ) sin mq , m π 0 , (6-27b)


154 SYSTEMS WITH GAUSSIAN PUPILS

G j (r, q; g ) = n + 1 Rn0 (r; g ) , m = 0 , (6-27c)

where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g )
is a Gaussian radial polynomial.

Substituting Eqs. (6-27a)–(6-27c) into the orthonormality Eq. (6-26), we find that the
Gaussian radial polynomials obey the orthogonality condition [1]

1 1
1
Ú (r; g ) (r; g ) A(r) r dr Ú A(r) r dr
Rnm Rnm¢ = d
n + 1 nn ¢
. (6-28)
0 0

Writing Eq. (6-24) in terms of two-index polynomials given by Eqs. (6-27a)–(6-27c) and
substituting these equations into it, as was done in Chapter 5 for the annular polynomials,
we find that the Gaussian radial polynomials are given by

È ( n m) 2 ˘
Rnm (r; g ) = M nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; g ) Rnm 2i (r; g )˙ , (6-29)
Î i ≥1 ˚

where

1 1
Rnm (r) Rn 2i (r; g ) = Ú (r) Rn 2i (r; g ) A(r) r dr Ú A(r) r dr
Rnm . (6-30)
0 0

The normalization constant M nm that replaces the normalization constant N j is


determined from the orthogonality Eq. (6-28) of the radial polynomials. Note that except
for the normalization constant, the radial polynomial Rnn (r; g ) is identical to the
corresponding polynomial for a uniformly illuminated circular pupil Rnn (r) , i.e.,

Rnn (r; g ) = Mnn Rnn (r) . (6-31)

The radial polynomial Rnm (r; g ) is a polynomial of degree n in r containing terms in rn ,


rn 2 , ..., and r m , whose coefficients depend on the Gaussian amplitude through g, i.e., it
has the form

Rnm (r; g ) = anm rn + bnm rn 2


+ K + dnm rm , (6-32)

where the coefficients anm , etc., depend on g. The radial polynomials are even or odd in r
depending on whether n (or m) is even or odd.

The polynomial ordering, the number of polynomials of a certain order or through a


certain order n, and the relationships among the indices n, m, and j are the same as
discussed for circle polynomials in Chapter 4. Moreover, a Gaussian circle polynomial
G j (r, q; g ) reduces to the corresponding circle polynomial Z j (r, q) as g Æ 0. The
Gaussian circle polynomials are also unique like the circle polynomials. They are not
only orthogonal over a Gaussian circular pupil, but they also include wavefront tilt and
defocus and balanced classical aberrations as members of the polynomial set.
6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations for a Gaussian Circular Pupil 155

6.6 GAUSSIAN CIRCLE POLYNOMIALS REPRESENTING BALANCED


PRIMARY ABERRATIONS FOR A GAUSSIAN CIRCULAR PUPIL

The radial polynomials corresponding to balanced primary aberrations are listed in


Table 6-6. The column “Gaussian” is for any value of g , and the column “Weakly
Truncated Gaussian” is for its large values. It can be seen that the balancing defocus for
(
spherical aberration given by Bd = b40 a40 As and the balancing tilt for coma given by)
( )
Bt = b31 a31 Ac are in agreement with the corresponding values given in Table 6-2. For
example, the relative balancing defocus in the case of spherical aberration from Table 6-6
for g = 1 is – 5.71948 6.12902 , which is the same as - 0.933 in Table 6-2. From the
form of the Gaussian circle polynomial R22 (r; g ) cos 2q representing balanced
astigmatism and varying as r 2 cos 2q , it is evident that the balancing defocus of
- (1 2)r 2 for astigmatism r 2 cos 2 q is independent of the value of g . Similarly,
comparing the form of a balanced primary aberration with the corresponding Gaussian
polynomial, we can immediately write its standard deviation. Thus, we can see that the
sigma values As 5a40 , Ac 2 2 a31 , and Aa 2 6 a22 of balanced spherical aberration,
coma, and astigmatism, respectively, are in agreement with their values given in Table 6-
3. For example, the balanced aberration for spherical aberration Asr 4 can be written

As 0 4
W (r, q; g ) =
a 40
(a 4 r + b40r 2 + c 40 )
As
= G 4 (r, q; g ) . (6-33)
5a 40

Table 6-6. Gaussian radial polynomials representing balanced primary aberrations


for Gaussian beams. Polynomials for special cases of g = 0 (corresponding to a
uniform beam), g = 1, and weakly truncated Gaussian beams are also given.

Aberration Radial Gaussian* Gaussian Uniform Weakly Truncated


Polynomial g 1 g 0 Gaussian

Piston R00 1 1 1 1

Distortion (tilt) R11 a11r 1.09367r r


g / 2r

2
Field curvature R20 a20r2 + b20 2
2.04989r – 0.85690 2r – 1 2
( gr – 1) / 3
(defocus)
Astigmatism R22 a22r2 1.14541r2 r2 ( g / 6 )r2

Coma R31 a31r3 + b31r 3.11213r 3 – 1.89152r 3 r3 – 2 r Êg ˆ


g / 2 Á r3 – r˜
Ë2 ¯

Spherical aberration R40 a40r4 + b40r2 + c40 6.12902r4 – 5.71948r2 + 0.83368 6 r4 – 6 r2 + 1 ( g 2r4 – 4 gr2 + 2) / 2 5

1
*a11 = (2 p 2 )–1/2 , a 20 = [3( p 4 – p 22 )] –1/2, b 20 = – p 2 a 20 , a 22 = ( 3 p 4 )–1/2 , a 13 = ( p – p 42 / p 2 ) 12
, b 31 = – ( p 4 / p 2 )a 13 ,
2 6
–1/2
{
a 40 = 5 [ p8 – 2 K 1 p6 + (K 12 + 2 K 2 ) p4 – 2 K 1 K 2 p2 + K 22 ] } , b40 = – K 1 a 40 , c40 = K 2 a 40 ,

p s = < r s > = (1 – expg ) –1 + ( s / 2 g ) p s – 2 , s is an even integer,

p 0 = 1, K1 = ( p6 – p 2 p 4 ) / ( p 4 – p 22 ), K 2 = ( p 2 p6 – p 42 ) / ( p 4 – p 22 ) .
156 SYSTEMS WITH GAUSSIAN PUPILS

Since G 4 is an orthonormal polynomial, its multiplier As 5a 40 yields the sigma value


of the balanced aberration. The balancing defocus is, of course, Asb40 a 40 . As a numerical
example, it yields a sigma value of As 13.71 for g = 1, the same as in Table 6-3. The
corresponding balancing defocus is - 0.933As , as expected.

6.7 WEAKLY TRUNCATED GAUSSIAN PUPILS


For a weakly truncated Gaussian pupil, we can let the upper limit of the radial
integration approach infinity with negligible error. Thus, Eq. (6-20) for the Strehl ratio
and Eq. (6-23) for the mean and mean square values of the aberration may be written [1]
2
2 • 2p
Ê gˆ
S = Á ˜
Ë p¯ Ú Ú ( ) [
exp -gr 2 exp iF(r, q) r dr dq ] (6-34)
0 0

and
• 2p
g n
< Fn > =
p Ú Ú ( )[ ]
exp - g r2 F(r, q) r dr dq , (6-35)
0 0

respectively.

The standard deviation of a primary aberration for a large value of g can be obtained
by calculating its mean and mean square values according to Eq. (6-36). The results thus
obtained are given in the last column of Table 6-1. The corresponding balanced
aberrations and their standard deviations are similarly given in Tables 6-2 and 6-3,
respectively. The balancing of an aberration reduces the standard deviation by a factor of
5, 3 , and 2 in the case of spherical aberration, coma, and astigmatism,
respectively, as noted in Table 6-4. The diffraction focus for these aberrations is listed in
Table 6-5. The amount of balancing aberration decreases as g increases in the case of
spherical aberration and coma, but does not change in the case of astigmatism. For
example, in the case of spherical aberration, the amount of balancing defocus for a
weakly truncated Gaussian beam is ( 4 g ) times the corresponding amount for a uniform
beam. Similarly, in the case of coma, the balancing tilt for a weakly truncated Gaussian
beam is (3 g ) times the corresponding amount for a uniform beam. The location of the
diffraction focus is independent of the value of g in the case of astigmatism, since the
balancing defocus is the same regardless of the value of g . Compared to the peak value
of an aberration, its standard deviation is smaller by a factor of g 2 2 , g 3 2 , and 2g in
the case of spherical aberration, coma, and astigmatism, respectively.

When a Gaussian beam is weakly truncated, i.e., when g is large, the quantity ps in
Table 6-6 reduces to

ps = < rs > = (s 2 g ) ps 2 = (s 2) ! g s2
. (6-36)
 :HDNO\ 7UXQFDWHG *DXVVLDQ 3XSLOV 157

As a result, we obtain simple expressions for the radial polynomials, which are listed in
the last column in Table 6-6. They are similar to Laguerre polynomials [4]. If we
normalize the radial coordinate r of a point on the pupil by w (instead of by a), then g
disappears from these expressions. Since the power in a weakly truncated Gaussian beam
is concentrated in a small region near the center of the pupil, the effect of the aberration
in its outer region is negligible. Accordingly, the aberration tolerances in terms of the
peak value of the aberration at the edge of the pupil (r = 1) may not be very meaningful.
They may instead be defined in terms of their value at the Gaussian radius [1].

6.8 ABERRATION COEFFICIENTS OF A GAUSSIAN CIRCULAR


ABERRATION FUNCTION
The aberration function W (r, q; g ) across a Gaussian circular pupil can be expanded
in terms of a complete set of orthonormal Gaussian circle polynomials G j (r, q; g ) in the
form

J
W (r, q; g ) = Â a j G j (r, q; g ) , 0 £ r £ 1 , 0 £ q £ 2 p , (6-37)
j =1

where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-
37) by G j ¢ (r, q; g ) , integrating over the Gaussian pupil, and using the orthonormality Eq.
(6-26), we obtain the circle expansion coefficients:

1 2p 1
a j = Ú Ú W (r, q; g ) G j (r, q; g ) A(r) r dr d q 2 p Ú A(r) r dr . (6-38)
0 0 0

The mean and mean square values of the aberration function are given by

W (r, q; g ) = a1 (6-39)

and
J
W 2 (r, q; g ) = Â a 2j . (6-40)
j =1

The variance of the aberration function is accordingly given by

2
sW = W 2 (r, q; g ) - W (r, q; g )

J
= Â a 2j . (6-41)
j =2

6.9 ORTHONORMALIZATION OF ANNULAR POLYNOMIALS OVER A


GAUSSIAN ANNULAR PUPIL

The balanced aberrations for an annular Gaussian pupil with an obscuration ratio 
can be obtained in a manner similar to those for a circular pupil, except that the lower
158 SYSTEMS WITH GAUSSIAN PUPILS

limit of zero in the radial integration is replaced by . The Gaussian annular polynomials
G j (r, q; g; ) orthonormal over a Gaussian annular pupil can be obtained recursively from
the annular polynomials A j (r, q; ) , starting with G1 = 1 (omitting the arguments for
brevity) from Eq. (3-18) according to

È j ˘
G j +1 = N j +1 Í A j +1 - Â A j +1G k G k ˙ , (6-42)
Î k =1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal. The


angular brackets indicate a mean value over the Gaussian annular pupil. Thus

1 2p 1 2p
A j +1G k = Ú Ú A(r) A j +1G k r dr dq Ú Ú A(r) r dr dq . (6-43)
 0  0

The orthonormality of the polynomials implies that

1 2p 1 2p
G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq
 0  0

= d jj ¢ . (6-44)

Applying the same reasoning as in the case of Gaussian circle polynomials, we find
that the polynomial G j (r, q; g; ) also has the same angular dependence as an annular
polynomial A j (r, q; ) . Thus, a Gaussian annular polynomial G j is separable in polar
coordinates r and q , and differs from the corresponding annular polynomial only in its
radial dependence. Given the form of the annular polynomials by Eqs. (5-17a)–(5-17c),
the Gaussian annular polynomials can accordingly be written

G even j (r, q; g; ) = 2(n + 1) Rnm (r; g; ) cos mq , m π 0 , (6-45a)

G odd j (r, q; g; ) = 2(n + 1) Rnm (r; g; ) sin mq , m π 0 , (6-45b)

G j (r, q; g; ) = n + 1 Rn0 (r; g; ) , m = 0 , (6-45c)

where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g; )
is a Gaussian annular radial polynomial.

Substituting Eqs. (6-45a)–(6-45c) into the orthonormality Eq. (6-44), we find that the
Gaussian annular radial polynomials obey the orthogonality condition [1,3]

1 1
1
Ú Rnm (r; g; ) Rnm¢ (r; g; ) A(r) r dr Ú A(r) r dr = d . (6-46)
n + 1 nn ¢
 

Writing Eq. (6-42) in terms of two-index polynomials given by Eqs. (6-45a)–(6-45c) and
substituting these equations into it, as was done in Chapter 5 for the annular polynomials,
6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil 159

we find that the Gaussian annular radial polynomials are given by

È ( n m) 2 ˘
Rnm (r; g; ) = M nm Í Rnm (r; ) - Â (n - 2i + 1) Rnm (r; ) Rnm 2 i (r; g; ) Rnm 2 i (r; g; ) ,
˙
ÍÎ i ≥1 ˙˚

(6-47)

where the angular brackets indicate an average over the annular Gaussian pupil; i.e.,

1 1
Rnm (r; ) Rn 2 i (r; g; ) = Ú Rnm (r; ) Rn 2 i (r; g; ) A(r) r dr Ú A(r) r dr . (6-48)
 

The normalization constant M nm that replaces the normalization constant N j is


determined from the orthogonality Eq. (6-46) of the radial polynomials. Note that the
radial polynomial Rnn (r; g ; ) is identical to the corresponding polynomial for a uniformly
illuminated annular pupil Rnn (r; ) , except for the normalization constant, i.e.,

Rnn (r; g; ) = M nn Rnn (r; ) . (6-49)

The radial polynomial Rnm (r; g ; ) is a polynomial of degree n in r containing terms in


rn , rn 2 , ..., and r m whose coefficients depend on the Gaussian amplitude through g,
i.e., it has the form

Rnm (r; g ; ) = anm rn + bnm rn 2


+ K + dnm rm , (6-50)

where the coefficients anm , etc., depend on g and .

The polynomial ordering, the number of polynomials of a certain order or through a


certain order n, and the relationships among the indices n, m, and j are the same as those
discussed for the Zernike circle polynomials in Chapter 4, or the annular polynomials in
Chapter 5. Moreover, a Gaussian annular polynomial G j (r, q; g; ) reduces to the
corresponding annular polynomial Aj (r, q; ) as g Æ 0. The Gaussian annular
polynomials are also unique like the Gaussian circle polynomials. They are not only
orthogonal over a Gaussian circular pupil, but also include wavefront tilt and defocus and
balanced classical aberrations as members of the polynomial set.

6.10 GAUSSIAN ANNULAR POLYNOMIALS REPRESENTING BALANCED


PRIMARY ABERRATIONS FOR A GAUSSIAN ANNULAR PUPIL
The radial annular polynomials Rnm (r; g ; ) for the balanced primary aberrations are
given by the same expressions as for the circle radial polynomials in Table 6-6 except
that now

ps = < rs >

Ë { [(
= Ê s exp g 1 - 2 )] - 1} {exp [g (1 -  )] - 1}ˆ¯ + (s 2 g ) p
2
s 2 . (6-51)
160 SYSTEMS WITH GAUSSIAN PUPILS

Using these expressions, numerical results for the coefficients of the terms of a radial
polynomial for any values of g and  can be obtained.

The coefficients for g = 1 and  = 0, 0.25, 0.50, 0.75, and 0.90 are given in Table 6-
7. For comparison, the coefficients for a uniformly illuminated pupil, i.e., for g = 0 , are
given in parentheses in this table. An increase (decrease) in the value of a coefficient anm
of an orthogonal aberration Rnm (r; g ; ) cos mq implies a decrease (increase) in the value
of s F for a given amount of the corresponding classical aberration. This, in turn, implies
that for small aberrations, the system performance as measured by the Strehl ratio is less
(more) sensitive to that classical aberration when balanced with other classical
aberrations to form an orthogonal aberration. Thus, as  increases, irrespective of the
value of g, the system becomes less sensitive to field curvature (defocus) and spherical
aberration but more sensitive to distortion (tilt) and astigmatism. In the case of coma, it
first becomes slightly more sensitive but is much less sensitive for larger values of . As
g increases, i.e., as the width of the Gaussian illumination becomes narrower, the system
becomes less sensitive to all classical primary aberrations. Although the results for g = 0
and g = 1 only are given in Table 6-7, the coefficients for 0 £ g £ 3 show that the
differences between the coefficients for uniform and Gaussian illumination are small, and
they decrease as  increases and increase as g increases. This is understandable because
as  increases or g decreases, the differences between the two illuminations decreases.

Table 6-7. Coefficients of terms in Gaussian radial polynomials Rnm (r; g ; ) for g = 1.
The numbers given in parentheses are the corresponding coefficients for uniform
illumination.

 a 11 a 20 b20 a 22 a 13 b31 a 40 b40 c40

0.00 1.09367 2.04989 – 0.85690 1.14541 3.11213 – 1.89152 6.12902 – 5.71948 0.83368

(1.00000) (2.00000) (– 1.00000) (1.00000) (3.00000) (– 2.00000) (6.00000) (– 6.00000) (1.00000)

0.25 1.04364 2.18012 – 1.00080 1.08940 3.01573 – 1.84513 6.95563 – 6.98197 1.25153

(0.97014) (2.13333) (– 1.13333) (0.96836) (2.94566) (– 1.97099) (6.82667) (– 7.25333) (1.42667)

0.50 0.92963 2.70412 – 1.56449 0.93620 3.14319 – 2.06618 10.79549 – 13.08900 3.46706

(0.89443) (2.66667) (– 1.66667) (0.87287) (3.11400) (– 2.17980) (10.66667) (– 13.33333) (3.66667)

0.75 0.80827 4.59329 – 3.51548 0.74439 4.55179 – 3.57767 31.47560 – 48.77879 18.39840

(0.80000) (4.57143) (– 3.57143) (0.72954) (4.53877) (– 3.63858) (31.34694) (– 48.97959) (18.63265)

0.90 0.74453 10.53581 – 9.50324 0.63890 9.60573 – 8.69629 166.33359 – 300.66342 135.36926

(0.74329) (10.52632) (– 9.52632) (0.63679) (9.60023) (– 8.72012) (166.20500) (– 300.83102) (135.62604)


6.11 Aberration Coefficients of a Gaussian Annular Aberration Function 161

6.11 ABERRATION COEFFICIENTS OF A GAUSSIAN ANNULAR


ABERRATION FUNCTION
The aberration function W (r, q; g; ) across a Gaussian annular pupil can be
expanded in terms of a complete set of orthonormal Gaussian annular polynomials
G j (r, q; g; ) in the form
J
W (r, q; g; ) = Â a j G j (r, q; g; ) ,  £ r £ 1 , 0 £ q £ 2 p , (6-52)
j =1

where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-
52) by G j (r, q; g; ), integrating over the Gaussian pupil, and using the orthonormality
Eq. (6-44), we obtain the Gaussian annular expansion coefficients:

1 2p 1
a j = Ú Ú W (r, q; g; )G j (r, q; g; ) A(r) r dr d q 2 p Ú A(r) r dr . (6-53)
  

The mean and mean square values of the aberration function are given by

W (r, q; g; ) = a1 (6-54)

and
J
W 2 (r, q; g; ) = Â a 2j . (6-55)
j =1

The variance of the aberration function is accordingly given by

s 2 = W 2 (r, q; g; ) - W (r, q; g; )

J
= Â a 2j . (6-56)
j =2

6.12 SUMMARY
A pupil with Gaussian illumination is called a Gaussian pupil. The Gaussian
illumination may be due to a filter with Gaussian transmission placed at the pupil or due
to a laser beam with Gaussian amplitude distribution. The illumination is characterized by
a truncation ratio g = a w , where a is the pupil radius and w is the radial distance,
called the Gaussian radius, where the amplitude is 1 e times its central value.

The aberration-free image for a system with a Gaussian pupil shows that the
Gaussian illumination reduces the central value, broadens the central bright spot, but
reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,
the OTF is higher for low spatial frequencies, and lower for the high. The diffraction
rings practically disappear when the pupil radius is twice the Gaussian radius, and the
beam propagates as a Gaussian everywhere. The OTF in this case is also described by a
Gaussian function.
162 SYSTEMS WITH GAUSSIAN PUPILS

The Strehl ratio for a small aberration can be estimated from its variance calculated
over the Gaussian amplitude-weighted pupil. The aberration variance decreases, and,
therefore, its tolerance increases as the truncation ratio increases (see Tables 6-1 and 6-3),
because the amplitude decreases as the aberration increases with the radial distance from
the center.

The Gaussian polynomials orthonormal over a Gaussian circular pupil are obtained
by orthonormalizing the Zernike circle polynomials over a corresponding Gaussian
amplitude-weighted pupil. They are given in Table 6-6 for the primary aberrations for
g = 1. For a weakly truncated pupil, i.e., for large values of g , the polynomials have a
simple analytical form similar to Laguerre polynomials, as shown in the last column in
Table 6-6.

The orthonormal Gaussian annular polynomials for Gaussian annular pupils can be
obtained by orthonormalizing the annular polynomials. The polynomial ordering is
exactly the same as that for the circle or the annular polynomials.
5HIHUHQFHV 163

References

1. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction


Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).

2. V. N. Mahajan, “Uniform versus Gaussian beams: a comparison of the effects of


diffraction, obscuration, and aberrations,” J. Opt. Soc. Am. A3, 470–485 (1986).

3. V. N. Mahajan, “Strehl ratio of a Gaussian beam,” J. Opt. Soc. Am. A22, 1824–
1833 (2005).

4. A. Korn and T. M. Korn, Mathematical Handbook for Scientists and Engineers


(McGraw-Hill, New York, 1968).

5. V. N. Mahajan, “Gaussian apodization and beam propagation,” Progress in


Optics, 49, 1–96, (2006).
CHAPTER 7

SYSTEMS WITH HEXAGONAL PUPILS

7.1 Introduction ..........................................................................................................167

7.2 Pupil Function ......................................................................................................168

7.3 Aberration-Free Imaging ....................................................................................169

7.3.1 PSF ..........................................................................................................169

7.3.2 OTF ..........................................................................................................174

7.4 Hexagonal Polynomials........................................................................................177

7.5 Hexagonal Coefficients of a Hexagonal Aberration Function ......................... 185

7.6 Isometric, Interferometric, and Imaging Characteristics of

Hexagonal Polynomial Aberrations ..................................................................187

7.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio ..............................194

7.7.1 Defocus ....................................................................................................194

7.7.2 Astigmatism............................................................................................. 194

7.7.3 Coma ........................................................................................................195

7.7.4 Spherical Aberration ................................................................................196

7.7.5 Strehl Ratio ..............................................................................................197

7.8 Summary............................................................................................................... 197

References ......................................................................................................................200

165
Chapter 7
Systems with Hexagonal Pupils
7.1 INTRODUCTION
Although most optical imaging systems have a circular or an annular pupil, with or
without Gaussian illumination, there are times when the wavefront or the interferogram is
hexagonal. This is most notable for the primary mirrors of large telescopes, such as the
Keck [1], the James Webb [2], or the CELT [3]. Although these mirrors are circular, they
are large enough that they are segmented into small hexagonal segments. Optical testing
of a hexagonal segment yields a hexagonal wavefront or interferogram, thus requiring
polynomials that are orthogonal over a hexagon. Even a large hexagonal primary mirror
consisting of hexagonal segments has been proposed [4].

Smith and Marsh [5] have discussed the PSF of a hexagonal pupil, but their equation
for it is incorrect. Sabatke et Dl. [4] desribe the complex amplitude for a trapezoid
forming the upper half of a regular hexagon, but do not carry out the summation of the
diffracted amplitudes of the two trapezoids of the hexagonal pupil. We give closed-form
expressions for the six-fold symmetric aberration-free PSF and OTF [6]. Similar
expressions for the PSF have been given by others [7,8]. The PSF and OTF are plotted
along with the ensquared power, and compared with the corresponding quantities for a
system with a circular pupil. The ensquared power and the OTF are shown to be lower
than the corresponding values for a circular pupil.

The hexagonal polynomials representing balanced aberrations are obtained in this


chapter by orthogonalizing the Zernike circle polynomials over a unit hexagon by using
the procedure described in Chapter 3. Each of these polynomials consists of either the
cosine or the sine terms, but not both. This is a consequence of the biaxial symmetry of a
hexagonal pupil. Whereas the circle, annular, and Gaussian polynomials, described in
Chapters 4, 5, and 6, respectively, are separable in their dependence on the polar
coordinates r and q of a pupil point, only some of the hexagonal polynomials are
separable. For example, the polynomial H14 contains cos 2q and cos 4q terms. Hence,
numbering the polynomials with two indices n and m loses significance, and they must be
numbered with a single index j. A hexagonal pupil has two distinct configurations where
the hexagon in one is rotated by 30 degrees with respect to that in the other. Only some of
the polynomials are common between the two configurations.

In Chapters 4–6, we considered the balancing of classical aberrations for systems


with circular, annular, and Gaussian pupils, respectively, and showed that the
corresponding orthonormal polynomials also represented balanced aberrations. Although
not shown explicitly, as was done in Chapters 4 through 6, the hexagonal polynomials
also represent balanced classical aberrations. However, some interesting results are
obtained in this respect due to lack of the radial symmetry of the hexagonal pupil. For
example, while the polynomials H11 and H22 representing the balanced primary and
167
168 SYSTEMS WITH HEXAGONAL PUPILS

secondary spherical aberrations are radially symmetric, the polynomial H37 representing
the balanced tertiary spherical aberration is not, because it also consists of an angle-
dependent term in Z28 or cos 6q . The balancing defocus, however, to optimally balance
Seidel astigmatism for a hexagonal pupil is the same as that for a circular or an annular
pupil.

The isometric, interferometric, and PSF plots for the hexagonal polynomial
aberrations are shown. The P-V numbers for the polynomials with a sigma value of one
wave are given, and the Strehl ratios are caluclated for a sigma value of one-tenth of a
wave to illustrate that the exponential expression for it, in terms of the aberration
variance, gives a good estimate for small aberrations.

The balancing of Seidel aberrations is considered, and their standard deviations are
obtained by expressing them in terms of the orthonormal polynomials. The diffraction
focus is shown to lie closer to the Gaussian image point in the case of coma, and closer to
the Gaussian image plane in the case of spherical aberration, compared to their
corresponding locations for a circular pupil. Plots of Strehl ratio as a function of the
sigma value of a Seidel aberration are given. They demonstrate that the exponential
expression underestimates in the case of defocus, but overestimates in the case of
astigmatism, coma, and spherical aberration. The Strehl ratio is estimated very well for
balanced astigmatism and coma, but it underestimates in the case of balanced spherical
aberration for s W > 0.2 .

7.2 PUPIL FUNCTION


Consider an imaging system with a uniformly illuminated hexagonal exit pupil with
( ) ( )
each side of length a and area Sex = 3 3 2 a 2 lying in the x p , y p plane with z axis as
its optical axis, as illustrated in Figure 7-1. For a uniformly illuminated pupil with an
( )
aberration function F x p , y p and power Pex exiting from it, the pupil function of the
system can be written
yp yc

E F

30º

a A 60º
D o xp o xc

C B
a
2a

(a) (b)

Figure 7-1. (a) Hexagonal pupil with dimension a. (b) Unit hexagonal pupil inscribed
inside a unit circle showing the coordinates of its corners. Each side of the hexagon
has a length of unity. The x axis passes through the corners D and A, and y axis
bisects its parallel sides EF and CB.
7.2 Pupil Function 169

(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (7-1)

where

(
A xp, yp ) = (P ex
12
Sex ) (7-2)

across the hexagonal pupil.

7.3 ABERRATION-FREE IMAGING


7.3.1 PSF
From Eq. (1-9), the aberrated irradiance distribution in the image plane normalized
by its aberration-free central value Pex Sex l2 R 2 can be writen
2
r 1 Û r Ê 2pi r r ˆ r
I (ri ) = 2 Ù exp iF rp exp Á -
Sex ı
[ ( )]
Ë lR
ri rp ˜ d rp
¯
◊ , (7-3)

or, using Cartesian coordinates,


2
1 Û Û È 2pi ˘
I (x i , y i ) =
Sex ı ı
[ (
2 Ù Ù exp iF x p , y p exp Í -
Î lR
)] (
x i x p + y i y p ˙ dx p dy p
˚
) , (7-4)

where the integration is carried over the hexagonal pupil. Letting

(x p, yp ) = a( x ¢, y ¢) (7-5)

and

(xi , yi ) = l Fx ( x , y ) , (7-6)

where

Fx = R 2a (7-7)

is the focal ratio of the image-forming light cone along the x axis, Eq. (7-4) can be written
2
4 ÛÛ
I ( x, y) =
27 ı ı
[ ]
Ù Ù exp iF ( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-8)

For the aberration-free case, Eq. (7-8) reduces to


2
4 ÛÛ
I ( x, y) = Ù Ù exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-9)
27 ı ı

The hexagonal region of integration consists of a rectangle CBEF and two congruent
triangles B F A and CDE with the limits of integration - 1 2, 1 2; - 3 2, 3 2 , ( )
170 SYSTEMS WITH HEXAGONAL PUPILS

[1 2, 1; - ] [
3(1 - x ¢), 3(1 - x ¢) , and -1, - 1 2; - 3(1 + x ¢), 3(1 + x ¢) , respectively. In ]
each case, the first pair of limits is on x ¢ , and the second on y ¢ . Hence, the irradiance
distribution is given by
2
4 È12 3 2 1 3 (1 x ¢) 12 3 (1+ x ¢) ˘
I ( x, y) = Í Ú dx ¢ Ú + Ú dx ¢ Ú + Ú dx ¢ Ú ˙ exp[ -pi ( xx ¢ + yy ¢) ]dy ¢ . (7-10)
27 ÍÎ 1 2 3 2 12 3 (1 x ¢) 1 3 (1+ x ¢) ˙
˚

The integrand in Eq. (7-10) is separable in the integration coordinates. We carry out the
integration of each of its three parts:

12 3 2
A1( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
12 3 2

= 4
sin(px 2) sin ( 3py 2 ) , (7-11)
2
p xy

1 3 (1 x ¢)
A2 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
12 3 (1 x ¢)

-2
){ [- ( ) ( )] }. (7-12)
ipx 2 ipx
= e 3 y cos 3py 2 + ix sin 3py 2 + 3 ye
(
p y x 2 - 3y 2
2

Combining A2 and A3 , we find that their sum is real:

12 3 (1+ x ¢)
A3 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
1 3 (1+ x ¢)

2
){ [ ( ) ( )] }
= e ipx 2 3 y cos 3py 2 + ix sin 3py 2 - 3 ye ipx . (7-13)
2
(
p y x - 3y 2 2

4
A2 + A3 =
p y x - 3y 2
2
( 2
)
¥ [ 3 y cos(px 2) cos ( )
3py 2 - x sin(px 2) sin ( )
3py 2 - 3 y cos( px ) . (7-14) ]
From Eqs. (7-11) and (7-14), we obtain

4
A1 + A2 + A3 =
(
p 2 x x 2 - 3y 2 )
¥ { 3x[cos(px 2) cos( ) ]
3py 2 - cos( px ) - 3y sin(px 2) sin ( )}
3py 2 . (7-15)

The sum of the three parts of diffracted amplitude is real. The irradiance distribution is
given by
7.3.1 PSF 171

4 2
I ( x, y) = A1 + A2 + A3
27

4 2
=
27
( A1 + A2 + A3 ) . (7-16)

Using the L’Hopital rule, it can be shown that the PSF I (0, 0) at the origin is unity,
as expected from the normalization in Eq. (7-3). Rotating the ( x , y ) coordinate system by
[ ]
60 o , i.e., by changing ( x , y ) to (1 2) x + 3 y , y - 3 x , it can be shown that the PSF
remains invariant, thus showing that the PSF is 6-fold symmetric, as expected for the 6-
fold symmetric pupil. The PSF along the x and y axes can be written from Eq. (7-14) as

64
I ( x , 0) = [
9p 4 x 4
cos(px 2) - cos( px ) ]2 . (7-17a)

and

16 2
I (0, y ) =
243p 4 y 4
{ [
2 3 1 - cos ( )]
3py 2 + 3py sin ( )}
3py 2 . (7-17b)

A 2D PSF is shown in Figure 7-2. The PSF in Figure 7-2a emphasizes the low-value
details, but that in Figure 7-2b is truncated to a value of 10 -3 relative to a value of unity at
the center. It shows a nearly circular bright spot at the center surrounded by nearly
hexagonal alternating dark and bright rings, three dark and two bright. Beyond the rings,
the PSF breaks into six diffracted arms each of alternating bright and dark strips with
some dim structure between two consecutive arms. Plots of the PSF along the x and y
axes and at 15o from the x axis are shown in Figure 7-3 as I ( x, 0) , I (0, y ) , and
( )
I 15o ∫ I ( r ) , respectively. The solid curve I c represents the Airy pattern for a circular
pupil (of the same radius a as the side of the hexagonal pupil imaging an object at the
same wavelength l with the same focal ratio as Fx ) with its first zero at 1.22, as in
Figure 4-2. The central bright spot has its zero value along the x axis at 1.33, and at 1.35
along the y axis.

The ensquared power, i.e., the fractional power in a square region centered at the
Gaussian image point, is given by
s s
P( s) = Ú dx Ú I ( x , y )dy , (7-18)
s s

where s is the half-width of the square. It is tabulated in Table 7-1 along with the
corresponding value for a circular pupil. The two ensquared powers are plotted in Figure
7-4 as Ph and Pc . The ensquared power for a hexagonal pupil, plotted as a dotted curve
Ph , starts at zero and rises to 83.8% as s increases to the first zero along the x axis at
1.33, like the Airy disc of radius 1.22 for a circular pupil (as in Figure 4-2a), and
approaches 100% asymptotically. It is evident that the ensquared power for a hexagonal
pupil is lower than the corresponding value for a circular pupil.
172 SYSTEMS WITH HEXAGONAL PUPILS

(a) (b)

Figure 7-2. 2D aberration-free PSF of a system with a hexagonal pupil.


o

Ic

I(x,0)

I(15q)
o

m
I(y,0)
Ic

Figure 7-3. PSF along the x and y axes and at 15 o from the x axis, where x, y, and r
are in units of l Fx .
7.3.1 PSF 173

Table 7-1. Ensquared power Ph of a system with a hexagonal pupil, where s is the
half width of a square in units of l Fx , compared with the ensquared power Pc for a
circular pupil.

s Ph Pc

0 0 0
0.1 0.0256 0.0310
0.2 0.0984 0.1180
0.3 0.2070 0.2449
0.4 0.3354 0.3897
0.5 0.4663 0.5302
0.6 0.5848 0.6491
0.7 0.6809 0.7369
0.8 0.7504 0.7930
0.9 0.7945 0.8229
1 0.8186 0.8360
1.2 0.8344 0.8455
1.4 0.8434 0.8624
1.6 0.8613 0.8862
1.8 0.8819 0.9043
2 0.8972 0.9135
2.2 0.9060 0.9184
2.4 0.9116 0.9241
2.6 0.9175 0.9315
2.8 0.9244 0.9384
3 0.9311 0.9426
3.5 0.9397 0.9495
4 0.9469 0.9573
4.5 0.9536 0.9615
5 0.9575 0.9662
6 0.9645 0.9722
7 0.9699 0.9765
8 0.9738 0.9798
9 0.9768 0.9823
10 0.9791 0.9843
174 SYSTEMS WITH HEXAGONAL PUPILS

Pc Ph
o

Figure 7-4. Ensquared power as a function of the half-width s of a square, where s is


in units of l Fx .

7.3.2 OTF

From Eq. (1-11), the OTF for a uniformly illuminated hexagonal pupil can be
obtained as the autocorrelation of the pupil function:
r
t (v ) = Sex1 Ú [ (r )] d rr
exp iQ rp p , (7-19)

where

(r r)
Q rp ; v (r ) (r
= F rp - F rp - l R v
r
) (7-20)

r
is the phase aberration difference function, and v is a spatial frequency vector in the
image plane. The integration in Eq. (7-19) is carried out over the overlap area of two
r
hexagonal pupils whose centers are displaced from each other by l R v . In the aberration-
free case, the OTF is real and simply equal to the relative area of overlap of two pupils
r
where the center of one is displaced from that of the other by l R v .

For a displacement x along the x axis, as in Figure 7-5a, the overlap area consists of
two isosceles triangles and a rectangle when x < a . The area of each triangle is 3a 2 4 ,
and that of the rectangle is 3a( a - x ) . The total fractional overlap area is 1 - 2 x 3a .
For x = a , as in Figure 5b, the rectangle vanishes and the two triangles meet forming a
rhombus. For x > a , the two triangles intersect each other, thus reducing the size and
therefore the area of the rhombus. The fractional area of the rhombus is given by
(1 3) (2 - x a)2 . The rhombus vanishes as x Æ 2a , and the two hexagons meet at a
vertex only, namely, the extreme right-hand vertex of one hexagon and the extreme left-
hand vertex of the other. Replacing the displacement x by l Rv x , where v x is a spatial
frequency along the x axis, and normalizing it by the cutoff frequency 1 l Fx along this
axis, we can write the tangential or the x-OTF as
7.3.2 OTF 175

yp
yp yp

Oc
Oc Oc y
O O
xp xp O
x x xp

(a) (b) (c)

Figure 7-5. Overlap area of two hexagonal pupils displaced from each other along
the x axis in (a) and with x = a in (b), and along the y axis in (c).

ÏÔ1 - (4 3)v x , 0 £ v x £ 1 2
t x (v x ) = Ì 2
(7-21)
ÔÓ(4 3) (1 - v x ) , 1 2 £ v x £ 1 .

Now consider a displacement y along the y axis, as illustrated in Figure 7-5c. Here
again, the overlap area consists of two congruent isosceles triangles and a rectangle. The
(
area of each triangle is 1 4 3 )( )
3a - y and that of the rectangle is a 3a - y for
2
( )
0 £ y £ 3a . The fractional overlap area is given by ( 2 3)ÈÍ 1 y 3a + (1 2) 1 y 3a ˘˙ .
( ) ( )
Î ˚
Again, replacing y by l Rv y , where v y is the spatial frequency along the y axis, and
normalizing by the cutoff frequency 1 l Fx , the sagittal or the y-OTF can be written

2
( ) = (2 3)ÈÍÎ(1 - 2v
ty vy y ) (
3 + (1 2) 1 - 2v y 3 ˘˙ , 0 £ v y £ 3 2 .
) ˚
(7-22)

Note that the cutoff frequency in the y direction is 3 2 compared to a value of unity in
the x direction.

It can be shown that the OTF for an angle q from the x axis in the range 0 £ q £ p 6
is given by [6]

Ï 4 È Ê2 ˆ ˘
Ô1 - vq Ísin q + 3 cos q + Á sin 2 q - sin 2q˜ vq ˙ , 0 £ vq £ v1
Ô 3 3 Î Ë 3 ¯ ˚
t(vq ) = Ì (7-23)
Ô 4 + 2 Ê sin q - 4 cos qˆ v + 1 Ê 1 - 1 sin 2q + 3 cos 2qˆ v 2 , v £ v £ v ,
Ô 3 3 ÁË 3 ˜ q
¯ 3Ë
Á
3
˜ q 1
¯ q 2
Ó

where vq is the normalized spatial frequency for the angle q and


1
È Ê sin q ˆ ˘
v1 = Í 2Á cos q - ˜˙ (7-24)
Î Ë 3 ¯˚

and
176 SYSTEMS WITH HEXAGONAL PUPILS

1
Ê sin q ˆ
v2 = Á cos q + ˜ (7-25)
Ë 3¯

are normalized spatial frequencies corresponding to the displacements r1 and r2 . The


spatial frequency v 2 represents the cutoff frequency as a function of angle q . It
decreases monotonically from a value of unity to 3 2 as the angle q increases from
zero to p 6. By letting q = 0, we obtain the OTF along the x axis as given by Eq. (7-21).
Similarly, q = p 6 yields the OTF along the y axis given by Eq. (7-22), since the OTFs
for angles p 6 and p 2 are identical owing to the six-fold symmetry of the hexagonal
pupil. The OTF for the range p 6 £ q £ p 3 is the same as that for the range 0 £ q £ p 6 ,
becuase of the symmetry of the pupil about the direction making an angle of p 6. For
larger angles, we make use of the six-fold symmetry of the OTF.

Figure 7-6 shows how the OTF varies with the spatial frequency (in units of the
cutoff frequency 1 l Fx ) along the x and y axes, and at 15o from the x axis as t(v x ),
( ) ( )
t v y (in long dashes), and t 15o ∫ t( v ) . The OTF of a system with a corresponding
circular pupil of radius a is also included for comparison as t c . Note that the cutoff
frequency of the hexagonal pupil is the same as that for the circular pupil only along the x
axis and every 60 o degrees from it. Otherwise, it is smaller. We note that the OTF of a
hexagonal pupil is lower than that for a circular pupil at all spatial frequencies. The OTF
along the x axis is slightly higher than that along the y axis, and the OTF at 15o is slightly
higher in the low frequency region but lower in the high. The 15o OTF is lower than that
along the x axis. The differences among the three curves are relatively small.
oW

Wc

o
W Qy
W q o
o

W Qx

oQx  Qy Q

Figure 7-6. OTF along the x and y axes, and at 15 o from the x axis, where the spatial
frequencies v x , v y , and v , are in units of 1 l Fx .
7.4 Hexagonal Polynomials 177

7.4 HEXAGONAL POLYNOMIALS


Figure 7-7 shows a unit hexagon inscribed inside a unit circle. The x axis passes
through the corners D and A , and y axis bisects its parallel sides EF and C B. The
coordinates of the corners of the hexagon are labeled in the figure. Each side of the
hexagon has a length of unity. The area of the unit hexagon is A = 3 3 2 .

The orthonormal hexagonal polynomials H j obtained by orthogonalizing the


Zernike circle polynomials over a hexagon [5,6] are given by [see Eq. (3-18)]

È j ˘
H j +1 = N j +1 Í Z j +1 - Â Z j +1H k H k ˙ , (7-26)
Î k =1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit hexagon, i.e., they satisfy the orthonormality condition

2
Ú H j H j ¢ dx dy = d jj ¢ . (7-27)
3 3 hexagon

The hexagonal region of integration consists of a rectangle EFCB and two congruent
(
triangles F A B and C D E with limits of integration - 1 2, 1 2; - 3 2, 3 2 , )
[ ] [ ]
1 2, 1; - 3(1 - x ), 3(1 - x ) , and -1, - 1 2; - 3 (1 + x ), 3 (1 + x ) , respectively. The
angular brackets indicate a mean value over the hexagonal pupil. Thus,

2
Z j +1H k = Ú Z j +1H j dx dy . (7-28)
3 3 hexagon

The orthonormal hexagonal polynomials are given in Tables 7-2–7-4 up to the eighth
order in three different but equivalent forms [9,10]. In Table 7-2, each hexagonal
polynomial is written in terms of the circle polynomials, thus illustrating the relationship
y
£ 1 3¥ £ 1 3¥
E² , ´ F² , ´
¤ 2 2¦ ¤2 2 ¦

30°

D ( 1,0) 60° A (1,0)


O x

£ 1 3¥ £1 3¥
C² , ´ B² , ´
¤ 2 2¦ ¤2 2 ¦

Figure 7-7. Unit hexagon inscribed inside a unit circle showing the coordinates of its
corners. Each side of the hexagon has a length of unity. The x axis passes through
the corners D and A, and y axis bisects its parallel sides EF and CB.
178 SYSTEMS WITH HEXAGONAL PUPILS

Table 7-2. Orthonormal hexagonal polynomials H j U , T in terms of the Zernike


circle polynomials Z j U T .
H1 Z1

H2 6 5 Z2

H3 6 5 Z3

H4 5 43 Z1 + (2 15 43 )Z4

H5 10 7 Z5

H6 10 7 Z6

H7 16 14 11055 Z3 + 10 35 2211 Z7

H8 16 14 11055 Z2 + 10 35 2211 Z8

H9 (2 5 / 3 ) Z9

H10 (2 35 103 ) Z10

H11 (521/ 1072205 )Z1 + 88 15 214441 Z4 + 14 43 4987 Z11

H12 225 6 492583 Z6 + 42 70 70369 Z12

H13 225 6 492583 Z5 + 42 70 70369 Z13

H14 2525 14 297774543 Z6 (1495 70 99258181 /3)Z12 + ( 378910 / 18337 /3)Z14

H15 2525 14 297774543 Z5 + (1495 70 99258181 /3)Z13 + ( 378910 18337 /3)Z15

H16 30857 2 3268147641 Z2 + (49168/ 3268147641 )Z8 + 42 1474 1478131 Z16

H17 30857 2 3268147641 Z3 + (49168/ 3268147641 )Z7 + 42 1474 1478131 Z17

H18 386 770 295894589 Z10 +6 118965 2872763 Z18

H19 6 10 97 Z9 + 14 5 291 Z19

H20 0.71499593Z2 0.72488884Z8 0.46636441Z16 +1.72029850Z20


H21 0.71499594Z3 + 0.72488884Z7 + 0.46636441Z17 + 1.72029850Z21
H22 0.58113135Z1 + 0.89024136Z4 + 0.89044507Z11 + 1.32320623Z22
H23 1.15667686Z5 + 1.10775599Z13 + 0.43375081Z15 + 1.39889072Z23
H24 1.15667686Z6 + 1.10775599Z12 0.43375081Z14 + 1.39889072Z24
H25 1.31832566Z5 + 1.14465174Z13 + 1.94724032Z15 + 0.67629133Z23 + 1.75496998Z25
7.4 Hexagonal Polynomials 179

Table 7-2. Orthonormal hexagonal polynomials H j U , T in terms of the Zernike


circle polynomials Z j U T . (Cont.)
H26 1.31832566Z6 1.14465174Z12 + 1.94724032Z14 0.67629133Z24 + 1.75496998Z26

H27 2 77 93 Z27

H28 1.07362889Z1 1.52546162Z4 1.28216588Z11 0.70446308Z22 + 2.09532473Z28


H29 0.97998834Z3 + 1.16162002Z7 +1.04573775Z17 +0.40808953Z21 +1.36410394Z29
H30 0.97998834Z2 + 1.16162002Z8 + 1.04573775Z16 0.40808953Z20 + 1.36410394Z30
H31 3.63513758Z9 + 2.92084414Z19 + 2.11189625Z31
H32 0.69734874Z10 + 0.67589740Z18 + 1.22484055Z32
H33 1.56189763Z3 + 1.69985309Z7 + 1.29338869Z17 + 2.57680871Z21
+ 0.67653220Z29 + 1.95719339Z33
H34 1.56189763Z2 1.69985309Z8 1.29338869Z16 + 2.57680871Z20
0.67653220Z30 + 1.95719339Z34
H35 1.63832594Z3 1.74759886Z7 1.27572528Z17 0.77446421Z21
0.60947360Z29 0.36228537Z33 + 2.24453237Z35
H36 1.63832594Z2 1.74759886Z8 1.27572528Z16 + 0.77446421Z20
0.60947360Z30 + 0.36228537Z34 + 2.24453237Z36
H37 0.82154671Z1 + 1.27988084Z4 + 1.32912377Z11 + 1.11636637Z22
0.54097038Z28 + 1.37406534Z37
H38 1.54526522Z6 + 1.57785242Z12 0.89280081Z14 + 1.28876176Z24
0.60514082Z26 + 1.43097780Z38
H39 1.54526522Z5 + 1.57785242Z13 + 0.89280081Z15 + 1.28876176Z23
+ 0.60514082Z25 + 1.43097780Z39
H40 2.51783502Z6 2.38279377Z12 + 3.42458933Z14 1.69296616Z24
+ 2.56612920Z26 0.85703819Z38 + 1.89468756Z40
H41 2.51783502Z5 + 2.38279377Z13 + 3.42458933Z15 + 1.69296616Z23
+ 2.56612920Z25 + 0.85703819Z39 + 1.89468756Z41
H42 2.72919646Z1 4.02313214Z4 3.69899239Z11 2.49229315Z22
+ 4.36717121Z28 1.13485132Z37 + 2.52330106Z42

H43 1362 77 20334667 Z27 + (260/3) 341 655957 Z43

H44 2.76678413Z6 2.50005278Z12 + 1.48041348Z14 1.62947374Z24


+ 0.95864121Z26 0.69034812Z38 + 0.40743941Z40 + 2.56965299Z44

H45 2.76678413Z5 2.50005278Z13 1.48041348Z15 1.62947374Z23


0.95864121Z25 0.69034812Z39 0.40743941Z41 + 2.56965299Z45
180 SYSTEMS WITH HEXAGONAL PUPILS

Table 7-3. Orthonormal hexagonal polynomials H j U , T in polar coordinates


U, T .
H1 1

H2 2 6 / 5 ȡcosș

H3 2 6 / 5 ȡsinș

H4 5 / 43 ( 5 + 12ȡ2)

H5 2 15 / 7 ȡ2sin2ș

H6 2 15 / 7 ȡ2cos2ș

H7 4 42 / 3685 ( 14ȡ + 25ȡ3)sinș

H8 4 42 / 3685 ( 14ȡ + 25ȡ3)cosș

H9 (4 10 / 3 )ȡ3sin3ș

H10 4 70 / 103 ȡ3cos3ș

H11 (3/ 1072205 )(737 5140ȡ2 + 6020ȡ4)

H12 (30/ 492583 )( 249ȡ2 + 392ȡ4)cos2ș

H13 (30/ 492583 )( 249ȡ2 + 392ȡ4)sin2ș

H14 (10/3) 7 / 99258181 [10(297 598ȡ2)ȡ2cos2ș + 5413ȡ4cos4ș]

H15 (10/3) 7 / 99258181 [ 10(297 598ȡ2)ȡ2 sin2ș + 5413ȡ4sin4ș]

H16 2 6 / 1089382547 (70369ȡ 322280ȡ3 + 309540ȡ5)cosș

H17 2 6 / 1089382547 (70369ȡ 322280ȡ3 + 309540ȡ5)sinș

H18 4 385 / 295894589 ( 3322ȡ3 + 4635ȡ5)cos3ș

H19 4 5 / 97 ( 22ȡ3 + 35ȡ5)sin3ș


H20 ( 2.17600248ȡ + 13.23551876ȡ3 + 16.15533716ȡ5)cosș + 5.95928883ȡ5 cos5ș
H21 (2.17600248ȡ 13.23551876ȡ3 + 16.15533716ȡ5) sinș + 5.95928883ȡ5 sin5ș
H22 2.47059083 + 33.14780774ȡ2 93.07966445ȡ4 + 70.01749250ȡ6
H23 (23.72919095ȡ2 90.67126833ȡ4 + 78.51254738ȡ6)sin2ș + 1.37164051ȡ4sin4ș
H24 (23.72919095ȡ2 90.67126833ȡ4 + 78.51254738ȡ6)cos2ș 1.37164051ȡ4cos4ș
H25 (7.55280798ȡ2 36.13018255ȡ4 + 37.95675688ȡ6)sin2ș + ( 26.67476754ȡ4
+ 39.39897852ȡ6)sin4ș
H26 ( 7.55280798ȡ2 + 36.13018255ȡ4 37.95675688ȡ6)cos2ș + ( 26.67476754ȡ4
+ 39.39897852ȡ6)cos4ș
7.4 Hexagonal Polynomials 181

Table 7-3. Orthonormal hexagonal polynomials H j U , T in polar coordinates


U, T . (Cont.)
H27 14 22 / 93 ȡ6sin6ș
H28 0.56537219 10.44830313ȡ2 + 38.71296332ȡ4 37.27668254ȡ6 + 7.83998727ȡ6cos6ș
H29 ( 15.56917599ȡ + 130.07864353ȡ3 291.15952742ȡ5
+ 190.97455178ȡ7)sinș + 1.41366362ȡ5sin5ș
H30 ( 15.56917599ȡ + 130.07864353ȡ3 291.15952742ȡ5
+ 190.97455178ȡ7)cosș 1.41366362ȡ5cos5ș
H31 (54.28516840 202.83704634ȡ2 + 177.39928561ȡ4)ȡ3sin3ș
H32 (41.60051295 135.27397959ȡ2 + 102.88660624ȡ4)ȡ3cos3ș
H33 ( 3.87525156 + 41.84243767ȡ2 117.56342978ȡ4 + 94.71450820ȡ6)ȡsin ș
+ 76.09262860 + ( 38.04631430 + 54.80141514ȡ2)ȡ5sin5ș
H34 (3.87525156 + 41.84243767ȡ2 117.56342978ȡ4+ 94.71450820ȡ6)ȡcos ș
+ ( 38.04631430 + 54.80141514ȡ2)ȡ5cos5ș
H35 (3.10311187 34.93479698ȡ2 + 102.08124605ȡ4 85.32630533ȡ6)ȡsinș
+ (6.01202622 10.14399046ȡ2)ȡ5 sin 5ș + 8.978129552ȡ7sin7ș
H36 (3.10311187ȡ 34.93479698ȡ2 + 114.10529848ȡ4 87.65802721ȡ6)ȡcosș
+ (12.02405243 2.33172188ȡ2) ȡ5cos3ș + (12.02405243 + 3.68030434ȡ2)ȡ5cos5ș
+ 6.01202622ȡ7cos7ș
H37 2.74530738 60.39881618ȡ2 + 300.22087475ȡ4 518.03488742ȡ6
+ 288.55372176ȡ8 2.02412582ȡ6cos6ș
H38 ( 42.96232789 + 287.78381063ȡ2 565.13651608ȡ4
+ 339.98298180ȡ6)ȡ2cos2ș + (8.49786414 13.58537785ȡ2)ȡ4cos4ș
H39 ( 42.96232789 + 287.78381063ȡ2 565.13651608ȡ4
+ 339.98298180ȡ6)ȡ2sin2ș + (8.49786414 13.58537785ȡ2)ȡ4sin4ș
H40 (14.79181046 121.61654135ȡ2 + 286.77354559ȡ4
203.62188574ȡ )ȡ2cos2ș
6

+ (83.39879886 280.00664075ȡ2 + 225.07739907ȡ4)ȡ4cos4ș


H41 ( 14.79181046 + 121.61654135ȡ2 286.77354559ȡ4 + 203.62188574ȡ6)ȡ2sin2ș
+ (83.39879886 280.00664075ȡ2 + 225.07739907ȡ4)ȡ4sin4ș
H42 0.84269170 + 24.65387703ȡ2 158.21741244ȡ4 + 344.75780000ȡ6
238.31877895ȡ8 + ( 58.59775991 + 85.64367812ȡ2)ȡ6cos6ș

H43 2 22 / 20334667 ( 23443 + 32240ȡ2)ȡ6sin6ș


H44 (9.64776957 85.41873843ȡ2 + 216.08041438ȡ4
164.01834750ȡ6)ȡ2cos2ș + (12.67622930 51.08055822ȡ2
+ 48.40133344ȡ4)ȡ4cos4ș + 10.90211434ȡ8cos8ș
H45 (9.64776957 85.41873843ȡ2 + 216.08041438ȡ4 164.01834750ȡ6)ȡ2sin2ș
(12.67622930 51.08055822ȡ2 + 48.40133344ȡ4)ȡ4sin4ș + 10.90211434ȡ8sin8ș
182 SYSTEMS WITH HEXAGONAL PUPILS

Table 7-4. Orthonormal hexagonal polynomials H j x, y in Cartesian coordinates


x, y , where U 2 x 2  y 2 .
H1 1

H2 2 6/5 x

H3 2 6/5 y

H4 5 / 43 ( 5 + 12ȡ2)

H5 4 15 / 7 xy

H6 2 15 / 7 (x2 y2)

H7 4 42 / 3685 ( 14 + 25ȡ2)y

H8 4 42 / 3685 ( 14 + 25ȡ2)x

H9 (4/3) 10 (3x2y y3)

H10 4 70 / 103 (x3 3xy2)

H11 (3/ 1072205 )(737 5140ȡ2 + 6020ȡ4)

H12 (30/ 492583 )(392ȡ2 249)(x2 y2)

H13 (60/ 492583 )(392ȡ2 249)xy

H14 (10/3) 7 / 99258181 [567x4 + 32478 x2 y2 11393y4 2970(x2 y2)]

H15 (40/3) 7 / 99258181 ( 1485 + 8403x2 2423y2)xy

H16 2 2 / 3268147641 (211107 966840ȡ2 + 928620ȡ4)x

H17 2 2 / 3268147641 (211107 966840ȡ2 + 928620ȡ4)y

H18 4 385 / 295894589 ( 3322 + 4635ȡ2)(x3 3xy2)

H19 4 5 / 97 ( 22 + 35ȡ2)(3x2y y3)


H20 ( 2.17600247 + 13.23551876ȡ2 + 13.64110699 ȡ4)x 119.18577680 ȡ2 x3
+ 95.3486212x5
H21 (2.17600247 13.23551876ȡ2 + 45.95178131ȡ 4)y 119.18577680 ȡ2y3
+ 95.34862128y5
H22 2.47059083 + 33.14780774ȡ2 93.07966445ȡ4 + 70.01749250ȡ6
H23 (47.45838189 175.85597460x2 186.82909872y2 + 157.02509476x4

+ 314.05018953x2y2 + 157.02509476y4)xy
H24 (23.72919094 92.04290884x2 + 78.51254738x4)x2 + ( 23.72919094
+ 8.22984309x2 + 89.29962781y2 + 78.51254738x4 78.51254738x2y2
78.51254738y4)y2
7.4 Hexagonal Polynomials 183

Table 7-4. Orthonormal hexagonal polynomials H j x, y in Cartesian coordinates


x, y , where U 2 x 2  y 2 . (Cont.)
H25 (15.10561596 – 178.95943525x2 + 34.43870505y2 + 233.50942786x4
+ 151.82702751x2y2 – 81.68240034y4)xy
H26 (– 7.55280798 + 9.45541501x2 + 1.44222164x4)x2 + (7.55280798 + 160.04860523x2– 62.80495008y2
–234.95164950x4 – 159.03813574x2y2 + 77.35573540y4)y2
H27 (40.85537039x4 136.18456799 x2y2 + 40.85537039y4)xy
H28 0.56537219 – 10.44830312ȡ2 + 38.71296332x4 + 77.42592664 x2y2 + 38.71296332y4 29.43669525x6
229.42985678 x4y2 +5.76976155 x2y4 45.11666981y6

H29 ( 15.56917599 + 130.07864353ȡ2 – 284.09120931ȡ4 + 190.97455178ȡ6) y


– 28.2732724ȡ2y3 + 22.61861792y5
2 3 5
H30 ( 15.56917599 + 130.07864353ȡ2 – 298.22784553ȡ4 + 190.97455178ȡ6)x + 28.27327243ȡ x – 22.61861792x

H31 (162.85550520x2 54.28516840y2 608.51113904x2ȡ2 + 202.83704634y2ȡ2 +532.19785685x2ȡ4


177.39928561y2ȡ4)y

H32 [(41.60051295 135.27397959x2 + 102.88660624x4)x2 +( 124.80153887 + 270.54795919x2+ 405.82193879y2


102.88660624x4 – 514.43303123 x2y2 308.65981874y4)y2]x
H33 [ 3.87525156 + (41.84243767 307.79500129x2 + 368.72158389x4)x2 + (41.84243767 + 145.33628349x2
155.60974407y + 10.13644892x4
2
209.06921162 x2y2 + 149.51592334y4)y2]y
H34 [3.87525156 + ( 41.84243767 + 79.51711547x2 39.91309306x4)x2 + ( 41.84243767 + 615.59000259x2
72.66814174y2 777.35626084x4 558.15060029 x2y2 + 179.29256748y4)y2]x
H35 [3.10311187 + ( 34.93479698 + 132.14137712x2 73.19935100x4)x2 + ( 34.93479698 + 144.04222993x2
2 2 4 2
+ 108.09327226y2 519.49349681x4 + 23.85771799 x y 104.44842531y )y ]y
H36 [3.10311187 + ( 34.93479698 + 96.06921983x2 66.20418535x4)x2 + ( 34.93479698 + 264.28275425x2
2 2 4 2
+ 72.02111496y2 535.81555000x4 + 7.53566481 x y 97.45325965y )y ]x

H37 2.74530738 60.39881618ȡ2 + 300.22087475ȡ4 + 288.55372176ȡ8


520.05901324x6 1523.74277487 x4y2 1584.46654966 x2y4 516.01076159y6
H38 ( 42.96232789 + 296.28167478x2 578.72189394x4 + 339.98298180x6)x2 + (42.96232789 50.98718488x2
279.28594648y2 497.20962679x4 + 633.06340537 x2y2 + 551.55113822y4 + 679.96596360x6
679.96596360 x2y4 339.98298180y6)y2
H39 [ 85.92465579 + (541.57616468 1075.93152073x2 + 679.96596360x4)x2 + (609.55907786
2 2 2 4 2
2260.54606433x 1184.61454360y2 + 2039.89789081x4 + 2039.89789081x y + 679.96596360y )y ]xy
H40 (14.79181046 38.21774249x2 + 6.76690483x4 + 21.45551332x6)x2 + ( 14.79181046 500.39279319x2
2 4 2 2 4
+ 205.01534022y + 1686.80674937x + 1113.25965819 x y 566.78018634y 1307.55336779x6
4 2 2 4 6 2
2250.77399075 x y 493.06582480 x y + 428.69928482y )y
H41 [ 29.58362093 + (576.82827818 1693.57365421x2 + 1307.55336779x4)x2 +( 90.36211274
1147.09418236x2 + 546.47947184y2 + 2122.04091078x4 + 321.42171817x2y2 493.06582480y4)y2]xy
H42 0.84269170 + (24.65387703 158.21741244x2 + 286.16004008x4 152.67510082x6) x2+ (24.65387703
316.43482489x2 158.21741244y2 + 1913.23979875x4 + 155.30700127x2 y2 + 403.35555992y4
– 2152.28660953x6 – 1429.91267370x4y2 + 245.73637792x2y4 – 323.96245707y6)y2 + 403
3 3 5 2
H43 2 22 / 20334667 (6x5y 20x y +6xy )( 23443 + 32240ȡ )
2 4 6
H44 (9.64776957 72.74250912x + 164.99985615x 104.71489971x )x2
+ ( 9.64776957 –76.05737585x2 + 98.09496774y2 + 471.48320551x4
+ 39.32237674 x2y2 267.16097261y4 826.90123032x6
+ 279.13466933 x4 y2 170.82784030 x2 y4 + 223.32179529y6) y2
H45 [19.29553915 + ( 221.54239411 + 636.48306167x2 434.42511407x4)x2
+ ( 120.13255963 + 864.32165754x2 + 227.83859586y2 1788.23382186x4
179.98634818 x2y2 221.64827593y4)y2]xy
184 SYSTEMS WITH HEXAGONAL PUPILS

between the two. In particular, it helps determine the potential error made when a
hexagonal aberration function is expanded in terms of the circle polynomials (see Chapter
12). The coefficients of the circle polynomials are the elements of the conversion matrix
M (discussed in Chapter 3). The polynomials up to H19 are given in their analytical form,
but those with j > 19 are written in a numerical form because of the increasing
complexity of the coefficients of the circle polynomials. In Table 7-3, the hexagonal
polynomials are given in polar coordinates, showing one-to-one correspondence with the
circle polynomials but illustrating the difference between them. This form is convenient
for analytical calculations because of integration of trigonometric functions over
symmetric limits. Finally, the polynomials are given in Cartesian coordinates in Table 7-
4, for a quantitative numerical analysis of, say, an interferogram.

Several observations can be made from the polynomial tables. It is evident from
Table 7-2 that the corresponding coefficients of the Zernike polynomials that make up the
hexagonal polynomial (n, m) pairs are the same except for signs in some cases, unless m
is a multiple of 3. For example, H14 and H15 have some coefficients with different signs,
but H16 and H17 have the same signs. H9 and H10 , which correspond to n = 3 and m =
3, and H18 and H19 , which correspond to n = 5 and m = 3, have different coefficients.
From Table 7-3, we note that each hexagonal polynomial consists of cosine or sine terms,
but not both.

Unlike the circle and annular polynomials, the hexagonal polynomials are generally
not separable in r and q due to lack of radial symmetry of the hexagonal pupil. The first
13 polynomials, i.e., up to H13 , are separable, but H14 and H15 are not; H16 through H19
are separable, but H20 and H21 are not. Accordingly, the notion of two indices n and m
with dependence on m in the form of cos mq loses significance. For example, the Zernike
polynomial Z14 for n = 4 and m = 4 varies as cos 4q but H14 has a term in cos 2q also.
Hence, the hexagonal polynomials can be ordered by a single index only. While the
polynomials H11 and H22 representing balanced primary and secondary spherical
aberrations are radially symmetric, the polynomial H37 representing balanced tertiary
spherical aberration is not, since it consists of an angle-dependent term in Z28 or cos 6q
also. If this term is not included in the polynomial H37 , the standard deviation of the
aberration increases from a value of unity to 1.13339.

A different configuration of a hexagonal pupil is illustrated in Figure 7-8 where the


hexagon is rotated by 30 o compared to that in Figure 7-7 so that the point A, for example,
moves to a point A ¢ . Whereas in Figure 7-7 the x axis passes through the corners D and A
of the hexagon and the y axis bisects its parallel sides EF and CB; in Figure 7-8, the x axis
bisects the parallel sides F ¢A ¢ and D¢C ¢ of the hexagon and the y axis passes through its
corners E ¢ and B ¢ . As a result, some polynomials change, as may be seen by comparing
the polynomials given in Table 7-5 for the 30-degree rotation with those in Table 7-2.
The first eight polynomials, H11 through H13 , H16 , H17 , H22 , H27 , etc., do not change.
Polynomials H 9 and H10 , H14 and H15 , and H18 and H19 , etc., exchange the
coefficients of the circle polynomial components.
7.4 Hexagonal Polynomials 185

y
E¢(0,1)

30
60
r

r
Ê 3 1ˆ Ê 3 1ˆ
D¢ Á , ˜ F¢ Á , ˜
Ë 2 2¯ Ë 2 2¯

O x

Ê 3 1ˆ Ê 3 1ˆ
C¢ Á , ˜ A¢ Á , ˜
Ë 2 2¯ Ë2 2¯

B¢ (0 , 1)

Figure 7-8. Unit hexagon rotated clockwise 30 degrees with respect that in Figure 7-
7, showing the coordinates of its corners. The x axis bisects the parallel sides F ¢A¢
and D¢ C ¢ of the hexagon, and the y axis passes through its corners E ¢ and B ¢ .

7.5 HEXAGONAL COEFFICIENTS OF A HEXAGONAL ABERRATION


FUNCTION
A hexagonal aberration function W ( x , y ) across a unit hexagon can be expanded in
terms of J hexagonal polynomials H j (r, q) in the form
J
W ( x, y) = Â a j H j ( x, y) , (7-29)
j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (7-29) by
H j ( x , y ), integrating over the unit hexagon, and using the orthonormality Eq. (7-27), we
obtain the hexagonal expansion coefficients:

2
aj = Ú W ( x , y )H j ( x , y ) dx dy . (7-30)
3 3 hexagon

It is evident from Eq. (7-30) that the value of a hexagonal coefficient is independent of
the number J of polynomials used in the expansion of the aberration function. Hence, one
or more polynomial terms can be added to or subtracted from the aberration function
without affecting the value of the coefficients of the other polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (7-31)

and
J
W 2 (r, q) = Â a 2j , (7-32)
j =1
186 SYSTEMS WITH HEXAGONAL PUPILS

Table 7-5. Orthonormal hexagonal polynomials H j U , T in terms of Zernike circle


polynomials Z j U T for hexagon rotated by 30 R, as in Figure 7-8.

H1 Z1

H2 6 / 5 Z2

H3 6 / 5 Z3

H4 5 / 43 Z1 + 2 15 / 43 Z4

H5 10 / 7 Z5

H6 10 / 7 Z6

H7 16 14 / 11055 Z3 + 10 35 / 2211 Z7

H8 16 14 / 11055 Z2 + 10 35 / 2211 Z8

H9 2 35 / 103 Z9

H10 (2 5 /3)Z10

H11 (521/ 1072205 ) Z1 + 88 15 / 214441 Z4 + 14 43 / 4987 Z11

H12 = 225 6 / 492583 Z6 + 42 70 / 70369 Z12

H13 = 225 6 / 492583 Z5 + 42 70 / 70369 Z13

H14 = 2525 14 / 297774543 Z6 + (1495 70 / 99258181 /3)Z12 + ( 378910 / 18337 /3)Z14

H15 = 2525 14 / 297774543 Z5 (1495 70 / 99258181 /3)Z13 + ( 378910 / 18337 /3)Z15

H16 = 30857 2 / 3268147641 Z2 + (49168/ 3268147641) Z8 + 42 1474 / 1478131 Z16

H17 = 30857 2 / 3268147641 Z3 + (49168/ 3268147641) Z7 + 42 1474 / 1478131 Z17

H18 = 6 10 / 97 Z10 + 14 5 / 291 Z18

H19 = 386 770 / 295894589 Z9 + 6 118965 / 2872763 Z19


H20 = 0.71499593Z2 + 0.72488884Z8 + 0.46636441Z16 + 1.72029850Z20
H21 = 0.71499593Z3 0.72488884Z7 0.46636441Z17 + 1.72029850Z21
H22 = 0.58113135Z1 + 0.89024136Z4 + 0.89044507Z11 + 1.32320623Z22
H23 = 1.15667686Z5 + 1.10775599Z13 0.43375081Z15 + 1.39889072Z23
H24 = 1.15667686Z6 + 1.10775599Z12 + 0.43375081Z14 + 1.39889072Z24
H25 = 1.31832566Z5 1.14465174Z13 + 1.94724032Z15 0.67629133Z23 + 1.75496998Z25
H26 = 1.31832566Z6 + 1.14465174Z12 + 1.94724032Z14 + 0.67629133Z24 + 1.75496998Z26

H27 = 2 77 / 93 Z27
H28 = 1.07362889Z1 + 1.52546162Z4 + 1.28216588Z11 + 0.70446308Z22 + 2.09532473Z28
7.5 Hexagonal Coefficients of a Hexagonal Aberration Function 187

respectively. Accordingly, the aberration variance is given by

2
2
sW = W 2 (r, q) - W (r, q)

J
= Â a 2j . (7-33)
j =2

7.6 ISOMETRIC, INTERFEROMETRIC, AND IMAGING CHARACTERISTICS


OF HEXAGONAL POLYNOMIAL ABERRATIONS

As in the case of circle and annular polynomials (see Sections 4.9 and 5.7,
respectively), we illustrate the hexagonal polynomials for n £ 8 in three different but
equivalent ways in Figure 7-9. For each polynomial, the isometric plot at the top
illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is
shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers
(in units of wavelength) are given in Table 7-6.

The PSF plots represent the images of a point object in the presence of a polynomial
aberration. They can be obtained by applying Eq. (7-6) to a hexagonal pupil. Piston yields
the aberration-free PSF since it does not affect the PSF. The full width of a square
displaying the PSFs is 24l Fx .

The polynomial aberrations H 2 and H 3 , representing the x and y wavefront tilts


with aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x
and y axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 2 6 5 la 2 a about the y axis and displaces the PSF along the x
axis by 4 6 5lFx a 2 . where Fx = R 2a is the focal ratio of the image-forming beam
along the x axis. Similarly, the coefficient a 3 corresponds to a tilt angle of 4 2 5la 3 a
about the x axis, and yields a displacement of the PSF along the y axis by 4 6 5lFy a 3 ,
where Fy = R ( )
3 2 a is the focal ratio of the image-forming beam along the y axis.

The symmetry properties of the aberrated PSFs (and OTFs) discussed for the circular
pupils in Section 4.7 are generally not applicable to hexagonal pupils. For example,
although the form of the polynomials H 5 and H 6 , representing balanced astigmatisms,
are the same as the corresponding Zernike circle polynomials, the interferogram and the
PSF for one cannot be obtained by a 45o rotation of the other. This is due to the lack of
radial symmetry of the hexagonal pupil. However, the interferograms and PSFs for the
polynomials H 7 and H 8 , representing balanced comas, are different from each other
only by a 90 o rotation. Similarly, the polynomials H 9 and H10 have the same form as
the Zernike circle polynomials Z 9 and Z10 , respectively, and they yield 6-fold symmetric
interferograms and 3-fold symmetric PSFs. The PSF for one can be obtained by a 120 o
rotation of the other. The interferograms and the PSFs for H11 and H 22 , representing the
balanced primary and secondary aberrations, respectively, are radially symmetric, but
those for H 37 , representing the balanced tertiary aberration, are not because it contains a
188 SYSTEMS WITH HEXAGONAL PUPILS

H1 H2 H3

H4 H5 H6

H7 H8 H9

H10 H11 H12

H13 H14 H15

Figure 7-9. Hexagonal polynomials shown as isometric plot on the top,


interferogram on the left, and PSF on the right for a sigma value of one wave.
7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 189

H16 H17 H18

H19 H20 H21

H22 H23 H24

H25 H26 H27

H28 H29 H30

Figure 7-9. Hexagonal polynomials shown as isometric plot on the top,


interferogram on the left, and PSF on the right for a sigma value of one wave.
(Cont.)
190 SYSTEMS WITH HEXAGONAL PUPILS

H31 H32 H33

H34 H35 H36

H37 H38 H39

H40 H41 H42

H43 H44 H45

Figure 7-9. Hexagonal polynomials shown as isometric plot on the top,


interferogram on the left, and PSF on the right for a sigma value of one wave.
(Cont.)
7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 191

Table 7-6. Peak-to-valley (P-V) numbers (in units of wavelength) of orthonormal


hexagonal polynomials for a sigma value of one wave.

Poly. P-V # Poly. P-V # Poly. P-V #

H1 0 H16 17.108 H 31 8.210

H2 4.328 H17 14.816 H 32 18.426

H3 3.795 H18 11.982 H 33 10.495

H4 4.092 H19 5.696 H 34 9.657

H5 5.071 H 20 8.081 H 35 10.094

H6 5.123 H 21 7.855 H 36 10.537

H7 5.790 H 22 10.086 H 37 12.843

H8 9.395 H 23 17.665 H 38 16.723

H9 5.477 H 24 15.298 H 39 25.254

H10 6.595 H 25 8.764 H 40 11.499

H11 5.728 H 26 7.919 H 41 12.891

H12 9.169 H 27 7.384 H 42 6.278

H13 10.587 H 28 6.655 H 43 9.859

H14 6.803 H 29 22.362 H 44 11.139

H15 7.116 H 30 25.822 H 45 9.983

term in cos 6q . Of course, as the order of a polynomial aberration increases, the


interferograms and the PSFs become more and more complex.

From Eq. (7-6), the Strehl ratio, representing the central value of an aberrated PSF
relative to its aberration-free value, is given by

S ∫ I (0, 0)

4 2
=
27 ÚÚ [ ]
exp iF ( x , y ) dx d y , (7-34)
192 SYSTEMS WITH HEXAGONAL PUPILS

where the integration is carried out over the unit hexagon, as in Eq. (7-8). We have
removed the primes on the x and y coordinates in Eq. (7-34), because the hexagonal
polynomial aberrations are already written in the normalized coordiantes. The Strehl ratio
for these aberrations with a sigma value of 0.1 wave is listed in Table 7-7 and plotted in
Figure 7-10. Because of the small value of the aberration, the Strehl ratio is
approximately the same for each polynomial, thus illustrating its independence of the
( )
type of the aberration. It is approximately given by exp - s F2 , or 0.67, where
s F = 0.2p .

Table 7-7. Strehl ratio S for hexagonal polynomial aberrations for a sigma value of
0.1 wave.

Poly. S Poly. S Poly. S

H1 1 H16 0.700 H 31 0.678

H2 0.665 H17 0.703 H 32 0.709

H3 0.665 H18 0.694 H 33 0.686

H4 0.664 H19 0.671 H 34 0.687

H5 0.672 H 20 0.692 H 35 0.704

H6 0.672 H 21 0.692 H 36 0.704

H7 0.676 H 22 0.700 H 37 0.713

H8 0.676 H 23 0.706 H 38 0.710

H9 0.677 H 24 0.703 H 39 0.714

H10 0.682 H 25 0.680 H 40 0.693

H11 0.680 H 26 0.680 H 41 0.693

H12 0.686 H 27 0.697 H 42 0.680

H13 0.686 H 28 0.700 H 43 0.693

H14 0.685 H 29 0.717 H 44 0.710

H15 0.685 H 30 0.712 H 45 0.710


7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 193

Figure 7-10. Strehl ratio for a hexagonal polynomial aberration with a sigma value
of 0.1 wave.
194 SYSTEMS WITH HEXAGONAL PUPILS

7.7 SEIDEL ABERRATIONS, STANDARD DEVIATION, AND STREHL RATIO


As discussed in the previous chapters, the Strehl ratio of an aberrated image for small
aberrations is determined by the variance of the aberration across the pupil under
consideration. Just as the Zernike circle polynomials represent balanced aberrations in the
sense of minimum variance and, in turn, maximum Strehl ratio for a small aberration,
similarly, the hexagonal polynomials also represent balanced aberrations for the
hexagonal pupils. In Chapters 4 through 6, we have given the value of sigma for a Seidel
aberration, using Ai as its coefficient, with and without balancing for circular, annular,
and Gaussian pupils. As shown below, similar results for a hexagonal pupil can be
obtained from the corresponding orthonormal polynomials. We also determine the Strehl
ratio for Seidel aberrations with and without balancing, and compare with the result
obtained by the exponential approximation.

7.7.1 Defocus
Consider the defocus aberration

W d (r) = Ad r 2 . (7-35)

From the form of the orthonormal defocus polynomial H4 given in Table 7-2, it is
evident that its sigma value across a hexagonal pupil is given by

Ad 43 Ad
sd = = . (7-36)
12 5 4.092

7.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by

W a (r, q) = Aa r 2 cos 2 q . (7-37)

The orthonormal polynomial representing balanced astigmatism is given by

H 6 = 2 15 7r 2 cos 2q . (7-38a)

(
= 2 15 7r 2 2 cos 2 q - 1 ) . (7-38b)

It shows that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is the same for a hexagonal pupil as for a circular, annular, or a Gaussian pupil.
Hence, for a small amount of astigmatism, the diffraction focus for a hexagonal pupil is
the same as for a circular, annular, or a Gaussian pupil. For an image with a focal ratio of
F, it lies along the z axis at a distance of - 4 Aa F 2 from the Gaussian image point. The
balanced astigmatism is given by

Ê 1 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (7-39)
Ë 2 ¯
 $VWLJPDWLVP 195

Its sigma value is given by

Aa 7 Aa
s ba = = . (7-40)
4 15 5.855

To obtain the sigma value of astigmatism, we write Eq. (7-37) in the form

1
W a (r, q) = (
A r 2 cos 2q + r 2
2 a
)
1 È 7 1 43 ˘
= Aa Í H6 + H ˙ + constant . (7-41)
4 Î 15 6 5 4˚

Utilizing Eq. (7-33), the sigma value is given by

Aa 127 Aa
sa = = . (7-42)
24 5 4.762

Comparing Eqs. (7-40) and (7-42), we find that balancing astigmatism with defocus
reduces its sigma value of by a factor of 1.23.

7.7.3 Coma
Now we consider Seidel coma:

W c (r, q) = Ac r 3 cos q . (7-43)

The orthonormal polynomial representing balanced coma is given by

(
H 8 = 4 42 3685 25r 3 - 14 r cos q .) (7-44)

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
r3 cos q is - 14 25 ª -0.56 compared to - 2 3 for a circular pupil. The diffraction focus
in this case lies along the x axis at a distance of - ( 4 3) F times the amount of tilt from
the Gaussian image point. The balanced coma is given by

Ê 14 ˆ
W bc (r, q) = Ac Á r 3 - r˜ cos q . (7-45)
Ë 25 ¯

Its sigma value is given by

Ac 737 Ac
s bc = = . (7-46)
20 210 10.676

To obtain the sigma value of Seidel coma, we write Eq. (7-43) in the form

È 1 3685 7 5 ˘
W c (r, q) = Ac Í H8 + H ˙ . (7-47)
Î 100 42 25 6 2 ˚
196 SYSTEMS WITH HEXAGONAL PUPILS

Utilizing Eq. (7-33), the sigma value is given by

Ac 83 Ac
sc = = . (7-48)
4 70 3.673

Comparing Eqs. (7-46) and (7-48), we find that balancing coma with tilt reduces its sigma
value of by a factor of 2.91.

7.7.4 Spherical Aberration


Finally, we consider Seidel spherical aberration:

W s (r) = Asr 4 . (7-49)

The orthonormal polynomial representing balanced spherical aberration is given by

60
H11 =
1072205
( )
301r 4 - 257r 2 + constant . (7-50)

It shows that the relative amount of defocus that optimally balances Seidel spherical
aberration r 4 is - 257 301 ª - 0.85 compared to a value of –1 for a circular pupil. The
diffraction focus lies closer to the Gaussian image point in the case of coma, and closer to
the Gaussian image plane in the case of spherical aberration, compared to their
corresponding locations for a circular pupil. The balanced spherical aberration is given by

Ê 257 2 ˆ
W bs (r) = As Á r 4 - r ˜ . (7-51)
Ë 301 ¯

Its sigma value is given by

As A 4987
s bs = 1072205 = s
60 ¥ 301 84 215

As
= . (7-52)
17.441

To obtain the sigma value of Seidel spherical aberration, we write Eq. (7-49) in the form

È 1072205 257 43 ˘
W s (r) = As Í H11 + H ˙ + constant . (7-53)
Î 60 ¥ 301 12 ¥ 301 5 4 ˚

Utilizing Eq. (7-33), the sigma value is given by

As 59 As
ss = = . (7-54)
6 35 4.621

Comparing Eqs. (7-52) and (7-54), we find that balancing astigmatism with defocus
reduces its sigma value by a factor of 3.77.
7.7.4 Spherical Aberration 197

The sigma values of the Seidel aberrations with and without balancing are given in
Table 7-8. The corresponding peak-to-valley (P-V) numbers for a sigma value of unity
are also given in the table.

7.7.5 Strehl Ratio


In Figure 7-10, we showed the Strehl ratio for the hexagonal polynomial aberrations
with a sigma value of one-tenth of a wave. In Figure 7-11, we show how it varies with the
sigma value of a Seidel aberration, with and without balancing, for 0 £ s W £ 0.25 . Also
( )
plotted is the Strehl ratio obtained from the approximate expression exp - s F2 as the
dashed curve. As expected, the exponential expression yields a very good estimate of the
Strehl ratio for s W £ 0.1. As s W increases, the true Strehl ratio departs from its
approximate value, except in the case of balanced astigamtism and balanced coma. It
overestimates in the case of defocus, but underestimates for the other aberrations.
Morover, the Strehl ratio for the balanced spherical aberration for large values of s W is
larger than that for the corresponding Seidel aberration, but the opposite is true in the case
of astigmatism and coma The aberration coefficient and the P-V number for a certain
value of s W of these aberrations can be obtained from Table 7-8.

7.8 SUMMARY
Closed-form expressions for the aberration-free PSF and OTF are given for a system
with a hexagonal pupil. They are plotted along with the ensquared power, and compared
with the corresponding qunatities for a system with a corresponding circular pupil. The
ensquared power and the OTF for a hexagonal pupil are shown to be lower than the
corresponding values for a circular pupil. Generally, the quantitative differences between
the corresponding functions for the two pupils are small, perhaps because the difference
in the pupil area is only about 16%.

Table 7-8. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.

Aberration Sigma P-V # for s = 1

Defocus s d = ( Ad 12) 43 5 = Ad 4.09 4.092

Astigmatism s a = ( Aa 24) 127 5 = Aa 4.76 4.762

Balanced astigmatism s ba = ( Aa 4) 7 15 = Aa 5.86 5.123

Coma s c = ( Ac 4) 83 70 = Ac 3.67 7.347

Balanced coma s bc = ( Ac 20) 737 210 = Ac 10.68 9.395

Spherical aberration s s = ( As 6) 59 35 = As 4.62 4.621

Balanced spherical aberration s bs = ( A s 84 ) 4987 215 = A s 17.44 5.728


198 SYSTEMS WITH HEXAGONAL PUPILS

1.0 1.0

0.8 0.8

0.6 0.6
S

S
0.4 0.4

0.2 0.2

Defocus Astigmatism
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
VW VW

(a) (b)

1.0 1.0

0.8 0.8

0.6 0.6
S

0.4 0.4

0.2 0.2

Coma Spherical
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
VW VW

(c) (d)

Figure 7-11. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
7.8 Summary 199

The polynomials orthonormal over a hexagonal pupil, representing the balanced


classical aberrations over such a pupil, are given through the eighth order in Tables 7-2
through 7-4 in terms of the circle polynomials, in polar coordinates, and in Cartesian
coordinates, respectively. The polynomials are ordered in the same manner as the circle,
annular, and Gaussian polynomials discussed in Chapters 4, 5, and 6, respectively.
However, unlike these polynomials, the hexagonal polynomials are generally not
separable in the coordinates r and q of a pupil point due to a lack of the radial symmetry
of the hexagonal pupil. The first 13 polynomials, i.e., up to H13 , are separable, but H14
and H15 are not; H16 through H19 are separable, but H20 and H21 are not. Accordingly,
the concept of two indices n and m with dependence on m in the form of cos mq or
sin mq loses significance. For example, the Zernike circle polynomial Z14 for n = 4 and
m = 4 varies as cos 4q , but H14 has a term in cos 2q also. Hence, the hexagonal
polynomials can be ordered by a single index only. Even so, each polynomial contains
only the cosine or the sine terms. Thus an even j polynomial, for example, consists of
only the cosine terms, as may be seen from Table 7-2.

While the polynomials H11 and H22 representing balanced primary and secondary
spherical aberrations are radially symmetric, the polynomial H37 representing balanced
tertiary spherical aberration is not, since it consists of an angle-dependent term in Z28 or
cos 6q also. If this term is not included in the polynomial H37 , the standard deviation of
the aberration increases from a value of unity to 1.13339.

In practice, the polynomials in Cartesian coordinates given in Table 7-4 will be used
for the analysis of aberration data of a hexagonal wavefront. A somewhat different set of
hexagonal polynomials is obtained when the hexagon is rotated by 30 degrees. These
polynomials are given in Table 7-5.

The first 45 hexagonal polynomials, i.e., up to and including the 8th order, are
illustrated by an isometric plot, an interferogram, and a PSF in Figure 7-9. The coefficient
of each orthonormal polynomial, or the sigma value of the corresponding aberration, is
one wave. Their corresponding P-V numbers for a sigma value of one wave are given in
Table 7-6 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l for each
aberration is given in Table 7-7 and illustrated in Figure 7-10. It shows that, for a small
aberration, the Strehl ratio can be estimated from the aberration variance. The sigma
values of the Seidel aberrations and their balanced forms are given, along with their P-V
numbers in Table 7-8.

The diffraction focus for a system with a hexagonal pupil is shown to lie closer to the
Gaussian image point in the case of coma, and closer to the Gaussian image plane in the
case of spherical aberration, compared to their corresponding locations for a circular
pupil. Figure 7-11 shows how the Strehl ratio varies with the sigma value of a Seidel
aberration, with and without balancing. The approximate expression exp - s F2 ( )
overestimates its value in the case of defocus, but underestimates it for the other
aberrations.
200 SYSTEMS WITH HEXAGONAL PUPILS

References

1. keckobservatory.org/

2. L. D. Feinberg, M. Clamping, R. K. Keski-Kuha, C. Atkinson, S. Texter, M.


Bergelnad, and B. B. Gallagher, “James Webb telescope optical telescope element
mirror development history and results,” in Space Telescopes and Instrumentation,
Proc. SPIE , 84422 (2012).

3. M. Troy and G. Chanan, “Diffraction effects from giant segmented-mirror


telescopes,” Appl. Opt. 42, 3745–3753 (2003).

4. E. Sabatke, J. Burge, and D. Sabatke, “Analytic diffraction analysis of a 32-m


telescope with hexagonal segments for high-contrast imaging,” Appl. Opt. 44,
1360–1365 (2005).

5. R. C. Smith and J. S. Marsh, “Diffraction patterns of simple apertures,” J. Opt.


Soc. Am. 64, 798–803 (1974).

6. J. A. Díaz and V. N. Mahajan, “Imaging by a system with a hexagonal pupil,”


Appl. Opt. 52, 5112–5122 (2013).

7. G. Chanan and M. Troy, “Strehl ratio and modulation transfer function for
segmented mirror telescopes as functions of segment phase error,” Appl. Opt. 38,
6642–6647 (1999).

8. N. Yaitskova and K. Dohlen, “Tip-tilt error for extremely large segmented


telescopes: detailed theoretical point-spread function analysis and numerical
simulation results,” J. Opt. Soc. Am. A 19, 1274–1285 (2003).

9. V. N. Mahajan and G.-m Dai, “Orthonormal polynomials in wavefront analysis:


analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

10. V. N. Mahajan, “Orthonormal polynomials in wavefront analysis,” Handbook of


Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–
11.41 (McGraw Hill, 2009).
CHAPTER 8

SYSTEMS WITH ELLIPTICAL PUPILS

8.1 Introduction ..........................................................................................................203

8.2 Pupil Function ......................................................................................................203

8.3 Aberration-Free Imaging ....................................................................................204

8.3.1 PSF ..........................................................................................................204

8.3.2 OTF ..........................................................................................................207

8.4 Elliptical Polynomials ..........................................................................................209

8.5 Elliptical Coefficients of an Elliptical Aberration Function ........................... 210

8.6 Isometric, Interferometric, and Imaging Characteristics of

Elliptical Polynomial Aberrations ......................................................................214

8.7 Seidel Aberrations and Their Standard Deviations ..........................................228

8.7.1 Defocus ....................................................................................................228

8.7.2 Astigmatism............................................................................................. 228

8.7.3 Coma ........................................................................................................229

8.7.4 Spherical Aberration ................................................................................230

8.8 Summary............................................................................................................... 232

References ......................................................................................................................234

201
Chapter 8
Systems with Elliptical Pupils
8.1 INTRODUCTION
The pupil of a human eye is slightly elliptical [1]. The pupil for off-axis imaging by a
system with an axial circular pupil may be vignetted, but can be approximated by an
ellipse [2]. When a flat mirror is tested by shining a circular beam on it at some angle
(other than normal incidence), the illuminated spot is elliptical. Similarly, the overlap
region of two circular wavefronts that are displaced from each other, as in lateral shearing
interferometry [3] or in the calculation of the optical transfer function of a system [4], can
also be approximated by an ellipse.

Starting with the pupil function of a system with an elliptical pupil, we scale the
coordinates of a point on the pupil and transform it to a circular pupil. The aberration-free
PSF and OTF are then obtained as for a system with a circular pupil. The corresponding
PSF and OTF obtained by unscaling the coordinates represent the results for the elliptical
pupil. Then we discuss the polynomials that are orthonormal over and represent balanced
classical aberrations for a unit elliptical pupil [5]. These polynomials cannot be obtained
by scaling the coordinates of the Zernike circle polynomials. The balancing of a Seidel
aberration over an elliptical pupil is discussed, and its standard deviation with and
without balancing is determined.

8.2 PUPIL FUNCTION


As illustrated in Figure 8-1a, consider an imaging system with an elliptical exit pupil
with semimajor and semiminor axes a and b and area Sex = pab lying in the x p , y p ( )
plane with z axis as its optical axis. The pupil is described by

x 2p y 2p
+ £ 1 . (8-1)
a2 b2

The aspect ratio c of the pupil is given by

c = ba £ 1 . (8-2)

( )
For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex
exiting from it, the pupil function of the system can be written

(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (8-3)

where

(
A xp, yp ) = (P ex
12
Sex ) (8-4)

is the uniform amplitude across the pupil.


203
204 SYSTEMS WITH ELLIPTICAL PUPILS

yp y9p

O xp O x9p

a a
(a) (b)

Figure 8-1. (a) Elliptical pupil with semimajor and semiminor axes a and b. (b)
Elliptical pupil transformed into a circular pupil by scaling its y p coordinate.

8.3 ABERRATION-FREE IMAGING


An elliptical pupil can be transformed to a circular pupil by scaling its coordinates.
Using the results for a circular pupil, the PSF [6] and OTF [7] of an elliptical pupil can be
written in this scaled coordinate system. Unscaling the coordinates finally yields the PSF
and OTF for a system with an elliptical pupil.

8.3.1 PSF
From Eq. (1-9), the aberrated irradiance distribution in the image plane of a system
with a uniformly illuminated elliptical exit pupil, normalized by its aberration-free central
value Pex Sex l2 R 2 , can be written
2
1 ÛÛ È 2pi ˘
I (x i , y i ) [ (
= 2 Ù Ù exp iF x p , y p expÍ -
Sex ı ı Î lR
)] ( )
x i x p + y i y p ˙ dx p d y p
˚
, (8-5)

where the integration is carried over the elliptical pupil. Using the scaled pupil
( )
coordinates x ¢p , y ¢p , where

( x ¢ , y ¢ ) = ( x , y c)
p p p p , (8-6)

the elliptical pupil is transformed into a circular pupil of radius a defined by

x ¢p2 + y ¢p2 £ a 2 . (8-7)

Similarly, we scale the image plane coordinates ( x i , y i ) into ( x ¢i , y ¢i ) according to

( x ¢i , y ¢i ) = ( x i , cy i ) , (8-8)

because of the Fourier transform relationship between the pupil function and the
diffracted amplitude. In the scaled coordinates, Eq. (8-5) for the aberrationfree case
becomes
 36) 205

2
c2 È 2pi ˘
I ( x ¢i , y ¢i ; c ) = 2 ÚÚ exp Í -
p circle Î lR
x ¢i x ¢p + y ¢i y ¢p ( ) ˙ dx ¢p dy ¢p
˚
. (8-9)

( )
In polar coordinates r p¢ , q and (ri¢, q i ) for the pupil and image points, we can write

( x¢ , y¢ )
p p (
= r p¢ cos q¢p , sin q¢p ) (
= ar cos q¢p , sin q¢p ) (8-10)

and

( x ¢i , y ¢i ) = ri¢(cos q i , sin q i ) , (8-11)

where 0 £ r £ 1 and 0 £ q, q i £ 2p . In these polar coordinates, we can write Eq. (8-9) in


the form

2
1 1 2p
[
I (r , q¢i ; c ) = 2 Ú Ú exp -pirr cos q¢i - q¢p r dr dq¢p
p 0 0
( )] , (8-12)

where

ri¢ r¢
r = = i , (8-13)
l R 2a l Fx

and

Fx = R 2a (8-14)

is the focal ratio of the image-forming light cone along the x p axis.

( )
For the aberration-free case, we let F r, q¢p = 0 and perform the integration as for a
circular pupil. Thus, we obtain
2
È 2J (p r ) ˘
I (r) = Í 1 ˙ . (8-15)
Î pr ˚

Substituting for r from Eqs. (8-8), (8-11) and (8-13), we obtain

2
Ï 2J È p x 2 + c 2 y 2 1 2 ˘ ¸
Ô 1 ÍÎ ( ˙˚ Ô )
I ( x , y; c ) = Ì 1 2 ˝ , (8-16)
2
Ô p x +c y
Ó
2 2
( Ô
˛
)
where ( x , y ) are image plane coordinates in units of l Fx . The fractional power contained
in an elliptical ring can be obtained in a similar manner from the corresponding equation
for a circular pupil, namely, Eq. (4-11). Thus, the fractional power in an elliptical ring
with semimajor and semiminor axes x c and y c with y c = cx c is given by

P ( x c , y c ; c ) = 1 - J 02 ÊË p x c2 + c 2 y c2 ˆ¯ - J12 ÊË p x c2 + c 2 y c2 ˆ¯ . (8-17)
206 SYSTEMS WITH ELLIPTICAL PUPILS

The distribution given by Eq. (8-16) approaches the Airy pattern for a circular pupil
as we let the aspect ratio c Æ 1. We also note that the relative irradiance at a point
( x, y c) is equal to the relative irradiance of the Airy pattern at a point ( x, y) . However,
the central irradiance for the elliptical pupil is equal to c 2 times the central value of the
Airy pattern. This is due to the area of the elliptical pupil being equal to c times that of
the circular pupil, and the power incident on and exiting from the elliptical pupil also
being equal to c times that for the circular pupil.

Figure 8-2a shows the 2D PSF for c = 0.85 . It is evident that the circular diffraction
rings of a circular pupil have been replaced by the elliptical diffraction rings of an
elliptical pupil. The dimension of a ring is larger in the direction of the smaller dimension
of the pupil with an aspect ratio of 1 c . Figure 8-2b shows the irradiance distribution
along the x and y axes, and at 45o from the x axis. The first zero along the x axis occurs at
1.22 (in units of l Fx ), as in the Airy pattern, at 1.22/0.85 or about 1.44 along the y axis,
and at about 1.32 at 45o from the x axis [see the curve I ( r ) ∫ I ( x = y ) ].

(a)

1.0 0.025

0.020
I (0, y)
0.8
0.015
I

I (x, 0)
0.6 0.010
I

I (x, 0) 0.005 (b)


I (r)
0.4
0.000
1.0 1.5 2.0 2.5 3.0
x, y, or r
0.2
I (0, y) I (r)

0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r

Figure 8-2. (a) 2D aberration-free PSF for c = 0.85. (b) Irradiance distribution along
the x and y axes, and at 45 o from the x axis, where x, y, and r are in units of l Fx .
 27) 207

8.3.2 OTF
r
The OTF of an aberration-free system at a spatial frequency v i is given by [see Eq.
(2-13)]

r Û r r r r
ı
( ) (
t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (8-18)

It represents the fractional area of overlap of two elliptical pupils centered at (0, 0) and
r
l R(x, h) , where (x, h) are the Cartesian components of the spatial frequency vector v i . In
( )
the scaled coordinates x ¢p , y ¢p , as in Eq. (8-6), the elliptical pupil reduces to a circular
pupil of radius a. The overlap area of two circular pupils, each of radius a, with their
origins at (0, 0) and ( x ¢0 , y ¢0 ) is given by

È 12˘
Ê r¢ ˆ Ê r¢ ˆ Ê r¢ ˆ
S( x ¢0 , y ¢0; a) = 2a 2 Í cos 1Á 0 ˜ - Á 0 ˜ 1 - Á 0 ˜ ˙ , (8-19)
Í Ë 2a ¯ Ë 2a ¯ Ë 2a ¯ ˙
Î ˚

where

(
r0¢ = x ¢02 + y ¢02 )1 2 (8-20)

is the distance between the centers of the two pupils.

Letting

( x ¢0 , y ¢0 ) = l R(x¢, h¢ ) = l R(x, h c ) (8-21)

and noting that the overlap area is to be multiplied by c when writing it in the unscaled
coordinates, the OTF of a system with an elliptical pupil can be written from Eq. (8-19) in
the form


(
t vx , vy ) =
p ÎÍ
(
cos 1 v e - v e 1 - v e2 )1 2 ˘˚˙ , (8-22)

where

12
Ê 2 v y2 ˆ
ve = Á vx + 2 ˜ (8-23)
Ë c ¯

and

Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 l Fx 1 l Fx ¯
(8-24)

are the spatial frequency components normalized by the cutoff frequency 1 l Fx along the
x axis.
208 SYSTEMS WITH ELLIPTICAL PUPILS

It should be evident that, since -a £ x p £ a and -b £ y p £ b , therefore, 0 £ v x £ 1


and 0 £ v y £ c . Hence, the cutoff spatial frequency varies with its orientation. Thus, for
example, the cutoff frequencies along the x and y axes are 1 and c, respectively. A smaller
cutoff frequency along the y axis is the spatial frequency analog of the larger diffraction
spread due to the smaller dimension of the pupil along this axis. For an arbitrary direction
making an angle q with the x axis, the cutoff frequency is given by
12
[ ( ) ]
c 1 - 1 - c 2 cos 2 q , and represents the distance of the point from the center of a unit
ellipse where a line passing through the center and making an angle q meets it. For
example, the cutoff frequency for 45o is equal to 0.916 when c = 0.85.

Figure 8-3 shows the OTF for c = 0.85 along the x and y axes, and at 45o from the x
( ) ( )
axis as t(v x ), t v y , and t v x = v y ∫ t(v e ) with the corresponding cutoff spatial
frequencies of 1, 0.85, and 0.916, respectively, each in units of 1 l Fx . It should be
( )
evident that t(v x ) is obtained from Eq. (8-22) by letting v y = 0. Similarly, t v y is
obtained by letting v x = 0. Moreover, the OTF along the x axis is the same as for a
corresponding circular pupil.

1.0

0.8

0.6
t

t ( nx )
0.4
t ( nx ny )

0.2
t ( ny )

0.0
0.0 0.25 0.5 0.75 1.0
nx, ny, or ne

Figure 8-3. OTF of a system with an elliptical pupil with aspect ratio c = 0.85, along
the x and y axes, and at 45 o from the x axis, where v x , v y . and v e are all in units of
1 l Fx .
 (OOLSWLFDO 3RO\QRPLDOV 209

8.4 ELLIPTICAL POLYNOMIALS


In Section 8.3, we obtained the aberration-free PSF and OTF by scaling the
coordinates of the elliptical pupil and thereby transforming it into a circular pupil, and
then using the PSF and OTF of a circular pupil. Similarly, by scaling the coordinates of
the Zernike circle polynomials we can obtain polynomials that are orthogonal over an
elliptical pupil. However, these elliptical polynomials do not represent the balanced
classical aberrations for a system with an elliptical pupil. To obtain the polynomials that
are orthogonal over and represent balanced aberrations for an elliptical pupil, we
orthogonalize the Zernike circle polynomials over the elliptical pupil [7,8].

Figure 8-4 shows a unit ellipse of an aspect ratio c inscribed inside a unit circle. Thus
the semimajor and semiminor axes a and b of the ellipse have been normalized by a so
that the farthest point(s) on the ellipse lie at a distance of unity. The unit ellipse is
represented by an equation

x2 + y2 c2 = 1 , (8-25)

or

y = ± c 1 - x2 . (8-26)

The area of the unit ellipse is given by pc .

The orthonormal elliptical polynomials E j obtained by orthonormalizing the Zernike


circle polynomials Z j over a unit ellipse are given by [see Eq. (3-18)]

È j ˘
E j +1 = N j +1 Í Z j +1 - Â Z j +1Ek Ek ˙ , (8-27)
Î k =1 ˚

D(0,c)

C 1, 0 A 1, 0
O x

B(0, c)

Figure 8-4. Unit ellipse of aspect ratio c inscribed inside a unit circle with its
semimajor axis of unity along the x axis.
210 SYSTEMS WITH ELLIPTICAL PUPILS

where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit ellipse i.e., they satisfy the orthonormality condition

1 c 1 x2
1 Û Û
dx E j E j ¢ dy = d jj ¢ . (8-28)
pc Ù
ı
Ù
ı
1
c 1 x2

The angular brackets indicate a mean value over the elliptical pupil. Thus, for example,

1 c 1 x2
1 Û Û
Z j Ek = dx Z j Ek dy . (8-29)
pc Ù
ı
Ù
ı
1
c 1 x2

It should be evident that because of the symmetric limits of integration, a mean value is
zero if the integrand is an odd function of x and or y. If the integrand is an even function,
then we may replace the lower limits of integration by zero and multiply the double
integral by 4.

The orthonormal elliptical polynomials up to the fourth order are given in Tables 8-1
through 8-3 in three different but equivalent forms, as in the case of hexagonal
polynomials. The expressions for higher-order elliptical polynomials are very long unless
the aspect ratio c is specified. As in the case of a hexagonal pupil, each elliptical
polynomial consists of either cosine or sine terms, but not both. For example, E6 is a
linear combination of Z 6 , Z 4 , and Z1. It also shows that the balancing defocus for (zero-
degree) Seidel astigmatism is different for an elliptical pupil compared to that for a
circular, annular, or a Gaussian pupil, as may be seen from Table 4-2, 5-2, or 6-2,
respectively. Moreover, E11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,
spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q
as well. The elliptical polynomials are generally more complex in that they are made up
of a larger number of circle polynomials. These results are a consequence of the fact that
the x and y dimensions of the elliptical pupil are not equal. As expected, the elliptical
polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse approaches
a unit circle.

8.5 ELLIPTICAL COEFFICIENTS OF AN ELLIPTICAL ABERRATION


FUNCTION
An elliptical aberration function W ( x , y ) across a unit ellipse can be expanded in
terms of J elliptical polynomials Ej (r, q) in the form
J
W ( x , y ) = Â a j Ej ( x , y ) , (8-30)
j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (8-30) by
8.5 Elliptical Coefficients of an Elliptical Aberration Function 211

Table 8-1. Orthonormal elliptical polynomials E j U, T in terms of the Zernike


circle polynomials Z j U T .
E1 Z1

E2 Z2

E3 Z3/c

2 4
E4 (1/ 3 2c 3c )[ 3 (1 c2) Z1 + 2Z4]

E5 Z5/c

E6 [1/(2 2 c2 3 2c
2 4
3c )][ 3 (3 4c2 + c4)Z1 3(1 c4)Z4 + 2 (3 2c2 + 3c4)Z6]

2 4
E7 [1/(c 5 6c 9c )][6(1 c2)Z3 + 2 2 Z7]

2 4
E8 (2/ 9 6c 5c )[(1 c2)Z2 + 2 Z8]

2 4
E9 [1/(2 2 c3 5 6c 9c )][ 2 2 (5 8c2 + 3c4)Z3 (5 2c2 3c4)Z7 + (5 6c2 + 9c4)Z9]

2 4
E10 [1/(2 2 c3 9 6c 5c )][ 2 2 (3 4c2 + c4)Z2 (3 + 2c2 5c4)Z8 + (9 6c2 + 5c4)Z10]

E11 (1/Į)[ 5 (7 10c2 + 3c4)Z1 + 4 15 (1 c2)Z4 2 30 (1 c2)Z6 + 8Z11]

E12 5 / 8 c 2(195 475c2 + 558c4 422c6 + 159c8 15c10)ȕ 1Z1 15 / 8 c 2(105 205c2

+ 194c4 114c6 + 5c8 + 15c10)ȕ 1Z4 + (1/2) 15 c 2 (75 155c2 + 174c4 134c6 + 55c8 15c10) ȕ 1Z6
6
10 2 c 2(3 2c2 +2c 3c8)ȕ 1Z11 + c 2ĮȖ 1Z12

2 4
E13 [1/(c 5 6c 5c )][ 15 (1 c2)Z5 +2Z13]

E14 ( 5 / 2 /4)(1 c2)2c 4(35 10c2 c4)Ȗ 1Z1 + (5 15 2 /8)(1 c2)2c 4


(7 + 2c2 c4)Ȗ 1Z4

( 15 /8)c 4 (35 70c2 + 56c4 26c6 + 5c8)Ȗ 1Z6 + (5/8 2 ) (1 c2)2c 4(7 + 10c2 + 7c4)Ȗ 1Z11

(5/8)c 4(7 6c2 + 6c6 7c8)Ȗ 1Z12 + (Ȗ/8c4) Z14

E15 ( 15 /4)c 3(5 8c2 + 3c4)į 1Z5 (5/4)(1 c4)c 3 į 1Z13 + (į/2c3) Z15
_______________________________________________________________________
Į (45 60c2 + 94c4 60c6 + 45c8)1/2
ȕ (1575 4800c2 + 12020c4 17280c6 + 21066c8 17280c10 + 12020c12 4800c14 + 1575c16)1/2
Ȗ (35 60c2 + 114c4 60c6 + 35c8)1/2
į (5 6c2 +5c4)1/2
ĮȖ ȕ
212 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-2. Orthonormal elliptical polynomials E j U, T in polar coordinates U, T .

E1 1

E2 2ȡcosș

E3 (2ȡsinș)/c

3 / §© 3 3c ·¹ ( 1 c2 +4ȡ2)
2 4
E4 2c

E5 ( 6 /c)ȡ2 sin2ș

(1/2c2) 6 / §© 3 3c ·¹ [2c2(1 c2) c4)ȡ2 + (3 2c2 +3c4)ȡ2 cos2ș]


2 4
E6 2c 3(1

9c ) ][ (1 + 3c2)ȡ +6ȡ3]sinș
2 4
E7 [4/(c 5 6c

5c ) [ (3 + c2)ȡ + 6ȡ3]cosș
2 4
E8 (4/ 9 6c

[1/(c3 5 9c ) ]{3[4c2(1 c2)ȡ 2c2 3c4)ȡ3]sinș + (5 6c2 +9c4)ȡ3 sin3ș]}


2 4
E9 6c (5

[1/(c2 9 5c ) ]{3[4c2(1 c2)ȡ (3 + 2c2 5c4)ȡ3]cosș + (9 6c2 +5c4)ȡ3 cos3ș]}


2 4
E10 6c

E11 ( 5 /Į) [3+2c2 +3c4 24(1 + c2)ȡ2 + 48ȡ4 12(1 c2)ȡ2 cos2ș]

E12 [ 10 Į/(Ȗc2)]( 3ȡ2 + 4ȡ4) cos2ș + [ 5 2 /(2c2ȕ)][ 12c2(5 2c2 + 2c6 5c8)

+ 6(15 + 125c2 194c4 + 194c6 125c8 15c10)ȡ2 + 240( 3+2c2 2c6

+ 3c8)ȡ4 + 6(75 155c2 + 174c4 134c6 + 55c8 15c10)ȡ2 cos2ș]

E13 ( 10 /cį) [ 3(1 + c2)ȡ2 + 8ȡ4] sin2ș

E14 [ 10 /(8c4Ȗ)]{3(1 c2)2[8c4 40c2(1 + c2)ȡ2 + 5(7 + 10c2 +7c4)ȡ4]

+ 4[6c2(5 7c2 + 7c4 5c6) 5(7 6c2 +6c6 7c8)ȡ2]ȡ2 cos2ș + (35 60c2

+ 114c4 60c6 + 35c8)ȡ4 cos4ș}

E15 ( 10 /c3)į 1{[6c2(1 c2) 5(1 c4)ȡ2]ȡ2 sin2ș + [(5 6c2 +5c4)/2]ȡ4 sin4ș}
8.5 Elliptical Coefficients of an Elliptical Aberration Function 213

Table 8-3. Orthonormal elliptical polynomials E j x, y in Cartesian coordinates


x, y , where U 2 x 2  y 2 .
E1 = 1

E2 = 2x

E3 = 2y/c

2 4
E4 = ( 3 / 3 í 2c í 3c )(í 1 í c2 +4ȡ2)

E5 = (2 6 /c)xy

2 4
E6 = [ 6 /(c2 3 í 2c í 3c )][c2(1 í c2) + c2(3c2 í 1)x2 í (3 í c2)y2]

2 4
E7 = [4/(c 5 í 6c í 9c )][í (1 + 3c2) + 6ȡ2]y

2 4
E8 = (4/ 9 í 6c í 5c )[í (3 + c2) + 6ȡ2]x

2 4
E9 = [4/(c3 5 í 6c í 9c )][3c2(3c2 í 1)x2 í (5 í 3c2)y2 + 3c2(1 í c2)]y

2 4
E10 = [4/(c2 9 í 6c í 5c )][c2(5c2 í 3)x2 í 3(3 í c2)y2 + 3c2(1 í c2)]x

E11 = ( 5 /Į)[48ȡ4 í 12(3 + c2)x2 í 12(1 + 3c2)y2 + 3 + 2c2 +3c4]

E12 = [ 10 Į/(c2Ȗ)][(x2 í y2) (4ȡ2 í 3)+[ 5 /(2 2 c2ȕ)][240(í 3 + 2c2 í 2c6

2
+ 3c8)ȡ4 í 60(í 9 + 3c2 +2c4 í 6c6 +7c8 +3c10)x í 24(15 í 70c2 + 92c4 í 82c6

+ 45c8) y2 + 12c2(í5 + 2c2 í 2c6 + 5c8)]

E13 = (2 10 /cį)(8ȡ2 í 3 í 3c2)xy

E14 = ( 10 /c4Ȗ)[c4(3 í 30c2 + 35c4)x4 +6c2(5 í 18c2 + 5c4)x2y2 + (35 í 30c2 +3c4)y4

í 6c4(1 í 6c2 + 5c4)x2 í 6c2(5 í 6c2 + c4)y2 + c4(1íc2)2]

E15 = (4 10 /c3į)[c2(5c2 í 3)x2 í (5 í 3c2)y2 + 3c2(1 í c2)]xy


214 SYSTEMS WITH ELLIPTICAL PUPILS

E j ( x , y ), integrating over the unit ellipse, and using the orthonormality Eq. (8-28), we
obtain the elliptical expansion coefficients:

1 c 1 x2
1 Û Û
aj = dx W ( x , y )E j ( x , y ) dx dy . (8-31)
pc Ù
ı
Ù
ı
1
c 1 x2

As stated in Section 3.2, it is evident from Eq. (8-7) that the value of an elliptical
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (8-32)

and

J
W 2 (r, q) = Â a 2j , (8-33)
j =1

respectively. Accordingly, the aberration variance is given by

2
2
sW = W 2 (r, q) - W (r, q)

J
= Â a 2j . (8-34)
j =2

8.6 ISOMETRIC, INTERFEROMETRIC, AND IMAGING CHARACTERISTICS


OF ELLIPTICAL POLYNOMIAL ABERRATIONS
The first 45 elliptical polynomials for an elliptical pupil with an aspect ratio of c =
0.85 are given in Table 8-4 to 8-6. They are illustrated in three different but equivalent
ways in Figure 8-5. For each polynomial, the isometric plot at the top illustrates its shape.
An interferogram is shown on the left, and a corresponding PSF is shown on the right for
a sigma value of one wave. The peak-to-valley aberration numbers (in units of
wavelength) are given in Table 8-7.

The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (8-5), are shown in Figure 8-5. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration E1 has
no effect on the PSF, it yields an aberration-free PSF.
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 215

Table 8-4. Elliptical polynomials in terms of Zernike polynomials for an elliptical


pupil with an aspect ratio c = 0.85.
E1 Z1

E2 Z2

E3 1.1765Z3

E4 0.2721Z1 + 1.1321Z4

E5 1.17645Z5

E6 0.3032Z1 0.3972Z4 + 1.2226Z6

E7 0.8458Z3 + 1.4369Z7

E8 0.2058Z2 + 1.0486Z8

E9 0.5527Z3 0.4945Z7 + 1.3332Z9

E 10 0.3243Z2 0.3329Z8 + 1.3199Z10

E 11 0.4721Z1 + 0.6768Z4 0.4785Z6 + 1.2594Z11

E 12 0.6786Z1 0.9419Z4 + 1.0489Z6 0.7451Z11 + 1.4250Z12

E 13 0.6987Z5 + 1.3002Z13

E 14 0.2576Z1 + 0.3242Z4 0.7837Z6 + 0.1889Z11 0.5861Z12 + 1.4774Z14

E 15 0.6848Z5 0.5376Z13 + 1.4734Z15

E16 0.3201Z2 + 0.3747Z8 0.3747Z10 + 1.1026Z16

E17 1.6951Z3 + 1.7799Z7 0.5933Z9 + 1.7457Z17

E18 0.6114Z2 0.6730Z8 + 1.1686Z10 0.5222Z16 + 1.4580Z18

E19 1.4290Z3 1.4348Z7 + 1.3271Z9 0.9078Z17 + 1.4985Z19

E20 0.3159Z2 + 0.3003Z8 1.1073Z10 + 0.1586Z16 0.7251Z18 + 1.6493Z20

E21 0.5487Z3 + 0.5004Z7 1.1469Z9 + 0.2441Z17 0.7400Z19 + 1.6506Z21

E22 0.8435Z1 + 1.2371Z4 0.9604Z6 + 1.1277Z11 0.7974Z12 + 1.3738Z22

E23 1.2479Z5 + 1.1962Z13 0.5981Z15 + 1.4572Z23

E24 1.5657Z1 2.2518Z4 + 2.4365Z6 1.95526Z11 + 2.0855Z12 0.7030Z14


1.1709Z22 + 1.7128Z24

E25 1.5089Z5 1.3450Z13 + 1.6563Z15 0.8395Z23 + 1.5980Z25

E26 0.8344Z1 + 1.1536Z4 2.0055Z6 + 0.9046Z11 1.7006Z12 + 1.7223Z14 + 0.4133Z22


0.9739Z24 + 1.6111Z26

E27 0.7754Z5 + 0.6060Z13 1.6348Z15 + 0.2747Z23 0.9271Z25 + 1.8541Z27


216 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-4. Elliptical polynomials in terms of Zernike polynomials for an elliptical


pupil with an aspect ratio c = 0.85. (Cont.)

E28 0.2686Z1 0.3567Z4 + 0.8956Z6 0.2550Z11 + 0.6867Z12 1.6500Z14 0.0970Z22


+ 0.3021Z24 0.9317Z26 + 1.8545Z28

E29 3.5331Z3 + 3.8704Z7 1.5265Z9 + 3.0038Z17 1.0013Z19 + 2.0832Z29

E30 0.5126Z2 + 0.6090Z8 0.6874Z10 + 0.5538Z16 0.5538Z18 + 1.1521Z30

E31 3.7743Z3 4.0384Z7 + 3.1911Z9 2.9785Z17 + 2.4088Z19 0.8496Z21 1.4764Z29


+ 1.7676Z31

E32 1.2170Z2 1.3856Z8 + 2.4334Z10 1.1564Z16 + 1.9603Z18 0.8039Z20


0.7334Z30 + 1.6725Z32

E33 1.8697Z3 + 1.9215Z7 2.9091Z9 + 1.2980Z17 2.1909Z19 + 2.2306Z21 + 0.5180Z29


1.1466Z31 + 1.7471Z33

E34 0.8765Z2 + 0.9332Z8 2.6500Z10 + 0.6726Z16 2.0393Z18 + 2.2043Z20 + 0.2987Z30


1.1006Z32 + 1.7428Z34

E35 0.6216Z3 0.6104Z7 + 1.4855Z9 0.3771Z17 + 1.0045Z19 2.3123Z21


0.1273Z29 + 0.4026Z31 1.1573Z33 + 2.0935Z35

E36 0.3561Z2 0.3568Z8 + 1.4294Z10 0.2285Z16 + 0.9733Z18 2.3066Z20


0.0816Z30 + 0.3938Z32 1.1559Z34 + 2.0934Z36

E37 1.5647Z1 + 2.3399Z4 1.9757Z6 + 2.2294Z11 1.7784Z12 + 0.2020Z14 + 1.6239Z22


1.1483Z24 + 1.4746Z37

E38 3.6706Z1 5.4231Z4 + 5.8485Z6 5.0215Z11 + 5.3399Z12 1.9891Z14 3.4518Z22


+ 3.5769Z24 1.1361Z26 1.6754Z37 + 2.0633Z38

E39 2.3101Z5 + 2.2906Z13 1.3805Z15 + 1.7827Z23 0.8913Z25 + 1.6187Z39

E40 2.5295Z1 + 3.6636Z4 5.4516Z6 + 3.2375Z11 4.9525Z12 + 4.2289Z14 + 2.0206Z22


3.2919Z24 + 2.8973Z26 1.0342Z28 + 0.7704Z37 1.5053Z38 + 1.8782Z40

E41 3.4596Z5 3.2813Z13 + 3.8178Z15 2.3452Z23 + 2.6958Z25 1.0155Z27


1.2074Z39 + 1.8441Z41

E42 1.0497Z1 1.4880Z4 + 2.9632Z6 1.2532Z11 + 2.5925Z12 4.2448Z14 0.7161Z22


+ 1.5920Z24 2.8929Z26 + 2.8550Z28 0.2315Z37 + 0.5922Z38 1.3793Z40 + 1.9027Z42
E43 2.3202Z5 + 2.0734Z13 4.1210Z15 + 1.3161Z23 2.8314Z25 + 2.8448Z27 +
0.5132Z39 1.3637Z41 + 1.9013Z43

E44 0.3097Z1 + 0.4294Z4 1.1006Z6 + 0.3448Z11 0.9148Z12 + 2.3943Z14 + 0.1816Z22


0.5119Z24 + 1.4601Z26 3.1647Z28 + 0.0514Z37 0.1605Z38 + 0.5359Z40
1.4192Z42 + 2.3730Z44

E45 0.9499Z5 0.79791Z13 + 2.3698Z15 0.4537Z23 + 1.4484Z25 3.1626Z27


0.1454Z39 + 0.5331Z41 1.4187Z43 + 2.3730Z45
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 217

Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an
aspect ratio c = 0.85.
E1 1

E2 2Ucosș

E3 2.3529Usinș

E4 1.6888 + 3.9217U 2

E5 2.8818U2sin2ș
2
E6 0.3848 1.3760 + 2.9947U 2cos2ș

E7 ( 6.4365 U + 12.1923U 3)sinș

E8 ( 5.5205 U + 8.8980U3)cosș

E9 (1.6917U 4.1956U 3)sinU + 3.7710U3sin3ș

E10 (1.2346U 2.8248U3)cos U + 3.733U 3 cos3ș

E11 2.1159 14.5521U 2 + 16.8965U4 1.1722U2 cos2ș

E12 0.7133 + 6.7333 U2 9.9960U 4 + ( 10.9496U2 + 18.0251U4) cos2ș

E13 ( 10.6232U 2 + 16.4461U 4)sin2ș

E14 0.1184 1.4114 U2 + 2.5347U 4 + (3.6409U2 7.4142U 4) cos2ș + 4.6720U 4 cos4ș

E15 (3.4228U2 6.8003U4)sin2ș + 4.6593U 4sin4ș

E16 (9.9790U 42.6545U 3 + 38.1952U5)cosș 1.0599U3cos3ș

E17 (11.4631U 57.4626U3 + 60.4711U 5)sinș 1.6781U 3sin3ș

E18 ( 2.8430U + 15.9987U 3 18.0913 U5)cosș + ( 16.8978U3 + 25.2538 U5)cos3ș

E19 ( 4.1751U + 25.5604 U3 31.4461U 5)sinș + ( 17.0096U 3 + 25.9539U 5)sin3ș

E20 (0.5810 U 4.0436 U3 + 5.4933 U5)cosș + (6.9151U 3 12.5589 U5)cos3ș + 5.7134 U5cos5ș

E21 (0.8035U 5.9010U3 + 8.4557U5)sinș + (7.0098U3 12.8173 U5)sin3ș + 5.7177U 5sin5ș

E22 2.41236 + 32.7723U2 93.9111U 4 + 72.6936 U6 + (5.21205U2 10.0862U 4)cos2ș

E23 (24.4229U2 93.9159 U4 + 81.7846 U6)sin2ș 1.89127U 4sin4ș

E24 1.0603 18.7425U 2 + 66.7039U4 61.9577U6 + (24.6341U 2 101.7900U4 +


6.1279U6)cos2ș 2.2230 U4cos4ș

E25 ( 9.7837U2 + 45.8111U 4 47.1181U6)sin2ș + ( 24.658 U4 + 35.8748U6)sin4ș

E26 (2.3177U2 12.8931U 4 + 15.4190 U6)sin2ș + (12.1747U 4 20.8131U6)sin4ș + 6.9375U6sin6ș

E27 0.0357 0.8943U2 + 4.2779U4 5.1324 U6 + (2.4613U2 13.9209U4 +


16.9552U6)cos2 + (12.2137U4 20.9176U6)cos4ș + 6.9389 U6cos6ș
218 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an
aspect ratio c = 0.85. (Cont.)
E28 ( 16.9428 U + 157.9560 U3 395.9030 U5 + 291.6410 U7)sin ș + (9.5563U3 17.3422 U5)sin3ș
E29 ( 15.0992 U + 120.4040 U3 257.3300 U5 + 161.3000U7)cosș + (5.7290 U3 9.5919 U5)cos3ș

E30 (7.9651U 87.6220U3 + 251.1590 U5 206.6960 U7)sinș + (46.3528 U3 170.3910U5


+ 148.4790 U7)sin3ș 2.9431U 5sin5ș

E31 (5.1212U 51.6989U3 + 135.9670U5 102.6820U 7)cosș + (46.6210U3 166.7500U5


+ 140.4930U7)cos3 ș 2.7848 U5cos5ș

E32 ( 1.9287 U + 24.5047 U3 79.3495 U5 + 72.5158U7)sinș + ( 23.7341 U3 +


99.6445 U5 96.3144U7)sin3ș + ( 34.2032U5 + 48.9185 U7)sin5ș

E33 ( 1.3158 U + 15.8070 U3 48.3961U 5 + 41.8223U7)cosș + ( 23.2637 U3 +


96.7544 U5 92.4529 U7)cos3ș + ( 34.1909U5 + 48.7981U 7)cos5ș

E34 (0.3278U 4.7830 U3 + 17.4967U5 17.8268U7)sinș + (6.3868U3 30.9130 U5 +


33.8176 U7)sin3ș + (19.7658U 5 32.4050U 7)sin5ș + 8.3740U7sin7ș

E35 (0.2368 U 3.3195U 3 + 11.6663U 5 11.4227U 7)cosș + (6.3087U3 30.3995U5 +


33.0808 U7)cos3ș + (19.7502U 5 32.3638U7)cos5ș + 8.3736U7cos7ș

E36 2.6240 58.7201U2 + 299.1450 U4 533.3820U6 + 309.6560U 8 + ( 13.7469U2 +


63.4344 U4 64.4471U 6)cos2ș + 0.6387U 4cos4ș

E37 1.3994 + 39.5156U 2 245.7470U4 + 521.0140U 6 351.8340U 8 + ( 43.5675U 2 +


325.0900 U4 718.3760U 6 + 490.2030 U8)cos2ș + (14.9649U 4 5.5058U6)cos4ș

E38 ( 44.7266U 2 + 307.6270U4 621.0460U6 + 384.5860 U8)sin2ș + (12.3100U 4 20.0104 U6)sin4ș

E39 0.3882 12.8141U2 + 91.0564U 4 216.6400U6 + 161.7810U8 + (23.5910U 2


199.4850 U4 + 485.8140U6 357.6390U 8)cos2ș + (78.6963U4 269.6320U 6 +
223.11800U8)cos4ș 3.8697U 6cos6ș

E40 (21.2315U 2 173.3660 U4 + 406.2550U6 286.8680U8)sin2ș + (78.9987U4


268.0850 U6 + 219.0700 U8)sin4ș 3.7995U6sin6ș

E41 0.0744 + 2.8113U2 22.4722 U4 + 59.3275U 6 48.6104U8 + ( 6.7226U2 +


64.4168 U4 174.4750U6 + 140.7070U8)cos2ș + ( 47.0816U4 + 180.8360U 6
163.8540U8)cos4ș + ( 45.8256 U6 + 64.5805U8)cos6ș

E42 ( 6.2129U 2 + 58.3775 U4 154.7540U6 + 121.9310U8)sin2ș + ( 46.8482U 4 +


179.4400U 6 162.0030U 8)sin4ș + ( 45.8214 U6 + 64.5323U8)sin6ș

E43 0.0106 0.4565U 2 + 4.0923U4 11.9863U6 + 10.7991U 8 + (1.3004U 2 14.1264U4


+ 42.7822U 6 38.1414U8)cos2ș + (14.3621U 4 62.7180U 6 + 63.6642U8)cos4 ș
+ (30.3057U 6 48.1679U8)cos6ș + 10.0677 U8cos8ș

E44 (1.2269U2 13.1623 U4 + 39.3265U6 34.5557U 8)sin2ș + (14.3234U4 62.4781 U6


+ 63.3298 U8)sin4ș + (30.2998 U6 48.1523U8)sin6ș + 10.0676U8sin8ș

E45 (1.2269U2 13.1623U4 + 39.3265U6 34.5557U 8)sin2ș + (14.3234U4 62.4781U 6 +


63.3298 U8)sin4ș + (30.2998 U6 48.1523U 8)sin6ș + 10.0676 U8sin8ș
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 219

Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with
an aspect ratio c = 0.85.
E1 1

E2 2x

E3 2.3529y

E4 1.6888 + 3.9217x2 + 3.9217y2

E5 5.7635xy

E6 0.3848 + 1.6188x2 4.3707y2

E7 6.4365y + 12.1923x2y + 12.1923y3

E8 5.5205x + 8.8980x3 + 8.8980xy2

E9 1.6917y + 7.1173x2y 7.9665y3

E10 1.2346x + 0.9083x3 14.0244xy2

E11 2.1159 15.7243x2 + 16.8965x4 13.3799y2 + 33.7930x2y2 + 16.8965y4

E12 0.7133 4.2163x2 + 8.0291x4 + 17.6829y2 19.9921x2y2 28.0211y4

E13 21.2463xy + 32.8922x3y + 32.8922xy3

E14 0.1184 + 2.229x2 0.2075x4 5.0523y2 22.9629x2y2 + 14.6209y4

E15 6.8457xy + 5.0366x3y 32.2378xy3

E16 9.9790x 43.7144x3 + 38.1952x5 39.4747xy2 + 76.3904x3y2 + 38.1952xy4

E17 11.4631y 62.4968x2y + 60.4711x4y 55.7845y3 + 120.9420x2y3 + 60.4711y5

E18 2.8430x 0.8991x3 + 7.1625x5 + 66.6923xy2 86.6903x3y2 93.8528xy4

E19 4.1751y 25.4686x2y + 46.4157x4y + 42.5700y3 10.9843x2y3 57.4000y5

E20 0.5810x + 2.8716x3 1.3522x5 24.7890xy2 21.0295x3y2 + 71.7367xy4

E21 0.8035y + 15.1286x2y 1.4078x4y 12.9108y3 65.9003x2y3 + 26.9908y5

E22 2.4124 + 37.9843x2 103.9970x4 + 72.6936x6 + 27.5602y2 187.8220x2y2 +


218.0810x4y2 83.8249y4 + 218.0810x2y4 + 72.6936y6

E23 48.8459xy 195.3970x3y + 163.5690x5y 180.2670xy3 + 327.1380x3y3 +


163.5690xy5

E24 1.0603 + 5.89157x2 37.3093x4 + 34.1702x6 43.3766y2 + 146.7460x2y2


89.7452x4y2 + 166.2710y4 282.0010x2y4 158.0860y6

E25 19.5673xy 7.0099x3y + 49.2630x5y + 190.2540xy3 188.4720x3y3 237.7350xy5

E26 0.2346 5.6603x2 + 6.0036x4 + 3.3804x6 + 15.6266y2 + 106.8290x2y2


169.8950x4y2 96.7384y4 60.5711x2y4 + 112.7040y6

E27 4.6354xy + 22.9127x3y 10.7896x5y 74.4852xy3 77.0736x3y3 + 155.7150xy5

E28 0.0357 + 1.5670x2 + 2.5707x4 2.1559x6 3.3556y2 64.7263x2y2 + 2.0618x4y2 +


30.4124y4 + 176.3200x2y4 49.9441y6
220 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with
an aspect ratio c = 0.85. (Cont.)

E29 16.9428y + 186.6250x2y 447.9300x4y + 291.6410x6y + 148.400y3


826.4900x2y3 + 874.9230x4y3 378.5610y5 + 874.9230x2y5 + 291.6410y7

E30 15.0992x + 126.1330x3 266.9220x5 + 161.300x7 + 103.2180xy2 495.4780x3y2


+ 483.8990x5y2 228.5560xy4 + 483.8990x3y4 + 161.300xy6

E31 7.9651y + 51.4363x2y 274.7310x4y + 238.7420x6y 133.9750y3 + 190.9650x2y3


+ 122.3080x4y3 + 418.6070y5 471.6090x2y5 355.1750y7

E32 5.1212x 5.0779x3 33.5676x5 + 37.8102x7 191.5620xy2 + 633.2820x3y2


448.5390x5y2 + 622.2940xy4 1010.5100x3y4 524.1600xy6

E33 1.9287y 46.6975x2y + 48.5678x4y + 28.1652x6y + 48.2387y3 + 382.6220x2y3


508.6170x4y3 213.1970y5 319.0340x2y5 + 217.7490y7

E34 1.3158x 7.4567x3 + 14.1674x5 1.8325x7 + 85.5979xy2 + 51.6084x3y2


221.2630x5y2 509.6140xy4 + 343.7410x3y4 + 563.1720xy6

E35 0.3278y + 14.3775x2y + 23.5868x4y 19.7803x6y 11.1698y3 224.4910x2y3


15.4587x4y3 + 68.1755y5 + 447.8370x2y5 92.4234y7

E36 0.2368x + 2.9892x3 + 1.0170x5 2.3322x7 22.2454xy2 113.3710x3y2 +


48.0804x5y2 + 201.6160xy4 + 255.2210x3y4 331.0990xy6

E37 2.6240 72.4669x2 + 363.2180x4 597.8290x6 + 309.6560x8 44.9732y2 +


594.4580x2y2 1664.5900x4y2 + 1238.6200x6y2 + 236.3490y4 1535.7000x2y4 +
1857.9300x4y4 468.9350y6 + 1238.6200x2y6 + 309.6560y8

E38 1.3994 4.0520x2 + 94.3076x4 222.8680x6 + 138.3690x8 + 83.0831y2


581.2840x2y2 + 972.1960x4y2 426.9300x6y2 555.8720y4 + 2408.9500x2y4
2111x4y4 + 1213.8800y6 2387.7400x2y6 842.0370y8

E39 89.4533xy + 664.4940x3y 1322.1300x5y + 769.1720x7y + 566.0140xy3


2484.1900x3y3 + 2307.5200x5y3 1162.0500xy5 + 2307.5200x3y5 + 769.1720xy7

E40 0.3882 + 10.7769x2 29.7327x4 4.3268x6 + 27.2600x8 36.4051y2


290.0650x2y2 + 1242.1000x4y2 960.6230x6y2 + 369.2380y4 + 154.3790x2y4
1260.4900x4y4 968.2160y6 + 469.9310x2y6 + 742.5370y8

E41 42.4629xy 30.7369x3y 282.6290x5y + 302.5440x7y 662.7270xy3 +


1701.0100x3y3 844.9290x5y3 + 1862.0500xy5 2597.4900x3y5 1450.0200xy7

E42 0.0744 3.9113x2 5.1370x4 + 19.8626x6 7.1767x8 + 9.5338y2 + 237.5450x2y2


213.2880x4y2 161.7390x6y2 133.9710y4 1239.1100x2y4 + 1346.8700x4y4 +
460.4640y6 + 1083.6900x2y6 417.7510y8

E43 12.4259xy 70.6377x3y + 133.3220x5y 16.9553x7y + 304.1480xy3 +


297.4110x3y3 819.8750x5y3 1302.1900xy5 + 476.1470x3y5 + 1279.0700xy7

E44 0.0106 + 0.8439x2 + 4.3280x4 1.6163x6 1.7783x8 1.7569y2 77.9881x2y2


134.1720x4y2 + 104.7100x6y2 + 32.5808y4 + 689.4340x2y4 + 132.8950x4y4
147.7920y6 1091.4200x2y6 + 170.84y8

E45 2.4538xy + 30.9691x3y + 10.5393x5y 24.1655x7y 83.6182xy3 448.6900x3y3 +


156.3320x5y3 + 510.3640xy5 + 777.2630x3y5 691.8850xy7
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 221

The polynomial aberrations E2 and E3 , representing the x - and y-tilts with


aberration coefficients a 2 and a 3 , displace the aberration-free PSF along the x and y
axes, respectively. The coefficient a 2 corresponds to a tilt angle of 2a 2 a about the y
axis, and yields a displacement of the PSF along the x axis by 4 a 2 Fx , where Fx = R 2a
is the focal ratio of the image-forming beam along the x axis. Similarly, the coefficient
a 3 corresponds to a tilt angle of 2a 3 b about the x axis, and yields a displacement of the
PSF along the y axis by 4 a 3 Fy , where Fy = R 2b is the focal ratio of the image-forming
beam along the y axis.

The defocus aberration represented by the polynomial E4 is radially symmetric and


yields a radially symmetric interferogram bounded, of course, by an ellipse. However, the
PSF is biaxially and not radially symmetric because of the larger diffraction spread along
the smaller dimension of the pupil. The interferograms and PSFs for the polynomial
aberrations E5 and E6 , representing balanced astigmatisms, are biaxially symmetric but
distinctly different from each other for the two aberrations. The polynomial aberrations
E7 and E8 , representing balanced comas, produce biaxially symmetric interferograms,
but the PSFs are symmetric about the y and x axes, respectively. The polynomial
aberrations E11 , E22 , and E37 , representing balanced primary, secondary, and tertiary
aberrations, respectively, are not radially symmetric because of the different diffraction
spreads along the x and the y axes, and because of the presence of the cos 2q term in E11
and E22 , and the cos 2q and cos 4q terms in E37 .

From Eq. (8-5), the Strehl ratio, i.e., the central value of a PSF relative to its
aberration-free value, can be written:

S(c ) ∫ I (0, 0; c )

1 c 1 x2
1 Û Û
=
pc Ù
ı
dx Ù
ı
[ ]
exp iF ( x , y ) dy , (8-35)
1 c 1 x2

where ( x , y ) are the pupil coordinates normalized by the pupil dimension a along the x p
axis, as used in the polynomials given in Table 8-3.

The Strehl ratio for elliptical polynomial aberrations with a sigma value of 0.1 wave
is listed in Table 8-8 and plotted in Figure 8-6. Because of the small value of the
aberration, the Strehl ratio is approximately the same for each polynomial. Both the table
and the figure illustrate that the Strehl ratio for a small aberration is independent of the
( )
type of aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
222 SYSTEMS WITH ELLIPTICAL PUPILS

E1 E2 E3

E4 E5 E6

E7 E8 E9

E10 E11 E12

E13 E14 E15

Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85
shown as isometric plot on the top, interferogram on the left, and PSF on the right
for a sigma value of one wave.
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 223

E16 E17 E18

E19 E20 E21

E22 E23 E24

E25 E26 E27

E28 E29 E30

Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85
shown as isometric plot on the top, interferogram on the left, and PSF on the right
for a sigma value of one wave. (Cont.)
224 SYSTEMS WITH ELLIPTICAL PUPILS

E31 E32 E33

E34 E35 E36

E37 E38 E39

E40 E41 E42

E43 E44 E45

Figure 8-5. Elliptical polynomials with an aspect ratio c = 0.85 shown as isometric
plot on the top, interferogram on the left, and PSF on the right for a sigma value of
one wave. (Cont.)
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 225

Table 8-7. Peak-to valley (P-V) numbers (in units of wavelength) of orthonormal
elliptical polynomial aberrations with an aspect ratio c = 0.85 and a sigma value of
one wave.

Poly. P-V # Poly. P-V# Poly. P-V #

E1 0 E16 8.920 E31 7.805

E2 4 E17 6.068 E32 8.415

E3 4 E18 7.554 E33 7.667

E4 3.922 E19 6.379 E34 8.768

E5 4.899 E20 8.700 E35 10.673

E6 4.777 E21 8.239 E36 11.196

E7 4.256 E22 6.681 E37 7.395

E8 6.7755 E23 8.444 E38 7.795

E9 5.839 E24 6.920 E39 9.824

E10 6.149 E25 8.181 E40 8.506

E11 4.831 E26 7.051 E41 8.692

E12 5.816 E27 9.958 E42 8.233

E13 6.942 E28 9.459 E43 9.313

E14 7.024 E29 7.351 E44 8.606

E15 7.428 E30 10.824 E45 12.414


226 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-8. Strehl ratio S for elliptical polynomial aberrations with an aspect ratio
c = 0.85 and a sigma value of 0.1 wave.

Poly. S Poly. S Poly. S

E1 1 E16 0.680 E31 0.675

E2 0.665 E17 0.669 E32 0.677

E3 0.665 E18 0.678 E33 0.684

E4 0.664 E19 0.675 E34 0.685

E5 0.671 E20 0.692 E35 0.703

E6 0.672 E21 0.692 E36 0.703

E7 0.667 E22 0.675 E37 0.680

E8 0.674 E23 0.677 E38 0.673

E9 0.679 E24 0.672 E39 0.679

E10 0.679 E25 0.6811 E40 0.678

E11 0.671 E26 0.680 E41 0.678

E12 0.671 E27 0.698 E42 0.688

E13 0.675 E28 0.698 E43 0.689

E14 0.686 E29 0.671 E44 0.708

E15 0.685 E30 0.684 E45 0.708


8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 227

o
o

Figure 8-6. Strehl ratio for an elliptical polynomial aberration with an aspect ratio c
= 0.85 and a sigma value of 0.1 wave.
228 SYSTEMS WITH ELLIPTICAL PUPILS

8.7 SEIDEL ABERRATIONS AND THEIR STANDARD DEVIATIONS


We now consider balancing of a Seidel aberration and obtain its standard deviation
with and without balancing.

8.7.1 Defocus
We start with the defocus aberration

W d (r) = Ad r 2 . (8-36)

From the form of the orthonormal defocus polynomial E4 given in Table 8-2, it is
evident that its sigma value across an elliptical pupil is given by

Ad h
sd = , (8-37)
4 3

where

(
h = 3 - 2c 2 + 3c 4 )1 2 . (8-38)

8.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by

W a (r, q) = Aa r 2 cos 2 q . (8-39)

The orthonormal polynomial representing balanced astigmatism is given by

6
E6 = 2
2c h
[
h 2r 2 cos 2q - 3 1 - c 2 ( )] (8-40a)

h 6 Ê 2 2 3 - c 2 2ˆ
= Á r cos q - r ˜ + constant . (8-40b)
c2 Ë h ¯

It shows that Seidel astigmatism r2 cos q is balanced with defocus aberration


[( ) ]
- 3 - c 2 h r 2 , or that balanced astigmatism is given by

Ê 3 - c 2 2ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r ˜ . (8-41)
Ë h ¯

Its sigma value is given by

c2
s ba = Aa . (8-42)
h 6

To determine the sigma of Seidel astigmatism, we write the aberration in terms of the
elliptical polynomials. Thus,
 $VWLJPDWLVP 29

W a (r, q) = Aa r 2 cos q

Ê c2 3 - c2 ˆ
= Aa Á E6 + E4 ˜ + constant . (8-43)
Ë 6h 4h 3 ¯

Utilizing Eq. (8-34), we find the sigma to be

s a = Aa 4 . (8-44)

Its value is independent of the aspect ratio c of the elliptical pupil, and thus equal to that
for a circular pupil. Since Seidel astigmatism x 2 varies only along the x axis for which
the unit ellipse has the same length as a unit circle, the sigma is independent of c.

8.7.3 Coma
Now we consider Seidel coma:

W c (r, q) = Ac r 3 cos q . (8-45)

The orthonormal polynomial representing balanced coma is given by

4
E8 =
4 12
[6r 3
( )
cos q - 3 + c 2 r cos q ] . (8-46)
(9 - 6c 2
+ 5c )
It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
( )
r3 cos q is - 3 + c 2 6 compared to - 2 3 for a circular pupil. The balanced coma is
given by

Ê 3 + c2 ˆ
W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (8-47)
Ë 6 ¯

Its sigma value is given by

s bc =
(9 - 6c 2 + 5c 4 )1 2 A . (8-48)
c
24

To obtain the sigma value of Seidel coma, we write Eq. (8-44) in the form

W c (r, q) = Ac Á
(
Ê 9 - 6c 2 + 5c 4
)1 2 E +
3 + c2
ˆ
E2 ˜ . (8-49)
8
Á 24 12 ˜
Ë ¯

Utilizing Eq. (8-34), we obtain the sigma value:

1
sc =
8
(5 + 2c 2 + c 4 )1 2 Ac . (8-50)
230 SYSTEMS WITH ELLIPTICAL PUPILS

8.7.4 Spherical Aberration


Finally, we consider Seidel spherical aberration

W s (r) = Asr 4 . (8-51)

The orthonormal polynomial representing balanced spherical aberration is given by

E11 = ( )[ ( ) ( ) ]
5 a 48r 4 - 12 1 - c 2 r 2 cos 2q - 24 1 + c 2 r 2 + constant (8-52a)

= ( )[ ( ) ( ) ]
5 a 48r 4 - 24 1 - c 2 r 2 cos 2 q + 12 1 - 3c 2 r 2 + constant . (8-52b)

The balanced spherical aberration is given by

È 1 1 ˘
Î 4
( ) 2
(
W bs (r) = As Ír 4 - 1 - c 2 r 2 cos 2q - 1 + c 2 r 2 ˙
˚
) (8-53a)

È 1 1 ˘
Î 2
( ) 4
( ˚
)
= As Ír 4 - 1 - c 2 r 2 cos 2 q + 1 - 3c 2 r 2 ˙ + constant . (8-53b)

It shows that spherical aberration is balanced not only by defocus but astigmatism as
well. Its sigma value is given by

a
s bs = As . (8-54)
48 5

To obtain the sigma value of Seidel spherical aberration, we write Eq. (8-50) in the form

ÏÔ a
W s (r) = As Ì E11 +
c2 1 - c2 (
E6 + Í
)
1 È3 1-c 1-c
2 4
(
+ h 1 + c2
)( ) ( ˘ ¸
)˙˙E4 Ô˝
ÔÓ 48 5 2h 6 8 3 ÍÎ 2 h Ô˛
˚

+ constant . (8-55)

Utilizing Eq. (8-34), we obtain the sigma value:

ss =
(225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 A . (8-56)
s
24 10

The sigma values of Seidel aberrations with and without balancing are given in Table
8-9. They reduce to the corresponding values for a circular pupil given in Table 4-3 as
c Æ 1. The variation of sigma for a primary aberration with the aspect ratio c is shown in
Figure 8-7. While s a for astigmatism is constant, it increases monotonically in the case
of coma s c and spherical aberration s s . For defocus, its value s d has a minimum for
c = 1 3 . The variation of sigma of a balanced primary aberration as a function of c is
shown in Figure 8-8. While its variation for balanced coma s bc and balanced spherical
aberration s bs is small, sigma of balanced astigmatism s ba increases monotonically.
 6SKHULFDO $EHUUDWLRQ 31

Table 8-9. Standard deviation s i of a primary and a balanced primary aberration


for an elliptical pupil of aspect ratio c.

Aberration Sigma
12
Defocus [(
s d = ( Ad 4) 3 - 2c 2 + 3c 4 ) 3]
Astigmatism s a = Aa 4

12
Balanced astigmatism s ba = Aa c 2 [6(3 - 2c 2
+ 3c 4 )]
Coma (
s c = Ac 5 + 2c 2 + c 4 )1 2 8

Balanced coma (
s bc = Ac 9 - 6c 2 + 5c 4 )1 2 24

Spherical aberration (
s s = As 225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 (24 10 )
Balanced spherical aberration (
s bs = As 45 - 60c 2 + 94c 4 - 60c 6 + 45c 8 )1 2 (48 5)

Figure 8-7. Variation of sigma of a Seidel aberration as a function of aspect ratio c


of a unit elliptical pupil, where the subscript d is for defocus, a for astigmatism, c for
coma, and s for spherical aberration.
232 SYSTEMS WITH ELLIPTICAL PUPILS

Figure 8-8. Variation of sigma of a balanced Seidel aberration as a function of


aspect ratio c of a unit elliptical pupil, where the subscript ba is for balanced
astigmatism, bc for balanced coma, and bs for balanced spherical aberration.

8.8 SUMMARY
The PSF and OTF of a system with an elliptical pupil are obtained from the
corresponding PSF and OTF of a system with a circular pupil discussed in Chapter 4 by
scaling the coordinates of the elliptical pupil and transforming it into a circular pupil. It is
explained that the orthogonal aberration polynomials for an elliptical pupil representing
balanced classical aberration for such a pupil can not be obtained in the same manner.
These polynomials orthonormal over a unit elliptical pupil are obtained by
orthonormalizing the circle polynomials by the Gram–Schmidt orthonormalization
process. They are given through the fourth order in Tables 8-1 through 8-3 in terms of the
circle polynomials, in the polar coordinates, and in the Cartesian coordinates,
respectively. Table 8-2 shows that each polynomial consists of either the cosine or the
sine terms, but not both. Thus, an even j polynomial, for example, consists of only the
cosine terms. This is a consequence of the biaxial symmetry of the pupil. Since the
polynomials are not separable in the polar coordinates r and q of a pupil point,
polynomial numbering with two indices n and m loses significance. Hence, they must be
numbered with a single index j. Their ordering is the same as for the polynomials
discussed in previous chapters.

Only the first 15 elliptical polynomials are given for an arbitrary aspect ratio c of the
pupil in the Tables 8-1 through 8-3. The expressions for the higher-order elliptical
polynomials are very long unless c is specified. The polynomial E6 for astigmatism is a
 6XPPDU\ 33

degree) Seidel astigmatism is different for an elliptical pupil compared to that for a
circular, annular, or a Gaussian pupil. Moreover, E11 is a linear combination of Z11 , Z 6 ,
Z 4 , and Z1. Thus, spherical aberration r 4 is balanced with not only defocus r2 but
astigmatism r2 cos 2 q as well. It is evidently not radially symmetric. As expected, the
elliptical polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse
approaches a unit circle.

The elliptical polynomials up to the eighth order for an elliptical pupil with an aspect
ratio of c = 0.85 are given in Tables 8-4 to 8-6 in terms of the Zernike circle polynomials,
in polar coordinates, and in Cartesian coordinates, respectively. They are illustrated in
three different but equivalent ways in Figure 8-5 with the isometric plot, interferogram,
and the PSF for a sigma value of one wave. The peak-to-valley aberration numbers (in
units of wavelength) are given in Table 8-7. The Strehl ratio for a sigma value of 0.1
wave is given in Table 8-8 and plotted in Figure 8-6. The Seidel aberrations are discussed
in Section 8.7 and their sigma values with and without balancing are given in Table 8-9.
234 SYSTEMS WITH ELLIPTICAL PUPILS

References

1. H. J. Wyatt, “The form of the human pupil,” Vision Res. 35, 2021–2036 (1995).

2. W. B. King, “The approximation of a vignetted pupil shape by an ellipse,” Appl.


Opt. 7, 197–201 (1968).

3. G. Harbers, P. J. Kunst, and G. W. R. Leibbrandt, “Analysis of lateral shearing


interferogram by use of Zernike polynomials,” Appl. Opt. 35, 6162–6172 (1996).

4. H. Sumita, “Orthogonal expansion of the aberration difference function and its


application to image evaluation,” Japanese J. Appl. Phys. 8, 1027–1036 (1969).

5. Y. P. Kathuria, “Far-field radiation patterns of elliptical pupil apertures and its


annuli,” ,((( 7UDQV $QWHQ 3URSD AP-31, 360–363 (1983).

6. J. V. Cornacchio and R. P. Soni, “Autoconvolution of an ellipse,” J. Opt. Soc. Am.


55, 107–108 (1965).

7. V. N. Mahajan and G.-m Dai, “Orthonormal polynomials in wavefront analysis:


analytical solution,” J Opt. Soc. Am A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

8. V. N. Mahajan, “Orthonormal polynomials in wavefront analysis,” Handbook of


Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–
11.41 (McGraw Hill, 2009).
CHAPTER 9

SYSTEMS WITH RECTANGULAR PUPILS

9.1 Introduction ..........................................................................................................237

9.2 Pupil Function ......................................................................................................237

9.3 Aberration-Free Imaging ....................................................................................238

9.3.1 PSF ..........................................................................................................238

9.3.2 OTF ..........................................................................................................240

9.4 Rectangular Polynomials..................................................................................... 242

9.5 Rectangular Coefficients of a Rectangular Aberration Function....................243

9.6 Isometric, Interferometric, and Imaging Characteristics of

Rectangular Polynomial Aberrations ................................................................247

9.7 Seidel Aberrations and Their Standard Deviations ..........................................260

9.7.1 Defocus ....................................................................................................260

9.7.2 Astigmatism............................................................................................. 260

9.7.3 Coma ........................................................................................................261

9.7.4 Spherical Aberration ................................................................................261

9.8 Summary............................................................................................................... 264

References ......................................................................................................................265

235
Chapter 9
Systems with Rectangular Pupils
9.1 INTRODUCTION
High-power laser beams have a rectangular cross-section; hence there is a need to
discuss the diffraction characteristics of a rectangular pupil. We start this chapter with a
brief discussion of the PSF and OTF of a system with such a pupil.

Although high-power rectangular laser beams have been around for a long time [1],
there is little in the literature on rectangular polynomials representing balanced
aberrations for such beams. In this chapter we discuss such polynomials that are
orthonormal over a unit rectangular pupil [2,3]. These polynomials are not separable in
the x and y coordinates of a point on the pupil. The expressions for only the first 15
orthonormal polynomials, i.e., up to and including the fourth order, are given for an
arbitrary aspect ratio of the pupil becuase they become quite cumbersome as their order
increases. However, expressions for the first 45 polynomials, i.e., up to and including the
eighth order, are given for an aspect ratio of 0.75. The isometric, interferometric, and PSF
plots of these polynomial aberrations with a sigma value of one wave are given along
with their P-V numbers. The Strehl ratios for these polynomial aberrations for a sigma
value of one-tenth of a wave are also given. Finally, we discuss how to obtain the
standard deviation of a Seidel aberration with and without balancing.

Products of Legendre polynomials (one for the x- and the other for the y axis) which
are also orthogonal over a rectangular pupil [4], are not suitable for the analysis of
rectangular wavefronts of rotationally symmetric systems, since they do not represent
classical or balanced aberrations for such systems. For example, the defocus aberration
for such a system is represented by x 2 + y 2 . While it can be expanded in terms of a
complete set of 2D Legendre polynomials, it cannot be represented by a single product of
the x- and y-Legendre polynomials. The same difficulty holds for spherical aberration,
coma, etc. However, products of such Legendre polynomials are suitable for anamorphic
systems, as discussed in Chapter 13. Products of Chebyshev polynomials, one for the x-
and the other for the y-axis, are also orthogonal over a rectangular pupil, but they are not
suitable either for the rectangular pupils considered in this chapter for the same reasons as
for the products of Legendre polynomials.

9.2 PUPIL FUNCTION


As illustrated in Figure 9-1, consider an optical system with a rectangular exit pupil
( )
with half-widths a and b and area Sex = 4 ab lying in the x p , y p plane with z axis as its
(
optical axis. For a uniformly illuminated pupil with an aberration function F x p , y p )
and power Pex exiting from it, the pupil function of the system can be written

(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (9-1)

where 237
238 SYSTEMS WITH RECTANGULAR PUPILS

yp

O xp

Figure 9-1. Rectangular pupil with half-widths a and b.

(
A xp, yp ) = (P ex Sex )
12
, - a £ xp £ a , -b £ yp £ b . (9-2)

9.3 ABERRATION-FREE IMAGING


9.3.1 PSF
From Eq. (1-9), the aberrated PSF at a point ( x i , y i ) in the image plane of a system
with a uniformly illuminated rectangular exit pupil, normalized by its aberration-free
central value Pex Sex l2 R 2 , can be written
2
1 a b È 2pi ˘
I (x i , y i ) = 2 Ú
Sex a b
[ (
Ú exp iF x p , y p expÍ -
Î lR
)] ( )
x i x p + y i y p ˙ dx p dy p .
˚
(9-3)

Letting

( x ¢, y ¢) (
= xp a, yp b ) , (9-4)

and

1
( x, y) = ( x , y )
l Fx i i
(9-5)

into Eq. (9-3), where

Fx = R 2a (9-6)

is the focal ratio of the image-forming light cone along the x axis, and

 = ba (9-7)

is the aspect ratio of the pupil, the irradiance distribution can be written

2
1 1 1
I ( x, y) =
16 1 1
[ ]
Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (9-8)

Accordingly, the aberration-free distribution is given by


 36) 239

2
1 1 1
I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢
16 1 1

2 2
Ê sin px ˆ Ê sin py ˆ
= Á ˜ . (9-9)
Ë px ¯ ÁË py ˜¯

Figure 9-2a shows the 2D PSF for an aspect ratio  = 0.75 . In particular, it shows the
central bright rectangular spot of size 2 ¥ 2 , with each dimension in units of l Fx . The
PSF is zero wherever x and/or y is a positive or a negative integer. Figure 9-2b shows
the irradiance distribution along the x and y axes, and along the diagonal of the central
12
bright spot as I ( x, 0) , I (0, y ) , and I ( x , y ) ∫ I ( r ) , where r = x 2 + y 2 (and )
4
È Ê 2ˆ ˘
Í sinË pr 1 +  ¯ ˙
I (r) = Í ˙ . (9-10)
Í pr 1 + 
2
˙
Î ˚

(a)

1.0

0.8

0.6
I (0, y)

(b)
0.4
I (r)

0.2

I (x, 0)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r

Figure 9-2. (a) 2D aberration-free PSF for  = 0.75. (b) Irradiance distribution along
the x and y axes, and along the diagonal of the central bright spot of the PSF.
240 SYSTEMS WITH RECTANGULAR PUPILS

9.3.2 OTF
From Eq. (1-13), the aberration-free OTF of a system with a rectangular pupil at a
spatial frequency (x, h) is given by the fractional area of overlap of two rectangles
centered at (0, 0) and lR(x, h) , as shown in Figure 9-3. The overlap area is given by

S(x, h) = (2a - l Rx) (2b - l Rh)

Ê x ˆÊ 1 h ˆ
= 4 abÁ 1 - ˜ Á1 - ˜ . (9-11)
Ë 1 l Fx ¯ Ë  1 l Fx ¯

Hence, the fractional area of overlap, or the OTF of the system may be written

v
(
t vx , vy ) = (1 - v ) ÊÁË1 -  ˆ˜¯
x
y
, (9-12)

where

Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 l Fx 1 l Fx ¯
(9-13)

are the spatial frequency components in units of the cutoff frequency 1 l Fx along the x
12
( )
axis. The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be
obtained from Eq. (9-12) by letting v y v x = . Thus
2
Ê v ˆ
t( v ) = Á 1 - ˜ . (9-14)
Ë 1 + 2 ¯

Its cutoff frequecny is 1 + 2 .

yp

b ‡
O9 R
‡
O xp

R
a

Figure 9-3. Overlap area of two rectangular pupils centered at (0, 0) and l R(x , h)
for an aspect ratio  = 0.75.
 27) 241

Figure 9-4 shows the OTF for  = 0.75 along the x and y axes, and along the
( )
diagonal of the pupil, as t(v x , 0) , t 0, v y , and t( v ) , with the corresponding cutoff
frequencies 1, 0.75, and 1.25, respectively, each in units of 1 l Fx . We note that
( )
t 0, v y < t(v x , 0) for any value of v x = v y due to the smaller dimension of the pupil
along the y axis. Moreover, t( v ) < t(v x , 0) for any frequency lying in the range
( )
0 < v = v x < 2 1 + 2 - 1 + 2 , or 0 < v = v x < 0.9375 in our example of  = 0.75 . The
two OTFs are equal to each other at the frequency 2 1 + 2 - 1 + 2 , or 0.9375. At ( )
larger frequencies, t( v ) > t(v x , 0) until v = 1 + 2 . Of course, the values of both OTFs
in the vicinity of the unity cutoff frequency for t(v x , 0) are quite small in our example.
( )
Finally, t 0, v y is only slightly greater than t( v ) in the frequency range
( )
0 < v = v x < 2 1 + 2 -  1 1 + 2 . The two OTFs are equal to each other at the
( )
frequecny 2 1 + 2 -  1 1 + 2 , or 1 2.4 in our example. For larger frequecnies, t( v ) is
significantly greater. We point out that they are equal to each other only if  ≥ 1 3 . As
( )
 Æ 1 and the rectangular pupil becomes square, t 0, v y Æ t(v x , 0) for any value of
v x = v y , and the cutoff frequency for t( v ) appraoches 2 , as discussed in the next
chapter.

1.0

0.8
t ( nx , 0)

0.6
t

0.4

0.2
t ( 0, ny )
t (n)
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
nx, ny, or n

Figure 9-4. Aberration-free OTF for  = 0.75, where v x , v y , and v are in units of
the cutoff frequency 1 l Fx along the x axis.
242 SYSTEMS WITH RECTANGULAR PUPILS

9.4 RECTANGULAR POLYNOMIALS


Figure 9-5 shows a unit rectangle inscribed inside a unit circle. The half-widths a
12
and b of the rectangular pupil are normalized by its semidiagonal a 2 + b 2 so that the ( )
farthest points (such as A) on the pupil lie at a distance of unity. The half-widths of the
12 12
unit rectangle along the x and y axes are c = a a 2 + b 2 ( )
and 1 - c 2 , respectively,
12
( )
where 0 < c < 1 . Accordingly, the aspect ratio of the rectangle is 1 - c 2
2 12
c , and its ( )
area is given by A = 4c 1 - c ( )
. As in the case of a unit ellipse, a unit rectangle is also
not unique, since c can have any value between 0 and 1. For example, when c = 0.8 , the
aspect ratio of the pupil is 0.75 and the area is 1.92. As c Æ 1 2 , the rectangle becomes
a square, and as c Æ 1 or 0, it becomes a slit parallel to the x or the y axis, respectively.

The orthonormal rectangular polynomials R j ( x , y ) obtained by orthogonalizing the


Zernike circle polynomials Z j over a unit ellipse are given by [see Eq. (3-18)]

È j ˘
R j +1 = N j +1 ÍZ j +1 - Â Z j +1R k R k ˙ , (9-15)
ÍÎ k =1 ˙˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit rectangle, i.e., they satisfy the orthonormality condition

c 1 c2
1 Û Û
dx Ù R j R j ¢ dy = d jj ¢ . (9-16)
2 Ù
4c 1 - c ı ı
c 1 c2

The angular brackets indicate a mean value over the rectangular pupil. Thus

c 1 c2
Û 1Û
Z j Rk = Ù dx Ù Z j Rk dy . (9-17)
4c 1 - c 2 ı ı
c 1 c2

D ( c, 1 c2 ) (
A c, 1 c2 )

O x

C ( c, 1 c2 ) (
B c, 1 c2 )

Figure 9-5. Unit rectangle of half-width c inscribed inside a unit circle. Its corner
points, such as A, lie at a distance of unity from its center.
 5HFWDQJXODU 3RO\QRPLDOV 243

It should be evident that because of the symmetric limits of integration, a mean value is
zero if the integrand is an odd function of x and/or y. If the integrand is an even function,
then we may replace the lower limits of integration by zero and multiply the double
integral by 4.

The rectangular polynomials thus obtained up to the fourth order are given in Tables
9-1 through 9-3 in the same manner as the elliptical polynomials. Only the first 15
polynomials are given in these tables, because their expressions become too long unless
the aspect ratio is specified. Each polynomial consists of a number of circle polynomials,
but contain only the cosine or the sine terms, not both. The polynomial R6 representing
balanced astigmatism is a linear combination of Z 6 , Z 4 , and Z1, showing that the
balancing defocus for 0 o Seidel astigmatism is different for a rectangular pupil compared
to that, for example, for a circular pupil. Similarly, the polynomial R11 , representing
balanced primary spherical aberration, is not radially symmetric, since it consists of a
term in astigmatism Z 6 or cos2q . As expected, the rectangular polynomials reduce to
the square polynomials as c Æ 1 2 , and the slit polynomials for a slit pupil parallel to
the x axis as c Æ 1, discussed in Chapters 10 and 11, respectively.

9.5 RECTANGULAR COEFFICIENTS OF A RECTANGULAR ABERRATION


FUNCTION
A rectangular aberration function W ( x , y ) across a unit rectangle can be expanded in
terms of J rectangular polynomials Rj (r, q) in the form
J
W ( x , y ) = Â a j Rj ( x , y ) , (9-18)
j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (9-18) by
R j ( x , y ), integrating over the unit rectangle, and using the orthonormality Eq. (9-16), we
obtain the rectangular expansion coefficients:

c 1 c2
1 Û Û
aj = Ù dx Ù W ( x , y )R j ( x , y )dy . (9-19)
2
4c 1 - c ı ı
c 1 c2

As stated in Section 3.2, it is evident from Eq. (9-19) that the value of a rectangular
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (9-20)

and
244 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-1. Orthonormal rectangular polynomials R j U, T in terms of the Zernike


circle polynomials Z j U T .
R1 Z1

R2 ( 3 /2c)Z2

2
R3 [ 3 /(2 1 c ) ]Z3
2 4
R4 [ 5 /(4 1 2c 2c ) ](Z1 +3Z4)
2
R5 [ 3 2 /(2c 1 c ) ]Z5
2 4
R6 { 5 /[8c2(1 c2) 1 2c + 2c ]}[(3 10c2 + 12c4 8c6)Z1 + 3 (1 2c2)Z4

+ 6 (1 2c2 + 2c4)Z6]

2 4 6
R7 [ 21 /(4 2 27 81c + 116c 62c )][ 2 (1+4c2)Z3 +5Z7]

2 4
R8 [ 21 /(4 2c 35 70c + 62c )][ 2 (5 4c2)Z2 +5Z8]

54c + 62c ·¹ / §©1 c ·¹ /[16c2(27


§ 27 2 4 2
R9 { 5 2 ©
81c2 + 116c4 62c6)]}

[2 2 (9 36c2 + 52c4 60c6)Z3 + (9 18c2 26c4)Z7 + (27 54c2 + 62c4)Z9]


2 4
R10 { 5 2 /[16c3(1 c2) 35 70c + 62c ]}[2 2 (35 112c2 + 128c4 60c6)Z2

+ (35 70c2 + 26c4)Z8 + (35 70c2 + 62c4)Z10]

R11 [1/(16—)][8(3 + 4c2 4c4)Z1 +25 3 Z4 + 10 6 (1 2c2)Z6 + 21 5 Z11]


R12 {3—/[16c2ȞȘ]}{(105 550c2 + 1559c4 2836c6 + 2695c8 1078c10)Z1

+ 5 3 (14 74c2 + 205c4 360c6 + 335c8 134c10)Z4 + (5/2)

3/2 (35 156c2 + 421c4 530c6 + 265c8)Z6 + 21 5 (1 4c2

+ 6c4 4c6)Z11 + [(7/2) 5 / 2 Ș/(1 c2)]Z12}

2 4 6
R13 [ 21 /(16 2 c 1 3c + 4c 2c )]( 3 Z5 + 5 Z13)
2 4 6 8
R14 IJ[6(245 1400c + 3378c 4452c + 3466c 1488c + 496c12)Z1
10

+ 15 3 (49 252c2 + 522c4 540c6 + 270c8)Z4 + 15 6 (49 252c2

+ 534c4 596c6+ 360c8 144c10)Z6 + 3 5 (49 196c2 + 282c4 172c6

+ 86c8)Z11 + 147 10 (1 4c2 + 6c4 4c6)Z12 + 3 10 Ȟ2Z14]

R15 {1/[32c3(1 c2) (1 3c2 + 4c4 2c6)1/2]}[3 7 2 (5 18c2 + 24c4 16c6)Z5

+ 105 2 (1 2c2)Z13 + 210 (1 2c2 +2c4)Z15]


__________________________________________________
— (9 36c2 + 103c4 134c6 + 67c8)1/2
Ȟ (49 196c2 + 330c4 268c6 + 134c8)1/2
IJ 1/[128Ȟc4(1 c2)2]
Ș 9 45c2 + 139c4 237c6 + 201c8 67c10
Ș — (1 c2)
2
9.5 Rectangular Coefficients of a Rectangular Aberration Function 245

Table 9-2. Orthonormal rectangular polynomials R j U, T in polar coordinates


U, T .
R1 = 1

R2 = ( 3 /c)ȡcosș

R3 = 3 /(1 í c2)ȡsinș

2 4
R4 = [ 5 /(2 1 í 2c  2c )](3ȡ2 í 1)

2
R5 = [3/(2c 1 í c )]ȡ2 sin2ș

2 4
R6 = { 5 /[4c2(1 í c2) 1 í 2c  2c ]}[3(1 í 2c2 + 2c4)ȡ2 cos2ș + 3(1 í 2c2)ȡ2
í 2c2(1 í c2) (1 í 2c2)]
2 4 6
R7 = [ 21 /(2 27 í 81c  116c í 62c )](15ȡ2 – 9 + 4c2)ȡsinș

R8 = [ 21 /(2c 35 í 70c  62c )](15ȡ2 í 5 í 4c2)ȡcosș


2 4

§ 27 í 54c  62c ·¹ / §©1 í c ·¹ /[8c2(27 í 81c2 + 116c4 í 62c6)]}


2 4 2
R9 = { 5 ©

{(27 í 54c2 + 62c4) × ȡ3 sin3ș í 3[4c2(3 í 13c2 + 10c4) í (9 í 18c2


í 26c4)ȡ2]ȡsinș}
2 4
R10 = { 5 /[8c3(1 í c2) 35 í 70c  62c ]}{(35 í 70c2 + 62c4)ȡ3 cos3ș
í 3[4c2(7 í 17c2 + 10c4) í (35 í 70c2 + 26c4)ȡ2]ȡcosș}
R11 = (1/8—)[315ȡ4 + 30(1 í 2c2)ȡ2 cos 2ș í 240ȡ2 + 27 + 16c2 í 16c4]
R12 = [3—/(8c2ȞȘ)][315(1 í 2c2) (1 í 2c2 +2c4)ȡ4 + 5(7—2ȡ2 í 21 + 72c2 í 225c4 + 306c 6
í 153c8)ȡ2 cos2ș í 15(1 í 2c2) (7 + 4c2 í 71c4 + 134c6 í 67c8)ȡ2
+ c2(1 í c2)(1 í 2c2)(70 í 233c2 + 233c4)]
2 4 6
R13 = [ 21 /(4c 1 í 3c  4c í 2c )](5ȡ2 í 3)ȡ2 sin2ș
R14 = 6IJ{5Ȟ2ȡ4 cos4ș í 20(1 í 2c2)[6c2(7 í 16c2 + 18c4 í 9c6) í 49(1 í 2c2 + 2c4)ȡ2]ȡ2
cos2ș + 8c4(1 í c2)2(21 í 62c2 + 62c4) í 120c2(7 í 30c2 + 46c4 í 23c6)ȡ2
+ 15(49 í 196c2 + 282c4 í 172c6 + 86c8)ȡ4}

R15 = { 21 /[8c3(1 í c2)3/2(1 í 2c2 +2c4)1/2]}[ í (1 í 2c2) (6c2 í 6c4 í 5ȡ2)ȡ2 sin2ș
+ (5/2)(1 í 2c2 +2c4)ȡ4 sin4ș]
246 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-3. Orthonormal rectangular polynomials R j x, y in Cartesian coordinates


x, y , where U 2 x 2  y 2 .
R1 = 1

R2 = ( 3 /c)x

3 / §© 1 í c ·¹ y
2
R3 =

2 4
R4 = [ 5 /(2 1 í 2c + 2c )](3ȡ2 í 1)

2
R5 = [3/( c 1 í c )]xy

2 4
R6 = { 5 /[2c2(1 í c2) 1 í 2c + 2c ]}[3(1 í c2)2x2 í 3c4y2 í c2(1 í 3c2 +2c4)]

2 4 6
R7 = [ 21 /(2 27 í 81c + 116c í 62c )](15ȡ2 – 9 + 4c2)y

2 4
R8 = [ 21 /(2c 35 í 70c + 62c )](15ȡ2 í 5 í 4c2)x

R9 = { 5 § 27 í 54c + 62c
2 4·
/ §©1 í c ·¹ /[2c2(27 í 81c2 + 116c4 í 62c6)]}
2
© ¹

[27(1 í c2)2x2 í 35c4y2 í c2(9 í 39c2 + 30c4)]y


2 4
R10 = { 5 /[2c3(1 í c2) 35 í 70c + 62c ]}[35(1 í c2)2x2 í 27c4y2 í c2(21 í 51c2
+ 30c4)]x
R11 = [1/(8—)][315ȡ4 í 30(7 + 2c2)x2 í 30(9 í 2c2)y2 + 27 + 16c2 í 16c4]
R12 = [3—/(8c2ȞȘ)][35(1 í c2)2(18 í 36c2 + 67c4)x4 + 630(1 í 2c2)(1 í 2c2 +2c4)x2y2
í 35c4(49 í 98c2 + 67c4)y4 í 30(1 í c2) (7 í 10c2 í 12c4 + 75c6 í 67c8)x2
í 30c2(7 í 77c2 + 189c4 í 193c6 + 67c8)y2 + c2(1 í c2) (1 í 2c2) (70 í 233c2
+ 233c4)]
2 4 6
R13 = [ 21 /(2c 1 í 3c + 4c í 2c )](5ȡ2 í 3)xy

R14 = 16IJ[735(1 í c2)4x4 í 540c4(1 í c2)2x2y2 + 735c8y4 í 90c2(1 í c2)3(7 í 9c2)x2


+ 90c6(1 í c2) (2 í 9c2)y2 +3c4(1 í c2)2(21 í 2c2 + 62c4)]
2 4 6
R15 = { 21 /[2c3(1 í c2) 1 í 3c + 4c í 2c ]}[5(1 í c2)2x2 í 5c4y2 í c2(3 í 9c2
+ 6c4)]xy
9.5 Rectangular Coefficients of a Rectangular Aberration Function 247

J
W 2 (r, q) = Â a 2j , (9-21)
j =1

respectively. Accordingly, the aberration variance is given by

2
2
sW = W 2 (r, q) - W (r, q)

J
= Â a 2j . (9-22)
j =2

9.6 ISOMETRIC, INTERFEROMETRIC, AND IMAGING CHARACTERISTICS


OF RECTANGULAR POLYNOMIAL ABERRATIONS
The rectangular polynomials up to the eighth order for a rectangular pupil with
c = 0.8 , corresponding to an aspect ratio of  = 0.75 , are given in Tables 9-4 to 9-6. They
are illustrated in three different but equivalent ways in Figure 9-6. For each polynomial,
the isometric plot at the top illustrates its shape. An interferogram is shown on the left,
and a corresponding PSF is shown on the right for a sigma value of one wave. The peak-
to-valley aberration numbers (in units of wavelength) are given in Table 9-7.

The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (9-3) are shown in Figure 9-6. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration R1 has
no effect on the PSF, it yields an aberration-free PSF. The polynomial aberrations R2 and
R3 , representing the x and y wavefront tilts with aberration coefficients a 2 and a 3 ,
displace the PSF in the image plane along the x and y axes, respectively. If the coefficient
a 2 is in units of wavelength, it corresponds to a wavefront tilt angle of 3la 2 ca about
the y axis and displaces the PSF along the x axis by 2 3lFx a 2 c , where Fx = R 2a and
12
(
c = a a 2 + b2 ) is the width of the rectangle along the x axis normalized by its
semidiagonal. Similarly, a 3 corresponds to a wavefront tilt angle of 3 (1 - c 2 )la 3 b
about the x axis and displaces the PSF by 2 3 (1 - c 2 )lFy a 3 , where Fy = R 2b is the
focal ratio of the image-forming beam along the y axis.

The defocus aberration represented by the polynomial R4 is radially symmetric and


yields a radially symmetric interferogram bounded, of course, by a rectangle. However,
the PSF is biaxially and not radially symmetric because of the larger diffraction spread
along the smaller direction of the pupil. The polynomial aberrations R5 and R6 ,
representing balanced astigmatism, both yield biaxially symmetric interferograms and
PSFs, but they are distinctly different from each other. The polynomial aberrations R7
and R8 , representing balanced comas, produce biaxially symmetric interferograms, but
the PSFs are symmetric only about the y and x axes, respectively. The polynomial
aberrations R11 , R22 , and R37 , representing balanced primary, secondary, and tertiary
aberrations are not radially symmetric because of the presence of cos 2q , cos 2q and
cos 4q , and cos 2q , cos 4q , and cos 6q terms, respectively.
248 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-4 Rectangular polynomials in terms of Zernike circle polynomials for a


rectangular pupil with c = 0.8 corresponding to an aspect ratio 0.75.
R1 1.Z1

R2 1.0825Z2

R3 1.4434Z3

R4 0.7613Z1 + 1.3186Z4

R5 1.2758Z5

R6 0.9614Z1 0.8012Z4 + 2.1820Z6

R7 1.6096Z3 + 1.5985Z7

R8 0.8848Z2 + 1.2821Z8

R9 4.0549Z3 2.2292Z7 + 3.0190Z9

R10 0.0077Z2 + 0.1153Z8 + 2.1173Z10

R11 0.9498Z1 + 1.3109Z4 0.2076Z6 + 1.4216Z11

R12 1.8433Z1 2.0095Z4 + 4.7861Z6 0.8443Z11 + 2.8091Z12

R13 0.9952Z5 + 1.2848Z13

R14 5.7024Z1 + 6.0904Z4 7.9324Z6 + 2.5076Z11 3.1207Z12 + 4.6212Z14

R15 1.9090Z5 0.7807Z13 + 3.0068Z15

R16 1.0746Z2 + 1.2027Z8 + 0.5203Z10 + 1.3544Z16

R17 2.4267Z3 + 2.3540Z7 0.8114Z9 + 1.7220Z17

R18 0.7905Z2 + 0.7891Z8 + 4.0955Z10 + 0.4914Z16 + 2.4652Z18

R19 9.1771Z3 7.2660Z7 + 6.8435Z9 2.7816Z17 + 3.1455Z19

R20 3.1155Z2 + 2.4245Z8 4.4115Z10 + 0.8983Z16 1.3467Z18 + 4.7364Z20

R21 22.2957Z3 + 16.1385Z7 16.6680Z9 + 5.1449Z17 5.3074Z19 + 6.9206Z21

R22 1.2407Z1 + 1.8668Z4 0.3413Z6 + 1.7268Z11 0.3191Z12 + 0.6512Z14 + 1.4983Z22

R23 0.82769Z5 + 1.0323Z13 + 0.1445Z15 + 1.3087Z23

R24 3.7592Z1 4.8556Z4 + 10.4311Z6 3.2528Z11 + 7.6493Z12 0.8460Z14


1.0933Z22 + 3.4474Z24

R25 3.2181Z5 2.2882Z13 + 5.6636Z15 0.8568Z23 + 2.9200Z25

R26 14.8185Z1 + 18.3776Z4 19.4312Z6 + 11.4773Z11 11.6213Z12 + 13.7289Z14 +


3.6298Z22 3.3094Z24 + 4.9523Z26

R27 9.9177Z5 + 5.7801Z13 11.4544Z15 + 1.5808Z23 3.0839Z25 + 7.1762Z27


9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 249

Table 9-4 Rectangular polynomials in terms of Zernike circle polynomials for a


rectangular pupil with c = 0.8 corresponding to an aspect ratio  = 0.75. (Cont.)

R28 = 30.6444Z1 36.3206Z4 + 53.9421Z6 20.5096Z11 + 30.7165Z12 31.3914Z14


5.3566Z22 + 8.1769Z24 8.2421Z26 + 10.7448Z28

R29 = 3.4865Z3 + 3.9022Z7 1.8556Z9 + 2.9825Z17 1.1968Z19 + 0.4761Z21 + 1.8221Z29

R30 = 1.2903Z2 + 1.5913Z8 + 1.5103Z10 + 1.4507Z16 + 0.7232Z18 + 0.0791Z20 + 1.4055Z30

R31 = 20.4078Z3 19.3401Z7 + 16.0374Z9 11.2671Z17 + 9.6922Z19 1.9475Z21


3.5735Z29 + 3.5748Z31

R32 = 2.7256Z2 + 2.6116Z8 + 6.9151Z10 + 1.6480Z16 + 5.3888Z18 + 1.0051Z20 + 0.7002Z30


+ 2.8331Z32

R33 = 58.4744Z3 + 51.4017Z7 45.1816Z9 + 26.2581Z17 23.1848Z19 + 20.6702Z21 +


6.7194Z29 5.9855Z31 + 6.2807Z33

R34 = 8.9453Z2 + 8.6027Z8 7.0055Z10 + 5.1606Z16 3.7909Z18 + 12.5633Z20 +


1.7207Z30 0.9675Z32 + 4.7946Z34

R35 = 137.4560Z3 115.4710Z7 + 119.5700Z9 54.4067Z17 + 56.4789Z19 59.716Z21


12.2438Z29 + 12.7553Z31 13.4933Z33 + 16.6422Z35

R36 = 9.1288Z2 7.5791Z8 + 29.6113Z10 3.4590Z16 + 14.4106Z18 23.3638Z20


0.7039Z30 + 3.4183Z32 5.3160Z34 + 11.2833Z36

R37 = 1.4443Z1 + 2.3880Z4 0.2229Z6 + 2.6066Z11 0.4738Z12 + 1.5018Z14 + 2.1013Z22


0.4267Z24 + 0.9143Z26 0.1707Z28 + 1.5680Z37

R38 = 6.7920Z1 9.6812Z4 + 20.4832Z6 8.1391Z11 + 17.6424Z12 3.3244Z14


4.5139Z22 + 10.4761Z24 1.4218Z26 + 1.6720Z28 1.3388Z37 + 3.9661Z38

R39 = 0.1065Z5 + 0.5880Z13 + 1.2183Z15 + 1.0307Z23 + 0.1823Z25 0.4340Z27 +


1.3327Z39

R40 = 39.3796Z1 + 53.2283Z4 53.0596Z6 + 40.6751Z11 39.4938Z12 + 41.0417Z14 +


20.0217Z22 18.4232Z24 + 20.9849Z26 2.4001Z28 + 5.33986Z37 4.3544Z38 +
6.1988Z40

R41 = 3.8438Z5 3.9634Z13 + 8.2513Z15 2.7281Z23 + 6.3196Z25 + 0.6634Z27


1.0209Z39 + 3.1115Z41

R42 = 78.9935Z1 102.5530Z4 + 153.1260Z6 72.0204Z11 + 109.2280Z12 87.8651Z14


30.8082Z22 + 48.1556Z24 38.2527Z26 + 36.5458Z28 6.4972Z37 + 10.8145Z38
8.3071Z40 + 9.4857Z42

R43 = 22.1387Z5 + 17.0827Z13 24.7366Z15 + 8.4351Z23 12.3584Z25 + 18.7263Z27 +


2.1738Z39 3.2116Z41 + 6.2555Z43

R44 = 197.7770Z1 + 252.3210Z4 358.1940Z6 + 171.0860Z11 242.6080Z12 +


254.2440Z14 + 69.4217Z22 98.2143Z24 + 103.3860Z26 109.2310Z28 + 13.6842Z37
19.2514Z38 + 20.4330Z40 21.5294Z42 + 26.1698Z44

R45 = 49.1651Z5 33.5817Z13 + 72.6480Z15 13.7675Z23 + 30.0565Z25 47.9434Z27


2.7431Z39 + 6.0701Z41 9.6463Z43 + 17.5983Z45
250 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-5. Rectangular polynomials in polar coordinates for a rectangular pupil


with c = 0.8 corresponding to an aspect ratio 0.75.
R1 = 1

R2 = 2.1651U cosT

R3 = 2.8868U cosT

R4 = 0.7613 + 2.2839( 1 + 2U2)

R5 = 3.1250U 2cos2T

R6 = 0.9614 1.3878( 1 + 2U2) + 5.3449U 2cos2T

R7 = ( 5.8234 U + 13.5638U3)cosT

R8 = ( 5.4830 U + 10.8789U3)cosT

R9 = (4.5005 U 18.9154U3)cosT + 8.5389U3cos3T

R10 = ( 0.6370 U + 0.9787U3)cosT + 5.9885U3cos3T

R11 = 0.9498 + 2.2705( 1 + 2U2) + 3.1787(1 6U2 + 6U4) 0.5086U2cos2T

R12 = 1.8433 3.4805( 1 + 2U2) 1.8880(1 6U2 + 6U4) + ( 14.9264U2 + 35.5330U4)cos2T

R13 = ( 9.7511U 2 + 16.2519U4)cos2T

R14 = 5.7024 + 10.5488( 1 + 2U 2) + 5.6072(1 6U2 + 6 U4) + (10.1748U2 39.4736U4)cos2T


+ 14.6134U4cos4T
R15 = (2.7303U2 9.8753U4)cos2T + 9.5085 U4cos4T

R16 = (9.4205U 46.0944U3 + 46.9165U5)cosT + 1.4715 U3cos3T

R17 = (9.4323U 51.6062U3 + 59.6505U5)cosT 2.2951U 3cos3T

R18 = (2.2238U 13.7300U3 + 17.0212U5)cosT + ( 22.5745U3 + 42.6979U5)cos3T

R19 = ( 6.1582 U + 53.9729U3 96.3558 U5)cosT + ( 24.2284U3 + 54.4811U 5)cos3T

R20 = (1.8516U 16.7701U 3 + 31.1191U5)cosT + (6.1828U3 23.3257U5)cos3T + 16.4075U5cos5T

R21 = (6.7650U 76.9274U3 + 178.2230U5)cosT + (26.3979U3 91.9276U 5)cos3T + 23.9735U5cos5T

R22 = 1.2407 + 3.2334( 1 + 2U2) + 3.8612(1 6U2 + 6U4) + 3.9642( 1 + 12U 2 30U4 + 20U6)
+ (2.1911U 2 4.0362 U4)cos2T + 2.0593U4cos4T
R23 = (21.6144 U2 84.877 U4 + 73.4513U6)cos2T + 0.4570U4cos4T

R24 = 3.7592 8.4102( 1 + 2U2) 7.2735(1 6U2 + 6U4) 2.8925( 1 + 12U2 30U4 + 20U6)
+ (30.3780U2 161.2260U4 + 193.4870 U6)cos2T 2.6753U 4cos4T

R25 = ( 5.4111U 2 + 35.1766U4 48.0902 U6)cos2T + ( 36.7175 U4 + 65.5530 U6)cos4T

R26 = 14.8185 + 31.8310( 1 + 2U2) + 25.6640(1 6U2 + 6U4) + 9.60361( 1 + 12U2 30U4 + 20 U6)
+ ( 11.6421U2 + 100.6510U4 185.7370U6)cos2T + ( 49.2338 U4 + 111.1780 U6)cos4T

R27 = (4.9469U2 45.1814 U4 + 88.7207U6)cos2T + (21.4719U4 69.2325U6)cos4T + 26.8510U6cos6T

R28 = 30.6444 62.9091( 1 + 2U2) 45.8608(1 6U 2 + 6 U4) 14.1723( 1 + 12U2 30U4 + 20 U6)
+ (24.2988U 2 223.3660U4 + 458.9270U6)cos2 T + (54.9277U4 185.0350 U6)cos4T + 40.2033 U6cos6 T
9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 251

Table 9-5. Rectangular polynomials in polar coordinates for a rectangular pupil


with c = 0.8 corresponding to an aspect ratio 0.75. (Cont.)

R29 ( 13.2595U + 127.7800U3 333.9810U5 + 255.0900U7)cosT + (11.3350U 3 20.7293U 5)cos3T + 1.6493U5cos5T

R30 ( 13.8336U + 121.8610 U3 287.0700U5 + 196.7720U7)cosT + ( 5.7494U 3 + 12.5263U 5)cos3T + 0.2742U 5cos5T

R31 (8.6741U 124.5650 U3 + 467.3430U 5 500.2940U7)cos T + (54.0511U 3 261.0960U 5 + 300.2790U7)cos3T


6.7464 U5cos5T

R32 ( 3.3996U + 37.6803U3 110.9620U 5 + 98.0286U7)cos T + (58.2124U3 246.6330U5 + 237.9790U7)cos3T


+ 3.4819U5cos5T

R33 ( 8.4508U + 150.9530U3 703.0400 U5 + 940.7110U7)cosT + ( 45.9536U 3 + 316.6830U 5 502.7790U7)cos3T


+ ( 79.1341U 5 + 175.8610U7)cos5T

R34 ( 4.6745U + 64.9609U3 234.2050U 5 + 240.9010U 7)cosT + ( 5.9855U 3 + 50.4373U 5 81.2682U7)cos3T


+ ( 71.5489U5 + 134.2480U7)cos5T

R35 (8.7830 U 187.4210 U3 + 1053.8100 U5 1714.1300U7)cos T + (65.8151U3 552.3990 U5 + 1071.4500U7)cos3T


+ (116.9780U5 377.8130 U7)cos5T + 66.5688 U7cos7T
R36 ( 0.0678U 4.9868U 3 + 49.1032U 5 98.5394U 7)cos T + (20.8068U 3 160.5990U5 + 287.1390 U7)cos3T
+ (46.6489U 5 148.8470 U7)cos5T + 45.1331U 7cos7T

R37 1.4443 + 4.1359( 1 + 2U 2) + 5.8286(1 6U2 + 6U4) + 5.5594( 1 + 12U2 30U4 + 20U6)
+ 4.7041(1 20U 2 + 90 U4 140U 6 + 70U 8) + ( 5.6303U 2 + 25.9377U 4 23.9482U 6)cos2T
+ ( 12.3568U4 + 20.5270 U6)cos4T 0.6386 U6cos6T

R38 6.7920 16.7684( 1 + 2U 2) 18.1996(1 6U 2 + 6U4) 11.9426( 1 + 12U2 30 U4 + 20U 6)


4.0165(1 20U 2 + 90U4 140 U6 + 70U8) + ( 50.2770U 2 + 448.8090U 4 1178.8400 U6 + 942.3010U8)cos2T
+ (16.0858U4 31.9182U 6)cos4T + 6.2562U6cos6T
R39 ( 39.2423U2 + 269.5590U4 535.8400 U6 + 316.6330U8)cos2T + (0.4428U 4 + 4.0919U6)cos4T 1.6238U6cos6T

R40 39.3796 + 92.1941( 1 + 2U2) + 90.9522(1 6U 2 + 6U4) + 52.9725( 1 + 12U2 30U4 + 20U 6)
+ 16.0196(1 20U 2 + 90U4 140U6 + 70U8) + (15.8434U 2 229.3420U 4 + 905.7830U 6 1034.5500U8)cos2T
+ (131.6850U 4 633.4660U6 + 736.3840U 8)cos4T 8.9803U6cos6 T

R41 (10.2509U2 105.8530 U4 + 301.6630 U6 242.5490 U8)cos2 T + (105.8810 U4 412.5710 U6 + 369.6300U8)cos4T


+ 2.4823U 6cos6T
R42 78.9935 177.6270( 1 + 2U2) 161.0420(1 6U2 + 6U4) 81.5109( 1 + 12U2 30U4 + 20 U6)
19.4915(1 20 U2 + 90U4 140U6 + 70 U8) + ( 38.8745U2 + 530.9200 U4 2114.8800U6 + 2569.3900U8)cos2T
+ ( 90.8696U4 + 621.471 U6 986.8280 U8)cos4T + ( 144.9680U6 + 321.9540U8)cos6T

R43 ( 10.6920U2 + 138.2210 U4 494.9710 U6 + 516.4750U8)cos2T + ( 51.4043U4 + 294.8320 U6 381.5180U8)cos4T


+ ( 115.7120 U6 + 212.3190 U8)cos6T

R44 197.7770 + 437.0330( 1 + 2U2) + 382.5600(1 6U2 + 6 U4) + 183.6730( 1 + 12 U2 30U4 + 20 U6)
+ 41.0527(1 20U2 + 90U4 140 U6 + 70U8) + (36.0550U2 619.6960 U4 + 3063.7900U6 4573.8900U8)cos2T
+ (170.1620U4 1319.9600U 6 + 2427.3200 U8)cos4T + (230.6850U6 730.7330U 8)cos6 T + 111.0290U8cos8T

R45 (5.4529U 2 92.7804 U4 + 449.2680 U6 651.7150U8)cos2 T + (53.7265U 4 406.8700 U6 + 721.0900U8)cos4 T


+ (107.0920U6 327.4060U8)cos6 T + 74.6631U 8cos8T
252 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-6. Rectangular polynomials in Cartesian coordinates for a rectangular pupil


with c = 0.8 corresponding to an aspect ratio 0.75.
R1 1

R2 2.1651x

R3 2.8866y

R4 1.5226 + 4.5677x2 + 4.5677y2

R5 6.2500xy

R6 0.4263 + 2.5694x2 8.1204y2

R7 5.8234y + 13.5638x2y + 13.5638y3

R8 5.4830x + 10.8789x3 + 10.8789xy2

R9 4.5005y + 6.7012x2y 27.4543y3

R10 0.6370x + 6.9672x3 16.9868xy2

R11 1.8580 15.0398x2 + 19.0722x4 14.0226y2 + 38.1445x2y2 + 19.0722y4

R12 0.2507 10.5596x2 + 24.2052x4 + 19.2931y2 22.6556x2y2 46.8608y4

R13 19.5023xy + 32.5038x3y + 32.5038xy3

R14 0.7608 2.3708x2 + 8.7829x4 22.7203y2 20.3939x2y2 + 87.7301y4

R15 5.4606xy + 18.2834x3y 57.7844xy3

R16 9.4205x 44.6228x3 + 46.9165x5 50.5090xy2 + 93.8330x3y2 + 46.9165xy4

R17 9.4323y 58.4915x2y + 59.6505x4y 49.3111y3 + 119.3010x2y3 + 59.6505y5

R18 2.2238x 36.3045x3 + 59.7191x5 + 53.9936xy2 51.3535x3y2 111.0730xy4

R19 6.1582y 18.7124x2y + 67.0875x4y + 78.2013y3 83.7494x2y3 150.8370y5

R20 1.8516x 10.5873x3 + 24.2009x5 35.3186xy2 55.1853x3y2 + 183.1340xy4

R21 6.7650y + 2.2661x2y + 22.3073x4y 103.3250y3 67.1447x2y3 + 294.1240y5

R22 2.0957 + 33.0605x2 97.7345x4 + 79.2831x6 + 28.6783y2 203.8710x2y2 +


237.8490x4y2 89.6620y4 + 237.8490x2y4 + 79.2831y6

R23 43.2289xy 167.9260x3y + 146.9030x5y 171.5820xy3 + 293.8050x3y3 + 146.9030xy5


R24 0.2700 + 22.4881x2 120.7660x4 + 135.6370x6 38.2678y2 + 102.3210x2y2 +
19.9357x4y2 + 201.6850y4 367.0390x2y4 251.3380y6

R25 10.8221xy 76.5169x3y + 166.0320x5y + 217.2230xy3 192.3610x3y3 358.3920xy5


R26 0.9521 + 13.2791x2 82.7075x4 + 117.5130x6 + 36.5633y2 + 27.1545x2y2
165.4110x4y2 284.0090y4 + 206.0630x2y4 + 488.9880y6

R27 9.8939xy 4.4751x3y + 61.6173x5y 176.2500xy3 182.1370x3y3 + 615.4780xy5

R28 0.5762 + 3.5782x2 18.4357x4 + 30.6504x6 45.0195y2 29.5599x2y2


69.2811x4y2 + 428.2970y4 + 218.9620x2y4 967.6110y6
9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 253

Table 9-6. Rectangular polynomials in Cartesian coordinates for a rectangular pupil


with c = 0.8 corresponding to an aspect ratio 0.75. (Cont.)
R29 = 13.2595y + 161.7850x2y 387.9230x4y + 255.0900x6y + 116.4450y3
725.9140x2y3 + 765.2710x4y3 311.6030y5 + 765.2710x2y5 + 255.0900y7

R30 = 13.8336x + 116.1110x3 274.2700x5 + 196.7720x7 + 139.1090xy2 601.9340x3y2


+ 590.3150x5y2 323.2780xy4 + 590.3150x3y4 + 196.7720xy6

R31 = 8.6741y + 37.5880x2y 349.6770x4y + 400.5440x6y 178.6160y3 + 479.9580x2y3 +


0.5157x4y3 + 721.6920y5 1200.6000x2y5 800.5730y7

R32 = 3.3996x + 95.8927x3 354.1130x5 + 336.0080x7 136.9570xy2 + 236.5240x3y2 +


56.1063x5y2 + 646.3470xy4 895.8120x3y4 615.9100xy6

R33 = 8.4508y + 13.0920x2y 148.6600x4y + 311.6780x6y + 196.9060y3 + 18.6281x2y3


571.0660x4y3 1098.8600y5 + 736.6060x2y5 + 1619.3500y7

R34 = 4.6745x + 58.9754x3 255.3170x5 + 293.8810x7 + 82.9174xy2 + 146.2040x3y2


404.2550x5y2 743.2620xy4 + 457.8080x3y4 + 1155.9400xy6

R35 = 8.7830y + 10.0244x2y 18.4974x4y + 77.1286x6y 253.2360y3 166.9520x2y3


225.9950x4y3 + 1723.1900y5 + 727.3160x2y5 3229.9600y7

R36 = 0.0688x + 15.8200x3 64.8468x5 + 84.8849x7 67.4072xy2 47.0852x3y2


190.9260x5y2 + 764.1440xy4 + 592.5860x3y4 2020.1200xy6

R37 = 2.2817 59.6994x2 + 305.1400x4 551.4460x6 + 329.2870x8 48.4388y2 +


657.2590x2y2 1759.1600x4y2 + 1317.1500x6y2 + 253.2650y4 1730.4300x2y4 +
1975.7200x4y4 502.2730y6 + 1317.1500x2y6 + 329.2870y8

R38 = 0.2972 37.5960x2 + 352.4860x4 881.0430x6 + 661.1430x8 + 62.9580y2


321.3330x2y2 142.7060x4y2 + 759.9720x6y2 545.1320y4 + 2402.6700x2y4
1686.9400x4y4 + 1464.1300y6 3009.2300x2y6 1223.4600y8

R39 = 78.4846xy + 540.8890x3y 1065.0600x5y + 633.2650x7y + 537.3460xy3


2110.8800x3y3 + 1899.8000x5y3 1097.7900xy5 + 1899.8000x3y5 + 633.2650xy7

R40 = 1.1848 30.2033x2 + 300.6440x4 919.9570x6 + 823.2050x8 61.8900y2 +


6.4945x2y2 + 657.9390x4y2 529.1540x6y2 + 759.3280y4 1423.0400x2y4
635.6140x4y4 2713.5600y6 + 3609.0500x2y6 + 2892.3100y8

R41 = 20.5019xy + 211.8160x3y 1032.0600x5y + 993.4240x7y 635.2280xy3 +


1157.0100x3y3 + 23.2280x5y3 + 2268.5100xy5 2933.8200x3y5 1963.6200xy7

R42 = 0.3900 16.1732x2 + 164.8860x4 539.7850x6 + 540.1110x8 + 61.5757y2


5.1104x2y2 + 248.0660x4y2 878.8930x6y2 896.9530y4 + 128.7860x2y4 +
1681.8400x4y4 + 3979.9200y6 2141.7300x2y6 5242.5800y8

R43 = 21.3841xy + 70.8246x3y 504.8870x5y + 780.7940x7y + 482.0590xy3 +


334.3580x3y3 1399.6900x5y3 2863.5400xy5 + 1652.4500x3y5 + 3832.9400xy7

R44 = 0.6837 2.2182x2 + 30.3867x4 99.4155x6 + 107.4090x8 74.3282y2


61.1328x2y2 18.4555x4y2 240.8510x6y2 + 1269.7800y4 + 774.5320x2y4 +
741.0040x4y4 6688.3600y6 2405.7900x2y6 + 10716.7000y8

R45 = 10.9057xy + 29.3452x3y 86.3882x5y + 213.8010x7y 400.4670xy3 344.7800x3y3


623.3740x5y3 + 3168.5700xy5 + 1970.1700x3y5 6749.5300xy7
254 SYSTEMS WITH RECTANGULAR PUPILS

R1 R2 R3

R4 R5 R6

R7 R8 R9

R10 R11 R12

R13 R14 R15

Figure 9-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio


 = 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave.
9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 255

R16 R17 R18

R19 R20 R21

R22 R23 R24

R25 R26 R27

R28 R29 R30

Figure 9-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio


 = 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave. (Cont.)
256 SYSTEMS WITH RECTANGULAR PUPILS

R31 R32 R33

R34 R35 R36

R37 R38 R39

R40 R41 R42

R43 R44 R45

Figure 9-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio of


 = 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave. (Cont.)
9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 257

Table 9-7. Peak-to-valley (P-V) numbers (in units of wavelength) of orthonormal


rectangular polynomial aberrations for c = 0.8 corresponding to an aspect ratio of
 = 0.75 for a sigma value of one wave.

Poly. P-V # Poly. P-V# Poly. P-V#

R1 0 R16 15.352 R31 11.357

R2 3.464 R17 16.675 R32 10.471

R3 3.464 R18 7.354 R33 8.574

R4 4.568 R19 7.741 R34 8.959

E5 6.000 R20 7.981 R35 11.357

R6 4.568 R21 9.224 R36 9.195

R7 9.289 E22 12.142 R37 16.914

R8 8.6345 R23 20.054 R38 12.861

R9 6.460 R24 9.195 R39 28.345

R10 6.115 R25 8.181 R40 7.783

R11 7.364 R26 6.821 R41 12.659

R12 6.024 R27 7.960 R42 10.108

R13 12.481 R28 12.142 R43 10.351

R14 5.488 R29 24.920 R44 8.480

R15 6.491 R30 23.048 R45 9.297

The Strehl ratio, namely the central value of a PSF relative to its aberration-free
value can be obtained from Eq. (9-8) by letting x = 0 = y , i.e., from

2
1 1 1
I (0, 0) = [ ]
Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢
16 1 1
. (9-23)

Its value for a rectangular polynomial aberration with a sigma value of 0.1 wave is listed
in Table 9-8 and plotted in Figure 9-7. Because of the small value of the aberration, the
Strehl ratio is approximately the same for each polynomial. Both the table and the figure
illustrate that the Strehl ratio for a small aberration is independent of the type of
( )
aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
258 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-8. Strehl ratio S for rectangular polynomial aberrations for c = 0.8
corresponding to an aspect ratio of  = 0.75 for a sigma value of 0.1 wave.

Poly. S Poly. S Poly. S

R1 1 R16 0.704 R31 0.702

R2 0.663 R17 0.715 R32 0.691

R3 0.663 R18 0.678 R33 0.688

R4 0.669 R19 0.685 R34 0.683

E5 0.676 R20 0.687 R35 0.685

R6 0.669 R21 0.681 R36 0.691

R7 0.688 E22 0.718 R37 0.723

R8 0.679 R23 0.719 R38 0.703

R9 0.673 R24 0.688 R39 0.722

R10 0.678 R25 0.691 R40 0.679

R11 0.700 R26 0.682 R41 0.705

R12 0.674 R27 0.688 R42 0.690

R13 0.701 R28 0.684 R43 0.691

R14 0.680 R29 0.724 R44 0.687

R15 0.683 R30 0.718 R45 0.691


9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 259

o
o

oj

Figure 9-7. Strehl ratio S for rectangular polynomial aberrations for c = 0.8
corresponding to an aspect ratio of  = 0.75 for a sigma value of 0.1 wave.
260 SYSTEMS WITH RECTANGULAR PUPILS

9.7 SEIDEL ABERRATIONS AND THEIR STANDARD DEVIATIONS


We now consider balancing of a Seidel aberration and obtain its standard deviation
with and without balancing.

9.7.1 Defocus
We start with the defocus aberration

W d (r) = Ad r 2 . (9-24)

From the form of the orthonormal defocus polynomial R4 given in Table 9-2, it is
evident that its sigma value across a rectangular pupil is given by

2g
sd = Ad , (9-25)
3 5

where

(
g = 1 - 2c 2 + 2c 4 )1 2 . (9-26)

9.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by

W a (r, q) = Aa r 2 cos 2 q . (9-27)

The orthonormal polynomial representing balanced astigmatism is given by

R6 = 3 5
g 2r 2 cos 2q + 1 - 2c 2 r 2 ( ) + constant (9-28a)
2
(
4c 1 - c g 2
)
3 5g Ê 2 2 c 4 2ˆ
= Á r cos q - 2 r ˜ + constant , (9-28b)
(
2c 2 1 - c 2 ) Ë g ¯

showing that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is c 4 g 2 . It is evident that the balanced astigmatism is given by

Ê c4 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - 2 r 2 ˜ . (9-29)
Ë g ¯

Its sigma value is given by

s ba =
(
2c 2 1 - c 2 )A . (9-30)
a
3 5g

To obtain the sigma value of astigmatism, we write Eq. (9-27) in the form
9.7.2 Astigmatism 261

2 Aa 2
W a (r, q) =
3 5g
[ ( )
c 1 - c 2 R6 + c 4 R4 + constant . ] (9-31)

Utilizing Eq. (9-22), the sigma value is given by

2c 2
sa = Aa . (9-32)
3 5

9.7.3 Coma
Now, we consider Seidel coma

W c (r, q) = Ac r 3 cos q . (9-33)

The orthonormal polynomial representing balanced coma is given by


12
1 Ê 21 ˆ
R8 = Á ˜
2c Ë 35 - 7c 2 + 62c 4 ¯
[15r 3
( )
cos q - 5 + 4c 2 r cos q ] . (9-34)

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
( )
r3 cos q is - 5 + 4c 2 15 compared to - 2 3 for a circular pupil. Its sigma value is given
by
12
2c Ê 35 - 70c 2 + 62c 4 ˆ
s bc = Ac . (9-35)
15 ÁË 21 ˜
¯

To obtain the sigma value of Seidel coma, we write Eq. (9-33) in the form

A
W c (r, q) = c
È Ê 35 - 70c 2 + 62c 4 ˆ 1 2
Í 2c Á
c 5 + 4a 2 c ˘ (
R2 ˙ .
)
15 21 ˜ R8 + (7-36)
Í Ë ¯ 3 ˙
Î ˚

Utilizing Eq. (9-22), we obtain the sigma value

7 + 8c 4
sc = c Ac (9-37)
105

9.7.4 Spherical Aberration


Finally, we consider Seidel spherical aberration

W s (r) = Asr 4 . (9-38)

The orthonormal polynomial representing balanced spherical aberration is given by

[ ( )
R11 = (1 8m) 315r 4 + 30 1 - 2c 2 r 2 cos 2q - 240r 2 + constant ] (9-39a)

= (1 8m)[ 315r 4
( ) ( ) ]
+ 60 1 - 2c 2 r 2 cos 2 q - 270 + 2c 2 r 2 + constant . (9-39b)
262 SYSTEMS WITH RECTANGULAR PUPILS

Hence, the balanced spherical aberration is given by

È 6 16 ˘
W bs (r) = As Ír 4 -
Î 63
( )
1 - 2c 2 r 2 cos 2q - r 2 ˙
21 ˚
(9-40a)

È 12 12 ˘
= As Ír 4 -
Î 63
( )
1 - 2c 2 cr 2 cos 2 q -
63
3 + 2c 2 r 2 ˙ .
˚
( ) (9-40b)

It shows, as in the case of an elliptical pupil, that spherical aberration is balanced not only
by defocus but astigmatism as well. Its sigma value is given by

8m
s bs = A . (9-41)
315 s

To obtain the sigma value of Seidel spherical aberration, we write Eq. (9-38) in the form

W s (r) =
1 È
Í8mR11 -
( )(
40c 2 1 - c 2 1 - 2c 2
R6 -
)
2( 241 - 2c ) ˘
R4 ˙ .
315 Í 5g 3 5g ˙˚
Î

+ constant . (9-42)

Utilizing Eq. (9-22), we obtain the sigma value:

4 As
ss =
45 7
(
63 - 162c 2 + 206c 4 - 88c 6 + 44c 8 )1 2 . (9-43)

The sigma values of Seidel aberrations with and without balancing are given in Table 9-9.

Table 9-9. Sigma of a Seidel aberration with and without balancing, where Ai is the
coefficient of an aberration.

Aberration Sigma

Defocus (
s d = 2 g 3 5 Ad )
Astigmatism sa = ( 2c 3 5) A
2
a

Balanced astigmatism s ba = [ 2c (1 - c ) 3 5g ] A
2 2
a

Coma sc = c [( 7 + 8c ) 105] A 4
c

4 12
Balanced coma s bc = ( 2c 15 21)( 35 - 70c + 62c ) A 2
c
Ê 4A ˆ 8 12
˜ ( 63 - 162c + 206c - 88c + 44c )
s 2 4 6
Spherical aberration ss =Á
Ë 45 7 ¯

Balanced spherical aberration s bs = (8m 315) As


9.7.4 Spherical Aberration 263

Figures 9-8 and 9-9 show the variation of sigma for a rectangular pupil as a function
of its width c along the x axis. It is evident from Figure 9-8 that defocus and spherical
sigmas have a minimum for a square pupil (i.e., for c = 1 2 ), but coma and astigmatism
sigmas increase monotonically as c increases from a value of zero, representing a slit
pupil along the y axis, to a value of 1, representing a slit pupil parallel to the x axis. The
balanced spherical sigma in Figure 9-9 has a minimum for a square pupil though its
variation is relatively small. The sigma for balanced astigmatism has a distinct maximum
for a square pupil, while the monotonically increasing sigma for balanced coma has a
point of inflection.

Figure 9-8. Variation of sigma of a primary or Seidel aberration as a function of


half-width c of a unit rectangular pupil.

Figure 9-9. Variation of sigma of a balanced primary aberration as a function of


half-width c of a unit rectangular pupil.
264 SYSTEMS WITH RECTANGULAR PUPILS

9.8 SUMMARY
The aberration-free PSF and OTF are discussed in Section 9.3. The polynomials
orthonormal over a unit rectangular pupil, representing balanced aberrations over such a
pupil are given through the fourth order in Tables 9-1 through 9-3 in terms of the circle
polynomials, in polar coordinates, and in Cartesian coordinates, respectively. Each
orthonormal polynomial consists of either the cosine or the sine terms, but not both. Thus
an even j polynomial, for example, consists of only the cosine terms, as may be seen from
Table 9-2. This is a consequence of the biaxial symmetry of the pupil. Since the
polynomials are not separable in the polar coordinates r and q of a pupil point,
polynomial numbering with two indices n and m loses significance, and must be
numbered with a single index j. They are ordered in the same manner as the polynomials
discussed in previous chapters.

As in the case of elliptical polynomials, only the first 15 rectangular polynomials are
given in the tables. The expressions for the higher-order polynomials are very long unless
the aspect ratio  of the pupil is specified. The polynomial R6 for astigmatism is a linear
combination of Z 6 , Z 4 , and Z1, showing that the balancing defocus for (zero-degree)
Seidel astigmatism is different for a rectangular pupil compared to that, for example, for a
circular pupil. Moreover, R11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,
spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q
as well. It is evidently not radially symmetric. As expected, the rectangular polynomials
reduce to the square polynomials (discussed in the next chapter) as c Æ 1 2 , i.e., as the
unit rectangle approaches a unit square.

The first 45 rectangular polynomials, i.e., up to and including the eighth order, for a
rectangular pupil with an aspect ratio of  = 0.75 are given in Tables 9-4 through 9-6 in
terms of Zernike circle polynomials, in polar coordinates, and in Cartesian coordinates,
respectively. They are illustrated in three different but equivalent ways in Figure 9-7 with
the isometric plot, interferogram, and the PSF for a sigma value of one wave. The peak-
to-valley aberration numbers (in units of wavelength) are given in Table 9-7. The Strehl
ratio for a sigma value of 0.1 wave is given in Table 9-8 and plotted in Figure 9-7. The
Seidel aberrations are discussed in Section 9.7, and their sigma values with and without
balancing are given in Table 9-9.
5HIHUHQFHV 265

References

1. K. N. LaFortune, R. L. Hurd, S. N. Fochs, M. D. Rotter, P. H. Pax, R. L. Combs,


S. S. Olivier, J. M. Brase, and R. M. Yamamoto, “Technical challenges for the
future of high energy lasers,” Proc. SPIE 6454, 1–11 (2007).

2. V. N. Mahajan and G.-m Dai, “Orthonormal polynomials in wavefront analysis:


analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

3. V. N. Mahajan, “Orthonormal polynomials in wavefront analysis,” Handbook of


Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–
11.41 (McGraw–Hill, 2009).

4 J. Rayces, “Least-squares fitting of orthogonal polynomials to the wave-aberration


function,” Appl. Opt. 31, 2223–2228 (1992).
CHAPTER 10

SYSTEMS WITH SQUARE PUPILS

10.1 Introduction ..........................................................................................................269

10.2 Pupil Function ......................................................................................................269

10.3 Aberration-Free Imaging ....................................................................................270

10.3.1 PSF ..........................................................................................................270

10.3.2 OTF ..........................................................................................................272

10.4 Square Polynomials..............................................................................................274

10.5 Square Coefficients of a Square Aberration Function ..................................... 281

10.6 Isometric, Interferometric, and Imaging Characteristics of

Square Polynomial Aberrations ......................................................................... 282

10.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio ..............................289

10.7.1 Defocus ....................................................................................................289

10.7.2 Astigmatism............................................................................................. 289

10.7.3 Coma ........................................................................................................290

10.7.4 Spherical Aberration ................................................................................290

10.7.5 Strehl Ratio ..............................................................................................292

10.8 Summary............................................................................................................... 293

References ......................................................................................................................294

267
Chapter 10
Systems with Square Pupils
10.1 INTRODUCTION
We start this chapter with a brief discussion of the aberration-free PSF and OTF for a
system with a square pupil, as, for example, a high-power laser beam with a square cross-
section. We can obtain these results as a special case of the rectangular pupils discussed
in the last chapter. Similarly, the square polynomials Sk can be obtained as a special case
of the rectangular polynomials Rk discussed there, i.e., by letting c = 1 2 . However,
we describe the procedure for obtaining them independently [1,2], and give expressions
for the first 45 polynomials, i.e., up to and including the eighth order. The isometric,
interferometric, and PSF plots of these polynomial aberrations with a sigma value of one
wave are given along with their P-V numbers. The Strehl ratios for these polynomial
aberrations for a sigma value of one-tenth of a wave are also given. Finally, we discuss
how to obtain the standard deviation of a Seidel aberration with and without balancing
and then discuss the Strehl ratio as a function of it.

Orthogonal square polynomials were also obtained by Bray by orthogonalizing the


circle polynomials, but he chose a circle inscribed inside a square instead of the other way
around [3]. Thus, his square with a full width of unity has regions that fall outside the unit
circle. Defining a unit square as we have, where its semidiagonal is unity, has the
advantage that the coefficient of a term in a certain polynomial represents its peak value.
For example, since r has a maximum value of unity, the coefficients of astigmatism
r 2 cos 2 q in S6 , or coma r 3 cos q in S8 , or spherical aberration r 4 in S11 represent
their peak values.

As in the case of rectangular polynomials, products of the x- and y-Legendre


polynomials, which are orthogonal over a square pupil, are not suitable for the analysis of
square wavefronts [4], because they do not represent classical or balanced aberrations.
For example, defocus is represented by a term in x 2 + y 2 . While it can be expanded in
terms of a complete set of Legendre polynomials, it cannot be represented by a single 2D
Legendre polynomial (i.e., as a product of x- and y-Legendre polynomials). The same
difficulty holds for spherical aberration and coma, etc. However, products of Legendre
polynomials are the correct polynomials for an anamorphic system, as discussed in
Chapter 13.

10.2 PUPIL FUNCTION


As illustrated in Figure 10-1, consider an optical system with a square exit pupil of
( )
half-width a and area Sex = 4 a 2 lying in the x p , y p plane with z axis as its optical axis.
( )
For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex
exiting from it, the pupil function of the system can be written

(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (10-1)
269
270 SYSTEMS WITH SQUARE PUPILS

yp

xp
O

Figure 10-1. Square pupil of half-width a.

where

(
A xp, yp ) = (P ex Sex )
12
, -a £ xp £ a , -a £ yp £ a . (10-2)

10.3 ABERRATION-FREE IMAGING


10.3.1 PSF
From Eq. (2-9), the aberrated PSF at a point ( x i , y i ) in the image plane of a system
with a uniformly illuminated rectangular exit pupil, normalized by its aberration-free
central value Pex Sex l2 R 2 , can be written
2
1 a a È 2pi ˘
I (x i , y i ) = 2 Ú
Sex a a
[ (
Ú exp iF x p , y p expÍ -
Î lR
)] ( )
x i x p + y i y p ˙ dx p dy p .
˚
(10-3)

Letting

( x ¢, y ¢) = a 1
(x p, yp ) (10-4)

and

1
( x, y) = (x , y )
lF i i
(10-5)

into Eq. (10-3), where

F = R 2a (10-6)

is the focal ratio of the image forming beam along the x and the y axes, we obtain the
irradiance distribution

2
1 1 1
I ( x, y) =
16 1 1
[ ]
Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (10-7)

Accordingly, the aberration-free distribution is given by


10.3.1 PSF 271

2
1 1 1
I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢
16 1 1

2 2
Ê sin px ˆ Ê sin py ˆ
= Á ˜ . (10-8)
Ë px ¯ ÁË py ˜¯

Figure 10-2a shows the 2D PSF, in particular, the central bright square spot of size
2 ¥ 2 , with each dimension in units of l F . The PSF is zero wherever x and/or y is a
positive or a negative integer. Moreover, there are rectangular spots along the x and y
axes, but square spots elsewhere in the PSF. Figure 10-2b shows the irradiance
distribution along the x and y axes, and along the diagonal of the central bright spot as
12
(
I ( x, 0) , I (0, y ) , and I ( x , x ) ∫ I ( r ) , where r = x 2 + y 2 )
= 2 x and
4

I (r) = Í
(
È sin pr 2 ) ˘˙ . (10-9)
Í pr 2 ˙
Î ˚

The irradiance along the diagonal is zero at integral multiples of 2.

(a)

1.0

0.8

0.6

0.4
(b)
I (x, 0)
0.2
I (0, y)
I (r)

0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r

Figure 10-2. (a) 2D aberration-free PSF. (b) Irradiance distribution along the x and
y axes, and along the diagonal of the central bright spot of the PSF.
272 SYSTEMS WITH SQUARE PUPILS

10.3.2 OTF
From Eq. (1-13), the aberration-free OTF of a system with a square pupil at a spatial
frequency (x, h) is given by the fractional area of overlap of two squares centered at
(0, 0) and lR(x, h) , as shown in Figure 10-3. The overlap area is given by
S(x, h) = (2a - l Rx) (2a - l Rh)

Ê x ˆÊ h ˆ
= 4 a 2 Á1 - ˜ Á1 - ˜ . (10-10)
Ë 1 lF ¯ Ë 1 lF ¯

Hence, the fractional area of overlap, or the OTF of the system may be written

(
t vx , vy ) = (1 - v ) (1 - v )
x y , (10-11)

where

Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 lF 1 lF ¯
(10-12)

are the spatial frequency components in units of the cutoff frequency 1 l F along the x
( )
or the y axis. The OTF t(v x , 0) along the x axis is the same as the OTF t 0, v y along
the y axis, with the same normalized cutoff frequency of unity.

yp

‡
O9 R
‡ xp
O

R
a

Figure 10-3. Overlap area of two square pupils centered at (0, 0) and l R(x , h) .
10.3.2 OTF 273

12
( )
The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be
obtained from Eq. (10-10) by letting v x = v y . Thus
2
Ê v ˆ
t( v ) = Á 1 - ˜ . (10-13)
Ë 2¯

Its cutoff frequency is 2.

( )
Figure 10-4 shows the OTF t(v x , 0) , t 0, v y , and t( v ) along the x and y axes, and
along the diagonal of the pupil with cutoff frequencies 1, 1, and 2 , respectively, each in
( )
units of 1 l F . Of course, t(v x , 0) = t 0, v y for any v x = v y . The OTF t( v ) < t(v x , 0) for
( )
any frequency lying in the range 0 < v = v x < 2 2 - 1 . They are equal to each other at
( )
the frequency 2 2 - 1 (or about 0.83), and t( v ) > t(v x , 0) for frequencies in the range
( )
2 2 - 1 < v = v x < 2 . Of course, t(v x , 0) is zero for v x ≥ 1, but t( v ) is not until
v = 2.

1.0

0.8

t ( nx , 0)
0.6
t (0, ny)
t

0.4
t (n)

0.2

0.0
0.0 0.5 1.0 1.5
nx, ny, or n

Figure 10-4. Aberration-free OTF of a system with a square pupil, where v x , v y ,


and v are in units of the cutoff frequency 1 l F along the x axis.
274 SYSTEMS WITH SQUARE PUPILS

10.4 SQUARE POLYNOMIALS


Figure 10-5 shows a unit square inscribed inside a unit circle. The distance of a
corner point of the square, such as A, from its center O is unity, but each of its sides has a
length of 2 , and its area is 2.

The orthonormal square polynomials S j ( x , y ) obtained by orthogonalizing the


Zernike circle polynomials Z j ( x , y ) over a unit square are given by [see Eq. (3-18)]

È j ˘
S j +1 = N j +1 ÍZ j +1 - Â Z j +1S k S k ˙ , (10-14)
ÍÎ k =1 ˙˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit square, i.e., they satisfy the orthonormality condition

1 2 1 2
1 Û Û
Ù dy Ù S j S j ¢ dx = d jj ¢ . (10-15)
2 ı ı
1 2 1 2

The angular brackets indicate a mean value over the rectangular pupil. Thus, for example,

1 2 1 2
1
Z j Sk = Ú dy Ú Z j S k dx . (10-16)
2 1 2 1 2

If the integrand is an odd function of x and/or y, the mean value is zero because of the
symmetric limits of integration. If the integrand is an even function, then we may replace
the lower limits of integration by zero and multiply the double integral by 4.

The orthonormal square polynomials up to and including the eighth order, i.e., the
first 45 polynomials, in terms of the Zernike circle polynomials are given in Table 10-1.

D ( 1 2, 1 2 ) (
A 1 2,1 2 )

O x

(
C 1 2, 1 2 ) (
B 1 2, 1 2 )

Figure 10-5. Unit square of half-width 1 2 inscribed inside a unit circle. Its corner
points, such as A, lie at a distance of unity from its center.
10.4 Square Polynomials 275

Table 10-1. Orthonormal square polynomials S j U , T in terms of the Zernike circle


polynomials Z j U T .

S1 Z1

S2 3 2 Z2

S3 3 2 Z3

S4 ( 5 2 /2) Z1 + ( 15 2 /2) Z4

S5 3 2 Z5

S6 ( 15 /2)Z6

S7 (3 21 31 /2)Z3 + (5 21 62 /2)Z7

S8 (3 21 31 /2)Z2 + (5 21 62 /2)Z8

S9 (7 5 31 /2)Z3 (13 5 62 /4)Z7 + ( 155 2 /4)Z9

S10 (7 5 31 /2)Z2 + (13 5 62 /4)Z8 + ( 155 2 /4)Z10

S11 (8/ 67 )Z1 + (25 3 67 /4)Z4 + (21 5 67 /4)Z11

S12 = (45 3 /16)Z6 + (21 5 /16)Z12

S13 = (3 7 /8)Z5 + ( 105 /8)Z13

S14 = 261/(8 134 )Z1 + (345 3 134 /16)Z4 + (129 5 134 /16)Z11 + (3 335 /16)Z14

S15 = ( 105 /4)Z15


S16 = 1.71440511Z2 +1.71491497Z8 + 0.65048499Z10 + 1.52093102Z16
S17 = 1.71440511Z3 + 1.71491497Z7 0.65048449Z9 + 1.52093102Z17
S18 = 4.10471345Z2 + 3.45884077Z8 + 5.34411808Z10 + 1.51830574Z16 + 2.80808005Z18
S19 = 4.10471345Z3 3.45884078Z7 + 5.34411808Z9 1.51830575Z17 + 2.80808005Z19
S20 = 5.57146696Z2 + 4.44429264Z8 + 3.00807599Z10 + 1.70525179Z16 +1.16777987Z18 + 4.19716701Z20
S21 = 5.57146696Z3 + 4.44429264Z7 3.00807599Z9 + 1.70525179Z17 1.16777988Z19 + 4.19716701Z21
S22 = 1.33159935Z1 + 1.94695912Z4 + 1.74012467Z11 + 0.65624211Z14 + 1.50989174Z22
S23 = 0.95479991Z5 + 1.01511643Z13 + 1.28689496Z23
S24 = 9.87992565Z6 + 7.28853095Z12 + 3.38796312Z24
S25 = 5.61978925Z15 + 2.84975327Z25
S26 = 11.00650275Z1 + 14.00366597Z4 + 9.22698484Z11 + 13.55765720Z14
+ 3.18799971Z22 + 5.11045000Z26
S27 = 4.24396143Z5 + 2.70990074Z13 + 0.84615108Z23 + 5.17855026Z27

S28 = 17.58672314Z6 + 11.15913268Z12 + 3.57668869Z24 + 6.44185987Z28


S29 = 2.42764289Z3 + 2.69721906Z7 1.56598064Z9 + 2.12208902Z17
0.93135653Z19 + 0.25252773Z21 + 1.59017528Z29

S30 = 2.42764289Z2 + 2.69721906Z8 + 1.56598064Z10 + 2.12208902Z16


+ 0.93135653Z18 + 0.25252773Z20 + 1.59017528Z30
276 SYSTEMS WITH SQUARE PUPILS

Table 10-1. Orthonormal square polynomials S j U , T in terms of the Zernike circle


polynomials Z j U T . (Cont.)

S31 9.10300982Z3 8.79978208Z7 + 10.69381427Z9 5.37383385Z17


+ 7.01044701Z19 1.26347272Z21 1.90131756Z29 + 3.07960207Z31
S32 9.10300982Z2 + 8.79978208Z8 + 10.69381427Z10 +5.37383385Z16
+ 7.01044701Z18 + 1.26347272Z20 + 1.90131756Z30 + 3.07960207Z32
S33 21.39630883Z3 + 19.76696884Z7 12.70550260Z9 + 11.05819453Z17
7.02178756Z19 +15.80286172Z21 + 3.29259996Z29 2.07602718Z31
+ 5.40902889Z33
S34 21.39630883Z2 + 19.76696884Z8 + 12.70550260Z10 + 11.05819453Z16
+ 7.02178756Z18 +15.80286172Z20 + 3.29259996Z30 + 2.07602718Z32
+ 5.40902889Z34
S35 16.54454462Z3 14.89205549Z7 + 22.18054997Z9 7.94524849Z17
+ 11.85458952Z19 6.18963457Z21 2.19431441Z29 +3.24324400Z31
1.72001172Z33 + 8.16384008Z35
S36 16.54454462Z2 + 14.89205549Z8 + 22.18054997Z10 + 7.94524849Z16
+ 11.85458952Z18 + 6.18963457Z20 + 2.19431441Z30 +3.24324400Z32
+ 1.72001172Z34 + 8.16384008Z36
S37 1.75238960Z1 + 2.72870567Z4 + 2.76530671Z11 + 1.43647360Z14
+ 2.12459170Z22 + 0.92450043Z26 + 1.58545010Z37
S38 19.24848143Z6 + 16.41468913Z12 + 9.76776798Z24 + 1.47438007Z28
+ 3.83118509Z38
S39 0.46604820Z5 + 0.84124290Z13 + 1.00986774Z23 0.42520747Z27 + 1.30579570Z39
S40 28.18104531Z1 + 38.52219208Z4 + 30.18363661Z11 + 36.44278147Z14 +
15.52577202Z22 + 19.21524879Z26 + 4.44731721Z37 + 6.00189814Z40
S41 (369/4) 35 3574 Z15 + [11781/(32 3574 )]Z25 + (2145/32) 7 3574 Z41

S42 85.33469748Z6 + 64.01249391Z12 + 30.59874671Z24 + 34.09158819Z28


+7.75796322Z38 + 9.37150432Z42
S43 14.30642479Z5 + 11.17404702Z13 + 5.68231935Z23 + 18.15306055Z27
+ 1.54919583Z39 + 5.90178984Z43
S44 36.12567424Z1 + 47.95305224Z4 + 35.30691679Z11 + 56.72014548Z14
+ 16.36470429Z22 + 26.32636277Z26 +3.95466397Z37 +6.33853092Z40
+ 12.38056785Z44
S45 21.45429746Z15 + 9.94633083Z25 + 2.34632890Z41 + 10.39130049Z45
10.4 Square Polynomials 277

Table 10-2. Orthonormal square polynomials S j U , T in polar coordinates U, T .


S1 = 1

S2 = 6 ȡcosș

S3 = 6 ȡsinș
2
S4 = 5 2 (3ȡ 1)
2
S5 = 3ȡ sin2ș

S6 = 3 5 2 ȡ2 cos2ș

2
S7 = 21 31 (15ȡ 7)ȡsinș

2
S8 = 21 31 (15ȡ 7)ȡcosș

S9 = ( 5 31 /2)[31ȡ3 sin3ș 3(13ȡ2 4)ȡsinș]

S10 = ( 5 31 /2)[31ȡ3 cos3ș + 3(13ȡ2 4)ȡcosș]

S11 = (1/2 67 )(315ȡ4 240ȡ2 + 31)

S12 = 15/2 2 )(7ȡ2 3)ȡ2 cos2ș

S13 = 21 2 (5ȡ2 3)ȡ2 sin2ș

S14 = [3/(8 134 )](335ȡ4 cos4ș + 645ȡ4 300ȡ2 + 22)

S15 = (5/2) 21 /2ȡ4 sin4ș


3
S16 = 55 1966 [11ȡ cos3ș + 3(19 97ȡ2 + 105ȡ4)ȡcosș]

3
S17 = 55 1966 [ 11ȡ sin3ș + 3(19 97ȡ2 + 105ȡ4)ȡsinș]

4
S18 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 cos3ș + 3(3128 23885ȡ2 + 37205ȡ )ȡcosș]

4
S19 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 sin3ș 3(3128 23885ȡ2 + 37205ȡ )ȡsinș]
4
S20 = (1/16) 7 859 [2577ȡ5 cos5ș 5(272 717ȡ2)ȡ3 cos3ș + 30(22 196ȡ2 + 349ȡ )ȡcosș]

4
S21 = (1/16) 7 859 [2577ȡ5 sin5ș + 5(272 717ȡ2)ȡ3 sin 3ș + 30(22 196ȡ2 + 349ȡ )ȡsinș]

S22 = (1/4) 65 849 (1155ȡ6 + 30ȡ4 cos4ș 1395ȡ4 + 453ȡ2 31)

S23 = (1/2) 33 3923 (471 1820ȡ2 + 1575ȡ4)ȡ2 sin2ș

S24 = (21/4) 65 1349 (27 140ȡ2 + 165ȡ4)ȡ2 cos2ș

S25 = (7/4) 33 2 (9ȡ2 5)ȡ4 sin4ș

S26 = (1/16 849 )[5( 98 + 2418ȡ2 12051ȡ4 + 15729ȡ6) + 3( 8195 + 17829ȡ2)ȡ4 cos4ș]
S27 = (1/16 7846 )[27461ȡ6 sin6ș + 15(348 2744ȡ2 + 4487ȡ4)ȡ2 sin2ș]

S28 = (21/32 1349 )[1349ȡ6 cos6ș + 5(196 1416ȡ2 + 2247ȡ4)ȡ2 cos2ș]

S29 = ( 13.79189793ȡ + 125.49411319ȡ3 308.13074909ȡ5 + 222.62454035ȡ7) sinș


+ (8.47599260ȡ3 16.13156842ȡ5) sin3ș + 0.87478174ȡ5 sin5ș
278 SYSTEMS WITH SQUARE PUPILS

Table 10-2. Orthonormal square polynomials S j U , T in polar coordinates U, T .


(Cont.)
S30 = ( 13.79189793ȡ + 125.49411319ȡ3 308.13074909ȡ5 + 222.62454035ȡ7) cosș
+ ( 8.47599260ȡ3 + 16.13156842ȡ5) cos3ș + 0.87478174ȡ5 cos5ș
S31 = (6.14762642ȡ 79.44065626ȡ3 + 270.16115026ȡ5 266.18445920ȡ7) sinș
+ (56.29115383ȡ3 248.12774426ȡ5 + 258.68657393ȡ7) sin3ș 4.37679791ȡ5 sin5ș
3
S32 = ( 6.14762642ȡ + 79.44065626ȡ 270.16115026ȡ + 266.18445920ȡ7) cosș
5

3
+ (56.29115383ȡ 248.12774426ȡ5 + 258.68657393ȡ7) cos3ș +4.37679791ȡ5 cos5ș
S33 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)sinș
+ ( 21.68093294ȡ3 + 127.50233381ȡ5 174.38628345ȡ7) sin3ș
+ ( 75.07397471ȡ5 + 151.45280913ȡ7) sin5ș
S34 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)cosș
+ (21.68093294ȡ3 127.50233381ȡ5 + 174.38628345ȡ7) cos3ș
+ ȡ5( 75.07397471 + 151.45280913ȡ2) cos5ș
S35 = (3.69268433ȡ 59.40323317ȡ3 + 251.40397826ȡ5 307.20401818ȡ7)sinș
+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)sin3ș
+ (19.83875817ȡ5 48.16032819ȡ7) sin 5ș + 32.65536033ȡ7 sin7ș
S36 = ( 3.69268433ȡ + 59.40323317ȡ3 251.40397826ȡ5 + 307.20401818ȡ7)cosș
+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)cos3ș
+ ( 19.83875817ȡ5 + 48.16032819ȡ7) cos5ș + 32.65536033ȡ7 cos7ș
S37 = 2.34475558 55.32128002ȡ2 + 296.53777290ȡ4 553.46621887ȡ6
+ 332.94452229ȡ8 + ( 12.75329096ȡ4 + 20.75498320ȡ6)cos4ș
S38 = ( 51.83202694ȡ2 + 451.93890159ȡ4 1158.49126888ȡ6 + 910.24313983ȡ8)cos2ș
+ 5.51662508ȡ6 cos6ș
S39 = ( 39.56789598ȡ2 + 267.47071204ȡ4 525.02362247ȡ6 + 310.24123146ȡ8)sin2ș
1.59098067ȡ6 sin6ș
S40 = 1.21593465 45.42224477ȡ2 + 373.41167834ȡ4 1046.32659847ȡ6
+ 933.93661610ȡ8 + (137.71626496ȡ4 638.10242034ȡ6 + 712.98912399ȡ8)cos4ș

S41 = (9/8) 7 1787 (1455 5544ȡ2 + 5005ȡ4)ȡ4 sin4ș

S42 = ( 40.45171657ȡ2 + 494.75561036ȡ4 1738.64589491ȡ6 + 1843.19802390ȡ8)cos2ș


+ ( 150.76043598ȡ6 + 318.07940431ȡ8)cos6ș
S43 = ( 9.12193686ȡ2 + 110.47679089ȡ4 371.21215287ȡ6 + 368.07015240ȡ8)sin2ș
+ ( 107.35168289ȡ6 + 200.31338972ȡ8) sin6ș
S44 = 0.58427150 25.29433513ȡ2 + 242.54313549ȡ4 795.02011474ȡ6
+ 830.47943579ȡ8 + (90.22533813ȡ4 538.44320774ȡ6 + 752.97905752ȡ8) cos4ș
+ 52.52630092ȡ8 cos8ș
S45 = (31.08509142ȡ4 194.79990628ȡ6 + 278.72965314ȡ8) sin4ș + 44.08655427ȡ8 sin8ș
10.4 Square Polynomials 279

Table 10-3. Orthonormal square polynomials S j x, y in Cartesian coordinates


x, y , where U 2 x 2  y 2 .
S1 = 1

S2 = 6x

S3 = 6y
2
S4 = 5 2 (3ȡ 1)

S5 = 6xy

S6 = 3 5 2 (x2 y2)

2
S7 = 21 31 (15ȡ 7)y

2
S8 = 21 31 (15ȡ 7)x

S9 = 5 31 (27x2 35y2 + 6)y

S10 = 5 31 (35x2 27y2 6)x

S11 = (1/2 67 )(315ȡ4 240ȡ2 + 31)

S12 = (15/2 2 )(x2 y2)(7ȡ2 3)

S13 = 42 (5ȡ2 3)xy

S14 = (3/4 134 )[10(49x4 36x2y2 + 49y4) 150ȡ2 + 11]

S15 = 5 42 (x2 y2)xy


4
S16 = 55 1966 (315ȡ 280x2 324y2 + 57)x

4
S17 = 55 1966 (315ȡ 324x2 280y2 + 57)y

S18 = (1/2) 3 844397 [105(1023x4 + 80x2y2 943y4) 61075x2 + 39915y2 + 4692]x

S19 = (1/2) 3 844397 [105(943x4 80x2y2 1023y4) 39915x2 + 61075y2 4692]y

S20 = (1/4) 7 859 [6(693x4 500x2y2 + 525y4) 1810x2 450y2 + 165]x

S21 = (1/4) 7 859 [6(525x4 500x2y2 + 693y4) 450x2 1810y2 + 165]y

S22 = (1/4) 65 849 [1155ȡ6 15(91x4 + 198x2y2 + 91y4) + 453ȡ2 31]

4
S23 = 33 3923 (1575ȡ 1820ȡ2 + 471)xy

S24 = (21/4) 65 1349 (165ȡ4 140ȡ2 + 27) (x2 y2)

S25 = 7 33 2 (9ȡ2 5)xy(x2 y2)

S26 = (1/8) 849[42(1573x6 375x4y2 375x2y4 + 1573y6) 60(707x4 225x2y2


+ 707y4) + 6045ȡ2 245]

S27 = (1/2 7846 )[14(2673x4 2500 x2y2 + 2673y4) 10290ȡ2 + 1305]xy

S28 = (21/8 1349 )[3146x6 2250 x4y2 + 2250 x2y4 3146y6 1770(x4 y4) + 245(x2 y2)]
280 SYSTEMS WITH SQUARE PUPILS

Table 10-3. Orthonormal square polynomials S j x, y in Cartesian coordinates


x, y , where U 2 x 2  y 2 . (Cont.)
S29 = ( 13.79189793 + 150.92209099x2 + 117.01812058y2 352.15154565x4 657.27245247x2y2
291.12439892y4 + 222.62454035x6 + 667.87362106x4y2 + 667.87362106x2y4 + 222.62454035y6)y
S30 = ( 13.79189793 + 117.01812058x2 + 150.92209099y2 291.12439892x4 657.27245247x2y2
352.15154565y + 222.62454035x + 667.87362106x y + 667.87362106x2y4 + 222.62454035y6)x
4 6 4 2

S31 = (6.14762642 + 89.43280522x2 135.73181009y2 496.10607212x4 + 87.83479115x2y2


+ 513.91209661y4 + 509.87526260x6 + 494.87949207x4y2 539.86680367x2y4 524.87103314y6)y
S32 = ( 6.14762642 + 135.73181009x2 89.43280522y2 513.91209661x4 87.83479115x2y2
+ 496.10607212y4 + 524.87103314x6 + 539.86680367x4y2 494.87949207x2y4 509.87526260y6)x
2 2
S33 = ( 6.78771487 + 38.11697536x + 124.84070714y 400.01976911x4 + 191.43062089x2y2
609.73320550y4 + 695.06919087x6 246.30347616x4y2 154.56957886x2y4 + 786.80308817y6)y
2 2
S34 = ( 6.78771487 + 124.84070714x + 38.11697536y 609.73320550x4 + 191.43062089x2y2
400.01976911y4 + 786.80308817x6 154.56957886x4y2 246.30347616x2y4 + 695.06919087y6)x
S35 = (3.69268433 + 25.20822264x2 87.60705178y2 200.98753298x4 63.30315999x2y2
+ 455.10450382y4 + 497.87935336x6 461.58554163x4y2 + 470.02596297x2y4 660.45220344y6)y
S36 = ( 3.69268433 + 87.60705178x2 25.20822264y2 455.10450382x4 + 63.30315999x2y2
+ 200.98753298y4 + 660.45220344x6 470.02596297x4y2 + 461.58554163x2y4 497.87935336y6)x
S37 = 2.34475558 55.32128002ȡ2 + 283.78448194ȡ4 532.71123567ȡ6 + 332.94452229ȡ8
+ 8(12.75329096ȡ2 20.75498320ȡ4) x2 + 8( 12.75329096 + 20.75498320ȡ2)x4
S38 = ( 51.83202694 + 451.93890159x2 1152.97464379x4 + 910.24313983x6)x2
+ (51.83202694 451.93890159y2 1241.24064523x4 + 1241.24064523x2y2
+ 1152.97464379y4 + 1820.48627967x6 1820.48627967x2y4 910.24313983y6)y2
S39 = ( 79.13579197 + 534.94142408x2 + 534.94142408y2 1059.59312899x4 2068.27487642x2y2
4 6 4 2
1059.59312899y + 620.48246292x + 1861.44738877x y + 1861.44738877x2y4 620.48246292y6)xy
S40 = 1.21593465 + ( 45.42224477 + 511.12794331x2 1684.42901882x4
+ 1646.92574009x6)x2 + ( 45.42224477 79.47423312x2 + 511.12794331y2
+ 51.53230630x4 + 51.53230630x2y2 1684.42901882y4 + 883.78996844x6
1526.27154329x4y2 + 883.78996844x2y4 + 1646.92574009y6)y2
S41 = (409.79084415x2 409.79084415y2 1561.42985567x4 + 1561.42985567y4
+ 1409.62417525x6 + 1409.62417525xy2 1409.62417525x2y4 1409.62417525y6)xy
S42 = ( 40.45171657 + 494.75561036x2 1889.40633090x4 + 2161.27742821x6)x2
+ (40.45171657 494.75561036y2 + 522.76064491x4 522.76064491x2y2
+ 1889.40633090y4 766.71561254x6 + 766.71561254x2y4 2161.27742821y6)y2
S43 = ( 18.24387372 + 220.95358178x2 + 220.95358178y2 1386.53440310x4
+ 662.18504631x2y2 1386.53440310y4 + 1938.02064313x6 595.96654168x4y2
595.96654168x2y4 + 1938.02064313y6)xy
S44 = 0.58427150 + ( 25.29433513 + 332.76847363x2 1333.46332249x4
+ 1635.98479424x6)x2 + ( 25.29433513 56.26575785x2 + 332.76847363y2
+ 307.15569451x4 + 307.15569451x2y2 1333.46332249y4 1160.73491284x6
+ 1129.92710444x4y2 1160.73491284x2y4 + 1635.98479424y6)y2
S45 = (124.34036571x2 124.34036571y2 779.19962514x4 + 779.19962514y4
+ 1467.61104674x6 1353.92842666x4y2 + 1353.92842666x2y4 1467.61104674y6)xy
10.4 Square Polynomials 281

The corresponding polynomials in polar and Cartesian coordinates are given in Tables
10-2 and 10-3, respectively. Of course, up to the fourth order, they can be obtained
simply from the rectangular polynomials Rk given in Tables 9-1 through 9-3 by letting
c = 1 2 . The square polynomial S11 representing the balanced primary spherical
aberration is radially symmetric, but the polynomial S22 representing balanced secondary
spherical aberration is not because it consists of a term in Z14 or cos4q, also. Similarly,
the polynomial S37 representing balanced tertiary spherical aberration is also not radially
symmetric, since it consists of terms in Z14 and Z 26 both varying as cos 4q .

10.5 SQUARE COEFFICIENTS OF A SQUARE ABERRATION FUNCTION


A square aberration function W ( x , y ) across a unit square can be expanded in terms
of J square polynomials Sj (r, q) in the form

J
W ( x , y ) = Â a j Sj ( x , y ) , (10-17)
j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (10-17) by
S j ( x , y ), integrating over the unit square, and using the orthonormality Eq. (10-15), we
obtain the square expansion coefficients:

1 1 2 1 2
aj = Ú dy Ú W ( x , y )S j ( x , y )dy . (10-18)
2 1 2 1 2

As stated in Section 3.2, it is evident from Eq. (10-18) that the value of a square
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (10-19)

and
J
W 2 (r, q) = Â a 2j , (10-20)
j =1

respectively. Accordingly, the aberration variance is given by

2
2
sW = W 2 (r, q) - W (r, q)

J
= Â a 2j . (10-21)
j =2
282 SYSTEMS WITH SQUARE PUPILS

10.6 ISOMETRIC, INTERFEROMETRIC, AND IMAGING


CHARACTERISTICS OF SQUARE POLYNOMIAL ABERRATIONS
The square polynomials are illustrated in three different but equivalent ways in
Figure 10-6. For each polynomial, the isometric plot at the top illustrates its shape. An
interferogram is shown on the left, and a corresponding PSF is shown on the right for a
sigma value of one wave. The peak-to-valley aberration numbers (in units of wavelength)
are given in Table 10-4.

The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (10-7) are shown in Figure 10-6. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration S1 has
no effect on the PSF, it yields an aberration-free PSF.

The polynomial aberrations S2 and S3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 3 2la 2 a about the y axis and displaces the PSF along the x
axis by 6 a 2l F . Similarly, a 3 corresponds to a wavefront tilt angle of 3 2l a 3 a
about the x axis and displaces the PSF by 6 a 3l F .

The defocus aberration represented by the polynomial S4 is radially symmetric and


yields a radially symmetric interferogram bounded, of course, by a square. However, the
PSF is biaxially symmetric. The polynomial aberrations S5 and S6 , representing
balanced astigmatism, yield biaxially symmetric interferograms and PSFs, but distinctly
different from each other. The polynomial aberrations S7 and S8 , representing balanced
comas, produce biaxially symmetric interferograms, but the PSFs are symmetric only
about the y and x axes, respectively. The polynomial aberrations S11 , representing the
primary spherical aberration, yields radially symmetric PSF. However, the polynomial
aberrations S22 , and S37 , representing the balanced secondary and tertiary aberrations are
not radially symmetric because of the presence of a cos 4q term. Accordingly, neither the
interferograms nor the PSFs for these aberrations are radially symmetric.

The Strehl ratio, namely the central value of a PSF relative to its aberration-free
value can be obtained from Eq. (10-7) by letting x = 0 = y , i.e., from

2
1 1 1
I (0, 0) = [ ]
Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢
16 1 1
. (10-22)

Its value for a square polynomial aberration with a sigma value of 0.1 wave is listed in
Table 10-5 and plotted in Figure 10-7. Because of the small value of the aberration, the
Strehl ratio is approximately the same for each polynomial. Both the table and the figure
illustrate that the Strehl ratio for a small aberration is independent of the type of
( )
aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 283

S1 S2 S3

S4 S5 S6

S7 S8 S9

S10 S11 S12

S13 S14 S15

Figure 10-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio


 = 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave.
284 SYSTEMS WITH SQUARE PUPILS

S16 S17 S18

S19 S20 S21

S22 S23 S24

S25 S26 S27

S28 S29 S30

Figure 10-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio


 = 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave. (Cont.)
10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 285

S31 S32 S33

S34 S35 S36

S37 S38 S39

S40 S41 S42

S43 S44 S45

Figure 10-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio


 = 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave. (Cont.)
286 SYSTEMS WITH SQUARE PUPILS

Table 10-4. Peak-to-valley (P-V) numbers (in units of wavelength) of orthonormal


square polynomials for a sigma value of unity.

Poly. P-V # Poly. P-V # Poly. P-V #

S1 0 S16 16.558 S31 11.511

S2 3.464 S17 16.558 S32 11.511

S3 3.464 S18 7.893 S33 9.390

S4 4.743 S19 7.893 S34 9.390

S5 6.000 S20 9.559 S35 12.574

S6 4.743 S21 9.559 S36 10.359

S7 9.312 S22 12.659 S37 17.116

S8 9.312 S23 20.728 S38 13.581

S9 6.532 S24 9.603 S39 29.423

S10 6.532 S25 9.749 S40 8.021

S11 7.374 S26 5.927 S41 13.325

S12 6.061 S27 7.975 S42 9.322

S13 12.962 S28 10.470 S43 10.502

S14 5.429 S29 24.983 S44 9.082

S15 6.236 S30 24.983 S45 9.853


10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 287

Table 10-5. Strehl ratio S for square polynomial aberrations for a sigma value of 0.1
wave.

Poly. S Poly. S Poly. S

S1 1 S16 0.712 S31 0.798

S2 0.662 S17 0.712 S32 0.798

S3 0.662 S18 0.681 S33 0.683

S4 0.669 S19 0.681 S34 0.683

S5 0.675 S20 0.688 S35 0.700

S6 0.669 S21 0.688 S36 0.700

S7 0.685 S22 0.722 S37 0.725

S8 0.685 S23 0.721 S38 0.708

S9 0.675 S24 0.690 S39 0.722

S10 0.675 S25 0.694 S40 0.688

S11 0.703 S26 0.673 S41 0.707

S12 0.669 S27 0.691 S42 0.679

S13 0.704 S28 0.698 S43 0.693

S14 0.6875 S29 0.723 S44 0.711

S15 0.682 S30 0.723 S45 0.700


288 SYSTEMS WITH SQUARE PUPILS

o
o

Figure 10-7. Strehl ratio S for square polynomial aberrations with a sigma value of
0.1 wave.
10.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio 289

10.7 SEIDEL ABERRATIONS, STANDARD DEVIATION, AND


STREHL RATIO
We now consider balancing of a Seidel aberration and obtain its standard deviation
with and without balancing. We also show how the Strehl ratio varies as a function of the
standard deviation and compare it with the approximate exponential expression for it.

10.7.1 Defocus
We start with the defocus aberration

W d (r) = Ad r 2 . (10-23)

From the form of the defocus orthonormal polynomial S4 given in Table 10-2, it is
evident that its sigma value across a square pupil is given by

1 2 Ad
sd = Ad = . (10-24)
3 5 4.743

10.7.2 Astigmatism
Next, consider 0 o Seidel astigmatism given by

W a (r, q) = Aa r 2 cos 2 q . (10-25)

The orthonormal polynomial representing balanced astigmatism is given by

5 2
S6 = 3 r cos 2q (10-26a)
2

Ê 1 ˆ
= 3 10 Á r 2 cos 2 q - r 2 ˜ , (10-26b)
Ë 2 ¯

showing that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is -1 2 , as in the case of a circular, annular, or a Gaussian pupil. Thus, the
balanced astigmatism is given by

Ê 1 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (10-27)
Ë 2 ¯

Its sigma value is given by

Aa Aa
s ba = = . (10-28)
3 10 9.487

To obtain the sigma value of astigmatism, we write Eq. (10-25) in the form

Aa
W a (r, q) = (S6 + S4 ) . (10-29)
3 10
290 SYSTEMS WITH SQUARE PUPILS

Utilizing Eq. (10-21), the sigma value is given by

Aa Aa
sa = = . (10-30)
3 5 6.708

10.7.3 Coma
Now, we consider Seidel coma:

W c (r, q) = Ac r 3 cos q . (10-31)

The orthonormal polynomial representing balanced coma is given by

21
S8 =
31
(
15r 3 cos q - 7r cos q ) . (10-32)

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
r3 cos q is - 7 15 compared to - 2 3 for a circular pupil. The balanced coma is given by

Ê 7 ˆ
W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (10-33)
Ë 15 ¯

Its sigma value is given by

1 31 Ac
s bc = Ac = . (10-34)
15 21 12.346

To obtain the sigma value of Seidel coma, we write Eq. (10-31) in the form

Ac Ê 31 7 ˆ
W c (r, q) = Á S8 + S2 ˜ . (10-35)
15 Ë 21 6 ¯

Utilizing Eq. (10-21), we obtain the sigma value:

3 Ac
sc = A = . (10-36)
70 c 4.831

10.7.4 Spherical Aberration


Finally, we consider Seidel spherical aberration:

W s (r) = Asr 4 . (10-37)

The orthonormal polynomial representing balanced spherical aberration is given by

1
S11 =
2 67
(
315r 4 - 240r 2 - 31 ) . (10-38)

Hence, the balanced spherical aberration is given by


10.7.4 Spherical Aberration 291

Ê 16 ˆ
W bs (r) = As Á r 4 - r 2 ˜ . (10-39)
Ë 21 ¯

It shows that spherical aberration is balanced by a relative defocus of -16 21. Its sigma
value is given by

2 1
s bs = 67 As = . (10-40)
315 19.242

To obtain the sigma value of Seidel spherical aberration, we write Eq. (10-23) in the form

2
W s (r) =
315
( )
67 S11 + 8 10 S4 + constant . (10-41)

Utilizing Eq. (10-21), we obtain the sigma value:

2 101 As
ss = A = . (10-42)
45 7 s 5.923

The sigma values of Seidel aberrations with and without balancing are given in Table 10-
6.

Table 10-6. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.

Aberration Sigma P-V # for s = 1

Defocus s d = 2 5 Ad 3 = Ad 4.74 4.74

Astigmatism s a = Aa 3 5 = Aa 6.71 6.71

Balanced astigmatism s ba = Aa 3 10 = Aa 9.49 4.74

Coma s c = 3 70 Ac = Ac 4.83 9.66

Balanced coma s bc = 31 21 Ac 15 = Ac 12.35 9.31

Spherical aberration s s = 2 101 7 As 45 = As 5.92 5.92

Balanced spherical aberration s bs = 2 67 As 315 = As 19.24 7.37


292 SYSTEMS WITH SQUARE PUPILS

10.7.5 Strehl Ratio


In Figure 10-7, we have shown the Strehl ratio for the square polynomial aberrations
with a sigma value of one wave. In Figure 10-8, we show how it varies with the sigma
value of a Seidel aberration, with and without balancing, for 0 £ s W £ 0.25 . Also plotted
( )
is the Strehl ratio obtained from the approximate expression exp - s F2 as the dashed
curve. We note that this expression underestimates the Strehl ratio for defocus and Seidel
astigmatism, but oversetimates for Seidel coma and Seidel spherical aberration. The
agreement between the actual and the approximate values is quite good for the balanced
aberrations, except that the approximate expression overestimates in the case of spherical
aberration for s W > 0.15. The aberration coefficient or the P-V aberration for a certain
value of s W can be obtained from Tables 10-4 and 10-6 for the aberrations considered
here.

(a) (b)

(c) (d)

Figure 10-8. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
10.8 Summary 293

10.8 SUMMARY
The aberration-free PSF and OTF of a square pupil are discussed in Section 10.3.
The polynomials orthonormal over a unit square pupil, representing balanced aberrations
over such a pupil are given through the eighth order in Tables 10-1 through Table 10-3 in
terms of the circle polynomials, in polar coordinates, and in Cartesian coordinates,
respectively. Each orthonormal polynomial consists of either the cosine or the sine terms,
but not both. Thus, an even j polynomial, for example, consists of only the cosine terms,
as may be seen from Table 10-1 or 10-2. This is a consequence of the four-fold symmetry
of the pupil. Since the polynomials are not separable in the polar coordinates r and q of
a pupil point, the polynomial numbering with two indices n and m loses significance, and
must be numbered with a single index j. They are ordered in the same manner as the
polynomials discussed in previous chapters.

Because of the higher symmetry of a square pupil compared to a rectangular pupil,


the form of the polynomial S6 representing balanced astigmatism is the same as that for a
circular pupil. Similarly, as indicated by the polynomial S11 , spherical aberration r 4 is
balanced only by defocus r2 , compared to R11 for a rectangular pupil, which consists of
a term in astigmatism r2 cos 2 q as well.

The first 45 hexagonal polynomials, i.e., up to and including the eighth order are
illustrated by an isometric plot, an interferogram, and a PSF in Figure 10-6. The
coefficient of each orthonormal polynomial, or the sigma value of the corresponding
aberration, is one wave. Their peak-to-valley numbers for a sigma value of one wave are
given in Table 10-4 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l
for each aberration is given in Table 10-5 and illustrated in Figure 10-7. It shows that, for
a small aberration, the Strehl ratio can be estimated from the aberration variance. The
sigma values of the Seidel aberrations and their balanced forms are given in Table 10-6.
294 SYSTEMS WITH SQUARE PUPILS

References

1. V. N. Mahajan and G.-m Dai, “Orthonormal polynomials in wavefront analysis:


analytical solution,” J Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

2. V. N. Mahajan, “Orthonormal polynomials in wavefront analysis,” Handbook of


Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–
11.41 (McGraw Hill, 2009).

3. M. Bray, “Orthogonal polynomials: A set for square areas," 3URF SPIE 5252,
314–320 (2004).

4. J. L. Rayces, “Least-squares fitting of orthogonal polynomials to the wave-


aberration function," Appl. Opt. 31, 2223–2228 (1992).
CHAPTER 11

SYSTEMS WITH SLIT PUPILS

11.1 Introduction ..........................................................................................................297

11.2 Aberration-Free Imaging ....................................................................................297

11.2.1 PSF ..........................................................................................................297

11.2.2 Image of an Incoherent Slit......................................................................298

11.3 Strehl Ratio and Aberration Balancing ............................................................. 299

11.3.1 Strehl Ratio ..............................................................................................299

11.3.2 Aberration Balancing............................................................................... 299

11.4 Slit Polynomials ....................................................................................................301

11.5 Standard Deviation of a Primary Aberration ................................................... 302

11.6 Summary............................................................................................................... 305

References ......................................................................................................................306

295
Chapter 11
Systems with Slit Pupils
11.1 INTRODUCTION
A slit pupil is a limiting case of a rectangular pupil whose one dimension is
negligibly small. It is used in spectrographs. The power series aberrations of a
rotationally symmetric imaging system with a slit pupil are the 1D analog of the
corresponding aberration terms discussed in Chapter 1. In this chapter, we discuss the
PSF of a slit pupil and the incoherent image of a slit parallel to the slit pupil. The Strehl
ratio for and the balanced aberrations of a slit pupil are discussed. It is shown that the
balanced aberrations are represented by the Legendre polynomials [1,2]. We show further
that the slit pupil is more sensitive to a primary aberration with or without balancing,
except for spherical aberration, for which it is slightly less sensitive.

11.2 ABERRATION-FREE IMAGING


11.2.1 PSF
As illustrated in Figure 11-1, consider a slit pupil, i.e., a rectangular pupil of half-
widths a and b, where b << a. Thus, the aspect ratio  = b a of the pupil is negligibly
small. Its PSF can be obtained from that of a rectangular pupil by letting  be
negligibly small. Letting  be practically zero in Eq. (9-8) for the PSF of a rectangular
pupil, the PSF of a slit pupil may be written

2
1 1
I ( x) = Ú exp[iF( x ¢) ] exp( -pix ¢x ) dx ¢ , (11-1)
4 1

where x is in units of l F , and F = R 2a is the focal ratio of the beam focusing at a


distance R from the focusing lens. The irradiance distribution is normalized by its central
value Pex Sex l2 R 2 , where Pex is the total power in the pattern, and Sex = 4 ab is the
pupil area. For the aberration-free case, we obtain

yp

O
b xp
a

Figure 11-1. A slit pupil of half-width a along the x axis, where b << a .

297
298 SYSTEMS WITH SLIT PUPILS

(a)

1.0

0.8

0.6
(x)

0.4 (b)

0.2

0.0
3 2 1 0 1 2 3
x

Figure 11-2. PSF of a slit pupil. (a) Irradiance distribution. (b) 1D PSF
2
Ê sin px ˆ
I ( x) = Á ˜ . (11-2)
Ë px ¯

The PSF is shown in Figure 11-2. Its value is zero wherever x is a positive or a negative
integer.

11.2.2 Image of an Incoherent Slit


If the point source is replaced by an incoherently illuminated slit object parallel to
the slit pupil, then each point on the source forms a PSF, and the net result for an
incoherent illumination is the sum of their irradiance images. The incoherent image of the
slit object thus obtained is shown in Figure 11-3.

Figure 11-3. Image of an incoherent slit object formed by a system with a slit pupil.
11.3 Strehl Ratio and Aberration Balancing 299

11.3 STREHL RATIO AND ABERRATION BALANCING


11.3.1 Strehl Ratio
From Eq. (11-1), the Strehl ratio, representing the central value of the PSF without
and with an aberration, can be written

S ∫ I ( 0)

2
1 1
= Ú exp[iF( x ¢) ] dx ¢ . (11-3)
4 1

It can also be written as


2
1 1
S = {
Ú exp i [F( x ) - F
4 1
]} dx

= {
exp i [F( x ) - F ]}
1
= 1 + i [F( x ) - F ] -
2
[F( x) - F ]2 + ...

2
~ 1 - F2 - F

∫ 1 - s F2 , (11-4)

where the angular brackets indicate a mean value across the pupil, F is the mean value
of the aberration function, F 2 is its mean square value, s F2 is its variance, and we have
neglected the higher-order terms in the power-series expansion of the exponent. The
mean value of a function g( x ) is given by
1
Ú g( x )dx
11
g( x ) = 1
1
= Ú g( x )dx . (11-5)
2 1
Ú dx
1

11.3.2 Aberration Balancing


A unit slit pupil along the x axis is illustrated in Figure 11-4. Consider an aberration
such as primary x-coma:

Wcx ( x ) = x 3 . (11-6)

Its variance across the slit pupil is given by

2
s 2cx = [W cx ( x)]2 - W cx ( x ) . (11-7)

Thus, the standard deviation of the x-coma aberration is given by s cx = 1 7.


300 SYSTEMS WITH SLIT PUPILS

x
1 1
O

Figure 11-4. Unit slit pupil along the x axis inscribed inside a unit circle.

The variance can be reduced by mixing it with a certain amount b of x-tilt. Thus, the
balanced aberration may be written in the form

W bcx ( x ) = x 3 + bx . (11-8)

Its variance is given by

1 2b b 2
s 2bcx = + + . (11-9)
7 5 3

The variance has a minimum value of 4/175 for a tilt of b = -3 / 5 compared to a value of
1/7 without any tilt. Thus, the variance is reduced by a factor of 25/4, or the standard
deviation of the balanced aberration is smaller by a factor of 5/2. The corresponding
balanced aberration is given by

W bcx ( x , y ) = x 3 - (3 5) x . (11-10)

A balanced aberration yields a higher Strehl ratio or increases the aberration tolerance for
a given Strehl ratio.

Similarly, the variance of the x-spherical aberration x 4 can be minimized by


combining it with x-defocus. Thus, consider the balanced aberration

W bsx ( x ) = x 4 + bx 2 . (11-11)

Its variance is given by

16 2b 4b 2
s 2bsx = + + . (11-12)
225 105 105
11.3.2 Aberration Balancing 301

Its sigma value is minimum and equal to 8 105 for b = - 6 7 compared to a value of
4 15 with no defocus. The balanced aberration is given by

W bsx ( x ) = x 4 - (6 7) x 2 . (11-13)

It should be evident that there is no distinction between defocus and astigmatism, since
they both vary as x 2 .

The process of minimizing the variance in this manner is called aberration balancing.
The variance of the higher-order classical aberrations, e.g., secondary coma x 5 ,
secondary spherical aberration x 6 , tertiary coma x 7 , and tertiary spherical aberration x 8 ,
can also be minimized by combining them with lower-degree aberrations.

11.4 SLIT POLYNOMIALS


By letting c Æ 1 in the rectangular pupil discussed in Chapter 9, we obtain a unit slit
pupil inscribed inside a unit circle that is parallel to the x axis, as illustrated in Figure 11-
4. The corresponding orthonormal polynomials representing balanced aberrations for
such pupils can be obtained from the rectangular polynomials R j ( x , y ) given in Table 9-3
by letting y Æ 0 and c Æ 1. Half of the rectangular polynomials thus reduce to zero.
Some of the other polynomials are redundant. For example, the 1D defocus and
astigmatism cannot be distinguished from each other. The slit polynomials are the
Legendre polynomials. Since the pupil is 1D along the x axis, the aberrations vary with x
only.

The Legendre polynomials Pn ( x ) are orthogonal over the interval [ -1, 1] , according
to [3]

1 1 1
Ú Pn ( x ) Pn ¢ ( x ) dx = d , (11-14)
2 1 2n + 1 nn ¢

where n is a positive integer (including zero). A polynomial with an even (odd) value of n
consists of terms with even (odd) powers of x. Thus, a polynomial is symmetric for an
even n and antisymmetric for an odd n, according to
n
Pn ( - x ) = ( -1) Pn ( x ) . (11-15)

Moreover,

Pn (1) = 1 , (11-16)

Ï1 for even n
Pn ( -1) = Ì (11-17)
Ó -1 for odd n ,

Pn ( 0) = 0 for odd n , (11-18)

Pn ( 0) is positive or negative depending on whether n/2 is even or odd.


302 SYSTEMS WITH SLIT PUPILS

Starting with P0 ( x ) = 1 and P1( x ) = x , the polynomials can be obtained recursively from
the relation

( n + 1) Pn +1( x) = ( 2n + 1) xPn ( x) - nPn 1( x) . (11-19)

It is evident from Eq. (11-19) that Pn ( x ) is a polynomial of degree n in x, i.e., the highest
power of x in a polynomial Pn ( x ) is n. It is perhaps worth noting that a Zernike radial
( ) (
polynomial Rn0 (r) is the same as a shifted Legendre polynomial P̃n r 2 = Pn 2r 2 - 1 , )
both of which are orthogonal over the interval [0, 1] [see Eq. (4-41)].

The variation of a polynomial Pn ( x ) for -1 £ x £ 1 is shown in Figure 11-5. For


clarity, the even polynomials are plotted in Figure 11-5a and the odd in Figure 11-5b. It is
evident, as expressed by Eqs. (11-15)–(11-18), that an odd polynomial starts at –1 for
x = -1 and ends with 1 for x = 1. However, the even polynomials start and end at unity.
The number of peaks and valleys in a polynomial Pn ( x ) is n-1.

We use the Legendre polynomials in their orthonormal form Ln ( x ) given by

Ln ( x ) = 2n + 1Pn ( x ) . (11-20)

Their orthonormality is expressed by

1 1
Ú L ( x ) Ln ¢ ( x ) dx = d nn ¢ . (11-21)
2 1 n

The first few Ln ( x ) polynomials are listed in Table 11-1. The standard deviation of
each polynomial is unity. The mean value of each polynomial [other than P0 ( x ) ] is zero,
as may be seen by letting n ¢ = 0 in Eq. (11-21). It is easy to see this explicitly for a
polynomial with an odd value of n, since the integral of an odd function over symmetric
limits is zero. For an even value of n, the piston term in the polynomial makes its mean
value zero. For example, the balanced x-spherical aberration is x 4 - (6 7) x 2 with a mean
value of - 3 35. The piston term of 3(3/8) in L4 ( x ) makes its mean value zero. The slit
pupil is more sensitive to a Seidel aberration with or without balancing compared to a
circular pupil, except for spherical aberration for which it is slightly less sensitive.

11.5 STANDARD DEVIATION OF A PRIMARY ABERRATION


The standard deviation of a 1D primary aberration for a slit pupil can be obtained
from the orthonormal polynomials by writing it as a sum of these polynomials. Of course,
they are obtained in Section 11.3, and they are listed in Table 11-2. Comparing them with
the sigma value of a corresponding 2D aberration for a circular pupil (see Tables 4-1 and
4-2), we find that a slit pupil is more sensitive to a primary aberration with or without
balancing, except for spherical aberration, for which it is slightly less sensitive.
11.5 Standard Deviation of a Primary Aberration 303

(a)

(b)

Figure 11-5. Legendre polynomials Pn ( x ) as a function of x. (a) Even n and (b) odd
n.
304 SYSTEMS WITH SLIT PUPILS

Table 11-1. Legendre polynomials Ln ( x) = 2n + 1Pn ( x) for a unit slit pupil


orthonormal over the interval -1 £ x £ 1.

n Aberration Ln ( x)

0 Piston 1

1 Tilt 3x

2 Defocus ( )(
5 2 3x 2 - 1 )
3 Primary coma ( )(
7 2 5x 3 - 3x )
4 Primary spherical aberration (3 8)( 35x 4 - 30 x 2 + 3)
5 Secondary coma ( )(
11 8 63x 5 - 70 x 3 + 15x )
6 Secondary spherical ( )(
13 16 231x 6 - 315x 4 + 105x 2 - 5 )
aberration
7 Tertiary coma ( )(
15 16 429 x 7 - 693x 5 + 315x 3 - 35x )
8 Tertiary spherical aberration ( )( )
17 128 6435 x 8 - 12012 x 6 + 6930 x 4 - 1260 x 2 + 35

Table 11-2. Standard deviation s of a primary aberration for a slit pupil, where Ai
is its aberration coefficient.

Aberration s

Tilt At 3 = At 1.732

Defocus (or astigmatism) 2 Ad 3 5 = Ad 3.354

Coma Ac 7 = 2.646

Balanced coma 2 Ac 5 7 = Ac 6.614

Spherical aberration 4 As 15 = As 3.750

Balanced spherical aberration 8 As 105 = As 13.125


11.6 Summary 305

11.6 SUMMARY
A slit pupil is a limiting case of a rectangular pupil whose one dimension is
negligibly small, as illustrated in Figure 11-1. Its PSF is shown in Figure 11-2. The image
of an incoherent slit object parallel to the slit pupil is shown in Figure 11-3. The balanced
aberrations for a slit pupil are the Legendre polynomials. We have written them in an
orthonormal form, as in Eq. (11-3). They are listed in Table 11-1 up to the eighth order
and plotted in Figure 11-4. The sigma value of a 1D primary aberration with and without
balancing is listed in Table 11-2. It is shown that a slit pupil is more sensitive to a
primary aberration with or without balancing, except for spherical aberration for which it
is slightly less sensitive.
306 SYSTEMS WITH SLIT PUPILS

References

1. V. N. Mahajan and G.-m Dai, “Orthonormal polynomials in wavefront analysis:


analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007).

2. R. Barakat and L. Riseberg, “Diffraction theory of the aberrations of a slit


aperture," J. Opt. Soc. Am. 55, 878–881 (1965). There is an error in their
polynomial S2 , which should read as x 2 - 1 3.

3. A. Korn and T. M. Korn, Mathematical Handbook for Scientists and Engineers


(McGraw-Hill, New York, 1968).
CHAPTER 12

USE OF ZERNIKE CIRCLE POLYNOMIALS FOR


NONCIRCULAR PUPILS

12.1 Introduction ..........................................................................................................309

12.2 Relationship Between the Orthonormal and the Corresponding

Zernike Circle Coefficients..................................................................................309

12.3 Use of Zernike Circle Polynomials for the Analysis of an

Annular Wavefront ..............................................................................................314

12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ......... 314

12.3.2 Interferometer Setting Errors................................................................... 320

12.3.3 Wavefront Fitting ....................................................................................320

12.3.4 Application to an Annular Seidel Aberration Function........................... 321

12.3.4.1 Annular Coefficients................................................................321

12.3.4.2 Circle Coefficients ................................................................... 322

12.3.4.3 Residual Aberration Function after Removing

Interferometer Setting Errors ..................................................323

12.3.4.4 Error with Assuming Circle Polynomials to be

Orthogonal over an Annulus ....................................................325

12.3.4.5 Numerical Example ................................................................326

12.4 Use of Zernike Circle Polynomials for the Analysis of a

Hexagonal Wavefront ..........................................................................................332

12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients ........... 332

12.4.2 Interferometer Setting Errors................................................................... 335

12.4.3 Numerical Example ................................................................................. 336

12.5 Aberration Coefficients from Discrete Wavefront Data ..................................345

12.6 Summary............................................................................................................... 345

References ......................................................................................................................348
307
Chapter 12
Use of Zernike Circle Polynomials for
Noncircular Pupils
12.1 INTRODUCTION
The orthonormal polynomials for various pupils discussed in the preceding chapters
represent balanced aberrations for those pupils, just as the Zernike circle polynomials
(discussed in Chapter 4) do for a circular pupil. In this chapter, we consider the use of
circle polynomials for the analysis of a noncircular wavefront. Since the circle
polynomials form a complete set, any wavefront, regardless of the shape of the pupil
(which defines the perimeter of the wavefront), can be expanded in terms of them.
Moreover, since each orthonormal polynomial is a linear combination of the circle
polynomials [see Eq. (3-18)], the wavefront fitting with the former set of polynomials is
as good as that with the latter. However, we illustrate the pitfalls of using circle
polynomials for a noncircular pupil by considering an annular and a hexagonal pupil
[1,2].

It is shown that, unlike the orthonormal coefficients, the circle coefficients generally
change as the number of polynomials used in the expansion changes. Although the
wavefront fit with a certain number of circle polynomials is the same as that with the
corresponding orthonormal polynomials, the piston circle coefficient does not represent
the mean value of the aberration function, and the sum of the squares of the other
coefficients does not yield its variance. While the interferometer setting errors of tip, tilt,
and defocus from a 4-circle-polynomial expansion are the same as those from the
orthonormal polynomial expansion, these errors obtained from, say, an 11-circle-
polynomial expansion, and removed from the aberration function yield wrong polishing
by zeroing out the residual aberration function. If the common practice of defining the
center of an interferogram and drawing a circle around it is followed, and determining the
circle coefficients in the same manner as for a circular interferogram, then the circle
coefficients of a noncircular interferogram do not yield a correct representation of the
aberration function. Moreover, in this case, some of the higher-order coefficients of
aberrations that are nonexistent in the aberration function are also nonzero. Finally, the
circle coefficients, however obtained, do not represent coefficients of the balanced
aberrations for a noncircular pupil. Such results are illustrated analytically and
numerically by considering annular and hexagonal Seidel aberration functions as
examples.

12.2 RELATIONSHIP BETWEEN THE ORTHONORMAL AND THE


CORRESPONDING ZERNIKE CIRCLE COEFFICIENTS

Consider an aberration function W ( x , y ) across a noncircular pupil fit with J


orthonormal polynomials F j ( x , y ) in the form

309
310 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

J
Wˆ ( x , y ) = Â a j F j ( x , y ) , (12-1)
j =1

where Wˆ ( x , y ) is the best-fit estimate of the function with J polynomials, and a j is the
coefficient of the polynomial F j ( x , y ) . The orthonormality of the polynomials across the
noncircular pupil is described by

1
Ú F ( x , y )F j ¢ ( x , y ) dx dy = d jj ¢ , (12-2)
A pupil j

where d jj ¢ is a Kronecker delta. The orthonormal coefficients are given by

1
aj = Ú W ( x , y )F j ( x , y ) dx dy . (12-3)
A pupil

It is evident that their value does not depend on the number of polynomials J used in the
expansion.

Letting F1( x , y ) = 1 , it is easy to see from Eq. (12-2) that the mean value of a
polynomial F j π1( x , y ) across the pupil is zero. Hence, the mean and the mean square
values of the estimated aberration function are given by

Ŵ = a1 (12-4)

and
J
Wˆ 2 ( x , y ) = Â a 2j , (12-5)
j =1

respectively. Its variance is accordingly given by

2
ˆ2 ˆ
ˆ = W ( x, y) - W ( x, y)
2
sW

J
= Â a 2j , (12-6)
j =2

where s Ŵ is its standard deviation. The number of polynomials J used in the expansion
to estimate the aberration function is increased until s Ŵ approaches the true value as
determined from the ray-trace or interferometric data within a certain prespecified
tolerance.

Since the circle polynomials Z j ( x , y ) form a complete set, each orthonormal


polynomial can be written in terms of them as a linear sum in the form [see Eq. (3-18)]
12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 311

J
F j ( x , y ) = Â M ji Z i ( x , y ) , (12-7)
i =1

or

{F } = M {Z }
j j , (12-8)

where M ji are the elements of the lower triangular conversion matrix M The estimated
aberration function can accordingly be expanded in terms of the circle polynomials in the
form
J
Wˆ ( x , y ) = Â bˆ j Z j ( x , y ) , (12-9)
j =1

where b̂ j is the Zernike coefficient of a polynomial Z j ( x , y ). The circle polynomials are


orthonormal over a unit circle in Cartesian coordinates according to

1
Ú Z ( x , y )Z j ¢ ( x , y ) dx dy = d jj ¢ , (12-10a)
p x 2 + y 2 £1 j

or in polar coordinates (with x = r cos q and y = r sin q )

2p
11
Z j (r, q) Z j ¢ (r, q) r dr dq = d jj ¢
p Ú0 Ú . (12-10b)
0

Substituting Eq. (12-7) into Eq. (12-1), we obtain


J j
Wˆ ( x , y ) =  a j  M ji Z i ( x , y )
j =1 i =1

J J
= Â Â a i M ij Z j ( x , y ) . (12-11)
j =1 i = j

Comparing Eqs. (12-9) and (12-11), we obtain


J
bˆ j = Â a i M ij . (12-12)
i= j

It is clear that the value of a circle coefficient b̂ j depends on the number of polynomials J
used in the expansion. Moreover, it is a linear combination of the orthonormal
coefficients, just as an orthonormal polynomial is a linear combination of the circle
polynomials. Equation (12-12) can be written in a matrix form as

b̂ = M T a , (12-13)

where a and b̂ are the column vectors representing the orthonormal and the Zernike
coefficients, respectively, and M T is the transpose of the conversion matrix M. Thus, the
312 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

matrix that is used to obtain the orthonormal polynomials from the circle polynomials is
also used to obtain the circle coefficients from the orthonormal coefficients. The
transpose of a matrix is obtained by interchanging its rows and columns. Since M is a
lower triangular matrix, M T is an upper triangular matrix. Multiplying both sides of Eq.
1
(12-13) by the inverse M T ( )
of M T , we obtain

a = MT( ) 1 bˆ . (12-14)

Accordingly, if the circle coefficients are known, the orthonormal coefficients can be
obtained from them.

If the orthonormal coefficients are not known, the circle coefficients b̂ j can be
obtained by a least squares fit. Suppose the aberration values are known over a certain
domain by way of interferometry at N data points. Equation (12-9) can be written in
matrix form

Sˆ = Zbˆ , (12-15)

where Ŝ is an array of N elements representing the values of the aberration function


Wˆ ( x , y ) , and Z is an N ¥ J matrix representing each of the J polynomials over the N
data points. Solving Eq. (12-15), for example, with a standard singular-value
decomposition algorithm yields

bˆ = Z 1Sˆ , (12-16)

where Z 1 is a generalized inverse of the Z matrix. Of course, this procedure can also be
used to determine the orthonormal coefficients by replacing the circle polynomials with
the orthonormal polynomials. Except for any numerical error because of the finite
number N of the data points, the b̂ -coefficients given by Eq. (12-16) are the same as
those given by Eq. (12-13).

If the practice of drawing a unit circle around an interferogram and determining the
Zernike coefficients for a circular pupil is extended to a noncircular wavefront, the
coefficients thus obtained will be given by

1
bj = Ú W ( x , y )Z j ( x , y ) dx dy . (12-17)
A pupil

The circle polynomials in Eq. (12-17) are implicitly assumed to be orthonormal over the
noncircular pupil. The value of a circle coefficient b j does not depend on the number of
polynomials used in the expansion. Substituting Eq. (12-1) for the estimated aberration
function Wˆ ( x , y ) in terms of the orthonormal polynomials, we obtain
J 1
bj = Â a j¢ Ú Z ( x , y ) F j ¢ ( x , y ) dx dy
j ¢ =1 A pupil j
12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 313

J
= Â a j¢ Z j Fj¢ , (12-18)
j ¢ =1

or in a matrix form

b = C ZF a , (12-19)

where C ZF is a matrix representing the inner products Z j F j ¢ of the Zernike


polynomials with the orthonormal polynomials over the domain of the noncircular
wavefront. As illustrated in Sections 12.3 and 12.4, by considering an annular or a
hexagonal Seidel aberration function, respectively, the circle coefficients b j thus
obtained are incorrect in the sense that they do not yield a least-squares fit of the
aberration function W ( x , y ) , unlike the coefficients b̂ j . This, of course, is due to the
incorrect assumption of orthonormality of the circle polynomials over the noncircular
pupil.

To relate the b̂ - and the b-circle coefficients, we equate the right-hand sides of Eqs.
(12-1) and (12-9), multiply both sides by Z j ¢ , and integrate over the domain of the
noncircular pupil. Thus,
J J
 bˆ j Z j ( x , y ) =  a j F j ( x , y ) (12-20)
j =1 j =1

and
J J
 bˆ j Z j ¢ Z j =  a j Z j¢ Fj , (12-21)
j =1 j =1

C ZZ bˆ = C ZF a = b , (12-22)

where we have utilized Eq. (12-19). From Eqs. (12-13) and (12-22), it is evident that

C ZF = C ZZ M T . (12-23)

Typical elements of the matrices C ZZ and C ZF are given by

1
c jj ¢ = Ú Z ( x , y )Z j ¢ ( x , y ) dx dy (12-24)
A pupil j

and

1
d jj ¢ = Ú Z ( x , y )F j ¢ ( x , y ) dx dy , (12-25)
A pupil j

respectively. It is evident from Eq. (12-24) that c jj ¢ = c j ¢j .


314 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

12.3 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR THE ANALYSIS OF


AN ANNULAR WAVEFRONT

12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients


Consider a system with a unit annular pupil with an obscuration ratio , as illustrated
in Figure 5-1. The polynomials A j (r, q; ) that are orthonormal across it and represent
balanced aberrations for it are similar to the circle polynomials in that they are separable
in the radial coordinate r and the azimuthal angle q of a point on the pupil. The
dependence on the obscuration ratio  is contained only in the radial portion of the
polynomial. As discussed in Chapter 5, the annular polynomials are given by

Aeven j (r, q; ) = 2(n + 1) Rnm (r; ) cos mq , m π 0 , (12-26a)

Aodd j (r, q; ) = 2(n + 1) Rnm (r; ) sin mq , m π 0 , (12-26b)

A j (r, q; ) = n + 1 Rn0 (r; ) , m = 0 , (12-26c)

where  £ r £ 1, n and m are positive integers, and n - m ≥ 0 and positive. The annular
polynomials are orthonormal across the annular pupil according to
1 2p
1
Ú Ú A j (r, q; ) A j ¢ (r, q; ) r dr dq = d jj ¢ . (12-27)
(
p 1 - 2 ) 0

As  Æ 0, an annular polynomial reduces to a corresponding circle polynomial. The


annular polynomials can be written in terms of the Zernike circle polynomials Z j (r, q),
as discussed in Chapter 4, according to

{A } = M {Z }
j j , (12-28)

where M is the conversion matrix.

An annular aberration function W (r, q; ) can be estimated by J orthonormal


polynomials according to
J
Wˆ (r, q; ) = Â a j A j (r, q; ) , (12-29)
j =1

where the orthonormal annular expansion coefficients are given by


1 2p
1
aj = W (r, q; ) A j (r, q; ) r dr dq .
) Ú Ú (12-30)
(
p 1 - 2 0

The mean value and the variance of the estimated function are accordingly given by Eqs.
(12-4) and (12-6).
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 315

Table 12-1 lists the first 11 annular polynomials, as obtained from the annular-
polynomial Tables 5-3 and 5-4. They are given in terms of the circle polynomials in
Table 12-2. The nonzero elements of a 11 ¥ 11 conversion matrix, as obtained from Table
12-2, are listed in Table 12-3. The transpose matrix M T can be obtained easily by
interchanging the rows and columns of M . The nonzero elements of the 11 ¥ 11 matrices
C ZZ and C ZF are given in Tables 12-4 and 12-5, respectively.

Given a certain annular aberration function, its annular coefficients a j can be


determined from Eq. (12-30). If it is expanded in terms of only the first four circle
polynomials, i.e., if J = 4 in Eq. (12-9), then the expansion b̂ -coefficients according to
Eq. (12-13) are given by

Ê
Ê bˆ1 ˆ Á 1 0 0 - 32 1 - 2( ) 1ˆ˜ Ê a1 ˆ Á
1 (
Ê a - 32 1 - 2
) 1 a 4 ˆ˜
Áˆ ˜ Á Á ˜
Á b2 ˜ = Á 0 (1 + 2 ) 1 2 0 0 ˜
˜ Á a2 ˜
Á 1 + 2 1 2 a

( ) 2
˜
˜ (12-31)
Á bˆ ˜ Á ˜ Á ˜ Á ˜
Á 3˜ Á 0
Áˆ ˜
0 (1 + 2 ) 1 2 0 ˜ Á a3 ˜ Á 1+ ( 2 1 2
)a3 ˜
Á ˜
Ë b4 ¯ Á ˜ Á ˜
Ë 0 0 0 (1 - 2 ) 1 ¯
Ë a4 ¯
Ë ( 1
1 - 2 a 4) ¯

or

(
bˆ1 = a1 - 32 1 - 2 ) 1 a4 , (12-32a)

(
bˆ2 = 1 + 2 ) 1 2 a2 , (12-32b)

(
bˆ3 = 1 + 2 ) 1 2 a3 , (12-32c)

(
bˆ4 = 1 - 2 ) 1 a4 . (12-32d)

These coefficients represent the Zernike piston, tip, tilt, and defocus coefficients.

To see how these coefficients change with the number of polynomials used in the
expansion, we consider an expansion using the first 11 circle polynomials. The
coefficients are now given by

(
bˆ1 = a1 - 32 1 - 2 ) 1 a4 + (
52 1 + 2 1 - 2 )( ) 2 a11 , (12-33a)

(
bˆ2 = 1 + 2 ) 1 2 a2 - (2 )
2 4 B a 8 , (12-33b)

(
bˆ3 = 1 + 2 ) 1 2 a3 - (2 )
2 4 B a 7 , (12-33c)

(
bˆ4 = 1 - 2 ) 1 a4 - (
152 1 - 2 ) 2 a11 , (12-33d)
316 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

(
bˆ5 = 1 + 2 + 4 ) 1 2 a5 , (12-33e)

(
bˆ6 = 1 + 2 + 4 ) 1 2 a6 , (12-33f)

[( ) ]
bˆ7 = 1 + 2 B a 7 , (12-33g)

[( ) ]
bˆ8 = 1 + 2 B a 8 , (12-33h)

(
bˆ9 = 1 + 2 + 4 + 6 ) 1 2 a9 , (12-33i)

Table 12-1. Orthonormal annular polynomials A j (r, q; ).

j n m A j (r, q; ) Aberration Name

1 0 0 1 Piston

x tilt
2 1 1 2 ÈÍr 1 + 2
Î
( )1 2 ˘˙˚ cos q
y tilt
3 1 1 2 ÈÍr 1 + 2
Î
( )1 2 ˘˚˙ sin q
4 2 0 (
3 2r 2 - 1 - 2 ) (1 - 2 ) Defocus

5 2 2 6 ÈÍr 2 1 + 2 + 4
Î
( )1 2 ˘˙˚ sin 2q 45∞ Primary astigmatism

6 2 2 6 ÈÍr 2 1 + 2 + 4
Î
( )1 2 ˘˙˚ cos 2q 0∞ Primary astigmatism

7 3 1 8
( ) ) sin q
3 1 + 2 r 3 - 2 1 + 2 + 4 r ( Primary y coma
12
(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]
3 (1 + 2 ) r 3 - 2 (1 + 2 + 4 ) r
8 3 1 8 1 2 cos q
Primary x coma
(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]
9 3 3 8 ÈÍr 3 1 + 2 + 4 + 6
Î
( )1 2 ˘˚˙ sin 3 q
10 3 3 8 ÈÍr 3 1 + 2 + 4 + 6
Î
( )1 2 ˘˚˙ cos 3q
2
11 4 0
ÎÍ ( )
5 È6r 4 - 6 1 +  2 r 2 + 1 + 4  2 +  4 ˘
˚˙ (1 -  )
2 Primary spherical aberration
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 317

Table 12-2. Annular polynomials A j (r, q; ) in terms of the Zernike circle


polynomials Z j (r, q ) , where  is the obscuration ratio of the annular pupil.

A1 = Z1

( ) 1 2 Z2
A2 = 1 + 2
12
A3 = (1 + 2 ) Z 3
1
A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )
12
A5 = (1 + 2 + 4 ) Z 5
12
A6 = (1 + 2 + 4 ) Z 6
A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]
A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]
12
A9 = (1 + 2 + 4 + 6 ) Z 9
12
A10 = (1 + 2 + 4 + 6 ) Z10

A11 = (1 - 2 ) [ 52 (1 + 2 ) Z1 - 152 Z 4 + Z11 ]


2

12
B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]

Table 12-3. Nonzero elements of a 11 ¥ 11 conversion matrix M for obtaining the


annular polynomials A j (r, q; ) from the Zernike circle polynomials Z j (r, q ) .

M 11 = 1

(
M 22 = 1 + 2 ) 1 2 = M 33
1
M 41 = - 32 (1 - 2 )
1
M 44 = (1 - 2 )
12
M 55 = (1 + 2 + 4 ) = M 66

M 73 = -2 2 4 B 1
= M 82

( ) = M 88
M 77 = 1 +  B 2 1

12
M 99 = (1 + 2 + 4 + 6 ) = M 10,10
2
M 111, = 52 (1 + 2 )(1 - 2 )
2
M 11,4 = - 152 (1 - 2 )
2 2
, = (1 -  )
M 1111
318 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

Table 12-4. Nonzero elements c jj ¢ of 11 ¥ 11 matrix C ZZ of the Zernike circle


polynomials over an annular pupil of obscuration ratio , where c jj ¢ = c j ¢j .

c11 = 1
c14 = 32 = c 41

c111 2
(
, = - 5 1 - 2  = c111
2
,)
c 22 = 1 + 2 = c 33

c 28 = 2 2 4 = c 82 = c 37 = c 73

c 44 = 1 - 2 2 + 4 4

( )
c 4,11 = 152 1 - 32 + 34 = c11,4

c 55 = 1 + 2 + 4 = c 66

c 77 = 1 + 2 - 7

c 99 = 1 + 2 + 4 + 6 = c10,10

, = 1 - 4  + 26  - 54  + 36 
2 4 6 8
c1111

Table 12-5. Nonzero elements d jj ¢ of 11 ¥ 11 matrix C ZF of the Zernike circle


polynomials over an annular pupil of obscuration ratio .

d11 = 1

(
d 22 = 1 + 2 )1 2 = d 33
d 41 = 32

( )( ) 1
d 44 = 1 - 2 2 + 4 1 - 2
12
d 55 = (1 + 2 + 4 ) = d 66
12
d 73 = 2 2 4 (1 + 2 ) = d 82
12 12
d 77 = (1 - 2 )(1 + 4 2 + 4 ) (1 + 2 ) = d 88
12
d 99 = (1 + 2 + 4 + 6 ) = d10,10

d111, = - 52 (1 - 22 )

d11,4 = 152 (1 - 2 )

2 2
, = (1 -  )
d1111
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 319

(
bˆ10 = 1 + 2 + 4 + 6 ) 1 2 a10 , (12-33j)

(
bˆ11 = 1 - 2 ) 2 a11 , (12-33k)

where

12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] . (12-34)

It is evident that all of the first four coefficients change, and b j = M jj a j for 5 £ j £ 11 .
The Zernike astigmatism coefficients b̂5 and b̂6 are smaller than the corresponding
12
( )
annular coefficients a 5 and a 6 by a factor of 1 + 2 + 4 . However, the Zernike
spherical aberration coefficient b̂11 is larger than the corresponding annular coefficient
2
( )
a11 by a factor of 1 - 2 . For example, when  = 0.5 , the astigmatism coefficients are
smaller by a factor of 1.1456, and the spherical aberration coefficient is larger by a factor
of 1.7778.

It should be evident that, because of the orthogonality of the trigonometric functions,


there is correlation between an annular and a circle polynomial only if they have the same
azimuthal dependence. As a consequence, the piston coefficient b̂1, for example, is a
linear combination of the piston coefficient a1 , defocus coefficient a 4 , and various
orders of spherical aberration. Similarly, the tilt coefficient b̂2 is a linear combination of
the tilt coefficient a 2 and various orders of coma, or astigmatism coefficient b̂5 is a
linear combination of various orders of astigmatism. Accordingly, the astigmatism
coefficients change if a 13-polynomial expansion is considered. For example, b̂5 then
contains contribution from a13 , as well. The tip and tilt coefficients b̂2 and b̂3 change
further if polynomials A16 (varying as cos q ) and A17 (varying as sinq ) are included in
the expansion. Moreover, A16 also contributes to the coma coefficient b̂8 , and A17
similarly contributes to the coma coefficient b̂7 . The defocus coefficient b̂4 does not
change until the secondary spherical aberration polynomial A22 is included with its
coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .
Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,
depending on the number of polynomials used in the expansion.

We note that the mean value of the aberration function is given by the annular piston
coefficient a1 . However, the value of the corresponding Zernike circle coefficient b̂1
depends on the number of polynomials used in the expansion, and it does not equal a1 ;
therefore, it does not represent the mean value. An orthonormal annular coefficient (other
than piston) represents the standard deviation of the corresponding aberration term in the
expansion, but a Zernike circle coefficient generally does not. The variance of the
aberration function cannot be obtained by summing the squares of the Zernike circle
coefficients b̂ j (excluding the piston coefficient). The circle coefficients b j can be
320 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

obtained from the b̂ j - or the a j -coefficients, according to Eq. (12-22). They are
considered in Section 12.3.5 for a Seidel aberration function.

12.3.2 Interferometer Setting Errors


The estimated wavefront obtained by using only the first four polynomials represents
the best-fit parabolic approximation of the aberration function in a least squares sense. In
terms of the orthonormal annular polynomials, it can be written as

Wˆ ( x , y ) = a1 A1 + a 2 A2 + a 3 A3 + a 4 A4 (12-35a)

(
= a1 + 2 1 + 2 ) 1 2 a 2 x + 2(1 + 2 ) 1 2 a 3 y
(
+ 3 1 - 2 ) 1 a 4 [2 + (2r2 - 1)] . (12-35b)

In terms of the circle polynomials it can be written

Wˆ ( x , y ) = bˆ1Z1 + bˆ2 Z 2 + bˆ3 Z 3 + bˆ4 Z 4 (12-36a)

(
= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-36b)

In Eqs. (12-35) and (12-36), we have omitted the arguments of the annular and circle
polynomials for simplicity. The coefficients of x, y, and r 2 representing the tip, tilt, and
defocus values obtained from the circle coefficients are the same as those obtained from
the orthonormal coefficients. The estimated piston from the Zernike expansion of Eq.
1
( )
(12-36b) is bˆ1 - 3bˆ4 , which is the same as a1 - 32 1 - 2 a 4 from the orthonormal
expansion in Eq. (12-35b). Accordingly, the aberration function obtained by subtracting
the piston, tip, tilt, and defocus values from the measured aberration function is
independent of the nature of the polynomials used in the expansion, so long as the
nonorthogonal expansion is in terms of only the first four circle polynomials [as may be
seen, for example, by comparing Eqs. (12-33a–d) with Eqs. (12-32a–d)]. In an
interferometer, the tip and tilt represent the lateral errors and defocus represents the
longitudinal error in the location of a point source illuminating an optical surface under
test from its center of curvature. These four terms are generally removed from the
aberration function and the remaining function is given to the optician to zero out from
the optical surface by polishing.

12.3.3 Wavefront Fitting


When an aberration function is expanded in terms of the orthonormal polynomials,
one or more polynomial terms can be added or subtracted from the aberration function
without affecting the coefficients of the other polynomials in the expansion. But that is
generally not true with the Zernike expansion. This is due to the fact that an expansion in
terms of the orthonormal polynomials gives a best fit for each polynomial, but an
expansion in terms of the circle polynomials gives it for the whole set in the expansion.
12.3.3 Wavefront Fitting 321

The estimated or reconstructed wavefront by the same number of corresponding


orthonormal or Zernike polynomials is the same. For example, the 4-polynomial
aberration functions of Eqs. (12-35) and (12-36) are exactly the same function.

Although the wavefront fit with a certain number of circle polynomials is as good as
the fit with a corresponding set of the orthonormal polynomials, there are pitfalls in using
the circle polynomials. Since the circle polynomials are not orthogonal over the
noncircular pupil, the advantages of orthogonality and aberration balancing are lost. Since
they do not represent the balanced classical aberrations for a noncircular pupil, the
Zernike coefficients b̂ j do not have the physical significance of their orthonormal
counterparts. For example, the mean value of a circle polynomial across a noncircular
pupil is not zero, the Zernike piston coefficient does not represent the mean value of the
aberration, the other Zernike coefficients do not represent the standard deviation of the
corresponding aberration terms, and the variance of the aberration is not equal to the sum
of the squares of these other coefficients. Moreover, the value of a Zernike coefficient
generally changes as the number of polynomials used in the expansion of an aberration
function changes. Hence, the circle polynomials are not appropriate for the analysis of a
noncircular wavefront. Of course, wavefront fitting with the improperly calculated
Zernike coefficients b j by using Eq. (12-17) will be in error, as demonstrated in Section
12.3.4 for a Seidel aberration function.

12.3.4 Application to an Annular Seidel Aberration Function


Consider an annular pupil aberrated by a Seidel aberration function given by

W (r, q; ) = At r cos q + Ad r 2 + Aa r 2 cos 2 q + Ac r 3 cos q + Asr 4 ,  £ r £ 1, (12-37)

where At , Ad , Aa , Ac , and As represent the peak values of distortion, field curvature,


astigmatism, coma, and spherical aberration, respectively. Without the explicit field
dependence, distortion is equivalent to a wavefront tilt, and field curvature is equivalent
to a wavefront defocus.

12.3.4.1 Annular Coefficients

The aberration function when approximated by only the first four annular
polynomials can be written

Wˆ (r, q; ) = a1 A1 + a 2 A2 + a 4 A4 , (12-38)

where the expansion coefficients according to Eq. (12-30) are given by

( ) (
a1 = 1 + 2 (2 Ad + Aa ) 4 + 1 + 2 + 4 As 3 , ) (12-39a)

(
a 2 = 1 + 2 )1 2 At ( )(
2 + 1 + 2 + 4 1 + 2 ) 1 2 Ac 3 , (12-39b)

( ) (
a 4 = 1 - 2 (2 Ad + Aa ) 4 3 + 1 - 4 As 2 3 . ) (12-39c)
322 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

It should be evident that the coefficient a 3 of the annular polynomial A3 varying as sinq
is zero. The mean value of the estimated aberration function is given by a1 , and its
variance is given by
2 2 2
sWˆ = a2 + a4 . (12-40)

An expansion in terms of 11 annular polynomials can be written

W (r, q; ) = a1 A1 + a 2 A2 + a 4 A4 + a 6 A6 + a 8 A8 + a11 A11 , (12-41)

where the coefficients a1 , a 2 , and a 4 are given by Eqs. (12-39a–c) and

1 12
a6 =
2 6
(1 + 2 + 4 ) Aa ,
(12-39d)
12
1 - 2 Ê 1 + 4 2 + 4 ˆ
a8 = Á ˜ Ac , (12-39e)
6 2 Ë 1 + 2 ¯

a11 =
(1 - 2 ) 2 A . (12-39f)
s
6 5

Again, it should be evident that the coefficients a 5 , a 7 , and a 9 of the polynomials A5 ,


A7 , and A9 , respectively, each polynomial varying as sin mq, are zero. Moreover, the
coefficient a10 of the polynomial A10 varying as cos 3q is also zero. The 11-polynomial
expansion represents the Seidel aberration function exactly. Its mean value is again a1 , as
given by Eq. (12-39a), and its variance is given by
2
sW = a 22 + a 42 + a 62 + a 82 + a11
2
. (12-42)

12.3.4.2 Circle Coefficients

Next we expand the Seidel aberration function in terms of the circle polynomials. A
4-polynomial expansion can be obtained from Eqs. (12-32) and (12-39) in the form

Wˆ (r, q; ) = bˆ1Z1 + bˆ2 Z 2 + bˆ4 Z 4 , (12-43)

where

[ (
bˆ1 = (2 Ad + Aa ) 4 + 1 - 2 1 + 2 2 As 3 , ) ] (12-44a)

bˆ2 = a 2 1 + 2( )1 2 ,
(12-44b)

bˆ4 = a 4 1 - 2( ) .
(12-44c)
12.3.4 Application to an Annular Seidel Aberration Function 323

The estimated aberration function in Eq. (12-43) is exactly the same as that in Eq. (12-
38), and the values of piston, x-tilt, and defocus are exactly the same as those obtained
from Eqs. (12-39a–c). It should be evident, however, that its mean value is not given by
b̂1. Moreover, since an expansion coefficient does not represent the standard deviation of
the corresponding aberration polynomial term, its variance is not given by bˆ22 + bˆ42 .

From Eqs. (12-33) and (12-39), an 11-polynomial Zernike circle expansion can be
written

W (r, q; ) = bˆ1Z1 + bˆ2 Z 2 + bˆ4 Z 4 + bˆ6 Z 6 + bˆ8 Z 8 + bˆ11Z11 , (12-45)

where

bˆ1 = (2 Ad + Aa ) 4 + As 3 , (12-46a)

bˆ2 = At 2 + Ac 3 , (12-46b)

bˆ4 = (2 Ad + Aa ) 4 3 + As 2 3 , (12-46c)

bˆ6 = Aa 2 6 , (12-46d)

bˆ8 = Ac 6 2 , (12-46e)

bˆ11 = As 6 5 . (12-46f)

As in the case of annular polynomials, the eleven circle polynomials also represent the
Seidel aberration function exactly. The expansion coefficients can also be obtained by
inspection of the aberration function and the form of the circle polynomials. Indeed
because of the form of the Seidel aberration function, the circle coefficients are
independent of the obscuration ratio . Each b̂ -coefficient represents the value of the
corresponding a-coefficient for  = 0 . It is clear that each of the three nonzero
coefficients of the 4-polynomial expansion changes as the number of polynomials is
increased from four to eleven. Hence, the values of piston, x-tilt, and defocus obtained
from the coefficients b̂1, b̂2 , and b̂4 are incorrect. Again, the mean value of the aberration
function is not given by b̂1, and its variance is not given by the sum of the squares of the
other coefficients.

12.3.4.3 Residual Aberration Function after Removing Interferometer Setting Errors

If we consider the first four polynomial terms as representing the interferometer


setting errors and remove them from the aberration function, the residual aberration
function from the annular expansion is given by

W RA (r, q; ) = a 6 A6 + a 8 A8 + a11 A11 . (12-47)


324 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

The same residual aberration function is obtained if a 4-polynomial Zernike expansion of


Eq. (12-43) is subtracted from the aberration function W (r, q; ). However, if the first
four polynomials are subtracted from the aberration function of Eq. (12-45), the residual
aberration function is given by

W RCb̂ (r, q; ) = bˆ6 Z 6 + bˆ8 Z 8 + bˆ11Z11 .

( ) ( )
= Aa 2 6 Z 6 + Ac 6 2 Z 8 + As 6 5 Z11 . ( ) (12-48)

Since the 11-polynomial aberration functions of Eqs. (12-41) and (12-45) are equal
to each other [and equal to the Seidel aberration function of Eq. (12-37)], the difference
between the residual aberration functions of Eqs. (12-48) and (12-47) is equal to the
difference between the interferometer setting errors given by Eq. (12-38) or (12-43) and
those given by Eq. (12-45). Accordingly, the difference or the error function consists of
piston, tilt, and defocus only. It is given by

1 2 2 4
DW Rbˆ (r, q; ) = -
6
( )
 4 + 2 As + A r cos q + 2 As r 2
3 1 + 2 c
, (12-49)

and is independent of the number J of the annular and circle polynomials (e.g., 11, as
above) used in the expansion. Of course, piston does not affect the peak-to-valley value
or the variance of the aberration function. If the interferometer setting errors obtained
from Eq. (12-45) are applied in the fabrication and testing of an optical system with an
annular pupil, the difference function represents the polishing error due to the use of the
circle polynomials.

If we compare the annular coefficients of astigmatism, coma, and spherical


aberration given by Eqs. (12-39d–f) with the corresponding Zernike coefficients given by
Eq. (12-46d–f), we obtain

a6
bˆ6
(
= 1 + 2 + 4 )1 2 , (12-50a)

12
2 Ê 1 + 4 +  ˆ
2 4
a8
bˆ8
= 1 (
-  Á )
Ë 1+ 
2 ˜
¯
, (12-50b)

and

a11
bˆ11
(
= 1 - 2 )2 . (12-50c)

Since the b̂ j -coefficients are independent of the value of , the variation of a ratio
a j bˆ j with  represents the variation of an annular coefficient a j .
12.3.4.4 Error with Assuming Circle Polynomials to be Orthogonal over an Annulus 325

12.3.4.4 Error with Assuming Circle Polynomials to be Orthogonal over an Annulus

Now we consider the expansion of the Seidel aberration function in terms of the
circle polynomials by assuming them to be orthogonal over the annulus. This is what one
does when defining a center of an interferogram, drawing a unit circle around it, and
determining its circle coefficients. The aberration function in this case can be written in
the form

W (r, q; ) = b1Z1 + b2 Z 2 + b4 Z 4 + b6 Z 6 + b8 Z 8 + b11Z11 + ... , (12-51)

where, according to Eq. (12-17), the coefficients b j are given by


1 2p
1
bj = Ú Ú W (r, q; ) Z j (r, q) r dr dq . (12-52)
(
p 1 - 2 ) 0

They can also be obtained from Eq. (12-22), i.e., from the annular or circle coefficients
by using the matrix C ZZ or C ZF given in Tables 12-4 and 12-5, respectively. The
“incorrect” circle coefficients b j are given by

b1 = a1 , (12-53a)

(
b2 = 1 + 2 )1 2 a 2 , (12-53b)

1 1
b4 =
4 3
(1 + 2 + 4 4 )(2 Ad + Aa ) +
2 3
(1 + 2 + 4 + 36 ) As , (12-53c)

(
b6 = 1 + 2 + 4 )1 2 a 6 , (12-53d)

1
b8 = 2 4 At +
6 2
(1 + 2 + 4 + 96 ) Ac , (12-53e)

5 4 2 1
b11 =
4
(
 3 - 1 (2 Ad + Aa ) +)6 5
(
1 + 2 + 4 - 96 + 368 As , ) (12-53f)

etc. These coefficients are incorrect in the sense that they do not yield a least-squares fit
of the aberration function. Since an annular polynomial with n = m has the same form as
that for a corresponding circle polynomial except for the normalization constant, the
coefficients b j and a j for such a polynomial are also related to each other by the
normalization constant. Equations (12-53a, b, d) represent this fact for n = m = 0, 1, 2 ,
respectively. It is clear, however, that the improperly calculated circle coefficients b j
depend on the obscuration ratio of the pupil. Evidently, they are different from the
corresponding b̂ -coefficients given by Eqs. (12-46a–f). While the value of the piston
coefficient b1 is equal to the true mean value a1 , the tilt coefficient b2 is larger than a 2
12
by a factor of 1 + 2 (1 2
)
or 1.1180, and the coma coefficient b6 is larger than a 6 by a
(
factor of 1 + 2 + 4 )
or 1.1456 when  = 0.5 . Moreover, the b-coefficients of some of
326 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

the nonexistent higher-order aberrations are not zero. For example, the coefficients b22 ,
b37 , etc. of the secondary and tertiary Zernike spherical aberrations Z 22 , Z 37 , etc., and
b16 , b30 , etc. of the secondary and tertiary Zernike coma Z16 and Z 30 , etc., are nonzero.
Thus, nonexistent aberrations are generated when an aberration function is expanded
improperly in terms of the circle polynomials.

If we estimate the annular Seidel aberration function with only 4-circle polynomials
from Eq. (12-51), we obtain

Wˆ (r, q; ) = b1Z1 + b2 Z 2 + b4 Z 4 . (12-54)

If we truncate the expansion in terms of the circle polynomials in Eq. (12-51) to the first
11 circle polynomials and remove the first four coefficients as interferometer setting
errors, the residual aberration function in this case is given by

W RCb (r, q; ) = b6 Z 6 + b8 Z 8 + b11Z11 . (12-55)

The tilt error is larger by a factor of 1 + 2( )1 2


or 1.1180 when  = 0.5 than its true value
given by a 2 , and the defocus error given by b4 can be compared with its true value given
by a 4 . Since the 11-polynomial aberration function from Eq. (12-51) is not equal to the
aberration function of Eq. (12-41), their difference does not consist of the difference in
their interferometer setting errors. For example, Eq. (12-53d) indicates that there will be
an astigmatism term in the difference function. Thus, wrong polishing will result if the
aberration function of Eq. (12-55) is provided to the optician to zero out.

12.3.4.5 Numerical Example

As a numerical example, we consider an annular Seidel aberration function with


At = Ad = Aa = 1, Ac = 2 , and As = 3 in waves. As illustrated in Figure 12-1, the
annular and circle coefficients of a 4-polynomial expansion differ from each other,
although they yield the same fit of the aberration function. We note that, whereas the
mean value a1 increases as  increases, but the piston coefficient b̂1 decreases. However,
the defocus coefficient a 4 decreases, while b̂4 increases. Both tilt coefficients a 2 and b̂2
increase. For a 11-polynomial expansion, the first four annular coefficients remain the
same, but the circle coefficients become independent of , as in Eqs. (12-46). Figure 12-2
shows the coefficient ratios a 6 bˆ6 (astigmatism), a 8 bˆ8 (coma) and a11 bˆ11 (spherical)
for a 11-polynomial expansion. We note that the coefficient a 6 increases, a11 decreases,
and a 8 is nearly constant for small values of  and then decreases as  increases. Figure
12-3 shows how the b̂ -coefficients change as we change the number of polynomials from
4 to 11 for  = 0.5. A wrong polishing will result if the tip, tilt, and focus errors of an
interferometer setting are estimated from the 11-circle-polynomial expansion, instead of
the four. The variation of standard deviation obtained from the coefficients of a 4- or 11-
polynomial expansion is shown in Figure 12-4, illustrating that the circle coefficients
yield incorrect results. The standard deviation obtained from the orthonormal coefficients
increase slowly with , starting at 1.7460 and 1.7877 for the 4- and 11-polynomial
12.3.4.5 Numerical Example 327

expansions, respectively. However, the standard deviation obtained from the circle
coefficients is correct only when  = 0. It increases rapidly with  for the 4-polynomial
expansion, but it is constant for the 11-polynomial expansion, indicating its incorrect
nature. The sigma values from the orthonormal and the circle coefficients are nearly equal
to each other for  £ 0.5 because of the very slow increase of the orthonormal sigma.

Figure 12-5 shows the contours of the Seidel aberration function for a circular and an
annular pupil with obscuration ratio of  = 0.5. The case of a circular pupil is included
just for reference. The dark circular region in Figure 12-5b (and others) represents the
obscuration. The contours of the annular Seidel aberration function fit with only four
polynomials, as in Eq. (12-38) or (12-43) and in Eq. (12-54), which are shown in Figures

Figure 12-1. Orthonormal annular coefficients a j and Zernike circle coefficients b̂ j


for a 4-polynomial expansion.

Figure 12-2. Ratio of the orthonormal annular coefficients a j and Zernike circle
coefficients b̂ j for a 11-polynomial expansion.
328 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

Figure 12-3. Orthonormal annular coefficients a j and Zernike circle coefficients


b̂ j , illustrating how the latter change as the number of polynomials changes from 4
to 11.

Figure 12-4. Standard deviation as obtained from the orthonormal annular


coefficients a j and Zernike circle coefficients b̂ j of a 4- and 11-polynomial
expansion.
12.3.4.5 Numerical Example 329

(a) (b)

Figure 12-5. Contours of (a) Seidel aberration function of Eq. (12-37) for a circular
pupil with At = Ad = Aa = 1, Ac = 2, and As = 3 in waves. (b) Same Seidel
aberration function, but for an annular pupil with obscuration ratio  = 0.5.

(a) (b)

Figure 12-6. Contours of an annular Seidel aberration function for  = 0.5 fit with
only 4-polynomials, as in (a) Eq. (12-38) or (12-43), and (b) Eq. (12-54).
330 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

(a)

(b)

(c)

Figure 12-7. Contours of the residual aberration function after removing the
interferometer setting errors. (a) WRA of Eq. (12-47) using annular polynomials, (b)
WRCb̂ of Eq. (12-48) using circle polynomials correctly, and (c) WRCb of Eq. (12-53)
using circle polynomials incorrectly.
12.3.4.5 Numerical Example 331

(a)

(b)

Figure 12-8. Contours of the difference or the error function (a) Eq. (12-49) and (b)
obtained by subtracting Eq. (12-47) from Eq. (12-55).

12-6a and 12-6b, respectively. The two figures look similar, but they are not the same.
Only Figure 6a represents the least-squares and, therefore, the correct fit. The contours of
the residual aberration function when the first four (of the eleven) polynomials are
removed as interferometer setting errors, as in Eqs. (12-47), (12-48), and (12-55), are
shown in Figures 12-7a, 12-7b, and 12-7c, respectively. All of the three figures are
different from each other, as expected. Only Figure 12-7a reflects removal of the correct
interferometer setting errors, and thus the correct residual aberration function. The
contours of the difference of the residual functions using the circle polynomials from the
one using the annular polynomials are shown in Figures 12-8a and 12-8b. They represent
the error functions given by Eq. (12-49) and the difference of Eqs. (12-55) and (12-47),
respectively, due to the removal of incorrect interferometer setting errors.
332 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

12.4 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR THE ANALYSIS OF


A HEXAGONAL WAVEFRONT
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients
Now, we consider a hexagonal aberration function W ( x , y ) across a unit hexagon
shown in Figure 7-7, and demonstrate the pitfalls of using Zernike circle polynomials for
its expansion. Estimating the aberration function with J hexagonal polynomials H j ( x , y )
given in Chapter 7, we may write
J
Wˆ ( x , y ) = Â a j H j ( x , y ) , (12-56)
j =1

where the orthonormal hexagonal expansion coefficients are given by

2
aj = Ú W ( x , y )H j dx dy . (12-57)
3 3 hexagon

The mean and the mean values of the estimated aberration function are given by Eqs. (12-
4) and (12-6).

An 11 ¥ 11 conversion matrix M for obtaining the hexagonal polynomials in terms of


the Zernike circle polynomials is given in Table 12-6, as obtained from Table 7-1. Its
transpose and inverse matrices are given in Tables 12-7 and 12-8, respectively. If only the
first 4 polynomials are used in the expansion, then the b̂ j coefficients according to Eq.
(12-13) are given by

Ê bˆ1 ˆ Ê 1 0 0 5 43 ˆ Ê a1 ˆ Ê a1 + 5 43a 4 ˆ
Áˆ ˜ Á 0 0 ˜ Áa ˜ Á 6 5a ˜
Á b2 ˜ 65 0
Áˆ ˜ = Á ˜ Á 2˜ = Á 2
˜ , (12-58)
b Á 0 0 65 0 ˜ Á a3 ˜ Á 6 5a 3 ˜
Á ˜ 3
Á ˜ Á ˜ Á ˜
Áˆ ˜ Ë 0 0 0 2 15 43 ¯ Ë a4 ¯ Ë 2 15 43a 4 ¯
Ëb ¯
4

or

bˆ1 = a1 + 5 43a 4 , (12-59a)

bˆ2 = 6 5a 2 , (12-59b)

bˆ3 = 6 5a 3 , (12-59c)

and

bˆ4 = 2 15 43a 4 . (12-59d)

It is evident that the piston coefficient b̂1 is not equal to a1 and, therefore, does not
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 333

Table 12-6. Conversion matrix M for obtaining the Zernike coefficients b̂ j from the
orthonormal hexagonal coefficients a j , as in Eq. (12-12).

1 0 0 0 0 0 0 0 0 0 0

0 6 5 0 0 0 0 0 0 0 0 0

0 0 6 5 0 0 0 0 0 0 0 0

5 43 0 0 2 15 43 0 0 0 0 0 0 0

0 0 0 0 10 7 0 0 0 0 0 0

0 0 0 0 0 10 7 0 0 0 0 0
14 35
0 0 16 0 0 0 10 0 0 0 0
11055 2211
14 35
0 16 0 0 0 0 0 10 0 0 0
11055 2211
2
0 0 0 0 0 0 0 0 5 0 0
3
35
0 0 0 0 0 0 0 0 0 2 0
103
521 15 43
0 0 88 0 0 0 0 0 0 14
1072205 214441 4987

Table 12-7. Transpose matrix MT for use in Eq. (12-13)

521
1 0 0 5 43 0 0 0 0 0 0
1072205
14
0 65 0 0 0 0 0 16 0 0 0
11055
14
0 0 65 0 0 0 16 0 0 0 0
11055
15
0 0 2 15 43 0 0 0 0 0 0 88 0
214441

0 0 0 0 10 7 0 0 0 0 0 0

0 0 0 0 0 10 7 0 0 0 0 0
35 0 0 0 0
0 0 0 0 0 0 10
2211

0 0 0 0 0 0 0 0 0 0 0
2
0 0 0 0 0 0 0 0 5 0 0
3
35
0 0 0 0 0 0 0 0 0 2 0
103
43
0 0 0 0 0 0 0 0 0 0 14
4987
334 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

Table 12-8. Analytical matrix M –1 for obtaining the Zernike coefficients a j from
the orthonormal hexagonal coefficients b̂ j .

1 0 0 0 0 0 0 0 0 0 0

0 56 0 0 0 0 0 0 0 0 0

0 0 56 0 0 0 0 0 0 0 0

1 2 3 0 0 43 15 2 0 0 0 0 0 0 0

0 0 0 0 7 10 0 0 0 0 0 0

0 0 0 0 0 7 10 0 0 0 0 0
2211
0 0 8 5 15 0 0 0 10 0 0 0 0
35

2211
0 8 5 15 0 0 0 0 0 10 0 0 0
35

0 0 0 0 0 0 0 0 3 2 5 0 0

103
0 0 0 0 0 0 0 0 0 2 0
35
4987
1 2 5 0 0 22 7 43 0 0 0 0 0 0 14
43

represent the mean value of the aberration function. The coefficients b̂2 , b̂3 , and b̂4
represent the tip, tilt, and defocus circle coefficients.

To see how these coefficients change with the number of polynomials used in the
expansion, we consider an expansion using 11 polynomials. The coefficients, obtained
from Eq. (12-13), are given by

bˆ1 a1  5 43a 4  521


1072205 a11 , (12-60a)

bˆ2 6 5a 2  16 14 11055a 8 , (12-60b)

bˆ3 6 5a 3  16 14 11055a 7 , (12-60c)

bˆ4 2 15 43a 4  88 15 214441a11 , (12-60d)

bˆ5 10 7 a 5 , (12-60e)

bˆ6 10 7 a 6 , (12-60f)
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 335

bˆ7 = 10 35 2211a 7 , (12-60g)

bˆ8 = 10 35 2211a 8 , (12-60h)

bˆ9 = (2 3) 5a 9 , (12-60i)

bˆ10 = 2 35 103a10 , (12-60j)

and

bˆ11 = 14 43 4987 a11 . (12-60k)

It is clear that all of the first four coefficients change, and b̂ j = M jj a j for 5 £ j £ 11 .
For astigmatism ( H 5 and H 6 ), coma ( H 7 and H 8 ), and spherical aberration ( H11 ), the
b̂ j coefficient is larger than the corresponding hexagonal coefficient by a factor of
10 7 ª 1.20 , 10 35 2211 ª 1.26 , and 14 43 4987 ª 1.30 , respectively. The
astigmatism coefficients b̂5 and b̂6 change if a 15-polynomial expansion is considered.
For example, b̂5 then contains contributions from a13 and a15 , as well. The tip and tilt
coefficients b̂2 and b̂3 change further if polynomials H16 and H17 are included in the
expansion. Moreover, H16 also contributes to the coma coefficient b̂8 , and H17 similarly
contributes to the coma coefficient b̂7 . The piston and defocus coefficients b̂1 and b̂4 do
not change until the secondary spherical aberration polynomial H 22 is included with its
coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .
Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,
depending on the number of polynomials used in the expansion.

12.4.2 Interferometer Setting Errors


The estimated wavefront obtained by using only the first four polynomials represents
the best-fit parabolic approximation of the aberration function in a least-squares sense. In
terms of the Zernike polynomials, it can be written as

Wˆ ( x , y ) = bˆ1Z1 + bˆ2 Z 2 + bˆ3 Z 3 + bˆ4 Z 4 (12-61a)

(
= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-61b)

Similarly, it can be written in terms of the orthonormal hexagonal polynomials as

Wˆ ( x , y ) = a1H1 + a 2 H 2 + a 3 H 3 + a 4 H 4 (12-62a)

= a1 + 2 6 5a 2 x + 2 6 5a 3 y + a 4 [ (
5 43 + 6 5 43 2r 2 - 1 )] . (12-62b)

Comparing the right-hand sides of Eqs. (12-61b) and (12-62b) and utilizing Eqs. (12-59a–
d), it is seen that the coefficients of x, y, and x 2 + y 2 , representing the tip, tilt, and
defocus values obtained from the Zernike coefficients, are the same as those obtained
from the hexagonal coefficients. The estimated piston from the Zernike expansion of Eq.
336 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

(12-61b) is bˆ1 - 3bˆ4 . Substituting for b1and b4 from Eqs. (12-59a–d), we find that it is
the same as a1