Optical Imaging and Aberrations

© All Rights Reserved

46 views

Optical Imaging and Aberrations

© All Rights Reserved

- Automatic Systems for Wear Measurement of Contact Wire in Railways
- Scientech-2502
- Fibre Optics
- Aoptic-abbration
- Accredit ion
- RT Report
- BT2257-VVQ
- The Science of Photonics
- Physics
- IIJEC-2014-03-05-028
- ls1_unit_2
- DIP Lab
- Lcd
- Handout Optik Geometri English
- jjjj
- Landsat Band Information
- Hom Filt
- ATEC Opto Mechanical Design Team #1
- Explore Scientific 127 ED
- Bendt Et Al. - Optical Analysis and Optimization of Line Focus Solar Collectors

You are on page 1of 419

WAVEFRONT ANALYSIS

PART III

WAVEFRONT ANALYSIS

VIRENDRA N. MAHAJAN

AND

COLLEGE OF OPTICAL SCIENCES - THE UNIVERSITY OF ARIZONA

Library of Congress Cataloging-in-Publication Data

Mahajan, Virendra N.

Optical imaging and aberrations, part III: wavefront analysis / Virendra N. Mahajan

pages cm.

Includes bibliographical references and index.

ISBN 978-0-8194-9111-4

1. Optical measurements. 2. Aberration--Measurement. 3. Orthogonal decompositions.

4. Orthogonal polynomials. I. Title.

QC367.M24 2013

621.36--dc23

2013018827

Published by

SPIE

P.O. Box 10

Bellingham, Washington 98227-0010 USA

Phone: +1 360.676.3290

Fax: +1 360.647.1445

Email: Books@spie.org

Web: http://spie.org

All rights reserved. No part of this publication may be reproduced or distributed in any

form or by any means without written permission of the publisher.

The content of this book reflects the work and thought of the author(s). Every effort has

been made to publish reliable and accurate information herein, but the publisher is not

responsible for the validity of the information or for any outcomes resulting from reliance

thereon.

First printing

Front cover: Shown from left to right are the aberration-free PSFs of optical imaging

systems with circular, annular, hexagonal, elliptical, rectangular, and square pupils.

To my grandchildren

v

FOREWORD

For years Vini Mahajan has been publishing a book series on optical imaging and

aberrations. Part I of the series on Ray Geometrical Optics was published in 1998, and

Part II on Wave Diffraction Optics followed in 2001. A second edition of Part II appeared

in 2011. Now Vini has written Part III on Wavefront Analysis, which should be of interest

to anyone working in the fields of optical design, fabrication, or testing.

analysis of optical imaging systems with pupils of different shapes. The book starts with

an excellent introduction to optical imaging and aberrations. These first two chapters

should be of interest to anyone working in optics. Chapter 3 describes orthonormal

polynomials and the Gram–Schmidt orthonormalization process for obtaining

orthonormal polynomials over one domain from those that are orthonormal over another.

Chapter 4 is a long and complete chapter on imaging and aberrations for optical

systems with circular pupils. The chapter covers the PSF and OTF for aberration-free

imaging, Strehl ratio and aberration balancing and tolerancing, and a very complete

description of Zernike circle polynomials. Isometric, interferometric, and imaging

characteristics of the circle polynomial aberrations are very nicely explained and

illustrated. The important relationship between the circle polynomials and the classical

aberrations is discussed. Since optical systems generally have circular pupils, this chapter

will be of use to almost anyone working in optics.

The next several chapters are intended for readers interested in optical systems with

noncircular or apodized circular or annular pupils. Much of this material is difficult to

find in such detail elsewhere. The chapters start with a brief discussion of aberration-free

imaging that includes both the PSF and the OTF of the optical system, as this is

potentially the ultimate goal of any optical design or test. Then the polynomials

appropriate for systems with pupils of different shapes representing balanced classical

aberrations are described in detail. As in the case of the circle polynomial aberrations, the

isometric, interferometric, and PSF plots of the first forty-five polynomial aberrations for

systems with hexagonal, elliptical, annular, rectangular, and square pupils facilitate

understanding of their significance. Systems with circular and annular pupils with

Gaussian illumination, anamorphic systems with square and circular pupils, and those

with circular and annular sector pupils are also discussed thoroughly.

Anyone thinking of using the Zernike circle polynomials for wavefront analysis of

systems with noncircular pupils should read Chapter 12, where their pitfalls are

illustrated by applying them to systems with annular and hexagonal pupils. Numerical

examples on the calculation of the orthonormal aberration coefficients from the

wavefront or the wavefront slope data given in Chapter 14 add to the utility and

vii

practicality of the book. A summary at the end of each chapter is quite useful, as it

describes the essence of the content.

Vini is an excellent writer with the gift of writing complex topics in a simplified, yet

rigorous, manner. As in the first two volumes of this book series, the material presented

in Part III is thorough and detailed, and much of it is from his own publications.

Wavefront Analysis is primarily analytical in nature, but it is generally easy to read with a

lot of examples and numerical results. Both students and experienced optical engineers

and scientists who have a need for wavefront analysis of optical systems will find it to be

extremely useful.

June 2013

viii

TABLE OF CONTENTS

Preface ........................................................................................................................... xvii

Acknowledgments .......................................................................................................... xix

Symbols and Notation.................................................................................................... xxi

1.1 Introduction ............................................................................................................................ 3

1.2 Diffraction Image ................................................................................................................... 3

1.2.1 Pupil Function .......................................................................................................... 4

1.2.2 PSF ........................................................................................................................... 5

1.2.3 OTF .......................................................................................................................... 6

1.3 Strehl Ratio ............................................................................................................................. 7

1.3.1 General Expression .................................................................................................. 7

1.3.2 Approximate Expression in Terms of Aberration Variance ..................................... 9

1.4 Aberration Balancing ........................................................................................................... 10

1.5 Summary ............................................................................................................................... 11

References ........................................................................................................................................ 12

2.1 Introduction .......................................................................................................................... 15

2.2 Optical Imaging .................................................................................................................... 15

2.3 Wave and Ray Aberrations ................................................................................................. 17

2.4 Defocus Aberration .............................................................................................................. 22

2.5 Wavefront Tilt ...................................................................................................................... 23

2.6 Aberration Function of a Rotationally Symmetric System .............................................. 25

2.7 Observation of Aberrations:

s: Interferograms .................................................................... 29

2.8 Summary ............................................................................................................................... 31

References ........................................................................................................................................ 33

ORTHONORMALIZATION................................................... 35

3.1 Introduction .......................................................................................................................... 37

3.2 Orthonormal Polynomials ................................................................................................... 37

3.3 Equivalence of Orthogonality-Based Coefficients and Least-Squares Fitting ............... 39

3.4 Orthonormalization of Zernike Circle Polynomials over Noncircular Pupils ............... 40

ix

3.5 Unit Pupil .............................................................................................................................. 43

3.6 Summary ............................................................................................................................... 43

References ........................................................................................................................................ 46

4.1 Introduction .......................................................................................................................... 49

4.2 Pupil Function....................................................................................................................... 49

4.3 Aberration-Free Imaging .................................................................................................... 50

4.3.1 PSF ......................................................................................................................... 51

4.3.2 OTF ........................................................................................................................ 53

4.4 Strehl Ratio and Aberration Tolerance.............................................................................. 54

4.4.1 Strehl Ratio............................................................................................................. 54

4.4.2 Defocus Strehl Ratio............................................................................................... 55

4.4.3 Approximate Expressions for Strehl Ratio............................................................. 56

4.5 Balanced Aberrations........................................................................................................... 57

4.6 Description of Zernike Circle Polynomials ........................................................................ 63

4.6.1 Analytical Form...................................................................................................... 63

4.6.2 Circle Polynomials in Polar Coordinates ............................................................... 65

4.6.3 Polynomial Ordering .............................................................................................. 65

4.6.4 Number of Circle Polynomials through a Certain Order n .................................... 65

4.6.5 Relationships among the Indices n, m, and j .......................................................... 69

4.6.6 Uniqueness of Circle Polynomials ......................................................................... 69

4.6.7 Circle Polynomials in Cartesian Coordinates......................................................... 70

4.7 Zernike Circle Coefficients of a Circular Aberration Function ...................................... 70

4.8 Symmetry Properties of Images Aberrated by a Circle Polynomial Aberration ........... 74

4.8.1 Symmetry of PSF ................................................................................................... 74

4.8.2 Symmetry of OTF................................................................................................... 76

4.9 Isometric, Interferometric, and Imaging Characteristics of

Circle Polynomial Aberrations ........................................................................................... 78

4.9.1 Isometric Characteristics ........................................................................................ 78

4.9.2 Interferometric Characteristics ............................................................................... 78

4.9.3 PSF Characteristics ................................................................................................ 83

4.9.4 OTF Characteristics ............................................................................................... 84

4.10 Circle Polynomials and Their Relationships with Classical Aberrations ....................... 88

4.10.1 Introduction ............................................................................................................ 88

4.10.2 Wavefront Tilt and Defocus ................................................................................... 88

4.10.3 Astigmatism ........................................................................................................... 89

4.10.4 Coma....................................................................................................................... 90

4.10.5 Spherical Aberration............................................................................................... 90

4.10.6 Seidel Coefficients from Zernike Coefficients ....................................................... 91

4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ........................... 92

x

4.11 Zernike Coefficients of a Scaled Pupil ............................................................................... 92

4.11.1 Theory .................................................................................................................... 92

4.11.2 Application to a Seidel Aberration Function.......................................................... 97

4.11.3 Numerical Example................................................................................................ 99

4.12 Summary ............................................................................................................................. 102

References ...................................................................................................................................... 103

5.1 Introduction ........................................................................................................................ 107

5.2 Aberration-Free Imaging .................................................................................................. 107

5.2.1 PSF ....................................................................................................................... 107

5.2.2 OTF ...................................................................................................................... 109

5.3 Strehl Ratio and Aberration Balancing............................................................................ 111

5.4 Orthonormalization of Circle Polynomials over an Annulus ......................................... 114

5.5 Annular Polynomials ......................................................................................................... 116

5.6 Annular Coefficients of an Annular Aberration Function ............................................. 123

5.7 Strehl Ratio for Annular Polynomial Aberrations ......................................................... 129

5.8 Isometric, Interferometric, and Imaging Characteristics of

Annular Polynomial Aberrations ..................................................................................... 132

5.9 Summary ............................................................................................................................. 139

References ...................................................................................................................................... 140

6.1 Introduction ........................................................................................................................ 143

6.2 Gaussian Pupil .................................................................................................................... 144

6.3 Aberration-Free Imaging .................................................................................................. 145

6.3.1 PSF ....................................................................................................................... 145

6.3.2 Optimum Gaussian Radius.................................................................................. 146

6.3.3 OTF ...................................................................................................................... 147

6.4 Strehl Ratio and Aberration Balancing............................................................................ 149

6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil . 153

6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations for a

Gaussian Circular Pupil..................................................................................................... 155

6.7 Weakly Truncated Gaussian Pupils ................................................................................. 156

6.8 Aberration Coefficients of a Gaussian Circular Aberration Function......................... 157

6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil ............ 157

6.10 Gaussian Annular Polynomials

yn Representing Balanced Primary Aberrations for a

Gaussian Annular Pupil ..................................................................................................... 159

xi

6.11 Aberration Coefficients of a Gaussian Annular Aberration Function ......................... 161

6.12 Summary ............................................................................................................................. 161

References ...................................................................................................................................... 163

7.1 Introduction ........................................................................................................................ 167

7.2 Pupil Function..................................................................................................................... 168

7.3 Aberration-Free Imaging .................................................................................................. 169

7.3.1 PSF ..........................................................................................................169

7.3.2 OTF ..........................................................................................................174

7.4 Hexagonal Polynomials...................................................................................................... 177

7.5 Hexagonal Coefficients of a Hexagonal Aberration Function........................................ 185

7.6 Isometric, Interferometric, and Imaging Characteristics of

Hexagonal Polynomial Aberrations ................................................................................. 187

7.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio............................................. 194

7.7.1 Defocus ....................................................................................................194

7.7.2 Astigmatism............................................................................................. 194

7.7.3 Coma ........................................................................................................195

7.7.4 Spherical Aberration ................................................................................196

7.7.5 Strehl Ratio ..............................................................................................197

7.8 Summary ............................................................................................................................. 197

References ...................................................................................................................................... 200

8.1 Introduction ........................................................................................................................ 203

8.2 Pupil Function..................................................................................................................... 203

8.3 Aberration-Free Imaging .................................................................................................. 204

8.3.1 PSF ....................................................................................................................... 204

8.3.2 OTF ...................................................................................................................... 207

8.4 Elliptical Polynomials......................................................................................................... 209

8.5 Elliptical Coefficients of an Elliptical Aberration Function ......................................... 210

8.6 Isometric, Interferometric, and Imaging Characteristics of

Elliptical Polynomial Aberrations..................................................................................... 214

8.7 Seidel Aberrations and Their Standard Deviations ........................................................ 228

8.7.1 Defocus ................................................................................................................. 228

8.7.2 Astigmatism ......................................................................................................... 228

8.7.3 Coma..................................................................................................................... 229

8.7.4 Spherical Aberration............................................................................................. 230

8.8 Summary ............................................................................................................................. 232

References ...................................................................................................................................... 234

xii

CHAPTER 9: SYSTEMS WITH RECTANGULAR PUPILS ............................ 235

9.1 Introduction ........................................................................................................................ 237

9.2 Pupil Function..................................................................................................................... 237

9.3 Aberration-Free Imaging .................................................................................................. 238

9.3.1 PSF ..........................................................................................................238

9.3.2 OTF ..........................................................................................................240

9.4 Rectangular Polynomials ................................................................................................... 242

9.5 Rectangular Coefficients of a Rectangular Aberration Function.................................. 243

9.6 Isometric, Interferometric, and Imaging Characteristics of

Rectangular Polynomial Aberrations ............................................................................... 247

9.7 Seidel Aberrations and Their Standard Deviations ........................................................ 260

9.7.1 Defocus ....................................................................................................260

9.7.2 Astigmatism............................................................................................. 260

9.7.3 Coma ........................................................................................................261

9.7.4 Spherical Aberration ................................................................................261

9.8 Summary ............................................................................................................................. 264

References ...................................................................................................................................... 265

10.1 Introduction ........................................................................................................................ 269

10.2 Pupil Function..................................................................................................................... 269

10.3 Aberration-Free Imaging .................................................................................................. 270

10.3.1 PSF ..........................................................................................................272

10.3.2 OTF ..........................................................................................................274

10.4 Square Polynomials ............................................................................................................ 281

10.5 Square Coefficients of a Square Aberration Function.................................................... 282

10.6 Isometric, Interferometric, and Imaging Characteristics of

Square Polynomial Aberrations ........................................................................................ 289

10.7 Seidel Aberrations and Their Standard Deviations ........................................................ 289

10.7.1 Defocus ....................................................................................................289

10.7.2 Astigmatism............................................................................................. 289

10.7.3 Coma ........................................................................................................290

10.7.4 Spherical Aberration ................................................................................292

10.8 Summary ............................................................................................................................. 293

References ...................................................................................................................................... 294

xiii

CHAPTER 11: SYSTEMS WITH SLIT PUPILS ............................................. 295

11.1 Introduction ........................................................................................................................ 297

11.2 Aberration-Free Imaging .................................................................................................. 297

11.2.1 PSF ..........................................................................................................297

11.2.2 Image of an Incoherent Slit......................................................................298

11.3 Strehl Ratio and Aberration Balancing............................................................................ 299

11.3.1 Strehl Ratio ..............................................................................................299

11.3.2 Aberration Balancing............................................................................... 289

11.4 Slit Polynomials .................................................................................................................. 301

11.5 Standard Deviation of a Primary Aberration ................................................................. 302

11. Summary ............................................................................................................................. 305

References ...................................................................................................................................... 306

NONCIRCULAR PUPILS ................................................. 307

12.1 Introduction ........................................................................................................................ 309

12.2 Relationship Between the Orthonormal and the Corresponding

Zernike Circle Coefficients ................................................................................................ 309

12.3 Use of Zernike Circle Polynomials for the Analysis of an Annular Wavefront ........... 314

12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ...................... 314

12.3.2 Interferometer Setting (rrors ................................................................................320

12.3.3 Wavefront Fitting ................................................................................................. 320

12.3.4 Application to an Annular Seidel Aberration Function........................................ 321

12.3.4.1 Annular Coefficients ............................................................................ 321

12.3.4.2 Circle Coefficients................................................................................ 323

12.3.4.3 Residual Aberration Function Dfter Removing

Interferometer Setting Errors................................................................ 323

12.3.4.4 Error with Assuming Circle Polynomials to be

Orthogonal over an Annulus ................................................................ 325

12.3.4.5 Numerical Example ............................................................................. 326

12.4 Use of Zernike Circle Polynomials for the Analysis of a Hexagonal Wavefront ......... 332

12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients........................ 332

12.4.2 Interferometer Setting Errors................................................................................ 335

124.3 Numerical Example.............................................................................................. 336

12.5 Aberration Coefficients from Discrete Wavefront Data................................................. 345

12.6 Summary ............................................................................................................................. 345

References ...................................................................................................................................... 348

xiv

CHAPTER 13: ANAMORPHIC SYSTEMS................................................ 349

13.1 Introduction ........................................................................................................................ 351

13.2 Gaussian Imaging ............................................................................................................... 352

13.3 Classical Aberrations ......................................................................................................... 354

13.4 Strehl Ratio and Aberration Balancing for a Rectangular Pupil .................................. 355

13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil ................................. 356

13.6 Expansion of a Rectangular Aberration Function in Terms of Orthonormal

Rectangular Polynomials ................................................................................................... 360

13.7 Anamorphic Imaging System with a Circular Pupil....................................................... 361

13.7.1 Balanced Aberrations ..............................................................................361

13.7.2 Orthonormal Polynomials Representing Balanced Aberrations ..............362

13.8 Comparison of Polynomials for Rotationally Symmetric and

Anamorphic Imaging Systems .......................................................................................... 362

13.9 Summary ............................................................................................................................. 365

References ...................................................................................................................................... 367

14.1 Introduction ..........................................................................................................371

14.2 Zernike Coefficients from Wavefront Data....................................................... 372

14.2.1 Theory ......................................................................................................372

14.2.2 Numerical Example ................................................................................. 373

14.3 Zernike Coefficients from Wavefront Slope Data ............................................383

14.3.1 Theory ......................................................................................................383

14.3.2 Alternative Approach for Obtaining Zernike Coefficients from

Wavefront Slope Data..............................................................................388

14.3.3 Numerical Example ................................................................................. 393

14.4 Summary............................................................................................................... 398

References ......................................................................................................................399

xv

PREFACE

This book is Part III of a series of books on Optical Imaging and Aberrations. Part I

on Ray Geometrical Optics and Part II on Wave Diffraction Optics were published

earlier. Part III is on Wavefront Analysis, which is an integral part of optical design,

fabrication, and testing. In optical design, rays are traced to determine the wavefront and

thereby the quality of a design. In optical testing, the fabrication errors and, therefore, the

associated aberrations are measured by way of interferometry. In both cases, the quality

of the wavefront is determined from the aberrations obtained at an array of points. The

aberrations thus obtained are used to calculate the mean, the peak-to-valley, and the

standard deviation values. While such statistical measures of the wavefront are part of

wavefront analysis, the purpose of this book is to determine the content of the wavefront

by decomposing the ray-traced or test-measured data in terms of polynomials that are

orthogonal over the expected domain of the data. These polynomials must include the

basic aberrations of wavefront defocus and tilt, and represent balanced classical

aberrations.

We start Part III with an outline of optical imaging in the presence of aberrations in

Chapter 1, i.e., on how to obtain the point-spread and optical transfer functions of an

imaging system with an arbitrary shaped pupil. The Strehl ratio of a system as a measure

of image quality is introduced in this chapter, and shown to be dependent only on the

aberration variance when the aberration is small. It is followed in Chapter 2 with a brief

discussion of the wavefronts and aberrations. This chapter introduces the nomenclature of

aberrations. How to obtain the orthogonal polynomials over a certain domain from those

over another is discussed in Chapter 3. For systems with a circular pupil, the Zernike

circle polynomials are well known for wavefront analysis. They are discussed at length in

Chapter 4. These polynomials are orthogonalized over an annular pupil in Chapter 5, and

over a Gaussian pupil in Chapter 6. They are obtained similarly for systems with

hexagonal, elliptical, rectangular, square, and slit pupils in the succeeding chapters. For

each pupil, the polynomials are given in their orthonormal form so that an expansion

coefficient (with the exception of piston) represents the standard deviation of the

corresponding polynomial aberration term. The standard deviation of a Seidel aberration

with and without aberration balancing is also discussed in these chapters.

Since the Zernike circle polynomials form a complete set, a wavefront over any

domain can be expanded in terms of them. However, the pitfalls of their use over a

domain other than circular and resulting from the lack of their orthogonality over the

chosen domain are discussed in Chapter 12. Finally, the aberrations of anamorphic

systems are discussed, and polynomials suitable for their aberration analysis are given in

Chapter 13 for both rectangular and circular pupils. The use of the orthonormal

polyonomials for determining the content of a wavefront is demonstrated in Chapter 14

by computer simulations of circular wavefronts. The determination of the aberrations

coefficients from the wavefront slope data, as in a Shack–Hartmann sensor, is also

discussed in this chapter.

June 2013

xvii

ACKNOWLEDGMENT6

received over the years from my employer, The Aerospace Corporation, in preparing Part

III on Wavefront Analysis in a series of bookV on Optical Imaging and Aberrations. My

special thanks go to my former classmate Dr. Bill Swantner for his constant advice on

and constructive critique of my work. I have benefitted greatly from his practical

expertise in both optical design and testing. The Sanskrit verse on p. xxiii was provided

by Professor Sally Sutherland of the University of California at Berkeley. Many thanks to

Professor James W. Wyant for writing the Foreword for this book.

I am grateful to Professor José Antonio Díaz Navas for carrying out many computer

calculations and preparing many of the figures. My thanks to Drs. Barry Johnson, James

Harvey, and Daniel Topa for reading an early version of the manuscript and suggesting to

include examples of wavefront analysis. I am grateful to Professor Eva Acosta for her

help with writing Chapter 14 on Numerical Wavefront Analysis, as my response to their

suggestion. Of course, any shortcomings or errors anywhere in the book are totally my

responsibility.

As in the past, I cannot say enough about the constant support I have received from

my wife Shashi over the many years it has taken me to complete this three-part series. I

dedicate Part III to my grandchildren.

Finally, I would like to thank SPIE Press Editors Dara Burrows and Scott McNeill,

and Manager Tim Lamkins for their quality support in bringing this book to publication.

It has always been a pleasure to work with the 63,( staff, starting with the 3XEOLFDWLRQV

'LUHFWRU Eric Pepper.

xix

SYMBOLS AND NOTATION

r

ai aberration coefficient rp pupil point position vector

A amplitude R radius of reference sphere

Ai peak aberration coefficient Re real part

Bd defocus coefficient Rj rectangular polynomial

Bj wave aberration polynomial Rnm (r) Zernike radial polynomial

Bt tilt coefficient S Strehl ratio

c aspect ratio Sex area of exit pupil

Ej elliptical polynomial Sj square, sector, or ray aberration

F focal ratio polynomial

r

Gj Gaussian or vector polynomial V vector polynomial

Hj hexagonal polynomial x, y Cartesian coordinates of a point

I irradiance W wave aberration

Im imaginary part Z nm Zernike circle polynomial

j polynomial number Zj Zernike circle polynomial

r image spatial frequency vector

Jn Bessel function vi

Lj Legendre polynomial v normalized spatial frequency

M magnification t optical transfer function

MTF modulation transfer function r = r a normalized radial coordinate

OTF optical transfer function q polar angle of a position vector

P object point f polar angle of frequency vector

P¢ Gaussian image point ⑀ obscuration or aspect ratio

Pex power in the exit pupil d (◊) Dirac delta function

Pi image power d ij Kronecker delta

Pn polynomial D longitudinal defocus

P(◊) pupil function F phase aberration

PSF point-spread function r, q polar coordinates of a point

PTF phase transfer function l optical wavelength

r radial coordinate x, h spatial frequency coordinates

rc radius of circle sW standard deviation (wave)

r

ri image point position vector sF standard deviation (phase)

xxi

Anantaratnaprabhavasya yasya himam

. na saubhagyavilopi jatam

Eko hi doso

. gunasannipate

. ˙ .

nimajjatindoh. kiranesvivankah

.

The snow does not diminish the beauty of the Himalayan mountains

which are the source of countless gems. Indeed, one flaw is lost

among a host of virtues, as the moon’s dark spot is lost among its rays.

xxiii

PART III

WAVEFRONT ANALYSIS

CHAPTER 1

OPTICAL IMAGING

1.5 Summary................................................................................................................. 11

References ........................................................................................................................12

1

Chapter 1

Optical Imaging

1.1 INTRODUCTION

The position and the size of the Gaussian image of an object formed by an optical

imaging system is determined by using its Gaussian imaging equations. The aperture stop

of the system limits the amount of light entering it the most. Its entrance pupil determines

the amount of light from an object that enters it, and the exit pupil determines how that

light is distributed in the image. The Gaussian image is an exact replica of the object,

except for its magnification. The diffraction image of an isoplanatic incoherent object is

given by the convolution of the Gaussian image and the diffraction image of a point

object, called the point-spread function (PSF). In the spatial frequency domain, the

spectrum of the image is correspondingly given by the product of the optical transfer

function (OTF), which is the Fourier transform of the PSF, and the spectrum of the

Gaussian image. The image is obtained by inverse Fourier transforming its spectrum [1].

We define a pupil function, representing the complex amplitude at the exit pupil, and give

equations for obtaining the PSF and the OTF.

measure of the quality of an image is its Strehl ratio, which represents the ratio of the

central irradiances of the PSF with and without the aberration. This ratio is discussed and

simple but approximate expressions for it are derived for small aberrations in terms of the

variance of the aberration at the exit pupil. Since the Strehl ratio is higher for a smaller

variance, we discuss aberration balancing in which an aberration of a higher order is

balanced with one or more aberrations of lower order to minimize its variance and

thereby maximize the Strehl ratio. We discuss some general results on the effects of

nonuniform amplitude, called apodization, and nonuniform phase, called aberration, at

the exit pupil on the irradiance at the center of the reference sphere with respect to which

the aberration is defined. For a given total power in the pupil and, therefore, in the image

of a point object, maximum central irradiance is obtained for a system with an

unapodized and unaberrated pupil. Moreover, the peak value of an unaberrated image lies

at the center of curvature of the reference sphere regardless of the apodization of the

pupil. Generally, the effect of even large amplitude variations across the pupil is

relatively small compared to that of even small aberrations.

The Gaussian image of a point object formed by an imaging system is determined by

using Gaussian optics. In the Gaussian approximation, the aberrations are completely

neglected, and all of the rays originating at the point object and transmitted by the system

pass through the Gaussian image point. In reality, however, when the object rays are

traced through the system, they do not generally pass through the Gaussian image point

due to the aberrations. Instead, they are distributed in the vicinity of the image point, and

their distribution is referred to as the spot diagram. In practice, even if the aberrations are

3

4 OPTICAL IMAGING

absent or neglected, the light is distributed in a finite region around the Gaussian image

point due to its diffraction by the system. The diffraction image of a point object is called

the PSF of the system, and the aberration-free image is referred to as the diffraction-

limited image. The image of an extended object is determined by adding the amplitude or

the irrandiance images of its small elements, depending on whether the object radiation is

coherent or incoherent.

A system is called isoplanatic for a small enough object if the distribution of light in

the image of any point on it is approximately the same, except for its location in the

image plane. Thus, over a small field of view, the image of a point object is shift

invariant. For an incoherent isoplanatic object, the diffraction image can be obtained by

convolving the Gaussian image (which is an exact replica of the object except for its size

and illumination scaling) with the diffraction PSF. In the spatial frequency domain, the

spectrum of the image is correspondingly given by the product of the OTF, which is the

Fourier transform of the PSF, and the spectrum of the Gaussian image. The image is

obtained by inverse Fourier transforming its spectrum [1]. We define a pupil function,

representing the complex amplitude at the exit pupil, and give equations for obtaining the

PSF and the OTF.

r

Consider a point object located at ro in the object plane radiating at a wavelength l .

Its Gaussian image formed by an imaging system determines the amount of light in the

image, depending on the object intensity, and distance from and the size of the entrance

pupil. The wave at the exit pupil of the system is represented by the pupil function

(r r ) (r r ) [ (r r )]

P rp ; ro = A rp ; ro exp iF rp ; ro , inside the exit pupil

= 0 , outside the exit pupil , (1-1)

r

(r r )

where rp is the 2D position vector of a point in the plane of the pupil and A rp ; ro and

F (r, q) are the amplitude and phase aberration functions of the system for the point

object under consideration. The phase aberration F (r, q) is related to the wave aberration

r r

( )

W rp ; ro according to

The shape of the pupil is arbitrary. It may, for example, be circular or annular. The total

power in the pupil and, therefore, in the image is given by

r r 2 r

Pex = Ú P (r ; r )

p o d rp

r r r

= Ú A 2 ( rp ; ro )d rp , (1-3)

3XSLO )XQFWLRQ 5

The image lies at a distance R from the plane of the exit pupil, where R is the radius

of curvature of the Gaussian reference sphere with respect to which the aberration

r r

( )

W rp ; ro is defined. The center of curvature of the reference sphere lies at the Gaussian

r r

image point (unless defocus is introduced). Generally, the amplitude function A rp ; ro ( )

is uniform across the exit pupil. An exception is the Gaussian pupil considered in Chapter

6. We assume a small field of view so that the dependence of the aberration function

r r

( )

W rp ; ro on the location of the point object in the object plane can be neglected.

1.2.2 PSF

The PSF of the system imaging an incoherent object is given by [1]

2

r 1 Û r Ê 2pi r r ˆ r

PSF (ri ) = 2 2 Ù

Pex l R ı

P rp exp Á -

Ë lR

( )

ri rp ˜ d rp

¯

◊ , (1-4)

r

where the position vector ri of the observation point is written with respect to the

r

location rg of the Gaussian image point, and Pex is the total power in the image. The

irradiance distribution of the image is obtained by multiplying the PSF by the total power

Pex in the image, i.e.,

2

r 1 Û r Ê 2pi r r ˆ r

I (ri ) = 2 2 Ù P rp exp Á -

lR ı Ë lR

( )

ri rp ˜ d rp

¯

◊ . (1-5)

For a uniformly illuminated pupil with irradiance I 0 , the total power incident on and

transmitted by the pupil is given by

(r )

where Sex is the area of the exit pupil. Letting A 2 rp = I 0 , we may write the irradiance

distribution

2

r I0 Û r Ê 2pi r r ˆ r

I (ri ) = 2 2 Ù exp iF rp

lR ı

[ ( )] exp Á -

Ë lR

◊

ri rp ˜ d rp

¯

. (1-7)

I0 r 2

I ( 0) =

l R2

2 [

Ú d rp ]

Pex Sex

= . (1-8)

l2 R 2

2

r 1 Û r Ê 2pi r r ˆ r

I (ri ) = 2 Ù exp iF rp

Sex ı

[ ( )] exp Á -

Ë lR

◊

ri rp ˜ d rp

¯

. (1-9)

6 OPTICAL IMAGING

For convenience, we will refer to the irradiance distribution given by Eq. (1-9) as the

r

( )

PSF. Letting F rp = 0, we obtain the aberration-free PSF.

1.2.3 OTF

The imaging process can be described in the space domain by way of the PSF, or in

the spatial frequency domain by way of the OTF. The OTF is the Fourier transform of the

PSF, defined as

r r r r r

t (v i ) = Ú PSF (ri ) exp (2p i v i ◊ ri ) d ri , (1-10)

r

where v i is a spatial frequency vector in the image plane and related to the corresponding

r r r

frequency v o in the object plane by the image magnification M according to v i = v o M .

Since the image of an isoplanatic incoherent object is given by the convolution of the PSF

and the Gaussian image, the (spatial frequency) spectrum of the image is given by the

product of the OTF and the spectrum of the Gaussian image. The image is obtained by

inverse Fourier transforming its spectrum.

Because of the relationship of the PSF with the pupil function, as in Eq. (1-4), the

OTF can also be written as the autocorrelation of the pupil function in the form

r r r r r r 2 r

t (v i ) = Û ( ) (

Ù P rp P * rp - l R v i d rp

ı

) Ú ( )

P rp d rp

r r r

Ú ( ) (

= Pex1 A rp A rp - l R v i exp iQ rp ) [ (r )] d rr p , (1-11)

(r r ) (r ) (r

Q rp ; v i = F rp - F rp - l R v i

r

) (1-12)

is a phase aberration difference function defined over the region of overlap of two pupils:

r r r

one centered at rp = 0 and the other at rp = l Rvi .

r

(r ) (r

t (v i ) = Pex1 Ú A rp A rp - l R v i d rp

r

) r

. (1-13)

For a uniformly illuminated pupil, the OTF is simply the fractional area of overlap of two

pupils centered at (0, 0) and l R(x, h) , where (x, h) are the Cartesian components of the

r

spatial frequency vector v i .

r

The region of overlap is maximum and equal to the area of the pupil for vi = 0,

giving a value of unity for t (0) . It represents the fact that the contrast of an image is zero

for an object of zero contrast. Because of the finite size of the pupil, the overlap region

r

reduces to zero at some frequency vc , called the cutoff frequency, and stays zero for

r r r

larger frequencies, i.e., t ( vi ) = 0 for vi ≥ vc . Because of isoplanatism, the spatial

frequency spectrum of the image is obtained as the product of the spectrum of the

27) 7

Gaussian image and the OTF. Inverse Fourier transforming the image spectrum yields the

space domain image.

r r

t ( vi ) = t * ( - vi ) , (1-14)

i.e., the OTF is complex symmetric or Hermitian. Therefore, its real part is even and its

imaginary part is odd, i.e.,

r r

Re t ( vi ) = Re t ( - vi ) ,

(1-15)

and

r r

Im t ( vi ) = - Im t ( - vi ) . (1-16)

r r r

[

t ( vi ) = t ( vi ) exp i Y ( vi ) ] , (1-17)

r r

where t ( vi ) and Y( vi ) are its modulus and phase, called the modulation and phase

transfer functions (MTF and PTF), respectively. Depending on the shape of the pupil and

the type of the aberration, the OTF may be real. A phase of p is sometimes associated

with a negative value of the MTF. It represents contrast reversal i.e, bright and dark

regions in the object appear as dark and bright regions in the image.

By inverse Fourier transforming Eq. (1-10), we can obtain the PSF according to

r r r r r

◊

PSF (ri ) = Ú t (v i ) exp (- 2 pi v i ri ) d v i . (1-18)

For a radially symmetric pupil with a radially symmetric aberration, e.g., a circular

pupil aberrated by spherical aberration, the OTF and PSF Eqs. (2-4) and (2-18) yield

and

respectively, where J 0 (◊) is the zeroth-order Bessel function of the first kind. The OTF is

evidently real in this case.

1.3.1 General Expression

The Strehl ratio of an image represents the ratio of its central irradiances with and

without aberration. From Eq. (1-5), the ratio of the central irradiances with aberration and

that at the Gaussian image point without aberration, may be written [1]

8 OPTICAL IMAGING

I a ( 0)

S = , (1-21)

I u ( 0)

respectively, and S is the Strehl ratio given by

r r r 2

Ú ( ) [ ( )]

A rp exp iF rp d rp

[ Ú A (rr ) d rr ]

S = 2

. (1-22)

p p

0£ S £ 1 . (1-23)

The Strehl ratio may also be determined from the OTF of the system. By definition,

r r

PSF ( 0) = Ú t (v i ) d v i . (1-25)

Since the PSF at any point is a real quantity, only the real part of the aberrated OTF

contributes to the integral, and the integral of its imaginary part must be zero. Hence, the

Strehl ratio is given by

r r r r

S = Ú Re t a ( v ) d v Ú t u ( v ) d v . (1-26)

Thus, the Strehl ratio may be obtained by integrating the real part of the measured

aberrated OTF over all spatial frequencies and dividing it by a similar integral of the

calculated unaberrated OTF.

The Strehl ratio gives a measure of the image quality in terms of the reduction in the

central irradiance due to the aberration in the system, including any defocus. Its value

being less than one is a consequence of the fact that the Huygens’ secondary spherical

wavelets on the reference sphere are not in phase due to the aberrations and, therefore,

they interfere nonconstructively at its center of curvature.

It can be shown that, for a given total power, the amplitude variations across the

pupil of an aberration-free system reduce the central irradiance, and any phase variations

(i.e., aberrations) further reduce it [2]. However, an irradiance reduced by phase

variations alone does not necessarily reduce any further if any amplitude variations are

also introduced. In fact, the amplitude variations can even increase this irradiance. For

example, the central value of a defocused PSF for a circular pupil decreases to zero as the

defocus aberration approaches one wave (see Section 4.4). The Huygens’ secondary

wavelets arriving at this point completely cancel each other. Hence, any amplitude

variations across the pupil will only help avoid complete cancellation and thereby

*HQHUDO ([SUHVVLRQ 9

increase the central value. The maximum value of central irradiance is obtained when the

system is unapodized and unaberrated [1,2]. It is shown in Chapter 5 how a Gaussian

pupil, as in a Gaussian beam, yields a smaller central value.

The peak value of the aberrated irradiance distribution of the image of a point object

does not necessarily occur at the center of the reference sphere. However, the peak value

of an unaberrated image does occur at the center regardless of the apodization. The

Huygens’ secondary wavelets emanating from the spherical wavefront being equidistant

from this point are in phase. Hence, they interfere constructively, producing a maximum

possible value at this point.

Equation (1-22) for the Strehl ratio can be written in an abbreviated form

2

S = exp (i F) , (1-27)

where the angular brackets L indicate a spatial average over the amplitude-weighted

pupil, e.g.,

r r r

Ú A ( rp ) F ( rp ) d rp

F = r r . (1-28)

Ú A ( rp ) d rp

r

Since F is independent of rp , Eq. (1-27) can be written

2

S = [

exp i ( F - F )]

2 2

= cos (F - F ) + sin (F - F )

2 (1-29)

≥ cos (F - F ) ,

equality holding when F is zero across the pupil, in which case S = 1. For small

aberrations, expanding the cosine function in a power series and retaining the first two

obtain the Maréchal result generalized for an apodized pupil

where

s 2F = (F - F )2 (1-31)

is the variance of the phase aberration across the amplitude-weighted pupil. The quantity

s F is the standard deviation of the aberration. We will refer to it as the “sigma value” or

simply the “sigma” of the aberration.

10 OPTICAL IMAGING

For small values of s F , three approximate expressions have been used in the

literature:

2

S1 ~ (1 - s 2F 2) , (1-32)

S2 ~ 1 - s 2F , (1-33)

and

S3 ~ exp (- s 2F ) . (1-34)

The first is the Maréchal formula [3], the second is the commonly used expression ob-

4

tained when the term in s F in the first is neglected [4,5], and the third is an empirical ex-

pression giving a better fit to the actual numerical results for various aberrations [6]. Just

as S1 > S2 by s F4 4 , similarly, S3 > S1 by approximately the same amount. The simplest

expression to use is, of course, S2 , according to which s 2F gives the drop in the Strehl

ratio. We note that, for a pupil of any shape, the Strehl ratio for a small aberration does

not depend on its type but only on its variance across the apodized pupil. For a high-

quality imaging system, a typical value of the Strehl ratio desired is 0.8, corresponding to

a wave aberration with a sigma of s w = l 14 , where s w = (l 2p) s F .

1.4 ABERRATION BALANCING

In geometrical optics, we mix one aberration with another in order to minimize the

variance of the ray distribution in an image plane. For example, when we minimize the

variance by combining the primary spherical aberration with defocus aberration by

considering the ray distribution in a defocused image plane, the smallest spot, called the

circle of least confusion, has a radius that is 1/4 of its value in the Gaussian image plane

[7]. Similarly, when astigmatism is combined with defocus, the circle of least confusion

has a diameter equal to half the length of the line image in the Gaussian image plane. In

the case of coma, the ray distribution is asymmetric about the Gaussian image point and,

therefore, its centroid does not lie at this point. The centroid shift is equivalent to

introducing a wavefront tilt, or balancing coma with tilt.

Based on diffraction, the best image for small aberrations is the one for which the

variance of the wave aberration is minimum so that its Strehl ratio is maximum. Since the

value of variance depends on the shape of and the amplitude across the pupil, the value of

the balancing aberration also depends on those factors. Thus, for example, the value of

defocus for balancing spherical aberration for an annular pupil is different than that for a

circular pupil. Similarly, its value for a Gaussian circular pupil, as in the case of a circular

Gaussian beam, is different than that for a uniform circular pupil. The process of

balancing a higher-order aberration with one or more aberrations of the same and/or

lower orders to minimize the variance is called aberration balancing. Thus, for example,

secondary spherical aberration is balanced with primary spherical aberration and defocus,

and secondary coma is balanced with primary coma and tilt.

$EHUUDWLRQ %DODQFLQJ 11

The balanced aberrations for a system with a certain shape of the pupil form the basis

of determining the orthogonal polynomial aberrations for the analysis of wavefronts

across the given pupil. The Zernike circle polynomials, for example, are the orthogonal

polynomial aberrations for a system with a circular pupil that represent the balanced

classical aberrations for such a system.

1.5 SUMMARY

The diffraction image of an isoplanatic incoherent object is given by the convolution

of its Gaussian image and the PSF. In the spatial frequency domain, the spectrum of the

image is given by the product of the OTF and the spectrum of the Gaussian image. The

image is obtained by inverse Fourier transforming its spectrum.

irradiance is given by Pex Sex l2 R 2 , independent of the shape of the pupil [see Eq. (1-8)].

The aberrations of a system are neglected in Gaussian optics when determining the

location and the size of an image formed by the system. The aberration-free OTF of a

system with a uniformly illuminated pupil is simply equal to the fractional area of overlap

of two pupils whose separation depends on the spatial frequency vector.

practice. An important measure of this quality is the Strehl ratio [see Eq. (1-21)], which

represents the ratio of the central irradiances of the image of a point object with and

without aberration. The Strehl ratio can also be obtained by integrating the real part of the

OTF of a system [see Eq. (1-26)]. For small aberrations, the Strehl ratio is determined by

the variance of the aberration according to, for example, Eq. (1-34), and it is independent

of the type of an aberration. The peak value of a PSF does not necessarily lie at its center,

as, for example, in the case of coma. For an apodized pupil, the aberration variance is

calculated over the amplitude-weighted pupil. A Strehl ratio of 0.8 is obtained when the

standard deviation s w of the wave aberration is approximately l 14 .

The variance of an aberration of a certain order can be reduced by mixing it with one

or more aberrations of lower order, thereby improving the Strehl ratio. The process of

mixing one aberration with others in this manner is called aberration balancing. The

polynomial aberrations used for wavefront analysis are not only orthogonal across the

pupil of a system, but also represent balanced classical aberrations for it.

12 OPTICAL IMAGING

References

Optics, 2nd ed. (SPIE Press, Bellingham, WA, 2011).

geometriques sur l'image d'un point lumineux,” Revue d'Optique 26, 257–277

(1947).

Groningen, The Netherlands (1942).

Diffraction pattern in the presence of small aberrations,” Physica 13, 605–620

(1947).

variance,” J. Opt. Soc. Am. 73, 860–861 (1983).

(SPIE Press, Bellingham, WA, Second Printing 2001).

CHAPTER 2

2.8 Summary................................................................................................................. 31

References ........................................................................................................................33

13

Chapter 2

Optical Wavefronts and Their Aberrations

2.1 INTRODUCTION

The position and the size of the Gaussian image of an object formed by an optical

imaging system is determined by using its Gaussian imaging equations. We have stated in

Chapter 1 that the quality of the diffraction image depends on the aberrations of the

system. A spherical wave originating at a point object is incident on the system. The

image formed by the system is aberration free and perfect if the wave exiting from the

system is also spherical. In this case, the rays originating at the point object and traced

through the system all pass through the Gaussian image point.

If the optical wavefront exiting from the exit pupil is not spherical, its optical

deviations from a spherical form represent its wave aberrations. These wave aberrations

play a fundamental role in determining the quality of the aberrated image. The rays traced

from the object point through the system, instead of passing through the Gaussian image

point, intersect the image plane in its vicinity. The distance of the point of intersection of

a ray in the image plane from the Gaussian image point is called the transverse ray

aberration, and the distribution of the rays is referred to as the spot diagram. In this

chapter, we define the wave and ray aberrations and give a relationship between them.

We relate the longitudinal defocus of an image to the defocus wave aberration, and its

wavefront tilt to the wavefront tilt aberration. Next, the possible aberrations of an

imaging system that is rotationally symmetric about its optical axis are described. The

aberration function of the system is expanded in a power series of the object and pupil

coordinates, and primary (or Seidel), secondary (or Schwarzschild), and tertiary

aberrations are introduced [1]. We also discusss briefly how the aberrations may be

observed using a Twyman–Green interferometer and what the fringe pattern of a primary

or Seidel aberration looks like. A short summary of the chapter is given at the end.

An optical imaging system consists of a series of refracting and/or reflecting

surfaces. The surfaces refract or reflect light rays from an object to form its image. The

image obtained according to geometrical optics in the Gaussian approximation, i.e.,

according to Snell's law in which the sines of the angles are replaced by the angles, is

called the Gaussian image. The Gaussian approximation and the Gaussian image are

often referred to as the paraxial approximation and the paraxial image, respectively. We

assume that the surfaces are rotationally symmetric about a common axis called the

optical axis (OA). Figure 2-1 illustrates the imaging of an on-axis point object P0 and an

off-axis point object P, respectively, by an optical system consisting of two thin lenses.

P ¢ and P0¢ are the corresponding Gaussian image points. An object and its image are

called conjugates of each other, i.e., if one of the two conjugates is an object, the other is

its image. The location and size of the image of an extended object is determined by

using its Gaussian imaging equations.

15

16 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

ExP

EnP

L1 L2

AS

MR 0

B02

OA CR0 A01

P0 A02 P¢0

B01

MR

0

(a)

ExP

L1 EnP

AS L2

C2

B2 P¢

P0 OA A2

MR A1 P¢

0

B1

CR

C1

MR

P

(b)

Figure 2-1. (a) Imaging of an on-axis point object P0 by an optical imaging system

consisting of two lenses L1 and L2 . OA is the optical axis. The Gaussian image is at

P0¢ . AS is the aperture stop; its image by L1 is the entrance pupil EnP, and its image

by L2 is the exit pupil ExP. CR0 is the axial chief ray, and MR0 is the axial marginal

ray. (b) Imaging of an off-axis point object P. The Gaussian image is at P ¢. CR is the

off-axis chief ray, and MR is the off-axis marginal ray.

2SWLFDO ,PDJLQJ 17

An aperture in the system that physically limits the solid angle of the rays from a

point object the most is called the aperture stop (AS). For an extended (i.e., a nonpoint)

object, it is customary to consider the aperture stop as the limiting aperture for the axial

point object, and to determine vignetting, or blocking of some rays, by this stop for off-

axis object points. The object is assumed to be placed to the left of the system so that

light initially travels from left to right. The image of the stop by surfaces that precede it in

the sense of light propagation, i.e., by surfaces that lie between it and the object, is called

the entrance pupil (EnP). When observed from the object side, the entrance pupil appears

to limit the rays entering the system to form the image of the object. Similarly, the image

of the aperture stop by surfaces that follow it, i.e., by surfaces that lie between it and the

image, is called the exit pupil (ExP). The object rays reaching its image appear to be

limited by the exit pupil. Since the entrance and exit pupils are images of the stop by the

surfaces that precede and follow it, respectively, the two pupils are conjugates of each

other for the whole system, i.e., if one pupil is considered as the object, the other is its

image formed by the system.

An object ray passing through the center of the aperture stop and appearing to pass

through the centers of the entrance and exit pupils is called the chief (or the principal) ray

(CR). An object ray passing through the edge of the aperture stop is called a marginal ray

(MR). The rays lying between the center and the edge of the aperture, and, therefore,

appearing to lie between the center and edge of the entrance and exit pupils, are called

zonal rays.

It is possible that the stop of a system may also be its entrance and/or exit pupil. For

example, a stop placed to the left of a lens is also its entrance pupil. Similarly, a stop

placed to the right of a lens is also its exit pupil. Finally, a stop placed at a single thin lens

is both its entrance and exit pupils.

Consider an optical system imaging a point object P, as illustrated in Figure 2-2. The

object radiates a spherical wave. For perfect imaging, the diverging spherical wave

incident on the system is converted by it into a spherical wave converging to the Gaussian

image point P ¢ . Generally, the wave exiting from real systems is only approximately

spherical.

The optical path length of a ray in a medium of refractive index n is equal to n times

its geometrical path length. Consider rays from a point object traced through the system

up to the exit pupil such that each one travels exactly the same optical path length. The

ray passing through the center of the pupil is called the chief ray, and represents the

reference ray with respect to which the optical path lengths of the other rays are

compared. The surface passing through the end points of the rays is called the system

wavefront, and it represents a surface of constant phase for the point object under

consideration. If the wavefront is spherical, with its center of curvature at the Gaussian

18 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

Optical

System

P¢

Figure 2-2. Perfect imaging of a point object P by an optical system at its Gaussian

image point P ¢ .

image point, we say that the image is perfect. The rays transmitted by the system have

equal optical lengths in propagating from P to P ¢ , and they all pass through P ¢ . If,

however, the actual wavefront deviates from this spherical wavefront, called the

Gaussian reference sphere, we say that the image is aberrated. The rays reaching the

Gaussian reference sphere do not travel the same optical path length, and they intersect

the Gaussian image plane in the vicinity of P ¢ . The optical deviations (i.e., the

geometrical deviations times the refractive index ni of the image space) of the wavefront

from a Gaussian reference sphere are called wave aberrations. The wave aberration of a

ray at a point on the reference sphere where the ray meets it is equal to the optical

deviation of the wavefront along that ray from the Gaussian reference sphere. It

represents the difference between the optical path lengths of the ray under consideration

and the chief ray in traveling from the point object to the reference sphere. Accordingly,

the wave aberration associated with the chief ray is zero. Since the optical path lengths of

the rays from the reference sphere to the Gaussian image point are equal, the wave

aberration of a ray is also equal to the difference between its optical path length from the

point object P to the Gaussian image point P ¢ and that of the chief ray.

The wave aberration of a ray is positive if it has to travel an extra optical path length,

compared to the chief ray, in order to reach the Gaussian reference sphere. Figures 2-3a

and 2-3b illustrate the reference sphere S and the aberrated wavefront W for on-axis and

off-axis point objects, respectively. The reference sphere, which is centered at the

Gaussian image point P0¢ in Figure 2-3a or P ¢ in Figure 2-3b, and the wavefront pass

through the center O of the exit pupil. The wave aberration ni Q Q of a general ray GR0

or GR, where ni is the refractive index of the image space, as shown in the figures, is

numerically positive. The coordinate system is also illustrated in these figures. We choose

a right-hand Cartesian coordinate system such that the optical axis lies along the z axis.

The object, entrance pupil, exit pupil, and Gaussian image lie in mutually parallel planes

that are perpendicular to this axis. Figure 2-4 illustrates the coordinate systems in the

object, exit pupil, and image planes. The origin of the coordinate system lies at O and the

Gaussian image plane lies at a distance zg from it along the z axis.

We assume that a point object such as P lies along the x axis. (There is no loss of

generality because of this since the system is rotationally symmetric about the optical

axis.) The z x plane containing the optical axis and the point object is called the

2.3 Wave and Ray Aberrations 19

ExP

Q Q(x, y, z)

GR0 x

d a

CR0

z

O OA P0¢ (0, 0)

g

b

y

W(x,y) = niQQ

S

W

R

Figure 2-3a. Aberrated wavefront for an on-axis point object. The reference sphere

S of radius of curvature R is centered at the Gaussian image point P0¢ . The

wavefront W and reference sphere pass through the center O of the exit pupil ExP.

A right-hand Cartesian coordinate system showing x, y, and z axes is illustrated,

where the z axis is along the optical axis O A of the imaging system. Angular

rotations a , b , and g about the three axes are also indicated. CR0 is the chief ray,

and a general ray GR0 is shown intersecting the Gaussian image plane at P0¢¢ .

ExP

Q(x,y,z)

Q

GR

P¢¢(xi,yi)

P¢(xg,0)

R

O OA P¢0

x

a

z

g

y b W(x,y) = niQQ

S

W

zg

Figure 2-3b. Aberrated wavefront for an off-axis point object. The reference sphere

S of radius of curvature R is centered at the Gaussian image point P ¢ . The value of

R in this figure is slightly larger than its value in Figure 1-3a. GR is a general ray

intersecting the Gaussian image plane at the point P ¢¢ . By definition, the chief ray

(not shown) passes through O, but it may or may not pass through P ¢ .

20 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

xo

P (xo, 0) xp

Q (x, y)

P0

an ct

xg

pl bje

e

r

O

q

P¢¢ (xi, yi, zg)

yo

R

O P¢ (xg, 0, zg)

an il

pl up

e

P

zg

yp P¢0

pl n

e

e sia

an

ag us

yg im Ga

Figure 2-4. Right-hand coordinate system in object, exit pupil, and image planes.

The optical axis of the system is along the z axis, and the off-axis point object P is

assumed to be along the x axis, thus making the z x plane the tangential plane.

tangential or the meridional plane. The corresponding Gaussian image point P ¢ lying in

the Gaussian image plane along its x axis also lies in the tangential plane. This may be

seen by consideration of a tangential object ray and Snell’s law, according to which the

incident and the refracted (or reflected) rays at a surface lie in the same plane. The chief

ray always lies in the tangential plane. The plane normal to the tangential plane but

containing the chief ray is called the sagittal plane. As the chief ray bends when it is

refracted or reflected at an optical surface, so does the sagittal plane. It should be evident

that only the chief ray lies in both the tangential and sagittal planes, because it lies along

the line of intersection of these two planes.

Consider an image ray such as GR in Figure 2-2b passing through a point Q with

coordinates (x, y, z) on the reference sphere of radius of curvature R centered at the image

point. We let W(x, y) represent its wave aberration nQ Q , because z is related to x and y

by virtue of Q being on the reference sphere. It can be shown that the ray intersects the

Gaussian image plane at a point P ¢¢ whose coordinates with respect to the Gaussian

image point P ¢ are approximately given by [1,2]

R Ê ∂W ∂W ˆ

(x i , y i ) = Á , ˜ , (2-1)

n Ë ∂x ∂y ¯

image point P ¢. For systems with narrow fields of view, P ¢ lies close to P0¢ , and we may

:DYH DQG 5D\ $EHUUDWLRQV 21

replace R with zg . Note that in the case of an axial point object, R zg . [Equation (2-1)

has been derived by Mahajan [1], Born and Wolf [2], and Welford [3]. Note, however,

that Welford uses a sign convention for the wave aberration that is opposite to ours.]

The displacement P0cP0s in Figure 2-3a (or Pc Ps in Figure 2-3b) of a ray from the

Gaussian image point is called its geometrical or transverse ray aberration, and its

coordinates ( x i , y i ) in the Gaussian image plane relative to the Gaussian image point are

called its ray aberration components. Since a ray is normal to a wavefront, the ray

aberration depends on the shape of the wavefront and, therefore, on its geometrical path

difference from the reference sphere. The division of W by n in Eq. (2-1) converts the

optical path length difference into geometrical path length difference. When an image is

formed in free space, as is often the case in practice, then n = 1. The angle G ~ P0cP0s R

between the ideal ray QP0c and the actual ray QP0s is called the angular ray aberration.

The distribution of rays in an image plane is called the ray spot diagram.

Q x, y in the plane of the exit pupil. If r, T are the polar coordinates of this point, as

illustrated in Figure 2-5, they are related to its rectangular coordinates x, y according to

Note that the tangential rays, i.e., those lying in the z x plane, lie along the x axis of the

exit pupil plane and thus correspond to T 0 or S . Similarly, the sagittal rays, i.e., those

lying in a plane orthogonal to the tangential plane but containing the chief ray lie along

the y axis of the exit pupil plane and thus correspond to T S 2 or 3S 2 .

Q(x, y)

Q(r, T)

r

y

T

x

O x

Figure 2-5. Circular exit pupil of radius a of an imaging system, and Cartesian and

polar coordinates x, y and r, T, respectively, of a point Q on the pupil.

22 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

We now discuss defocus wave aberration of a system and relate it to its longitudinal

defocus. Consider an imaging system for which the Gaussian image of a point object is

located at P1 . As indicated in Figure 2-6, let the wavefront for this point object be

spherical with a center of curvature at P2 (due, for example, to field curvature discussed

in Section 1.6 for an off-axis point object) such that P2 lies on the line OP, joining the

center O of the exit pupil and the Gaussian image point P1 . The aberration of the

wavefront representing its optical deviation along a ray from the Gaussian reference

sphere is given by nQ2Q1 , where n is the refractive index of the image space, and Q2Q1,

as indicated in the figure, is approximately equal to the difference in the sags of the

reference sphere and the wavefront at a height r. (The sag of a surface at a certain point

on it represents its deviation at that point along its axis of symmetry from a plane surface

that is tangent to it at its vertex.) Thus, the defocus wave aberration at a point Q1 at a

distance r from the optical axis, representing the second-order difference, is given by

n §1 1· 2

W r ¨ ¸r , (2-3)

2 ©z R¹

where z and R are the radii of curvature of the reference sphere S and the spherical

wavefront W centered at P1 and P2 , respectively, passing through the center O of the exit

pupil, and r is the distance of Q1 from the optical axis. We note that the defocus wave

aberration is proportional to r 2 . If z ~ R , then Eq. (2-3) may be written as follows:

ExP

Q2 Q1

O B P1 P2

S centered at P1

W centered at P2

W S

Z

curvature R centered at P2 . The reference sphere S with a radius of curvature z is

centered at P1 . Both W and S pass through the center O of the exit pupil ExP. The

ray Q2 P2 is normal to the wavefront at Q2 . OB represents the sag of Q1 .

'HIRFXV $EHUUDWLRQ 23

W (r) ~ - n D2 r 2 , (2-4)

2R

where D = z - R is called the longitudinal defocus. We note that the defocus wave

aberration and the longitudinal defocus have numerically opposite signs.

A defocus aberration is also introduced if the image is observed in a plane other than

the Gaussian image plane. Consider, for example, an imaging system forming an

aberration-free image at the Gaussian image point P2 (and not at P1 , as in Figure 1-6).

Thus, the wavefront at the exit pupil is spherical passing through its center Q with its

center of curvature at P2 . Let the image be observed in a defocused plane passing through

a point P1 , which lies on the line joining Q and P2 . For the observed image at P1 to be

aberration free, the wavefront at the exit pupil must be spherical with its center of

curvature at P1 . Such a wavefront forms the reference sphere with respect to which the

aberration of the actual wavefront must be defined. The aberration of the wavefront at a

point Q1 on the reference sphere is given by Eqs. (2-3) and (2-4).

If the exit pupil is circular with a radius a, then Eq. (2-4) may be written

W (r) = Bd r 2 , (2-5)

Bd ~ - nD 8 F 2 (2-6)

represents the peak value of the defocus aberration with F = R 2a as the focal ratio or

the f-number of the image-forming light cone. Note that a positive value of Bd implies a

positive value of D. Thus, an imaging system having a positive value of defocus

aberration D can be made defocus free if the image is observed in a plane lying farther

from the plane of the exit pupil, compared to the defocused image plane, by a distance

8Bd F 2 n . Similarly, a positive defocus aberration of Bd ~ - nD 8F 2 is introduced into

the system if the image is observed in a plane lying closer to the plane of the exit pupil,

compared to the defocus-free image plane, by a distance D.

Now we describe the relationship between a wavefront tilt and the corresponding tilt

aberration. As indicated in Figure 2-7, consider a spherical wavefront centered at P2 in

the Gaussian image plane passing through the Gaussian image point P1 . The wave

aberration of the wavefront at Q1 is its optical deviation nQ2Q1 from a reference sphere

centered at P1 . It is evident that, for small values of the ray aberration P1P2 , the wavefront

and the reference sphere are tilted with respect to each other by an angle b . The

wavefront tilt may be due to an inadvertently tilted element of the imaging system or

distortion (discussed in Section 2.6) for an off-axis point object. The ray and the wave

aberrations can be written

x i = R (2-7)

24 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

ExP

Q2 Q1

r

P2

xi

b

O OA P1

S W

Figure 2-7. Wavefront tilt. The spherical wavefront W is centered at P2 while the

reference sphere S is centered at P1 , such that the two spherical surfaces are tilted

with respect to each other by a small angle = P1 P2 R , where R is their radius of

curvature. The ray Q2 P2 is normal to the wavefront at Q2.

and

respectively, where P1P2 = x i and (r, q) are the polar coordinates of the point Q1 . Both

the wave and ray aberrations are numerically positive in Figure 2-7.

Once again, for a system with a circular exit pupil of radius a, Eq. (2-8) may be

written

where

B t = n i ab (2-10)

is the peak value of the wavefront tilt aberration. Note that a positive value of Bt implies

that the wavefront tilt angle is also positive. Thus, if an aberration-free wavefront is

centered at P2 , then an observation with respect to P1 as the origin implies that we have

introduced a tilt aberration of Bt r cos q.

2.6 Aberration Function of a Rotationally Symmetric System 25

SYSTEM

Consider a point object with Cartesian coordinates (p, q) in the object plane. Its

image, formed by a rotationally symmetric system, is perfect if the spherical wavefront

diverging from the object point and incident on the imaging system is converted by the

system into a spherical wavefront converging to its Gaussian image point. Any deviation

of the imaging wavefront at the exit pupil of the system from a reference sphere passing

through the center of the pupil with center of curvature at the Gaussian image point

represents the aberration function. In optical design, the aberration function is determined

by tracing rays originating at the point object and propagating them through the system

and determining their optical path lengths in reaching the reference sphere relative to that

of the chief ray passing through the center of the pupil. Similarly, in optical testing the

wave aberration at a discrete array of points is determined interferometrically.

If (x, y) are the coordinates of a pupil point, the aberration function consists of terms

r

formed from three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy . If h

r

and rr are

r the position vectors of the object and pupil points,rthen the rotational invariants

r r r r r

are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and hr cos q , where h = h , r = r , and q is the polar

r r

angle of r with respect to that of h . It is convenient to consider the aberration function

in terms of the image height h ¢ , for example, when the object is at infinity, and let q be

the angle for the image point. The image height is, of course, related to the object height

by the Gaussian magnification. We now expand the aberration function W (h ¢; r , q) in a

power series in terms of the three rotational invariants h ¢ 2 , r 2 , and h ¢r cos q in the form

• • •

W (h¢; r , q) = Â Â ( ) l (r 2 ) p (h¢r cos q) m

Â C lpm h ¢ 2

l =0 p =0 m =0

• • •

= Â Â Â C lpm h ¢ 2l + m r 2 p + m cos m q , (2-11)

l =0 p =0 m =0

where C lpm are the expansion coefficients, and l, p, and m are positive integers, including

zero. There is no term with sinq dependence. The aberration terms are called the

classical aberrations.

It is evident that the degree of each term of the series in the object or image and pupil

coordinates is even and given by 2(l + p + m) . Any terms for which p = 0 = m so that

2 p + m = 0 , i.e., those terms that do not depend on r and, therefore, vary only as h ¢ 2l ,

must add up to zero since the aberration associated with the chief ray (for which r = 0 ) is

zero. Thus, the zero-degree term C000 and terms such as C100 h ¢ 2 , C 200 h ¢ 4 , etc., do not

appear in Eq. (2-11). There is also no term of second degree. For example, the term

C010 r 2 represents defocus aberration that is independent of h. It has the implication that

the image is being observed in a plane other than the Gaussian image plane. Similarly, the

term C 001 h ¢r cos q represents a wavefront tilt aberration that depends on h. It has the

implication that the image height is not h ¢ . Hence, a power series expansion of the

26 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

are referred to as the primary, secondary, tertiary aberrations, etc. The primary

aberrations are also called the Seidel aberrations, and the secondary aberrations are also

called the Schwarzschild aberrations.

• • n

W (h¢; r , q) = Â Â Â 2 l + m a nm h¢ 2l + m r n cos m q , (2-12)

l = 0 n =1 m = 0

where

n = 2p + m (2-13)

is a positive integer not including zero, and 2l + m anm are the expansion coefficients. From

Eq. (2-13), we note that n - m = 2 p ≥ 0 and even. The order i of an aberration term,

which is equal to its degree in the object and pupil coordinates, is given by

i = 2l + m + n . (2-14)

The number of terms Ni of a certain order i, i.e., the number of integer sets satisfying Eq.

(2-14) with n - m ≥ 0 and even, is given by

N i = (i + 2) (i + 4) 8 . (2-15)

This number includes a term with n = 0 = m , called piston aberration, although such a

term does not constitute an aberration (since it corresponds to the chief ray, which has a

zero aberration associated with it). It is included here for completeness, as interferometric

data based on the aberrations of a system may have a piston component.

The fourth order (i = 4), i.e., the primary or the Seidel aberration function consisting

of a sum of five fourth-order terms, can be written

W P (r , q; h ¢ ) = 0 a 40 r

4

+ 1a 31h ¢ r 3 cos q + 2 a 22 h ¢ 2 r 2 cos 2 q

(2-16)

+ 2 a 20 h ¢ 2 r 2 + 3 a11h ¢ 3 r cos q .

Since the wave aberration W has dimensions of length, the dimensions of the coefficients

i a jk are inverse length cubed. Since the ray aberrations are related to the wave

aberrations by a spatial derivative [see Eq. (2-1)], their degree is lower by one.

Accordingly, the primary aberrations are also referred to as the third-order ray

aberrations. The wave aberration coefficients 0 a 40 , 1a 31 , 2 a 22 , 2 a 20 , and 3 a11 represent

the coefficients of spherical aberration, coma, astigmatism, field curvature, and

distortion, respectively.

From Eq. (2-16), we note that only spherical aberration is independent of the object

or image height. The field curvature, in its dependence on the pupil coordinates (r, q) , is

like the defocus aberration discussed in Section 2.4. However, the field curvature

$EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 27

represents a defocus aberration that depends on the field h ¢ , thus requiring a curved

image surface for its elimination. On the other hand, pure defocus aberration, such as that

produced by observing the image in a plane other than the Gaussian image plane, is

independent of the field h ¢ . Similarly, distortion depends on the pupil coordinates as a

wavefront tilt. However, distortion depends on the field as h ¢ 3 , but the wavefront tilt

produced by a tilted element in the system would be independent of h¢ .

The sixth order ( i = 6), i.e., the secondary or the Schwarzschild aberration function,

can be written

+ 3 a 31h¢ 3 r 3 cos q + 4 a 22 h ¢ 4 r 2 cos 2 q + 4 a 20 h ¢ 4 r 2 + 5 a11h ¢ 5 r cos q . (2-17)

Four of the nine aberration terms (excluding piston) correspond to l = 0. They are the

secondary spherical aberration ( 0 a 60 r 6 ), secondary coma ( 1a 51h¢ r 5 cos q ), secondary

astigmatism ( 4 a 22 h¢ 4 r 2 cos 2 q ) (wings or Flügelfehler), and arrows or Pfeilfehler

( 3 a 33 h¢ 3 r 3 cos 3 q ). The remaining five corresponding to l π 0 and called lateral

aberrations are similar to the corresponding primary aberrations except for their

dependence on the image height h ¢. The lateral spherical aberration 2 a40 h ¢ 2 r 4 is also

called the oblique spherical aberration.

Aberration terms of the eighth (i = 8) order are called the tertiary aberrations. There

are fourteen aberration terms of this order, excluding piston. Only five of them have the

dependencies on pupil coordinates that are different from those of the secondary or

primary aberrations. Four have dependence on these coordinates as for the secondary

aberrations, and the remaining five have the same dependence as the primary aberrations.

Their difference lies in their dependence on the image height.

coordinates but the same dependence on pupil coordinates so that there is only one term

for each pair of (n, m) values, Eq. (2-12) for the power-series expansion of the aberration

function may be written

• n

W (r, q) = Â Â a nm r n cos m q , (2-18)

n =1 m = 0

•

2l + m

anm = a n Â 2 l + m anm h ¢ . (2-19)

l=0

The radial coordinate r has been normalized to r = r a . It has the advantage that, since

0 £ r £ 1 and cos q £ 1, the coefficient a nm of a classical aberration r n cos m q

represents the peak value or half of the peak-to-valley (P-V) value of the corresponding

aberration term, depending on whether m is even or odd, respectively. The indices n and

m represent the powers of r and cos q, respectively. The index m also represents the

28 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

minimum power of h ¢ dependence of a coefficient (with the exception of tilt and defocus

terms corresponding to n - m ≥ 0 and 2, respectively). The maximum power of h ¢

dependence is given by i - n . Moreover, the powers of h ¢ dependence are even or odd

according to whether n and m are even or odd, respectively. The number of terms through

a certain order i in the reduced power-series expansion of the aberration function given

by Eq. (2-18) is also given by Eq. (2-15). This number includes a nonaberration piston

term corresponding to n = 0 = m . The terms of Eq. (2-12) through a certain order i

correspond to those terms of Eq. (1-18) for which n + m £ i.

Seidel aberration function of Eq. (2-16) may be written in terms of the coefficients a nm

in the form

W P (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31q 3 cos q + a 40r 4 , (2-20)

where

3

a11 = 3 a11h ¢ a , (2-21a)

2

a 20 = 2 a 20 h ¢ a2 , (2-21b)

2

a 22 = 2 a 22 h ¢ a2 , (2-21c)

a 31 = 1a 31h ¢ a 3 , (2-21d)

and

4

a 40 = 0 a 40 a . (2-21e)

Comparing the distortion term a11r cos q with the wavefront tilt aberration given by

Eq. (2-9), we note that while the two are similar in their dependence on the pupil

coordinates, their coefficients depend on the image height differently. The distortion

coefficient a11 varies with h ¢ as h ¢ 3 , but the tilt coefficient Bt is independent of h ¢.

Similarly, comparing the field curvature term a 20r 2 with the defocus wave aberration

given by Eq. (2-5), we note that their dependence on the pupil coordinates is the same.

However, whereas the field curvature coefficient a20 varies with h ¢ as h ¢ 2 , the defocus

coefficient Bd is independent of h ¢.

The aberration function through the sixth order, i.e., for i £ 6 or n + m £ 6 may be

written

W S (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31r 3 cos q + a 33r 3 cos 3 q

where

a11 = ( 3 a11h ¢

3

)

+ 5 a11h¢ 5 a , (2-23a)

$EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 29

a20 = ( 2 a20 h ¢

2

)

+ 4 a20 h¢ 4 a 2 , (2-23b)

a22 = ( 2 a22 h ¢

2

)

+ 4 a22 h¢ 4 a 2 , (2-23c)

a31 = (a 1 31h ¢ )

+ 3 a31h ¢ 3 a 3 , (2-23d)

3 3

a33 = 3 a33 h ¢ a , (2-23e)

a 40 = ( 0 a 40 + 2a 40h ¢ 2 ) a 4 , (2-23f)

2 4

a42 = 2 a42 h ¢ a , (2-23g)

6

a60 = 0 a60 a . (2-23i)

Written in this form, the aberration function has nine aberration terms through the sixth

order or through the secondary aberrations. Since the dependence of an aberration term

on the image height h ¢ is contained in the aberration coefficient anm , it should be noted

that the primary aberrations (including distortion and field curvature terms) in Eqs. (2-23)

are not the same as those in Eq. (2-20), because they contain aberration components not

only of the fourth degree, but of the sixth degree as well. For example, a 40r 4 consists of

spherical and lateral spherical aberrations 0 a 40 a 4 r 4 and 2 a 40 h ¢ 2 a 4 r 4 .

Similarly, the aberration function through the eighth order can be written. Once

again, an aberration term of this expansion will not be necessarily the same as a

corresponding term of the expansions of Eq. (2-20) or (2-22). We add that it is convenient

to refer to the aberration terms of a power-series expansion as the classical aberrations,

e.g., a term in r4 may be referred to as the classical primary spherical aberration.

There are a variety of interferometers that are used for detecting and measuring

aberrations of optical systems [4]. Figure 2-8 illustrates schematically a Twyman–Green

interferometer in which a collimated laser beam is divided into two parts by a beam

splitter BS. One part, called the test beam, is incident on the system under test, indicated

by the lens L, and the other, called the reference beam, is incident on a plane mirror M 1 .

The focus F of the lens system lies at the center of curvature C of a spherical mirror M 2 .

As the angle of the incident light is changed to study the off-axis aberrations of the

system, the mirror is tilted so that its center of curvature lies at the current focus of the

beam. In this arrangement the mirror does not introduce any aberration since it is forming

the image of an object lying at its center of curvature .

The two reflected beams interfere in the region of their overlap. Lens L ¢ is used to

observe the interference pattern on a screen S placed in a plane containing the image of L

30 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

M1

BS

L M2

x

L¢

Figure 2-8. Twyman–Green interferometer for testing a lens system L. A laser beam

is split into two parts by a beam splitter BS. The reflected part is incident on a plane

mirror M1 and the transmitted part is incident on L. F is the image-space focal

point of L , and C is the center of curvature of a spherical mirror M2 . The

interfering beams are focused by a lens L ¢ , and the interference pattern is observed

on a screen S.

since the test beam goes through the lens system L twice, its aberration is twice that of the

system.

If the reference beam has a uniform phase and the test beam has a phase distribution

F( x , y ) , and if their amplitudes are equal to each other, the irradiance distribution of their

interference pattern is given by

[ ]2

I ( x , y ) = I 0 1 + exp iF( x , y )

{ [

= 2I 0 1 + cos F( x , y ) ]} , (2-24)

where I0 is the irradiance when only one beam is present. Of course, the phase and the

wave aberration distributions are related to each other according to

2p

F( x , y ) = W (x, y) , (2-25)

l

2EVHUYDWLRQ RI $EHUUDWLRQV ,QWHUIHURJUDPV 31

where l is the wavelength of the laser beam. The irradiance has a maximum value equal

to 4 I 0 at those points for which

F( x , y ) = 2pn (2-26a)

F( x , y ) = 2p(n + 1 2) , (2-26b)

where n is a positive or a negative integer, including zero. Each fringe in the interference

pattern represents a certain value of n, which in turn corresponds to the locus of ( x , y )

points with phase aberration given by Eq. (2-25a) for a bright fringe and Eq. (2-25b) for a

[ ]

dark fringe. If the test beam is aberration free F ( x , y ) = 0 , then the interference pattern

has a uniform irradiance of 4 I 0 . Figure 2-9 shows interferograms of six waves of a

primary aberration. In Figure 2-9a for spherical aberration and 2-9d for astigmatism, a

certain amount of defocus has also been added. In Figure 2-9c, a certain amount of tilt has

been added to the coma aberration.

2.8 SUMMARY

A perfect image of a point object is formed by an imaging system when a spherical

wave diverging from the object and incident on the system is converted by it into a

spherical wave converging to the Gaussian image point. If rays from the object point are

traced through the system, they all travel exactly the same optical path length from the

object point to the Gaussian image point, and they all pass through this image point.

When the wavefront exiting from the exit pupil of the system is not spherical, its optical

deviations from the spherical form represent the wave aberrations, and an aberrated

image is formed. The rays intersect the image plane in the vicinity of the Gaussian image

point, and their distribution is called the spot diagram. The wave and the ray aberrations

are related to each other by a spatial derivative, as in Eq. (2-1).

integral powers of three rotational invariants, namely, h ¢ 2 , r 2 , and h ¢r cos q , where h ¢ is

the height of the Gaussian image point from the optical axis and (r, q) are the polar

coordinates of a point in the plane of the exit pupil. There is no term with sinq

dependence. The order of an aberration term, representing its degree in the object and

pupil coordinates, is even. The aberrations of the lowest order, namely 4, are called

primary or Seidel aberrations. Similarly, the aberrations of the next order, namely 6, are

called the secondary or the Schwarzschild aberrations. When an image is observed in a

defocused image plane, the defocus aberration thus introduced varies as r 2 . It is similar

to the field curvature aberration in its pupil dependence, but whereas the former is

independent of the image height, the latter varies as h ¢ 2 .

The interference pattern formed by two beams, one of which has traveled through an

aberrated system, is shown in Figure 2-9 for primary aberrations, as an illustration of

interferograms.

32 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS

aberration combined with defocus As r 4 + Bd r 2 , (c) coma combined with tilt

Ac r 3 + Bt rcos q , and (d) astigmatism combined with defocus Aa r 2 cos 2q + Bd r 2 . The

aberrations in the interferograms are twice their corresponding values in the system

under test, because the test beam goes through the system twice.

5HIHUHQFHV 33

References

2nd Printing (SPIE Press, Bellingham, Washington, 2001).

2 M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,

New York, 1999).

New York, 1974).

4. D. Malacara, Ed., Optical Shop Testing, 3rd ed., Wiley, New York (2007).

CHAPTER 3

GRAM–SCHMIDT ORTHONORMALIZATION

Least-Squares Fitting............................................................................................. 39

3.6 Summary................................................................................................................. 43

References ........................................................................................................................46

35

Chapter 3

Orthonormal Polynomials and Gram–Schmidt

Orthonormalization

3.1 INTRODUCTION

In optical design, we trace rays from a point object through a system to determine the

aberrations of the wavefront at its exit pupil. In optical testing, we determine the

aberrations of a system or an element interferometrically. In both cases, we obtain

aberration numbers at an array of points. We can calculate the PSF or other associated

image quality measures from these numbers. We can also calculate the aberration

variance, which, in turn, gives some idea of the image quality. However, such measures

do not shed light on the content of the aberration function. To understand the nature of

this function, we want to know the amount of certain familiar aberrations discussed in

Chapter 2 that are present, so that perhaps something can be done about them in

improving the design or the system under test.

decompose it into a set of orthogonal polynomials that represent balanced classical

aberrations and include wavefront defocus and tilt. The Zernike circle polynomials are in

widespread use for this purpose for systems with circular pupils. These polynomials are

unique in the sense that they are not only orthogonal across a unit circle, but they also

represent balanced aberrations yielding minimum variance, as we shall see in Chapter 4.

In this chapter, we discuss the basic properties of the orthogonal polynomials. We also

describe the Gram–Schmidt orthogonalization process for obtaining orthogonal

polynomials over one domain from those that are orthogonal over another domain, e.g.,

obtaining polynomials that are orthogonal over an annular pupil from the circle

polynomials. We emphasize the use of orthonormal polynomials so that their coefficients

represent the standard deviations of the corresponding polynomial aberration terms.

Consider a complete set of polynomials F j ( x , y ) in Cartesian coordinates ( x , y ) that

are orthonormal over a certain pupil according to

1

Ú F ( x , y ) F j ' ( x , y ) dx dy = d jj ' , (3-1)

A pupil j

where A is the area of the pupil inscribed inside a unit circle, the integration is carried out

over the area of the pupil, and d jj' is a Kronecker delta. Let F1 = 1. Since it is

independent of the coordinates x and y, it is referred to as the piston polynomial. As a

result, the mean value of each polynomial, except for j = 1, is zero, i.e.,

1

F j ( x, y) = Ú F ( x , y ) dx dy

A pupil j

37

38 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

= 0 for j π 1 , (3-2)

as may be seen by letting j ¢ = 1 in Eq. (3-1). The angular brackets on the left-hand side

of Eq. (3-2) indicate a mean value over the area of the pupil. Similarly, the mean square

value of a polynomial is unity, i.e.,

1

F j2 ( x , y ) = Ú F ( x , y ) dx dy

2

A pupil j

= 1 , (3-3)

form

•

W ( x, y) = Â a j F j ( x, y) , (3-4)

j =1

Multiplying both sides of Eq. (3-4) by F j ¢ ( x , y ) , integrating over the pupil, and utilizing

the orthonormality Eq. (3-1), the aberration coefficients are given by

1 1 •

Ú W ( x , y ) F j ¢ ( x , y ) dx dy = Â a Ú F ( x , y ) F j ¢ ( x , y ) dx dy

A pupil A j =1 j pupil j

= a j¢ ,

or

1

aj = Ú W ( x , y ) F j ( x , y ) dx dy . (3-5)

A pupil

polynomials used in the expansion. Accordingly, one or more terms can be added to or

subtracted from the aberration function without affecting the other coefficients. It is a

consequence of the orthogonality of the polynomials.

•

W ( x, y) = Â a j F j ( x, y)

j =1

= a1 , (3-6)

where we have utilized Eq. (3-2) for the mean value of a polynomial. The mean square

value of the aberration function is given by

2UWKRQRUPDO 3RO\QRPLDOV 39

1 • •

W 2 ( x, y) = Ú Â a j F j ( x , y ) Â a j ¢ F j ¢ ( x , y ) dx dy

A pupil j =1 j ¢ =1

•

= Â a 2j , (3-7)

j =1

where we have utilized the orthonormality Eq. (3-1) and Eq. (3-3) for the mean square

2

value of a polynomial. The variance s W of the aberration function is accordingly given

by

2

2

sW = W 2 ( x, y) - W ( x, y)

•

= Â a 2j , (3-8)

j =2

where s W is the standard deviation or the sigma value of the aberration function. Since

the mean value of a polynomial (except piston) is zero, each expansion coefficient a j

represents the standard deviation of the corresponding polynomial term. The variance of

the aberration function is simply the sum of the variances of the polynomial terms.

In the orthonormality Eq. (3-1) and those that follow it, we have assumed a

uniformly illuminated pupil, i.e., the amplitude across it is constant. If that is not the case,

as for example in a Gaussian pupil where the amplitude across the pupil varies as a

Gaussian function, then the amplitude function must be included in all the integrations

over the pupil (see Chapter 6). The quantity A in such cases would also be an amplitude-

weighted area of the pupil. Thus, the integrations, indicated by the angular brackets

implying a mean value, would be over an amplitude-weighted area of the pupil.

In practice, the number of polynomials used in the expansion will be truncated such

that the resulting variance obtained from Eq. (3-8) equals the actual value obtained from

the function W ( x , y ) within some specified tolerance. The Strehl ratio of an image for

small aberrations can be estimated from the variance according to Eq. (1-34).

LEAST-SQUARES FITTING

It is easy to show that the expansion coefficients a j given by Eq. (3-5) and obtained

as a consequence of the orthogonality of the polynomials F j ( x , y ) represent a least-

squares fit of the aberration function W ( x , y ) . Suppose we estimate the function with

only J polynomials. Thus we write

J

Wˆ ( x , y ) = Â a j F j ( x , y ) , (3-9)

j =1

fitting the aberration function with J polynomials is given by

40 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

1 2

E =

A pupil

[

Ú W ( x , y ) - Wˆ ( x , y ) ] dx dy

2

1 È J ˘

= Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ dx dy . (3-10)

A pupil Î j =1 ˚

∂E

= 0 , (3-11)

∂a j ¢

or

1 È J ˘

Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ F j ¢ ( x , y ) dx dy = 0 . (3-12)

A pupil Î j =1 ˚

Using the orthonormality Eq. (3-1), Eq. (3-12) yields Eq. (3-5). The variance of the

estimated aberration function is given by

2

ˆ2 ˆ

ˆ = W ( x, y) - W ( x, y)

2

sW

J

= Â a 2j . (3-13)

j =2

It should be evident that each polynomial coefficient provides a best fit to the

aberration function. The fit, of course, improves as more and more polynomials are added

until there is no more improvement. We point out that, in practice, the aberration function

data is available at a discrete set of points. Hence, there will be some error in the

coefficient values, because the orthonormality Eq. (3-1) will not be satisfied exactly. This

error decreases as the number of data points increases.

NONCIRCULAR PUPILS

The Zernike circle polynomials (discussed in Chapter 4) are orthogonal over a

circular pupil. They uniquely represent balanced classical aberrations and include

wavefront tilt and defocus aberrations. The corresponding polynomials F j ( x , y ) that are

orthogonal over a noncircular pupil can be obtained by orthogonalizing the circle

polynomials Z j ( x , y ) using the Gram–Schmidt orthonormalization process [1]. Omitting

the argument ( x , y ) of the polynomials for simplicity, we may write

G1 = Z1 = 1 , (3-14)

j

G j +1 = Z j +1 + Â c j +1,k Fk , (3-15)

k =1

2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 41

G j +1 G j +1

F j +1 = = 12

, (3-16)

G j +1 È1 2

˘

Í Ú G j +1 dx dy ˙

Î A pupil ˚

where

1

c j +1, k = - Ú Z F dx dy . (3-17a)

A pupil j +1 k

∫ - Z j +1Fk . (3-17b)

It is evident from Eq. (3-14) that F1 = 1. Substituting Eq. (3-17b) into Eq. (3-15) and

substituting the result thus obtained into Eq. (3-12), we may write

È j ˘

F j +1 = N j +1 Í Z j +1 - Â Z j +1Fk Fk ˙ , (3-18)

Î k = 1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the

pupil under consideration, i.e., they satisfy the orthonormality condition of Eq. (3-1).

Thus, the F-polynomials are obtained recursively, starting with F1 = 1. It is clear from Eq.

(3-18) that each F-polynomial of a certain order is a linear combination of the circle

polynomials of no more than that order. It should be evident that the F-polynomials are

ordered in the same manner as the basis polynomials and that there is a one-to-one

correspondence between them.

Because of the biaxial symmetry of the pupils considered in this chapter and,

therefore, the symmetric limits of integration, the integral in Eq. (3-17a) is zero when the

integrand is an odd function of one or both integration variables. It should be evident that

a c-coefficient is zero unless the Z- and the G-polynomials have the same cosine or sine

dependence. If all of the c-coefficients in Eq. (3-15) are zero, then the F-polynomial has

the same form as the corresponding Zernike polynomial, except for its normalization.

The orthonormal F-polynomials represent the unit vectors of the space that span the

aberration function. They can be written in a matrix form according to

l 1

Fl ( x, y) = Â Mli Zi ( x, y) with Mll = . (3-19)

i =1 Gl

While the diagonal elements of the M-matrix are simply equal to the normalization

constants of the G- polynomials [since there is no multiplier with the polynomial Z j +1 in

Eq. (3-15)], there are no matrix elements above the diagonal because a polynomial Fl

consists of a linear combination of circle polynomials up to Zl only. The matrix is lower

triangular and the missing elements may be given a value of zero when multiplying a

( )

Zernike column vector L, Z j , L to obtain the orthonormal column vector L , F j ,L . ( )

It should be evident that the orthonormal polynomials for a noncircular pupil written in

42 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

terms of the circle polynomials immediately yield the elements of the conversion matrix

M.

matrix approach [2], which is not only faster but also avoids the potential numerical

instability of the Gram–Schmidt approach as the number of polynomials increases.

Multiplying both sides of Eq. (3-19) by Fk , integrating over the pupil, and using the

orthonrmality Eq. (3-1), we obtain

J

Fk Fl = d kl = Â M kj Z j Fk , (3-20)

j =1

where, for example, Z j Fk represents the inner product of the Zernike polynomial Z j

and the orthonormal polynomial Fk over the pupil, i.e.,

1

Z j Fk = Ú Z ( x , y ) Fk ( x , y ) dx dy . (3-21)

A pupil j

MC ZF = 1 , (3-22)

and the orthonormal polynomials Fk . The elements of this matrix are given by

J T

Z k Fi [

= Â M ij Z j Z k

j =1

]

J T

= Â Z k Z j M ij

j =1

[ ] , (3-23)

T

[ ]

where, for example, M ij is the transpose of the matrix with elements M ij (obtained

by interchanging the rows and columns of the matrix M ). Equation (3-23) can be written

in the matrix form as

C ZF = C ZZ M T , (3-24)

polynomials between themselves. Substituting Eq. (3-24) into Eq. (3-22), we obtain

MC ZZ M T = 1 . (3-25)

Letting

M = QT ( )1 , (3-26)

2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 43

QT Q = C ZZ . (3-27)

Solving Eq. (3-27) for the matrix Q , the conversion matrix M can be obtained from Eq.

(3-26). While the matrix M is lower triangular, the matrix Q is upper triangular.

When considering the aberrations of a circular pupil of radius a, we normalize the

radial coordinate r by defining r = r a . Thus, 0 £ r £ a , but 0 £ r £ 1. This

normalization has the advantage that the coefficient of a classical aberration r n cos m q

(see Section 2.6) represents its peak value. This value occurs at the point where the x axis

intersects the circle. At this point, r has its maximum value of unity and the value of q is

zero giving a maximum value of unity for cos q . For example, the coefficient As of the

primary spherical aberration Asr 4 represents the peak value of the aberration. Indeed,

when As = 1l , we speak of one wave of spherical aberration. The same is true of primary

coma Ac r 3 cos q , where Ac represents its peak value. Similarly, we define a unit pupil

such that the distance of the farthest point from its center is unity. Figure 3-1 shows the

noncircular pupils considered in this book. The outer radius of an annular pupil is unity,

as in Figure 3-1a. The corners of the hexagon in Figure 3-1b lie at a distance of unity.

Figure 3-1c illustrates an ellipse with an aspect ratio of b, and its semimajor axis has a

length of unity. For each of these pupils, the coefficient of a classical aberration

represents its peak value. Figure 3-1d shows a rectangle with a half width a and its

corners at a distance of unity from its center. Similarly, Figure 3-1e shows a square of

half width 1 2 so that its corners are also at a distance of unity from its center. In these

two cases, while r has its maximum value of unity at a corner, the value of cos q at that

point is not unity. Hence, in these cases, the coefficient of a classical aberration does not

represent its peak value. In the case of a rectangle, the value of cos q depends on the

value of a, but in the case of a square its value is 1 2 . For example, coma has a peak

value of Ac 2 at a corner or the midpoint of a side. Finally, a unit slit pupil with a half

width of unity is shown in Figure 3-1f. The value of a coefficient of a classical aberration

in this case does represent its peak value.

3.6 SUMMARY

The content of an aberration function can be determined by expanding it in terms of a

complete set of polynomials that are orthogonal over its domain and have the form of

familiar aberrations, such as those discussed in Chapter 2. The Zernike circle

polynomials, for example, are not only orthogonal over a circular pupil, but they also

represent balanced classical aberrations, as discussed in Chapter 4. It is advantageous to

use the polynomials in their orthonormal form so that the piston coefficient represents the

mean value of the aberration function and the other expansion coefficients represent the

standard deviations of the corresponding polynomial aberration terms. As illustrated by

Eq. (3-5), the value of an expansion coefficient is independent of the number of

polynomials used in the expansion. Moreover, each coefficient yields a least-squares fit to

the aberration function. The variance of the aberration function is given by the sum of the

squares of the coefficients (other than the piston), as in Eq. (3-8).

44 ORTHONORMAL POLYNOMIALS AND GRAM SCHMIDT ORTHONORMALIZATION

( ) ( )

1

q

( ) ( )

y y

D(0,c)

(

D –c, 1 – c 2 ) (

A c, 1 – c 2 )

C – 1, 0 A 1, 0

O x O x

(

C – c, – 1 – c 2 ) (

B c, – 1 – c 2 )

B(0, – c)

y y

D – 1 2, 1 2

A 1 2,1 2

x

O x –1 O 1

C –1 2, – 1 2

B 1 2, – 1 2

(e) Sq u a r e (f) S l i t

Figure 3-1. Unit pupils inscribed inside a unit circle. (a) annulus of obscuration ratio

, (b) hexagon with a side of unity, (c) ellipse of aspect ratio b, (d) rectangle of half

width a, (e) square of half width 1 2 , and (f) slit of half width of unity.

6XPPDU\ 45

Given a set of polynomials that are orthonormal over a certain domain, those that are

orthonormal over another domain can be obtained from them by the recursive Gram–

Schmidt orthonormalization process. They can also be obtained by a nonrecursive matrix

approach. Each new polynomial obtained is a linear combination of the basis

polynomials, as indicated by Eq. (3-18). We use the Zernike circle polynomials as the

basis functions to obtain the polynomials that are orthonormal over an annular, Gaussian,

hexagonal, elliptical, rectangular, or a square pupil. The slit pupil is a limiting case of a

rectangular pupil whose one dimension is negligibly small compared to the other. The

concept of a unit pupil is emphasized so that the farthest point or points on a pupil are at a

distance of unity from its center. It has the advantage that the coefficient of a single

aberration term represents its peak value. Thus, in each case the pupil is inscribed inside a

unit circle.

46 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION

References

(McGraw-Hill, New York, 1968).

matrix formulation,” Opt. Lett. 32, 74–76 (2007).

CHAPTER 4

4.1 Introduction ............................................................................................................49

47

48 SYSTEMS WITH CIRCULAR PUPILS

4.10 Circle Polynomials and Their Relationships with Classical Aberrations ......... 88

4.10.1 Introduction................................................................................................88

4.10.3 Astigmatism............................................................................................... 89

4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ..............92

References ......................................................................................................................103

Chapter 4

Systems with Circular Pupils

4.1 INTRODUCTION

Optical systems generally have a circular pupil. The imaging elements of such

systems also have a circular boundary. Therefore, they are also represented by circular

pupils in fabrication and testing. As a result, the Zernike circle polynomials have been in

widespread use since Zernike introduced them in his phase contrast method for testing

circular mirrors [1]. They are used in optical design and testing to understand the

aberration content of a wavefront. They have also been used for analyzing the wavefront

aberrations introduced by atmospheric turbulence on a wave propagating through it [2].

We start this chapter with a brief discussion of the point-spread function (PSF) and

the optical transfer function (OTF) of an aberration-free system with a circular pupil. We

then consider the effect of primary aberrations on the Strehl ratio of an image. Since the

Strehl ratio for small aberrations depends on the variance of an aberration, we balance a

classical aberration of a certain order with those of lower orders to reduce its variance.

The utility of the Zernike circle polynomial stems from the fact that they are not only

orthogonal over a circular pupil, but they also uniquely represent the balanced classical

aberrations yielding minimum variance over the pupil [3–6]. Because of their

orthogonality, when a circular wavefront is expanded in terms of them, the value of a

Zernike expansion coefficient is independent of the number of polynomials used in the

expansion. Hence, one or more polynomial terms can be added or subtracted without

affecting the other coefficients. The piston coefficient represents the mean value of the

aberration function, and the variance of the function is given simply by the sum of the

squares of the other expansion coefficients.

symmetry of its interferogram, the corresponding aberrated PSF, the real and imaginary

parts of the OTF, and the modulation transfer function (MTF). It is shown that the

interferogram, the real part of the OTF, and the corresponding MTF are 2m-fold whether

m is an even or an odd integer, but the PSF and the imaginary part of the OTF are m-fold

when m is odd. Numerical examples are given to illustrate the Zernike aberrations

isometrically, interferometrically, and by the corresponding PSFs, OTFs, and MTFs.

function and the corresponding Zernike expansion coefficients are considered. In

particular, we discuss how to obtain the Seidel coefficients from the Zernike coefficients

of an aberration function. We illustrate by an example how wrong Seidel coefficients are

obtained when using only the corresponding Zernike polynomials. Finally, we show how

the Zernike coefficients of an aberration function over a circular pupil change as its

diameter is reduced.

49

50 SYSTEMS WITH CIRCULAR PUPILS

Consider an imaging system with a circular exit pupil of radius a, diameter D 2a ,

and area Sex Sa 2 lying in the pupil plane x p y p with z as its optical axis. The Cartesian

and polar coordinates x p , y p and r p , T of a pupil point Q, as illustrated in Figure 4-1,

are related to each other according to

We refer to the pupil in the U, T coordinates as a unit circular pupil in the sense of a unit

G

disc. For a uniformly illuminated pupil with an aberration function ) r p and power Pex

exiting from it, the pupil function of the system can be written

G

P rp G > G @

A r p exp i) r p ,

G

rp d a

(4-3)

0 , otherwise ,

where

G P

A rp ex Sex

12

(4-4)

yp y pc

Q(rp , T)

rp U

yp U sin T

T T

xp x pc

O xp O U cos T

a 1

(a) (b)

Figure 4-1. (a) Circular exit pupil of radius a of an imaging system. (b) Circular

pupil as a unit disc. The polar coordinates of a point Q are r p , T in (a) and U, T

in (b).

4.3 Aberration Free Imaging 51

4.3.1 PSF

Using polar coordinates (ri , q i ) for an observation point in Eq. (2-9), the PSF

representing the irradiance distribution in the image plane for a circular pupil can be

written

1 2p 2

1 Û Û

I (r , q i ) [ ] [

= 2 Ù Ù exp iF (r, q) exp - pir r cos (q - q i ) r dr dq

p ı ı

] , (4-5)

0 0

the phase aberration at a point (r, q) in the pupil plane, and the irradiance is normalized

by the aberration-free central value Pex Sex l2 R 2 = p Pex 4 l2 F 2 .

For an aberration-free system, i.e., for a spherical wavefront exiting from the pupil so

that F(r, q) = 0, Eq. (4-5) reduces to

2

1 1 2p

I (r , q i ) = [ (

Ú Ú exp - pi r r cos q p - q i r dr d q p

p2 0 0

)] . (4-6)

Noting that

2p

Ú exp (i x cos a ) da = 2pJ 0 ( x ) , (4-7)

0

◊

where J 0 ( ) is the zero-order Bessel function of the first kind, Eq. (4-7) reduces to

1 2

[

I ( r ) = 4 Ú J 0 (p r r) r dr

0

] . (4-8)

a a

Ú x J 0 (bx ) dx = J ( ab) , (4-9)

0 b 1

where J 0 (◊) is the first-order Bessel function of the first kind, Eq. (4-9) yields

2

È 2J (p r ) ˘

I (r) = Í 1 ˙ , (4-10)

Î pr ˚

where J1(◊) is the first-order Bessel function of the first kind. Integrating over a circle of

radius rc , (in units of l F ) it can be shown that it contains a fractional power given by

Figure 4-2 shows a plot of Eq. (4-10), called the Airy pattern. It consists of a bright

52 SYSTEMS WITH CIRCULAR PUPILS

spot at the center, called the Airy disc, surrounded by dark and bright diffraction rings.

The fractional power is also plotted in Figure 4-2a. The radius of the Airy disc is 1.22 and

contains 83.8% of the total light, as may be seen by letting rc = 1.22 in Eq. (4-11). The

center of the pattern lies at the Gaussian image point.

1.0

0.8

P

I(r), P(rc)

0.6

0.4

0.2 I

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

r, rc

(a)

(b)

Figure 4-2. (a) Irradiance and encircled power distributions for an aberration-free

system with a circular pupil. (b) 2D PSF, called the Airy pattern.

4.3.2 OTF 53

4.3.2 OTF

From Eq. (2-11), the aberration-free OTF can be written

r Û r r r r

ı

( ) (

t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (4-12)

It is evident that the OTF represents the fractional area of overlap of two circles, each of

r

radius a, separated by a distance l Rvi , where v i = v i . From Figure 4-3, we note that the

area of overlap is given by four times the difference between the area of a sector of radius

a and cone angle b , and the area of the triangle OAB. Hence, the OTF can be written

4 Ê b 1 ˆ

t(v i ) = Á p a 2 - OA ◊ AB˜ . (4-13)

Sex Ë 2p 2 ¯

13), we obtain

2

t(v i ) = (b - sin b cos b) (4-14)

p

2È

=

p ÎÍ

(

cos 1 v - v 1 - v 2 )1 2 ˘˚˙ , 0£ v£1 . (4-15)

which the overlap area reduces to zero. The OTF is radially symmetric because the

overlap area depends only on the separation l Rvi of the two pupils and is independent of

r

the direction of v i .

a

b

O

A O¢

lRni

Figure 4-3. Aberration-free OTF as the fractional area of overlap of two circles of

radius a whose centers are separated by a distance lRvi .

54 SYSTEMS WITH CIRCULAR PUPILS

Figure 4-4 shows how the OTF varies with v. The integral of the aberration-free

OTF that enters into the calculation of the Strehl ratio from the real part of the complex

aberrated OTF [see Eq. (2-25)] is given by

1

Û

Ù t (v) v dv = 1 8 . (4-16)

ı

0

t¢ ( 0) = - 4 p . (4-17)

Although obtained from the aberration-free OTF, this slope is independent of any

aberration.

Letting r = 0 in Eq. (4-5) for the irradiance distribution normalized by its aberration-

free central value, we obtain the Strehl ratio of an aberrated image:

1 2p 2

1 Û Û

S =

p2 ı ı

[ ]

Ù Ù exp i F(r, q) r dr dq . (4-18)

0 0

1.0

0.8

0.6

t

0.4

0.2

0.0

0.0 0.2 0.4 0.6 0.8 1.0

n

4.4.2 Defocus Strehl Ratio 55

Consider an observation being made in an image plane passing through a point P1 at

a distance z from the exit pupil of a system, while a beam with a spherical wavefront W

is focused at a point P2 at a distance R, as illustrated in Figure 1-6. The spherical

wavefront is aberrated with respect to the reference sphere S of radius of curvature z due

to the longitudinal defocus z R . The defocus aberration may be written

)U Bd U 2 , (4-19)

where the peak value Bd of the phase aberration is related to the longitudinal defocus

according to

Bd

S 4O F 2 z R . (4-20)

distance z R , as in Figure 1-6. Substituting Eq. (4-19) into Eq. (4-18), we obtain the

Strehl ratio of the defocused image:

S >sin Bd 2 Bd 2 @ 2 . (4-21)

The Strehl ratio decreases as the aberration increases until it reaches a value of zero

when the aberration becomes 2S radians or one wave. As shown in Figure 4-5, it

fluctuates for increasing value of defocus, becoming zero when the aberration is an

integral number of waves. It should be evident that the defocused Strehl ratio represents

the axial irradiance of a focused beam.

1.0

0.8

0.6

S

0.4

0.2

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Bd

Figure 4-5. Strehl ratio S of a defocused beam, representing its axial irradiance,

where Bd is the defocus aberration in units of wavelength.

56 SYSTEMS WITH CIRCULAR PUPILS

The approximate expressions for the Strehl ratio when the aberration is small are

given by Eqs. (2-31)–(2-33), i.e.,

2

S1 ~ (1 - s 2F 2) , (4-22a)

S2 ~ 1 - s 2F , (4-22b)

and

S3 ~ exp (- s 2F ) , (4-22c)

where

is the variance of the phase aberration across the pupil. The mean and the mean square

values of the aberration are obtained from the expression

1 2p

Û Û

Fn = p 1 Ù Ù F n (r, q) r dr dq (4-24)

ı ı

0 0

Table 4-1 gives the form as well as the standard deviation s F of a primary (or a

Seidel) aberration, where an aberration coefficient Ai represents the peak value of the

aberration. It also lists the aberration tolerance, i.e., the value of the aberration coefficient

Ai , for a Strehl ratio of 0.8. This tolerance has been obtained by using the Strehl ratio

expression S2 , according to which the standard deviation for a Strehl ratio of 0.8 is given

by

sF = 0.2 (4-25)

or

where s w is the sigma value of the wave aberration. The aberration tolerance listed in

Table 4-1 is for the wave (as opposed to the phase) aberration coefficient, as is customary

in optics. It should be understood that the tolerance numbers given are not accurate to the

second decimal place. They are listed as such for consistency only. We have used the

symbol Ad for the coefficient of field curvature aberration, which varies quadratically

with the angle that a point object makes with the optical axis of the system. However, to

4.4.3 Approximate Expressions for Strehl Ratio 57

Table 4-1. Standard deviation and aberration tolerance for primary aberrations.

Spherical As r 4 2 As As l 4.19

=

3 5 3.35

=

2 2 2.83

4

=

(defocus) 2 3 3.46

2

avoid confusion, we have used the symbol Bd for representing the defocus wave

aberration, which is independent of the field angle but has the same dependence on pupil

coordinates as field curvature. Similarly, we have used the symbol At for distortion,

which varies as the cube of the field angle. But, we will use the symbol Bt to represent

the wavefront tilt, which is independent of the field angle but has the same dependence on

pupil coordinates as distortion.

The variance of a primary aberration can be reduced by observing the image in a

defocused image plane, i.e., by mixing it with defocus aberration. Thus, for example, we

balance primary spherical aberration with defocus aberration and write it as

F(r) = As r 4 + Bd r 2 . (4-27)

as discussed in Section 4.3. The mean and the mean square value of the aberration

function are given by

1 2p

1 Û Û

<F > =

p Ù Ù

ı ı

( A s r 4 + B d r 2 ) r dr d q

0 0

As Bd

= + (4-28)

3 2

and

As2 B2 A B

F2 = + d + s d . (4-29)

5 3 2

58 SYSTEMS WITH CIRCULAR PUPILS

2

s F2 = F 2 - F

4 As2 B2 A B

= + d + s d . (4-30)

45 12 6

∂ s F2

= 0 , (4-31)

∂ Bd

and checking that it yields a minimum and not a maximum. Thus, we find that the

optimum value is Bd = - As, and the balanced aberration is given by

(

F bs (r) = As r 4 - r 2 ) . (4-32)

Its standard deviation or sigma value is As 6 5 , which is a factor of 4 smaller than the

corresponding value 2 As 3 5 for Bd = 0. Since the sigma value has been reduced by a

factor of 4, its tolerance has been increased by the same factor. For example, S = 0.8 is

obtained in the Gaussian image plane for As = l 4 . However, the same Strehl ratio is

obtained for As = 1 l in a slightly defocused image plane such that Bd = - l .

Similarly, we balance astigmatism with defocus and coma with tilt. Table 4-2 lists

the form of a balanced primary aberration, its standard deviation, and its tolerance for a

Strehl ratio of 0.8, according to Eq. (4-16b). Also listed in the table is the location of the

diffraction focus, i.e., the point with respect to which the aberration variance is minimum

so that the Strehl ratio is maximum at it. The amount of balancing defocus is minus half

standard deviation, and aberration tolerance.

F ( r, q)

Aberration Focus* S = 0.8

Spherical (

As r 4 - r2 ) (0, 0, 8F A )

2

s

As 0.955l

6 5

Coma (

Ac r3 - 2r 3 cos q ) (4 FAc 3, 0, 0 ) Ac 0.604l

6 2

Aa

Astigmatism (

Aa r2 cos 2 q - 1 2 ) (0 , 0 , 4 F A )

2

a

2 6

0.349l

= ( Aa 2) r2 cos 2q

*The diffraction focus coordinates are relative to the Gaussian image point.

4.5 Balanced Aberrations 59

the amount of astigmatism, or the diffraction focus lies at a distance 4 F 2 As along the z

axis. The balancing tilt is minus two-thirds the amount of the coma. Thus, the maximum

Strehl ratio is obtained at a point that is displaced from the Gaussian image point by

4 FAc 3 but lies in the Gaussian image plane.

For primary aberrations, S1 and S2 underestimate the true Strehl ratio S. S3 gives a

better approximation for the true Strehl ratio than S1 and S2 . The reason is that, for small

4

values of s w , it is larger than S1 by approximately s F 4 . Of course, S1 is larger than S2

4

by s F 4 . The expression S3 underestimates the true Strehl ratio only for coma and

astigmatism; it overestimates for the other aberrations. Numerical analysis shows that the

error, defined as 100 (1 - S3 S ) , is < 10% for S > 0.3 [5,7].

Rayleigh [8] showed that a quarter-wave of primary spherical aberration reduces the

irradiance at the Gaussian image point by 20%, i.e., the Strehl ratio for this aberration is

0.8. This result has brought forth the Rayleigh’s l 4 rule; namely, that a Strehl ratio of

approximately 0.8 is obtained if the maximum absolute value of the aberration at any

point in the pupil is equal to l 4 . A variant of this definition is that an aberrated

wavefront that lies between two concentric spheres spaced a quarter-wave apart will give

a Strehl ratio of approximately 0.8. Thus, instead of W p = l 4 , we require

W p v = l 4 , where Wp is the peak absolute value and Wp v is the peak-to-valley (P-V)

value of the aberration. However, a Strehl ratio of 0.8 is obtained for W p = l 4 = W p v

for spherical aberration only. For other primary aberrations, distinctly different values of

Wp and Wp v give a Strehl ratio of 0.8 [5,9]. Thus, it is advantageous to use s w for

estimating the Strehl ratio. A Strehl ratio of S >

~ 0.8 is obtained for s w <

~ l 14 .

When a certain aberration is balanced with other aberrations to minimize its variance,

the balanced aberration does not necessarily yield a higher or the highest possible Strehl

ratio. For small aberrations, a maximum Strehl ratio is obtained when the variance is

minimum. For large aberrations, however, there is no simple relationship between the

Strehl ratio and the aberration variance. For example [9], when As = 3l , the optimum

amount of defocus is Bd = - 3l , but the Strehl ratio is a minimum and equal to 0.12. The

Strehl ratio is maximum and equal to 0.26 for Bd ~ - 4l or - 2l . For As < ~ 2.3l , the

axial irradiance is maximum at a point with respect to which the aberration variance is

minimum. Similarly, in the case of coma, the maximum irradiance in the image plane

occurs at the point with respect to which the aberration variance is minimum only if

~ 0.7l , which in turn corresponds to S >

Ac < ~ 0.76 . For larger values of Ac , the

distance of the point of maximum irradiance does not increase linearly with its value and

even fluctuates in some regions [10]. Moreover, it is found that for Ac > 2.3l , the Seidel

coma gives a larger Strehl ratio than the balanced coma, i.e., the irradiance in the image

plane at the origin is larger than at the point with respect to which the aberration variance

is minimum. Thus, only for large Strehl ratios, the irradiance is maximum at the point

associated with the minimum aberration variance.

60 SYSTEMS WITH CIRCULAR PUPILS

The defocused PSFs are shown in Figure 4-6 to illustrate the zero Strehl ratio for

integral number of waves of defocus aberration. As an illustration of the improvement in

the Strehl ratio by aberration balancing, Table 4-3 lists the Strehl ratio of a primary

aberration with and without balancing for a quarter wave of aberration. The Strehl ratio

for a quarter of defocus is 0.811. As shown in Figure 4-7, the Strehl ratio for a quarter

wave of spherical aberration improves from a value of 0.800 to 0.986 when it is balanced

with an equal and opposite amount of defocus aberration. In the case of coma, a Strehl

ratio of 0.737 is obtained, but a peak of value 0.966 lies to the right of the origin, as

shown in Figure 4-8. When coma is balanced with a wavefront tilt equal to 2 3 the

amount of coma, the peak moves to the origin and the Strehl ratio increases from 0.737 to

0.966. In the case of astigmatism, as shown in Figure 4-9, the Strehl ratio increases from

a value of 0.857 to 0.902 when it is balanced with defocus.

and secondary astigmatism ( U 4 cos 2 T ) can be reduced similarly by mixing them with

appropriate aberrations of lower order. The secondary spherical aberration is balanced

with primary spherical aberration and defocus to minimize its variance. The balanced

secondary spherical aberration thus obtained is given by

Similarly, secondary coma is balanced with primary coma and wavefront tilt to minimize

its variance, and the balanced aberration thus obtained is given by

1.0

Bd = 0 Defocus

0.8

0.6

I (r)

1/4

1

0.4 x10

0.2

0.0

0.0 0.5 1.0 1.5 2.0

r

Figure 4-6. PSFs for a quarter-wave and one wave of defocus as a function of r in

units of O F . For clarity, the curve for Bd 1 has been multiplied by ten. The

aberration-free PSF, representing the Airy pattern with its first zero at 1.22, is

shown by the solid curve.

4.5 Balanced Aberrations 61

Table 4-3. Strehl ratio S for a quarter-wave of a primary aberration with and

without balancing for a circular pupil, i.e., for Bd Aa Ac As O 4 and

0 d U d 1.

Aberration S

Aberration free 1

Defocus, Bd U 2 0.811

>

Balanced astigmatism, Aa U 2 cos 2 T 1 2 @ 0.902

>

Balanced coma, Ac U 3 2 3U cos T @ 0.966

Balanced spherical aberration, As U 4 U 2 0.986

1.0

0.8

0.6

I (r)

0.4 Balanced

Spherical

Spherical

0.2

0.0

0.0 0.5 1.0 1.5 2.0

r

Figure 4-7. PSFs for a quarter-wave of spherical aberration with and without

balancing with equal and opposite amount of defocus. The aberration-free PSF,

representing the Airy pattern with its first zero at 1.22, is shown by the solid curve.

62 SYSTEMS WITH CIRCULAR PUPILS

1.0

0.8

I (x,0)

0.6

Coma

0.4

Balanced

Coma

0.2

0.0

-2 -1 0 1 2

x

Figure 4-8. PSFs for a quarter-wave of coma along the x axis (in units of O F ) with

and without the balancing tilt. The aberration-free PSF is shown by the solid curve.

astigmatism, and defocus to minimize its variance, and the balanced aberration thus

obtained is given by:

1 4 3 2 3 1 § 4 3 2·

) bsa U, T U 4 cos 2 T U U cos 2 T U 2 U U cos 2T . (4-35)

2 4 8 2© 4 ¹

1.0

0.8 Balanced

Astigmatism

I (x,0)

0.6

0.4

Astigmatism

0.2

0.0

0 1 2

x

Figure 4-9. PSFs for a quarter-wave of astigmatism along the x axis (in units of

O F ) with and without the balancing defocus. The aberration-free PSF is shown by

the solid curve.

4.5 Balanced Aberrations 63

order aberrations to minimize their variance, it is found [11] that a maximum of Strehl

ratio is obtained only if its value comes out to be greater than about 0.5. Otherwise, a

mixture of aberrations yielding a larger-than-minimum possible variance gives a higher

Strehl ratio than the one provided by a minimum-variance mixture.

4.6.1 Analytical Form

In his phase contrast method for testing the figure of circular mirrors, which he

proposed as an improvement over the Foucault knife-edge test, Zernike introduced his

circle polynomials as eigenfunctions of a second-order differential equation in two

variables [1]. These polynomials, which form a complete orthogonal set for the interior of

a unit circle, are the well-known circle polynomials. Nijboer used these polynomials to

study the balancing of classical aberrations of a power-series expansion of the aberration

function and the effect of small aberrations on the diffraction images formed by

rotationally symmetric imaging systems with circular pupils [2].

[

Z nm (r, q) = 2( n + 1) (1 + d m 0 ) ]1/ 2Rnm (r) cos mq , 0 £ r £ 1 , 0 £ q £ 2 p , (4-36)

where n and m are positive integers including zero, n - m ≥ 0 and even, and Rnm (r) is a

radial polynomial given by

( n m )/ 2 ( -1) s ( n - s)!

Rnm (r) = Â rn 2s

(4-37)

s= 0 Ên+m ˆ Ên-m ˆ

s!Á - s˜ ! Á - s˜ !

Ë 2 ¯ Ë 2 ¯

with a degree n in r containing terms in rn , rn 2 , K, and rm. It is clear from Eq. (4-36)

that the circle polynomials are separable in the polar coordinates r and q of a pupil

point.

even or odd. It is normalized such that

and

Ïd m 0 for even n 2

Rnm ( 0) = Ì (4-40)

Ó - d m 0 for odd n 2 .

64 SYSTEMS WITH CIRCULAR PUPILS

polynomial Pn (◊) according to

(

Rn0 (r) = Pn 2r 2 - 1 ) . (4-41)

2p

Ú cos mq cos m¢q dq = p (1 + d m 0 ) d mm ¢ . (4-42)

0

1

Û m 1

Ù Rn (r) Rn ¢ (r) r dr = 2 n+ 1 d nn ¢

m

. (4-43)

ı ( )

0

In Eq. (4-43), the m value is the same for both radial polynomials because of the

orthogonality Eq. (4-42) of the trigonometric functions. Accordingly, the polynomials

Z nm (r, q) are orthonormal according to

1 1 2p m

Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢ d mm ¢

m¢

. (4-44)

p0 0 n

random in nature, we need both the cosine and the sine Zernike circle polynomials to

express them. It is convenient in such cases to write their form and numbering as [5]:

An even number is associated with a cosine polynomial and an odd number with a sine

polynomial. The orthogonality of the trigonometric functions yields

2p

Ï cos mq cos m¢q , j and j ¢ are both even

Ô cos mq sin m¢q , j is even and j ¢ is odd

Û Ô

Ù dq Ì

ı Ôsin mq cos m¢q , j is odd and j ¢ is even

0

ÔÓsin mq sin m¢q , j and j ¢ are both odd

Ô

= Ì p d mm ¢ , j and j ¢ are both odd (4-46)

Ô0 , otherwise .

Ó

Therefore, the Zernike circle polynomials are orthonormal over a unit disc according to

4.6.1 Analytical Form 65

1 2p 1 2p

Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (4-47)

0 0 0 0

The orthonormal Zernike circle polynomials and the names associated with some of

them when identified with the classical aberrations are listed in Table 4-4 in polar

coordinates for n £ 8. The polynomials independent of q are the spherical aberrations,

those varying as cos q are the coma aberrations, and those varying as cos 2q are the

astigmatism aberrations. The variation of several radial polynomials Rnm (r) with r is

illustrated in Figure 4-10. A polynomial with an even value of n has a value of zero at n 2

values of r , e.g., for defocus, astigmatism, and various orders of spherical aberration. A

polynomial with an odd value of n has a value of zero at ( n + 1) 2 values of r , e.g., for

various orders of coma. The larger the value of n of a polynomial, the more oscillatory

the polynomial.

The index n of a Zernike polynomial represents its radial degree or the order, since it

represents the highest power of r in the polynomial. This is different from the order of a

classical aberration, which represents the degree of the object (for which the aberration

function is considered) and pupil points in Cartesian coordinates (see Section 1.6). The

index m of a polynomial is referred to as its azimuthal frequency. The index j is a

polynomial-ordering number and is a function of both n and m. The polynomials in Table

4-4 are ordered such that an even j corresponds to a symmetric polynomial varying as

cosmq, while an odd j corresponds to an antisymmetric polynomial varying as sinmq. A

polynomial with a lower value of n is ordered first, and for a given value of n, a

polynomial with a lower value of m is ordered first.

The number of circle polynomials of a given order n is n + 1. Their number through

a certain order n is given by

N n = ( n + 1)( n + 2) 2 . (4-48)

For a rotationally symmetric imaging system, each of the sin mq terms is zero, as

discussed in Section 1.6. Accordingly, the number of polynomials of an even order is

(n 2) + 1 and ( n + 1) 2 for an odd order. Their number through an order n is given by

[

N n = (n 2) + 1 ]2 for even n , (4-49a)

66 SYSTEMS WITH CIRCULAR PUPILS

Table 4-4. Orthonormal Zernike circle polynomials Z j ( r,, q) . The indices j, n, and m

are called the polynomial number, radial degree, and azimuthal frequency,

respectively. The polynomials Z j are ordered such that an even j corresponds to a

symmetric polynomial varying as cos mqq , while an odd j corresponds to an

antisymmetric polynomial varying as sin mqq. A polynomial with a lower value of n

is ordered first, and for a given value of n, a polynomial with a lower value of m is

ordered first.

1 0 0 1 Piston

2 1 1 2 r cos q x-tilt

3 1 1 2 r sin q y-tilt

4 2 0 (

3 2r 2 - 1 ) Defocus

6 2 2 6 r2 cos 2 q 0∞ Primary astigmatism

7 3 1 (

8 3r3 - 2r sin q ) Primary y-coma

8 3 1 8 (3r 3

- 2r) cos q Primary x-coma

9 3 3 8 r 3 sin 3 q

10 3 3 8 r 3 cos 3 q

11 4 0 (

5 6r 4 - 6r2 + 1 ) Primary spherical aberration

12 4 2 (

10 4r 4 - 3r2 cos 2q ) 0∞ Secondary astigmatism

13 4 2 10 ( 4r 4

- 3r ) sin 2q

2 45∞ Secondary astigmatism

14 4 4 10 r 4 cos 4 q

15 4 4 10 r 4 sin 4 q

16 5 1 ( )

12 10r5 - 12r3 + 3r cos q Secondary x-coma

5 3

Secondary y-coma

18 5 3 12 (5r - 4r ) cos 3q

5 3

19 5 3 12 (5r - 4r ) sin 3q

5 3

20 5 5 12 r 5 cos 5 q

21 5 5 12 r 5 sin 5 q

*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,

orthonormal Zernike circle 0∞ primary astigmatism.

4.6.4 Number of Circle Polynomials through a Certain Order n 67

22 6 0 (

7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical

23 6 2 ( 6

)

14 15r - 20r + 6r sin 2q 4 2

45∞ Tertiary astigmatism

6 4 2

0∞ Tertiary astigmatism

25 6 4 14 (6r - 5r ) sin 4q

6 4

26 6 4 14 (6r - 5r ) cos 4q

6 4

27 6 6 14 r 6 sin 6 q

28 6 6 14 r 6 cos 6 q

29 7 1 ( )

4 35r7 - 60r5 + 30r3 - 4r sin q Tertiary y-coma

7 5 3

Tertiary x-coma

7 5 3

7 5 3

33 7 5 4 (7r - 6r ) sin 5q

7 5

34 7 5 4 (7r - 6r ) cos 5q

7 5

35 7 7 4 r 7 sin 7 q

36 7 7 4 r 7 cos 7 q

37 8 0 (

3 70r8 - 140r6 + 90r4 - 20r2 + 1 ) Tertiary spherical

38 8 2 ( )

18 56r 8 - 105r 6 + 60r 4 - 10r 2 cos 2q 0∞ Quaternary astigmatism

42 8 6 18 (8r 8 - 7r 6 ) cos 6q

43 8 6 18 (8r 8 - 7r 6 ) sin 6q

44 8 8 18 r 8 cos 8q

45 8 8 18 r 8 sin 8q

*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,

orthonormal Zernike circle 0∞ primary astigmatism.

68 SYSTEMS WITH CIRCULAR PUPILS

n 4

0.5 8

R n(ρ)

0 (a)

0

-0.5 6

2

-1

0 0.2 0.4 0.6 0.8 1

n 5

0.5

7

1

R n(ρ)

0 (b)

1

-0.5

-1

0 0.2 0.4 0.6 0.8 1

n 6

0.5

2

R n(ρ)

0 (c)

2

-0.5 8

4

-1

0 0.2 0.4 0.6 0.8 1

U

U. (a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.

4.6.5 Relationships among the Indices n, m, and j 69

The number of polynomials Nn through a certain order n represents the largest value

of j. Since the number of polynomials with the same value of n but different values of m

is equal to n + 1, the smallest value of j for a given value of n is Nn - n . For a given

value of n and m, there are two j values, Nn - n + m - 1 and Nn - n + m . The even value

of j represents the cos mq polynomial, and the odd value of j represents the sin mq

polynomial. The value of j with m = 0 is Nn - n . For example, for n = 5, N n = 21 and

j = 21 represents the sin 5q polynomial. The number of the corresponding cos 5q

polynomial is j = 20. The two polynomials with m = 3, for example, have j values of 18

and 19, representing the cos 3q and the sin 3q polynomials, respectively.

[

n = ( 2 j - 1)

12

]

+ 0.5

integer

-1 , (4-50)

where the subscript integer implies the integer value of the number in brackets. Once n is

known, the value of m is given by

Ô {

Ï 2 [ 2 j + 1 - n( n + 1) ] 4 }

integer

when n is even (4-51a)

m=Ì

{ }

Ô 2 [ 2( j + 1) - n( n + 1) ] 4 integer - 1 when n is odd .

Ó

(4-51b)

For example, suppose we want to know the values of n and m for the polynomial j = 10.

From Eq. (4-50), n = 3 and from Eq. (4-51b), m = 3. Hence, it is a cos 3q polynomial.

The Zernike circle polynomials have certain unique mathematical properties. They

are the only polynomials in two variables r and q, which (a) are orthogonal over a circle,

(b) are invariant in form with respect to rotation of the coordinate axes about the origin,

and (c) include a polynomial for each permissible pair of n and m values [4,12].

From the standpoint of wavefront analysis, their uniqueness lies in the fact that they

are not only orthogonal over a circular pupil, but include wavefront tilt, defocus, and

balanced classical aberrations as members of the polynomial set for such a pupil. For

example, Z 6 , Z 8 , and Z11 represent the balanced primary aberrations of astigmatism,

coma, and spherical aberration, as may be seen by comparing their forms with those

given in Table 4-2. Similarly, Z12 , Z16 , and Z 22 represent the balanced secondary

aberrations of astigmatism, coma, and spherical aberration, respectively, as may be seen

by comparing their forms with those given in Eqs. (4-33)–(4-35), respectively. Note that

the constant term in a radially symmetric aberration is needed to make its mean value

zero over the pupil. A balanced classical aberration in the form of a Zernike polynomial is

referred to as a Zernike or orthogonal aberration, e.g., Z 6 is Zernike primary

astigmatism or Z 8 is Zernike primary coma. In Section 4.5, aberrations with only cos mq

type dependence are considered, as would be the case for a rotationally symmetric

70 SYSTEMS WITH CIRCULAR PUPILS

imaging system. In general, an aberration function will also have sin mq type terms, for

example, due to fabrication errors or those due to atmospheric turbulence. The

corresponding polynomials with sin mq dependence are considered in Section 4.6.

The circle polynomials given in polar coordinates in Table 4-4 can be written in the

Cartesian coordinates ( x , y ) of a pupil point, and cos mq and sin mq can be written in

terms of powers of cos q and sinq , respectively. They are listed in Table 4-5 using the

polynomial ordering index j. It is quite common in the optics literature to consider a point

object lying along the y axis when imaged by a rotationally symmetric optical system,

thus making the yz plane the tangential plane [4]. To maintain symmetry of the aberration

function about this plane, the polar angle q of a pupil point in Figure 4-1 is accordingly

defined as the angle made by its position vector OQ with the y axis, contrary to the

standard convention as the angle with the x axis. We choose a point object along the x

( )

axis so that, for example, the coma aberration is expressed as x x 2 + y 2 and not as

( )

y x 2 + y 2 . A positive value of our coma aberration yields a diffraction point spread

function that is symmetric about the x axis (or symmetric in y) with its peak and centroid

shifted to a positive value of x with respect to the Gaussian image point.

available at a uniformly spaced array of points in Cartesian coordinates. Hence, it is

convenient to carry out numerical analysis in a Cartesian coordinate system using the

Zernike circle polynomials in Cartesian coordinates.

FUNCTION

The aberration function W (r, q) of a rotationally symmetric imaging system for a

certain point object can be expanded in terms of the orthonormal Zernike circle

polynomials Z nm (r, q) that are orthonormal over a unit disc in the form

• n

W (r, q) = Â Â c nm Z nm (r, q) , 0 £ r £ 1 , 0 £ q £ 2p , (4-52)

n =0 m =0

where c nm are the orthonormal expansion coefficients that depend on the object location.

The orthonormal Zernike expansion coefficients are given by

1 1 2p

c nm = Ú Ú W (r, q)Z n (r, q) r dr d q ,

m

(4-53)

p0 0

as may be seen by substituting Eq. (4-52) and utilizing the orthonormality Eq. (4-44) of

the polynomials.

Because of the orthogonality of the Zernike polynomials, the mean value of a circle

polynomial, except when n = 0 = m (the piston polynomial), is zero, and its mean square

value is unity, as shown in Section 3.2. Therefore, the mean and the mean square values

4.7 Zernike Circle Coefficients of a Circular Aberration Function 71

coordinates ( x, y) , where x = r cosq , y = r sinq , and 0 £ r = x 2 + y 2(1 2

£ 1. )

Poly. n m Zj ( x, y) Name

Z1 0 0 1 Piston

Z2 1 1 2x x tilt

Z3 1 1 2y y tilt

Z4 2 0 3 (2r2 – 1) Defocus

Z6 2 2 6 ( x 2 – y2 ) 0∞ Primary astig.

Z9 3 3 8 y (3 x 2 – y 2 )

Z10 3 3 8 x( x 2 – 3y 2 )

Z14 4 4 10 (r 4 – 8 x 2 y 2 )

Z15 4 4 4 10 xy ( x 2 – y 2 )

Z18 5 3 12 x ( x 2 – 3 y 2 ) (5 r2 – 4)

Z19 5 3 12 y (3 x 2 – y 2 ) (5 r2 – 4 )

Z 20 5 5 12 x (16 x 4 – 20 x 2 r2 + 5 r 4 )

Z 21 5 5 12 y(16 y 4 – 20 y 2 r2 + 5 r 4 )

Z 23 6 2 2 14 xy (15 r 4 – 20 r2 + 6 )

72 SYSTEMS WITH CIRCULAR PUPILS

coordinates ( x, y) , where x = r cosq , y = r sinq , and 0 £ r = x 2 + y 2

1 2

( )

£ 1 . (Cont.)

Poly. n m Zj ( x, y) Name

Z 26 6 4 14 (8 x 4 - 8 x 2 r2 + r 4 ) (6r2 – 5 )

Z 27 6 6 14 xy (32 x 4 – 32 x 2 r2 + 6 r 4 )

Z 28 6 6 14 (32 x 6 – 48 x 4r2 + 18 x 2 r4 – r6 )

Z 29 7 1 (

4 y 35r 6 - 60r 4 + 30r 2 - 4 ) Tertiary y-coma

Z 33 7 5 4( 7r 2 - 6)[ 4 x 2 y ( x 2 - y 2 ) + y (r 4 - 8 x 2 y 2 ) ]

Z 34 7 5 4( 7r 2 - 6)[ x (r 4 - 8 x 2 y 2 ) - 4 xy 2 ( x 2 - y 2 ) ]

Z 35 7 7 8 x 2 y ( 3r 4 - 16 x 2 y 2 ) + 4 y ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )

Z 36 7 7 4 x ( x 2 - y 2 )(r 4 - 16 x 2 y 2 ) - 8 xy 2 ( 3r 4 - 16 x 2 y 2 )

Z 42 8 6 18 ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )(8r 2 - 7)

Z 43 8 6 2 18 xy ( 3r 4 - 16 x 2 y 2 )

Z 44 8 8 (

2 18 r 4 - 8 x 2 y 2 ) 2 - r8

Z 45 8 8 7 (20 r6 – 30 r 4 + 12 r2 – 1 )

4.7 Zernike Circle Coefficients of a Circular Aberration Function 73

W (r, q) = c 00 , (4-54)

• •

W 2 (r, q) = Â 2

Â c nm , (4-55)

n =0 m =0

2

s 2 = W 2 (r, q) - W (r, q)

• •

2

= Â Â c nm . (4-56)

n =1 m = 0

In practice, the expansion will be truncated at some value N of n such that the variance

obtained from Eq. (4-56) will be equal to its value obtained from the actual data within

some specified tolerance.

from fabrication errors or atmospheric turbulence can be expanded in terms of the

Zernike circle polynomials Z j (r, q) in the form [2,5]

J

W (r, q) = Â a j Z j (r, q) , (4-57)

j =1

where a j are the expansion coefficients, and we have truncated the polynomials at

maximum value J of j. Multiplying both sides of Eq. (4-57) by Z j (r, q), integrating over

the unit disc, and using the orthonormality Eq. (4-4), we obtain the circle expansion

coefficients:

2p

11

aj = Ú

p0 Ú W (r, q)Z j (r, q) r dr dq . (4-58)

0

As stated in Section 3.2, it is evident from Eq. (4-58) that the value of a circle coefficient

a j is independent of the number J of the polynomials used in Eq. (4-57) for the

expansion of the aberration function. Hence, one or more terms can be added to or

subtracted from the aberration function without affecting the value of the coefficients of

the other polynomials in the expansion.

The mean and the mean square values of the aberration function are given by

W (r, q) = a1 , (4-59)

J

W 2 (r, q) = Â a 2j , (4-60)

j =1

74 SYSTEMS WITH CIRCULAR PUPILS

s 2 = W 2 (r, q) - W (r, q)

2

J

= Â a 2j . (4-61)

j =2

POLYNOMIAL ABERRATION

is m-fold symmetric, unless m = 0, in which case it is radially symmetric. However, the

symmetry of the corresponding interferogram depends on cos mq or sin mq , since it

does not depend on the sign of the aberration. Hence, it is 2m-fold symmetric. Based on

the symmetry of the aberration, we now determine the symmetry of the PSF, the real and

the imaginary parts of the OTF, and the MTF [13,14].

Consider an m-fold symmetric aberration of the form cos mq . From Eq. (4-5), the

PSF at a distance r but an angle q i + 2pk m , where k = 1, 2,..., m, can be written

2

1 1 2p

I (r , q i + 2pk m) = [ ] [ ]

Ú Ú exp i F ( r, q) exp - pirr cos(q - q i - 2 pk m) r dr dq

p2 0 0

,

(4-62)

Now,

[ ]

F(r, q - 2 pk m) ~ cos m(q - 2 pk m) = cos( mq - 2 pk ) = cos mq ~ F(r, q) .

(4-63)

Hence, we can write Eq. (4-62) as

1 1 2p

I (r , q i + 2pk m) = [ ] [

Ú Ú exp i F(r, q - 2pk m) exp - pirr cos(q - q i - 2 pk m)

p2 0 0

]

2

¥ r dr d q

= I (r , q i ) . (4-64)

Thus if we change the angle q i by 2pk m but keep r unchanged, we obtain the same

value of the PSF as at (r , q i ) . This change can occur m times over a complete cycle of

2p . Therefore, Eq. (4-64) shows that the PSF is m-fold symmetric, as expected for the m-

fold aberration function. However, this is true for odd values of m only.

If m is even, the invariance of the PSF when q i changes by p, i.e., for k = m/2,

r r

implies that the PSF is symmetric or even about the origin, i.e., I ( r ) = I ( -r ) . It has the

consequence that the PSF is 2m-fold symmetric when m is even, as we show next. The

PSF at a distance r but angle q i ± pj m , where j = 1, 2, ..., 2m, is given by

4.8.1 Symmetry of PSF 75

2

1 1 2p

I (r , q i ± pj m) = [ ] [

Ú Ú exp i F ( r, q) exp - pirr cos(q - q i m pj m) r dr dq

p2 0 0

] . (4-65)

Now

[ ]

F(r, q ± pj m) ~ cos m(q ± pj m) = cos( mq ± pj )

= Ì ~ Ì (4-66)

Ó - cos mq for odd j ÔÓ -F(r, q) for odd j .

2

1 1 2p

I (r , q i ± pj m) = [ ] [

Ú Ú exp i F(r, q - pj m) exp - pirr cos(q - q i m pj m) r dr dq

p2 0 0

]

(4-67)

ÏÔ I (r , q i ) for even j

= Ì r (4-68)

ÔÓ I (r , q i + p) ∫ I ( -r ) for odd j ,

where in Eq. (4-67) we have substituted F(r, q) = F(r, q ± pj m) for even j and

r r

F(r, q) = -F(r, q ± pj m) for odd j to obtain Eq. (4-68). Since I ( r ) = I ( -r ) for even m,

the right-hand side of Eq. (4-68) is equal to I (r , q i ) for odd values of j also. Hence the

PSF is 2m-fold symmetric when m is even. Of course, when m = 0, the PSF is radially

symmetric, like the aberration function.

The PSFs for two polynomial aberrations with the same n and m values, and the

same sigma value, but different angular dependence as cos mq and sin mq are the same

except that one is rotated by an angle p 2m with respect to the other. If two such

polynomial aberrations are present simultaneously with sigma values a j and b j , we can

write their sum in the form

= (

2(n + 1) Rnm (r) a j cos mq + b j sin mq )

= {[

2(n + 1) Rnm (r) a 2j + b 2j cos m q - (1 m) tan 1

(b j aj )]} . (4-69)

( )

that its orientation is different by an angle (1 m) tan 1 b j a j . Hence, the orientation of

the PSF (and OTF) also change by this angle.

( )

12

It is easy to see that when both a j and b j are negative, a 2j + b 2j in Eq. (4-69)

( )

12

must be replaced by - a 2j + b 2j . However, when one of the coefficients is positive and

( )

the other is negative, then tan 1 b j a j of a negative argument has two solutions: a

76 SYSTEMS WITH CIRCULAR PUPILS

negative acute angle or its complimentary angle. The choice is made depending on

whether a 2 or a 3 is negative according to

( )

Ï - tan 1 b a for positive a and negative a

Ô (4-70a)

(b )

j j 2 3

tan 1

aj = Ì

( )

j

Ô p - tan 1 b j a j for negative a 2 and positive a 3 . (4-70b)

Ó

(b j )

a j , as when a 2 is

( ) ( )

12 12

positive, but also replace a 2j + b 2j with - a 2j + b 2j .

The complex OTF given by Eq. (2-10) can be written in terms of its real and

imaginary parts:

r r r

t( v ) = Re t( v ) + i Im t( v ) , (4-71)

r r r r r

Re t( v ) = Ú I ( r ) cos( 2pv ◊ r ) d r (4-72a)

and

r r r r r

Im t( v ) = Ú I ( r ) sin( 2pv ◊ r ) d r , (4-72b)

[

Re t(v , f) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f) r dr dq i ] (4-73a)

and

[

Im t(v , f) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f) r dr dq i ] . (4-73b)

When m is odd, the OTF is complex. To determine the symmetry of its real part, we

consider it for a spatial frequency (v , f + pj m), where, as before, j = 1, 2, ..., 2m :

[

Re t(v , f + pj m) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f - pj m) r dr dq i ] . (4-74)

From Eq. (4-68) for even j, we can replace I (r , q i ) with I (r , q i - pj m) , and thus

[

Re t(v , f - pj m) = ÚÚ I (r , q i - pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i ]

= Re t( v , f) . (4-75)

For odd j,

I (r , q i + pj m) = I (r , q i + p) . (4-76)

4.8.2 Symmetry of OTF 77

Therefore, changing the variable of integration from q i to q i + p , we may write Eq. (4-

74) as

[ ]

Re t(v , f + pj m) = ÚÚ I (r , q i + p) cos 2 pvr cos(q i + p - f - pj m) r dr dq i

[ ]

= ÚÚ I (r , q i + pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i

= Re t(v , f) . (4-77)

Now consider the imaginary part given by Eq. (4-73b). Following the same

procedure as for the real part, we replace I (r , q i ) by I (r , q i - pj m) for even j and write

[ ]

Im t(v , f + pj m) = ÚÚ I (r , q i - pj m) sin 2pvr cos(q i - f - pj m) r dr dq i

= Im t(v , f) . (4-78)

[ ]

Im t(v , f + pj m) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f - pj m) r dr dq i . (4-79)

Again, changing the variable of integration from q i to q i + p and utilizing Eq. (4-68) for

odd j, we may write Eq. (4-79) as

[ ]

Im t(v , f + pj m) = ÚÚ I (r , q i + p) sin 2 pvr cos(q i + p - f - pj m) r dr dq i

[ ]

= - ÚÚ I (r , q i + pj m) sin 2pvr cos(q i - f - pj m) r dr dq i

= - Im t(v , f) . (4-80)

Thus, the imaginary part does not change for even j, but its sign changes for odd j without

changing its magnitude. Hence, the imaginary part is only m-fold symmetric.

However, when m is even, the PSF is even about the origin, and, therefore, the

imaginary part of the OTF given by Eq. (4-72b) is zero (since its integrand is an odd

function). Accordingly, the OTF is real. Moreover, since the PSF is 2m-fold symmetric in

this case, so is the OTF. Accordingly, the MTF, which is the modulus of the OTF, is 2m-

fold symmetric whether m is even or odd. Of course, when m = 0, i.e., for a radially

symmetric aberration, the OTF is real, radially symmetric, and equal to the MTF.

The symmetry properties of the various functions discussed above for a Zernike

polynomial aberration with m -fold symmetry varying as cos mq or sin mq are

summarized in Table 4-6, where NA stands for “not applicable.” Of course, for m = 0,

the interferogram, the PSF, and the OTF are all radially symmetric. In addition, the OTF

is real when m is zero or even.

78 SYSTEMS WITH CIRCULAR PUPILS

Table 4-6. Symmetry of interferogram, PSF, real and imaginary parts of OTF, and

MTF for m-fold symmetric Zernike polynomial aberration varying as cosmqq or

sinmq .

CHARACTERISTICS OF CIRCLE POLYNOMIAL ABERRATIONS

The circle polynomial aberrations for n £ 8 are illustrated in three different but

equivalent ways in Figure 4-11 for a sigma value of one wave. For each polynomial

aberration, the isometric plot is shown at the top, the interferogram on the left, and the

PSF on the right. The peak-to-valley numbers of the aberrations are given, and the Strehl

ratio and examples of the OTF characteristics are illustrated for a sigma value of 0.1 wave

[14].

The isometric plot at the top illustrates the shape of an aberration polynomial, as

produced, for example, in a deformable mirror. The corresponding P-V aberration

numbers (in units of wavelength) are given in Table 4-7. From the form of the

polynomials given in Eqs. (4-45a) and (4-45b) for m π 0 , these numbers are given by

2 2( n + 1) , since Rnm (1) = 1 and cos q or sinq varies by 2 from –1 to 1. When m = 0

and n 2 is even, as for the primary and tertiary spherical aberrations Z11 and Z 37 , the P-

V numbers are given by (1 - b) n + 1 , where b is the extreme negative value of Rnm (r)

as r varies between 0 and 1. However, when m = 0 and n 2 is odd, as for defocus Z 4

and secondary spherical aberration Z 22 , Rnm (r) varies from –1 at r = 0 to 1 at r = 1, as

may be seen from Figure 4-10. The P-V numbers in this case are given by 2 ( n + 1) . It

should be evident that the P-V numbers of two polynomials with the same values of n and

m are the same. The P-V numbers of a polynomial aberration representing the fabrication

errors give a measure of the depth of material to be removed in the fabrication process.

The symmetry of an interferogram of a polynomial aberration, as in optical testing,

can be different from that of the aberration, because a fringe is formed independent of its

sign. For example, astigmatism Z 6 varying as cos 2q is 2-fold symmetric. It has the

implication that the aberration function does not change when it is rotated by p. Rotating

by p 2 yields an aberration of the same magnitude but with an opposite sign.

Accordingly, its interferogram is 4-fold symmetric WKXV Whe fringes intersecting the x axis

4.9.2 Interferometric Characteristics 79

Z1 Z2 Z3

Z4 Z5 Z6

Z7 Z8 Z9

Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,

interferogram on the left, and PSF on the right for a sigma value of one wave.

80 SYSTEMS WITH CIRCULAR PUPILS

Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,

interferogram on the left, and PSF on the right for a sigma value of one wave.

(Cont.)

4.9.2 Interferometric Characteristics 81

Z43 Z44 Z 45

Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,

interferogram on the left, and PSF on the right for a sigma value of one wave.

(Cont.)

82 SYSTEMS WITH CIRCULAR PUPILS

Zernike polynomial aberrations for a sigma value of one wave.

Z1 0 Z16 2 12 = 6.928 Z 31 8

Z2 4 Z17 2 12 = 6.928 Z 32 8

Z3 4 Z18 2 12 = 6.928 Z 33 8

Z5 2 6 = 4.899 Z 20 2 12 = 6.928 Z 35 8

Z6 2 6 = 4.899 Z 21 2 12 = 6.928 Z 36 8

are formed by a positive aberration, and those intersecting the y axis are formed by a

negative aberration. The number of fringes in an interferogram, which is equal to the

number of times the aberration changes by one wave as we move from the center to the

edges of the pupil, is different for the different polynomials. Each fringe represents a

contour of constant phase or aberration. The fringe is dark when the phase is an odd

multiple of p, or the aberration is an odd multiple of l 2. In the case of tilts, for

example, the aberration changes by one wave four times, which is the same as the peak-

to-valley value of 4 waves. Hence, 4 straight line fringes symmetric about the center are

obtained. The x-tilt polynomial Z2 yields vertical fringes, and the y-tilt polynomial Z3

yields horizontal fringes. Similarly, defocus aberration Z4 yields about 3.5 fringes. In the

case of spherical aberration Z11 , the aberration starts at a value of 5 waves, decreases

to zero, reaches a negative value of - 5 2 waves, and then increases to 5 waves.

4.9.2 Interferometric Characteristics 83

Hence, the total number of times the aberration changes by unity is equal to 6.7, and

approximately seven circular fringes are obtained.

The PSF plots represent the images of a point object in the presence of a polynomial

aberration. The piston aberration represented by the Zernike polynomial Z1 has no effect

on the image. Thus the PSF it yields is the Airy pattern given by Eq. (4-10). The full

width of a square displaying the PSFs in Figure 4-11 is 24l F .

The polynomial aberrations Z 2 and Z 3 , representing the x and y wavefront tilts with

aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y

axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a

wavefront tilt angle of 4(l D)a 2 about the y axis and displaces the PSF along the x axis

by 4l Fa 2 . Similarly, a 3 corresponds to a wavefront tilt angle of 4(l D)a 3 about the x

axis and displaces the PSF by 4l Fa 3 along the y axis. The aberrated PSFs can be

obtained from Eq. (4-5). For astigmatism Z 5 and Z 6 , m = 2, and the PSF is 4-fold

symmetric. For coma Z 7 and Z 8 , m = 1, the PSF is symmetric about the y and the x axis,

respectively. The polynomial Z10 corresponds to m = 3, the aberration function is 3-fold

symmetric, but the interferogram is 6-fold symmetric. Since m is odd, the PSF is also 3-

fold symmetric.

The Strehl ratio for the first 45 circle polynomial aberrations with a sigma value of

0.1 wave is listed in Table 4-8 and plotted in Figure 4-12 on a nominal and an expanded

scale to clearly show the variation of their values. For the tilt polynomials Z 2 and Z 3 , the

Strehl ratio simply represents the PSF value at a displaced point along the x or the y axis,

respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.4 l F .

A closed-form expression for the Strehl ratio for the defocus circle polynomial Z 4

can be obtained from Eq. (4-18) by letting

( ) ˘˙

2

È sin 3a

4

S = Í . (4-82)

Í 3a 4 ˙

Î ˚

For a defocus sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement with the

result given in Table 4-8. Note that a 4 is the sigma value, which in turn is equal to

Bd 2 3 , where Bd is the peak value of the defocus aberration. Hence, Eq. (4-82) is the

same as Eq. (4-21). The amount of longitudinal defocus required to produce a certain

value of a 4 , and therefore Bd , is given by Eq. (4-20).

The results of Table 4-8 and Figure 4-12 illustrate that the Strehl ratio for a small

84 SYSTEMS WITH CIRCULAR PUPILS

Table 4-8. Strehl ratio S for Zernike circle polynomial aberrations with a sigma

value of 0.1 wave.

Z1 1 Z16 0.673 Z 31 0.674

aberration is nearly independent of the type of the aberration and that it depends primarily

( )

on its sigma value. It is approximately given by Eq. (4-22c) as exp - s F2 , or 0.67,

where s F = 0.2p .

r

An image displacement of rt due to a wavefront tilt produces a linearly varying

r r r

phase factor of 2pv ◊ rt in the OTF, as may be seen from Eq. (1-10) by replacing PSF ( r )

r r r r

with the displaced PSF PSF (r - rt ) and the OTF t( v ) by the corresponding OTF t t ( v ) .

Of course, the phase factor, representing the phase transfer function, has no effect on the

MTF of the system.

The 3D MTF plots are shown in Figure 4-13 for the primary aberration polynomials

with a sigma value of 0.1 wave. The MTF for the piston aberration represents the

aberration-free MTF. It is included among the aberrated MTF plots by a solid line as a

4.9.4 OTF Characteristics 85

oS

oj

oS

oj

Figure 4-12. Strehl ratio for Zernike circle polynomial aberrations with a sigma

value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.

reference. The symmetry of the MTFs is made more explicit by the contour plots shown

below each 3D MTF figure. The MTF value at the center of the contours is unity and

decreases to zero from the center out starting with a value of 0.9 and ending with zero.

The tangential (long dashes), sagittal (medium dashes), and 45o (small dashes) MTF plots

are also shown in this figure, i.e., for the spatial frequency vector along the x axis, y axis,

and at 45o from the x axis, respectively. Because of the 4-fold symmetry of the MTF in

the case of astigmatism, the tangential MTF is equal to the sagittal MTF. As expected

[3,8], the aberrated MTF is lower than the aberration-free MTF at all spatial frequencies

0 v 1, i.e., within the passband of the system.

86 SYSTEMS WITH CIRCULAR PUPILS

y x

Z 1 - Piston

Z 4 - Defocus

Z6 Primary astigmatism

Z8 Primary coma

Z 10

Z 11 Primary spherical

Figure 4-13. 3D, tangential or along x axis (in long dashes), sagittal or along y axis

(in medium dashes), and at 45 D from the x axis (in small dashes) MTF plots for

Zernike circle polynomial aberrations with a sigma value of 0.1 wave. The solid

curve represents the aberration-free MTF. The spatial frequency v is normalized

by the cutoff frequency 1 O F . The contour plots below each 3D MTF plot are in

steps of 0.1 from the center out, starting with 0.9 and ending with zero.

4.9.4 OTF Characteristics 87

Figure 4-14a shows the symmetry of the real and the imaginary parts of the OTF for

coma Z 8 . The real part has even symmetry, but the imaginary part has odd symmetry.

The thick and thin contours of the imaginary part in both cases represent its positive and

negative values, respectively. The real and imaginary parts of the OTF for the aberration

Z10 are shown in Figure 4-14b. In addition to their even and odd symmetry, it shows that

the real part is 6-fold symmetric and the imaginary part is 3-fold symmetric, as expected

for a 3-fold symmetric aberration. Because of the odd symmetry of the imaginary part, its

integral over the spatial frequencies imaged by a system is zero, as expected from the

statement after Eq. (1-25).

(b) Z10

Re ( ) Im ( )

Figure 4-14. Real and imaginary parts of the OTF for a Zernike polynomial

aberration with a sigma value of 0.1 wave. (a) Z8 (primary coma) showing the even

and odd symmetry of the real and imaginary parts. (b) Z10 showing the 6-fold

symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to

their even and odd symmetry, respectively. The thick and thin contours of the

imaginary part in both cases represent its positive and negative values, respectively.

88 SYSTEMS WITH CIRCULAR PUPILS

CLASSICAL ABERRATIONS

4.10.1 Introduction

It is seen from Eq. (1-18) that a classical aberration depends on the polar angle q as

m

cos q . However, a Zernike polynomial depends on the angle as cos mq (or sin mq). By

expressing cos m q as a series of cos mq terms, or cos mq as a power series of cos q

terms, the coefficients of classical aberrations can be obtained from the Zernike

coefficients and vice versa [15,16]. We illustrate this for primary aberrations. The names

of some of the aberrations associated with the Zernike polynomials are given in Table 4-

4. They are a carry over from the names associated with the classical aberrations.

The Seidel aberrations are well known in optical design, where the optical system

has an axis of rotational symmetry with the consequence that the angle-dependent terms

are in the form of powers of cos q . However, the measured aberrations of a system in

optical testing generally contain both the cosine and sine terms due to the assembly and

fabrication errors. We show how to define the effective Seidel coefficients in such cases.

We emphasize that the Seidel aberration coefficients determined from the primary

Zernike aberrations will be in error unless the higher-order terms that also contain Seidel

terms are negligible [16,17].

The Zernike tilt aberration

represents a tilt of the wavefront about the y axis by an angle 4(l D)a 2 , where the

aberration coefficient is in units of wavelength. It results in a displacement of the PSF

along the x axis by 4l Fa 2 . Similarly, the Zernike tilt aberration

represents a tilt of the wavefront about the x axis by an angle 4(l D)a 3 and results in a

displacement of the PSF along the y axis by 4l Fa 3 .

It should be evident that when the cosine and sine terms of a certain aberration are

present simultaneously, as in optical testing, their combination represents the aberration

whose orientation depends on the value of the component terms. For example, if both x

and y Zernike tilts are present in the form

it can be written

4.10.2 Wavefront Tilt and Defocus 89

(

W (r, q) = 2 a 22 + a 32 )1 2 r cos [q - tan 1(a 3 a 2 )] . (4-86)

about )1 2

an axis that is orthogonal to a line making an angle of tan (a 3 a 2 ) with the x axis. How

1

to decide the sign of the overall tilt and the value of its angle are discussed following Eq.

(4-69).

The Zernike tilt aberration Z 2 (r, q) is similar to the Seidel distortion in its (r, q)

dependence. Similarly, the Zernike defocus aberration Z 4 (r) varies with r as the Seidel

field curvature varies with it. The constant term in Z 4 (r) makes its mean value across the

circular pupil to be zero, without changing its standard deviation.

4.10.3 Astigmatism

The Zernike primary astigmatism

with defocus aberration r2 to yield minimum variance. It yields a uniform circular spot

diagram, but a line sagittal image along the x axis (i.e., in a plane that zeroes out the

defocus part). The Zernike primary astigmatism

can be written

a 5 Z 5 (r, q) = [

6 a 5r 2 cos 2(q + p 4) ] . (4-89)

called the 45∞ astigmatism. The secondary Zernike astigmatism given by

10 a12 4 r 4 - 3r 2 cos 2q (4-90)

does not yield a line image in any plane. However, it is referred to as the 0∞ astigmatism

in conformance with the corresponding primary astigmatism because of its variation with

q as cos 2q . Similarly, the name tertiary astigmatism in Table 4-4 can be explained.

(

W (r, q) = a 52 + a 62 )1 2 {[

6 r 2 cos 2 q - (1 2) tan 1

(a 5 ]}

a6 ) , (4-92)

90 SYSTEMS WITH CIRCULAR PUPILS

(1 2) tan 1( a 5 a 6 ) .

It should be evident that there is ambiguity in determining astigmatism, because it

can be written in different but equivalent forms by separating defocus aberration from it.

For example, a 0∞ astigmatism can be written

(

= a 6 6 2r 2 cos 2 q - r 2 ) (4-93b)

= a6 6 ( - 2r 2 sin 2 q + r 2 ) . (4-93c)

combination of 0∞ positive Seidel astigmatism and a negative defocus, as in Eq. (4-93b),

or a 90∞ negative Seidel astigmatism and a positive defocus, as in Eq. (4-93c).

4.10.4 Coma

The Zernike coma terms a 8 Z 8 (r, q) and a 7 Z 7 (r, q) are called the x and y Zernike

comas. They represent classical coma r 3 cos q or r 3 sin q balanced with tilt r cos q or

r sin q , respectively, to yield minimum variance. They yield PSFs that are symmetric

about the x and y axes, respectively. Similarly, the names for the secondary and tertiary

coma can be explained.

When both x- and y -Zernike comas are present, the aberration may be written

= ( ) (

8 a 8 3r 3 - 2r cos q + 8 a 7 3r 3 - 2r sin q ) (4-94b)

(

= a 72 + a 82 )1 2 8 (3r3 - 2r) cos [q - tan 1(a 7 a 8 )] , (4-94c)

tan 1(a 7 a 8 ) with the x axis.

The Zernike spherical aberrations represent balanced classical spherical aberrations.

For example, the primary or Seidel spherical aberration varying as r 4 is balanced with

defocus varying as r 2 to yield Z11(r) representing the balanced primary spherical

aberration. As in the case of Zernike defocus term Z 4 (r) the constant term in Z11(r)

makes its mean value across the circular pupil to be zero. Similarly, the Zernike

secondary and tertiary spherical aberrations Z 22 and Z 37 also contain a constant term so

that their mean value is zero.

4.10.6 Seidel Coefficients from Zernike Coefficients 91

It should be noted that the wavefront tilt aberration given by Eq. (4-86) represents the

tilt aberration obtained from Zernike tilt aberrations. However, there are other Zernike

aberrations that also contain tilt aberration built into them, e.g., Zernike primary,

12

(

secondary, or tertiary coma. Similarly, the Seidel coma 3 8 a 72 + a 82 )

in Eq. (4-88c) at

an angle of tan 1(a 7 a 8 ) is only from the primary Zernike comas. But the secondary and

tertiary Zernike comas also contain Seidel coma. Hence, only if the higher-order Zernike

comas are zero or negligible, the PSF aberrated by primary Zernike coma will be

symmetric about a line making an angle of tan 1(a 7 a 8 ) with the x axis. Similarly, only

if the secondary and tertiary astigmatisms are zero or negligible, the Seidel astigmatism is

12

( )

2 6 a 52 + a 62 , as in Eq. (4-92). It yields an aberrated PSF that is symmetric about two

orthogonal axes, one of which is along a line that makes an angle of (1 2) tan 1( a 5 a 6 )

with the x axis.

To illustrate how a wrong Seidel coefficient can be inferred unless it is obtained from

all of the significant Zernike terms that contain Seidel aberrations, we consider an axial

image aberrated by one wave of secondary spherical aberration r 6 . In terms of Zernike

polynomials it will be written as

where

(

a 22 = 1 20 7 , a11 = 1 4 5 , a 4 = 9 20 3 , a1 = 1 4 . ) (4-96)

If we infer the Seidel spherical aberration from only the primary Zernike aberration

a11Z11(r) , its amount would be 1.5 waves. Such a conclusion is obviously incorrect,

because in reality the amount of Seidel spherical aberration is zero. Needless to say if we

expand the aberration function up to the first, say, as many as 21 terms, we will in fact

incorrectly conclude that the amount of Seidel spherical aberration is 1.5 waves.

However, the Seidel spherical aberration will correctly reduce to zero when at least the

first 22 terms are included in the expansion. For an off-axis image, there are angle-

dependent aberrations, e.g., Z14 , that also contain Seidel aberrations. Hence, it is

important that the expansion be carried out up to a certain number of terms such that any

additional terms do not significantly change the mean square difference between the

function and its estimate. Otherwise, the inferred Seidel aberrations will be erroneous.

only, we may write [16,17]

8

W (r, q) = Â a j Z j (r, q) + a11Z11(r) (4-97a)

j =1

(4-97b)

92 SYSTEMS WITH CIRCULAR PUPILS

where A p is the piston aberration, other coefficients Ai represent the peak value of the

corresponding Seidel aberration term, and b i is the orientation angle of the Seidel

aberration. They are given by

A p = a1 - 3a 4 + 5a11 , (4-98a)

2 2 12 Ê a - 8a7 ˆ

At = 2ÈÍ a 2 - 8 a 8

( ) + (a 3 - 8 a 7 ˘˙

) , b t = tan 1Á 3 ˜ , (4-98b)

Î ˚ Ë a2 - 8a8 ¯

Ad = 2 ( 3a 4 - 3 5a11 - Aa ) , (4-98c)

1

(

Aa = 2 6 a 52 + a 62 )1 2 , ba =

2

tan 1

(a 5 a6 ) , (4-98d)

(

Ac = 6 2 a 72 + a 82 )1 2 , b c = tan 1

(a 7 a8 ) , (4-98e)

and

As = 6 5a11 . (4-98f)

As a note of caution, we add that the approximation of Eq. (4-97a) is good only when the

higher-order Zernike aberrations that also contain Seidel aberration terms are negligible.

4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing

In Figure 4-12, we have shown the Strehl ratio for the circle polynomial aberrations

with a sigma value of one-tenth of a wave. In Figure 4-13, we show how it varies with the

sigma value of a Seidel aberration, with and without balancing (as in Tables 4-1 and 4-2),

for 0 £ s W £ 0.25 . Also plotted is the Strehl ratio obtained from the approximate

( )

expression exp - s F2 as the dashed curve. As expected, the exponential expression

yields a very good estimate of the Strehl ratio for s W £ 0.1. As s W increases, the true

Strehl ratio departs from its approximate value, except in the case of balanced

astigamtism for which the difference is quite small. It overestimates in the case of

defocus, balanced coma, and spherical aberration, but underestimates for astigmatism and

coma. Morover, for agiven value of sigma, its value for spherical aberration is exactly the

same as for the balanced spherical aberration. The aberration coefficient and the P-V

number for a certain value of s W of these aberrations can be obtained from Table 4-9.

Given an aberration function across a circular pupil, its orthonormal Zernike

coefficients can be obtained from Eq. (4-48). Now we discuss how these coefficients

change when the size of the pupil is reduced, as when the aperture of a camera lens or the

pupil of a human eye (assuming it to be circular) is reduced due to an illumination

increase. We give two approaches. In one, we express a scaled Zernike radial polynomial

as a linear combination of the unscaled radial polynomials and utilize the orthogonal

property of the radial polynomials [18]. In the other, we use some known integrals [19].

4.11 Zernike Coefficients of a Scaled Pupil 93

1.0 1.0

0.8 0.8

0.6 0.6

S

S

0.4 0.4

0.2 0.2

Defocus Astigmatism

0.0 0.0

0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25

ΣW ΣW

1.0 1.0

0.8 0.8

0.6 0.6

S

0.4 0.4

0.2 0.2

Coma Spherical

0.0 0.0

0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25

ΣW ΣW

Figure 4-15. Strehl ratio as a function of the sigma value of a Seidel aberration with

and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical

aberration.

Table 4-9. Sigma value of a Seidel aberration with and without balancing, and P-V

numbers for a sigma value of unity, where Ai is the aberration coefficient.

Astigmatism s a = Aa 4 4

Balanced astigmatism s ba = Aa 2 6 = Aa 4.90 4.90

Coma s c = Ac 2 2 = Ac 2.83 2.83

Balanced coma s bc = Ac 6 2 = Ac 8.49 9.212

Spherical aberration, s s = 2 As 3 5 = As 3.35 3..35

Balanced spherical aberration s bs = As 6 5 = As 13.42 3.35

94 SYSTEMS WITH CIRCULAR PUPILS

An alternate approach may also be considered [20]. It is perhaps worth noting that, in

practice, one will determine the Zernike coefficients of an aberration function of a system

from its interferometric data by using Eq. (4-58). The corresponding coefficients of a

scaled pupil can also be determined in the same manner by utilizing its data, i.e., by

excluding that data of the unscaled pupil that is not part of the scaled pupil. The result

obtained can be illustrated by considering a Seidel aberration function and writing it in

terms of the Zernike polynomials for both the unscaled and the scaled pupils.

4.11.1 Theory

Consider a circular pupil with its wave aberration function W (r, q) expanded in

terms of the orthonormal Zernike circle polynomials Z j (r, q), as in Eq. (4-57). For a

corresponding scaled pupil with a normalized radius of £ 1, as in Figure 4-16, the

aberration function can be written from Eq. (4-57) in the form

j

Normalizing the smaller pupil to a unit circle, the aberration function across it can also be

written in terms of the Zernike polynomials that are orthonormal over it in the form

j¢

2p

11

bj ¢ = W (r, q) Z j ¢ (r, q) r dr dq ,

p Ú0 Ú (4-101)

0

or

2p

11

bj ¢ = W (r, q) Z j ¢ (r, q) r dr dq .

p Ú0 Ú (4-102)

0

Figure 4-16. Scaled circular pupil, where the pupil radius is reduced from unity to

by blocking the outer portion.

4.11.1 Theory 95

Eq. (4-102) and obtain

1 2p

1

bj ¢ = Â Ú Ú a j Z j (r, q) Z j ¢ (r, q) r dr dq . (4-103)

p j 0 0

From Eq. (4-46), the angular integration in Eq. (4-103) yields p(1 + d m 0 ) d mm ¢ . Hence, we

may write

1

bn ¢,m = 2(n ¢ + 1) Â 2(n + 1)a n,m Ú Rnm (r) Rnm¢ (r) r dr , (4-104)

n 0

where we have replaced the single index j by the corresponding double indices n and m,

and similarly replaced j ¢ by n ¢ and m according to Eqs. (4-50) and (4-51).

The integral in Eq. (4-104) can be solved very simply by writing the radial

polynomial Rnm (r) in terms of the corresponding polynomials Rnm¢ (r) in the form [18]

n

Rnm (r) = Â hn ¢ (n; )Rnm¢ (r) , (4-105)

n ¢=m

where

hn ¢ (n; ) = ( n ¢ + 1) Â Â , (4-106)

s s ¢ s! s¢!( n ¢ + s¢ + 1)!

s and s¢ are positive integers (including zero), and n - n ¢ = 2( s + s¢) . Substituting Eq. (4-

105) into Eq. (4-104) and utilizing Eq. (4-43) for the orthogonality of the radial

polynomials, we obtain the intended result:

n +1

bn ¢,m = Â h (n; ) a n,m . (4-107)

n n¢ + 1 n ¢

the terms of the aberration function in Eq. (4-52), then the largest value of n in Eq. (4-

107) is N or N - 1, depending on whether N - m is even or odd, respectively. From Eq.

(4-105), it is easy to show that

hn (n; ) = n , (4-108a)

hn 2 (n; ) (

= - ( n - 1) 1 - 2 n ) 2

, (4-108b)

n-3

hn 4 (n; ) =

2

( )(

1 - 2 n - 2 - n2 n ) 4

, (4-108c)

n-5

hn 6 (n; ) =

6

1 - 2 ( )[(n - 3)(n - 4) - 2(n - 1)(n - 3)2 + n(n - 1)4 ] , (4-108d)

96 SYSTEMS WITH CIRCULAR PUPILS

hn 8 (n; ) =

n-7 n

2

8

(1 - 2 ) ÈÍÎ (n - 4)(n12- 5)(n - 6) - (n - 2)(n 4- 4)(n - 5) 2

( n - 1)( n - 2)( n - 4) n( n - 1)( n - 2) 6 ˘

+ 4 - ˙ , etc. (4-108e)

4 12 ˚

Equations (4-108a)–(4-108e) are sufficient to obtain the Zernike coefficients of the scaled

pupil up to and including the eighth order. The expressions for hn ¢ (n; ) for n £ 8 are

listed in Table 4-9.

Moreover, for a given value of n ¢ , the multiplier of a coefficient a nm is independent of

m, regardless of whether it is a cosine or a sine polynomial. For example, when n ¢ = 4,

the b-coefficients are given by

b4,0 = h4 (4; )a 4,0 + 7 5h4 (6; )a 6,0 + 9 5h4 (8; )a 8,0 + ... , (4-109a)

b4,2 = h4 (4; )a 4,2 + 7 5h4 (6; )a 6,2 + 9 5h4 (8; )a 8,2 + ... , (4-109b)

and

b4,4 = h4 (4; )a 4,4 + 7 5h4 (6; )a 6,4 + 9 5h4 (8; )a 8,4 + ... . (4-109c)

As Æ 1, all the multipliers vanish except a n ¢m , which approaches unity and yields the

expected result bn ¢,m = a n ¢,m .

The integral in Eq. (4-104) can also be evaluated by using the relationship [21]

•

( n m) 2

Rnm (r) = ( -1) Ú J n +1( r ) J m (rr ) dr (4-110)

0

to rewrite Rnm (r) , where J n (◊) is the nth-order Bessel function of the first kind. Thus,

we obtain after interchanging the integrals,

•

1 È1 m

( n m) 2 Û ˘

Ú n

R m

( r) R m

n¢ (r) r d r = ( -1) Ù n +1 Í Ú Rn ¢ (r) J m (rr ) r dr˙ dr

J ( r )

0 ı Î0 ˚

0

•

(n + n ¢ 2m) 2 Û J n ¢ +1( r )

= ( -1) Ù J n +1( r ) dr

ı r

0

1

= [

R n ¢ ( ) - Rnn ¢ + 2 ( )

2( n ¢ + 1) n

] , (4-111)

4.11.1 Theory 97

n n¢ h n ¢ (n; )

0 0 1

1 1

2 0 (

- 1 - 2 )

2 2 2

3 1 - 2 1 - 2 ( )

3 3 3

4 0 (1 - 2 )(1 - 22 )

4 2 - 32 (1 - 2 )

4 4 4

5 1 (

1 - 2 3 - 52 )( )

5 3 - 4 3 1 - 2 ( )

5 5 5

6 0 ( )(

- 1 - 2 1 - 52 + 54)

6 2 3 (1 - )( 2 - 3 )

2 2 2

6 4 - 54 (1 - 2 )

6 6 6

7 1 ( )(

- 2 1 - 2 2 + 82 - 74 )

7 3 2 (1 - 2 )( 5 - 72 )

3

7 5 - 65 (1 - 2 )

7 7 7

8 2 - 2 (1 - 2 )(10 - 352 + 284 )

8 4 54 (1 - 2 )( 3 - 4 2 )

8 6 - 76 (1 - 2 )

8 8 8

98 SYSTEMS WITH CIRCULAR PUPILS

( n ¢ m) 2 È J n ¢ +1 ( r ) ˘

1

Ú Rnm¢ (r) J m (rr ) r dr = ( -1) Í ˙ , (4-112a)

0 Î r ˚

J n +1( r ) J ( r ) + J n + 2 ( r )

= n , (4-112b)

r 2( n + 1)

and Eq. (4-110). Substituting Eq. (4-111) into Eq. (4-104), we obtain

n +1

bn ¢m = Â

n n ¢ + 1 nm n

[

a R n ¢ ( ) - Rnn ¢ + 2 ( ) ] . (4-113)

The equivalence of Eqs. (4-107) and (4-113) can be established by expanding the scaled

radial polynomial in terms of the orthogonal radial polynomials in the form

n

Rnm (r) = Â a n ¢ (n; )Rnm¢ (r) , (4-114)

n ¢=m

where, using the orthogonality of the radial polynomials, an expansion coefficient given

by

1

a n ¢ (n; ) = 2( n ¢ + 1) Ú Rnm (r) Rnm¢ (r) r dr (4-115)

0

is the same as hn ¢ (n; ) , as may be seen by comparing Eqs. (4-105) and (4-114).

As an example of the use of Eq. (4-107), we consider a Seidel aberration function

[16]

where a Seidel coefficient Ai represents the peak value of a Seidel aberration. It can be

written in terms of the Zernike polynomials in the form

where the argument (r, q) of the orthonormal Zernike polynomials Z nm is omitted for

brevity, and the Zernike coefficients are given by

Ad Aa As

a 0,0 ∫ a1 = + + , (4-118a)

2 4 3

At Ac

a11, ∫ a 2 = + , (4-118b)

2 3

4.11.2 Application to a Seidel Aberration Function 99

Ad Aa As

a 2,0 ∫ a 4 = + + , (4-118c)

2 3 4 3 2 3

Aa

a 2,2 ∫ a 6 = , (4-118d)

2 6

Ac

a 3,1 ∫ a 8 = , (4-118e)

6 2

and

As

a 4,0 ∫ a11 = . (4-118f)

6 5

Moreover, it is evident that the highest order among the aberrations is N = 4 . The

aberration variance in terms of the Zernike coefficients is given by

s 2 = a11

2 2 2 2 2

, + a 2, 0 + a 2, 2 + a 3,1 + a 4 , 0 (4-119a)

= a 22 + a 42 + a 62 + a 82 + a11

2

. (4-119b)

For a scaled pupil, the aberration function can be written in the form

where, from Eq. (4-107) and utilizing the h-coefficients given in Table 4-9, the Zernike

coefficients are given by

b0,0 = a 0,0 h0 (0; ) + 3h0 (2; )a 2,0 + 5h0 (4; )a 4,0

( )

= a 0,0 - 3 1 - 2 a 2,0 + 5 1 - 2 1 - 22 a 4,0 ( )( ) ,

or

( )

b1 = a1 - 3 1 - 2 a 4 + 5 1 - 2 1 - 22 a11 , ( )( ) (4-121a)

[

b11, = h1 (1; ) a11, + 2 h1 (3; ) a 3,1 = a11, - 2 2 1 - 2 a 3,1 ( ) ] ,

or

[

b2 = a 2 - 2 2 1 - 2 a 8( ) ] , (4-121b)

or

100 SYSTEMS WITH CIRCULAR PUPILS

[ (

b4 = 2 a 4 - 15 1 - 2 a11 ) ] , (4-121c)

or

b6 = 2 a 6 , (4-121d)

or

b8 = 3 a 8 , (4-121e)

and

or

2

. (4-122)

are indeed correct by writing the Seidel aberration function for the scaled pupil and

determining its Zernike coefficients. From Eq. (4-116), the aberration function of the

scaled pupil can be written

W (r, q) = At¢r cos q + Ad¢ r 2 + Aa¢ r 2 cos 2 q + Ac¢ r 3 cos q + As¢r 4 , (4-124)

where

Writing Eq. (4-124) in terms of Zernike polynomials, as was done in obtaining Eq. (4-

117) from Eq. (4-116), it is easy to see that the Zernike coefficients thus obtained are the

same as the corresponding coefficients given by Eqs. (4-121a)–(4-121f).

If each Seidel aberration coefficient in Eq. (4-116) is unity (e.g., one wave), then the

corresponding Zernike coefficients in Eq. (4-117) for the full pupil are given by

4.11.3 Numerical Example 101

a1 = 13 12 , a 2 = 5 6 , a 4 = 5 4 3 , a 6 = 1 2 6 , a 8 = 1 6 2 , a11 = 1 6 5 . (4-126)

Substituting Eqs. (4-126) into Eq. (4-119b), the variance of the aberration function is

given by s 2 = 919 720 , or its standard deviation is given s = 1.1298 . For a pupil scaled

with = 0.8 , the Zernike coefficients in Eq. (4-120b) are given by

Substituting Eq. (4-118) into Eq. (4-122), the aberration variance and standard deviation

for the scaled pupil are given by

s 2 = 0.5036 (4-128)

and

s = 0.7097 , (4-129)

respectively.

aberration function of a scaled pupil in terms of their values for a corresponding unscaled

pupil. It is perhaps worth noting that, in practice, one will determine the Zernike

coefficients of an aberration function of a system from its interferometric data by using

Eq. (4-58). The corresponding coefficients of a scaled pupil can also be determined in the

same manner by utilizing its data, i.e., by excluding that data of the unscaled pupil that is

not part of the scaled pupil.

4.12 SUMMARY

The aberration-free PSF, called the Airy pattern, is shown in Figure 4-2. It consists of

a bright central spot of radius 1.22l F , called the Airy disc, containing 83.8% of the total

light, surrounded by the diffraction rings. The corresponding OTF shown in Figure 4-4

starts at a value of unity and decreases monotonically to zero at the cutoff frequency

1 l F . Since the Strehl ratio for a small aberration increases with a decrease in the

aberration variance, we explicitly consider the balancing of primary aberrations with

lower-order aberrations. As seen from Tables 4-1 and 4-2, the sigma value of primary

spherical aberration when balanced with defocus, primary coma balanced with tilt, and

primary astigmatism balanced with defocus, is reduced by a factor of 4, 3, and 6 2,

respectively. Accordingly, the aberration tolerance for a given Strehl ratio increases by

the same factor.

The Zernike circle polynomials are in widespread use for the analysis of circular

wavefronts because of their orthogonality over a unit circle and their representation of the

balanced classical aberrations for systems with circular pupils. The polynomials are

described by three indices: j is a polynomial ordering number, n represents the radial

degree or the order of a polynomial, and m represents its azimuthal frequency. The

polynomials are ordered such that an even j corresponds to a cosine polynomial and an

102 SYSTEMS WITH CIRCULAR PUPILS

first, and, for a given value of n, a polynomial with a lower value of m is ordered first.

The expressions for the polynomials through the eighth order are given in polar

coordinates in Table 4-4 and in Cartesian coordinates in Table 4-5 in the orthonormal

form so that each expansion coefficient (except piston) of an aberration function

represents the sigma value of the corresponding polynomial term.

Only the cosine circle polynomials are needed to represent the aberration function of

a rotationally symmetric system. However, both cosine and sine polynomials are needed

to represent fabrication errors, or the aberrations introduced by atmospheric turbulence. A

circle polynomial aberration varying as cos mq or sin mq is m-fold symmetric. However,

its interferogram is 2m-fold symmetric. The PSF is m-fold symmetric when m is odd, and

2m-fold symmetric when m is even, unless m = 0, in which case it is radially symmetric,

like the aberration itself. These symmetry properties (along with those of the OTF) are

summarized in Table 4-6. The PSFs for two polynomial aberrations with the same n and

m values and the same sigma value but different angular dependence as cos mq and

sin mq are the same except that one is rotated by an angle p 2m with respect to the

other. If two such polynomial aberrations are present simultaneously with sigma values

a j and b j , then the orientation of the interferogram, PSF, and OTF changes by an angle

( )

(1 m) tan 1 b j a j .

The circle polynomials for n £ 8 are illustrated in Figure 4-11 by an isometric plot,

an interferogram, and a PSF for a sigma value of one wave. The corresponding P-V

numbers are given in Table 4-7. The Strehl ratio for a sigma value of 0.1 l for each

polynomial aberration is given in Table 4-8 and plotted in Figure 4-12, illustrating that,

for a small aberration, its value can be estimated from the aberration variance regardless

of the aberration type.

The OTF is complex with real and imaginary parts (or MTF and PTF) for odd m, but

it is real for even m. For m = 0, the OTF is real and radially symmetric. The real part of

the OTF is 2m-fold symmetric whether m is odd or even. However, its imaginary part is

m-fold symmetric for odd m, though its magnitude (i.e., if we ignore its sign) is 2m-fold

symmetric. Accordingly, the MTF is 2m-fold symmetric whether m is even or odd. The

MTF for primary aberrations, and Z10 and the real and imaginary parts of the OTF for

coma and Z10 , are given for a sigma value of 0.1 wave in Figures 4-13 and 4-14,

respectively.

The determination of the effective Seidel or primary aberration coefficients from the

corresponding coefficients of the cosine and sine polynomials is demonstrated in Section

4.9. It is emphasized that these coefficients cannot be obtained from only the primary

Zernike aberrations, but must also include the primary aberrations in the higher-order

Zernike terms. How to obtain the Zernike coefficients of a certain aberration function

when the diameter of the pupil is reduced from its nominal value is discussed in Section

4.11.

5eferences 103

References

1. F. Zernike, “Diffraction theory of knife-edge test and its improved form, the phase

contrast method,” Mon. Not. R. Astron. Soc. 94, 377–384 (1934).

66, 207–211 (1976).

Diffraction pattern in the presence of small aberrations,” Physica 13, 605–620

(1947)

4. M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,

New York, 1999).

Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).

Proc. 5173, 1–17 (2003).

variance,” J. Opt. Soc. Am. 73, 860–861 (1983).

8. Lord Rayleigh, Phil. Mag. (5) 8, 403 (1879); also in his Scientific Papers (Dover,

New York, 1964) Vol. 1, p. 432.

9. V. N. Mahajan, “Strehl ratio for primary aberrations: some analytical results for

circular and annular pupils,” J. Opt. Soc. Am. 72, 1258–1266 (1982); Errata, 10,

2092 (1993).

10. V. N. Mahajan, “Line of sight of an aberrated optical system,” J. Opt. Soc. Am. A

2, 833–846 (1985).

11. W. B. King, “Dependence of the Strehl ratio on the magnitude of the variance of

the wave aberration,” J. Opt. Soc. Am. 58, 655–661 (1968).

12. A. B. Bhatia and E. Wolf, “On the circle polynomials of Zernike and related

orthogonal sets,” Proc. Cambridge Philos. Soc. 50, 40–48 (1954).

Opt. Soc. Am. 11, 1993–2003 (1994).

14. V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular

polynomial aberrations,” Appl. Opt. 52, 2062-2074 (2013).

Optics, (SPIE Press, Bellingham, Washington, Second Printing 2001).5

104 SYSTEMS WITH CIRCULAR PUPILS

metrology,” Applied Optics and Optical Engineering, XI, 1–53 (1992). Note that

the polynomials used in this work are not in their orthonormal form, and are

ordered differently as well.

J. Phys. 15, 203–209 (2006).

18. V. N. Mahajan, “Zernike coefficients of a scaled pupil,” Appl. Opt. 49, 5374-5377

(2010).

19. A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients

of scaled pupils,” Microlith, Microfab. and Microsyst, 5, 030501 (2006).

for concentric circular scaled pupils: an equivalent expression,” J. Mod. Opt. 56,

149-155 (2009).

Groningen, The Netherlands (1942).

CHAPTER 5

References ......................................................................................................................140

105

Chapter 5

Systems with Annular Pupils

5.1 INTRODUCTION

An important example of an imaging system with a noncircular pupil is that of a

system with an annular pupil. The two-mirror astronomical telescopes represent systems

with annular pupils. Examples of such telescopes, including their linear obscuration ratios

given in parentheses are the 200-inch telescope at Mount Palomar (0.36), the 84-inch

telescope at the Kitt-Peak observatory (0.37), the telescope at the McDonald Observatory

(0.5), and the Hubble Space Telescope (0.33 when using the Wide-Field Planetary

Camera).

We start this chapter with a brief discussion of how the obscuration affects the

aberration-free PSF and OTF of a circular pupil. We then consider its effect on the Strehl

ratio of primary aberrations, their balancing, and tolerances with and without balancing.

Next we obtain the polynomials that are orthonormal over an annular pupil by

orthogonalizing the Zernike circle polynomials by the procedure outlined in Chapter 3.

The annular polynomials are given in terms of the Zernike circle polynomials, and in both

polar and Cartesian coordinates. They are also related to the balanced aberrations. The

aberrated PSFs and OTFs are illustrated for the annular polynomial aberrations.

5.2.1 PSF

Figure 5-1 illustrates a unit annular pupil with outer and inner radii of 1 and , i.e., a

pupil with a linear obscuration ratio of . Thus, if (r, q) are the coordinates of a point on

the pupil, then £ r £ 1 and 0 £ q £ 2 p . The PSF, Strehl ratio, and the OTF of a system

with an annular pupil can be obtained from the equations given in Section 2.2 in the same

manner as for a system with a circular pupil. The significant difference lies in replacing

the lower limit 0 of the radial integration by the obscuration ratio of the annular pupil.

Thus, Eq. (4-3) for the aberrated PSF for an aberration F(r, q; ) is replaced by

1

'

Figure 5-1. Unit annulus of obscuration ratio , representing the ratio of its inner

and outer radii.

107

108 SYSTEMS WITH ANNULAR PUPILS

1 2p 2

1

I (r , q i ) = [ ] [

Ú Ú exp i F ( r, q) exp - pirr cos(q i - q) r dr dq ] , (5-1)

(

p 2 1 - 2 )2 0

where (r ,q i ) are the polar coordinates of a point in the image plane, r is in units of l F ,

and F = R D is the focal ratio of the image-forming light cone. The PSF is normalized to

unity at the center by the aberration-free central irradiance p Pex 1 - 2 4l2 F 2 . It is

2

( )

smaller than the corresponding central value for a circular pupil by a factor of 1 - 2 , ( )

since both the pupil area and the power Pex are each smaller by a factor of 1 - 2 . ( )

The aberration-free PSF is given by [1,2]

2

1 È 2J1( pr ) 2J ( pr ) ˘

I ( r; ) = Í pr - 2 1 . (5-2)

(1 - 2 ) 2 Î pr ˙˚

The effect of the obscuration is two fold. First, there is a loss of light in the image that

increases with increasing . Second, the radius of the central bright spot decreases and

contains less and less light, while more and more light appears in the diffraction rings. As

Æ 1, the PSF approaches J 0 ( pr ) , and the central bright spot radius decreases to 0.76

compared to a value of 1.22 for a circular pupil. The irradiance distribution I of the PSF

and its encircled power P are shown in Figure 5-2 for several typical values of the

obscuration ratio. The 2D PSF is shown in Figure 5-3 for obscuration ratios of 0.5 and

0.8. For large obscuration ratios, such as 0.8, the PSF consisits of groups of diffraction

rings.

1.0

0.9 I

P

=0

0.8

0.7 0.25

(r) P(rc)

0.6

0.5

0.50

0.4

0.3 0.75

0.2

0.1

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

r; rc

Figure 5-2. The irradiance and encircled power distributions for various values of

the obscuration ratio .

5.2.2 OTF 109

(a)

(b)

obscuration ratio of (a) 0.5 and (b) 0.8.

5.2.2 OTF

The aberration-free OTF, representing the Fourier transform of the corresponding

PSF given by Eq. (5-2) [3], or the fractional overlap area of two unit annular circles

separated by a distance l Rv i , is given by [1,4]

110 SYSTEMS WITH ANNULAR PUPILS

1

t (v; ) =

1 - 2

[ ]

t (v) + 2 t (v ) - t12 (v; ) , 0 £ v £ 1 , (5-3)

where t (v) is given by Eq. (4-15) and represents the OTF of the system if there were no

obscuration, v = l Fv i is a normalized radial spatial frequency as in the case of a circular

pupil (since the obscuration has no effect on the cutoff frequency 1 l F ), and

(

= (2 p) q1 + 2 q 2 - 2 v sin q1 , ) (1 - ) 2 £ v £ (1 + ) 2 (5-4b)

= 0, otherwise . (5-4c)

4v 2 + 1 - 2

cos q1 = (5-5a)

4v

and

4v 2 - 1 + 2

cos q 2 = , (5-5b)

4 v

respectively. It is evident from Eq. (5-3) that t ( v; ) > t ( v ) at least for spatial frequencies

1

( )

(1 + ) 2 < v < 1 by a factor of 1 - 2 . This is illustrated in Figure 5-4 for the same

values of as the PSFs in Figure 5-2. The OTF decreases at the low and mid spatial

frequencies and increases at the high. This is the spatial frequency analog of the increased

light in the diffraction rings and a smaller central bright spot.

1.0

0.8

= 0

0.6

t (n; )

0.25

0.4 0.50

0.75

0.2

0.0

0.0 0.2 0.4 0.6 0.8 1.0

n

ratio .

5.2.2 OTF 111

1

0

(

Ú t ( v; ) vdv = 1 - 8 .

2

) (5-6)

t ¢(0; ) = - 4 p (1 - ) . (5-7)

Letting r = 0 in Eq. (5-1), we obtain the Strehl ratio of an image:

1 2p 2

1 Û Û

S ∫ I (0; ) = [ ]

Ù Ù exp iF(r, q; ) r dr dq . (5-8)

(

p 2 1 - 2 )2 ı ı

0

The approximate value of the Strehl ratio can be obtained from the aberration variance

1 2p

1 ÛÛ

n

[(

< F > = p 1- 2

)] Ù Ù F (r, q; ) r dr dq ,

ıı

n

(5-10)

0

with n = 1 and 2, respectively. Table 5-1 gives the form as well as the standard deviation

s F of a primary aberration.

Table 5-1. Primary aberrations and their standard deviations for a system with a

uniformly illuminated annular pupil of obscuration ratio .

Aberration F( r,, q) sF

Spherical As r 4 12

(4 - 2

- 6 4 - 6 + 4 8 ) As 3 5

Coma Ac r3 cos q 12

(1 + 2

+ 4 + 6 ) Ac 2 2

Astigmatism Aa r2 cos 2 q 2 12

(1 + ) Aa 4

2

d 2 3

2 12

Distortion (tilt) At r cos q (1 + ) At 2

112 SYSTEMS WITH ANNULAR PUPILS

For a small aberration, we balance a classical aberration with one or more aberrations

of lower order to minimize its variance and thereby maximize the corresponding Strehl

ratio. Thus, for example, we balance spherical aberration with defocus, as in Chapter 4,

and write it as

We determine the amount of defocus Bd such that the variance sF2 is minimized; i.e., we

calculate sF2 and let

∂s F2

= 0 (5-12)

∂B d

2

( )

Bd = - 1 + 2 As . The corresponding standard deviation is 1 - 2 As 6 5 . ( )

Astigmatism and coma aberrations can be treated similarly. Table 5-2 lists the form

of a balanced primary aberration and its standard deviation. Also listed in the table is the

location of the diffraction focus, i.e., the point with respect to which the aberration

variance is minimum so that the Strehl ratio at it is maximum. We note that in the case of

coma, the balancing aberration is a wavefront tilt whose amount depends on . Thus,

maximum Strehl ratio is obtained at a point that is displaced from the Gaussian image

point but lies in the Gaussian image plane. In the case of astigmatism, the amount of

balancing defocus is independent of . The higher-order classical aberrations can be

balanced in a similar manner.

Figure 5-5 shows how the standard deviation of an aberration, for a given value of

the aberration coefficient Ai , varies with the obscuration ratio of the pupil. In Figures 5-

5a and 5-5b, the amounts of defocus and tilt required to minimize the variance of

spherical aberration and coma, respectively, are also shown. We observe from these

figures that the standard deviation of spherical and balanced spherical aberrations and

Table 5-2. Balanced primary aberrations, their standard deviation, and diffraction

focus.

Balanced

spherical [ (

As r 4 - 1 + 2 r 2 ) ] 1

6 5

1 - 2( )

2

As [0,0,8(1 + )F A ]

2 2

s

Balanced 2 1 + 2 + 4 4 12

coma

Ê

Ac Á r3 -

ˆ

r˜ cos q (1 - ) (1 + 4 + )

2 2

Ac Í

(

È 4 1 + 2 + 4 ) ˘

FAc , 0, 0 ˙

Ë 3 1 + 2 ¯

6 2 (1 + ) 2 12

Î (

Í 3 1+ 2

) ˙

˚

Balanced

astigmatism a

(

A r 2 cos 2 q - 1 2 ) 1

(1 + 2

+ 4

12

) Aa (0, 0, 4 F A )

2

a

2 6

5.3 Strehl Ratio and Aberration Balancing 113

Spherical Balanced

0.25 0.10 1.0 0.10

coma

Balanced defocus

0.20 0.08 0.8 0.08

sf /As

0.15 0.06 0.6 0.06

Coma

Balanced

0.05 spherical 0.02 0.2 0.02

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

(a) (b)

0.40 0.30

0.25

Defocus

0.35

0.20

sf /Ad

VI /Aa

0.30 0.15

Astigmatism 0.10

0.25

Balanced 0.05

astigmatism

0.20 0.00

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

(c) (d)

0.75

0.70

0.65 Tilt

sf /At

0.60

0.55

0.50

0.0 0.2 0.4 0.6 0.8 1.0

(e)

aberration with obscuration ratio . Variation of balancing defocus in the case of

spherical aberration and tilt in the case of coma are also shown. (a) Spherical

aberration, (b) coma, (c) astigmatism, (d) defocus, and (e) tilt.

114 SYSTEMS WITH ANNULAR PUPILS

aberration coefficients As and Bd , for a given Strehl ratio, increases. Thus, for example,

the depth of focus for a certain value of the Strehl ratio increases as increases. The

standard deviation of coma, astigmatism, balanced astigmatism, and tilt increases as

increases. The standard deviation of balanced coma first slightly increases, achieves its

maximum value at = 0.29 , and then decreases rapidly as increases. The factor by

which the standard deviation of an aberration is reduced by balancing it with another

aberration is reduced in the case of spherical aberration, but increases in the case of coma

and astigmatism, as increases.

ANNULUS

The polynomials Aj (r, q; ) orthonormal over a unit annulus of obscuration ratio

can be obtained recursively from the Zernike circle polynomials Z j (r, q), starting with

A1 = 1 (omitting the arguments for brevity) from Eq. (3-18) according to [5–7]

È j ˘

A j +1 = N j +1 Í Z j +1 - Â Z j +1 Ak Ak ˙ , (5-13)

Î k =1 ˚

angular brackets indicate a mean value over the annulus. Thus,

1 2p

1 Û Û

Z j +1 Ak = Ù Ù Z j +1 Ak r dr dq . (5-14)

(

p 1 - 2 ) ı ı

0

1 2p

1 Û Û

A j A j¢ = Ù Ù A j A j ¢ r dr d q

(

p 1 - 2 ) ı ı

0

= d jj ¢ . (5-15)

whether j is even or odd. It is radially symmetric when m = 0 . Because of the orthogonal

properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)], the

polynomials Ak that contribute to the sum in Eq. (5-13) must also have the same angular

dependence as that of the polynomial Z j +1. Hence, the polynomial A j +1 will also have

the same angular dependence. Thus, an annular polynomial A j is separable in polar

coordinates r and q , and differs from the corresponding circle polynomial only in its

radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the

annular polynomials can accordingly be written [1]

5.4 Orthonormalization of Circle Polynomials over an Annulus 115

where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; ) is

an annular radial polynomial.

Substituting Eqs. (5-16a)–(5-16c) into Eq. (5-15), we find that the annular radial

polynomials obey the orthogonality condition

1

Û m 1 - 2

Ù Rn (r; ) Rn ¢ (r; ) r dr = 2

m

d . (5-17)

ı (n+ 1) nn ¢

In the two-index n and m representation Anm (r, q; ) of an annular polynomial, Eq. (5-13)

can be written

È ( n m) 2 ˘

Anm = N nm Í Z nm - Â Z nm Anm 2i An 2i ˙ , (5-18)

Î i =1 ˚

where N nm replaces the normalization constant N j and, as in Eq. (5-13), the angular

brackets indicate a mean value over the unit annulus. Substituting Eqs. (5-16a)–(5-16c)

into Eq. (5-18), we find that the annular radial polynomials are given by

È ( n m) 2 ˘

Rnm (r; ) = N nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; ) Rnm 2i (r; )˙ , (5-19)

Î i ≥1 ˚

where

1

2 Û m

Rnm (r) Rnm¢ (r; ) = Ù Rn (r) Rn ¢ (r; ) r dr .

m

(5-20)

1 - 2 ı

and r m with coefficients that depend on . The radial polynomials are even or odd in r

depending on whether n (or m) is even or odd.

For m = 0 , the annular radial polynomials are equal to the Legendre polynomials

Pn (◊) according to

È 2 r 2 - 2

R20n (r; ) = Pn Í -

(1

˘

˙ .

) (5-21)

ÍÎ 1 -

2

˙˚

Thus, they can be obtained from the circle radial polynomials R20n (r) by replacing r with

[(r 2

- 2 ) (1 - )] 2 12

, i.e.,

116 SYSTEMS WITH ANNULAR PUPILS

ÈÊ r2 - 2 ˆ 1 2 ˘

R20n (r; ) = R20n ÍÁ 2 ˜

˙ . (5-22)

ÍÎË 1 - ¯ ˙˚

Given that Rnn (r) = r n [see Eq. 4-39)], it can be seen from Eqs. (5-17) and (5-19) that

12

{(

Rnn (r; ) = r n 1 - 2 ) [1 - 2(n +1) ]} (5-23a)

12

Ê n ˆ

= r n Á Â 2i ˜ . (5-23b)

Ë i=0 ¯

Moreover,

Rnn 2 (r; ) =

[(

nrn - (n - 1) 1 - 2 n ) (1 - ( ) )] r

2 n 1 n 2

12 . (5-24)

Ï 1 - 2

Ì

Ó

( )

1

(

Èn 2 1 - 2( n +1

ÎÍ

)

) - (n - 1)(1 - ) (1 - ( ) )˘˚˙¸˝˛

2 2n 2 2 n 1

It is evident that an annular radial polynomial Rnn (r; ) differs from the corresponding

circle polynomial Rnn (r) only in its normalization. We also note that

π 1, m π 0 . (5-25b)

The annular polynomials obtained from Eq. (5-13) in terms of the Zernike circle

polynomials are given in Table 5-3 [1,7]. The elements of the matrix M to convert the

circle polynomials into the annular polynomials can be obtained easily from this table

{ } { }

according to A j = M Z j [see Eq. (3-19)]. The nonzero elements of the matrix for

the first 15 polynomials are given in Table 5-4. The polynomial ordering, the number of

polynomials of a certain order or through a certain order n, and the relationships among

the indices n, m, and j are the same as discussed for circle polynomials in Chapter 4. It

should be evident that an annular polynomial Aj (r, q; ) reduces to the corresponding

circle polynomial Z j (r, q) as Æ 0. In Table 5-5, the annular polynomials are given in

the Cartesian coordinates. The variation of several annular radial polynomials with r is

shown in Figure 5-6 for = 0.5 .

The annular polynomials are also unique like the circle polynomials. They not only

are orthogonal over an annular pupil but also include wavefront tilt and defocus and

balanced classical aberrations as members of the polynomial set. For example, A6 , A8 ,

and A11 represent the balanced primary aberrations of astigmatism, coma, and spherical

aberration, as may be seen by comparing their forms with those given in Table 5-2. The

annular polynomials may be referred to as the orthogonal aberrations because of their

orthogonality over the annular pupil.

5.5 Annular Polynomials 117

Zernike circle polynomials Z j (r, q ) , where is the obscuration ratio of the annular

pupil.

A1 = Z1

( ) 1 2 Z2

A2 = 1 + 2

12

A3 = (1 + 2 ) Z 3

1

A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )

12

A5 = (1 + 2 + 4 ) Z 5

A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]

A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]

12

B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]

12

A9 = (1 + 2 + 4 + 6 ) Z 9

12

A10 = (1 + 2 + 4 + 6 ) Z10

2

12

Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ

A12 = Á 8˜ Á - 15 Z +

6 6

Z

2 12 ˜

Ë 1 + 4 + 10 + 4 + ¯

2 4 6

Ë 1- 1- ¯

12

Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ

A13 = Á 8˜ Á - 15 Z +

6 5

Z

2 13 ˜

Ë 1 + 4 + 10 + 4 + ¯

2 4 6

Ë 1- 1- ¯

(

A14 = 1 + 2 + 4 + 6 + 8 ) 1 2 Z14

12

A15 = (1 + 2 + 4 + 6 + 8 ) Z15

1 Ï 4 ¸

A16 =

2 2

Ì [ 3( 3 + 4

2

) ( ) ]

+ 34 Z 2 + 2 6 3 + 2 Z 8 + bZ16 ˝

(1 - ) Óa ˛

1 Ï 4 ¸

A17 =

2 2

Ì [ 3( 3 + 4

2

) ( ) ]

+ 34 Z 3 + 2 6 3 + 2 Z 7 + bZ17 ˝

(1 - ) Óa ˛

12

10 1 2 Ê 1 + 4 2 + 4 ˆ

(

a = 1 + 13 + 46 + 46 + 13 +

2 4 6 8

) , b = Á 6˜

Ë 1 + 9 + 9 + ¯

2 4

12

Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ

A18 = Á 12 ˜ Á Z10 + Z

2 18 ˜

Ë 1 + 4 + 10 + 20 + 10 + 4 + ¯

2 4 6 8 10

Ë 1-

8

1- ¯

12

Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ

A19 = Á 12 ˜ Á Z9 + Z

2 19 ˜

Ë 1 + 4 + 10 + 20 + 10 + 4 + ¯

2 4 6 8 10

Ë 1-

8

1- ¯

118 SYSTEMS WITH ANNULAR PUPILS

Zernike circle polynomials Z j (r, q ) , where is the obscuration ratio of the annular

pupil. (Cont.)

(

A20 = 1 + 2 + 4 + 6 + 8 + 10 ) 1 2 Z 20

12

A21 = (1 + 2 + 4 + 6 + 8 + 10 ) Z 21

= (1 - 2 ) [ - 7 2 (1 + 32 + 4 ) Z1 + ]

3

A22 ( )

212 1 + 22 Z 4 - 35 Z11 + Z 22

1 Ï 6 ¸

A23 =

2 2

Ì [ 21(2 + 3 2

) ( ) ]

+ 34 + 26 Z 5 - 35 6 + 32 + 4 Z13 + dZ 23 ˝

(1 - ) Óg ˛

1 Ï 6 ¸

A24 =

2 2

Ì [ 21(2 + 3 2

) ( ) ]

+ 34 + 26 Z 6 - 35 6 + 32 + 4 Z14 + dZ 24 ˝

(1 - ) Óg ˛

12

(

g = 1 + 13 2 + 91 4 + 339 6 + 792 8 + 102810 + 72912 + 33914 + 9116 + 1318 + 20 )

12

Ê 1 + 4 2 + 104 + 4 6 + 8 ˆ

d =Á 12 ˜

Ë 1 + 9 + 45 + 65 + 45 + 9 + ¯

2 4 6 8 10

Ê - 3510 1 ˆ

A25 = c Á Z15 + Z

2 25 ˜

Ë 1- 1-

10

¯

Ê - 3510 1 ˆ

A26 = c Á Z14 + Z

2 26 ˜

Ë 1- 1-

10

¯

12

Ê 1 + 2 + 4 + 6 + 8 ˆ

c = Á 16 ˜

Ë 1 + 4 + 10 + 20 + 35 + 20 + 10 + 4 + ¯

2 4 6 8 10 12 14

(

A27 = 1 + 2 + 4 + 6 + 8 + 10 + 12 ) 12 Z 27

12

A28 = (1 + 2 + 4 + 6 + 8 + 10 + 12 ) Z 28

It is evident from Eq. (5-13) that each annular polynomial is a linear combination of

the circle polynomials, without any mixing of the cosine and the sine terms. Similarly,

because of the same angular dependence of an annular polynomial Aj (r, q; ) as the

corresponding circle polynomial Z j (r, q), each radial polynomial Rnm (r; ) can be written

as a linear combination of the polynomials Rnm (r) , Rnm 2 (r) , etc. This, of course, is also

evident from Eq. (5-19). For example,

1

R13 (r; ) =

B

[( )

1 + 2 R13 (r) - 24 R11(r) ] , (5-26)

where

12

(

B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] , (5-27)

5.5 Annular Polynomials 119

annular polynomials A j (r, q; ) from the Zernike circle polynomials Z j (r, q ) .

M 11 = 1

(

M 22 = 1 + 2 ) 1 2 = M 33

M 41 = -32 1 - 2( )1

(

M 44 = 1 - 2 )1

(

M 55 = 1 + 2 + 4 ) 1 2 = M 66

M 73 = -2 2 4 B = M 82

(

M 77 = 1 + 2 B = M 88 )

12

(

B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )]

(

M 99 = 1 + 2 + 4 + 6 ) 1 2 = M10,10

(

M 111, = 52 1 + 2 1 - 2 )( )2

M 11,4 = - 152 1 - 2 ( )2

, = 1-

M 1111 2

( )2

12

6 Ê 1 + 2 + 4 ˆ

M 12,6 = - 15 6 Á 8˜

= M 13,5

1 - Ë 1 + 4 + 10 + 4 + ¯

2 4 6

12

1 Ê 1 + 2 + 4 ˆ

M 12,12 = Á 8˜

= M 13,13

1 - Ë 1 + 4 + 10 + 4 + ¯

2 2 4 6

(

M 14,14 = 1 + 2 + 4 + 6 + 8 ) 1 2 = M15,15

120 SYSTEMS WITH ANNULAR PUPILS

1 2

(

( x, y) , where x = rcos q , y = rsinq , and £ r = x 2 + y 2 £ 1. )

Poly. Aj (x, y; )

A1 1

A2 2 x / (1 + 2 )1 / 2

A3 2y /(1 + 2 )1/ 2

A4 3 (2r2 – 1 - 2 ) / (1 – 2 )

A5 2 6 xy/(1 + 2 + 4 )1 / 2

A6 6 ( x 2 – y 2 )/(1 + 2 + 4 )1 / 2

8 y[3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]

A7

(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2

8 x [3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]

A8

(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2

A9 8 y (3 x 2 – y 2 ) / (1 + 2 + 4 + 6 )1 / 2

A10 8 x ( x 2 – 3 y 2 ) / (1 + 2 + 4 + 6 )1 / 2

A11 5[6r 4 – 6 (1 + 2 ) r2 + (1 + 4 2 + 4 )] / (1 – 2 ) 2

10 ( x 2 – y 2 ) [ 4r2 – 3 (1 - 8 ) / (1 – 6 )]

A12 1/ 2

{(1 – ) 2 –1

[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }

2 10 xy[ 4r2 – 3 (1 – 8 ) / (1 – 6 )]

A13 1/ 2

{(1 – ) 2 –1

[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }

A14 10 (r 4 – 8 x 2 y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2

A15 4 10 xy ( x 2 – y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2

5.5 Annular Polynomials 121

1 2

(

( x, y) , where x = rcos q , y = rsinq , and £ r = x 2 + y 2 £ 1. (Cont.) )

Poly. Aj (x, y; )

12 x [10 (1 + 4 2 + 4 ) r 4 – 12 ( 1 + 4 2 + 4 4 + 6 )r 2 ] + 3(1 + 4 2 + 10 4 + 4 6 + 8 )]

A16

(1 – 2 ) 2 [(1 + 4 2 + 4 )(1 + 9 2 + 9 4 + 9 6 )]1/ 2

12 y [ 10 (1 + 4 2 + 4 ) r 4 – 12 (1 + 4 2 + 4 4 + 6 ) r 2 + 3(1 + 4 2 + 10 4 + 4 6 + 8 ) ]

A17

(1 – 2 ) 2 [(1 + 4 2 + 4 )(1 + 9 2 + 9 4 + 6 )]1/ 2

12 x ( x 2 – 3 y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]

A18 1/ 2

{(1 – ) 2 –1

[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }

12 y [3 x 2 – y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]

A19 1/ 2

{(1 – ) 2 –1

[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }

A20 (

12 x 16 x 4 – 20 x 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2

A21 (

12 y 16 y 4 – 20 y 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2

7 [ 20 r 6 – 30(1 + 2 ) r 4 + 12 (1 + 3 2 + 4 ) r 2 – (1 + 9 2 + 94 + 6 )]

A22

(1 – 2 ) 3

2 14 xy [15 (1 + 4 2 + 10 4 + 4 6 + 8 ) r 4 – 20 (1 + 4 2 + 10 4 + 10 6 + 4 8 + 10 ) r 2

+ 6 (1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 )]

A23

(1 – 2 ) 2 [1 + 4 2 + 10 4 + 4 6 + 8 ) (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )]1/ 2

14 ( x 2 – y 2 )[15 (1 + 4 2 + 10 4 + 4 6 + 8 ) r 4 – 20 (1 + 4 2 + 10 4 + 10 6 + 4 8 + 10 ) r 2

+ 6 (1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 )]

A24

(1 – 2 ) 2 [1 + 4 2 + 10 4 + 4 6 + 8 ) (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )] 1/2

A25 1/ 2

{(1 – ) 2 –1

[36 (1 – 14 ) – 35 (1 – 12 )2 / (1 – 10 )] }

14 (8 x 4 - 8 x 2 r2 + r 4 )[6r2 – 5 (1 – 12 ) / (1 – 10 )]

A26 1/ 2

{(1 – ) 2 –1

[36 (1 – 14 ) – 35 (1 – 12 )2 / (1 – 10 )] }

A27 (

14 xy 32 x 4 – 32 x 2 r 2 + 6 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 + 12 )1/ 2

A28 (

14 32 x 6 – 48 x 4 r 2 + 18 x 2 r 4 – r 6 ) (1 + 2 + 4 + 6 + 8 + 10 + 12 )1/ 2

122 SYSTEMS WITH ANNULAR PUPILS

n 4

0.5

8

Rn(U; H)

0 (a)

0

-0.5

6

2

-1

0.5 0.6 0.7 0.8 0.9 1

U

1

n 5

1

0.5

7

R1n(U; H)

0 (b)

-0.5

3

-1

0.5 0.6 0.7 0.8 0.9 1

n 6 2

0.5

Rn(U; H)

0 (c)

2

-0.5

4

-1

0.5 0.6 0.7 0.8 0.9 1

U

Figure 5-6. Variation of an annular radial polynomial Rnm (r; ) with r for = 0.5.

(a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.

5.5 Annular Polynomials 123

and

(

R40 (r; ) = 1 - 2 ) 2 [R40 (r) - 32R20 (r) + 2 (1 + 2 )R00 (r)] . (5-28)

The radial annular polynomials Rnm (r; ) for n £ 8 are listed in Table 5-6. Table 5-7 lists

the full annular polynomials, illustrating their ordering.

The aberration function W (r, q; ) across a unit annulus with an obscuration ratio

can be expanded in terms of J annular polynomials Aj (r, q; ) in the form

J

W (r, q; ) = Â a j Aj (r, q; ) , 0 £ < 1 , 0 £ r £ 1 , 0 £ q £ 2 p , (5-29)

j =1

sides of Eq. (5-29) by A j (r, q; ) , integrating over the unit annulus, and using the

orthonormality Eq. (5-15), we obtain the annular expansion coefficients:

1 1 2p

aj = 2 Ú Ú W (r, q; ) Aj (r, q; ) r dr d q . (5-30)

p(1 - ) 0

The mean and the mean square values of the aberration function are given by

W (r, q; ) = a1 (5-31)

and

J

W 2 (r, q; ) = Â a 2j . (5-32)

j =1

2

2

sW = W 2 (r, q; ) - W (r, q; )

J

= Â a 2j . (5-33)

j =2

As explained in Section 3.3, the annular expansion coefficients yield a least-squares fit of

the aberration function with J polynomials.

124 SYSTEMS WITH ANNULAR PUPILS

Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio

and £ r £ 1.

n m Rnm (r; )

0 0 1

12

1 1 (

r 1 + 2 )

2 0 ( 2r 2

) (1 - )

- 1 - 2 2

4 12

2 2 r (1 + + )

2 2

3 (1 + ) r - 2 (1 + + ) r

2 3 2 4

3 1

12

(1 - ) [(1 + ) (1 + 4 + )]

2 2 2 4

6 12

3 3 r (1 + + + )

3 2 4

4 0 [6r - 6 (1 + ) r + 1 + 4 + ] (1 - )

4 2 2 2 4 2 2

4r - 3 [(1 - ) (1 - )] r

4 8 6 2

4 2

Ï 1 1 2¸

8 2

Ì(1 - ) Í16 (1 - ) - 15 (1 - ) (1 - )˙

È 2 ˘ 10 6

˝

Ó Î ˚ ˛

12

4 4 (

r 4 1 + 2 + 4 + 6 + 8 )

5 1 ( ) ( ) (

10 1 + 4 2 + 4 r5 - 12 1 + 4 2 + 4 4 + 6 r3 + 3 1 + 4 2 + 10 4 + 4 6 + 8 r )

12

(1 - ) [(1 + 4 + ) (1 + 9 + 9 2 2 2 4 2 4

+ 6 )]

5 r - 4 [(1 - ) (1 - )] r 5 10 8 3

5 3 12

Ï1- 1

1 - )˘ ¸˝ 10 2

Ì( ) ( ) ( ) (

È25 1 - - 24 1 -

2 12 8

Ó Í

Î ˚˙ ˛

12

5 5 (

r5 1 + 2 + 4 + 6 + 8 + 10 )

6 0 [20 r 6

( ) (

- 30 1 + 2 r 4 + 12 1 + 32 + 4 r 2 - 1 + 92 + 94 + 6 ) ( )] (1 - 2 ) 3

( )

15 1 + 4 2 + 104 + 4 6 + 8 r 6 - 20 1 + 4 2 + 104 + 106 + 4 8 + 10 r 4 ( )

6 2

( )

+ 6 1 + 4 2 + 104 + 206 + 108 + 4 10 + 12 r 2

12

(1 - ) [(1 + 4 2 + 104 + 4 6 + 8 ) (1 + 92 + 454 + 656 + 458 + 910 + 12 )]

2 2

6 4

6r6 - 5 1 - 12 [( ) (1 - )] r 10 4

12

Ï 1 - 2

) - 35 (1 - ) (1 - )˘˚˙¸˝˛

1È 12 2

Ì

Ó

( ) ÎÍ

36 1 - 14( 10

12

6 6 (

r6 1 + 2 + 4 + 6 + 8 + 10 + 12 )

5.6 Annular Coefficients of an Annular Aberration Function 125

Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio

and £ r £ 1. (Cont.)

n m Rnm (r; )

7 5

7r7 - 6 1 - 14 [( ) (1 - )] r

12 5

12

Ï 1 - 2

) - 48 (1 - ) (1 - )˘˙˚¸˝˛

1È 14 2

Ì

Ó

( ) ÍÎ

49 1 - 16 ( 12

12

7 7 (

r7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 )

8 0

( ) ( )

70 r8 - 140 1 + 2 r6 + 30 3 + 82 + 34 r4 - 20 1 + 6 2 + 6 4 + 6 r2 + e80 ( )

2 4

(1 - )

8 2 a 82r 8 + b82r 6 + c 82r 4 + d 82r 2

8 4 a 84 r 8 + b84 r 6 + c 84 r 4

8 6 a 86r 8 + b86r 6

8 8 (

r 8 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 + 16 )1 2

(

a17 = 35 1 + 92 + 94 + 6 ) A17

(

b71 = - 60 1 + 9 2 + 154 + 9 6 + 8 ) A71

(

c17 = 30 1 + 9 2 + 254 + 256 + 9 8 + 10 ) A71

(

d71 = - 4 1 + 9 2 + 454 + 656 + 458 + 9 10 + 12 ) A71

(

A17 = 1 - 2 ) 3 (1 + 92 + 94 + 6 )1 2 (1 + 162 + 364 + 166 + 8 )1 2

(

a73 = 21 1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 ) A73

(

b73 = - 30 1 + 4 2 + 10 4 + 20 6 + 20 8 + 10 10 + 4 12 + 14 ) A73

(

c73 = 10 1 + 4 2 + 10 4 + 20 6 + 358 + 20 10 + 10 12 + 4 14 + 16 ) A73

2 12

(

A 73 = 1 2 ) (1 + 4 2

+ 10 4 + 20 6 + 10 8 + 4 10 + 12 )

12

(

¥ 1 + 9 2 + 45 4 + 165 6 + 270 8 + 27010 + 16512 + 4514 + 916 + 18 )

e80 = 1 + 162 + 364 + 166 + 8

(

a 82 = 56 1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 ) A82

126 SYSTEMS WITH ANNULAR PUPILS

Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio

and £ r £ 1. (Cont.)

(

b82 = -105 1 + 9 2 + 45 4 + 85 6 + 85 8 + 45 10 + 912 + 14 ) A82

(

c 82 = 60 1 + 9 2 + 45 4 + 115 6 + 150 8 + 115 10 + 4512 + 914 + 16 ) A82

(

d 82 = -10 1 + 9 2 + 45 4 + 165 6 + 270 8 + 270 10 + 16512 + 4514 + 916 + 18 ) A82

(

A82 = 1 - 2 ) 3 (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )1 2

(

¥ 1 + 162 + 136 4 + 416 6 + 6268 + 416 10 + 13612 + 1614 + 16 )1 2

(

a 84 = 28 1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 ) A84

(

b84 = -42 1 + 4 2 + 10 4 + 20 6 + 35 8 + 35 10 + 2012 + 1014 + 4 16 + 16 ) A84

(

c 84 = 15 1 + 4 2 + 10 4 + 20 6 + 35 8 + 56 10 + 3512 + 2014 + 1016 + 4 16 + 16 ) A84

2 12

(

A 84 = 1 2 ) (1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 )

12

(

¥ 1 + 9 2 + 45 4 + 165 6 + 495 8 + 846 10 + 994 12 + 84614 + 49616 + 16518 + 45 20 + 9 22 + 24 )

(

a 86 = 8 1 + 2 + 4 + 6 + 8 + 10 + 12 ) A86

(

b86 = -7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 ) A86

12

( )(

A 86 = 1 2 1 + 2 + 4 + 6 + 8 + 10 + 12 )

12

¥ (1 + 4 + 10 2 4

+ 20 6 + 35 8 + 56 10 + 84 12 + 845614 + 3516 + 2018 + 10 20 + 4 22 + 24 )

5.6 Annular Coefficients of an Annular Aberration Function 127

manner as the circle polynomials in Table 4-3.

* The words “orthonormal annular” should be added to the name, e.g., orthonormal

annular primary spherical aberration.

128 SYSTEMS WITH ANNULAR PUPILS

manner as the circle polynomials in Table 4-3. (Cont.)

* The words “orthonormal annular” should be added to the name, e.g., orthonormal

annular primary spherical aberration.

5.7 Strehl Ratio for Annular Polynomial Aberrations 129

The Strehl ratio for an annular polynomial aberration with a sigma value of 0.1 wave

is listed in Table 5-8 and plotted in 5-7. For the wavefront tilt polynomials A2 and A3 ,

the Strehl ratio simply represents the PSF value at a displaced point along the x or the y

axis, respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.358l F .

A closed-form expression for the Strehl ratio for the annular defocus polynomial can be

obtained from Eq. (5-8) by letting

2

S = Í

(

È sin 3a

4 ) ˘˙ . (5-35)

Í 3a 4 ˙

Î ˚

For a defocus aberration sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement

with the result given in Table 5-8. Although Eq. (5-35) reads exactly the same as Eq. (4-

82) for a circular pupil, the longitudinal defocus for a given value of a 4 is different for

the annular pupil [see Eq. (5-37)]. .

distance z instead of the Gaussian image plane at a distance R, the longitudinal defocus is

z - R , and the aberration may be written in the form

W (r) = Bd r 2 , (5-36)

where Bd represents its peak value given by Eq. (4-19). The annular coefficient a 4 is

related to the longitudinal defocus z - R according to

p

a4 =

8 3l F 2

(

1 - 2 z - R ) . (5-37)

distance z < R .

The results in Table 5-8 and Figure 5-7 illustrate that the Strehl ratio for a small

aberration is nearly independent of the type of the aberration, and depends primarily on

(

its sigma value. It is approximately given by Eq. (1-34) as exp - s F2 , or 0.67, where )

s F = 0.2p .

130 SYSTEMS WITH ANNULAR PUPILS

Table 5-8. Strehl ratio S for annular polynomial aberrations for = 0.5 and a sigma

value of 0.1 wave.

5.7 Strehl Ratio for Annular Polynomial Aberrations 131

o

o

Figure 5-7. Strehl ratio for annular polynomial aberrations for = 0.5 and a sigma

value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.

132 SYSTEMS WITH ANNULAR PUPILS

OF ANNULAR POLYNOMIAL ABERRATIONS

As in the case of circle polynomials (see Section 4.8), we illustrate the annular

polynomials for n £ 8 in three different but equivalent ways in Figure 5-8 for = 0.5 and

a sigma value of one wave [8]. For each polynomial, the isometric plot at the top

illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is

shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers

(in units of wavelength) are given in Table 5-8. From Eqs. (5-16) for the form of the

polynomials, it is evident that the P-V numbers of two polynomials with the same values

of n and m are the same. This may also be seen from Table 5-7.

The PSF plots represent the images of a point object in the presence of an annular

polynomial aberration. Thus, for example, piston yields the aberration-free PSF (since it

has no effect on the PSF) given by Eq. (5-2). The full width of a square displaying the

PSFs in Figure 5-8 is 24l F .

The polynomial aberrations A2 and A3 , representing the x and y wavefront tilts with

aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y

axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a

12

( )

wavefront tilt angle of 4 a2 l D 1 + 2 about the y axis and displaces the PSF along the

12

( )

x axis by 4 a2 lF 1 + 2 . Similarly, a 3 corresponds to a wavefront tilt angle of

12 12

( )

4 a3 l D 1 + 2 ( )

about the x axis and displaces the PSF by 4 a3 lF 1 + 2 along the y

axes. As the order of a polynomial aberration increases, the interferograms and the PSFs

become more and more complex.

The 3D MTF plots for the for the primary polynomial aberrations and A10 are shown

in Figure 5-9 for a sigma value of 0.1 wave. The contour plots shown below each 3D

MTF figure are in steps of 0.1 from the center out, starting with a value of 0.9 and ending

with zero. The tangential, (long dashes), sagittal (medium dashes), and 45o (small dashes)

MTF plots are also shown in this figure, i.e., for the spatial frequency vector along the x

axis, y axis, and at 45o from the x axis, respectively. Figure 5-10a shows the symmetry of

the real and the imaginary parts of the OTF for the orthogonal primary coma A8 . The real

part has even symmetry, but the imaginary part has odd symmetry. The real and

imaginary parts of the OTF for the polynomial aberration A10 are shown in Figure 5-10b.

Since the aberration is 3-fold symmetric, the imaginary part of the OTF is 3-fold

symmetric, but the real part is 6-fold symmetric, as expected.

Comparing the form of the annular polynomials with those of the circle polynomials

given in Chapter 4, it is easy to see that the symmetry properties of the interferograms,

PSFs, real and imaginary parts of the OTF and the MTFs aberrated by an annular

polynomial aberration are the same as those for a corresponding circle polynomial

aberration in a circular pupil. These properties are summarized in Table 4-6.

5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 133

A1 A2 A3

A4 A5 A6

A7 A8 A9

Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram

on the left, and PSF on the right for = 0.5 and a sigma value of one wave.

134 SYSTEMS WITH ANNULAR PUPILS

A 16 A 17 A 18

Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram

on the left, and PSF on the right for = 0.5 and a sigma value of one wave. (Cont.)

5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 135

Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram

on the left, and PSF on the right for = 0.5 and a sigma value of one wave. (Cont.)

136 SYSTEMS WITH ANNULAR PUPILS

annular polynomials for = 0.5 and a sigma value of one wave.

5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 137

y x

A 1 - Piston

A 4 - Defocus

A6 Primary astigmatism

A8 Primary coma

A 10

A 11 Primary spherical

Figure 5-9. 3D, tangential or along x axis (in long dashes), sagittal or along y axis (in

medium dashes), and at 45 o from the x axis (in small dashes) MTF plots for annular

polynomial aberrations with a sigma value of 0.1 wave for = 0.5. The solid curve

represents the aberration-free MTF. The spatial frequency v is normalized by the

cutoff frequency 1 l F . The contour plots below each 3D MTF plot are in steps of

0.1 from the center out, starting with 0.9 and ending with zero.

138 SYSTEMS WITH ANNULAR PUPILS

(b) A10

Re ( ) Im

Figure 5-10. Real and imaginary parts of the OTF for an annular polynomial

aberration with a sigma value of 0.1 wave for = 0.5. (a) A8 (primary coma) shows

the even and odd symmetry of the real and imaginary parts. (b) A10 shows the 6-fold

symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to

their even and odd symmetry, respectively. The thick and thin contours of the

imaginary part represent its positive and negative values, respectively.

5.9 Summary 139

5.9 SUMMARY

A brief description of the aberration-free PSF and OTF of a system with an annular

pupil is given in Section 5.2, and follows with a discussion of the Strehl ratio and

aberration balancing for such a system in Section 5.3. The variation of the standard

deviation of a primary aberration with the obscuration ratio is shown in Figure 5-5. It is

evident, for example, from Figure 5-5d that the standard deviation of the defocus

aberration decreases, and the depth of focus accordingly increases as the obscuration

increases.

orthonormalizing the Zernike circle polynomials, are given in Table 5-3 in terms of the

circle polynomials. This form is useful for comparing the expansions of an annular

wavefront in terms of the annular and circle polynomials, as discussed in Chapter 12. The

nonzero elements of a 15 ¥ 15 conversion matrix for obtaining the annular polynomials

from the circle polynomials are given in Table 5-4. The annular polynomials are given in

Cartesian coordinates in Table 5-5 for numerical analyses of annular wavefronts. The

radial annular polynomials for n £ 8 are given in Table 5-6. The ordering of the annular

polynomials in Table 5-7 is the same as that for the circle polynomials in Table 4-3.

The Strehl ratio for a sigma value of 0.1 l for each aberration polynomial is given in

Table 5-8 and illustrated in Figure 5-7. It shows that, for a small aberration, the Strehl

ratio can be estimated from the aberration variance. The annular polynomials for n £ 8

are illustrated by an isometric plot, an interferogram, and a PSF in Figure 5-8 for = 0.5

and a sigma value of one wave. Their peak-to-valley numbers are given in Table 5-9 in

units of wavelength. The 3D MTFs are shown in Figure 5-9 for the primary and A10

polynomial aberrations. The tangential, sagittal, and 45o MTF plots are also shown in

Figure 5-9 for the orthogonal primary coma, i.e., for the spatial frequency vector along

the x axis, y axis, and at 45o from the x axis, respectively. The real and imaginary parts of

the OTFs are shown in Figure 5-10 for the A8 and A10 polynomial aberrations that have

odd values of m.

The symmetry properties of an interferogram, PSF, and real and imaginary parts of

the OTF and MTF aberrated by an annular polynomial aberration are the same as those

for a corresponding circle polynomial aberration in a circular pupil. These properties are

summarized in Table 4-6.

140 SYSTEMS WITH ANNULAR PUPILS

References

Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).

1820–1823 (1974).

3. E. L. O’Neill, “Transfer function for an annular aperture,” J. Opt. Soc. Am. 46,

285–288 (1956). Note that a term of - 2 h2 is missing in the second of O’Neill’s

Eq. (26), as was pointed out by the author in an Errata on p. 1096 in the Dec 1956

issue. Unfortunately, the obscuration ratio h in the original paper was typed

incorrectly as n in the Errata.

centrale de la pupille sur le contraste des images optiques.” Rev. Opt. (Paris) 32,

143–178 (1953).

with annular pupils,” Appl. Opt. 33, 8125–8127 (1994).

pupils,” J. Opt. Soc. Am. 71, 75–85 (1981); 71, 1408 (1981); 1, 685 (1984).

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, (McGraw Hill,

2009), pp. 11.3–11.41.

polynomial aberrations,” Appl. Opt. 52, 1–13 (2013).

CHAPTER 6

References ......................................................................................................................163

141

Chapter 6

Systems with Gaussian Pupils

6.1 INTRODUCTION

In this chapter, we consider optical systems with Gaussian apodization or Gaussian

pupils, i.e., those with a Gaussian amplitude across the wavefront at their exit pupils,

which may be circular or annular [1,2]. The discussion in this chapter is equally

applicable to imaging systems with a Gaussian transmission (obtained, for example, by

placing a Gaussian filter at its exit pupil) as well as laser transmitters in which the laser

beam has a Gaussian distribution at its exit pupil. It is evident that whereas a Gaussian

function extends to infinity, the pupil of an optical system can only have a finite diameter.

The net effect is that the finite size of the pupil truncates the infinite-extent Gaussian

function. If the Gaussian function is very narrow (i.e., its standard deviation is very small)

compared to the radius of the pupil, it is said to be weakly truncated. In such cases, the

truncation can be neglected, and the pupil can be assumed to be infinitely wide.

The aberration-free image for a system with a Gaussian pupil shows that the

Gaussian illumination reduces the central value, broadens the central bright spot, but

reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,

the OTF for a Gaussian pupil is higher for low spatial frequencies, and lower for the high.

In these respects, the effect of a Gaussian illumination is opposite to that of a central

obscuration in an annular pupil. The diffraction rings practically disappear when the pupil

radius is twice the Gaussian radius, and the beam propagates as a Gaussian everywhere.

The OTF in this case is also described by a Gaussian function.

and shown to be smaller than its corresponding value for a uniform pupil. This is due to

the fact that the wave amplitude decreases as a function of the radial distance from the

center of the pupil while the aberration increases, i.e., the amplitude is smaller where the

aberration is larger. Accordingly, the Strehl ratio for a Gaussian pupil for a given amount

of a primary aberration is higher than that for a uniform pupil, or the aberration tolerance

for a given Strehl ratio is higher for a Gaussian pupil. The balanced primary aberrations

with minimum variance are also obtained, and the diffraction focus for various values of

the truncation ratio are given. The Gaussian polynomials orthonormal over a Gaussian

pupil are obtained by orthogonalizing the circle polynomials over such a pupil. As

expected, the Gaussian polynomials for primary aberrations represent balanced

aberrations. Similarly, the orthonormal Gaussian annular polynomials are obtained by

orthogonalizing the annular polynomials over a Gaussian pupil. Again, the primary

Gaussian annular polynomials represent the balanced aberrations for a Gaussian annular

pupil. The isometric, interferometric, and imaging characteristics of the Gaussian circular

and annular polynomial aberrations are not discussed because of their similarity with

those of the corresponding circle or annular polynomial aberrations for uniform pupils.

143

144 SYSTEMS WITH GAUSSIAN PUPILS

The pupil function for a system with a Gaussian pupil of radius a may be written [1]

where

Here A0 is a constant that is determined from the total power in the pupil and

2

g = (a w ) , (6-3)

where the quantity w, called the Gaussian radius represents the radial distance from the

center of the pupil at which the amplitude drops to e 1 of the amplitude at the center. The

pupil radius a normalized by the Gaussian radius w , i.e., g = a w , is called the

truncation ratio. The larger the value of g is, the narrower the Gaussian beam is. A

uniform beam is represented by the limiting case of g Æ 0 . The aberration function

F(r, q) represents the phase aberration at a point (r, q) in the plane of the exit pupil,

where 0 £ r £ 1 and 0 £ q p £ 2p . The amplitude A0 at its center is determined from

the total power in the pupil.

when a uniform beam illuminates the pupil with a Gaussian transmission. In the former

case, the total power incident on the pupil and that exiting from it are given by

•

Pinc = 2 A02 Sex Ú (

exp - 2gr 2 r dr )

0

A02 Sex

= , (6-4)

2g

and

1

Pex = 2 A02 Sex Ú (

exp - 2gr 2 r dr )

0

[

= A02 (Sex 2 g ) 1 - exp(- 2 g ) ] , (6-5)

respectively. The fractional transmitted power that goes on to the image is given by

= 1 - exp(- 2g ) . (6-6)

*DXVVLDQ 3XSLO 145

More and more power is transmitted as the beam becomes narrower and narrower, i.e., as

w decreases or g increases. The pupil irradiance A 2 (r) in units of Pex Sex may be

written

The pupil in the latter case, where an amplitude filter is placed in the pupil plane, is

said to be apodized. The power incident in this case is Pinc = A02 Sex . The power exiting

from the pupil is again given by Eq. (6-5), but the fractional transmitted power is given

by

1 - exp(- 2g )

Ptrans = Pex Pinc = . (6-8)

2g

6.3.1 PSF

Substituting Eq. (6-2) into Eq. (2-4), the irradiance distribution in the image plane in

units of Pex Sex l2 R 2 is may be written

2

1 2p

I (r; q i ; g ) = p 2

Ú Ú [ ]

I (r) exp -pirr cos(q i - q) r dr dq p , (6-9)

0 0

2

È1 ˘

I ( r; g ) = 4 Í Ú I (r) J 0 ( prr) r dr˙ . (6-10)

ÍÎ 0 ˙˚

[

I (0; g ) = tanh ( g 2) ( g 2) ] . (6-11)

For large values of g, a pupil is said to be weakly truncated. For such a pupil,

I (0; g ) Æ 2 g . (6-12)

The fractional power in the image plane contained in a circle of radius rc is given by

rc

P(rc ; g ) = p 2 2( )Ú I (r; g ) rdr , (6-13)

0

where rc is in units of l F.

146 SYSTEMS WITH GAUSSIAN PUPILS

Figure 6-1 shows the image-plane irradiance and encircled-power distributions for

J 0 , 1, 2, and 3. It is evident that the Gaussian illumination reduces the central value

and broadens the central bright spot, but reduces the power in the diffraction rings. For

example, when J 1, the central value is 0.924 compared to a value of 1 for a uniform

beam. Moreover, the central bright spot has a radius of 1.43 and contains 95.5% of the

total power compared to a radius of 1.22 containing 83.8% of the power for a uniform

beam. The diffraction rings practically disappear for J t 4 , and the beam propagates as a

Gaussian everywhere.

For a given total beam power Pinc incident on a pupil of fixed radius a, the

transmitted power Pex increases as Z decreases, but the corresponding central irradiance

in the image plane decreases. Hence, there is an optimum value of Z that yields the

maximum central value. To determine this value, we write the central irradiance given by

Eq. (6-11) in units of Pinc Sex O2 R 2 :

2 J >1 exp J @2 . (6-14)

1

J = 0 J = 1

2

0.8 1 0

0.6

3

(r) P(rc)

0.4

3

0.2

0

0.5 1 1.5 2 2.5 3

r; rc

Figure 6-1. PSF and encircled power for a Gaussian pupil with J 0 , 1, 2, and 3.

The irradiance is in units of Pex Sex O2 R 2 , and the encircled power is in units of Pex .

r and rc are in units of OF.

6.3.2 Optimum Gaussian Radius 147

Letting

wI 0; J

0 , (6-15)

wJ

The corresponding irradiance at the edge of the pupil is 8.1%, and the transmitted power

Ptrans is 91.87%. Figure 6-2 shows how I 0; J varies with J .

6.3.3 OTF

From Eq. (2-13), the OTF for an aberration-free Gaussian pupil is given by

G G G G G

W v i ; J

Pex1 ³ A r p A r p O Rv i dr p (6-16)

G

in the pupil coordinate system x p , y p . Let the spatial frequency vector v i with its

Cartesian components [, K make an angle I with the x p axis, as illustrated in Figure 6-

3. It is convenient to write the autocorrelation integral in a p, q coordinate system

whose axes are rotated by an angle I with respect to the x p , y p system (so that the p

G

axis lies along the direction of the spatial frequency vector v i ) and whose origin lies at a

distance ORv i from that of the x p , y p system along the p axis. If we further let the

p, q coordinates be normalized by the pupil radius a and the spatial frequency v i be

normalized by the cutoff spatial frequency 1 O F , the OTF can be written

0.8

0.6

(0 J)

0.4

0.2

0

0 0.5 1 1.5 2 2.5 3

J

showing that its value is maximum when J 1.120 or Z 0.893a .

148 SYSTEMS WITH GAUSSIAN PUPILS

q

p

yp

xp

(0,0)

ni

lR

Figure 6-3. Geometry for evaluating the OTF. The centers of the two pupils are

( )

located at (0, 0) and l R ( x, h) in the x p , y p coordinate system and m (l R 2) (vi , 0)

12

in the ( p, q ) coordinate system, where vi = x 2 + h 2 ( )

and f = tan 1 ( h x) . The

shaded area is the overlap area of the two pupils. When normalized by the pupil

radius a, the centers of the two pupils of unity radius lie at m v along the p axis.

(

t (v ; g ) = a 2 Pex ) Ú Ú A( p + v , q) A( p - v , q) dp dq , 0£ v£1 . (6-17)

Substituting for the amplitude A(r) from Eq. (6-2) and for the power Pex from Eq. (6-5)

into Eq. (6-17), we obtain

1 v2 1 q2 v

(

8g exp -2gv 2 Û ) Û

t (v ; g ) = Ù

p [1 - exp( -2 g ) ] ı

dq Ù

ı

[ ( )]

exp -2g p 2 + q 2 dp , (6-18)

0 0

where the integration is over a quadrant of the overlap region of two pupils whose centers

are separated by a distance v along the p axis. For large values of g (e.g., g ≥ 4 ), the

contribution to the integral in Eq. (6-18) is negligible unless v = 0 , in which case it

represents the Gaussian-weighted area of a quadrant of the pupil, and the equation

reduces to

(

t (v ; g ) = exp -2gv 2 ) , 0£v £1 . (6-19)

Figure 6-4 shows how the OTF varies with v for several values of g . We note that

compared to a uniform pupil (i.e., for g = 0 ), the OTF of a Gaussian pupil is higher for

low spatial frequencies, and lower for the high. Moreover, as g increases, the bandwidth

6.3.3 OTF 149

0.8

1

0.6

W(Q J)

0.4

J = 3 2

0.2

0

0 0.2 0.4 0.6 0.8 1

Q

and a large value of J represents a weakly truncated pupil.

of low frequencies for which the OTF is higher decreases and the OTF at high

frequencies becomes increasingly smaller. This is due to the fact that the Gaussian

weighting across the overlap region of two pupils whose centers are separated by small

values of v is higher than that for large values of v. If we consider an apodization such

that the amplitude increases from the center toward the edge of the pupil, then the OTF is

lower for low frequencies and higher for the high. Thus unlike aberrations, which reduce

the MTF of a system at all frequencies within its passband, the amplitude variations can

increase or decrease the MTF at any of those frequencies.

From Eq. (2-22), the Strehl ratio (representing the ratio of the central irradiances with

and without aberration) for a Gaussian pupil is given by [1–3]

2 2

1 2S ª1 2 S º

S ³ ³ AU exp>i )U, T@ U dU dT «³ ³ AU U dU dT»

0 0 ¬0 0 ¼

2 1 2S 2

J ½

® S 1 exp J ¾ ³ ³ exp JU exp>i )U, T@ U dU dT

2

. (6-20)

¯ > @ Ó 0 0

150 SYSTEMS WITH GAUSSIAN PUPILS

S ~ exp ( - s F2 ) , (6-21)

where

is the variance of the phase aberration across the Gaussian-amplitude weighted pupil. The

mean and the mean square values of the aberration are obtained from the expression

1 2p 1 2p

n

< Fn > = Ú Ú [

A(r) F(r, q) ] r dr d q Ú Ú A(r) r dr dq

0 0 0 0

1 2p

g

= Ú Ú

p[1 - exp( - g ) ] 0

( )[

exp -gr 2 F(r, q) ] n r dr d q , (6-23)

0

with n = 1 and 2, respectively. The angular brackets indicate a mean value over the

Gaussian pupil.

Table 6-1 lists the primary aberrations and their standard deviations for increasing

values of g . It is evident that the standard deviation of an aberration decreases as g

increases. This is due to the fact that while an aberration increases as r increases, the

amplitude decreases more and more rapidly as g increases, thus reducing its effect more

Table 6-1. Primary aberrations and their standard deviations for optical systems

with Gaussian pupils. For comparison, the results for a uniform pupil ( g = 0 ) are

also given.

Primary Aberration sF ( g = 0) sF ( g = 1) sF ( g =2 ) sF ( g ≥3 )

Spherical, As r 4 2 As As As As 2 5 As

=

3 5 3.35 3.67 6.20 g2

Coma, Ac r3 cos q Ac Ac Ac Ac 3 Ac

=

2 2 2.83 3.33 6.08 g3 2

Astigmatism, Aa r2 cos 2 q Aa Aa Aa Aa

4 4.40 6.59 2g

Defocus, Bd r2 Bd Bd Bd Bd Bd

=

2 3 3.46 3.55 4.79 g

Tilt, Bt r cos q Bt Bt Bt Bt

2 2.19 2.94 2g

6WUHKO 5DWLR DQG $EHUUDWLRQ %DODQFLQJ 151

and more compared to that for a uniform pupil. Accordingly, for a given small amount of

aberration Ai , the Strehl ratio for a Gaussian pupil is higher than that for a uniform pupil.

Similarly, the aberration tolerance for a given Strehl ratio is higher for a Gaussian pupil.

Its approximate value can be obtained from Eq. (6-21).

Since the Strehl ratio depends on the aberration variance, we balance a given

aberration with lower-order aberrations to minimize its variance. Thus, we balance

spherical aberration and astigmatism with defocus aberration, and coma with tilt

aberration to minimize their variance. The balanced primary aberrations thus obtained are

listed in Table 6-2. For example, the defocus aberration that balances spherical aberration

is given by Bd As = - 1, - 0.933 , and - 4 g when g = 0 , 1, and ≥ 3, respectively.

Similarly, the tilt aberration that balances coma for these values of g is given by

Bt Ac = - (2 3) , - 0.608 , and - 2 g , respectively. The defocus coefficient given by

Bd = - Aa 2 to balance astigmatism is independent of the value of g .

The standard deviations of the balanced primary aberrations are given in Table 6-3.

The factor by which the standard deviation of a primary aberration is reduced by

balancing it with another is listed in Table 6-4. The diffraction focus representing the

point of maximum irradiance for a small aberration is listed in Table 6-5. We note that,

although aberration balancing in the case of a uniform pupil reduces the standard

deviation of spherical aberration and coma by factors of 4 and 3, respectively, the

reduction in the case of astigmatism is only a factor of 1.22. For a Gaussian pupil, the

trend is similar but the reduction factors are smaller for spherical aberration and coma,

and are larger for astigmatism. For a Gaussian beam with g = 1, they are 3.74, 2.64, and

1.27, corresponding to spherical aberration, coma, and astigmatism, respectively. In

Section 6.6, the balanced aberrations are identified with the Gaussian polynomials

discussed in Section 6.5.

Balanced F( r, q ; g = 0) F( r, q ; g = 1) (

F r , q;; g = 2 ) (

F r, q ; g ≥ 3 )

Aberration

Ê 4 2ˆ

Spherical (

As r 4 r2 ) (

As r 4 0.933r 2 ) (

As r 4 0.728 r 2 ) As Á r 4

Ë

r ˜

g ¯

Ê 2 ˆ Ê 3 2 ˆ

Coma Ac Á r 3

Ë

r˜ cos q

3 ¯

(

Ac r 3 )

0.608 r cos q A c r 3 ( )

0.419 r cos q A c Á r

Ë

r˜ cos q

g ¯

Astigmatism

(

A a r 2 cos 2 q 12 ) (

A a r 2 cos 2 q 12 ) (

A a r 2 cos 2 q 12 ) (

A a r 2 cos 2 q 12 )

152 SYSTEMS WITH GAUSSIAN PUPILS

Balanced sF ( g = 0) s F ( g = 1) sF ( g =2 ) sF ( g ≥3 )

Aberration

Spherical As As As As 2 As

=

6 5 13.42 13.71 18.29 g2

Coma Ac Ac Ac Ac Ac

=

6 2 8.49 8.80 12.21 g3 2

Astigmatism Aa Aa Aa Aa Aa

=

2 6 4.90 5.61 9.08 2g

Table 6-4. Factor by which the standard deviation of a Seidel aberration across an

aperture is reduced when it is optimally balanced with other aberrations.

Reduction Factor

Balanced Uniform Gaussian Gaussian Weakly Truncated

Aberration ( g = 0) ( g = 1) ( g =2 ) (

Gaussian g ≥ 3 )

Spherical 4 3.74 2.95 5 = 2.24

Diffraction Focus

Balanced Uniform Gaussian Gaussian Weakly Truncated

Aberration ( g = 0) ( g = 1) ( g =2 ) Gaussian g ≥ 3( )

Ê 32 2 ˆ

Spherical (0, 0, 8F A ) (0, 0, 7.46 F A ) (0, 0, 5.82 F A )

2

s

2

s

2

s Á 0, 0, F As ˜

Ë g ¯

Astigmatism (0 , 0 , 4 F A ) (0 , 0 , 4 F A )

2

a

2

a (0 , 0 , 4 F A )

2

a (0 , 0 , 4 F A )2

a

6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil 153

A GAUSSIAN CIRCULAR PUPIL

The Gaussian circle polynomials G j (r, q; g ) orthonormal over a Gaussian pupil can

be obtained recursively from the Zernike circle polynomials Z j (r, q) discussed in

Chapter 4, starting with G1 = 1 (omitting the arguments for brevity) from Eq. (3-18)

according to

È j ˘

G j +1 = N j +1 Í Z j +1 - Â Z j +1G k G k ˙ , (6-24)

Î k =1 ˚

angular brackets indicate a mean value over the Gaussian pupil. Thus

1 2p 1 2p

Z j +1G k = Ú Ú A(r) Z j +1G k r dr dq Ú Ú A(r) r dr dq

0 0 0 0

1 2p

g

= Ú

p[1 - exp( - g ) ] 0 Ú ( )

exp - gr 2 Z j +1G k r dr dq . (6-25)

0

1 2p 1 2p

G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq

0 0 0 0

1 2p

g

= Ú

p[1 - exp( - g ) ] 0 Ú ( )

exp - gr 2 G j G j ¢ r dr dq

0

= d jj ¢ . (6-26)

Now a circle polynomial Z j varies with the angle q as cos mq or sin mq depending

on whether j is even or odd. It is radially symmetric when m = 0. Because of the

orthogonal properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)],

the polynomials G k that contribute to the sum in Eq. (6-8) must also have the same

angular dependence as that of the polynomial Z j +1. Hence, the polynomial G j +1 will also

have the same angular dependence. Thus, a Gaussian polynomial G j is separable in polar

coordinates r and q , and differs from the corresponding circle polynomial only in its

radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the

Gaussian polynomials can accordingly be written

154 SYSTEMS WITH GAUSSIAN PUPILS

where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g )

is a Gaussian radial polynomial.

Substituting Eqs. (6-27a)–(6-27c) into the orthonormality Eq. (6-26), we find that the

Gaussian radial polynomials obey the orthogonality condition [1]

1 1

1

Ú (r; g ) (r; g ) A(r) r dr Ú A(r) r dr

Rnm Rnm¢ = d

n + 1 nn ¢

. (6-28)

0 0

Writing Eq. (6-24) in terms of two-index polynomials given by Eqs. (6-27a)–(6-27c) and

substituting these equations into it, as was done in Chapter 5 for the annular polynomials,

we find that the Gaussian radial polynomials are given by

È ( n m) 2 ˘

Rnm (r; g ) = M nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; g ) Rnm 2i (r; g )˙ , (6-29)

Î i ≥1 ˚

where

1 1

Rnm (r) Rn 2i (r; g ) = Ú (r) Rn 2i (r; g ) A(r) r dr Ú A(r) r dr

Rnm . (6-30)

0 0

determined from the orthogonality Eq. (6-28) of the radial polynomials. Note that except

for the normalization constant, the radial polynomial Rnn (r; g ) is identical to the

corresponding polynomial for a uniformly illuminated circular pupil Rnn (r) , i.e.,

rn 2 , ..., and r m , whose coefficients depend on the Gaussian amplitude through g, i.e., it

has the form

+ K + dnm rm , (6-32)

where the coefficients anm , etc., depend on g. The radial polynomials are even or odd in r

depending on whether n (or m) is even or odd.

certain order n, and the relationships among the indices n, m, and j are the same as

discussed for circle polynomials in Chapter 4. Moreover, a Gaussian circle polynomial

G j (r, q; g ) reduces to the corresponding circle polynomial Z j (r, q) as g Æ 0. The

Gaussian circle polynomials are also unique like the circle polynomials. They are not

only orthogonal over a Gaussian circular pupil, but they also include wavefront tilt and

defocus and balanced classical aberrations as members of the polynomial set.

6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations for a Gaussian Circular Pupil 155

PRIMARY ABERRATIONS FOR A GAUSSIAN CIRCULAR PUPIL

Table 6-6. The column “Gaussian” is for any value of g , and the column “Weakly

Truncated Gaussian” is for its large values. It can be seen that the balancing defocus for

(

spherical aberration given by Bd = b40 a40 As and the balancing tilt for coma given by)

( )

Bt = b31 a31 Ac are in agreement with the corresponding values given in Table 6-2. For

example, the relative balancing defocus in the case of spherical aberration from Table 6-6

for g = 1 is – 5.71948 6.12902 , which is the same as - 0.933 in Table 6-2. From the

form of the Gaussian circle polynomial R22 (r; g ) cos 2q representing balanced

astigmatism and varying as r 2 cos 2q , it is evident that the balancing defocus of

- (1 2)r 2 for astigmatism r 2 cos 2 q is independent of the value of g . Similarly,

comparing the form of a balanced primary aberration with the corresponding Gaussian

polynomial, we can immediately write its standard deviation. Thus, we can see that the

sigma values As 5a40 , Ac 2 2 a31 , and Aa 2 6 a22 of balanced spherical aberration,

coma, and astigmatism, respectively, are in agreement with their values given in Table 6-

3. For example, the balanced aberration for spherical aberration Asr 4 can be written

As 0 4

W (r, q; g ) =

a 40

(a 4 r + b40r 2 + c 40 )

As

= G 4 (r, q; g ) . (6-33)

5a 40

for Gaussian beams. Polynomials for special cases of g = 0 (corresponding to a

uniform beam), g = 1, and weakly truncated Gaussian beams are also given.

Polynomial g 1 g 0 Gaussian

Piston R00 1 1 1 1

g / 2r

2

Field curvature R20 a20r2 + b20 2

2.04989r – 0.85690 2r – 1 2

( gr – 1) / 3

(defocus)

Astigmatism R22 a22r2 1.14541r2 r2 ( g / 6 )r2

g / 2 Á r3 – r˜

Ë2 ¯

Spherical aberration R40 a40r4 + b40r2 + c40 6.12902r4 – 5.71948r2 + 0.83368 6 r4 – 6 r2 + 1 ( g 2r4 – 4 gr2 + 2) / 2 5

1

*a11 = (2 p 2 )–1/2 , a 20 = [3( p 4 – p 22 )] –1/2, b 20 = – p 2 a 20 , a 22 = ( 3 p 4 )–1/2 , a 13 = ( p – p 42 / p 2 ) 12

, b 31 = – ( p 4 / p 2 )a 13 ,

2 6

–1/2

{

a 40 = 5 [ p8 – 2 K 1 p6 + (K 12 + 2 K 2 ) p4 – 2 K 1 K 2 p2 + K 22 ] } , b40 = – K 1 a 40 , c40 = K 2 a 40 ,

p 0 = 1, K1 = ( p6 – p 2 p 4 ) / ( p 4 – p 22 ), K 2 = ( p 2 p6 – p 42 ) / ( p 4 – p 22 ) .

156 SYSTEMS WITH GAUSSIAN PUPILS

of the balanced aberration. The balancing defocus is, of course, Asb40 a 40 . As a numerical

example, it yields a sigma value of As 13.71 for g = 1, the same as in Table 6-3. The

corresponding balancing defocus is - 0.933As , as expected.

For a weakly truncated Gaussian pupil, we can let the upper limit of the radial

integration approach infinity with negligible error. Thus, Eq. (6-20) for the Strehl ratio

and Eq. (6-23) for the mean and mean square values of the aberration may be written [1]

2

2 • 2p

Ê gˆ

S = Á ˜

Ë p¯ Ú Ú ( ) [

exp -gr 2 exp iF(r, q) r dr dq ] (6-34)

0 0

and

• 2p

g n

< Fn > =

p Ú Ú ( )[ ]

exp - g r2 F(r, q) r dr dq , (6-35)

0 0

respectively.

The standard deviation of a primary aberration for a large value of g can be obtained

by calculating its mean and mean square values according to Eq. (6-36). The results thus

obtained are given in the last column of Table 6-1. The corresponding balanced

aberrations and their standard deviations are similarly given in Tables 6-2 and 6-3,

respectively. The balancing of an aberration reduces the standard deviation by a factor of

5, 3 , and 2 in the case of spherical aberration, coma, and astigmatism,

respectively, as noted in Table 6-4. The diffraction focus for these aberrations is listed in

Table 6-5. The amount of balancing aberration decreases as g increases in the case of

spherical aberration and coma, but does not change in the case of astigmatism. For

example, in the case of spherical aberration, the amount of balancing defocus for a

weakly truncated Gaussian beam is ( 4 g ) times the corresponding amount for a uniform

beam. Similarly, in the case of coma, the balancing tilt for a weakly truncated Gaussian

beam is (3 g ) times the corresponding amount for a uniform beam. The location of the

diffraction focus is independent of the value of g in the case of astigmatism, since the

balancing defocus is the same regardless of the value of g . Compared to the peak value

of an aberration, its standard deviation is smaller by a factor of g 2 2 , g 3 2 , and 2g in

the case of spherical aberration, coma, and astigmatism, respectively.

When a Gaussian beam is weakly truncated, i.e., when g is large, the quantity ps in

Table 6-6 reduces to

ps = < rs > = (s 2 g ) ps 2 = (s 2) ! g s2

. (6-36)

:HDNO\ 7UXQFDWHG *DXVVLDQ 3XSLOV 157

As a result, we obtain simple expressions for the radial polynomials, which are listed in

the last column in Table 6-6. They are similar to Laguerre polynomials [4]. If we

normalize the radial coordinate r of a point on the pupil by w (instead of by a), then g

disappears from these expressions. Since the power in a weakly truncated Gaussian beam

is concentrated in a small region near the center of the pupil, the effect of the aberration

in its outer region is negligible. Accordingly, the aberration tolerances in terms of the

peak value of the aberration at the edge of the pupil (r = 1) may not be very meaningful.

They may instead be defined in terms of their value at the Gaussian radius [1].

ABERRATION FUNCTION

The aberration function W (r, q; g ) across a Gaussian circular pupil can be expanded

in terms of a complete set of orthonormal Gaussian circle polynomials G j (r, q; g ) in the

form

J

W (r, q; g ) = Â a j G j (r, q; g ) , 0 £ r £ 1 , 0 £ q £ 2 p , (6-37)

j =1

where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-

37) by G j ¢ (r, q; g ) , integrating over the Gaussian pupil, and using the orthonormality Eq.

(6-26), we obtain the circle expansion coefficients:

1 2p 1

a j = Ú Ú W (r, q; g ) G j (r, q; g ) A(r) r dr d q 2 p Ú A(r) r dr . (6-38)

0 0 0

The mean and mean square values of the aberration function are given by

W (r, q; g ) = a1 (6-39)

and

J

W 2 (r, q; g ) = Â a 2j . (6-40)

j =1

2

sW = W 2 (r, q; g ) - W (r, q; g )

J

= Â a 2j . (6-41)

j =2

GAUSSIAN ANNULAR PUPIL

The balanced aberrations for an annular Gaussian pupil with an obscuration ratio

can be obtained in a manner similar to those for a circular pupil, except that the lower

158 SYSTEMS WITH GAUSSIAN PUPILS

limit of zero in the radial integration is replaced by . The Gaussian annular polynomials

G j (r, q; g; ) orthonormal over a Gaussian annular pupil can be obtained recursively from

the annular polynomials A j (r, q; ) , starting with G1 = 1 (omitting the arguments for

brevity) from Eq. (3-18) according to

È j ˘

G j +1 = N j +1 Í A j +1 - Â A j +1G k G k ˙ , (6-42)

Î k =1 ˚

angular brackets indicate a mean value over the Gaussian annular pupil. Thus

1 2p 1 2p

A j +1G k = Ú Ú A(r) A j +1G k r dr dq Ú Ú A(r) r dr dq . (6-43)

0 0

1 2p 1 2p

G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq

0 0

= d jj ¢ . (6-44)

Applying the same reasoning as in the case of Gaussian circle polynomials, we find

that the polynomial G j (r, q; g; ) also has the same angular dependence as an annular

polynomial A j (r, q; ) . Thus, a Gaussian annular polynomial G j is separable in polar

coordinates r and q , and differs from the corresponding annular polynomial only in its

radial dependence. Given the form of the annular polynomials by Eqs. (5-17a)–(5-17c),

the Gaussian annular polynomials can accordingly be written

where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g; )

is a Gaussian annular radial polynomial.

Substituting Eqs. (6-45a)–(6-45c) into the orthonormality Eq. (6-44), we find that the

Gaussian annular radial polynomials obey the orthogonality condition [1,3]

1 1

1

Ú Rnm (r; g; ) Rnm¢ (r; g; ) A(r) r dr Ú A(r) r dr = d . (6-46)

n + 1 nn ¢

Writing Eq. (6-42) in terms of two-index polynomials given by Eqs. (6-45a)–(6-45c) and

substituting these equations into it, as was done in Chapter 5 for the annular polynomials,

6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil 159

È ( n m) 2 ˘

Rnm (r; g; ) = M nm Í Rnm (r; ) - Â (n - 2i + 1) Rnm (r; ) Rnm 2 i (r; g; ) Rnm 2 i (r; g; ) ,

˙

ÍÎ i ≥1 ˙˚

(6-47)

where the angular brackets indicate an average over the annular Gaussian pupil; i.e.,

1 1

Rnm (r; ) Rn 2 i (r; g; ) = Ú Rnm (r; ) Rn 2 i (r; g; ) A(r) r dr Ú A(r) r dr . (6-48)

determined from the orthogonality Eq. (6-46) of the radial polynomials. Note that the

radial polynomial Rnn (r; g ; ) is identical to the corresponding polynomial for a uniformly

illuminated annular pupil Rnn (r; ) , except for the normalization constant, i.e.,

rn , rn 2 , ..., and r m whose coefficients depend on the Gaussian amplitude through g,

i.e., it has the form

+ K + dnm rm , (6-50)

certain order n, and the relationships among the indices n, m, and j are the same as those

discussed for the Zernike circle polynomials in Chapter 4, or the annular polynomials in

Chapter 5. Moreover, a Gaussian annular polynomial G j (r, q; g; ) reduces to the

corresponding annular polynomial Aj (r, q; ) as g Æ 0. The Gaussian annular

polynomials are also unique like the Gaussian circle polynomials. They are not only

orthogonal over a Gaussian circular pupil, but also include wavefront tilt and defocus and

balanced classical aberrations as members of the polynomial set.

PRIMARY ABERRATIONS FOR A GAUSSIAN ANNULAR PUPIL

The radial annular polynomials Rnm (r; g ; ) for the balanced primary aberrations are

given by the same expressions as for the circle radial polynomials in Table 6-6 except

that now

ps = < rs >

Ë { [(

= Ê s exp g 1 - 2 )] - 1} {exp [g (1 - )] - 1}ˆ¯ + (s 2 g ) p

2

s 2 . (6-51)

160 SYSTEMS WITH GAUSSIAN PUPILS

Using these expressions, numerical results for the coefficients of the terms of a radial

polynomial for any values of g and can be obtained.

The coefficients for g = 1 and = 0, 0.25, 0.50, 0.75, and 0.90 are given in Table 6-

7. For comparison, the coefficients for a uniformly illuminated pupil, i.e., for g = 0 , are

given in parentheses in this table. An increase (decrease) in the value of a coefficient anm

of an orthogonal aberration Rnm (r; g ; ) cos mq implies a decrease (increase) in the value

of s F for a given amount of the corresponding classical aberration. This, in turn, implies

that for small aberrations, the system performance as measured by the Strehl ratio is less

(more) sensitive to that classical aberration when balanced with other classical

aberrations to form an orthogonal aberration. Thus, as increases, irrespective of the

value of g, the system becomes less sensitive to field curvature (defocus) and spherical

aberration but more sensitive to distortion (tilt) and astigmatism. In the case of coma, it

first becomes slightly more sensitive but is much less sensitive for larger values of . As

g increases, i.e., as the width of the Gaussian illumination becomes narrower, the system

becomes less sensitive to all classical primary aberrations. Although the results for g = 0

and g = 1 only are given in Table 6-7, the coefficients for 0 £ g £ 3 show that the

differences between the coefficients for uniform and Gaussian illumination are small, and

they decrease as increases and increase as g increases. This is understandable because

as increases or g decreases, the differences between the two illuminations decreases.

Table 6-7. Coefficients of terms in Gaussian radial polynomials Rnm (r; g ; ) for g = 1.

The numbers given in parentheses are the corresponding coefficients for uniform

illumination.

0.00 1.09367 2.04989 – 0.85690 1.14541 3.11213 – 1.89152 6.12902 – 5.71948 0.83368

0.25 1.04364 2.18012 – 1.00080 1.08940 3.01573 – 1.84513 6.95563 – 6.98197 1.25153

0.50 0.92963 2.70412 – 1.56449 0.93620 3.14319 – 2.06618 10.79549 – 13.08900 3.46706

0.75 0.80827 4.59329 – 3.51548 0.74439 4.55179 – 3.57767 31.47560 – 48.77879 18.39840

0.90 0.74453 10.53581 – 9.50324 0.63890 9.60573 – 8.69629 166.33359 – 300.66342 135.36926

6.11 Aberration Coefficients of a Gaussian Annular Aberration Function 161

ABERRATION FUNCTION

The aberration function W (r, q; g; ) across a Gaussian annular pupil can be

expanded in terms of a complete set of orthonormal Gaussian annular polynomials

G j (r, q; g; ) in the form

J

W (r, q; g; ) = Â a j G j (r, q; g; ) , £ r £ 1 , 0 £ q £ 2 p , (6-52)

j =1

where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-

52) by G j (r, q; g; ), integrating over the Gaussian pupil, and using the orthonormality

Eq. (6-44), we obtain the Gaussian annular expansion coefficients:

1 2p 1

a j = Ú Ú W (r, q; g; )G j (r, q; g; ) A(r) r dr d q 2 p Ú A(r) r dr . (6-53)

The mean and mean square values of the aberration function are given by

W (r, q; g; ) = a1 (6-54)

and

J

W 2 (r, q; g; ) = Â a 2j . (6-55)

j =1

s 2 = W 2 (r, q; g; ) - W (r, q; g; )

J

= Â a 2j . (6-56)

j =2

6.12 SUMMARY

A pupil with Gaussian illumination is called a Gaussian pupil. The Gaussian

illumination may be due to a filter with Gaussian transmission placed at the pupil or due

to a laser beam with Gaussian amplitude distribution. The illumination is characterized by

a truncation ratio g = a w , where a is the pupil radius and w is the radial distance,

called the Gaussian radius, where the amplitude is 1 e times its central value.

The aberration-free image for a system with a Gaussian pupil shows that the

Gaussian illumination reduces the central value, broadens the central bright spot, but

reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,

the OTF is higher for low spatial frequencies, and lower for the high. The diffraction

rings practically disappear when the pupil radius is twice the Gaussian radius, and the

beam propagates as a Gaussian everywhere. The OTF in this case is also described by a

Gaussian function.

162 SYSTEMS WITH GAUSSIAN PUPILS

The Strehl ratio for a small aberration can be estimated from its variance calculated

over the Gaussian amplitude-weighted pupil. The aberration variance decreases, and,

therefore, its tolerance increases as the truncation ratio increases (see Tables 6-1 and 6-3),

because the amplitude decreases as the aberration increases with the radial distance from

the center.

The Gaussian polynomials orthonormal over a Gaussian circular pupil are obtained

by orthonormalizing the Zernike circle polynomials over a corresponding Gaussian

amplitude-weighted pupil. They are given in Table 6-6 for the primary aberrations for

g = 1. For a weakly truncated pupil, i.e., for large values of g , the polynomials have a

simple analytical form similar to Laguerre polynomials, as shown in the last column in

Table 6-6.

The orthonormal Gaussian annular polynomials for Gaussian annular pupils can be

obtained by orthonormalizing the annular polynomials. The polynomial ordering is

exactly the same as that for the circle or the annular polynomials.

5HIHUHQFHV 163

References

Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).

diffraction, obscuration, and aberrations,” J. Opt. Soc. Am. A3, 470–485 (1986).

3. V. N. Mahajan, “Strehl ratio of a Gaussian beam,” J. Opt. Soc. Am. A22, 1824–

1833 (2005).

(McGraw-Hill, New York, 1968).

Optics, 49, 1–96, (2006).

CHAPTER 7

References ......................................................................................................................200

165

Chapter 7

Systems with Hexagonal Pupils

7.1 INTRODUCTION

Although most optical imaging systems have a circular or an annular pupil, with or

without Gaussian illumination, there are times when the wavefront or the interferogram is

hexagonal. This is most notable for the primary mirrors of large telescopes, such as the

Keck [1], the James Webb [2], or the CELT [3]. Although these mirrors are circular, they

are large enough that they are segmented into small hexagonal segments. Optical testing

of a hexagonal segment yields a hexagonal wavefront or interferogram, thus requiring

polynomials that are orthogonal over a hexagon. Even a large hexagonal primary mirror

consisting of hexagonal segments has been proposed [4].

Smith and Marsh [5] have discussed the PSF of a hexagonal pupil, but their equation

for it is incorrect. Sabatke et Dl. [4] desribe the complex amplitude for a trapezoid

forming the upper half of a regular hexagon, but do not carry out the summation of the

diffracted amplitudes of the two trapezoids of the hexagonal pupil. We give closed-form

expressions for the six-fold symmetric aberration-free PSF and OTF [6]. Similar

expressions for the PSF have been given by others [7,8]. The PSF and OTF are plotted

along with the ensquared power, and compared with the corresponding quantities for a

system with a circular pupil. The ensquared power and the OTF are shown to be lower

than the corresponding values for a circular pupil.

chapter by orthogonalizing the Zernike circle polynomials over a unit hexagon by using

the procedure described in Chapter 3. Each of these polynomials consists of either the

cosine or the sine terms, but not both. This is a consequence of the biaxial symmetry of a

hexagonal pupil. Whereas the circle, annular, and Gaussian polynomials, described in

Chapters 4, 5, and 6, respectively, are separable in their dependence on the polar

coordinates r and q of a pupil point, only some of the hexagonal polynomials are

separable. For example, the polynomial H14 contains cos 2q and cos 4q terms. Hence,

numbering the polynomials with two indices n and m loses significance, and they must be

numbered with a single index j. A hexagonal pupil has two distinct configurations where

the hexagon in one is rotated by 30 degrees with respect to that in the other. Only some of

the polynomials are common between the two configurations.

with circular, annular, and Gaussian pupils, respectively, and showed that the

corresponding orthonormal polynomials also represented balanced aberrations. Although

not shown explicitly, as was done in Chapters 4 through 6, the hexagonal polynomials

also represent balanced classical aberrations. However, some interesting results are

obtained in this respect due to lack of the radial symmetry of the hexagonal pupil. For

example, while the polynomials H11 and H22 representing the balanced primary and

167

168 SYSTEMS WITH HEXAGONAL PUPILS

secondary spherical aberrations are radially symmetric, the polynomial H37 representing

the balanced tertiary spherical aberration is not, because it also consists of an angle-

dependent term in Z28 or cos 6q . The balancing defocus, however, to optimally balance

Seidel astigmatism for a hexagonal pupil is the same as that for a circular or an annular

pupil.

The isometric, interferometric, and PSF plots for the hexagonal polynomial

aberrations are shown. The P-V numbers for the polynomials with a sigma value of one

wave are given, and the Strehl ratios are caluclated for a sigma value of one-tenth of a

wave to illustrate that the exponential expression for it, in terms of the aberration

variance, gives a good estimate for small aberrations.

The balancing of Seidel aberrations is considered, and their standard deviations are

obtained by expressing them in terms of the orthonormal polynomials. The diffraction

focus is shown to lie closer to the Gaussian image point in the case of coma, and closer to

the Gaussian image plane in the case of spherical aberration, compared to their

corresponding locations for a circular pupil. Plots of Strehl ratio as a function of the

sigma value of a Seidel aberration are given. They demonstrate that the exponential

expression underestimates in the case of defocus, but overestimates in the case of

astigmatism, coma, and spherical aberration. The Strehl ratio is estimated very well for

balanced astigmatism and coma, but it underestimates in the case of balanced spherical

aberration for s W > 0.2 .

Consider an imaging system with a uniformly illuminated hexagonal exit pupil with

( ) ( )

each side of length a and area Sex = 3 3 2 a 2 lying in the x p , y p plane with z axis as

its optical axis, as illustrated in Figure 7-1. For a uniformly illuminated pupil with an

( )

aberration function F x p , y p and power Pex exiting from it, the pupil function of the

system can be written

yp yc

E F

30º

a A 60º

D o xp o xc

C B

a

2a

(a) (b)

Figure 7-1. (a) Hexagonal pupil with dimension a. (b) Unit hexagonal pupil inscribed

inside a unit circle showing the coordinates of its corners. Each side of the hexagon

has a length of unity. The x axis passes through the corners D and A, and y axis

bisects its parallel sides EF and CB.

7.2 Pupil Function 169

(

P xp, yp ) ( ) [ (

= A x p , y p exp iF x p , y p )] , (7-1)

where

(

A xp, yp ) = (P ex

12

Sex ) (7-2)

7.3.1 PSF

From Eq. (1-9), the aberrated irradiance distribution in the image plane normalized

by its aberration-free central value Pex Sex l2 R 2 can be writen

2

r 1 Û r Ê 2pi r r ˆ r

I (ri ) = 2 Ù exp iF rp exp Á -

Sex ı

[ ( )]

Ë lR

ri rp ˜ d rp

¯

◊ , (7-3)

2

1 Û Û È 2pi ˘

I (x i , y i ) =

Sex ı ı

[ (

2 Ù Ù exp iF x p , y p exp Í -

Î lR

)] (

x i x p + y i y p ˙ dx p dy p

˚

) , (7-4)

(x p, yp ) = a( x ¢, y ¢) (7-5)

and

(xi , yi ) = l Fx ( x , y ) , (7-6)

where

Fx = R 2a (7-7)

is the focal ratio of the image-forming light cone along the x axis, Eq. (7-4) can be written

2

4 ÛÛ

I ( x, y) =

27 ı ı

[ ]

Ù Ù exp iF ( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-8)

2

4 ÛÛ

I ( x, y) = Ù Ù exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-9)

27 ı ı

The hexagonal region of integration consists of a rectangle CBEF and two congruent

triangles B F A and CDE with the limits of integration - 1 2, 1 2; - 3 2, 3 2 , ( )

170 SYSTEMS WITH HEXAGONAL PUPILS

[1 2, 1; - ] [

3(1 - x ¢), 3(1 - x ¢) , and -1, - 1 2; - 3(1 + x ¢), 3(1 + x ¢) , respectively. In ]

each case, the first pair of limits is on x ¢ , and the second on y ¢ . Hence, the irradiance

distribution is given by

2

4 È12 3 2 1 3 (1 x ¢) 12 3 (1+ x ¢) ˘

I ( x, y) = Í Ú dx ¢ Ú + Ú dx ¢ Ú + Ú dx ¢ Ú ˙ exp[ -pi ( xx ¢ + yy ¢) ]dy ¢ . (7-10)

27 ÍÎ 1 2 3 2 12 3 (1 x ¢) 1 3 (1+ x ¢) ˙

˚

The integrand in Eq. (7-10) is separable in the integration coordinates. We carry out the

integration of each of its three parts:

12 3 2

A1( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢

12 3 2

= 4

sin(px 2) sin ( 3py 2 ) , (7-11)

2

p xy

1 3 (1 x ¢)

A2 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢

12 3 (1 x ¢)

-2

){ [- ( ) ( )] }. (7-12)

ipx 2 ipx

= e 3 y cos 3py 2 + ix sin 3py 2 + 3 ye

(

p y x 2 - 3y 2

2

12 3 (1+ x ¢)

A3 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢

1 3 (1+ x ¢)

2

){ [ ( ) ( )] }

= e ipx 2 3 y cos 3py 2 + ix sin 3py 2 - 3 ye ipx . (7-13)

2

(

p y x - 3y 2 2

4

A2 + A3 =

p y x - 3y 2

2

( 2

)

¥ [ 3 y cos(px 2) cos ( )

3py 2 - x sin(px 2) sin ( )

3py 2 - 3 y cos( px ) . (7-14) ]

From Eqs. (7-11) and (7-14), we obtain

4

A1 + A2 + A3 =

(

p 2 x x 2 - 3y 2 )

¥ { 3x[cos(px 2) cos( ) ]

3py 2 - cos( px ) - 3y sin(px 2) sin ( )}

3py 2 . (7-15)

The sum of the three parts of diffracted amplitude is real. The irradiance distribution is

given by

7.3.1 PSF 171

4 2

I ( x, y) = A1 + A2 + A3

27

4 2

=

27

( A1 + A2 + A3 ) . (7-16)

Using the L’Hopital rule, it can be shown that the PSF I (0, 0) at the origin is unity,

as expected from the normalization in Eq. (7-3). Rotating the ( x , y ) coordinate system by

[ ]

60 o , i.e., by changing ( x , y ) to (1 2) x + 3 y , y - 3 x , it can be shown that the PSF

remains invariant, thus showing that the PSF is 6-fold symmetric, as expected for the 6-

fold symmetric pupil. The PSF along the x and y axes can be written from Eq. (7-14) as

64

I ( x , 0) = [

9p 4 x 4

cos(px 2) - cos( px ) ]2 . (7-17a)

and

16 2

I (0, y ) =

243p 4 y 4

{ [

2 3 1 - cos ( )]

3py 2 + 3py sin ( )}

3py 2 . (7-17b)

A 2D PSF is shown in Figure 7-2. The PSF in Figure 7-2a emphasizes the low-value

details, but that in Figure 7-2b is truncated to a value of 10 -3 relative to a value of unity at

the center. It shows a nearly circular bright spot at the center surrounded by nearly

hexagonal alternating dark and bright rings, three dark and two bright. Beyond the rings,

the PSF breaks into six diffracted arms each of alternating bright and dark strips with

some dim structure between two consecutive arms. Plots of the PSF along the x and y

axes and at 15o from the x axis are shown in Figure 7-3 as I ( x, 0) , I (0, y ) , and

( )

I 15o ∫ I ( r ) , respectively. The solid curve I c represents the Airy pattern for a circular

pupil (of the same radius a as the side of the hexagonal pupil imaging an object at the

same wavelength l with the same focal ratio as Fx ) with its first zero at 1.22, as in

Figure 4-2. The central bright spot has its zero value along the x axis at 1.33, and at 1.35

along the y axis.

The ensquared power, i.e., the fractional power in a square region centered at the

Gaussian image point, is given by

s s

P( s) = Ú dx Ú I ( x , y )dy , (7-18)

s s

where s is the half-width of the square. It is tabulated in Table 7-1 along with the

corresponding value for a circular pupil. The two ensquared powers are plotted in Figure

7-4 as Ph and Pc . The ensquared power for a hexagonal pupil, plotted as a dotted curve

Ph , starts at zero and rises to 83.8% as s increases to the first zero along the x axis at

1.33, like the Airy disc of radius 1.22 for a circular pupil (as in Figure 4-2a), and

approaches 100% asymptotically. It is evident that the ensquared power for a hexagonal

pupil is lower than the corresponding value for a circular pupil.

172 SYSTEMS WITH HEXAGONAL PUPILS

(a) (b)

o

Ic

I(x,0)

I(15q)

o

m

I(y,0)

Ic

Figure 7-3. PSF along the x and y axes and at 15 o from the x axis, where x, y, and r

are in units of l Fx .

7.3.1 PSF 173

Table 7-1. Ensquared power Ph of a system with a hexagonal pupil, where s is the

half width of a square in units of l Fx , compared with the ensquared power Pc for a

circular pupil.

s Ph Pc

0 0 0

0.1 0.0256 0.0310

0.2 0.0984 0.1180

0.3 0.2070 0.2449

0.4 0.3354 0.3897

0.5 0.4663 0.5302

0.6 0.5848 0.6491

0.7 0.6809 0.7369

0.8 0.7504 0.7930

0.9 0.7945 0.8229

1 0.8186 0.8360

1.2 0.8344 0.8455

1.4 0.8434 0.8624

1.6 0.8613 0.8862

1.8 0.8819 0.9043

2 0.8972 0.9135

2.2 0.9060 0.9184

2.4 0.9116 0.9241

2.6 0.9175 0.9315

2.8 0.9244 0.9384

3 0.9311 0.9426

3.5 0.9397 0.9495

4 0.9469 0.9573

4.5 0.9536 0.9615

5 0.9575 0.9662

6 0.9645 0.9722

7 0.9699 0.9765

8 0.9738 0.9798

9 0.9768 0.9823

10 0.9791 0.9843

174 SYSTEMS WITH HEXAGONAL PUPILS

Pc Ph

o

in units of l Fx .

7.3.2 OTF

From Eq. (1-11), the OTF for a uniformly illuminated hexagonal pupil can be

obtained as the autocorrelation of the pupil function:

r

t (v ) = Sex1 Ú [ (r )] d rr

exp iQ rp p , (7-19)

where

(r r)

Q rp ; v (r ) (r

= F rp - F rp - l R v

r

) (7-20)

r

is the phase aberration difference function, and v is a spatial frequency vector in the

image plane. The integration in Eq. (7-19) is carried out over the overlap area of two

r

hexagonal pupils whose centers are displaced from each other by l R v . In the aberration-

free case, the OTF is real and simply equal to the relative area of overlap of two pupils

r

where the center of one is displaced from that of the other by l R v .

For a displacement x along the x axis, as in Figure 7-5a, the overlap area consists of

two isosceles triangles and a rectangle when x < a . The area of each triangle is 3a 2 4 ,

and that of the rectangle is 3a( a - x ) . The total fractional overlap area is 1 - 2 x 3a .

For x = a , as in Figure 5b, the rectangle vanishes and the two triangles meet forming a

rhombus. For x > a , the two triangles intersect each other, thus reducing the size and

therefore the area of the rhombus. The fractional area of the rhombus is given by

(1 3) (2 - x a)2 . The rhombus vanishes as x Æ 2a , and the two hexagons meet at a

vertex only, namely, the extreme right-hand vertex of one hexagon and the extreme left-

hand vertex of the other. Replacing the displacement x by l Rv x , where v x is a spatial

frequency along the x axis, and normalizing it by the cutoff frequency 1 l Fx along this

axis, we can write the tangential or the x-OTF as

7.3.2 OTF 175

yp

yp yp

Oc

Oc Oc y

O O

xp xp O

x x xp

Figure 7-5. Overlap area of two hexagonal pupils displaced from each other along

the x axis in (a) and with x = a in (b), and along the y axis in (c).

ÏÔ1 - (4 3)v x , 0 £ v x £ 1 2

t x (v x ) = Ì 2

(7-21)

ÔÓ(4 3) (1 - v x ) , 1 2 £ v x £ 1 .

Now consider a displacement y along the y axis, as illustrated in Figure 7-5c. Here

again, the overlap area consists of two congruent isosceles triangles and a rectangle. The

(

area of each triangle is 1 4 3 )( )

3a - y and that of the rectangle is a 3a - y for

2

( )

0 £ y £ 3a . The fractional overlap area is given by ( 2 3)ÈÍ 1 y 3a + (1 2) 1 y 3a ˘˙ .

( ) ( )

Î ˚

Again, replacing y by l Rv y , where v y is the spatial frequency along the y axis, and

normalizing by the cutoff frequency 1 l Fx , the sagittal or the y-OTF can be written

2

( ) = (2 3)ÈÍÎ(1 - 2v

ty vy y ) (

3 + (1 2) 1 - 2v y 3 ˘˙ , 0 £ v y £ 3 2 .

) ˚

(7-22)

Note that the cutoff frequency in the y direction is 3 2 compared to a value of unity in

the x direction.

It can be shown that the OTF for an angle q from the x axis in the range 0 £ q £ p 6

is given by [6]

Ï 4 È Ê2 ˆ ˘

Ô1 - vq Ísin q + 3 cos q + Á sin 2 q - sin 2q˜ vq ˙ , 0 £ vq £ v1

Ô 3 3 Î Ë 3 ¯ ˚

t(vq ) = Ì (7-23)

Ô 4 + 2 Ê sin q - 4 cos qˆ v + 1 Ê 1 - 1 sin 2q + 3 cos 2qˆ v 2 , v £ v £ v ,

Ô 3 3 ÁË 3 ˜ q

¯ 3Ë

Á

3

˜ q 1

¯ q 2

Ó

1

È Ê sin q ˆ ˘

v1 = Í 2Á cos q - ˜˙ (7-24)

Î Ë 3 ¯˚

and

176 SYSTEMS WITH HEXAGONAL PUPILS

1

Ê sin q ˆ

v2 = Á cos q + ˜ (7-25)

Ë 3¯

spatial frequency v 2 represents the cutoff frequency as a function of angle q . It

decreases monotonically from a value of unity to 3 2 as the angle q increases from

zero to p 6. By letting q = 0, we obtain the OTF along the x axis as given by Eq. (7-21).

Similarly, q = p 6 yields the OTF along the y axis given by Eq. (7-22), since the OTFs

for angles p 6 and p 2 are identical owing to the six-fold symmetry of the hexagonal

pupil. The OTF for the range p 6 £ q £ p 3 is the same as that for the range 0 £ q £ p 6 ,

becuase of the symmetry of the pupil about the direction making an angle of p 6. For

larger angles, we make use of the six-fold symmetry of the OTF.

Figure 7-6 shows how the OTF varies with the spatial frequency (in units of the

cutoff frequency 1 l Fx ) along the x and y axes, and at 15o from the x axis as t(v x ),

( ) ( )

t v y (in long dashes), and t 15o ∫ t( v ) . The OTF of a system with a corresponding

circular pupil of radius a is also included for comparison as t c . Note that the cutoff

frequency of the hexagonal pupil is the same as that for the circular pupil only along the x

axis and every 60 o degrees from it. Otherwise, it is smaller. We note that the OTF of a

hexagonal pupil is lower than that for a circular pupil at all spatial frequencies. The OTF

along the x axis is slightly higher than that along the y axis, and the OTF at 15o is slightly

higher in the low frequency region but lower in the high. The 15o OTF is lower than that

along the x axis. The differences among the three curves are relatively small.

oW

Wc

o

WQy

Wq o

o

WQx

oQx Qy Q

Figure 7-6. OTF along the x and y axes, and at 15 o from the x axis, where the spatial

frequencies v x , v y , and v , are in units of 1 l Fx .

7.4 Hexagonal Polynomials 177

Figure 7-7 shows a unit hexagon inscribed inside a unit circle. The x axis passes

through the corners D and A , and y axis bisects its parallel sides EF and C B. The

coordinates of the corners of the hexagon are labeled in the figure. Each side of the

hexagon has a length of unity. The area of the unit hexagon is A = 3 3 2 .

Zernike circle polynomials over a hexagon [5,6] are given by [see Eq. (3-18)]

È j ˘

H j +1 = N j +1 Í Z j +1 - Â Z j +1H k H k ˙ , (7-26)

Î k =1 ˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the

unit hexagon, i.e., they satisfy the orthonormality condition

2

Ú H j H j ¢ dx dy = d jj ¢ . (7-27)

3 3 hexagon

The hexagonal region of integration consists of a rectangle EFCB and two congruent

(

triangles F A B and C D E with limits of integration - 1 2, 1 2; - 3 2, 3 2 , )

[ ] [ ]

1 2, 1; - 3(1 - x ), 3(1 - x ) , and -1, - 1 2; - 3 (1 + x ), 3 (1 + x ) , respectively. The

angular brackets indicate a mean value over the hexagonal pupil. Thus,

2

Z j +1H k = Ú Z j +1H j dx dy . (7-28)

3 3 hexagon

The orthonormal hexagonal polynomials are given in Tables 7-2–7-4 up to the eighth

order in three different but equivalent forms [9,10]. In Table 7-2, each hexagonal

polynomial is written in terms of the circle polynomials, thus illustrating the relationship

y

£ 1 3¥ £ 1 3¥

E² , ´ F² , ´

¤ 2 2¦ ¤2 2 ¦

30°

O x

£ 1 3¥ £1 3¥

C² , ´ B² , ´

¤ 2 2¦ ¤2 2 ¦

Figure 7-7. Unit hexagon inscribed inside a unit circle showing the coordinates of its

corners. Each side of the hexagon has a length of unity. The x axis passes through

the corners D and A, and y axis bisects its parallel sides EF and CB.

178 SYSTEMS WITH HEXAGONAL PUPILS

circle polynomials Z j U T .

H1 Z1

H2 6 5 Z2

H3 6 5 Z3

H4 5 43 Z1 + (2 15 43 )Z4

H5 10 7 Z5

H6 10 7 Z6

H7 16 14 11055 Z3 + 10 35 2211 Z7

H8 16 14 11055 Z2 + 10 35 2211 Z8

H9 (2 5 / 3 ) Z9

H21 0.71499594Z3 + 0.72488884Z7 + 0.46636441Z17 + 1.72029850Z21

H22 0.58113135Z1 + 0.89024136Z4 + 0.89044507Z11 + 1.32320623Z22

H23 1.15667686Z5 + 1.10775599Z13 + 0.43375081Z15 + 1.39889072Z23

H24 1.15667686Z6 + 1.10775599Z12 0.43375081Z14 + 1.39889072Z24

H25 1.31832566Z5 + 1.14465174Z13 + 1.94724032Z15 + 0.67629133Z23 + 1.75496998Z25

7.4 Hexagonal Polynomials 179

circle polynomials Z j U T . (Cont.)

H26 1.31832566Z6 1.14465174Z12 + 1.94724032Z14 0.67629133Z24 + 1.75496998Z26

H27 2 77 93 Z27

H29 0.97998834Z3 + 1.16162002Z7 +1.04573775Z17 +0.40808953Z21 +1.36410394Z29

H30 0.97998834Z2 + 1.16162002Z8 + 1.04573775Z16 0.40808953Z20 + 1.36410394Z30

H31 3.63513758Z9 + 2.92084414Z19 + 2.11189625Z31

H32 0.69734874Z10 + 0.67589740Z18 + 1.22484055Z32

H33 1.56189763Z3 + 1.69985309Z7 + 1.29338869Z17 + 2.57680871Z21

+ 0.67653220Z29 + 1.95719339Z33

H34 1.56189763Z2 1.69985309Z8 1.29338869Z16 + 2.57680871Z20

0.67653220Z30 + 1.95719339Z34

H35 1.63832594Z3 1.74759886Z7 1.27572528Z17 0.77446421Z21

0.60947360Z29 0.36228537Z33 + 2.24453237Z35

H36 1.63832594Z2 1.74759886Z8 1.27572528Z16 + 0.77446421Z20

0.60947360Z30 + 0.36228537Z34 + 2.24453237Z36

H37 0.82154671Z1 + 1.27988084Z4 + 1.32912377Z11 + 1.11636637Z22

0.54097038Z28 + 1.37406534Z37

H38 1.54526522Z6 + 1.57785242Z12 0.89280081Z14 + 1.28876176Z24

0.60514082Z26 + 1.43097780Z38

H39 1.54526522Z5 + 1.57785242Z13 + 0.89280081Z15 + 1.28876176Z23

+ 0.60514082Z25 + 1.43097780Z39

H40 2.51783502Z6 2.38279377Z12 + 3.42458933Z14 1.69296616Z24

+ 2.56612920Z26 0.85703819Z38 + 1.89468756Z40

H41 2.51783502Z5 + 2.38279377Z13 + 3.42458933Z15 + 1.69296616Z23

+ 2.56612920Z25 + 0.85703819Z39 + 1.89468756Z41

H42 2.72919646Z1 4.02313214Z4 3.69899239Z11 2.49229315Z22

+ 4.36717121Z28 1.13485132Z37 + 2.52330106Z42

+ 0.95864121Z26 0.69034812Z38 + 0.40743941Z40 + 2.56965299Z44

0.95864121Z25 0.69034812Z39 0.40743941Z41 + 2.56965299Z45

180 SYSTEMS WITH HEXAGONAL PUPILS

U, T .

H1 1

H2 2 6 / 5 ȡcosș

H3 2 6 / 5 ȡsinș

H4 5 / 43 ( 5 + 12ȡ2)

H5 2 15 / 7 ȡ2sin2ș

H6 2 15 / 7 ȡ2cos2ș

H9 (4 10 / 3 )ȡ3sin3ș

H20 ( 2.17600248ȡ + 13.23551876ȡ3 + 16.15533716ȡ5)cosș + 5.95928883ȡ5 cos5ș

H21 (2.17600248ȡ 13.23551876ȡ3 + 16.15533716ȡ5) sinș + 5.95928883ȡ5 sin5ș

H22 2.47059083 + 33.14780774ȡ2 93.07966445ȡ4 + 70.01749250ȡ6

H23 (23.72919095ȡ2 90.67126833ȡ4 + 78.51254738ȡ6)sin2ș + 1.37164051ȡ4sin4ș

H24 (23.72919095ȡ2 90.67126833ȡ4 + 78.51254738ȡ6)cos2ș 1.37164051ȡ4cos4ș

H25 (7.55280798ȡ2 36.13018255ȡ4 + 37.95675688ȡ6)sin2ș + ( 26.67476754ȡ4

+ 39.39897852ȡ6)sin4ș

H26 ( 7.55280798ȡ2 + 36.13018255ȡ4 37.95675688ȡ6)cos2ș + ( 26.67476754ȡ4

+ 39.39897852ȡ6)cos4ș

7.4 Hexagonal Polynomials 181

U, T . (Cont.)

H27 14 22 / 93 ȡ6sin6ș

H28 0.56537219 10.44830313ȡ2 + 38.71296332ȡ4 37.27668254ȡ6 + 7.83998727ȡ6cos6ș

H29 ( 15.56917599ȡ + 130.07864353ȡ3 291.15952742ȡ5

+ 190.97455178ȡ7)sinș + 1.41366362ȡ5sin5ș

H30 ( 15.56917599ȡ + 130.07864353ȡ3 291.15952742ȡ5

+ 190.97455178ȡ7)cosș 1.41366362ȡ5cos5ș

H31 (54.28516840 202.83704634ȡ2 + 177.39928561ȡ4)ȡ3sin3ș

H32 (41.60051295 135.27397959ȡ2 + 102.88660624ȡ4)ȡ3cos3ș

H33 ( 3.87525156 + 41.84243767ȡ2 117.56342978ȡ4 + 94.71450820ȡ6)ȡsin ș

+ 76.09262860 + ( 38.04631430 + 54.80141514ȡ2)ȡ5sin5ș

H34 (3.87525156 + 41.84243767ȡ2 117.56342978ȡ4+ 94.71450820ȡ6)ȡcos ș

+ ( 38.04631430 + 54.80141514ȡ2)ȡ5cos5ș

H35 (3.10311187 34.93479698ȡ2 + 102.08124605ȡ4 85.32630533ȡ6)ȡsinș

+ (6.01202622 10.14399046ȡ2)ȡ5 sin 5ș + 8.978129552ȡ7sin7ș

H36 (3.10311187ȡ 34.93479698ȡ2 + 114.10529848ȡ4 87.65802721ȡ6)ȡcosș

+ (12.02405243 2.33172188ȡ2) ȡ5cos3ș + (12.02405243 + 3.68030434ȡ2)ȡ5cos5ș

+ 6.01202622ȡ7cos7ș

H37 2.74530738 60.39881618ȡ2 + 300.22087475ȡ4 518.03488742ȡ6

+ 288.55372176ȡ8 2.02412582ȡ6cos6ș

H38 ( 42.96232789 + 287.78381063ȡ2 565.13651608ȡ4

+ 339.98298180ȡ6)ȡ2cos2ș + (8.49786414 13.58537785ȡ2)ȡ4cos4ș

H39 ( 42.96232789 + 287.78381063ȡ2 565.13651608ȡ4

+ 339.98298180ȡ6)ȡ2sin2ș + (8.49786414 13.58537785ȡ2)ȡ4sin4ș

H40 (14.79181046 121.61654135ȡ2 + 286.77354559ȡ4

203.62188574ȡ )ȡ2cos2ș

6

H41 ( 14.79181046 + 121.61654135ȡ2 286.77354559ȡ4 + 203.62188574ȡ6)ȡ2sin2ș

+ (83.39879886 280.00664075ȡ2 + 225.07739907ȡ4)ȡ4sin4ș

H42 0.84269170 + 24.65387703ȡ2 158.21741244ȡ4 + 344.75780000ȡ6

238.31877895ȡ8 + ( 58.59775991 + 85.64367812ȡ2)ȡ6cos6ș

H44 (9.64776957 85.41873843ȡ2 + 216.08041438ȡ4

164.01834750ȡ6)ȡ2cos2ș + (12.67622930 51.08055822ȡ2

+ 48.40133344ȡ4)ȡ4cos4ș + 10.90211434ȡ8cos8ș

H45 (9.64776957 85.41873843ȡ2 + 216.08041438ȡ4 164.01834750ȡ6)ȡ2sin2ș

(12.67622930 51.08055822ȡ2 + 48.40133344ȡ4)ȡ4sin4ș + 10.90211434ȡ8sin8ș

182 SYSTEMS WITH HEXAGONAL PUPILS

x, y , where U 2 x 2 y 2 .

H1 1

H2 2 6/5 x

H3 2 6/5 y

H4 5 / 43 ( 5 + 12ȡ2)

H5 4 15 / 7 xy

H6 2 15 / 7 (x2 y2)

H7 4 42 / 3685 ( 14 + 25ȡ2)y

H8 4 42 / 3685 ( 14 + 25ȡ2)x

H20 ( 2.17600247 + 13.23551876ȡ2 + 13.64110699 ȡ4)x 119.18577680 ȡ2 x3

+ 95.3486212x5

H21 (2.17600247 13.23551876ȡ2 + 45.95178131ȡ 4)y 119.18577680 ȡ2y3

+ 95.34862128y5

H22 2.47059083 + 33.14780774ȡ2 93.07966445ȡ4 + 70.01749250ȡ6

H23 (47.45838189 175.85597460x2 186.82909872y2 + 157.02509476x4

+ 314.05018953x2y2 + 157.02509476y4)xy

H24 (23.72919094 92.04290884x2 + 78.51254738x4)x2 + ( 23.72919094

+ 8.22984309x2 + 89.29962781y2 + 78.51254738x4 78.51254738x2y2

78.51254738y4)y2

7.4 Hexagonal Polynomials 183

x, y , where U 2 x 2 y 2 . (Cont.)

H25 (15.10561596 – 178.95943525x2 + 34.43870505y2 + 233.50942786x4

+ 151.82702751x2y2 – 81.68240034y4)xy

H26 (– 7.55280798 + 9.45541501x2 + 1.44222164x4)x2 + (7.55280798 + 160.04860523x2– 62.80495008y2

–234.95164950x4 – 159.03813574x2y2 + 77.35573540y4)y2

H27 (40.85537039x4 136.18456799 x2y2 + 40.85537039y4)xy

H28 0.56537219 – 10.44830312ȡ2 + 38.71296332x4 + 77.42592664 x2y2 + 38.71296332y4 29.43669525x6

229.42985678 x4y2 +5.76976155 x2y4 45.11666981y6

– 28.2732724ȡ2y3 + 22.61861792y5

2 3 5

H30 ( 15.56917599 + 130.07864353ȡ2 – 298.22784553ȡ4 + 190.97455178ȡ6)x + 28.27327243ȡ x – 22.61861792x

177.39928561y2ȡ4)y

102.88660624x4 – 514.43303123 x2y2 308.65981874y4)y2]x

H33 [ 3.87525156 + (41.84243767 307.79500129x2 + 368.72158389x4)x2 + (41.84243767 + 145.33628349x2

155.60974407y + 10.13644892x4

2

209.06921162 x2y2 + 149.51592334y4)y2]y

H34 [3.87525156 + ( 41.84243767 + 79.51711547x2 39.91309306x4)x2 + ( 41.84243767 + 615.59000259x2

72.66814174y2 777.35626084x4 558.15060029 x2y2 + 179.29256748y4)y2]x

H35 [3.10311187 + ( 34.93479698 + 132.14137712x2 73.19935100x4)x2 + ( 34.93479698 + 144.04222993x2

2 2 4 2

+ 108.09327226y2 519.49349681x4 + 23.85771799 x y 104.44842531y )y ]y

H36 [3.10311187 + ( 34.93479698 + 96.06921983x2 66.20418535x4)x2 + ( 34.93479698 + 264.28275425x2

2 2 4 2

+ 72.02111496y2 535.81555000x4 + 7.53566481 x y 97.45325965y )y ]x

520.05901324x6 1523.74277487 x4y2 1584.46654966 x2y4 516.01076159y6

H38 ( 42.96232789 + 296.28167478x2 578.72189394x4 + 339.98298180x6)x2 + (42.96232789 50.98718488x2

279.28594648y2 497.20962679x4 + 633.06340537 x2y2 + 551.55113822y4 + 679.96596360x6

679.96596360 x2y4 339.98298180y6)y2

H39 [ 85.92465579 + (541.57616468 1075.93152073x2 + 679.96596360x4)x2 + (609.55907786

2 2 2 4 2

2260.54606433x 1184.61454360y2 + 2039.89789081x4 + 2039.89789081x y + 679.96596360y )y ]xy

H40 (14.79181046 38.21774249x2 + 6.76690483x4 + 21.45551332x6)x2 + ( 14.79181046 500.39279319x2

2 4 2 2 4

+ 205.01534022y + 1686.80674937x + 1113.25965819 x y 566.78018634y 1307.55336779x6

4 2 2 4 6 2

2250.77399075 x y 493.06582480 x y + 428.69928482y )y

H41 [ 29.58362093 + (576.82827818 1693.57365421x2 + 1307.55336779x4)x2 +( 90.36211274

1147.09418236x2 + 546.47947184y2 + 2122.04091078x4 + 321.42171817x2y2 493.06582480y4)y2]xy

H42 0.84269170 + (24.65387703 158.21741244x2 + 286.16004008x4 152.67510082x6) x2+ (24.65387703

316.43482489x2 158.21741244y2 + 1913.23979875x4 + 155.30700127x2 y2 + 403.35555992y4

– 2152.28660953x6 – 1429.91267370x4y2 + 245.73637792x2y4 – 323.96245707y6)y2 + 403

3 3 5 2

H43 2 22 / 20334667 (6x5y 20x y +6xy )( 23443 + 32240ȡ )

2 4 6

H44 (9.64776957 72.74250912x + 164.99985615x 104.71489971x )x2

+ ( 9.64776957 –76.05737585x2 + 98.09496774y2 + 471.48320551x4

+ 39.32237674 x2y2 267.16097261y4 826.90123032x6

+ 279.13466933 x4 y2 170.82784030 x2 y4 + 223.32179529y6) y2

H45 [19.29553915 + ( 221.54239411 + 636.48306167x2 434.42511407x4)x2

+ ( 120.13255963 + 864.32165754x2 + 227.83859586y2 1788.23382186x4

179.98634818 x2y2 221.64827593y4)y2]xy

184 SYSTEMS WITH HEXAGONAL PUPILS

between the two. In particular, it helps determine the potential error made when a

hexagonal aberration function is expanded in terms of the circle polynomials (see Chapter

12). The coefficients of the circle polynomials are the elements of the conversion matrix

M (discussed in Chapter 3). The polynomials up to H19 are given in their analytical form,

but those with j > 19 are written in a numerical form because of the increasing

complexity of the coefficients of the circle polynomials. In Table 7-3, the hexagonal

polynomials are given in polar coordinates, showing one-to-one correspondence with the

circle polynomials but illustrating the difference between them. This form is convenient

for analytical calculations because of integration of trigonometric functions over

symmetric limits. Finally, the polynomials are given in Cartesian coordinates in Table 7-

4, for a quantitative numerical analysis of, say, an interferogram.

Several observations can be made from the polynomial tables. It is evident from

Table 7-2 that the corresponding coefficients of the Zernike polynomials that make up the

hexagonal polynomial (n, m) pairs are the same except for signs in some cases, unless m

is a multiple of 3. For example, H14 and H15 have some coefficients with different signs,

but H16 and H17 have the same signs. H9 and H10 , which correspond to n = 3 and m =

3, and H18 and H19 , which correspond to n = 5 and m = 3, have different coefficients.

From Table 7-3, we note that each hexagonal polynomial consists of cosine or sine terms,

but not both.

Unlike the circle and annular polynomials, the hexagonal polynomials are generally

not separable in r and q due to lack of radial symmetry of the hexagonal pupil. The first

13 polynomials, i.e., up to H13 , are separable, but H14 and H15 are not; H16 through H19

are separable, but H20 and H21 are not. Accordingly, the notion of two indices n and m

with dependence on m in the form of cos mq loses significance. For example, the Zernike

polynomial Z14 for n = 4 and m = 4 varies as cos 4q but H14 has a term in cos 2q also.

Hence, the hexagonal polynomials can be ordered by a single index only. While the

polynomials H11 and H22 representing balanced primary and secondary spherical

aberrations are radially symmetric, the polynomial H37 representing balanced tertiary

spherical aberration is not, since it consists of an angle-dependent term in Z28 or cos 6q

also. If this term is not included in the polynomial H37 , the standard deviation of the

aberration increases from a value of unity to 1.13339.

hexagon is rotated by 30 o compared to that in Figure 7-7 so that the point A, for example,

moves to a point A ¢ . Whereas in Figure 7-7 the x axis passes through the corners D and A

of the hexagon and the y axis bisects its parallel sides EF and CB; in Figure 7-8, the x axis

bisects the parallel sides F ¢A ¢ and D¢C ¢ of the hexagon and the y axis passes through its

corners E ¢ and B ¢ . As a result, some polynomials change, as may be seen by comparing

the polynomials given in Table 7-5 for the 30-degree rotation with those in Table 7-2.

The first eight polynomials, H11 through H13 , H16 , H17 , H22 , H27 , etc., do not change.

Polynomials H 9 and H10 , H14 and H15 , and H18 and H19 , etc., exchange the

coefficients of the circle polynomial components.

7.4 Hexagonal Polynomials 185

y

E¢(0,1)

30

60

r

r

Ê 3 1ˆ Ê 3 1ˆ

D¢ Á , ˜ F¢ Á , ˜

Ë 2 2¯ Ë 2 2¯

O x

Ê 3 1ˆ Ê 3 1ˆ

C¢ Á , ˜ A¢ Á , ˜

Ë 2 2¯ Ë2 2¯

B¢ (0 , 1)

Figure 7-8. Unit hexagon rotated clockwise 30 degrees with respect that in Figure 7-

7, showing the coordinates of its corners. The x axis bisects the parallel sides F ¢A¢

and D¢ C ¢ of the hexagon, and the y axis passes through its corners E ¢ and B ¢ .

FUNCTION

A hexagonal aberration function W ( x , y ) across a unit hexagon can be expanded in

terms of J hexagonal polynomials H j (r, q) in the form

J

W ( x, y) = Â a j H j ( x, y) , (7-29)

j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (7-29) by

H j ( x , y ), integrating over the unit hexagon, and using the orthonormality Eq. (7-27), we

obtain the hexagonal expansion coefficients:

2

aj = Ú W ( x , y )H j ( x , y ) dx dy . (7-30)

3 3 hexagon

It is evident from Eq. (7-30) that the value of a hexagonal coefficient is independent of

the number J of polynomials used in the expansion of the aberration function. Hence, one

or more polynomial terms can be added to or subtracted from the aberration function

without affecting the value of the coefficients of the other polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (7-31)

and

J

W 2 (r, q) = Â a 2j , (7-32)

j =1

186 SYSTEMS WITH HEXAGONAL PUPILS

polynomials Z j U T for hexagon rotated by 30 R, as in Figure 7-8.

H1 Z1

H2 6 / 5 Z2

H3 6 / 5 Z3

H4 5 / 43 Z1 + 2 15 / 43 Z4

H5 10 / 7 Z5

H6 10 / 7 Z6

H7 16 14 / 11055 Z3 + 10 35 / 2211 Z7

H8 16 14 / 11055 Z2 + 10 35 / 2211 Z8

H9 2 35 / 103 Z9

H10 (2 5 /3)Z10

H20 = 0.71499593Z2 + 0.72488884Z8 + 0.46636441Z16 + 1.72029850Z20

H21 = 0.71499593Z3 0.72488884Z7 0.46636441Z17 + 1.72029850Z21

H22 = 0.58113135Z1 + 0.89024136Z4 + 0.89044507Z11 + 1.32320623Z22

H23 = 1.15667686Z5 + 1.10775599Z13 0.43375081Z15 + 1.39889072Z23

H24 = 1.15667686Z6 + 1.10775599Z12 + 0.43375081Z14 + 1.39889072Z24

H25 = 1.31832566Z5 1.14465174Z13 + 1.94724032Z15 0.67629133Z23 + 1.75496998Z25

H26 = 1.31832566Z6 + 1.14465174Z12 + 1.94724032Z14 + 0.67629133Z24 + 1.75496998Z26

H27 = 2 77 / 93 Z27

H28 = 1.07362889Z1 + 1.52546162Z4 + 1.28216588Z11 + 0.70446308Z22 + 2.09532473Z28

7.5 Hexagonal Coefficients of a Hexagonal Aberration Function 187

2

2

sW = W 2 (r, q) - W (r, q)

J

= Â a 2j . (7-33)

j =2

OF HEXAGONAL POLYNOMIAL ABERRATIONS

As in the case of circle and annular polynomials (see Sections 4.9 and 5.7,

respectively), we illustrate the hexagonal polynomials for n £ 8 in three different but

equivalent ways in Figure 7-9. For each polynomial, the isometric plot at the top

illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is

shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers

(in units of wavelength) are given in Table 7-6.

The PSF plots represent the images of a point object in the presence of a polynomial

aberration. They can be obtained by applying Eq. (7-6) to a hexagonal pupil. Piston yields

the aberration-free PSF since it does not affect the PSF. The full width of a square

displaying the PSFs is 24l Fx .

with aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x

and y axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a

wavefront tilt angle of 2 6 5 la 2 a about the y axis and displaces the PSF along the x

axis by 4 6 5lFx a 2 . where Fx = R 2a is the focal ratio of the image-forming beam

along the x axis. Similarly, the coefficient a 3 corresponds to a tilt angle of 4 2 5la 3 a

about the x axis, and yields a displacement of the PSF along the y axis by 4 6 5lFy a 3 ,

where Fy = R ( )

3 2 a is the focal ratio of the image-forming beam along the y axis.

The symmetry properties of the aberrated PSFs (and OTFs) discussed for the circular

pupils in Section 4.7 are generally not applicable to hexagonal pupils. For example,

although the form of the polynomials H 5 and H 6 , representing balanced astigmatisms,

are the same as the corresponding Zernike circle polynomials, the interferogram and the

PSF for one cannot be obtained by a 45o rotation of the other. This is due to the lack of

radial symmetry of the hexagonal pupil. However, the interferograms and PSFs for the

polynomials H 7 and H 8 , representing balanced comas, are different from each other

only by a 90 o rotation. Similarly, the polynomials H 9 and H10 have the same form as

the Zernike circle polynomials Z 9 and Z10 , respectively, and they yield 6-fold symmetric

interferograms and 3-fold symmetric PSFs. The PSF for one can be obtained by a 120 o

rotation of the other. The interferograms and the PSFs for H11 and H 22 , representing the

balanced primary and secondary aberrations, respectively, are radially symmetric, but

those for H 37 , representing the balanced tertiary aberration, are not because it contains a

188 SYSTEMS WITH HEXAGONAL PUPILS

H1 H2 H3

H4 H5 H6

H7 H8 H9

interferogram on the left, and PSF on the right for a sigma value of one wave.

7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 189

interferogram on the left, and PSF on the right for a sigma value of one wave.

(Cont.)

190 SYSTEMS WITH HEXAGONAL PUPILS

interferogram on the left, and PSF on the right for a sigma value of one wave.

(Cont.)

7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 191

hexagonal polynomials for a sigma value of one wave.

interferograms and the PSFs become more and more complex.

From Eq. (7-6), the Strehl ratio, representing the central value of an aberrated PSF

relative to its aberration-free value, is given by

S ∫ I (0, 0)

4 2

=

27 ÚÚ [ ]

exp iF ( x , y ) dx d y , (7-34)

192 SYSTEMS WITH HEXAGONAL PUPILS

where the integration is carried out over the unit hexagon, as in Eq. (7-8). We have

removed the primes on the x and y coordinates in Eq. (7-34), because the hexagonal

polynomial aberrations are already written in the normalized coordiantes. The Strehl ratio

for these aberrations with a sigma value of 0.1 wave is listed in Table 7-7 and plotted in

Figure 7-10. Because of the small value of the aberration, the Strehl ratio is

approximately the same for each polynomial, thus illustrating its independence of the

( )

type of the aberration. It is approximately given by exp - s F2 , or 0.67, where

s F = 0.2p .

Table 7-7. Strehl ratio S for hexagonal polynomial aberrations for a sigma value of

0.1 wave.

7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 193

Figure 7-10. Strehl ratio for a hexagonal polynomial aberration with a sigma value

of 0.1 wave.

194 SYSTEMS WITH HEXAGONAL PUPILS

As discussed in the previous chapters, the Strehl ratio of an aberrated image for small

aberrations is determined by the variance of the aberration across the pupil under

consideration. Just as the Zernike circle polynomials represent balanced aberrations in the

sense of minimum variance and, in turn, maximum Strehl ratio for a small aberration,

similarly, the hexagonal polynomials also represent balanced aberrations for the

hexagonal pupils. In Chapters 4 through 6, we have given the value of sigma for a Seidel

aberration, using Ai as its coefficient, with and without balancing for circular, annular,

and Gaussian pupils. As shown below, similar results for a hexagonal pupil can be

obtained from the corresponding orthonormal polynomials. We also determine the Strehl

ratio for Seidel aberrations with and without balancing, and compare with the result

obtained by the exponential approximation.

7.7.1 Defocus

Consider the defocus aberration

W d (r) = Ad r 2 . (7-35)

From the form of the orthonormal defocus polynomial H4 given in Table 7-2, it is

evident that its sigma value across a hexagonal pupil is given by

Ad 43 Ad

sd = = . (7-36)

12 5 4.092

7.7.2 Astigmatism

Next consider 0 o Seidel astigmatism given by

H 6 = 2 15 7r 2 cos 2q . (7-38a)

(

= 2 15 7r 2 2 cos 2 q - 1 ) . (7-38b)

It shows that the relative amount of defocus r2 that balances Seidel astigmatism

r2 cos 2 q is the same for a hexagonal pupil as for a circular, annular, or a Gaussian pupil.

Hence, for a small amount of astigmatism, the diffraction focus for a hexagonal pupil is

the same as for a circular, annular, or a Gaussian pupil. For an image with a focal ratio of

F, it lies along the z axis at a distance of - 4 Aa F 2 from the Gaussian image point. The

balanced astigmatism is given by

Ê 1 ˆ

W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (7-39)

Ë 2 ¯

$VWLJPDWLVP 195

Aa 7 Aa

s ba = = . (7-40)

4 15 5.855

To obtain the sigma value of astigmatism, we write Eq. (7-37) in the form

1

W a (r, q) = (

A r 2 cos 2q + r 2

2 a

)

1 È 7 1 43 ˘

= Aa Í H6 + H ˙ + constant . (7-41)

4 Î 15 6 5 4˚

Aa 127 Aa

sa = = . (7-42)

24 5 4.762

Comparing Eqs. (7-40) and (7-42), we find that balancing astigmatism with defocus

reduces its sigma value of by a factor of 1.23.

7.7.3 Coma

Now we consider Seidel coma:

(

H 8 = 4 42 3685 25r 3 - 14 r cos q .) (7-44)

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma

r3 cos q is - 14 25 ª -0.56 compared to - 2 3 for a circular pupil. The diffraction focus

in this case lies along the x axis at a distance of - ( 4 3) F times the amount of tilt from

the Gaussian image point. The balanced coma is given by

Ê 14 ˆ

W bc (r, q) = Ac Á r 3 - r˜ cos q . (7-45)

Ë 25 ¯

Ac 737 Ac

s bc = = . (7-46)

20 210 10.676

To obtain the sigma value of Seidel coma, we write Eq. (7-43) in the form

È 1 3685 7 5 ˘

W c (r, q) = Ac Í H8 + H ˙ . (7-47)

Î 100 42 25 6 2 ˚

196 SYSTEMS WITH HEXAGONAL PUPILS

Ac 83 Ac

sc = = . (7-48)

4 70 3.673

Comparing Eqs. (7-46) and (7-48), we find that balancing coma with tilt reduces its sigma

value of by a factor of 2.91.

Finally, we consider Seidel spherical aberration:

60

H11 =

1072205

( )

301r 4 - 257r 2 + constant . (7-50)

It shows that the relative amount of defocus that optimally balances Seidel spherical

aberration r 4 is - 257 301 ª - 0.85 compared to a value of –1 for a circular pupil. The

diffraction focus lies closer to the Gaussian image point in the case of coma, and closer to

the Gaussian image plane in the case of spherical aberration, compared to their

corresponding locations for a circular pupil. The balanced spherical aberration is given by

Ê 257 2 ˆ

W bs (r) = As Á r 4 - r ˜ . (7-51)

Ë 301 ¯

As A 4987

s bs = 1072205 = s

60 ¥ 301 84 215

As

= . (7-52)

17.441

To obtain the sigma value of Seidel spherical aberration, we write Eq. (7-49) in the form

È 1072205 257 43 ˘

W s (r) = As Í H11 + H ˙ + constant . (7-53)

Î 60 ¥ 301 12 ¥ 301 5 4 ˚

As 59 As

ss = = . (7-54)

6 35 4.621

Comparing Eqs. (7-52) and (7-54), we find that balancing astigmatism with defocus

reduces its sigma value by a factor of 3.77.

7.7.4 Spherical Aberration 197

The sigma values of the Seidel aberrations with and without balancing are given in

Table 7-8. The corresponding peak-to-valley (P-V) numbers for a sigma value of unity

are also given in the table.

In Figure 7-10, we showed the Strehl ratio for the hexagonal polynomial aberrations

with a sigma value of one-tenth of a wave. In Figure 7-11, we show how it varies with the

sigma value of a Seidel aberration, with and without balancing, for 0 £ s W £ 0.25 . Also

( )

plotted is the Strehl ratio obtained from the approximate expression exp - s F2 as the

dashed curve. As expected, the exponential expression yields a very good estimate of the

Strehl ratio for s W £ 0.1. As s W increases, the true Strehl ratio departs from its

approximate value, except in the case of balanced astigamtism and balanced coma. It

overestimates in the case of defocus, but underestimates for the other aberrations.

Morover, the Strehl ratio for the balanced spherical aberration for large values of s W is

larger than that for the corresponding Seidel aberration, but the opposite is true in the case

of astigmatism and coma The aberration coefficient and the P-V number for a certain

value of s W of these aberrations can be obtained from Table 7-8.

7.8 SUMMARY

Closed-form expressions for the aberration-free PSF and OTF are given for a system

with a hexagonal pupil. They are plotted along with the ensquared power, and compared

with the corresponding qunatities for a system with a corresponding circular pupil. The

ensquared power and the OTF for a hexagonal pupil are shown to be lower than the

corresponding values for a circular pupil. Generally, the quantitative differences between

the corresponding functions for the two pupils are small, perhaps because the difference

in the pupil area is only about 16%.

Table 7-8. Sigma value of a Seidel aberration with and without balancing, and P-V

numbers for a sigma value of unity, where Ai is the aberration coefficient.

198 SYSTEMS WITH HEXAGONAL PUPILS

1.0 1.0

0.8 0.8

0.6 0.6

S

S

0.4 0.4

0.2 0.2

Defocus Astigmatism

0.0 0.0

0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25

VW VW

(a) (b)

1.0 1.0

0.8 0.8

0.6 0.6

S

0.4 0.4

0.2 0.2

Coma Spherical

0.0 0.0

0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25

VW VW

(c) (d)

Figure 7-11. Strehl ratio as a function of the sigma value of a Seidel aberration with

and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical

aberration.

7.8 Summary 199

classical aberrations over such a pupil, are given through the eighth order in Tables 7-2

through 7-4 in terms of the circle polynomials, in polar coordinates, and in Cartesian

coordinates, respectively. The polynomials are ordered in the same manner as the circle,

annular, and Gaussian polynomials discussed in Chapters 4, 5, and 6, respectively.

However, unlike these polynomials, the hexagonal polynomials are generally not

separable in the coordinates r and q of a pupil point due to a lack of the radial symmetry

of the hexagonal pupil. The first 13 polynomials, i.e., up to H13 , are separable, but H14

and H15 are not; H16 through H19 are separable, but H20 and H21 are not. Accordingly,

the concept of two indices n and m with dependence on m in the form of cos mq or

sin mq loses significance. For example, the Zernike circle polynomial Z14 for n = 4 and

m = 4 varies as cos 4q , but H14 has a term in cos 2q also. Hence, the hexagonal

polynomials can be ordered by a single index only. Even so, each polynomial contains

only the cosine or the sine terms. Thus an even j polynomial, for example, consists of

only the cosine terms, as may be seen from Table 7-2.

While the polynomials H11 and H22 representing balanced primary and secondary

spherical aberrations are radially symmetric, the polynomial H37 representing balanced

tertiary spherical aberration is not, since it consists of an angle-dependent term in Z28 or

cos 6q also. If this term is not included in the polynomial H37 , the standard deviation of

the aberration increases from a value of unity to 1.13339.

In practice, the polynomials in Cartesian coordinates given in Table 7-4 will be used

for the analysis of aberration data of a hexagonal wavefront. A somewhat different set of

hexagonal polynomials is obtained when the hexagon is rotated by 30 degrees. These

polynomials are given in Table 7-5.

The first 45 hexagonal polynomials, i.e., up to and including the 8th order, are

illustrated by an isometric plot, an interferogram, and a PSF in Figure 7-9. The coefficient

of each orthonormal polynomial, or the sigma value of the corresponding aberration, is

one wave. Their corresponding P-V numbers for a sigma value of one wave are given in

Table 7-6 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l for each

aberration is given in Table 7-7 and illustrated in Figure 7-10. It shows that, for a small

aberration, the Strehl ratio can be estimated from the aberration variance. The sigma

values of the Seidel aberrations and their balanced forms are given, along with their P-V

numbers in Table 7-8.

The diffraction focus for a system with a hexagonal pupil is shown to lie closer to the

Gaussian image point in the case of coma, and closer to the Gaussian image plane in the

case of spherical aberration, compared to their corresponding locations for a circular

pupil. Figure 7-11 shows how the Strehl ratio varies with the sigma value of a Seidel

aberration, with and without balancing. The approximate expression exp - s F2 ( )

overestimates its value in the case of defocus, but underestimates it for the other

aberrations.

200 SYSTEMS WITH HEXAGONAL PUPILS

References

1. keckobservatory.org/

Bergelnad, and B. B. Gallagher, “James Webb telescope optical telescope element

mirror development history and results,” in Space Telescopes and Instrumentation,

Proc. SPIE , 84422 (2012).

telescopes,” Appl. Opt. 42, 3745–3753 (2003).

telescope with hexagonal segments for high-contrast imaging,” Appl. Opt. 44,

1360–1365 (2005).

Soc. Am. 64, 798–803 (1974).

Appl. Opt. 52, 5112–5122 (2013).

7. G. Chanan and M. Troy, “Strehl ratio and modulation transfer function for

segmented mirror telescopes as functions of segment phase error,” Appl. Opt. 38,

6642–6647 (1999).

telescopes: detailed theoretical point-spread function analysis and numerical

simulation results,” J. Opt. Soc. Am. A 19, 1274–1285 (2003).

analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.

Am. A 29, 1673–1674 (2012).

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–

11.41 (McGraw Hill, 2009).

CHAPTER 8

References ......................................................................................................................234

201

Chapter 8

Systems with Elliptical Pupils

8.1 INTRODUCTION

The pupil of a human eye is slightly elliptical [1]. The pupil for off-axis imaging by a

system with an axial circular pupil may be vignetted, but can be approximated by an

ellipse [2]. When a flat mirror is tested by shining a circular beam on it at some angle

(other than normal incidence), the illuminated spot is elliptical. Similarly, the overlap

region of two circular wavefronts that are displaced from each other, as in lateral shearing

interferometry [3] or in the calculation of the optical transfer function of a system [4], can

also be approximated by an ellipse.

Starting with the pupil function of a system with an elliptical pupil, we scale the

coordinates of a point on the pupil and transform it to a circular pupil. The aberration-free

PSF and OTF are then obtained as for a system with a circular pupil. The corresponding

PSF and OTF obtained by unscaling the coordinates represent the results for the elliptical

pupil. Then we discuss the polynomials that are orthonormal over and represent balanced

classical aberrations for a unit elliptical pupil [5]. These polynomials cannot be obtained

by scaling the coordinates of the Zernike circle polynomials. The balancing of a Seidel

aberration over an elliptical pupil is discussed, and its standard deviation with and

without balancing is determined.

As illustrated in Figure 8-1a, consider an imaging system with an elliptical exit pupil

with semimajor and semiminor axes a and b and area Sex = pab lying in the x p , y p ( )

plane with z axis as its optical axis. The pupil is described by

x 2p y 2p

+ £ 1 . (8-1)

a2 b2

c = ba £ 1 . (8-2)

( )

For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex

exiting from it, the pupil function of the system can be written

(

P xp, yp ) ( ) [ (

= A x p , y p exp iF x p , y p )] , (8-3)

where

(

A xp, yp ) = (P ex

12

Sex ) (8-4)

203

204 SYSTEMS WITH ELLIPTICAL PUPILS

yp y9p

O xp O x9p

a a

(a) (b)

Figure 8-1. (a) Elliptical pupil with semimajor and semiminor axes a and b. (b)

Elliptical pupil transformed into a circular pupil by scaling its y p coordinate.

An elliptical pupil can be transformed to a circular pupil by scaling its coordinates.

Using the results for a circular pupil, the PSF [6] and OTF [7] of an elliptical pupil can be

written in this scaled coordinate system. Unscaling the coordinates finally yields the PSF

and OTF for a system with an elliptical pupil.

8.3.1 PSF

From Eq. (1-9), the aberrated irradiance distribution in the image plane of a system

with a uniformly illuminated elliptical exit pupil, normalized by its aberration-free central

value Pex Sex l2 R 2 , can be written

2

1 ÛÛ È 2pi ˘

I (x i , y i ) [ (

= 2 Ù Ù exp iF x p , y p expÍ -

Sex ı ı Î lR

)] ( )

x i x p + y i y p ˙ dx p d y p

˚

, (8-5)

where the integration is carried over the elliptical pupil. Using the scaled pupil

( )

coordinates x ¢p , y ¢p , where

( x ¢ , y ¢ ) = ( x , y c)

p p p p , (8-6)

( x ¢i , y ¢i ) = ( x i , cy i ) , (8-8)

because of the Fourier transform relationship between the pupil function and the

diffracted amplitude. In the scaled coordinates, Eq. (8-5) for the aberrationfree case

becomes

36) 205

2

c2 È 2pi ˘

I ( x ¢i , y ¢i ; c ) = 2 ÚÚ exp Í -

p circle Î lR

x ¢i x ¢p + y ¢i y ¢p ( ) ˙ dx ¢p dy ¢p

˚

. (8-9)

( )

In polar coordinates r p¢ , q and (ri¢, q i ) for the pupil and image points, we can write

( x¢ , y¢ )

p p (

= r p¢ cos q¢p , sin q¢p ) (

= ar cos q¢p , sin q¢p ) (8-10)

and

the form

2

1 1 2p

[

I (r , q¢i ; c ) = 2 Ú Ú exp -pirr cos q¢i - q¢p r dr dq¢p

p 0 0

( )] , (8-12)

where

ri¢ r¢

r = = i , (8-13)

l R 2a l Fx

and

Fx = R 2a (8-14)

is the focal ratio of the image-forming light cone along the x p axis.

( )

For the aberration-free case, we let F r, q¢p = 0 and perform the integration as for a

circular pupil. Thus, we obtain

2

È 2J (p r ) ˘

I (r) = Í 1 ˙ . (8-15)

Î pr ˚

2

Ï 2J È p x 2 + c 2 y 2 1 2 ˘ ¸

Ô 1 ÍÎ ( ˙˚ Ô )

I ( x , y; c ) = Ì 1 2 ˝ , (8-16)

2

Ô p x +c y

Ó

2 2

( Ô

˛

)

where ( x , y ) are image plane coordinates in units of l Fx . The fractional power contained

in an elliptical ring can be obtained in a similar manner from the corresponding equation

for a circular pupil, namely, Eq. (4-11). Thus, the fractional power in an elliptical ring

with semimajor and semiminor axes x c and y c with y c = cx c is given by

P ( x c , y c ; c ) = 1 - J 02 ÊË p x c2 + c 2 y c2 ˆ¯ - J12 ÊË p x c2 + c 2 y c2 ˆ¯ . (8-17)

206 SYSTEMS WITH ELLIPTICAL PUPILS

The distribution given by Eq. (8-16) approaches the Airy pattern for a circular pupil

as we let the aspect ratio c Æ 1. We also note that the relative irradiance at a point

( x, y c) is equal to the relative irradiance of the Airy pattern at a point ( x, y) . However,

the central irradiance for the elliptical pupil is equal to c 2 times the central value of the

Airy pattern. This is due to the area of the elliptical pupil being equal to c times that of

the circular pupil, and the power incident on and exiting from the elliptical pupil also

being equal to c times that for the circular pupil.

Figure 8-2a shows the 2D PSF for c = 0.85 . It is evident that the circular diffraction

rings of a circular pupil have been replaced by the elliptical diffraction rings of an

elliptical pupil. The dimension of a ring is larger in the direction of the smaller dimension

of the pupil with an aspect ratio of 1 c . Figure 8-2b shows the irradiance distribution

along the x and y axes, and at 45o from the x axis. The first zero along the x axis occurs at

1.22 (in units of l Fx ), as in the Airy pattern, at 1.22/0.85 or about 1.44 along the y axis,

and at about 1.32 at 45o from the x axis [see the curve I ( r ) ∫ I ( x = y ) ].

(a)

1.0 0.025

0.020

I (0, y)

0.8

0.015

I

I (x, 0)

0.6 0.010

I

I (r)

0.4

0.000

1.0 1.5 2.0 2.5 3.0

x, y, or r

0.2

I (0, y) I (r)

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

x, y, or r

Figure 8-2. (a) 2D aberration-free PSF for c = 0.85. (b) Irradiance distribution along

the x and y axes, and at 45 o from the x axis, where x, y, and r are in units of l Fx .

27) 207

8.3.2 OTF

r

The OTF of an aberration-free system at a spatial frequency v i is given by [see Eq.

(2-13)]

r Û r r r r

ı

( ) (

t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (8-18)

It represents the fractional area of overlap of two elliptical pupils centered at (0, 0) and

r

l R(x, h) , where (x, h) are the Cartesian components of the spatial frequency vector v i . In

( )

the scaled coordinates x ¢p , y ¢p , as in Eq. (8-6), the elliptical pupil reduces to a circular

pupil of radius a. The overlap area of two circular pupils, each of radius a, with their

origins at (0, 0) and ( x ¢0 , y ¢0 ) is given by

È 12˘

Ê r¢ ˆ Ê r¢ ˆ Ê r¢ ˆ

S( x ¢0 , y ¢0; a) = 2a 2 Í cos 1Á 0 ˜ - Á 0 ˜ 1 - Á 0 ˜ ˙ , (8-19)

Í Ë 2a ¯ Ë 2a ¯ Ë 2a ¯ ˙

Î ˚

where

(

r0¢ = x ¢02 + y ¢02 )1 2 (8-20)

Letting

and noting that the overlap area is to be multiplied by c when writing it in the unscaled

coordinates, the OTF of a system with an elliptical pupil can be written from Eq. (8-19) in

the form

2È

(

t vx , vy ) =

p ÎÍ

(

cos 1 v e - v e 1 - v e2 )1 2 ˘˚˙ , (8-22)

where

12

Ê 2 v y2 ˆ

ve = Á vx + 2 ˜ (8-23)

Ë c ¯

and

Ê x h ˆ

(v , v )

x y = Á , ˜

Ë 1 l Fx 1 l Fx ¯

(8-24)

are the spatial frequency components normalized by the cutoff frequency 1 l Fx along the

x axis.

208 SYSTEMS WITH ELLIPTICAL PUPILS

and 0 £ v y £ c . Hence, the cutoff spatial frequency varies with its orientation. Thus, for

example, the cutoff frequencies along the x and y axes are 1 and c, respectively. A smaller

cutoff frequency along the y axis is the spatial frequency analog of the larger diffraction

spread due to the smaller dimension of the pupil along this axis. For an arbitrary direction

making an angle q with the x axis, the cutoff frequency is given by

12

[ ( ) ]

c 1 - 1 - c 2 cos 2 q , and represents the distance of the point from the center of a unit

ellipse where a line passing through the center and making an angle q meets it. For

example, the cutoff frequency for 45o is equal to 0.916 when c = 0.85.

Figure 8-3 shows the OTF for c = 0.85 along the x and y axes, and at 45o from the x

( ) ( )

axis as t(v x ), t v y , and t v x = v y ∫ t(v e ) with the corresponding cutoff spatial

frequencies of 1, 0.85, and 0.916, respectively, each in units of 1 l Fx . It should be

( )

evident that t(v x ) is obtained from Eq. (8-22) by letting v y = 0. Similarly, t v y is

obtained by letting v x = 0. Moreover, the OTF along the x axis is the same as for a

corresponding circular pupil.

1.0

0.8

0.6

t

t ( nx )

0.4

t ( nx ny )

0.2

t ( ny )

0.0

0.0 0.25 0.5 0.75 1.0

nx, ny, or ne

Figure 8-3. OTF of a system with an elliptical pupil with aspect ratio c = 0.85, along

the x and y axes, and at 45 o from the x axis, where v x , v y . and v e are all in units of

1 l Fx .

(OOLSWLFDO 3RO\QRPLDOV 209

In Section 8.3, we obtained the aberration-free PSF and OTF by scaling the

coordinates of the elliptical pupil and thereby transforming it into a circular pupil, and

then using the PSF and OTF of a circular pupil. Similarly, by scaling the coordinates of

the Zernike circle polynomials we can obtain polynomials that are orthogonal over an

elliptical pupil. However, these elliptical polynomials do not represent the balanced

classical aberrations for a system with an elliptical pupil. To obtain the polynomials that

are orthogonal over and represent balanced aberrations for an elliptical pupil, we

orthogonalize the Zernike circle polynomials over the elliptical pupil [7,8].

Figure 8-4 shows a unit ellipse of an aspect ratio c inscribed inside a unit circle. Thus

the semimajor and semiminor axes a and b of the ellipse have been normalized by a so

that the farthest point(s) on the ellipse lie at a distance of unity. The unit ellipse is

represented by an equation

x2 + y2 c2 = 1 , (8-25)

or

y = ± c 1 - x2 . (8-26)

circle polynomials Z j over a unit ellipse are given by [see Eq. (3-18)]

È j ˘

E j +1 = N j +1 Í Z j +1 - Â Z j +1Ek Ek ˙ , (8-27)

Î k =1 ˚

D(0,c)

C 1, 0 A 1, 0

O x

B(0, c)

Figure 8-4. Unit ellipse of aspect ratio c inscribed inside a unit circle with its

semimajor axis of unity along the x axis.

210 SYSTEMS WITH ELLIPTICAL PUPILS

where N j +1 is a normalization constant so that the polynomials are orthonormal over the

unit ellipse i.e., they satisfy the orthonormality condition

1 c 1 x2

1 Û Û

dx E j E j ¢ dy = d jj ¢ . (8-28)

pc Ù

ı

Ù

ı

1

c 1 x2

The angular brackets indicate a mean value over the elliptical pupil. Thus, for example,

1 c 1 x2

1 Û Û

Z j Ek = dx Z j Ek dy . (8-29)

pc Ù

ı

Ù

ı

1

c 1 x2

It should be evident that because of the symmetric limits of integration, a mean value is

zero if the integrand is an odd function of x and or y. If the integrand is an even function,

then we may replace the lower limits of integration by zero and multiply the double

integral by 4.

The orthonormal elliptical polynomials up to the fourth order are given in Tables 8-1

through 8-3 in three different but equivalent forms, as in the case of hexagonal

polynomials. The expressions for higher-order elliptical polynomials are very long unless

the aspect ratio c is specified. As in the case of a hexagonal pupil, each elliptical

polynomial consists of either cosine or sine terms, but not both. For example, E6 is a

linear combination of Z 6 , Z 4 , and Z1. It also shows that the balancing defocus for (zero-

degree) Seidel astigmatism is different for an elliptical pupil compared to that for a

circular, annular, or a Gaussian pupil, as may be seen from Table 4-2, 5-2, or 6-2,

respectively. Moreover, E11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,

spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q

as well. The elliptical polynomials are generally more complex in that they are made up

of a larger number of circle polynomials. These results are a consequence of the fact that

the x and y dimensions of the elliptical pupil are not equal. As expected, the elliptical

polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse approaches

a unit circle.

FUNCTION

An elliptical aberration function W ( x , y ) across a unit ellipse can be expanded in

terms of J elliptical polynomials Ej (r, q) in the form

J

W ( x , y ) = Â a j Ej ( x , y ) , (8-30)

j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (8-30) by

8.5 Elliptical Coefficients of an Elliptical Aberration Function 211

circle polynomials Z j U T .

E1 Z1

E2 Z2

E3 Z3/c

2 4

E4 (1/ 3 2c 3c )[ 3 (1 c2) Z1 + 2Z4]

E5 Z5/c

E6 [1/(2 2 c2 3 2c

2 4

3c )][ 3 (3 4c2 + c4)Z1 3(1 c4)Z4 + 2 (3 2c2 + 3c4)Z6]

2 4

E7 [1/(c 5 6c 9c )][6(1 c2)Z3 + 2 2 Z7]

2 4

E8 (2/ 9 6c 5c )[(1 c2)Z2 + 2 Z8]

2 4

E9 [1/(2 2 c3 5 6c 9c )][ 2 2 (5 8c2 + 3c4)Z3 (5 2c2 3c4)Z7 + (5 6c2 + 9c4)Z9]

2 4

E10 [1/(2 2 c3 9 6c 5c )][ 2 2 (3 4c2 + c4)Z2 (3 + 2c2 5c4)Z8 + (9 6c2 + 5c4)Z10]

E12 5 / 8 c 2(195 475c2 + 558c4 422c6 + 159c8 15c10)ȕ 1Z1 15 / 8 c 2(105 205c2

+ 194c4 114c6 + 5c8 + 15c10)ȕ 1Z4 + (1/2) 15 c 2 (75 155c2 + 174c4 134c6 + 55c8 15c10) ȕ 1Z6

6

10 2 c 2(3 2c2 +2c 3c8)ȕ 1Z11 + c 2ĮȖ 1Z12

2 4

E13 [1/(c 5 6c 5c )][ 15 (1 c2)Z5 +2Z13]

(7 + 2c2 c4)Ȗ 1Z4

( 15 /8)c 4 (35 70c2 + 56c4 26c6 + 5c8)Ȗ 1Z6 + (5/8 2 ) (1 c2)2c 4(7 + 10c2 + 7c4)Ȗ 1Z11

E15 ( 15 /4)c 3(5 8c2 + 3c4)į 1Z5 (5/4)(1 c4)c 3 į 1Z13 + (į/2c3) Z15

_______________________________________________________________________

Į (45 60c2 + 94c4 60c6 + 45c8)1/2

ȕ (1575 4800c2 + 12020c4 17280c6 + 21066c8 17280c10 + 12020c12 4800c14 + 1575c16)1/2

Ȗ (35 60c2 + 114c4 60c6 + 35c8)1/2

į (5 6c2 +5c4)1/2

ĮȖ ȕ

212 SYSTEMS WITH ELLIPTICAL PUPILS

E1 1

E2 2ȡcosș

E3 (2ȡsinș)/c

3 / §© 3 3c ·¹ ( 1 c2 +4ȡ2)

2 4

E4 2c

E5 ( 6 /c)ȡ2 sin2ș

2 4

E6 2c 3(1

9c ) ][ (1 + 3c2)ȡ +6ȡ3]sinș

2 4

E7 [4/(c 5 6c

5c ) [ (3 + c2)ȡ + 6ȡ3]cosș

2 4

E8 (4/ 9 6c

2 4

E9 6c (5

2 4

E10 6c

E11 ( 5 /Į) [3+2c2 +3c4 24(1 + c2)ȡ2 + 48ȡ4 12(1 c2)ȡ2 cos2ș]

E12 [ 10 Į/(Ȗc2)]( 3ȡ2 + 4ȡ4) cos2ș + [ 5 2 /(2c2ȕ)][ 12c2(5 2c2 + 2c6 5c8)

+ 4[6c2(5 7c2 + 7c4 5c6) 5(7 6c2 +6c6 7c8)ȡ2]ȡ2 cos2ș + (35 60c2

E15 ( 10 /c3)į 1{[6c2(1 c2) 5(1 c4)ȡ2]ȡ2 sin2ș + [(5 6c2 +5c4)/2]ȡ4 sin4ș}

8.5 Elliptical Coefficients of an Elliptical Aberration Function 213

x, y , where U 2 x 2 y 2 .

E1 = 1

E2 = 2x

E3 = 2y/c

2 4

E4 = ( 3 / 3 í 2c í 3c )(í 1 í c2 +4ȡ2)

E5 = (2 6 /c)xy

2 4

E6 = [ 6 /(c2 3 í 2c í 3c )][c2(1 í c2) + c2(3c2 í 1)x2 í (3 í c2)y2]

2 4

E7 = [4/(c 5 í 6c í 9c )][í (1 + 3c2) + 6ȡ2]y

2 4

E8 = (4/ 9 í 6c í 5c )[í (3 + c2) + 6ȡ2]x

2 4

E9 = [4/(c3 5 í 6c í 9c )][3c2(3c2 í 1)x2 í (5 í 3c2)y2 + 3c2(1 í c2)]y

2 4

E10 = [4/(c2 9 í 6c í 5c )][c2(5c2 í 3)x2 í 3(3 í c2)y2 + 3c2(1 í c2)]x

2

+ 3c8)ȡ4 í 60(í 9 + 3c2 +2c4 í 6c6 +7c8 +3c10)x í 24(15 í 70c2 + 92c4 í 82c6

E14 = ( 10 /c4Ȗ)[c4(3 í 30c2 + 35c4)x4 +6c2(5 í 18c2 + 5c4)x2y2 + (35 í 30c2 +3c4)y4

214 SYSTEMS WITH ELLIPTICAL PUPILS

E j ( x , y ), integrating over the unit ellipse, and using the orthonormality Eq. (8-28), we

obtain the elliptical expansion coefficients:

1 c 1 x2

1 Û Û

aj = dx W ( x , y )E j ( x , y ) dx dy . (8-31)

pc Ù

ı

Ù

ı

1

c 1 x2

As stated in Section 3.2, it is evident from Eq. (8-7) that the value of an elliptical

coefficient is independent of the number J of polynomials used in the expansion of the

aberration function. Hence, one or more terms can be added to or subtracted from the

aberration function without affecting the value of the coefficients of the other

polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (8-32)

and

J

W 2 (r, q) = Â a 2j , (8-33)

j =1

2

2

sW = W 2 (r, q) - W (r, q)

J

= Â a 2j . (8-34)

j =2

OF ELLIPTICAL POLYNOMIAL ABERRATIONS

The first 45 elliptical polynomials for an elliptical pupil with an aspect ratio of c =

0.85 are given in Table 8-4 to 8-6. They are illustrated in three different but equivalent

ways in Figure 8-5. For each polynomial, the isometric plot at the top illustrates its shape.

An interferogram is shown on the left, and a corresponding PSF is shown on the right for

a sigma value of one wave. The peak-to-valley aberration numbers (in units of

wavelength) are given in Table 8-7.

The PSF plots, representing the images of a point object in the presence of a

polynomial aberration and obtained by applying Eq. (8-5), are shown in Figure 8-5. The

full width of a square displaying the PSFs is 24l Fx . Since the piston aberration E1 has

no effect on the PSF, it yields an aberration-free PSF.

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 215

pupil with an aspect ratio c = 0.85.

E1 Z1

E2 Z2

E3 1.1765Z3

E4 0.2721Z1 + 1.1321Z4

E5 1.17645Z5

E7 0.8458Z3 + 1.4369Z7

E8 0.2058Z2 + 1.0486Z8

E 13 0.6987Z5 + 1.3002Z13

1.1709Z22 + 1.7128Z24

0.9739Z24 + 1.6111Z26

216 SYSTEMS WITH ELLIPTICAL PUPILS

pupil with an aspect ratio c = 0.85. (Cont.)

+ 0.3021Z24 0.9317Z26 + 1.8545Z28

+ 1.7676Z31

0.7334Z30 + 1.6725Z32

1.1466Z31 + 1.7471Z33

1.1006Z32 + 1.7428Z34

0.1273Z29 + 0.4026Z31 1.1573Z33 + 2.0935Z35

0.0816Z30 + 0.3938Z32 1.1559Z34 + 2.0934Z36

1.1483Z24 + 1.4746Z37

+ 3.5769Z24 1.1361Z26 1.6754Z37 + 2.0633Z38

3.2919Z24 + 2.8973Z26 1.0342Z28 + 0.7704Z37 1.5053Z38 + 1.8782Z40

1.2074Z39 + 1.8441Z41

+ 1.5920Z24 2.8929Z26 + 2.8550Z28 0.2315Z37 + 0.5922Z38 1.3793Z40 + 1.9027Z42

E43 2.3202Z5 + 2.0734Z13 4.1210Z15 + 1.3161Z23 2.8314Z25 + 2.8448Z27 +

0.5132Z39 1.3637Z41 + 1.9013Z43

0.5119Z24 + 1.4601Z26 3.1647Z28 + 0.0514Z37 0.1605Z38 + 0.5359Z40

1.4192Z42 + 2.3730Z44

0.1454Z39 + 0.5331Z41 1.4187Z43 + 2.3730Z45

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 217

Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an

aspect ratio c = 0.85.

E1 1

E2 2Ucosș

E3 2.3529Usinș

E4 1.6888 + 3.9217U 2

E5 2.8818U2sin2ș

2

E6 0.3848 1.3760 + 2.9947U 2cos2ș

E8 ( 5.5205 U + 8.8980U3)cosș

E20 (0.5810 U 4.0436 U3 + 5.4933 U5)cosș + (6.9151U 3 12.5589 U5)cos3ș + 5.7134 U5cos5ș

6.1279U6)cos2ș 2.2230 U4cos4ș

16.9552U6)cos2 + (12.2137U4 20.9176U6)cos4ș + 6.9389 U6cos6ș

218 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an

aspect ratio c = 0.85. (Cont.)

E28 ( 16.9428 U + 157.9560 U3 395.9030 U5 + 291.6410 U7)sin ș + (9.5563U3 17.3422 U5)sin3ș

E29 ( 15.0992 U + 120.4040 U3 257.3300 U5 + 161.3000U7)cosș + (5.7290 U3 9.5919 U5)cos3ș

+ 148.4790 U7)sin3ș 2.9431U 5sin5ș

+ 140.4930U7)cos3 ș 2.7848 U5cos5ș

99.6445 U5 96.3144U7)sin3ș + ( 34.2032U5 + 48.9185 U7)sin5ș

96.7544 U5 92.4529 U7)cos3ș + ( 34.1909U5 + 48.7981U 7)cos5ș

33.8176 U7)sin3ș + (19.7658U 5 32.4050U 7)sin5ș + 8.3740U7sin7ș

33.0808 U7)cos3ș + (19.7502U 5 32.3638U7)cos5ș + 8.3736U7cos7ș

63.4344 U4 64.4471U 6)cos2ș + 0.6387U 4cos4ș

325.0900 U4 718.3760U 6 + 490.2030 U8)cos2ș + (14.9649U 4 5.5058U6)cos4ș

199.4850 U4 + 485.8140U6 357.6390U 8)cos2ș + (78.6963U4 269.6320U 6 +

223.11800U8)cos4ș 3.8697U 6cos6ș

268.0850 U6 + 219.0700 U8)sin4ș 3.7995U6sin6ș

64.4168 U4 174.4750U6 + 140.7070U8)cos2ș + ( 47.0816U4 + 180.8360U 6

163.8540U8)cos4ș + ( 45.8256 U6 + 64.5805U8)cos6ș

179.4400U 6 162.0030U 8)sin4ș + ( 45.8214 U6 + 64.5323U8)sin6ș

+ 42.7822U 6 38.1414U8)cos2ș + (14.3621U 4 62.7180U 6 + 63.6642U8)cos4 ș

+ (30.3057U 6 48.1679U8)cos6ș + 10.0677 U8cos8ș

+ 63.3298 U8)sin4ș + (30.2998 U6 48.1523U8)sin6ș + 10.0676U8sin8ș

63.3298 U8)sin4ș + (30.2998 U6 48.1523U 8)sin6ș + 10.0676 U8sin8ș

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 219

Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with

an aspect ratio c = 0.85.

E1 1

E2 2x

E3 2.3529y

E5 5.7635xy

218.0810x4y2 83.8249y4 + 218.0810x2y4 + 72.6936y6

163.5690xy5

89.7452x4y2 + 166.2710y4 282.0010x2y4 158.0860y6

169.8950x4y2 96.7384y4 60.5711x2y4 + 112.7040y6

30.4124y4 + 176.3200x2y4 49.9441y6

220 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with

an aspect ratio c = 0.85. (Cont.)

826.4900x2y3 + 874.9230x4y3 378.5610y5 + 874.9230x2y5 + 291.6410y7

+ 483.8990x5y2 228.5560xy4 + 483.8990x3y4 + 161.300xy6

+ 122.3080x4y3 + 418.6070y5 471.6090x2y5 355.1750y7

448.5390x5y2 + 622.2940xy4 1010.5100x3y4 524.1600xy6

508.6170x4y3 213.1970y5 319.0340x2y5 + 217.7490y7

221.2630x5y2 509.6140xy4 + 343.7410x3y4 + 563.1720xy6

15.4587x4y3 + 68.1755y5 + 447.8370x2y5 92.4234y7

48.0804x5y2 + 201.6160xy4 + 255.2210x3y4 331.0990xy6

594.4580x2y2 1664.5900x4y2 + 1238.6200x6y2 + 236.3490y4 1535.7000x2y4 +

1857.9300x4y4 468.9350y6 + 1238.6200x2y6 + 309.6560y8

581.2840x2y2 + 972.1960x4y2 426.9300x6y2 555.8720y4 + 2408.9500x2y4

2111x4y4 + 1213.8800y6 2387.7400x2y6 842.0370y8

2484.1900x3y3 + 2307.5200x5y3 1162.0500xy5 + 2307.5200x3y5 + 769.1720xy7

290.0650x2y2 + 1242.1000x4y2 960.6230x6y2 + 369.2380y4 + 154.3790x2y4

1260.4900x4y4 968.2160y6 + 469.9310x2y6 + 742.5370y8

1701.0100x3y3 844.9290x5y3 + 1862.0500xy5 2597.4900x3y5 1450.0200xy7

213.2880x4y2 161.7390x6y2 133.9710y4 1239.1100x2y4 + 1346.8700x4y4 +

460.4640y6 + 1083.6900x2y6 417.7510y8

297.4110x3y3 819.8750x5y3 1302.1900xy5 + 476.1470x3y5 + 1279.0700xy7

134.1720x4y2 + 104.7100x6y2 + 32.5808y4 + 689.4340x2y4 + 132.8950x4y4

147.7920y6 1091.4200x2y6 + 170.84y8

156.3320x5y3 + 510.3640xy5 + 777.2630x3y5 691.8850xy7

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 221

aberration coefficients a 2 and a 3 , displace the aberration-free PSF along the x and y

axes, respectively. The coefficient a 2 corresponds to a tilt angle of 2a 2 a about the y

axis, and yields a displacement of the PSF along the x axis by 4 a 2 Fx , where Fx = R 2a

is the focal ratio of the image-forming beam along the x axis. Similarly, the coefficient

a 3 corresponds to a tilt angle of 2a 3 b about the x axis, and yields a displacement of the

PSF along the y axis by 4 a 3 Fy , where Fy = R 2b is the focal ratio of the image-forming

beam along the y axis.

yields a radially symmetric interferogram bounded, of course, by an ellipse. However, the

PSF is biaxially and not radially symmetric because of the larger diffraction spread along

the smaller dimension of the pupil. The interferograms and PSFs for the polynomial

aberrations E5 and E6 , representing balanced astigmatisms, are biaxially symmetric but

distinctly different from each other for the two aberrations. The polynomial aberrations

E7 and E8 , representing balanced comas, produce biaxially symmetric interferograms,

but the PSFs are symmetric about the y and x axes, respectively. The polynomial

aberrations E11 , E22 , and E37 , representing balanced primary, secondary, and tertiary

aberrations, respectively, are not radially symmetric because of the different diffraction

spreads along the x and the y axes, and because of the presence of the cos 2q term in E11

and E22 , and the cos 2q and cos 4q terms in E37 .

From Eq. (8-5), the Strehl ratio, i.e., the central value of a PSF relative to its

aberration-free value, can be written:

S(c ) ∫ I (0, 0; c )

1 c 1 x2

1 Û Û

=

pc Ù

ı

dx Ù

ı

[ ]

exp iF ( x , y ) dy , (8-35)

1 c 1 x2

where ( x , y ) are the pupil coordinates normalized by the pupil dimension a along the x p

axis, as used in the polynomials given in Table 8-3.

The Strehl ratio for elliptical polynomial aberrations with a sigma value of 0.1 wave

is listed in Table 8-8 and plotted in Figure 8-6. Because of the small value of the

aberration, the Strehl ratio is approximately the same for each polynomial. Both the table

and the figure illustrate that the Strehl ratio for a small aberration is independent of the

( )

type of aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .

222 SYSTEMS WITH ELLIPTICAL PUPILS

E1 E2 E3

E4 E5 E6

E7 E8 E9

Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85

shown as isometric plot on the top, interferogram on the left, and PSF on the right

for a sigma value of one wave.

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 223

Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85

shown as isometric plot on the top, interferogram on the left, and PSF on the right

for a sigma value of one wave. (Cont.)

224 SYSTEMS WITH ELLIPTICAL PUPILS

Figure 8-5. Elliptical polynomials with an aspect ratio c = 0.85 shown as isometric

plot on the top, interferogram on the left, and PSF on the right for a sigma value of

one wave. (Cont.)

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 225

Table 8-7. Peak-to valley (P-V) numbers (in units of wavelength) of orthonormal

elliptical polynomial aberrations with an aspect ratio c = 0.85 and a sigma value of

one wave.

226 SYSTEMS WITH ELLIPTICAL PUPILS

Table 8-8. Strehl ratio S for elliptical polynomial aberrations with an aspect ratio

c = 0.85 and a sigma value of 0.1 wave.

8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 227

o

o

Figure 8-6. Strehl ratio for an elliptical polynomial aberration with an aspect ratio c

= 0.85 and a sigma value of 0.1 wave.

228 SYSTEMS WITH ELLIPTICAL PUPILS

We now consider balancing of a Seidel aberration and obtain its standard deviation

with and without balancing.

8.7.1 Defocus

We start with the defocus aberration

W d (r) = Ad r 2 . (8-36)

From the form of the orthonormal defocus polynomial E4 given in Table 8-2, it is

evident that its sigma value across an elliptical pupil is given by

Ad h

sd = , (8-37)

4 3

where

(

h = 3 - 2c 2 + 3c 4 )1 2 . (8-38)

8.7.2 Astigmatism

Next consider 0 o Seidel astigmatism given by

6

E6 = 2

2c h

[

h 2r 2 cos 2q - 3 1 - c 2 ( )] (8-40a)

h 6 Ê 2 2 3 - c 2 2ˆ

= Á r cos q - r ˜ + constant . (8-40b)

c2 Ë h ¯

[( ) ]

- 3 - c 2 h r 2 , or that balanced astigmatism is given by

Ê 3 - c 2 2ˆ

W ba (r, q) = Aa Á r 2 cos 2 q - r ˜ . (8-41)

Ë h ¯

c2

s ba = Aa . (8-42)

h 6

To determine the sigma of Seidel astigmatism, we write the aberration in terms of the

elliptical polynomials. Thus,

$VWLJPDWLVP 29

W a (r, q) = Aa r 2 cos q

Ê c2 3 - c2 ˆ

= Aa Á E6 + E4 ˜ + constant . (8-43)

Ë 6h 4h 3 ¯

s a = Aa 4 . (8-44)

Its value is independent of the aspect ratio c of the elliptical pupil, and thus equal to that

for a circular pupil. Since Seidel astigmatism x 2 varies only along the x axis for which

the unit ellipse has the same length as a unit circle, the sigma is independent of c.

8.7.3 Coma

Now we consider Seidel coma:

4

E8 =

4 12

[6r 3

( )

cos q - 3 + c 2 r cos q ] . (8-46)

(9 - 6c 2

+ 5c )

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma

( )

r3 cos q is - 3 + c 2 6 compared to - 2 3 for a circular pupil. The balanced coma is

given by

Ê 3 + c2 ˆ

W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (8-47)

Ë 6 ¯

s bc =

(9 - 6c 2 + 5c 4 )1 2 A . (8-48)

c

24

To obtain the sigma value of Seidel coma, we write Eq. (8-44) in the form

W c (r, q) = Ac Á

(

Ê 9 - 6c 2 + 5c 4

)1 2 E +

3 + c2

ˆ

E2 ˜ . (8-49)

8

Á 24 12 ˜

Ë ¯

1

sc =

8

(5 + 2c 2 + c 4 )1 2 Ac . (8-50)

230 SYSTEMS WITH ELLIPTICAL PUPILS

Finally, we consider Seidel spherical aberration

E11 = ( )[ ( ) ( ) ]

5 a 48r 4 - 12 1 - c 2 r 2 cos 2q - 24 1 + c 2 r 2 + constant (8-52a)

= ( )[ ( ) ( ) ]

5 a 48r 4 - 24 1 - c 2 r 2 cos 2 q + 12 1 - 3c 2 r 2 + constant . (8-52b)

È 1 1 ˘

Î 4

( ) 2

(

W bs (r) = As Ír 4 - 1 - c 2 r 2 cos 2q - 1 + c 2 r 2 ˙

˚

) (8-53a)

È 1 1 ˘

Î 2

( ) 4

( ˚

)

= As Ír 4 - 1 - c 2 r 2 cos 2 q + 1 - 3c 2 r 2 ˙ + constant . (8-53b)

It shows that spherical aberration is balanced not only by defocus but astigmatism as

well. Its sigma value is given by

a

s bs = As . (8-54)

48 5

To obtain the sigma value of Seidel spherical aberration, we write Eq. (8-50) in the form

ÏÔ a

W s (r) = As Ì E11 +

c2 1 - c2 (

E6 + Í

)

1 È3 1-c 1-c

2 4

(

+ h 1 + c2

)( ) ( ˘ ¸

)˙˙E4 Ô˝

ÔÓ 48 5 2h 6 8 3 ÍÎ 2 h Ô˛

˚

+ constant . (8-55)

ss =

(225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 A . (8-56)

s

24 10

The sigma values of Seidel aberrations with and without balancing are given in Table

8-9. They reduce to the corresponding values for a circular pupil given in Table 4-3 as

c Æ 1. The variation of sigma for a primary aberration with the aspect ratio c is shown in

Figure 8-7. While s a for astigmatism is constant, it increases monotonically in the case

of coma s c and spherical aberration s s . For defocus, its value s d has a minimum for

c = 1 3 . The variation of sigma of a balanced primary aberration as a function of c is

shown in Figure 8-8. While its variation for balanced coma s bc and balanced spherical

aberration s bs is small, sigma of balanced astigmatism s ba increases monotonically.

6SKHULFDO $EHUUDWLRQ 31

for an elliptical pupil of aspect ratio c.

Aberration Sigma

12

Defocus [(

s d = ( Ad 4) 3 - 2c 2 + 3c 4 ) 3]

Astigmatism s a = Aa 4

12

Balanced astigmatism s ba = Aa c 2 [6(3 - 2c 2

+ 3c 4 )]

Coma (

s c = Ac 5 + 2c 2 + c 4 )1 2 8

Balanced coma (

s bc = Ac 9 - 6c 2 + 5c 4 )1 2 24

Spherical aberration (

s s = As 225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 (24 10 )

Balanced spherical aberration (

s bs = As 45 - 60c 2 + 94c 4 - 60c 6 + 45c 8 )1 2 (48 5)

of a unit elliptical pupil, where the subscript d is for defocus, a for astigmatism, c for

coma, and s for spherical aberration.

232 SYSTEMS WITH ELLIPTICAL PUPILS

aspect ratio c of a unit elliptical pupil, where the subscript ba is for balanced

astigmatism, bc for balanced coma, and bs for balanced spherical aberration.

8.8 SUMMARY

The PSF and OTF of a system with an elliptical pupil are obtained from the

corresponding PSF and OTF of a system with a circular pupil discussed in Chapter 4 by

scaling the coordinates of the elliptical pupil and transforming it into a circular pupil. It is

explained that the orthogonal aberration polynomials for an elliptical pupil representing

balanced classical aberration for such a pupil can not be obtained in the same manner.

These polynomials orthonormal over a unit elliptical pupil are obtained by

orthonormalizing the circle polynomials by the Gram–Schmidt orthonormalization

process. They are given through the fourth order in Tables 8-1 through 8-3 in terms of the

circle polynomials, in the polar coordinates, and in the Cartesian coordinates,

respectively. Table 8-2 shows that each polynomial consists of either the cosine or the

sine terms, but not both. Thus, an even j polynomial, for example, consists of only the

cosine terms. This is a consequence of the biaxial symmetry of the pupil. Since the

polynomials are not separable in the polar coordinates r and q of a pupil point,

polynomial numbering with two indices n and m loses significance. Hence, they must be

numbered with a single index j. Their ordering is the same as for the polynomials

discussed in previous chapters.

Only the first 15 elliptical polynomials are given for an arbitrary aspect ratio c of the

pupil in the Tables 8-1 through 8-3. The expressions for the higher-order elliptical

polynomials are very long unless c is specified. The polynomial E6 for astigmatism is a

6XPPDU\ 33

degree) Seidel astigmatism is different for an elliptical pupil compared to that for a

circular, annular, or a Gaussian pupil. Moreover, E11 is a linear combination of Z11 , Z 6 ,

Z 4 , and Z1. Thus, spherical aberration r 4 is balanced with not only defocus r2 but

astigmatism r2 cos 2 q as well. It is evidently not radially symmetric. As expected, the

elliptical polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse

approaches a unit circle.

The elliptical polynomials up to the eighth order for an elliptical pupil with an aspect

ratio of c = 0.85 are given in Tables 8-4 to 8-6 in terms of the Zernike circle polynomials,

in polar coordinates, and in Cartesian coordinates, respectively. They are illustrated in

three different but equivalent ways in Figure 8-5 with the isometric plot, interferogram,

and the PSF for a sigma value of one wave. The peak-to-valley aberration numbers (in

units of wavelength) are given in Table 8-7. The Strehl ratio for a sigma value of 0.1

wave is given in Table 8-8 and plotted in Figure 8-6. The Seidel aberrations are discussed

in Section 8.7 and their sigma values with and without balancing are given in Table 8-9.

234 SYSTEMS WITH ELLIPTICAL PUPILS

References

1. H. J. Wyatt, “The form of the human pupil,” Vision Res. 35, 2021–2036 (1995).

Opt. 7, 197–201 (1968).

interferogram by use of Zernike polynomials,” Appl. Opt. 35, 6162–6172 (1996).

application to image evaluation,” Japanese J. Appl. Phys. 8, 1027–1036 (1969).

annuli,” ,((( 7UDQV $QWHQ 3URSD AP-31, 360–363 (1983).

55, 107–108 (1965).

analytical solution,” J Opt. Soc. Am A 24, 2994–3016 (2007). Errata: J. Opt. Soc.

Am. A 29, 1673–1674 (2012).

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–

11.41 (McGraw Hill, 2009).

CHAPTER 9

References ......................................................................................................................265

235

Chapter 9

Systems with Rectangular Pupils

9.1 INTRODUCTION

High-power laser beams have a rectangular cross-section; hence there is a need to

discuss the diffraction characteristics of a rectangular pupil. We start this chapter with a

brief discussion of the PSF and OTF of a system with such a pupil.

Although high-power rectangular laser beams have been around for a long time [1],

there is little in the literature on rectangular polynomials representing balanced

aberrations for such beams. In this chapter we discuss such polynomials that are

orthonormal over a unit rectangular pupil [2,3]. These polynomials are not separable in

the x and y coordinates of a point on the pupil. The expressions for only the first 15

orthonormal polynomials, i.e., up to and including the fourth order, are given for an

arbitrary aspect ratio of the pupil becuase they become quite cumbersome as their order

increases. However, expressions for the first 45 polynomials, i.e., up to and including the

eighth order, are given for an aspect ratio of 0.75. The isometric, interferometric, and PSF

plots of these polynomial aberrations with a sigma value of one wave are given along

with their P-V numbers. The Strehl ratios for these polynomial aberrations for a sigma

value of one-tenth of a wave are also given. Finally, we discuss how to obtain the

standard deviation of a Seidel aberration with and without balancing.

Products of Legendre polynomials (one for the x- and the other for the y axis) which

are also orthogonal over a rectangular pupil [4], are not suitable for the analysis of

rectangular wavefronts of rotationally symmetric systems, since they do not represent

classical or balanced aberrations for such systems. For example, the defocus aberration

for such a system is represented by x 2 + y 2 . While it can be expanded in terms of a

complete set of 2D Legendre polynomials, it cannot be represented by a single product of

the x- and y-Legendre polynomials. The same difficulty holds for spherical aberration,

coma, etc. However, products of such Legendre polynomials are suitable for anamorphic

systems, as discussed in Chapter 13. Products of Chebyshev polynomials, one for the x-

and the other for the y-axis, are also orthogonal over a rectangular pupil, but they are not

suitable either for the rectangular pupils considered in this chapter for the same reasons as

for the products of Legendre polynomials.

As illustrated in Figure 9-1, consider an optical system with a rectangular exit pupil

( )

with half-widths a and b and area Sex = 4 ab lying in the x p , y p plane with z axis as its

(

optical axis. For a uniformly illuminated pupil with an aberration function F x p , y p )

and power Pex exiting from it, the pupil function of the system can be written

(

P xp, yp ) ( ) [ (

= A x p , y p exp iF x p , y p )] , (9-1)

where 237

238 SYSTEMS WITH RECTANGULAR PUPILS

yp

O xp

(

A xp, yp ) = (P ex Sex )

12

, - a £ xp £ a , -b £ yp £ b . (9-2)

9.3.1 PSF

From Eq. (1-9), the aberrated PSF at a point ( x i , y i ) in the image plane of a system

with a uniformly illuminated rectangular exit pupil, normalized by its aberration-free

central value Pex Sex l2 R 2 , can be written

2

1 a b È 2pi ˘

I (x i , y i ) = 2 Ú

Sex a b

[ (

Ú exp iF x p , y p expÍ -

Î lR

)] ( )

x i x p + y i y p ˙ dx p dy p .

˚

(9-3)

Letting

( x ¢, y ¢) (

= xp a, yp b ) , (9-4)

and

1

( x, y) = ( x , y )

l Fx i i

(9-5)

Fx = R 2a (9-6)

is the focal ratio of the image-forming light cone along the x axis, and

= ba (9-7)

is the aspect ratio of the pupil, the irradiance distribution can be written

2

1 1 1

I ( x, y) =

16 1 1

[ ]

Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (9-8)

36) 239

2

1 1 1

I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢

16 1 1

2 2

Ê sin px ˆ Ê sin py ˆ

= Á ˜ . (9-9)

Ë px ¯ ÁË py ˜¯

Figure 9-2a shows the 2D PSF for an aspect ratio = 0.75 . In particular, it shows the

central bright rectangular spot of size 2 ¥ 2 , with each dimension in units of l Fx . The

PSF is zero wherever x and/or y is a positive or a negative integer. Figure 9-2b shows

the irradiance distribution along the x and y axes, and along the diagonal of the central

12

bright spot as I ( x, 0) , I (0, y ) , and I ( x , y ) ∫ I ( r ) , where r = x 2 + y 2 (and )

4

È Ê 2ˆ ˘

Í sinË pr 1 + ¯ ˙

I (r) = Í ˙ . (9-10)

Í pr 1 +

2

˙

Î ˚

(a)

1.0

0.8

0.6

I (0, y)

(b)

0.4

I (r)

0.2

I (x, 0)

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

x, y, or r

Figure 9-2. (a) 2D aberration-free PSF for = 0.75. (b) Irradiance distribution along

the x and y axes, and along the diagonal of the central bright spot of the PSF.

240 SYSTEMS WITH RECTANGULAR PUPILS

9.3.2 OTF

From Eq. (1-13), the aberration-free OTF of a system with a rectangular pupil at a

spatial frequency (x, h) is given by the fractional area of overlap of two rectangles

centered at (0, 0) and lR(x, h) , as shown in Figure 9-3. The overlap area is given by

Ê x ˆÊ 1 h ˆ

= 4 abÁ 1 - ˜ Á1 - ˜ . (9-11)

Ë 1 l Fx ¯ Ë 1 l Fx ¯

Hence, the fractional area of overlap, or the OTF of the system may be written

v

(

t vx , vy ) = (1 - v ) ÊÁË1 - ˆ˜¯

x

y

, (9-12)

where

Ê x h ˆ

(v , v )

x y = Á , ˜

Ë 1 l Fx 1 l Fx ¯

(9-13)

are the spatial frequency components in units of the cutoff frequency 1 l Fx along the x

12

( )

axis. The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be

obtained from Eq. (9-12) by letting v y v x = . Thus

2

Ê v ˆ

t( v ) = Á 1 - ˜ . (9-14)

Ë 1 + 2 ¯

yp

b

O9 R

O xp

R

a

Figure 9-3. Overlap area of two rectangular pupils centered at (0, 0) and l R(x , h)

for an aspect ratio = 0.75.

27) 241

Figure 9-4 shows the OTF for = 0.75 along the x and y axes, and along the

( )

diagonal of the pupil, as t(v x , 0) , t 0, v y , and t( v ) , with the corresponding cutoff

frequencies 1, 0.75, and 1.25, respectively, each in units of 1 l Fx . We note that

( )

t 0, v y < t(v x , 0) for any value of v x = v y due to the smaller dimension of the pupil

along the y axis. Moreover, t( v ) < t(v x , 0) for any frequency lying in the range

( )

0 < v = v x < 2 1 + 2 - 1 + 2 , or 0 < v = v x < 0.9375 in our example of = 0.75 . The

two OTFs are equal to each other at the frequency 2 1 + 2 - 1 + 2 , or 0.9375. At ( )

larger frequencies, t( v ) > t(v x , 0) until v = 1 + 2 . Of course, the values of both OTFs

in the vicinity of the unity cutoff frequency for t(v x , 0) are quite small in our example.

( )

Finally, t 0, v y is only slightly greater than t( v ) in the frequency range

( )

0 < v = v x < 2 1 + 2 - 1 1 + 2 . The two OTFs are equal to each other at the

( )

frequecny 2 1 + 2 - 1 1 + 2 , or 1 2.4 in our example. For larger frequecnies, t( v ) is

significantly greater. We point out that they are equal to each other only if ≥ 1 3 . As

( )

Æ 1 and the rectangular pupil becomes square, t 0, v y Æ t(v x , 0) for any value of

v x = v y , and the cutoff frequency for t( v ) appraoches 2 , as discussed in the next

chapter.

1.0

0.8

t ( nx , 0)

0.6

t

0.4

0.2

t ( 0, ny )

t (n)

0.0

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

nx, ny, or n

Figure 9-4. Aberration-free OTF for = 0.75, where v x , v y , and v are in units of

the cutoff frequency 1 l Fx along the x axis.

242 SYSTEMS WITH RECTANGULAR PUPILS

Figure 9-5 shows a unit rectangle inscribed inside a unit circle. The half-widths a

12

and b of the rectangular pupil are normalized by its semidiagonal a 2 + b 2 so that the ( )

farthest points (such as A) on the pupil lie at a distance of unity. The half-widths of the

12 12

unit rectangle along the x and y axes are c = a a 2 + b 2 ( )

and 1 - c 2 , respectively,

12

( )

where 0 < c < 1 . Accordingly, the aspect ratio of the rectangle is 1 - c 2

2 12

c , and its ( )

area is given by A = 4c 1 - c ( )

. As in the case of a unit ellipse, a unit rectangle is also

not unique, since c can have any value between 0 and 1. For example, when c = 0.8 , the

aspect ratio of the pupil is 0.75 and the area is 1.92. As c Æ 1 2 , the rectangle becomes

a square, and as c Æ 1 or 0, it becomes a slit parallel to the x or the y axis, respectively.

Zernike circle polynomials Z j over a unit ellipse are given by [see Eq. (3-18)]

È j ˘

R j +1 = N j +1 ÍZ j +1 - Â Z j +1R k R k ˙ , (9-15)

ÍÎ k =1 ˙˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the

unit rectangle, i.e., they satisfy the orthonormality condition

c 1 c2

1 Û Û

dx Ù R j R j ¢ dy = d jj ¢ . (9-16)

2 Ù

4c 1 - c ı ı

c 1 c2

The angular brackets indicate a mean value over the rectangular pupil. Thus

c 1 c2

Û 1Û

Z j Rk = Ù dx Ù Z j Rk dy . (9-17)

4c 1 - c 2 ı ı

c 1 c2

D ( c, 1 c2 ) (

A c, 1 c2 )

O x

C ( c, 1 c2 ) (

B c, 1 c2 )

Figure 9-5. Unit rectangle of half-width c inscribed inside a unit circle. Its corner

points, such as A, lie at a distance of unity from its center.

5HFWDQJXODU 3RO\QRPLDOV 243

It should be evident that because of the symmetric limits of integration, a mean value is

zero if the integrand is an odd function of x and/or y. If the integrand is an even function,

then we may replace the lower limits of integration by zero and multiply the double

integral by 4.

The rectangular polynomials thus obtained up to the fourth order are given in Tables

9-1 through 9-3 in the same manner as the elliptical polynomials. Only the first 15

polynomials are given in these tables, because their expressions become too long unless

the aspect ratio is specified. Each polynomial consists of a number of circle polynomials,

but contain only the cosine or the sine terms, not both. The polynomial R6 representing

balanced astigmatism is a linear combination of Z 6 , Z 4 , and Z1, showing that the

balancing defocus for 0 o Seidel astigmatism is different for a rectangular pupil compared

to that, for example, for a circular pupil. Similarly, the polynomial R11 , representing

balanced primary spherical aberration, is not radially symmetric, since it consists of a

term in astigmatism Z 6 or cos2q . As expected, the rectangular polynomials reduce to

the square polynomials as c Æ 1 2 , and the slit polynomials for a slit pupil parallel to

the x axis as c Æ 1, discussed in Chapters 10 and 11, respectively.

FUNCTION

A rectangular aberration function W ( x , y ) across a unit rectangle can be expanded in

terms of J rectangular polynomials Rj (r, q) in the form

J

W ( x , y ) = Â a j Rj ( x , y ) , (9-18)

j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (9-18) by

R j ( x , y ), integrating over the unit rectangle, and using the orthonormality Eq. (9-16), we

obtain the rectangular expansion coefficients:

c 1 c2

1 Û Û

aj = Ù dx Ù W ( x , y )R j ( x , y )dy . (9-19)

2

4c 1 - c ı ı

c 1 c2

As stated in Section 3.2, it is evident from Eq. (9-19) that the value of a rectangular

coefficient is independent of the number J of polynomials used in the expansion of the

aberration function. Hence, one or more terms can be added to or subtracted from the

aberration function without affecting the value of the coefficients of the other

polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (9-20)

and

244 SYSTEMS WITH RECTANGULAR PUPILS

circle polynomials Z j U T .

R1 Z1

R2 ( 3 /2c)Z2

2

R3 [ 3 /(2 1 c ) ]Z3

2 4

R4 [ 5 /(4 1 2c 2c ) ](Z1 +3Z4)

2

R5 [ 3 2 /(2c 1 c ) ]Z5

2 4

R6 { 5 /[8c2(1 c2) 1 2c + 2c ]}[(3 10c2 + 12c4 8c6)Z1 + 3 (1 2c2)Z4

+ 6 (1 2c2 + 2c4)Z6]

2 4 6

R7 [ 21 /(4 2 27 81c + 116c 62c )][ 2 (1+4c2)Z3 +5Z7]

2 4

R8 [ 21 /(4 2c 35 70c + 62c )][ 2 (5 4c2)Z2 +5Z8]

§ 27 2 4 2

R9 { 5 2 ©

81c2 + 116c4 62c6)]}

2 4

R10 { 5 2 /[16c3(1 c2) 35 70c + 62c ]}[2 2 (35 112c2 + 128c4 60c6)Z2

R12 {3/[16c2ȞȘ]}{(105 550c2 + 1559c4 2836c6 + 2695c8 1078c10)Z1

2 4 6

R13 [ 21 /(16 2 c 1 3c + 4c 2c )]( 3 Z5 + 5 Z13)

2 4 6 8

R14 Ĳ[6(245 1400c + 3378c 4452c + 3466c 1488c + 496c12)Z1

10

__________________________________________________

(9 36c2 + 103c4 134c6 + 67c8)1/2

Ȟ (49 196c2 + 330c4 268c6 + 134c8)1/2

Ĳ 1/[128Ȟc4(1 c2)2]

Ș 9 45c2 + 139c4 237c6 + 201c8 67c10

Ș (1 c2)

2

9.5 Rectangular Coefficients of a Rectangular Aberration Function 245

U, T .

R1 = 1

R2 = ( 3 /c)ȡcosș

R3 = 3 /(1 í c2)ȡsinș

2 4

R4 = [ 5 /(2 1 í 2c 2c )](3ȡ2 í 1)

2

R5 = [3/(2c 1 í c )]ȡ2 sin2ș

2 4

R6 = { 5 /[4c2(1 í c2) 1 í 2c 2c ]}[3(1 í 2c2 + 2c4)ȡ2 cos2ș + 3(1 í 2c2)ȡ2

í 2c2(1 í c2) (1 í 2c2)]

2 4 6

R7 = [ 21 /(2 27 í 81c 116c í 62c )](15ȡ2 – 9 + 4c2)ȡsinș

2 4

2 4 2

R9 = { 5 ©

í 26c4)ȡ2]ȡsinș}

2 4

R10 = { 5 /[8c3(1 í c2) 35 í 70c 62c ]}{(35 í 70c2 + 62c4)ȡ3 cos3ș

í 3[4c2(7 í 17c2 + 10c4) í (35 í 70c2 + 26c4)ȡ2]ȡcosș}

R11 = (1/8)[315ȡ4 + 30(1 í 2c2)ȡ2 cos 2ș í 240ȡ2 + 27 + 16c2 í 16c4]

R12 = [3/(8c2ȞȘ)][315(1 í 2c2) (1 í 2c2 +2c4)ȡ4 + 5(72ȡ2 í 21 + 72c2 í 225c4 + 306c 6

í 153c8)ȡ2 cos2ș í 15(1 í 2c2) (7 + 4c2 í 71c4 + 134c6 í 67c8)ȡ2

+ c2(1 í c2)(1 í 2c2)(70 í 233c2 + 233c4)]

2 4 6

R13 = [ 21 /(4c 1 í 3c 4c í 2c )](5ȡ2 í 3)ȡ2 sin2ș

R14 = 6Ĳ{5Ȟ2ȡ4 cos4ș í 20(1 í 2c2)[6c2(7 í 16c2 + 18c4 í 9c6) í 49(1 í 2c2 + 2c4)ȡ2]ȡ2

cos2ș + 8c4(1 í c2)2(21 í 62c2 + 62c4) í 120c2(7 í 30c2 + 46c4 í 23c6)ȡ2

+ 15(49 í 196c2 + 282c4 í 172c6 + 86c8)ȡ4}

R15 = { 21 /[8c3(1 í c2)3/2(1 í 2c2 +2c4)1/2]}[ í (1 í 2c2) (6c2 í 6c4 í 5ȡ2)ȡ2 sin2ș

+ (5/2)(1 í 2c2 +2c4)ȡ4 sin4ș]

246 SYSTEMS WITH RECTANGULAR PUPILS

x, y , where U 2 x 2 y 2 .

R1 = 1

R2 = ( 3 /c)x

3 / §© 1 í c ·¹ y

2

R3 =

2 4

R4 = [ 5 /(2 1 í 2c + 2c )](3ȡ2 í 1)

2

R5 = [3/( c 1 í c )]xy

2 4

R6 = { 5 /[2c2(1 í c2) 1 í 2c + 2c ]}[3(1 í c2)2x2 í 3c4y2 í c2(1 í 3c2 +2c4)]

2 4 6

R7 = [ 21 /(2 27 í 81c + 116c í 62c )](15ȡ2 – 9 + 4c2)y

2 4

R8 = [ 21 /(2c 35 í 70c + 62c )](15ȡ2 í 5 í 4c2)x

R9 = { 5 § 27 í 54c + 62c

2 4·

/ §©1 í c ·¹ /[2c2(27 í 81c2 + 116c4 í 62c6)]}

2

© ¹

2 4

R10 = { 5 /[2c3(1 í c2) 35 í 70c + 62c ]}[35(1 í c2)2x2 í 27c4y2 í c2(21 í 51c2

+ 30c4)]x

R11 = [1/(8)][315ȡ4 í 30(7 + 2c2)x2 í 30(9 í 2c2)y2 + 27 + 16c2 í 16c4]

R12 = [3/(8c2ȞȘ)][35(1 í c2)2(18 í 36c2 + 67c4)x4 + 630(1 í 2c2)(1 í 2c2 +2c4)x2y2

í 35c4(49 í 98c2 + 67c4)y4 í 30(1 í c2) (7 í 10c2 í 12c4 + 75c6 í 67c8)x2

í 30c2(7 í 77c2 + 189c4 í 193c6 + 67c8)y2 + c2(1 í c2) (1 í 2c2) (70 í 233c2

+ 233c4)]

2 4 6

R13 = [ 21 /(2c 1 í 3c + 4c í 2c )](5ȡ2 í 3)xy

+ 90c6(1 í c2) (2 í 9c2)y2 +3c4(1 í c2)2(21 í 2c2 + 62c4)]

2 4 6

R15 = { 21 /[2c3(1 í c2) 1 í 3c + 4c í 2c ]}[5(1 í c2)2x2 í 5c4y2 í c2(3 í 9c2

+ 6c4)]xy

9.5 Rectangular Coefficients of a Rectangular Aberration Function 247

J

W 2 (r, q) = Â a 2j , (9-21)

j =1

2

2

sW = W 2 (r, q) - W (r, q)

J

= Â a 2j . (9-22)

j =2

OF RECTANGULAR POLYNOMIAL ABERRATIONS

The rectangular polynomials up to the eighth order for a rectangular pupil with

c = 0.8 , corresponding to an aspect ratio of = 0.75 , are given in Tables 9-4 to 9-6. They

are illustrated in three different but equivalent ways in Figure 9-6. For each polynomial,

the isometric plot at the top illustrates its shape. An interferogram is shown on the left,

and a corresponding PSF is shown on the right for a sigma value of one wave. The peak-

to-valley aberration numbers (in units of wavelength) are given in Table 9-7.

The PSF plots, representing the images of a point object in the presence of a

polynomial aberration and obtained by applying Eq. (9-3) are shown in Figure 9-6. The

full width of a square displaying the PSFs is 24l Fx . Since the piston aberration R1 has

no effect on the PSF, it yields an aberration-free PSF. The polynomial aberrations R2 and

R3 , representing the x and y wavefront tilts with aberration coefficients a 2 and a 3 ,

displace the PSF in the image plane along the x and y axes, respectively. If the coefficient

a 2 is in units of wavelength, it corresponds to a wavefront tilt angle of 3la 2 ca about

the y axis and displaces the PSF along the x axis by 2 3lFx a 2 c , where Fx = R 2a and

12

(

c = a a 2 + b2 ) is the width of the rectangle along the x axis normalized by its

semidiagonal. Similarly, a 3 corresponds to a wavefront tilt angle of 3 (1 - c 2 )la 3 b

about the x axis and displaces the PSF by 2 3 (1 - c 2 )lFy a 3 , where Fy = R 2b is the

focal ratio of the image-forming beam along the y axis.

yields a radially symmetric interferogram bounded, of course, by a rectangle. However,

the PSF is biaxially and not radially symmetric because of the larger diffraction spread

along the smaller direction of the pupil. The polynomial aberrations R5 and R6 ,

representing balanced astigmatism, both yield biaxially symmetric interferograms and

PSFs, but they are distinctly different from each other. The polynomial aberrations R7

and R8 , representing balanced comas, produce biaxially symmetric interferograms, but

the PSFs are symmetric only about the y and x axes, respectively. The polynomial

aberrations R11 , R22 , and R37 , representing balanced primary, secondary, and tertiary

aberrations are not radially symmetric because of the presence of cos 2q , cos 2q and

cos 4q , and cos 2q , cos 4q , and cos 6q terms, respectively.

248 SYSTEMS WITH RECTANGULAR PUPILS

rectangular pupil with c = 0.8 corresponding to an aspect ratio 0.75.

R1 1.Z1

R2 1.0825Z2

R3 1.4434Z3

R4 0.7613Z1 + 1.3186Z4

R5 1.2758Z5

R7 1.6096Z3 + 1.5985Z7

R8 0.8848Z2 + 1.2821Z8

1.0933Z22 + 3.4474Z24

3.6298Z22 3.3094Z24 + 4.9523Z26

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 249

rectangular pupil with c = 0.8 corresponding to an aspect ratio = 0.75. (Cont.)

5.3566Z22 + 8.1769Z24 8.2421Z26 + 10.7448Z28

3.5735Z29 + 3.5748Z31

+ 2.8331Z32

6.7194Z29 5.9855Z31 + 6.2807Z33

1.7207Z30 0.9675Z32 + 4.7946Z34

12.2438Z29 + 12.7553Z31 13.4933Z33 + 16.6422Z35

0.7039Z30 + 3.4183Z32 5.3160Z34 + 11.2833Z36

0.4267Z24 + 0.9143Z26 0.1707Z28 + 1.5680Z37

4.5139Z22 + 10.4761Z24 1.4218Z26 + 1.6720Z28 1.3388Z37 + 3.9661Z38

1.3327Z39

20.0217Z22 18.4232Z24 + 20.9849Z26 2.4001Z28 + 5.33986Z37 4.3544Z38 +

6.1988Z40

1.0209Z39 + 3.1115Z41

30.8082Z22 + 48.1556Z24 38.2527Z26 + 36.5458Z28 6.4972Z37 + 10.8145Z38

8.3071Z40 + 9.4857Z42

2.1738Z39 3.2116Z41 + 6.2555Z43

254.2440Z14 + 69.4217Z22 98.2143Z24 + 103.3860Z26 109.2310Z28 + 13.6842Z37

19.2514Z38 + 20.4330Z40 21.5294Z42 + 26.1698Z44

2.7431Z39 + 6.0701Z41 9.6463Z43 + 17.5983Z45

250 SYSTEMS WITH RECTANGULAR PUPILS

with c = 0.8 corresponding to an aspect ratio 0.75.

R1 = 1

R2 = 2.1651U cosT

R3 = 2.8868U cosT

R5 = 3.1250U 2cos2T

R7 = ( 5.8234 U + 13.5638U3)cosT

R8 = ( 5.4830 U + 10.8789U3)cosT

+ 14.6134U4cos4T

R15 = (2.7303U2 9.8753U4)cos2T + 9.5085 U4cos4T

R22 = 1.2407 + 3.2334( 1 + 2U2) + 3.8612(1 6U2 + 6U4) + 3.9642( 1 + 12U 2 30U4 + 20U6)

+ (2.1911U 2 4.0362 U4)cos2T + 2.0593U4cos4T

R23 = (21.6144 U2 84.877 U4 + 73.4513U6)cos2T + 0.4570U4cos4T

R24 = 3.7592 8.4102( 1 + 2U2) 7.2735(1 6U2 + 6U4) 2.8925( 1 + 12U2 30U4 + 20U6)

+ (30.3780U2 161.2260U4 + 193.4870 U6)cos2T 2.6753U 4cos4T

R26 = 14.8185 + 31.8310( 1 + 2U2) + 25.6640(1 6U2 + 6U4) + 9.60361( 1 + 12U2 30U4 + 20 U6)

+ ( 11.6421U2 + 100.6510U4 185.7370U6)cos2T + ( 49.2338 U4 + 111.1780 U6)cos4T

R28 = 30.6444 62.9091( 1 + 2U2) 45.8608(1 6U 2 + 6 U4) 14.1723( 1 + 12U2 30U4 + 20 U6)

+ (24.2988U 2 223.3660U4 + 458.9270U6)cos2 T + (54.9277U4 185.0350 U6)cos4T + 40.2033 U6cos6 T

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 251

with c = 0.8 corresponding to an aspect ratio 0.75. (Cont.)

R30 ( 13.8336U + 121.8610 U3 287.0700U5 + 196.7720U7)cosT + ( 5.7494U 3 + 12.5263U 5)cos3T + 0.2742U 5cos5T

6.7464 U5cos5T

+ 3.4819U5cos5T

+ ( 79.1341U 5 + 175.8610U7)cos5T

+ ( 71.5489U5 + 134.2480U7)cos5T

+ (116.9780U5 377.8130 U7)cos5T + 66.5688 U7cos7T

R36 ( 0.0678U 4.9868U 3 + 49.1032U 5 98.5394U 7)cos T + (20.8068U 3 160.5990U5 + 287.1390 U7)cos3T

+ (46.6489U 5 148.8470 U7)cos5T + 45.1331U 7cos7T

R37 1.4443 + 4.1359( 1 + 2U 2) + 5.8286(1 6U2 + 6U4) + 5.5594( 1 + 12U2 30U4 + 20U6)

+ 4.7041(1 20U 2 + 90 U4 140U 6 + 70U 8) + ( 5.6303U 2 + 25.9377U 4 23.9482U 6)cos2T

+ ( 12.3568U4 + 20.5270 U6)cos4T 0.6386 U6cos6T

4.0165(1 20U 2 + 90U4 140 U6 + 70U8) + ( 50.2770U 2 + 448.8090U 4 1178.8400 U6 + 942.3010U8)cos2T

+ (16.0858U4 31.9182U 6)cos4T + 6.2562U6cos6T

R39 ( 39.2423U2 + 269.5590U4 535.8400 U6 + 316.6330U8)cos2T + (0.4428U 4 + 4.0919U6)cos4T 1.6238U6cos6T

R40 39.3796 + 92.1941( 1 + 2U2) + 90.9522(1 6U 2 + 6U4) + 52.9725( 1 + 12U2 30U4 + 20U 6)

+ 16.0196(1 20U 2 + 90U4 140U6 + 70U8) + (15.8434U 2 229.3420U 4 + 905.7830U 6 1034.5500U8)cos2T

+ (131.6850U 4 633.4660U6 + 736.3840U 8)cos4T 8.9803U6cos6 T

+ 2.4823U 6cos6T

R42 78.9935 177.6270( 1 + 2U2) 161.0420(1 6U2 + 6U4) 81.5109( 1 + 12U2 30U4 + 20 U6)

19.4915(1 20 U2 + 90U4 140U6 + 70 U8) + ( 38.8745U2 + 530.9200 U4 2114.8800U6 + 2569.3900U8)cos2T

+ ( 90.8696U4 + 621.471 U6 986.8280 U8)cos4T + ( 144.9680U6 + 321.9540U8)cos6T

+ ( 115.7120 U6 + 212.3190 U8)cos6T

R44 197.7770 + 437.0330( 1 + 2U2) + 382.5600(1 6U2 + 6 U4) + 183.6730( 1 + 12 U2 30U4 + 20 U6)

+ 41.0527(1 20U2 + 90U4 140 U6 + 70U8) + (36.0550U2 619.6960 U4 + 3063.7900U6 4573.8900U8)cos2T

+ (170.1620U4 1319.9600U 6 + 2427.3200 U8)cos4T + (230.6850U6 730.7330U 8)cos6 T + 111.0290U8cos8T

+ (107.0920U6 327.4060U8)cos6 T + 74.6631U 8cos8T

252 SYSTEMS WITH RECTANGULAR PUPILS

with c = 0.8 corresponding to an aspect ratio 0.75.

R1 1

R2 2.1651x

R3 2.8866y

R5 6.2500xy

237.8490x4y2 89.6620y4 + 237.8490x2y4 + 79.2831y6

R24 0.2700 + 22.4881x2 120.7660x4 + 135.6370x6 38.2678y2 + 102.3210x2y2 +

19.9357x4y2 + 201.6850y4 367.0390x2y4 251.3380y6

R26 0.9521 + 13.2791x2 82.7075x4 + 117.5130x6 + 36.5633y2 + 27.1545x2y2

165.4110x4y2 284.0090y4 + 206.0630x2y4 + 488.9880y6

69.2811x4y2 + 428.2970y4 + 218.9620x2y4 967.6110y6

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 253

with c = 0.8 corresponding to an aspect ratio 0.75. (Cont.)

R29 = 13.2595y + 161.7850x2y 387.9230x4y + 255.0900x6y + 116.4450y3

725.9140x2y3 + 765.2710x4y3 311.6030y5 + 765.2710x2y5 + 255.0900y7

+ 590.3150x5y2 323.2780xy4 + 590.3150x3y4 + 196.7720xy6

0.5157x4y3 + 721.6920y5 1200.6000x2y5 800.5730y7

56.1063x5y2 + 646.3470xy4 895.8120x3y4 615.9100xy6

571.0660x4y3 1098.8600y5 + 736.6060x2y5 + 1619.3500y7

404.2550x5y2 743.2620xy4 + 457.8080x3y4 + 1155.9400xy6

225.9950x4y3 + 1723.1900y5 + 727.3160x2y5 3229.9600y7

190.9260x5y2 + 764.1440xy4 + 592.5860x3y4 2020.1200xy6

657.2590x2y2 1759.1600x4y2 + 1317.1500x6y2 + 253.2650y4 1730.4300x2y4 +

1975.7200x4y4 502.2730y6 + 1317.1500x2y6 + 329.2870y8

321.3330x2y2 142.7060x4y2 + 759.9720x6y2 545.1320y4 + 2402.6700x2y4

1686.9400x4y4 + 1464.1300y6 3009.2300x2y6 1223.4600y8

2110.8800x3y3 + 1899.8000x5y3 1097.7900xy5 + 1899.8000x3y5 + 633.2650xy7

6.4945x2y2 + 657.9390x4y2 529.1540x6y2 + 759.3280y4 1423.0400x2y4

635.6140x4y4 2713.5600y6 + 3609.0500x2y6 + 2892.3100y8

1157.0100x3y3 + 23.2280x5y3 + 2268.5100xy5 2933.8200x3y5 1963.6200xy7

5.1104x2y2 + 248.0660x4y2 878.8930x6y2 896.9530y4 + 128.7860x2y4 +

1681.8400x4y4 + 3979.9200y6 2141.7300x2y6 5242.5800y8

334.3580x3y3 1399.6900x5y3 2863.5400xy5 + 1652.4500x3y5 + 3832.9400xy7

61.1328x2y2 18.4555x4y2 240.8510x6y2 + 1269.7800y4 + 774.5320x2y4 +

741.0040x4y4 6688.3600y6 2405.7900x2y6 + 10716.7000y8

623.3740x5y3 + 3168.5700xy5 + 1970.1700x3y5 6749.5300xy7

254 SYSTEMS WITH RECTANGULAR PUPILS

R1 R2 R3

R4 R5 R6

R7 R8 R9

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on

the right for a sigma value of one wave.

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 255

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on

the right for a sigma value of one wave. (Cont.)

256 SYSTEMS WITH RECTANGULAR PUPILS

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on

the right for a sigma value of one wave. (Cont.)

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 257

rectangular polynomial aberrations for c = 0.8 corresponding to an aspect ratio of

= 0.75 for a sigma value of one wave.

The Strehl ratio, namely the central value of a PSF relative to its aberration-free

value can be obtained from Eq. (9-8) by letting x = 0 = y , i.e., from

2

1 1 1

I (0, 0) = [ ]

Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢

16 1 1

. (9-23)

Its value for a rectangular polynomial aberration with a sigma value of 0.1 wave is listed

in Table 9-8 and plotted in Figure 9-7. Because of the small value of the aberration, the

Strehl ratio is approximately the same for each polynomial. Both the table and the figure

illustrate that the Strehl ratio for a small aberration is independent of the type of

( )

aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .

258 SYSTEMS WITH RECTANGULAR PUPILS

Table 9-8. Strehl ratio S for rectangular polynomial aberrations for c = 0.8

corresponding to an aspect ratio of = 0.75 for a sigma value of 0.1 wave.

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 259

o

o

oj

Figure 9-7. Strehl ratio S for rectangular polynomial aberrations for c = 0.8

corresponding to an aspect ratio of = 0.75 for a sigma value of 0.1 wave.

260 SYSTEMS WITH RECTANGULAR PUPILS

We now consider balancing of a Seidel aberration and obtain its standard deviation

with and without balancing.

9.7.1 Defocus

We start with the defocus aberration

W d (r) = Ad r 2 . (9-24)

From the form of the orthonormal defocus polynomial R4 given in Table 9-2, it is

evident that its sigma value across a rectangular pupil is given by

2g

sd = Ad , (9-25)

3 5

where

(

g = 1 - 2c 2 + 2c 4 )1 2 . (9-26)

9.7.2 Astigmatism

Next consider 0 o Seidel astigmatism given by

R6 = 3 5

g 2r 2 cos 2q + 1 - 2c 2 r 2 ( ) + constant (9-28a)

2

(

4c 1 - c g 2

)

3 5g Ê 2 2 c 4 2ˆ

= Á r cos q - 2 r ˜ + constant , (9-28b)

(

2c 2 1 - c 2 ) Ë g ¯

showing that the relative amount of defocus r2 that balances Seidel astigmatism

r2 cos 2 q is c 4 g 2 . It is evident that the balanced astigmatism is given by

Ê c4 ˆ

W ba (r, q) = Aa Á r 2 cos 2 q - 2 r 2 ˜ . (9-29)

Ë g ¯

s ba =

(

2c 2 1 - c 2 )A . (9-30)

a

3 5g

To obtain the sigma value of astigmatism, we write Eq. (9-27) in the form

9.7.2 Astigmatism 261

2 Aa 2

W a (r, q) =

3 5g

[ ( )

c 1 - c 2 R6 + c 4 R4 + constant . ] (9-31)

2c 2

sa = Aa . (9-32)

3 5

9.7.3 Coma

Now, we consider Seidel coma

12

1 Ê 21 ˆ

R8 = Á ˜

2c Ë 35 - 7c 2 + 62c 4 ¯

[15r 3

( )

cos q - 5 + 4c 2 r cos q ] . (9-34)

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma

( )

r3 cos q is - 5 + 4c 2 15 compared to - 2 3 for a circular pupil. Its sigma value is given

by

12

2c Ê 35 - 70c 2 + 62c 4 ˆ

s bc = Ac . (9-35)

15 ÁË 21 ˜

¯

To obtain the sigma value of Seidel coma, we write Eq. (9-33) in the form

A

W c (r, q) = c

È Ê 35 - 70c 2 + 62c 4 ˆ 1 2

Í 2c Á

c 5 + 4a 2 c ˘ (

R2 ˙ .

)

15 21 ˜ R8 + (7-36)

Í Ë ¯ 3 ˙

Î ˚

7 + 8c 4

sc = c Ac (9-37)

105

Finally, we consider Seidel spherical aberration

[ ( )

R11 = (1 8m) 315r 4 + 30 1 - 2c 2 r 2 cos 2q - 240r 2 + constant ] (9-39a)

= (1 8m)[ 315r 4

( ) ( ) ]

+ 60 1 - 2c 2 r 2 cos 2 q - 270 + 2c 2 r 2 + constant . (9-39b)

262 SYSTEMS WITH RECTANGULAR PUPILS

È 6 16 ˘

W bs (r) = As Ír 4 -

Î 63

( )

1 - 2c 2 r 2 cos 2q - r 2 ˙

21 ˚

(9-40a)

È 12 12 ˘

= As Ír 4 -

Î 63

( )

1 - 2c 2 cr 2 cos 2 q -

63

3 + 2c 2 r 2 ˙ .

˚

( ) (9-40b)

It shows, as in the case of an elliptical pupil, that spherical aberration is balanced not only

by defocus but astigmatism as well. Its sigma value is given by

8m

s bs = A . (9-41)

315 s

To obtain the sigma value of Seidel spherical aberration, we write Eq. (9-38) in the form

W s (r) =

1 È

Í8mR11 -

( )(

40c 2 1 - c 2 1 - 2c 2

R6 -

)

2( 241 - 2c ) ˘

R4 ˙ .

315 Í 5g 3 5g ˙˚

Î

+ constant . (9-42)

4 As

ss =

45 7

(

63 - 162c 2 + 206c 4 - 88c 6 + 44c 8 )1 2 . (9-43)

The sigma values of Seidel aberrations with and without balancing are given in Table 9-9.

Table 9-9. Sigma of a Seidel aberration with and without balancing, where Ai is the

coefficient of an aberration.

Aberration Sigma

Defocus (

s d = 2 g 3 5 Ad )

Astigmatism sa = ( 2c 3 5) A

2

a

Balanced astigmatism s ba = [ 2c (1 - c ) 3 5g ] A

2 2

a

Coma sc = c [( 7 + 8c ) 105] A 4

c

4 12

Balanced coma s bc = ( 2c 15 21)( 35 - 70c + 62c ) A 2

c

Ê 4A ˆ 8 12

˜ ( 63 - 162c + 206c - 88c + 44c )

s 2 4 6

Spherical aberration ss =Á

Ë 45 7 ¯

9.7.4 Spherical Aberration 263

Figures 9-8 and 9-9 show the variation of sigma for a rectangular pupil as a function

of its width c along the x axis. It is evident from Figure 9-8 that defocus and spherical

sigmas have a minimum for a square pupil (i.e., for c = 1 2 ), but coma and astigmatism

sigmas increase monotonically as c increases from a value of zero, representing a slit

pupil along the y axis, to a value of 1, representing a slit pupil parallel to the x axis. The

balanced spherical sigma in Figure 9-9 has a minimum for a square pupil though its

variation is relatively small. The sigma for balanced astigmatism has a distinct maximum

for a square pupil, while the monotonically increasing sigma for balanced coma has a

point of inflection.

half-width c of a unit rectangular pupil.

half-width c of a unit rectangular pupil.

264 SYSTEMS WITH RECTANGULAR PUPILS

9.8 SUMMARY

The aberration-free PSF and OTF are discussed in Section 9.3. The polynomials

orthonormal over a unit rectangular pupil, representing balanced aberrations over such a

pupil are given through the fourth order in Tables 9-1 through 9-3 in terms of the circle

polynomials, in polar coordinates, and in Cartesian coordinates, respectively. Each

orthonormal polynomial consists of either the cosine or the sine terms, but not both. Thus

an even j polynomial, for example, consists of only the cosine terms, as may be seen from

Table 9-2. This is a consequence of the biaxial symmetry of the pupil. Since the

polynomials are not separable in the polar coordinates r and q of a pupil point,

polynomial numbering with two indices n and m loses significance, and must be

numbered with a single index j. They are ordered in the same manner as the polynomials

discussed in previous chapters.

As in the case of elliptical polynomials, only the first 15 rectangular polynomials are

given in the tables. The expressions for the higher-order polynomials are very long unless

the aspect ratio of the pupil is specified. The polynomial R6 for astigmatism is a linear

combination of Z 6 , Z 4 , and Z1, showing that the balancing defocus for (zero-degree)

Seidel astigmatism is different for a rectangular pupil compared to that, for example, for a

circular pupil. Moreover, R11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,

spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q

as well. It is evidently not radially symmetric. As expected, the rectangular polynomials

reduce to the square polynomials (discussed in the next chapter) as c Æ 1 2 , i.e., as the

unit rectangle approaches a unit square.

The first 45 rectangular polynomials, i.e., up to and including the eighth order, for a

rectangular pupil with an aspect ratio of = 0.75 are given in Tables 9-4 through 9-6 in

terms of Zernike circle polynomials, in polar coordinates, and in Cartesian coordinates,

respectively. They are illustrated in three different but equivalent ways in Figure 9-7 with

the isometric plot, interferogram, and the PSF for a sigma value of one wave. The peak-

to-valley aberration numbers (in units of wavelength) are given in Table 9-7. The Strehl

ratio for a sigma value of 0.1 wave is given in Table 9-8 and plotted in Figure 9-7. The

Seidel aberrations are discussed in Section 9.7, and their sigma values with and without

balancing are given in Table 9-9.

5HIHUHQFHV 265

References

S. S. Olivier, J. M. Brase, and R. M. Yamamoto, “Technical challenges for the

future of high energy lasers,” Proc. SPIE 6454, 1–11 (2007).

analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.

Am. A 29, 1673–1674 (2012).

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–

11.41 (McGraw–Hill, 2009).

function,” Appl. Opt. 31, 2223–2228 (1992).

CHAPTER 10

References ......................................................................................................................294

267

Chapter 10

Systems with Square Pupils

10.1 INTRODUCTION

We start this chapter with a brief discussion of the aberration-free PSF and OTF for a

system with a square pupil, as, for example, a high-power laser beam with a square cross-

section. We can obtain these results as a special case of the rectangular pupils discussed

in the last chapter. Similarly, the square polynomials Sk can be obtained as a special case

of the rectangular polynomials Rk discussed there, i.e., by letting c = 1 2 . However,

we describe the procedure for obtaining them independently [1,2], and give expressions

for the first 45 polynomials, i.e., up to and including the eighth order. The isometric,

interferometric, and PSF plots of these polynomial aberrations with a sigma value of one

wave are given along with their P-V numbers. The Strehl ratios for these polynomial

aberrations for a sigma value of one-tenth of a wave are also given. Finally, we discuss

how to obtain the standard deviation of a Seidel aberration with and without balancing

and then discuss the Strehl ratio as a function of it.

circle polynomials, but he chose a circle inscribed inside a square instead of the other way

around [3]. Thus, his square with a full width of unity has regions that fall outside the unit

circle. Defining a unit square as we have, where its semidiagonal is unity, has the

advantage that the coefficient of a term in a certain polynomial represents its peak value.

For example, since r has a maximum value of unity, the coefficients of astigmatism

r 2 cos 2 q in S6 , or coma r 3 cos q in S8 , or spherical aberration r 4 in S11 represent

their peak values.

polynomials, which are orthogonal over a square pupil, are not suitable for the analysis of

square wavefronts [4], because they do not represent classical or balanced aberrations.

For example, defocus is represented by a term in x 2 + y 2 . While it can be expanded in

terms of a complete set of Legendre polynomials, it cannot be represented by a single 2D

Legendre polynomial (i.e., as a product of x- and y-Legendre polynomials). The same

difficulty holds for spherical aberration and coma, etc. However, products of Legendre

polynomials are the correct polynomials for an anamorphic system, as discussed in

Chapter 13.

As illustrated in Figure 10-1, consider an optical system with a square exit pupil of

( )

half-width a and area Sex = 4 a 2 lying in the x p , y p plane with z axis as its optical axis.

( )

For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex

exiting from it, the pupil function of the system can be written

(

P xp, yp ) ( ) [ (

= A x p , y p exp iF x p , y p )] , (10-1)

269

270 SYSTEMS WITH SQUARE PUPILS

yp

xp

O

where

(

A xp, yp ) = (P ex Sex )

12

, -a £ xp £ a , -a £ yp £ a . (10-2)

10.3.1 PSF

From Eq. (2-9), the aberrated PSF at a point ( x i , y i ) in the image plane of a system

with a uniformly illuminated rectangular exit pupil, normalized by its aberration-free

central value Pex Sex l2 R 2 , can be written

2

1 a a È 2pi ˘

I (x i , y i ) = 2 Ú

Sex a a

[ (

Ú exp iF x p , y p expÍ -

Î lR

)] ( )

x i x p + y i y p ˙ dx p dy p .

˚

(10-3)

Letting

( x ¢, y ¢) = a 1

(x p, yp ) (10-4)

and

1

( x, y) = (x , y )

lF i i

(10-5)

F = R 2a (10-6)

is the focal ratio of the image forming beam along the x and the y axes, we obtain the

irradiance distribution

2

1 1 1

I ( x, y) =

16 1 1

[ ]

Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (10-7)

10.3.1 PSF 271

2

1 1 1

I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢

16 1 1

2 2

Ê sin px ˆ Ê sin py ˆ

= Á ˜ . (10-8)

Ë px ¯ ÁË py ˜¯

Figure 10-2a shows the 2D PSF, in particular, the central bright square spot of size

2 ¥ 2 , with each dimension in units of l F . The PSF is zero wherever x and/or y is a

positive or a negative integer. Moreover, there are rectangular spots along the x and y

axes, but square spots elsewhere in the PSF. Figure 10-2b shows the irradiance

distribution along the x and y axes, and along the diagonal of the central bright spot as

12

(

I ( x, 0) , I (0, y ) , and I ( x , x ) ∫ I ( r ) , where r = x 2 + y 2 )

= 2 x and

4

I (r) = Í

(

È sin pr 2 ) ˘˙ . (10-9)

Í pr 2 ˙

Î ˚

(a)

1.0

0.8

0.6

0.4

(b)

I (x, 0)

0.2

I (0, y)

I (r)

0.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

x, y, or r

Figure 10-2. (a) 2D aberration-free PSF. (b) Irradiance distribution along the x and

y axes, and along the diagonal of the central bright spot of the PSF.

272 SYSTEMS WITH SQUARE PUPILS

10.3.2 OTF

From Eq. (1-13), the aberration-free OTF of a system with a square pupil at a spatial

frequency (x, h) is given by the fractional area of overlap of two squares centered at

(0, 0) and lR(x, h) , as shown in Figure 10-3. The overlap area is given by

S(x, h) = (2a - l Rx) (2a - l Rh)

Ê x ˆÊ h ˆ

= 4 a 2 Á1 - ˜ Á1 - ˜ . (10-10)

Ë 1 lF ¯ Ë 1 lF ¯

Hence, the fractional area of overlap, or the OTF of the system may be written

(

t vx , vy ) = (1 - v ) (1 - v )

x y , (10-11)

where

Ê x h ˆ

(v , v )

x y = Á , ˜

Ë 1 lF 1 lF ¯

(10-12)

are the spatial frequency components in units of the cutoff frequency 1 l F along the x

( )

or the y axis. The OTF t(v x , 0) along the x axis is the same as the OTF t 0, v y along

the y axis, with the same normalized cutoff frequency of unity.

yp

O9 R

xp

O

R

a

Figure 10-3. Overlap area of two square pupils centered at (0, 0) and l R(x , h) .

10.3.2 OTF 273

12

( )

The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be

obtained from Eq. (10-10) by letting v x = v y . Thus

2

Ê v ˆ

t( v ) = Á 1 - ˜ . (10-13)

Ë 2¯

( )

Figure 10-4 shows the OTF t(v x , 0) , t 0, v y , and t( v ) along the x and y axes, and

along the diagonal of the pupil with cutoff frequencies 1, 1, and 2 , respectively, each in

( )

units of 1 l F . Of course, t(v x , 0) = t 0, v y for any v x = v y . The OTF t( v ) < t(v x , 0) for

( )

any frequency lying in the range 0 < v = v x < 2 2 - 1 . They are equal to each other at

( )

the frequency 2 2 - 1 (or about 0.83), and t( v ) > t(v x , 0) for frequencies in the range

( )

2 2 - 1 < v = v x < 2 . Of course, t(v x , 0) is zero for v x ≥ 1, but t( v ) is not until

v = 2.

1.0

0.8

t ( nx , 0)

0.6

t (0, ny)

t

0.4

t (n)

0.2

0.0

0.0 0.5 1.0 1.5

nx, ny, or n

and v are in units of the cutoff frequency 1 l F along the x axis.

274 SYSTEMS WITH SQUARE PUPILS

Figure 10-5 shows a unit square inscribed inside a unit circle. The distance of a

corner point of the square, such as A, from its center O is unity, but each of its sides has a

length of 2 , and its area is 2.

Zernike circle polynomials Z j ( x , y ) over a unit square are given by [see Eq. (3-18)]

È j ˘

S j +1 = N j +1 ÍZ j +1 - Â Z j +1S k S k ˙ , (10-14)

ÍÎ k =1 ˙˚

where N j +1 is a normalization constant so that the polynomials are orthonormal over the

unit square, i.e., they satisfy the orthonormality condition

1 2 1 2

1 Û Û

Ù dy Ù S j S j ¢ dx = d jj ¢ . (10-15)

2 ı ı

1 2 1 2

The angular brackets indicate a mean value over the rectangular pupil. Thus, for example,

1 2 1 2

1

Z j Sk = Ú dy Ú Z j S k dx . (10-16)

2 1 2 1 2

If the integrand is an odd function of x and/or y, the mean value is zero because of the

symmetric limits of integration. If the integrand is an even function, then we may replace

the lower limits of integration by zero and multiply the double integral by 4.

The orthonormal square polynomials up to and including the eighth order, i.e., the

first 45 polynomials, in terms of the Zernike circle polynomials are given in Table 10-1.

D ( 1 2, 1 2 ) (

A 1 2,1 2 )

O x

(

C 1 2, 1 2 ) (

B 1 2, 1 2 )

Figure 10-5. Unit square of half-width 1 2 inscribed inside a unit circle. Its corner

points, such as A, lie at a distance of unity from its center.

10.4 Square Polynomials 275

polynomials Z j U T .

S1 Z1

S2 3 2 Z2

S3 3 2 Z3

S4 ( 5 2 /2) Z1 + ( 15 2 /2) Z4

S5 3 2 Z5

S6 ( 15 /2)Z6

S7 (3 21 31 /2)Z3 + (5 21 62 /2)Z7

S8 (3 21 31 /2)Z2 + (5 21 62 /2)Z8

S14 = 261/(8 134 )Z1 + (345 3 134 /16)Z4 + (129 5 134 /16)Z11 + (3 335 /16)Z14

S16 = 1.71440511Z2 +1.71491497Z8 + 0.65048499Z10 + 1.52093102Z16

S17 = 1.71440511Z3 + 1.71491497Z7 0.65048449Z9 + 1.52093102Z17

S18 = 4.10471345Z2 + 3.45884077Z8 + 5.34411808Z10 + 1.51830574Z16 + 2.80808005Z18

S19 = 4.10471345Z3 3.45884078Z7 + 5.34411808Z9 1.51830575Z17 + 2.80808005Z19

S20 = 5.57146696Z2 + 4.44429264Z8 + 3.00807599Z10 + 1.70525179Z16 +1.16777987Z18 + 4.19716701Z20

S21 = 5.57146696Z3 + 4.44429264Z7 3.00807599Z9 + 1.70525179Z17 1.16777988Z19 + 4.19716701Z21

S22 = 1.33159935Z1 + 1.94695912Z4 + 1.74012467Z11 + 0.65624211Z14 + 1.50989174Z22

S23 = 0.95479991Z5 + 1.01511643Z13 + 1.28689496Z23

S24 = 9.87992565Z6 + 7.28853095Z12 + 3.38796312Z24

S25 = 5.61978925Z15 + 2.84975327Z25

S26 = 11.00650275Z1 + 14.00366597Z4 + 9.22698484Z11 + 13.55765720Z14

+ 3.18799971Z22 + 5.11045000Z26

S27 = 4.24396143Z5 + 2.70990074Z13 + 0.84615108Z23 + 5.17855026Z27

S29 = 2.42764289Z3 + 2.69721906Z7 1.56598064Z9 + 2.12208902Z17

0.93135653Z19 + 0.25252773Z21 + 1.59017528Z29

+ 0.93135653Z18 + 0.25252773Z20 + 1.59017528Z30

276 SYSTEMS WITH SQUARE PUPILS

polynomials Z j U T . (Cont.)

+ 7.01044701Z19 1.26347272Z21 1.90131756Z29 + 3.07960207Z31

S32 9.10300982Z2 + 8.79978208Z8 + 10.69381427Z10 +5.37383385Z16

+ 7.01044701Z18 + 1.26347272Z20 + 1.90131756Z30 + 3.07960207Z32

S33 21.39630883Z3 + 19.76696884Z7 12.70550260Z9 + 11.05819453Z17

7.02178756Z19 +15.80286172Z21 + 3.29259996Z29 2.07602718Z31

+ 5.40902889Z33

S34 21.39630883Z2 + 19.76696884Z8 + 12.70550260Z10 + 11.05819453Z16

+ 7.02178756Z18 +15.80286172Z20 + 3.29259996Z30 + 2.07602718Z32

+ 5.40902889Z34

S35 16.54454462Z3 14.89205549Z7 + 22.18054997Z9 7.94524849Z17

+ 11.85458952Z19 6.18963457Z21 2.19431441Z29 +3.24324400Z31

1.72001172Z33 + 8.16384008Z35

S36 16.54454462Z2 + 14.89205549Z8 + 22.18054997Z10 + 7.94524849Z16

+ 11.85458952Z18 + 6.18963457Z20 + 2.19431441Z30 +3.24324400Z32

+ 1.72001172Z34 + 8.16384008Z36

S37 1.75238960Z1 + 2.72870567Z4 + 2.76530671Z11 + 1.43647360Z14

+ 2.12459170Z22 + 0.92450043Z26 + 1.58545010Z37

S38 19.24848143Z6 + 16.41468913Z12 + 9.76776798Z24 + 1.47438007Z28

+ 3.83118509Z38

S39 0.46604820Z5 + 0.84124290Z13 + 1.00986774Z23 0.42520747Z27 + 1.30579570Z39

S40 28.18104531Z1 + 38.52219208Z4 + 30.18363661Z11 + 36.44278147Z14 +

15.52577202Z22 + 19.21524879Z26 + 4.44731721Z37 + 6.00189814Z40

S41 (369/4) 35 3574 Z15 + [11781/(32 3574 )]Z25 + (2145/32) 7 3574 Z41

+7.75796322Z38 + 9.37150432Z42

S43 14.30642479Z5 + 11.17404702Z13 + 5.68231935Z23 + 18.15306055Z27

+ 1.54919583Z39 + 5.90178984Z43

S44 36.12567424Z1 + 47.95305224Z4 + 35.30691679Z11 + 56.72014548Z14

+ 16.36470429Z22 + 26.32636277Z26 +3.95466397Z37 +6.33853092Z40

+ 12.38056785Z44

S45 21.45429746Z15 + 9.94633083Z25 + 2.34632890Z41 + 10.39130049Z45

10.4 Square Polynomials 277

S1 = 1

S2 = 6 ȡcosș

S3 = 6 ȡsinș

2

S4 = 5 2 (3ȡ 1)

2

S5 = 3ȡ sin2ș

S6 = 3 5 2 ȡ2 cos2ș

2

S7 = 21 31 (15ȡ 7)ȡsinș

2

S8 = 21 31 (15ȡ 7)ȡcosș

3

S16 = 55 1966 [11ȡ cos3ș + 3(19 97ȡ2 + 105ȡ4)ȡcosș]

3

S17 = 55 1966 [ 11ȡ sin3ș + 3(19 97ȡ2 + 105ȡ4)ȡsinș]

4

S18 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 cos3ș + 3(3128 23885ȡ2 + 37205ȡ )ȡcosș]

4

S19 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 sin3ș 3(3128 23885ȡ2 + 37205ȡ )ȡsinș]

4

S20 = (1/16) 7 859 [2577ȡ5 cos5ș 5(272 717ȡ2)ȡ3 cos3ș + 30(22 196ȡ2 + 349ȡ )ȡcosș]

4

S21 = (1/16) 7 859 [2577ȡ5 sin5ș + 5(272 717ȡ2)ȡ3 sin 3ș + 30(22 196ȡ2 + 349ȡ )ȡsinș]

S26 = (1/16 849 )[5( 98 + 2418ȡ2 12051ȡ4 + 15729ȡ6) + 3( 8195 + 17829ȡ2)ȡ4 cos4ș]

S27 = (1/16 7846 )[27461ȡ6 sin6ș + 15(348 2744ȡ2 + 4487ȡ4)ȡ2 sin2ș]

+ (8.47599260ȡ3 16.13156842ȡ5) sin3ș + 0.87478174ȡ5 sin5ș

278 SYSTEMS WITH SQUARE PUPILS

(Cont.)

S30 = ( 13.79189793ȡ + 125.49411319ȡ3 308.13074909ȡ5 + 222.62454035ȡ7) cosș

+ ( 8.47599260ȡ3 + 16.13156842ȡ5) cos3ș + 0.87478174ȡ5 cos5ș

S31 = (6.14762642ȡ 79.44065626ȡ3 + 270.16115026ȡ5 266.18445920ȡ7) sinș

+ (56.29115383ȡ3 248.12774426ȡ5 + 258.68657393ȡ7) sin3ș 4.37679791ȡ5 sin5ș

3

S32 = ( 6.14762642ȡ + 79.44065626ȡ 270.16115026ȡ + 266.18445920ȡ7) cosș

5

3

+ (56.29115383ȡ 248.12774426ȡ5 + 258.68657393ȡ7) cos3ș +4.37679791ȡ5 cos5ș

S33 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)sinș

+ ( 21.68093294ȡ3 + 127.50233381ȡ5 174.38628345ȡ7) sin3ș

+ ( 75.07397471ȡ5 + 151.45280913ȡ7) sin5ș

S34 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)cosș

+ (21.68093294ȡ3 127.50233381ȡ5 + 174.38628345ȡ7) cos3ș

+ ȡ5( 75.07397471 + 151.45280913ȡ2) cos5ș

S35 = (3.69268433ȡ 59.40323317ȡ3 + 251.40397826ȡ5 307.20401818ȡ7)sinș

+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)sin3ș

+ (19.83875817ȡ5 48.16032819ȡ7) sin 5ș + 32.65536033ȡ7 sin7ș

S36 = ( 3.69268433ȡ + 59.40323317ȡ3 251.40397826ȡ5 + 307.20401818ȡ7)cosș

+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)cos3ș

+ ( 19.83875817ȡ5 + 48.16032819ȡ7) cos5ș + 32.65536033ȡ7 cos7ș

S37 = 2.34475558 55.32128002ȡ2 + 296.53777290ȡ4 553.46621887ȡ6

+ 332.94452229ȡ8 + ( 12.75329096ȡ4 + 20.75498320ȡ6)cos4ș

S38 = ( 51.83202694ȡ2 + 451.93890159ȡ4 1158.49126888ȡ6 + 910.24313983ȡ8)cos2ș

+ 5.51662508ȡ6 cos6ș

S39 = ( 39.56789598ȡ2 + 267.47071204ȡ4 525.02362247ȡ6 + 310.24123146ȡ8)sin2ș

1.59098067ȡ6 sin6ș

S40 = 1.21593465 45.42224477ȡ2 + 373.41167834ȡ4 1046.32659847ȡ6

+ 933.93661610ȡ8 + (137.71626496ȡ4 638.10242034ȡ6 + 712.98912399ȡ8)cos4ș

+ ( 150.76043598ȡ6 + 318.07940431ȡ8)cos6ș

S43 = ( 9.12193686ȡ2 + 110.47679089ȡ4 371.21215287ȡ6 + 368.07015240ȡ8)sin2ș

+ ( 107.35168289ȡ6 + 200.31338972ȡ8) sin6ș

S44 = 0.58427150 25.29433513ȡ2 + 242.54313549ȡ4 795.02011474ȡ6

+ 830.47943579ȡ8 + (90.22533813ȡ4 538.44320774ȡ6 + 752.97905752ȡ8) cos4ș

+ 52.52630092ȡ8 cos8ș

S45 = (31.08509142ȡ4 194.79990628ȡ6 + 278.72965314ȡ8) sin4ș + 44.08655427ȡ8 sin8ș

10.4 Square Polynomials 279

x, y , where U 2 x 2 y 2 .

S1 = 1

S2 = 6x

S3 = 6y

2

S4 = 5 2 (3ȡ 1)

S5 = 6xy

S6 = 3 5 2 (x2 y2)

2

S7 = 21 31 (15ȡ 7)y

2

S8 = 21 31 (15ȡ 7)x

4

S16 = 55 1966 (315ȡ 280x2 324y2 + 57)x

4

S17 = 55 1966 (315ȡ 324x2 280y2 + 57)y

4

S23 = 33 3923 (1575ȡ 1820ȡ2 + 471)xy

+ 707y4) + 6045ȡ2 245]

S28 = (21/8 1349 )[3146x6 2250 x4y2 + 2250 x2y4 3146y6 1770(x4 y4) + 245(x2 y2)]

280 SYSTEMS WITH SQUARE PUPILS

x, y , where U 2 x 2 y 2 . (Cont.)

S29 = ( 13.79189793 + 150.92209099x2 + 117.01812058y2 352.15154565x4 657.27245247x2y2

291.12439892y4 + 222.62454035x6 + 667.87362106x4y2 + 667.87362106x2y4 + 222.62454035y6)y

S30 = ( 13.79189793 + 117.01812058x2 + 150.92209099y2 291.12439892x4 657.27245247x2y2

352.15154565y + 222.62454035x + 667.87362106x y + 667.87362106x2y4 + 222.62454035y6)x

4 6 4 2

+ 513.91209661y4 + 509.87526260x6 + 494.87949207x4y2 539.86680367x2y4 524.87103314y6)y

S32 = ( 6.14762642 + 135.73181009x2 89.43280522y2 513.91209661x4 87.83479115x2y2

+ 496.10607212y4 + 524.87103314x6 + 539.86680367x4y2 494.87949207x2y4 509.87526260y6)x

2 2

S33 = ( 6.78771487 + 38.11697536x + 124.84070714y 400.01976911x4 + 191.43062089x2y2

609.73320550y4 + 695.06919087x6 246.30347616x4y2 154.56957886x2y4 + 786.80308817y6)y

2 2

S34 = ( 6.78771487 + 124.84070714x + 38.11697536y 609.73320550x4 + 191.43062089x2y2

400.01976911y4 + 786.80308817x6 154.56957886x4y2 246.30347616x2y4 + 695.06919087y6)x

S35 = (3.69268433 + 25.20822264x2 87.60705178y2 200.98753298x4 63.30315999x2y2

+ 455.10450382y4 + 497.87935336x6 461.58554163x4y2 + 470.02596297x2y4 660.45220344y6)y

S36 = ( 3.69268433 + 87.60705178x2 25.20822264y2 455.10450382x4 + 63.30315999x2y2

+ 200.98753298y4 + 660.45220344x6 470.02596297x4y2 + 461.58554163x2y4 497.87935336y6)x

S37 = 2.34475558 55.32128002ȡ2 + 283.78448194ȡ4 532.71123567ȡ6 + 332.94452229ȡ8

+ 8(12.75329096ȡ2 20.75498320ȡ4) x2 + 8( 12.75329096 + 20.75498320ȡ2)x4

S38 = ( 51.83202694 + 451.93890159x2 1152.97464379x4 + 910.24313983x6)x2

+ (51.83202694 451.93890159y2 1241.24064523x4 + 1241.24064523x2y2

+ 1152.97464379y4 + 1820.48627967x6 1820.48627967x2y4 910.24313983y6)y2

S39 = ( 79.13579197 + 534.94142408x2 + 534.94142408y2 1059.59312899x4 2068.27487642x2y2

4 6 4 2

1059.59312899y + 620.48246292x + 1861.44738877x y + 1861.44738877x2y4 620.48246292y6)xy

S40 = 1.21593465 + ( 45.42224477 + 511.12794331x2 1684.42901882x4

+ 1646.92574009x6)x2 + ( 45.42224477 79.47423312x2 + 511.12794331y2

+ 51.53230630x4 + 51.53230630x2y2 1684.42901882y4 + 883.78996844x6

1526.27154329x4y2 + 883.78996844x2y4 + 1646.92574009y6)y2

S41 = (409.79084415x2 409.79084415y2 1561.42985567x4 + 1561.42985567y4

+ 1409.62417525x6 + 1409.62417525xy2 1409.62417525x2y4 1409.62417525y6)xy

S42 = ( 40.45171657 + 494.75561036x2 1889.40633090x4 + 2161.27742821x6)x2

+ (40.45171657 494.75561036y2 + 522.76064491x4 522.76064491x2y2

+ 1889.40633090y4 766.71561254x6 + 766.71561254x2y4 2161.27742821y6)y2

S43 = ( 18.24387372 + 220.95358178x2 + 220.95358178y2 1386.53440310x4

+ 662.18504631x2y2 1386.53440310y4 + 1938.02064313x6 595.96654168x4y2

595.96654168x2y4 + 1938.02064313y6)xy

S44 = 0.58427150 + ( 25.29433513 + 332.76847363x2 1333.46332249x4

+ 1635.98479424x6)x2 + ( 25.29433513 56.26575785x2 + 332.76847363y2

+ 307.15569451x4 + 307.15569451x2y2 1333.46332249y4 1160.73491284x6

+ 1129.92710444x4y2 1160.73491284x2y4 + 1635.98479424y6)y2

S45 = (124.34036571x2 124.34036571y2 779.19962514x4 + 779.19962514y4

+ 1467.61104674x6 1353.92842666x4y2 + 1353.92842666x2y4 1467.61104674y6)xy

10.4 Square Polynomials 281

The corresponding polynomials in polar and Cartesian coordinates are given in Tables

10-2 and 10-3, respectively. Of course, up to the fourth order, they can be obtained

simply from the rectangular polynomials Rk given in Tables 9-1 through 9-3 by letting

c = 1 2 . The square polynomial S11 representing the balanced primary spherical

aberration is radially symmetric, but the polynomial S22 representing balanced secondary

spherical aberration is not because it consists of a term in Z14 or cos4q, also. Similarly,

the polynomial S37 representing balanced tertiary spherical aberration is also not radially

symmetric, since it consists of terms in Z14 and Z 26 both varying as cos 4q .

A square aberration function W ( x , y ) across a unit square can be expanded in terms

of J square polynomials Sj (r, q) in the form

J

W ( x , y ) = Â a j Sj ( x , y ) , (10-17)

j =1

where a j are the expansion coefficients. Multiplying both sides of Eq. (10-17) by

S j ( x , y ), integrating over the unit square, and using the orthonormality Eq. (10-15), we

obtain the square expansion coefficients:

1 1 2 1 2

aj = Ú dy Ú W ( x , y )S j ( x , y )dy . (10-18)

2 1 2 1 2

As stated in Section 3.2, it is evident from Eq. (10-18) that the value of a square

coefficient is independent of the number J of polynomials used in the expansion of the

aberration function. Hence, one or more terms can be added to or subtracted from the

aberration function without affecting the value of the coefficients of the other

polynomials in the expansion.

The mean and mean square values of the aberration function are given by

W (r, q) = a1 , (10-19)

and

J

W 2 (r, q) = Â a 2j , (10-20)

j =1

2

2

sW = W 2 (r, q) - W (r, q)

J

= Â a 2j . (10-21)

j =2

282 SYSTEMS WITH SQUARE PUPILS

CHARACTERISTICS OF SQUARE POLYNOMIAL ABERRATIONS

The square polynomials are illustrated in three different but equivalent ways in

Figure 10-6. For each polynomial, the isometric plot at the top illustrates its shape. An

interferogram is shown on the left, and a corresponding PSF is shown on the right for a

sigma value of one wave. The peak-to-valley aberration numbers (in units of wavelength)

are given in Table 10-4.

The PSF plots, representing the images of a point object in the presence of a

polynomial aberration and obtained by applying Eq. (10-7) are shown in Figure 10-6. The

full width of a square displaying the PSFs is 24l Fx . Since the piston aberration S1 has

no effect on the PSF, it yields an aberration-free PSF.

The polynomial aberrations S2 and S3 , representing the x and y wavefront tilts with

aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y

axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a

wavefront tilt angle of 3 2la 2 a about the y axis and displaces the PSF along the x

axis by 6 a 2l F . Similarly, a 3 corresponds to a wavefront tilt angle of 3 2l a 3 a

about the x axis and displaces the PSF by 6 a 3l F .

yields a radially symmetric interferogram bounded, of course, by a square. However, the

PSF is biaxially symmetric. The polynomial aberrations S5 and S6 , representing

balanced astigmatism, yield biaxially symmetric interferograms and PSFs, but distinctly

different from each other. The polynomial aberrations S7 and S8 , representing balanced

comas, produce biaxially symmetric interferograms, but the PSFs are symmetric only

about the y and x axes, respectively. The polynomial aberrations S11 , representing the

primary spherical aberration, yields radially symmetric PSF. However, the polynomial

aberrations S22 , and S37 , representing the balanced secondary and tertiary aberrations are

not radially symmetric because of the presence of a cos 4q term. Accordingly, neither the

interferograms nor the PSFs for these aberrations are radially symmetric.

The Strehl ratio, namely the central value of a PSF relative to its aberration-free

value can be obtained from Eq. (10-7) by letting x = 0 = y , i.e., from

2

1 1 1

I (0, 0) = [ ]

Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢

16 1 1

. (10-22)

Its value for a square polynomial aberration with a sigma value of 0.1 wave is listed in

Table 10-5 and plotted in Figure 10-7. Because of the small value of the aberration, the

Strehl ratio is approximately the same for each polynomial. Both the table and the figure

illustrate that the Strehl ratio for a small aberration is independent of the type of

( )

aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .

10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 283

S1 S2 S3

S4 S5 S6

S7 S8 S9

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on

the right for a sigma value of one wave.

284 SYSTEMS WITH SQUARE PUPILS

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on

the right for a sigma value of one wave. (Cont.)

10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 285

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on

the right for a sigma value of one wave. (Cont.)

286 SYSTEMS WITH SQUARE PUPILS

square polynomials for a sigma value of unity.

10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 287

Table 10-5. Strehl ratio S for square polynomial aberrations for a sigma value of 0.1

wave.

288 SYSTEMS WITH SQUARE PUPILS

o

o

Figure 10-7. Strehl ratio S for square polynomial aberrations with a sigma value of

0.1 wave.

10.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio 289

STREHL RATIO

We now consider balancing of a Seidel aberration and obtain its standard deviation

with and without balancing. We also show how the Strehl ratio varies as a function of the

standard deviation and compare it with the approximate exponential expression for it.

10.7.1 Defocus

We start with the defocus aberration

W d (r) = Ad r 2 . (10-23)

From the form of the defocus orthonormal polynomial S4 given in Table 10-2, it is

evident that its sigma value across a square pupil is given by

1 2 Ad

sd = Ad = . (10-24)

3 5 4.743

10.7.2 Astigmatism

Next, consider 0 o Seidel astigmatism given by

5 2

S6 = 3 r cos 2q (10-26a)

2

Ê 1 ˆ

= 3 10 Á r 2 cos 2 q - r 2 ˜ , (10-26b)

Ë 2 ¯

showing that the relative amount of defocus r2 that balances Seidel astigmatism

r2 cos 2 q is -1 2 , as in the case of a circular, annular, or a Gaussian pupil. Thus, the

balanced astigmatism is given by

Ê 1 ˆ

W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (10-27)

Ë 2 ¯

Aa Aa

s ba = = . (10-28)

3 10 9.487

To obtain the sigma value of astigmatism, we write Eq. (10-25) in the form

Aa

W a (r, q) = (S6 + S4 ) . (10-29)

3 10

290 SYSTEMS WITH SQUARE PUPILS

Aa Aa

sa = = . (10-30)

3 5 6.708

10.7.3 Coma

Now, we consider Seidel coma:

21

S8 =

31

(

15r 3 cos q - 7r cos q ) . (10-32)

It shows that the relative amount of tilt r cos q that optimally balances Seidel coma

r3 cos q is - 7 15 compared to - 2 3 for a circular pupil. The balanced coma is given by

Ê 7 ˆ

W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (10-33)

Ë 15 ¯

1 31 Ac

s bc = Ac = . (10-34)

15 21 12.346

To obtain the sigma value of Seidel coma, we write Eq. (10-31) in the form

Ac Ê 31 7 ˆ

W c (r, q) = Á S8 + S2 ˜ . (10-35)

15 Ë 21 6 ¯

3 Ac

sc = A = . (10-36)

70 c 4.831

Finally, we consider Seidel spherical aberration:

1

S11 =

2 67

(

315r 4 - 240r 2 - 31 ) . (10-38)

10.7.4 Spherical Aberration 291

Ê 16 ˆ

W bs (r) = As Á r 4 - r 2 ˜ . (10-39)

Ë 21 ¯

It shows that spherical aberration is balanced by a relative defocus of -16 21. Its sigma

value is given by

2 1

s bs = 67 As = . (10-40)

315 19.242

To obtain the sigma value of Seidel spherical aberration, we write Eq. (10-23) in the form

2

W s (r) =

315

( )

67 S11 + 8 10 S4 + constant . (10-41)

2 101 As

ss = A = . (10-42)

45 7 s 5.923

The sigma values of Seidel aberrations with and without balancing are given in Table 10-

6.

Table 10-6. Sigma value of a Seidel aberration with and without balancing, and P-V

numbers for a sigma value of unity, where Ai is the aberration coefficient.

292 SYSTEMS WITH SQUARE PUPILS

In Figure 10-7, we have shown the Strehl ratio for the square polynomial aberrations

with a sigma value of one wave. In Figure 10-8, we show how it varies with the sigma

value of a Seidel aberration, with and without balancing, for 0 £ s W £ 0.25 . Also plotted

( )

is the Strehl ratio obtained from the approximate expression exp - s F2 as the dashed

curve. We note that this expression underestimates the Strehl ratio for defocus and Seidel

astigmatism, but oversetimates for Seidel coma and Seidel spherical aberration. The

agreement between the actual and the approximate values is quite good for the balanced

aberrations, except that the approximate expression overestimates in the case of spherical

aberration for s W > 0.15. The aberration coefficient or the P-V aberration for a certain

value of s W can be obtained from Tables 10-4 and 10-6 for the aberrations considered

here.

(a) (b)

(c) (d)

Figure 10-8. Strehl ratio as a function of the sigma value of a Seidel aberration with

and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical

aberration.

10.8 Summary 293

10.8 SUMMARY

The aberration-free PSF and OTF of a square pupil are discussed in Section 10.3.

The polynomials orthonormal over a unit square pupil, representing balanced aberrations

over such a pupil are given through the eighth order in Tables 10-1 through Table 10-3 in

terms of the circle polynomials, in polar coordinates, and in Cartesian coordinates,

respectively. Each orthonormal polynomial consists of either the cosine or the sine terms,

but not both. Thus, an even j polynomial, for example, consists of only the cosine terms,

as may be seen from Table 10-1 or 10-2. This is a consequence of the four-fold symmetry

of the pupil. Since the polynomials are not separable in the polar coordinates r and q of

a pupil point, the polynomial numbering with two indices n and m loses significance, and

must be numbered with a single index j. They are ordered in the same manner as the

polynomials discussed in previous chapters.

the form of the polynomial S6 representing balanced astigmatism is the same as that for a

circular pupil. Similarly, as indicated by the polynomial S11 , spherical aberration r 4 is

balanced only by defocus r2 , compared to R11 for a rectangular pupil, which consists of

a term in astigmatism r2 cos 2 q as well.

The first 45 hexagonal polynomials, i.e., up to and including the eighth order are

illustrated by an isometric plot, an interferogram, and a PSF in Figure 10-6. The

coefficient of each orthonormal polynomial, or the sigma value of the corresponding

aberration, is one wave. Their peak-to-valley numbers for a sigma value of one wave are

given in Table 10-4 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l

for each aberration is given in Table 10-5 and illustrated in Figure 10-7. It shows that, for

a small aberration, the Strehl ratio can be estimated from the aberration variance. The

sigma values of the Seidel aberrations and their balanced forms are given in Table 10-6.

294 SYSTEMS WITH SQUARE PUPILS

References

analytical solution,” J Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.

Am. A 29, 1673–1674 (2012).

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–

11.41 (McGraw Hill, 2009).

3. M. Bray, “Orthogonal polynomials: A set for square areas," 3URF SPIE 5252,

314–320 (2004).

aberration function," Appl. Opt. 31, 2223–2228 (1992).

CHAPTER 11

References ......................................................................................................................306

295

Chapter 11

Systems with Slit Pupils

11.1 INTRODUCTION

A slit pupil is a limiting case of a rectangular pupil whose one dimension is

negligibly small. It is used in spectrographs. The power series aberrations of a

rotationally symmetric imaging system with a slit pupil are the 1D analog of the

corresponding aberration terms discussed in Chapter 1. In this chapter, we discuss the

PSF of a slit pupil and the incoherent image of a slit parallel to the slit pupil. The Strehl

ratio for and the balanced aberrations of a slit pupil are discussed. It is shown that the

balanced aberrations are represented by the Legendre polynomials [1,2]. We show further

that the slit pupil is more sensitive to a primary aberration with or without balancing,

except for spherical aberration, for which it is slightly less sensitive.

11.2.1 PSF

As illustrated in Figure 11-1, consider a slit pupil, i.e., a rectangular pupil of half-

widths a and b, where b << a. Thus, the aspect ratio = b a of the pupil is negligibly

small. Its PSF can be obtained from that of a rectangular pupil by letting be

negligibly small. Letting be practically zero in Eq. (9-8) for the PSF of a rectangular

pupil, the PSF of a slit pupil may be written

2

1 1

I ( x) = Ú exp[iF( x ¢) ] exp( -pix ¢x ) dx ¢ , (11-1)

4 1

distance R from the focusing lens. The irradiance distribution is normalized by its central

value Pex Sex l2 R 2 , where Pex is the total power in the pattern, and Sex = 4 ab is the

pupil area. For the aberration-free case, we obtain

yp

O

b xp

a

Figure 11-1. A slit pupil of half-width a along the x axis, where b << a .

297

298 SYSTEMS WITH SLIT PUPILS

(a)

1.0

0.8

0.6

(x)

0.4 (b)

0.2

0.0

3 2 1 0 1 2 3

x

Figure 11-2. PSF of a slit pupil. (a) Irradiance distribution. (b) 1D PSF

2

Ê sin px ˆ

I ( x) = Á ˜ . (11-2)

Ë px ¯

The PSF is shown in Figure 11-2. Its value is zero wherever x is a positive or a negative

integer.

If the point source is replaced by an incoherently illuminated slit object parallel to

the slit pupil, then each point on the source forms a PSF, and the net result for an

incoherent illumination is the sum of their irradiance images. The incoherent image of the

slit object thus obtained is shown in Figure 11-3.

Figure 11-3. Image of an incoherent slit object formed by a system with a slit pupil.

11.3 Strehl Ratio and Aberration Balancing 299

11.3.1 Strehl Ratio

From Eq. (11-1), the Strehl ratio, representing the central value of the PSF without

and with an aberration, can be written

S ∫ I ( 0)

2

1 1

= Ú exp[iF( x ¢) ] dx ¢ . (11-3)

4 1

2

1 1

S = {

Ú exp i [F( x ) - F

4 1

]} dx

= {

exp i [F( x ) - F ]}

1

= 1 + i [F( x ) - F ] -

2

[F( x) - F ]2 + ...

2

~ 1 - F2 - F

∫ 1 - s F2 , (11-4)

where the angular brackets indicate a mean value across the pupil, F is the mean value

of the aberration function, F 2 is its mean square value, s F2 is its variance, and we have

neglected the higher-order terms in the power-series expansion of the exponent. The

mean value of a function g( x ) is given by

1

Ú g( x )dx

11

g( x ) = 1

1

= Ú g( x )dx . (11-5)

2 1

Ú dx

1

A unit slit pupil along the x axis is illustrated in Figure 11-4. Consider an aberration

such as primary x-coma:

Wcx ( x ) = x 3 . (11-6)

2

s 2cx = [W cx ( x)]2 - W cx ( x ) . (11-7)

300 SYSTEMS WITH SLIT PUPILS

x

1 1

O

Figure 11-4. Unit slit pupil along the x axis inscribed inside a unit circle.

The variance can be reduced by mixing it with a certain amount b of x-tilt. Thus, the

balanced aberration may be written in the form

W bcx ( x ) = x 3 + bx . (11-8)

1 2b b 2

s 2bcx = + + . (11-9)

7 5 3

The variance has a minimum value of 4/175 for a tilt of b = -3 / 5 compared to a value of

1/7 without any tilt. Thus, the variance is reduced by a factor of 25/4, or the standard

deviation of the balanced aberration is smaller by a factor of 5/2. The corresponding

balanced aberration is given by

W bcx ( x , y ) = x 3 - (3 5) x . (11-10)

A balanced aberration yields a higher Strehl ratio or increases the aberration tolerance for

a given Strehl ratio.

combining it with x-defocus. Thus, consider the balanced aberration

W bsx ( x ) = x 4 + bx 2 . (11-11)

16 2b 4b 2

s 2bsx = + + . (11-12)

225 105 105

11.3.2 Aberration Balancing 301

Its sigma value is minimum and equal to 8 105 for b = - 6 7 compared to a value of

4 15 with no defocus. The balanced aberration is given by

W bsx ( x ) = x 4 - (6 7) x 2 . (11-13)

It should be evident that there is no distinction between defocus and astigmatism, since

they both vary as x 2 .

The process of minimizing the variance in this manner is called aberration balancing.

The variance of the higher-order classical aberrations, e.g., secondary coma x 5 ,

secondary spherical aberration x 6 , tertiary coma x 7 , and tertiary spherical aberration x 8 ,

can also be minimized by combining them with lower-degree aberrations.

By letting c Æ 1 in the rectangular pupil discussed in Chapter 9, we obtain a unit slit

pupil inscribed inside a unit circle that is parallel to the x axis, as illustrated in Figure 11-

4. The corresponding orthonormal polynomials representing balanced aberrations for

such pupils can be obtained from the rectangular polynomials R j ( x , y ) given in Table 9-3

by letting y Æ 0 and c Æ 1. Half of the rectangular polynomials thus reduce to zero.

Some of the other polynomials are redundant. For example, the 1D defocus and

astigmatism cannot be distinguished from each other. The slit polynomials are the

Legendre polynomials. Since the pupil is 1D along the x axis, the aberrations vary with x

only.

The Legendre polynomials Pn ( x ) are orthogonal over the interval [ -1, 1] , according

to [3]

1 1 1

Ú Pn ( x ) Pn ¢ ( x ) dx = d , (11-14)

2 1 2n + 1 nn ¢

where n is a positive integer (including zero). A polynomial with an even (odd) value of n

consists of terms with even (odd) powers of x. Thus, a polynomial is symmetric for an

even n and antisymmetric for an odd n, according to

n

Pn ( - x ) = ( -1) Pn ( x ) . (11-15)

Moreover,

Pn (1) = 1 , (11-16)

Ï1 for even n

Pn ( -1) = Ì (11-17)

Ó -1 for odd n ,

302 SYSTEMS WITH SLIT PUPILS

Starting with P0 ( x ) = 1 and P1( x ) = x , the polynomials can be obtained recursively from

the relation

It is evident from Eq. (11-19) that Pn ( x ) is a polynomial of degree n in x, i.e., the highest

power of x in a polynomial Pn ( x ) is n. It is perhaps worth noting that a Zernike radial

( ) (

polynomial Rn0 (r) is the same as a shifted Legendre polynomial P̃n r 2 = Pn 2r 2 - 1 , )

both of which are orthogonal over the interval [0, 1] [see Eq. (4-41)].

clarity, the even polynomials are plotted in Figure 11-5a and the odd in Figure 11-5b. It is

evident, as expressed by Eqs. (11-15)–(11-18), that an odd polynomial starts at –1 for

x = -1 and ends with 1 for x = 1. However, the even polynomials start and end at unity.

The number of peaks and valleys in a polynomial Pn ( x ) is n-1.

Ln ( x ) = 2n + 1Pn ( x ) . (11-20)

1 1

Ú L ( x ) Ln ¢ ( x ) dx = d nn ¢ . (11-21)

2 1 n

The first few Ln ( x ) polynomials are listed in Table 11-1. The standard deviation of

each polynomial is unity. The mean value of each polynomial [other than P0 ( x ) ] is zero,

as may be seen by letting n ¢ = 0 in Eq. (11-21). It is easy to see this explicitly for a

polynomial with an odd value of n, since the integral of an odd function over symmetric

limits is zero. For an even value of n, the piston term in the polynomial makes its mean

value zero. For example, the balanced x-spherical aberration is x 4 - (6 7) x 2 with a mean

value of - 3 35. The piston term of 3(3/8) in L4 ( x ) makes its mean value zero. The slit

pupil is more sensitive to a Seidel aberration with or without balancing compared to a

circular pupil, except for spherical aberration for which it is slightly less sensitive.

The standard deviation of a 1D primary aberration for a slit pupil can be obtained

from the orthonormal polynomials by writing it as a sum of these polynomials. Of course,

they are obtained in Section 11.3, and they are listed in Table 11-2. Comparing them with

the sigma value of a corresponding 2D aberration for a circular pupil (see Tables 4-1 and

4-2), we find that a slit pupil is more sensitive to a primary aberration with or without

balancing, except for spherical aberration, for which it is slightly less sensitive.

11.5 Standard Deviation of a Primary Aberration 303

(a)

(b)

Figure 11-5. Legendre polynomials Pn ( x ) as a function of x. (a) Even n and (b) odd

n.

304 SYSTEMS WITH SLIT PUPILS

orthonormal over the interval -1 £ x £ 1.

n Aberration Ln ( x)

0 Piston 1

1 Tilt 3x

2 Defocus ( )(

5 2 3x 2 - 1 )

3 Primary coma ( )(

7 2 5x 3 - 3x )

4 Primary spherical aberration (3 8)( 35x 4 - 30 x 2 + 3)

5 Secondary coma ( )(

11 8 63x 5 - 70 x 3 + 15x )

6 Secondary spherical ( )(

13 16 231x 6 - 315x 4 + 105x 2 - 5 )

aberration

7 Tertiary coma ( )(

15 16 429 x 7 - 693x 5 + 315x 3 - 35x )

8 Tertiary spherical aberration ( )( )

17 128 6435 x 8 - 12012 x 6 + 6930 x 4 - 1260 x 2 + 35

Table 11-2. Standard deviation s of a primary aberration for a slit pupil, where Ai

is its aberration coefficient.

Aberration s

Tilt At 3 = At 1.732

Coma Ac 7 = 2.646

11.6 Summary 305

11.6 SUMMARY

A slit pupil is a limiting case of a rectangular pupil whose one dimension is

negligibly small, as illustrated in Figure 11-1. Its PSF is shown in Figure 11-2. The image

of an incoherent slit object parallel to the slit pupil is shown in Figure 11-3. The balanced

aberrations for a slit pupil are the Legendre polynomials. We have written them in an

orthonormal form, as in Eq. (11-3). They are listed in Table 11-1 up to the eighth order

and plotted in Figure 11-4. The sigma value of a 1D primary aberration with and without

balancing is listed in Table 11-2. It is shown that a slit pupil is more sensitive to a

primary aberration with or without balancing, except for spherical aberration for which it

is slightly less sensitive.

306 SYSTEMS WITH SLIT PUPILS

References

analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007).

aperture," J. Opt. Soc. Am. 55, 878–881 (1965). There is an error in their

polynomial S2 , which should read as x 2 - 1 3.

(McGraw-Hill, New York, 1968).

CHAPTER 12

NONCIRCULAR PUPILS

12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ......... 314

References ......................................................................................................................348

307

Chapter 12

Use of Zernike Circle Polynomials for

Noncircular Pupils

12.1 INTRODUCTION

The orthonormal polynomials for various pupils discussed in the preceding chapters

represent balanced aberrations for those pupils, just as the Zernike circle polynomials

(discussed in Chapter 4) do for a circular pupil. In this chapter, we consider the use of

circle polynomials for the analysis of a noncircular wavefront. Since the circle

polynomials form a complete set, any wavefront, regardless of the shape of the pupil

(which defines the perimeter of the wavefront), can be expanded in terms of them.

Moreover, since each orthonormal polynomial is a linear combination of the circle

polynomials [see Eq. (3-18)], the wavefront fitting with the former set of polynomials is

as good as that with the latter. However, we illustrate the pitfalls of using circle

polynomials for a noncircular pupil by considering an annular and a hexagonal pupil

[1,2].

It is shown that, unlike the orthonormal coefficients, the circle coefficients generally

change as the number of polynomials used in the expansion changes. Although the

wavefront fit with a certain number of circle polynomials is the same as that with the

corresponding orthonormal polynomials, the piston circle coefficient does not represent

the mean value of the aberration function, and the sum of the squares of the other

coefficients does not yield its variance. While the interferometer setting errors of tip, tilt,

and defocus from a 4-circle-polynomial expansion are the same as those from the

orthonormal polynomial expansion, these errors obtained from, say, an 11-circle-

polynomial expansion, and removed from the aberration function yield wrong polishing

by zeroing out the residual aberration function. If the common practice of defining the

center of an interferogram and drawing a circle around it is followed, and determining the

circle coefficients in the same manner as for a circular interferogram, then the circle

coefficients of a noncircular interferogram do not yield a correct representation of the

aberration function. Moreover, in this case, some of the higher-order coefficients of

aberrations that are nonexistent in the aberration function are also nonzero. Finally, the

circle coefficients, however obtained, do not represent coefficients of the balanced

aberrations for a noncircular pupil. Such results are illustrated analytically and

numerically by considering annular and hexagonal Seidel aberration functions as

examples.

CORRESPONDING ZERNIKE CIRCLE COEFFICIENTS

orthonormal polynomials F j ( x , y ) in the form

309

310 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

J

Wˆ ( x , y ) = Â a j F j ( x , y ) , (12-1)

j =1

where Wˆ ( x , y ) is the best-fit estimate of the function with J polynomials, and a j is the

coefficient of the polynomial F j ( x , y ) . The orthonormality of the polynomials across the

noncircular pupil is described by

1

Ú F ( x , y )F j ¢ ( x , y ) dx dy = d jj ¢ , (12-2)

A pupil j

1

aj = Ú W ( x , y )F j ( x , y ) dx dy . (12-3)

A pupil

It is evident that their value does not depend on the number of polynomials J used in the

expansion.

Letting F1( x , y ) = 1 , it is easy to see from Eq. (12-2) that the mean value of a

polynomial F j π1( x , y ) across the pupil is zero. Hence, the mean and the mean square

values of the estimated aberration function are given by

Ŵ = a1 (12-4)

and

J

Wˆ 2 ( x , y ) = Â a 2j , (12-5)

j =1

2

ˆ2 ˆ

ˆ = W ( x, y) - W ( x, y)

2

sW

J

= Â a 2j , (12-6)

j =2

where s Ŵ is its standard deviation. The number of polynomials J used in the expansion

to estimate the aberration function is increased until s Ŵ approaches the true value as

determined from the ray-trace or interferometric data within a certain prespecified

tolerance.

polynomial can be written in terms of them as a linear sum in the form [see Eq. (3-18)]

12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 311

J

F j ( x , y ) = Â M ji Z i ( x , y ) , (12-7)

i =1

or

{F } = M {Z }

j j , (12-8)

where M ji are the elements of the lower triangular conversion matrix M The estimated

aberration function can accordingly be expanded in terms of the circle polynomials in the

form

J

Wˆ ( x , y ) = Â bˆ j Z j ( x , y ) , (12-9)

j =1

orthonormal over a unit circle in Cartesian coordinates according to

1

Ú Z ( x , y )Z j ¢ ( x , y ) dx dy = d jj ¢ , (12-10a)

p x 2 + y 2 £1 j

2p

11

Z j (r, q) Z j ¢ (r, q) r dr dq = d jj ¢

p Ú0 Ú . (12-10b)

0

J j

Wˆ ( x , y ) = Â a j Â M ji Z i ( x , y )

j =1 i =1

J J

= Â Â a i M ij Z j ( x , y ) . (12-11)

j =1 i = j

J

bˆ j = Â a i M ij . (12-12)

i= j

It is clear that the value of a circle coefficient b̂ j depends on the number of polynomials J

used in the expansion. Moreover, it is a linear combination of the orthonormal

coefficients, just as an orthonormal polynomial is a linear combination of the circle

polynomials. Equation (12-12) can be written in a matrix form as

b̂ = M T a , (12-13)

where a and b̂ are the column vectors representing the orthonormal and the Zernike

coefficients, respectively, and M T is the transpose of the conversion matrix M. Thus, the

312 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

matrix that is used to obtain the orthonormal polynomials from the circle polynomials is

also used to obtain the circle coefficients from the orthonormal coefficients. The

transpose of a matrix is obtained by interchanging its rows and columns. Since M is a

lower triangular matrix, M T is an upper triangular matrix. Multiplying both sides of Eq.

1

(12-13) by the inverse M T ( )

of M T , we obtain

a = MT( ) 1 bˆ . (12-14)

Accordingly, if the circle coefficients are known, the orthonormal coefficients can be

obtained from them.

If the orthonormal coefficients are not known, the circle coefficients b̂ j can be

obtained by a least squares fit. Suppose the aberration values are known over a certain

domain by way of interferometry at N data points. Equation (12-9) can be written in

matrix form

Sˆ = Zbˆ , (12-15)

Wˆ ( x , y ) , and Z is an N ¥ J matrix representing each of the J polynomials over the N

data points. Solving Eq. (12-15), for example, with a standard singular-value

decomposition algorithm yields

bˆ = Z 1Sˆ , (12-16)

where Z 1 is a generalized inverse of the Z matrix. Of course, this procedure can also be

used to determine the orthonormal coefficients by replacing the circle polynomials with

the orthonormal polynomials. Except for any numerical error because of the finite

number N of the data points, the b̂ -coefficients given by Eq. (12-16) are the same as

those given by Eq. (12-13).

If the practice of drawing a unit circle around an interferogram and determining the

Zernike coefficients for a circular pupil is extended to a noncircular wavefront, the

coefficients thus obtained will be given by

1

bj = Ú W ( x , y )Z j ( x , y ) dx dy . (12-17)

A pupil

The circle polynomials in Eq. (12-17) are implicitly assumed to be orthonormal over the

noncircular pupil. The value of a circle coefficient b j does not depend on the number of

polynomials used in the expansion. Substituting Eq. (12-1) for the estimated aberration

function Wˆ ( x , y ) in terms of the orthonormal polynomials, we obtain

J 1

bj = Â a j¢ Ú Z ( x , y ) F j ¢ ( x , y ) dx dy

j ¢ =1 A pupil j

12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 313

J

= Â a j¢ Z j Fj¢ , (12-18)

j ¢ =1

or in a matrix form

b = C ZF a , (12-19)

polynomials with the orthonormal polynomials over the domain of the noncircular

wavefront. As illustrated in Sections 12.3 and 12.4, by considering an annular or a

hexagonal Seidel aberration function, respectively, the circle coefficients b j thus

obtained are incorrect in the sense that they do not yield a least-squares fit of the

aberration function W ( x , y ) , unlike the coefficients b̂ j . This, of course, is due to the

incorrect assumption of orthonormality of the circle polynomials over the noncircular

pupil.

To relate the b̂ - and the b-circle coefficients, we equate the right-hand sides of Eqs.

(12-1) and (12-9), multiply both sides by Z j ¢ , and integrate over the domain of the

noncircular pupil. Thus,

J J

Â bˆ j Z j ( x , y ) = Â a j F j ( x , y ) (12-20)

j =1 j =1

and

J J

Â bˆ j Z j ¢ Z j = Â a j Z j¢ Fj , (12-21)

j =1 j =1

C ZZ bˆ = C ZF a = b , (12-22)

where we have utilized Eq. (12-19). From Eqs. (12-13) and (12-22), it is evident that

C ZF = C ZZ M T . (12-23)

1

c jj ¢ = Ú Z ( x , y )Z j ¢ ( x , y ) dx dy (12-24)

A pupil j

and

1

d jj ¢ = Ú Z ( x , y )F j ¢ ( x , y ) dx dy , (12-25)

A pupil j

314 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

AN ANNULAR WAVEFRONT

Consider a system with a unit annular pupil with an obscuration ratio , as illustrated

in Figure 5-1. The polynomials A j (r, q; ) that are orthonormal across it and represent

balanced aberrations for it are similar to the circle polynomials in that they are separable

in the radial coordinate r and the azimuthal angle q of a point on the pupil. The

dependence on the obscuration ratio is contained only in the radial portion of the

polynomial. As discussed in Chapter 5, the annular polynomials are given by

where £ r £ 1, n and m are positive integers, and n - m ≥ 0 and positive. The annular

polynomials are orthonormal across the annular pupil according to

1 2p

1

Ú Ú A j (r, q; ) A j ¢ (r, q; ) r dr dq = d jj ¢ . (12-27)

(

p 1 - 2 ) 0

annular polynomials can be written in terms of the Zernike circle polynomials Z j (r, q),

as discussed in Chapter 4, according to

{A } = M {Z }

j j , (12-28)

polynomials according to

J

Wˆ (r, q; ) = Â a j A j (r, q; ) , (12-29)

j =1

1 2p

1

aj = W (r, q; ) A j (r, q; ) r dr dq .

) Ú Ú (12-30)

(

p 1 - 2 0

The mean value and the variance of the estimated function are accordingly given by Eqs.

(12-4) and (12-6).

12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 315

Table 12-1 lists the first 11 annular polynomials, as obtained from the annular-

polynomial Tables 5-3 and 5-4. They are given in terms of the circle polynomials in

Table 12-2. The nonzero elements of a 11 ¥ 11 conversion matrix, as obtained from Table

12-2, are listed in Table 12-3. The transpose matrix M T can be obtained easily by

interchanging the rows and columns of M . The nonzero elements of the 11 ¥ 11 matrices

C ZZ and C ZF are given in Tables 12-4 and 12-5, respectively.

determined from Eq. (12-30). If it is expanded in terms of only the first four circle

polynomials, i.e., if J = 4 in Eq. (12-9), then the expansion b̂ -coefficients according to

Eq. (12-13) are given by

Ê

Ê bˆ1 ˆ Á 1 0 0 - 32 1 - 2( ) 1ˆ˜ Ê a1 ˆ Á

1 (

Ê a - 32 1 - 2

) 1 a 4 ˆ˜

Áˆ ˜ Á Á ˜

Á b2 ˜ = Á 0 (1 + 2 ) 1 2 0 0 ˜

˜ Á a2 ˜

Á 1 + 2 1 2 a

=Á

( ) 2

˜

˜ (12-31)

Á bˆ ˜ Á ˜ Á ˜ Á ˜

Á 3˜ Á 0

Áˆ ˜

0 (1 + 2 ) 1 2 0 ˜ Á a3 ˜ Á 1+ ( 2 1 2

)a3 ˜

Á ˜

Ë b4 ¯ Á ˜ Á ˜

Ë 0 0 0 (1 - 2 ) 1 ¯

Ë a4 ¯

Ë ( 1

1 - 2 a 4) ¯

or

(

bˆ1 = a1 - 32 1 - 2 ) 1 a4 , (12-32a)

(

bˆ2 = 1 + 2 ) 1 2 a2 , (12-32b)

(

bˆ3 = 1 + 2 ) 1 2 a3 , (12-32c)

(

bˆ4 = 1 - 2 ) 1 a4 . (12-32d)

These coefficients represent the Zernike piston, tip, tilt, and defocus coefficients.

To see how these coefficients change with the number of polynomials used in the

expansion, we consider an expansion using the first 11 circle polynomials. The

coefficients are now given by

(

bˆ1 = a1 - 32 1 - 2 ) 1 a4 + (

52 1 + 2 1 - 2 )( ) 2 a11 , (12-33a)

(

bˆ2 = 1 + 2 ) 1 2 a2 - (2 )

2 4 B a 8 , (12-33b)

(

bˆ3 = 1 + 2 ) 1 2 a3 - (2 )

2 4 B a 7 , (12-33c)

(

bˆ4 = 1 - 2 ) 1 a4 - (

152 1 - 2 ) 2 a11 , (12-33d)

316 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

(

bˆ5 = 1 + 2 + 4 ) 1 2 a5 , (12-33e)

(

bˆ6 = 1 + 2 + 4 ) 1 2 a6 , (12-33f)

[( ) ]

bˆ7 = 1 + 2 B a 7 , (12-33g)

[( ) ]

bˆ8 = 1 + 2 B a 8 , (12-33h)

(

bˆ9 = 1 + 2 + 4 + 6 ) 1 2 a9 , (12-33i)

1 0 0 1 Piston

x tilt

2 1 1 2 ÈÍr 1 + 2

Î

( )1 2 ˘˙˚ cos q

y tilt

3 1 1 2 ÈÍr 1 + 2

Î

( )1 2 ˘˚˙ sin q

4 2 0 (

3 2r 2 - 1 - 2 ) (1 - 2 ) Defocus

5 2 2 6 ÈÍr 2 1 + 2 + 4

Î

( )1 2 ˘˙˚ sin 2q 45∞ Primary astigmatism

6 2 2 6 ÈÍr 2 1 + 2 + 4

Î

( )1 2 ˘˙˚ cos 2q 0∞ Primary astigmatism

7 3 1 8

( ) ) sin q

3 1 + 2 r 3 - 2 1 + 2 + 4 r ( Primary y coma

12

(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]

3 (1 + 2 ) r 3 - 2 (1 + 2 + 4 ) r

8 3 1 8 1 2 cos q

Primary x coma

(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]

9 3 3 8 ÈÍr 3 1 + 2 + 4 + 6

Î

( )1 2 ˘˚˙ sin 3 q

10 3 3 8 ÈÍr 3 1 + 2 + 4 + 6

Î

( )1 2 ˘˚˙ cos 3q

2

11 4 0

ÎÍ ( )

5 È6r 4 - 6 1 + 2 r 2 + 1 + 4 2 + 4 ˘

˚˙ (1 - )

2 Primary spherical aberration

12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 317

polynomials Z j (r, q ) , where is the obscuration ratio of the annular pupil.

A1 = Z1

( ) 1 2 Z2

A2 = 1 + 2

12

A3 = (1 + 2 ) Z 3

1

A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )

12

A5 = (1 + 2 + 4 ) Z 5

12

A6 = (1 + 2 + 4 ) Z 6

A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]

A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]

12

A9 = (1 + 2 + 4 + 6 ) Z 9

12

A10 = (1 + 2 + 4 + 6 ) Z10

2

12

B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]

annular polynomials A j (r, q; ) from the Zernike circle polynomials Z j (r, q ) .

M 11 = 1

(

M 22 = 1 + 2 ) 1 2 = M 33

1

M 41 = - 32 (1 - 2 )

1

M 44 = (1 - 2 )

12

M 55 = (1 + 2 + 4 ) = M 66

M 73 = -2 2 4 B 1

= M 82

( ) = M 88

M 77 = 1 + B 2 1

12

M 99 = (1 + 2 + 4 + 6 ) = M 10,10

2

M 111, = 52 (1 + 2 )(1 - 2 )

2

M 11,4 = - 152 (1 - 2 )

2 2

, = (1 - )

M 1111

318 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

polynomials over an annular pupil of obscuration ratio , where c jj ¢ = c j ¢j .

c11 = 1

c14 = 32 = c 41

c111 2

(

, = - 5 1 - 2 = c111

2

,)

c 22 = 1 + 2 = c 33

c 28 = 2 2 4 = c 82 = c 37 = c 73

c 44 = 1 - 2 2 + 4 4

( )

c 4,11 = 152 1 - 32 + 34 = c11,4

c 55 = 1 + 2 + 4 = c 66

c 77 = 1 + 2 - 7

c 99 = 1 + 2 + 4 + 6 = c10,10

, = 1 - 4 + 26 - 54 + 36

2 4 6 8

c1111

polynomials over an annular pupil of obscuration ratio .

d11 = 1

(

d 22 = 1 + 2 )1 2 = d 33

d 41 = 32

( )( ) 1

d 44 = 1 - 2 2 + 4 1 - 2

12

d 55 = (1 + 2 + 4 ) = d 66

12

d 73 = 2 2 4 (1 + 2 ) = d 82

12 12

d 77 = (1 - 2 )(1 + 4 2 + 4 ) (1 + 2 ) = d 88

12

d 99 = (1 + 2 + 4 + 6 ) = d10,10

d11,4 = 152 (1 - 2 )

2 2

, = (1 - )

d1111

12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 319

(

bˆ10 = 1 + 2 + 4 + 6 ) 1 2 a10 , (12-33j)

(

bˆ11 = 1 - 2 ) 2 a11 , (12-33k)

where

12

(

B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] . (12-34)

It is evident that all of the first four coefficients change, and b j = M jj a j for 5 £ j £ 11 .

The Zernike astigmatism coefficients b̂5 and b̂6 are smaller than the corresponding

12

( )

annular coefficients a 5 and a 6 by a factor of 1 + 2 + 4 . However, the Zernike

spherical aberration coefficient b̂11 is larger than the corresponding annular coefficient

2

( )

a11 by a factor of 1 - 2 . For example, when = 0.5 , the astigmatism coefficients are

smaller by a factor of 1.1456, and the spherical aberration coefficient is larger by a factor

of 1.7778.

there is correlation between an annular and a circle polynomial only if they have the same

azimuthal dependence. As a consequence, the piston coefficient b̂1, for example, is a

linear combination of the piston coefficient a1 , defocus coefficient a 4 , and various

orders of spherical aberration. Similarly, the tilt coefficient b̂2 is a linear combination of

the tilt coefficient a 2 and various orders of coma, or astigmatism coefficient b̂5 is a

linear combination of various orders of astigmatism. Accordingly, the astigmatism

coefficients change if a 13-polynomial expansion is considered. For example, b̂5 then

contains contribution from a13 , as well. The tip and tilt coefficients b̂2 and b̂3 change

further if polynomials A16 (varying as cos q ) and A17 (varying as sinq ) are included in

the expansion. Moreover, A16 also contributes to the coma coefficient b̂8 , and A17

similarly contributes to the coma coefficient b̂7 . The defocus coefficient b̂4 does not

change until the secondary spherical aberration polynomial A22 is included with its

coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .

Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,

depending on the number of polynomials used in the expansion.

We note that the mean value of the aberration function is given by the annular piston

coefficient a1 . However, the value of the corresponding Zernike circle coefficient b̂1

depends on the number of polynomials used in the expansion, and it does not equal a1 ;

therefore, it does not represent the mean value. An orthonormal annular coefficient (other

than piston) represents the standard deviation of the corresponding aberration term in the

expansion, but a Zernike circle coefficient generally does not. The variance of the

aberration function cannot be obtained by summing the squares of the Zernike circle

coefficients b̂ j (excluding the piston coefficient). The circle coefficients b j can be

320 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

obtained from the b̂ j - or the a j -coefficients, according to Eq. (12-22). They are

considered in Section 12.3.5 for a Seidel aberration function.

The estimated wavefront obtained by using only the first four polynomials represents

the best-fit parabolic approximation of the aberration function in a least squares sense. In

terms of the orthonormal annular polynomials, it can be written as

Wˆ ( x , y ) = a1 A1 + a 2 A2 + a 3 A3 + a 4 A4 (12-35a)

(

= a1 + 2 1 + 2 ) 1 2 a 2 x + 2(1 + 2 ) 1 2 a 3 y

(

+ 3 1 - 2 ) 1 a 4 [2 + (2r2 - 1)] . (12-35b)

(

= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-36b)

In Eqs. (12-35) and (12-36), we have omitted the arguments of the annular and circle

polynomials for simplicity. The coefficients of x, y, and r 2 representing the tip, tilt, and

defocus values obtained from the circle coefficients are the same as those obtained from

the orthonormal coefficients. The estimated piston from the Zernike expansion of Eq.

1

( )

(12-36b) is bˆ1 - 3bˆ4 , which is the same as a1 - 32 1 - 2 a 4 from the orthonormal

expansion in Eq. (12-35b). Accordingly, the aberration function obtained by subtracting

the piston, tip, tilt, and defocus values from the measured aberration function is

independent of the nature of the polynomials used in the expansion, so long as the

nonorthogonal expansion is in terms of only the first four circle polynomials [as may be

seen, for example, by comparing Eqs. (12-33a–d) with Eqs. (12-32a–d)]. In an

interferometer, the tip and tilt represent the lateral errors and defocus represents the

longitudinal error in the location of a point source illuminating an optical surface under

test from its center of curvature. These four terms are generally removed from the

aberration function and the remaining function is given to the optician to zero out from

the optical surface by polishing.

When an aberration function is expanded in terms of the orthonormal polynomials,

one or more polynomial terms can be added or subtracted from the aberration function

without affecting the coefficients of the other polynomials in the expansion. But that is

generally not true with the Zernike expansion. This is due to the fact that an expansion in

terms of the orthonormal polynomials gives a best fit for each polynomial, but an

expansion in terms of the circle polynomials gives it for the whole set in the expansion.

12.3.3 Wavefront Fitting 321

orthonormal or Zernike polynomials is the same. For example, the 4-polynomial

aberration functions of Eqs. (12-35) and (12-36) are exactly the same function.

Although the wavefront fit with a certain number of circle polynomials is as good as

the fit with a corresponding set of the orthonormal polynomials, there are pitfalls in using

the circle polynomials. Since the circle polynomials are not orthogonal over the

noncircular pupil, the advantages of orthogonality and aberration balancing are lost. Since

they do not represent the balanced classical aberrations for a noncircular pupil, the

Zernike coefficients b̂ j do not have the physical significance of their orthonormal

counterparts. For example, the mean value of a circle polynomial across a noncircular

pupil is not zero, the Zernike piston coefficient does not represent the mean value of the

aberration, the other Zernike coefficients do not represent the standard deviation of the

corresponding aberration terms, and the variance of the aberration is not equal to the sum

of the squares of these other coefficients. Moreover, the value of a Zernike coefficient

generally changes as the number of polynomials used in the expansion of an aberration

function changes. Hence, the circle polynomials are not appropriate for the analysis of a

noncircular wavefront. Of course, wavefront fitting with the improperly calculated

Zernike coefficients b j by using Eq. (12-17) will be in error, as demonstrated in Section

12.3.4 for a Seidel aberration function.

Consider an annular pupil aberrated by a Seidel aberration function given by

astigmatism, coma, and spherical aberration, respectively. Without the explicit field

dependence, distortion is equivalent to a wavefront tilt, and field curvature is equivalent

to a wavefront defocus.

The aberration function when approximated by only the first four annular

polynomials can be written

Wˆ (r, q; ) = a1 A1 + a 2 A2 + a 4 A4 , (12-38)

( ) (

a1 = 1 + 2 (2 Ad + Aa ) 4 + 1 + 2 + 4 As 3 , ) (12-39a)

(

a 2 = 1 + 2 )1 2 At ( )(

2 + 1 + 2 + 4 1 + 2 ) 1 2 Ac 3 , (12-39b)

( ) (

a 4 = 1 - 2 (2 Ad + Aa ) 4 3 + 1 - 4 As 2 3 . ) (12-39c)

322 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

It should be evident that the coefficient a 3 of the annular polynomial A3 varying as sinq

is zero. The mean value of the estimated aberration function is given by a1 , and its

variance is given by

2 2 2

sWˆ = a2 + a4 . (12-40)

1 12

a6 =

2 6

(1 + 2 + 4 ) Aa ,

(12-39d)

12

1 - 2 Ê 1 + 4 2 + 4 ˆ

a8 = Á ˜ Ac , (12-39e)

6 2 Ë 1 + 2 ¯

a11 =

(1 - 2 ) 2 A . (12-39f)

s

6 5

A7 , and A9 , respectively, each polynomial varying as sin mq, are zero. Moreover, the

coefficient a10 of the polynomial A10 varying as cos 3q is also zero. The 11-polynomial

expansion represents the Seidel aberration function exactly. Its mean value is again a1 , as

given by Eq. (12-39a), and its variance is given by

2

sW = a 22 + a 42 + a 62 + a 82 + a11

2

. (12-42)

Next we expand the Seidel aberration function in terms of the circle polynomials. A

4-polynomial expansion can be obtained from Eqs. (12-32) and (12-39) in the form

where

[ (

bˆ1 = (2 Ad + Aa ) 4 + 1 - 2 1 + 2 2 As 3 , ) ] (12-44a)

bˆ2 = a 2 1 + 2( )1 2 ,

(12-44b)

bˆ4 = a 4 1 - 2( ) .

(12-44c)

12.3.4 Application to an Annular Seidel Aberration Function 323

The estimated aberration function in Eq. (12-43) is exactly the same as that in Eq. (12-

38), and the values of piston, x-tilt, and defocus are exactly the same as those obtained

from Eqs. (12-39a–c). It should be evident, however, that its mean value is not given by

b̂1. Moreover, since an expansion coefficient does not represent the standard deviation of

the corresponding aberration polynomial term, its variance is not given by bˆ22 + bˆ42 .

From Eqs. (12-33) and (12-39), an 11-polynomial Zernike circle expansion can be

written

where

bˆ1 = (2 Ad + Aa ) 4 + As 3 , (12-46a)

bˆ2 = At 2 + Ac 3 , (12-46b)

bˆ4 = (2 Ad + Aa ) 4 3 + As 2 3 , (12-46c)

bˆ6 = Aa 2 6 , (12-46d)

bˆ8 = Ac 6 2 , (12-46e)

bˆ11 = As 6 5 . (12-46f)

As in the case of annular polynomials, the eleven circle polynomials also represent the

Seidel aberration function exactly. The expansion coefficients can also be obtained by

inspection of the aberration function and the form of the circle polynomials. Indeed

because of the form of the Seidel aberration function, the circle coefficients are

independent of the obscuration ratio . Each b̂ -coefficient represents the value of the

corresponding a-coefficient for = 0 . It is clear that each of the three nonzero

coefficients of the 4-polynomial expansion changes as the number of polynomials is

increased from four to eleven. Hence, the values of piston, x-tilt, and defocus obtained

from the coefficients b̂1, b̂2 , and b̂4 are incorrect. Again, the mean value of the aberration

function is not given by b̂1, and its variance is not given by the sum of the squares of the

other coefficients.

setting errors and remove them from the aberration function, the residual aberration

function from the annular expansion is given by

324 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

Eq. (12-43) is subtracted from the aberration function W (r, q; ). However, if the first

four polynomials are subtracted from the aberration function of Eq. (12-45), the residual

aberration function is given by

( ) ( )

= Aa 2 6 Z 6 + Ac 6 2 Z 8 + As 6 5 Z11 . ( ) (12-48)

Since the 11-polynomial aberration functions of Eqs. (12-41) and (12-45) are equal

to each other [and equal to the Seidel aberration function of Eq. (12-37)], the difference

between the residual aberration functions of Eqs. (12-48) and (12-47) is equal to the

difference between the interferometer setting errors given by Eq. (12-38) or (12-43) and

those given by Eq. (12-45). Accordingly, the difference or the error function consists of

piston, tilt, and defocus only. It is given by

1 2 2 4

DW Rbˆ (r, q; ) = -

6

( )

4 + 2 As + A r cos q + 2 As r 2

3 1 + 2 c

, (12-49)

and is independent of the number J of the annular and circle polynomials (e.g., 11, as

above) used in the expansion. Of course, piston does not affect the peak-to-valley value

or the variance of the aberration function. If the interferometer setting errors obtained

from Eq. (12-45) are applied in the fabrication and testing of an optical system with an

annular pupil, the difference function represents the polishing error due to the use of the

circle polynomials.

aberration given by Eqs. (12-39d–f) with the corresponding Zernike coefficients given by

Eq. (12-46d–f), we obtain

a6

bˆ6

(

= 1 + 2 + 4 )1 2 , (12-50a)

12

2 Ê 1 + 4 + ˆ

2 4

a8

bˆ8

= 1 (

- Á )

Ë 1+

2 ˜

¯

, (12-50b)

and

a11

bˆ11

(

= 1 - 2 )2 . (12-50c)

Since the b̂ j -coefficients are independent of the value of , the variation of a ratio

a j bˆ j with represents the variation of an annular coefficient a j .

12.3.4.4 Error with Assuming Circle Polynomials to be Orthogonal over an Annulus 325

Now we consider the expansion of the Seidel aberration function in terms of the

circle polynomials by assuming them to be orthogonal over the annulus. This is what one

does when defining a center of an interferogram, drawing a unit circle around it, and

determining its circle coefficients. The aberration function in this case can be written in

the form

1 2p

1

bj = Ú Ú W (r, q; ) Z j (r, q) r dr dq . (12-52)

(

p 1 - 2 ) 0

They can also be obtained from Eq. (12-22), i.e., from the annular or circle coefficients

by using the matrix C ZZ or C ZF given in Tables 12-4 and 12-5, respectively. The

“incorrect” circle coefficients b j are given by

b1 = a1 , (12-53a)

(

b2 = 1 + 2 )1 2 a 2 , (12-53b)

1 1

b4 =

4 3

(1 + 2 + 4 4 )(2 Ad + Aa ) +

2 3

(1 + 2 + 4 + 36 ) As , (12-53c)

(

b6 = 1 + 2 + 4 )1 2 a 6 , (12-53d)

1

b8 = 2 4 At +

6 2

(1 + 2 + 4 + 96 ) Ac , (12-53e)

5 4 2 1

b11 =

4

(

3 - 1 (2 Ad + Aa ) +)6 5

(

1 + 2 + 4 - 96 + 368 As , ) (12-53f)

etc. These coefficients are incorrect in the sense that they do not yield a least-squares fit

of the aberration function. Since an annular polynomial with n = m has the same form as

that for a corresponding circle polynomial except for the normalization constant, the

coefficients b j and a j for such a polynomial are also related to each other by the

normalization constant. Equations (12-53a, b, d) represent this fact for n = m = 0, 1, 2 ,

respectively. It is clear, however, that the improperly calculated circle coefficients b j

depend on the obscuration ratio of the pupil. Evidently, they are different from the

corresponding b̂ -coefficients given by Eqs. (12-46a–f). While the value of the piston

coefficient b1 is equal to the true mean value a1 , the tilt coefficient b2 is larger than a 2

12

by a factor of 1 + 2 (1 2

)

or 1.1180, and the coma coefficient b6 is larger than a 6 by a

(

factor of 1 + 2 + 4 )

or 1.1456 when = 0.5 . Moreover, the b-coefficients of some of

326 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

the nonexistent higher-order aberrations are not zero. For example, the coefficients b22 ,

b37 , etc. of the secondary and tertiary Zernike spherical aberrations Z 22 , Z 37 , etc., and

b16 , b30 , etc. of the secondary and tertiary Zernike coma Z16 and Z 30 , etc., are nonzero.

Thus, nonexistent aberrations are generated when an aberration function is expanded

improperly in terms of the circle polynomials.

If we estimate the annular Seidel aberration function with only 4-circle polynomials

from Eq. (12-51), we obtain

If we truncate the expansion in terms of the circle polynomials in Eq. (12-51) to the first

11 circle polynomials and remove the first four coefficients as interferometer setting

errors, the residual aberration function in this case is given by

or 1.1180 when = 0.5 than its true value

given by a 2 , and the defocus error given by b4 can be compared with its true value given

by a 4 . Since the 11-polynomial aberration function from Eq. (12-51) is not equal to the

aberration function of Eq. (12-41), their difference does not consist of the difference in

their interferometer setting errors. For example, Eq. (12-53d) indicates that there will be

an astigmatism term in the difference function. Thus, wrong polishing will result if the

aberration function of Eq. (12-55) is provided to the optician to zero out.

At = Ad = Aa = 1, Ac = 2 , and As = 3 in waves. As illustrated in Figure 12-1, the

annular and circle coefficients of a 4-polynomial expansion differ from each other,

although they yield the same fit of the aberration function. We note that, whereas the

mean value a1 increases as increases, but the piston coefficient b̂1 decreases. However,

the defocus coefficient a 4 decreases, while b̂4 increases. Both tilt coefficients a 2 and b̂2

increase. For a 11-polynomial expansion, the first four annular coefficients remain the

same, but the circle coefficients become independent of , as in Eqs. (12-46). Figure 12-2

shows the coefficient ratios a 6 bˆ6 (astigmatism), a 8 bˆ8 (coma) and a11 bˆ11 (spherical)

for a 11-polynomial expansion. We note that the coefficient a 6 increases, a11 decreases,

and a 8 is nearly constant for small values of and then decreases as increases. Figure

12-3 shows how the b̂ -coefficients change as we change the number of polynomials from

4 to 11 for = 0.5. A wrong polishing will result if the tip, tilt, and focus errors of an

interferometer setting are estimated from the 11-circle-polynomial expansion, instead of

the four. The variation of standard deviation obtained from the coefficients of a 4- or 11-

polynomial expansion is shown in Figure 12-4, illustrating that the circle coefficients

yield incorrect results. The standard deviation obtained from the orthonormal coefficients

increase slowly with , starting at 1.7460 and 1.7877 for the 4- and 11-polynomial

12.3.4.5 Numerical Example 327

expansions, respectively. However, the standard deviation obtained from the circle

coefficients is correct only when = 0. It increases rapidly with for the 4-polynomial

expansion, but it is constant for the 11-polynomial expansion, indicating its incorrect

nature. The sigma values from the orthonormal and the circle coefficients are nearly equal

to each other for £ 0.5 because of the very slow increase of the orthonormal sigma.

Figure 12-5 shows the contours of the Seidel aberration function for a circular and an

annular pupil with obscuration ratio of = 0.5. The case of a circular pupil is included

just for reference. The dark circular region in Figure 12-5b (and others) represents the

obscuration. The contours of the annular Seidel aberration function fit with only four

polynomials, as in Eq. (12-38) or (12-43) and in Eq. (12-54), which are shown in Figures

for a 4-polynomial expansion.

Figure 12-2. Ratio of the orthonormal annular coefficients a j and Zernike circle

coefficients b̂ j for a 11-polynomial expansion.

328 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

b̂ j , illustrating how the latter change as the number of polynomials changes from 4

to 11.

coefficients a j and Zernike circle coefficients b̂ j of a 4- and 11-polynomial

expansion.

12.3.4.5 Numerical Example 329

(a) (b)

Figure 12-5. Contours of (a) Seidel aberration function of Eq. (12-37) for a circular

pupil with At = Ad = Aa = 1, Ac = 2, and As = 3 in waves. (b) Same Seidel

aberration function, but for an annular pupil with obscuration ratio = 0.5.

(a) (b)

Figure 12-6. Contours of an annular Seidel aberration function for = 0.5 fit with

only 4-polynomials, as in (a) Eq. (12-38) or (12-43), and (b) Eq. (12-54).

330 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

(a)

(b)

(c)

Figure 12-7. Contours of the residual aberration function after removing the

interferometer setting errors. (a) WRA of Eq. (12-47) using annular polynomials, (b)

WRCb̂ of Eq. (12-48) using circle polynomials correctly, and (c) WRCb of Eq. (12-53)

using circle polynomials incorrectly.

12.3.4.5 Numerical Example 331

(a)

(b)

Figure 12-8. Contours of the difference or the error function (a) Eq. (12-49) and (b)

obtained by subtracting Eq. (12-47) from Eq. (12-55).

12-6a and 12-6b, respectively. The two figures look similar, but they are not the same.

Only Figure 6a represents the least-squares and, therefore, the correct fit. The contours of

the residual aberration function when the first four (of the eleven) polynomials are

removed as interferometer setting errors, as in Eqs. (12-47), (12-48), and (12-55), are

shown in Figures 12-7a, 12-7b, and 12-7c, respectively. All of the three figures are

different from each other, as expected. Only Figure 12-7a reflects removal of the correct

interferometer setting errors, and thus the correct residual aberration function. The

contours of the difference of the residual functions using the circle polynomials from the

one using the annular polynomials are shown in Figures 12-8a and 12-8b. They represent

the error functions given by Eq. (12-49) and the difference of Eqs. (12-55) and (12-47),

respectively, due to the removal of incorrect interferometer setting errors.

332 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

A HEXAGONAL WAVEFRONT

12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients

Now, we consider a hexagonal aberration function W ( x , y ) across a unit hexagon

shown in Figure 7-7, and demonstrate the pitfalls of using Zernike circle polynomials for

its expansion. Estimating the aberration function with J hexagonal polynomials H j ( x , y )

given in Chapter 7, we may write

J

Wˆ ( x , y ) = Â a j H j ( x , y ) , (12-56)

j =1

2

aj = Ú W ( x , y )H j dx dy . (12-57)

3 3 hexagon

The mean and the mean values of the estimated aberration function are given by Eqs. (12-

4) and (12-6).

the Zernike circle polynomials is given in Table 12-6, as obtained from Table 7-1. Its

transpose and inverse matrices are given in Tables 12-7 and 12-8, respectively. If only the

first 4 polynomials are used in the expansion, then the b̂ j coefficients according to Eq.

(12-13) are given by

Ê bˆ1 ˆ Ê 1 0 0 5 43 ˆ Ê a1 ˆ Ê a1 + 5 43a 4 ˆ

Áˆ ˜ Á 0 0 ˜ Áa ˜ Á 6 5a ˜

Á b2 ˜ 65 0

Áˆ ˜ = Á ˜ Á 2˜ = Á 2

˜ , (12-58)

b Á 0 0 65 0 ˜ Á a3 ˜ Á 6 5a 3 ˜

Á ˜ 3

Á ˜ Á ˜ Á ˜

Áˆ ˜ Ë 0 0 0 2 15 43 ¯ Ë a4 ¯ Ë 2 15 43a 4 ¯

Ëb ¯

4

or

bˆ2 = 6 5a 2 , (12-59b)

bˆ3 = 6 5a 3 , (12-59c)

and

It is evident that the piston coefficient b̂1 is not equal to a1 and, therefore, does not

12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 333

Table 12-6. Conversion matrix M for obtaining the Zernike coefficients b̂ j from the

orthonormal hexagonal coefficients a j , as in Eq. (12-12).

1 0 0 0 0 0 0 0 0 0 0

0 6 5 0 0 0 0 0 0 0 0 0

0 0 6 5 0 0 0 0 0 0 0 0

5 43 0 0 2 15 43 0 0 0 0 0 0 0

0 0 0 0 10 7 0 0 0 0 0 0

0 0 0 0 0 10 7 0 0 0 0 0

14 35

0 0 16 0 0 0 10 0 0 0 0

11055 2211

14 35

0 16 0 0 0 0 0 10 0 0 0

11055 2211

2

0 0 0 0 0 0 0 0 5 0 0

3

35

0 0 0 0 0 0 0 0 0 2 0

103

521 15 43

0 0 88 0 0 0 0 0 0 14

1072205 214441 4987

521

1 0 0 5 43 0 0 0 0 0 0

1072205

14

0 65 0 0 0 0 0 16 0 0 0

11055

14

0 0 65 0 0 0 16 0 0 0 0

11055

15

0 0 2 15 43 0 0 0 0 0 0 88 0

214441

0 0 0 0 10 7 0 0 0 0 0 0

0 0 0 0 0 10 7 0 0 0 0 0

35 0 0 0 0

0 0 0 0 0 0 10

2211

0 0 0 0 0 0 0 0 0 0 0

2

0 0 0 0 0 0 0 0 5 0 0

3

35

0 0 0 0 0 0 0 0 0 2 0

103

43

0 0 0 0 0 0 0 0 0 0 14

4987

334 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

Table 12-8. Analytical matrix M –1 for obtaining the Zernike coefficients a j from

the orthonormal hexagonal coefficients b̂ j .

1 0 0 0 0 0 0 0 0 0 0

0 56 0 0 0 0 0 0 0 0 0

0 0 56 0 0 0 0 0 0 0 0

1 2 3 0 0 43 15 2 0 0 0 0 0 0 0

0 0 0 0 7 10 0 0 0 0 0 0

0 0 0 0 0 7 10 0 0 0 0 0

2211

0 0 8 5 15 0 0 0 10 0 0 0 0

35

2211

0 8 5 15 0 0 0 0 0 10 0 0 0

35

0 0 0 0 0 0 0 0 3 2 5 0 0

103

0 0 0 0 0 0 0 0 0 2 0

35

4987

1 2 5 0 0 22 7 43 0 0 0 0 0 0 14

43

represent the mean value of the aberration function. The coefficients b̂2 , b̂3 , and b̂4

represent the tip, tilt, and defocus circle coefficients.

To see how these coefficients change with the number of polynomials used in the

expansion, we consider an expansion using 11 polynomials. The coefficients, obtained

from Eq. (12-13), are given by

1072205 a11 , (12-60a)

bˆ5 10 7 a 5 , (12-60e)

bˆ6 10 7 a 6 , (12-60f)

12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 335

bˆ9 = (2 3) 5a 9 , (12-60i)

and

It is clear that all of the first four coefficients change, and b̂ j = M jj a j for 5 £ j £ 11 .

For astigmatism ( H 5 and H 6 ), coma ( H 7 and H 8 ), and spherical aberration ( H11 ), the

b̂ j coefficient is larger than the corresponding hexagonal coefficient by a factor of

10 7 ª 1.20 , 10 35 2211 ª 1.26 , and 14 43 4987 ª 1.30 , respectively. The

astigmatism coefficients b̂5 and b̂6 change if a 15-polynomial expansion is considered.

For example, b̂5 then contains contributions from a13 and a15 , as well. The tip and tilt

coefficients b̂2 and b̂3 change further if polynomials H16 and H17 are included in the

expansion. Moreover, H16 also contributes to the coma coefficient b̂8 , and H17 similarly

contributes to the coma coefficient b̂7 . The piston and defocus coefficients b̂1 and b̂4 do

not change until the secondary spherical aberration polynomial H 22 is included with its

coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .

Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,

depending on the number of polynomials used in the expansion.

The estimated wavefront obtained by using only the first four polynomials represents

the best-fit parabolic approximation of the aberration function in a least-squares sense. In

terms of the Zernike polynomials, it can be written as

(

= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-61b)

Wˆ ( x , y ) = a1H1 + a 2 H 2 + a 3 H 3 + a 4 H 4 (12-62a)

= a1 + 2 6 5a 2 x + 2 6 5a 3 y + a 4 [ (

5 43 + 6 5 43 2r 2 - 1 )] . (12-62b)

Comparing the right-hand sides of Eqs. (12-61b) and (12-62b) and utilizing Eqs. (12-59a–

d), it is seen that the coefficients of x, y, and x 2 + y 2 , representing the tip, tilt, and

defocus values obtained from the Zernike coefficients, are the same as those obtained

from the hexagonal coefficients. The estimated piston from the Zernike expansion of Eq.

336 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS

(12-61b) is bˆ1 - 3bˆ4 . Substituting for b1and b4 from Eqs. (12-59a–d), we find that it is

the same as a1