## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

by Howard Mark and Jerry Workman, Jr.

Ratings:

1,090 pages18 hours

*Chemometrics in Spectroscopy, Second Edition,* provides the reader with the methodology crucial to apply chemometrics to real world data. It allows scientists using spectroscopic instruments to find explanations and solutions to their problems when they are confronted with unexpected and unexplained results. Unlike other books on these topics, it explains the root causes of the phenomena that lead to these results. While books on NIR spectroscopy sometimes cover basic chemometrics, they do not mention many of the advanced topics this book discusses. In addition, traditional chemometrics books do not cover spectroscopy to the point of understanding the basis for the underlying phenomena.

The second edition has been expanded with 50% more content covering advances in the field that have occurred in the last 10 years, including calibration transfer, units of measure in spectroscopy, principal components, clinical data reporting, classical least squares, regression models, spectral transfer, and more.

Written in the column format of the authors’ online magazine Presents topical and important chapters for those involved in analysis work, both research and routine Focuses on practical issues in the implementation of chemometrics for NIR Spectroscopy Includes a companion website with 350 additional color figures that illustrate CLS conceptsPublisher: Academic PressReleased: Jul 13, 2018ISBN: 9780128053300Format: book

*Rebecca. *

This large single volume fulfills the need for chemometric-based tutorials on topics of interest to analytical chemists or other scientists performing modern mathematical and statistical operations for use with analytical measurements. The book covers a very broad range of chemometric topics as indicated in the extensive table of contents. This book is a collection of the series of columns first published in *Spectroscopy *providing detailed mathematical and philosophical discussions on the use of chemometrics and statistical methods for scientific measurements and analytical methods. In addition, the new revolution in biotechnology and the slew of spectroscopic techniques therein provides an opportunity for those scientists to strengthen their use of mathematics and calibration through the use of this book.

Subjects covered include those of interest to many groups of scientists, mathematicians, and practicing analysts for daily problem-solving as well as detailed insights into subjects difficult to thoroughly grasp for the nonspecialists. The coverage relies more on concept delineation than on rigorous mathematics, but the descriptive mathematics and derivations are included for the more rigorously minded.

Sections on matrix algebra, analytic geometry, experimental design, instrument and system calibration, noise, derivatives and their use in data analysis, linearity, and nonlinearity are described. Collaborative laboratory studies, using analysis of variance (ANOVA), testing for systematic error, ranking tests for collaborative studies, and efficient comparison of two analytical methods are included. Discussions on topics such as the limitations in analytical accuracy, and brief introductions to the statistics of spectral searches, and the chemometrics of imaging spectroscopy are included.

The popularity of the *Chemometrics in Spectroscopy *series (ongoing since the early 1990s) as well as the *Statistics in Spectroscopy *series and books has been overwhelming and we sincerely thank our readership over the years. We have received emails from many people, one memorable one thanking us that a career change was made due to the renewed and stimulated interest in statistics and chemometrics due largely to our thought-provoking columns. We hope you find this collection useful and will continue to read the columns and write to us with your thoughts, comments, and questions regarding this stimulating topic.

This second edition of *Chemometrics in Spectroscopy *is an extension of the first edition. At the time the first edition was published, we were already hard at work writing the eponymous columns for *Spectroscopy *(the magazine), wherein the various columns corresponding to the chapters in this book were originally published. Indeed, due to the vagaries of publishing schedules for books and magazines, some of the later chapters of the first edition of *Chemometrics in Spectroscopy *were actually published in the book even before the corresponding column was published in the magazine! Nevertheless, the series of magazine columns do form a coherent sequence, as do the chapters in this book.

This second edition of *Chemometrics in Spectroscopy*, however, contains roughly 40% more chapters than the first edition contained. The first 75 chapters recapitulate those of the first edition, with some reorganization (to be described later). In addition, of course, errors that were found after publication of the first edition are corrected here, and some supplementary material was added when we realized that the discussions in the first edition were incomplete or lacking important details.

Many chapters were reorganized with respect to the original magazine columns, as well as with respect to the first edition. In the first edition, one of our goals was to faithfully reproduce the magazine columns as they initially appeared (except for correcting errors). Our goal here was a modification of that. Most chapters here correspond 1:1 with a given column in the magazine, but do not necessarily appear in the same order. There are two underlying reasons for this.

First, as time went on and we covered more complex topics that required more complicated descriptions, any given topic tended to be spread over more and more magazine columns. Since any given issue of *Spectroscopy *had only limited space for our column, a lengthy and complicated discussion had to be spread over more issues of the magazine.

Secondly, the editors of *Spectroscopy *wanted some variation of the topics we covered in successive issues; they did not want one topic to exclusively take up the magazine’s space allocation for what would have sometimes been a year or more.

We were able to address these requirements by alternating the topics we addressed, issue to issue, although this sometimes made it necessary to wait for two and sometimes three issues to be able to read the continuation of a given discussion.

When we came to write this book, we decided early on that our original organization, or even the revised one to satisfy the editors, was not completely suited for a book. We felt that any given topic should be discussed continuously (or at least, contiguously). So while we mostly kept the correspondence between the *Spectroscopy *columns and the chapters in this book, the sequence of chapters was rearranged so that chapters discussing any given topic all appear sequentially, regardless of the number of chapters they are spread over. For this reason, the chapters here do not always correspond to the sequence of columns as originally published in *Spectroscopy*.

There is also another reason for that lack of correspondence. Sometimes, while reviewing and reorganizing the material for this book we would realize that a topic had been omitted, and we would add an appropriate chapter to the book that never appeared (yet) in the magazine columns. When possible we would insert that as a new magazine column when an opportunity appeared, although sometimes that opportunity had not occurred before this book came to be printed. We do hope and expect to eventually include all those orphan

columns in the magazine.

This new organization is also evident in the Table of Contents: a continuing discussion of a given topic is contained in successive chapters. In the Table of Contents we have therefore separated the chapter headings into blocks, with a section heading for blocks of chapters representing the same topic. Individual chapters, not part of a larger block, are simply separated from each other, without a section head.

For those readers who would like to compare the original magazine column with the corresponding chapter in this book, we include the reference to that column on the title page of the corresponding chapter here, since it would otherwise be difficult to make the correct correspondence. Some columns may not have been published by the time this book is printed. In those cases, the corresponding chapter will not include the reference.

We hope these changes make the sequence of information more useful, instead of adding more confusion to what are sometimes inherently confusing subjects in the first place! There are still many topics related to chemometrics that we have not discussed or covered in detail. Hopefully these will be dealt with in the future, either in *Spectroscopy *columns or in book form (or both!). We ask you, our readers, to stay tuned…

A feature of (at least some) Elsevier books is their Book Companion.

This is a place on the Elsevier website that contains information related to a given book, but does not appear in the book. There can be as many reasons for this as there are authors, I suppose, and more than one reason may apply for any given book.

The Book Companion for *Chemometrics in Spectroscopy, Second Edition *can be found at: **https://www.elsevier.com/books-and-journals/book-companion/9780128053096. **

That seems like a handful to key in, but it turns out that it’s not so bad. By exercising a little care, I was able to get it right on the first try! Readers who are intimidated at the thought of keying in such a lengthy URL may request a copy of a CD-ROM containing the files for the Book Companion, by sending an email message to **hlmark@nearinfrared.com, requesting a copy of the CD. The message must contain the reader’s name, affiliation, and all contact information (especially the full postal address to which the CD will be shipped). A small fee will be requested to defray costs of preparing and shipping the CD. **

There are two reasons why information was placed on the Book Companion for *Chemometrics in Spectroscopy, Second Edition*, reasons that relate to both the nature of the information, and the location and organization of the information.

The first reason is described in **Chapter 128, which describes some experiments and the results thereof (and includes a copy of the URL and description of its use). Some of the results, and an overview, is described in Chapter 128 (no spoilers, here; you’ll have to read the chapter to get to the descriptions about the info!), but the details of all the results are very voluminous and occupy 350 PowerPoint slides, far too many to include all the results in the printed book. Additionally, there are other miscellaneous files in the set, including a copy of Table 128-7 (formatted as a Microsoft WORD file) and miscellaneous slides representing the Figures in that chapter. The compromise reached was to put all those slides and other information in the Book Companion, thereby making it available to the readers, without making the printed book impossibly large. This also conforms to the modern trend in scientific publishing, where journals articles often have auxiliary information that is available online. In this vein, Chapter 128 also contains more details about the Book Companion: what it is, why we developed it, and why it’s worth having, as well as how to access it. **

The second reason, which is the rationale for the other set of information in the Book Companion, is the fact that *Chemometrics in Spectroscopy, Second Edition *is printed entirely in black and white. On the other hand, *Spectroscopy *(the magazine and original source for the columns from which this book is taken) is printed using color printing technology, and in creating those original columns the authors made liberal use of color figures. Conversion of those figures to B&W inevitably lost some of the information that was contained in the color figures. In some figures that loss is minor, but in other figures the loss of the color information can cause complete lack of comprehension of what the figure is intended to demonstrate. Again a compromise was reached, whereby those figures where the colors carry important information were added to the Book Companion, so that a reader can refer to the full-color figure when the B&W version does not suffice.

**Chapter 1 **

**A New Beginning … ☆ **

We introduce our new series of discussions of chemometric analysis, presenting some historical background, and list some of the projected topics that will be covered.

Chemometrics; Correlation matrix; Covariance matrix; Matrix; MND; Multivariate normal distribution; Reviews; Spectroscopy

**Multivariate Normal Distribution **

**Matrix Operations **

**References **

**Further Reading **

Why do we title this article A New Beginning…

? Well, there are a lot of reasons. First of all, of course, is the simple fact that that's just the way we do things. Second is the fact that we hope to develop this series of columns the way we did our previous series Statistics in Spectroscopy

(SiS). Those of you out there who followed that series know that for the most part, each column was pretty much self-contained and could stand alone yet also fit into that series in the appropriate place and contributed to the flow of information in that series as a whole. We hope to be able to reproduce that on a larger scale. Just as the series SiS

was self-contained and stood alone, so too will we try to make this new series stand alone, and at the same time be a worthy successor to SiS, and also continue to develop the concepts we began there.

Third is the fact that we are finally starting to write again. To you, our readership, it may seem like we have been writing continuously since we began SiS, but in fact we have been running on backlog for a longer time than you would believe. That was advantageous in that it allowed us time to pursue our personal and professional lives including such other projects as arranging for SiS to be published as a book **[1]. At this point, we have all good intentions of continuing the series; however, our other commitments may prevent us from turning out these columns as rapidly as we would like, so there may be some delays from one column to the next. We will try to keep these delays to a minimum. **

The downside of our getting ahead of ourselves, on the other hand, is that we were not able to keep you abreast on the latest developments related to our favorite topic. However, since the last time we actually wrote something, there have been a number of noteworthy developments.

Our pervious series dealt only with the elementary concepts of statistics related to the general practice of calibration used for ultraviolet-visible-near infrared (UV-Vis-NIR) and occasionally for infrared (IR) spectroscopy. Our purpose in writing SiS was to help provide a small foot bridge to cross the gap between specialized chemometrics literature written at the expert level and those general statistics articles and texts dealing with examples and questions far removed from chemistry or spectroscopic practice. Since the beginning of the Statistics

series in 1986, several reviews, tutorials, and textbooks have been published to begin the construction of a major highway bridging this gap. Most notably, at least in our minds, have been tutorial articles on classical least squares (CLS), principal components regression (PCR), and partial least squares regression (PLSR) by Haaland and Thomas **[2,3]. Other important work includes textbooks on calibration and chemometrics by Naes and Martens [4] and Mark [5]. Chemometric reviews discussing the progress of tutorial and textbook literature appear regularly in Analytical Chemistry, Critical Review issues. Another recent series of articles on chemometric concepts termed The Chemometric Space by Naes and Isaksson has appeared [6]. In addition, there is a North American chapter of the International Chemometrics Society (NAmICS) which we are told has over 300 members. Those interested in joining or obtaining further information may contact David Duewer at NIST (National Institute of Science and Technology) (david.duewer@NIST.GOV) or send a message to the discussion group (ICSL@LISTSERV.UMD.EDU). **

Finally, since imitation is the sincerest form of flattery (or so they tell us), we're pleased to see that others have also taken the route of printing longer tutorial discussions in the form of a series of related articles on a given topic. Series that we have no qualms recommending, on topics related to ours, have appeared in some of the sister publications of Spectroscopy [**7–15] (note: There have been recent indications that the series in Spectroscopy International has continued beyond the ones we have listed. If we can obtain more information we will keep you posted—Spectroscopy International has also undergone some transformations and it is not always easy to get copies). Tom Fearn, a Professor in the Department of Statistical Science, University College London, had a while back taken over the abovementioned series of columns Chemometric Space in NIR News. In this series, Prof. Fearn explains many of the more advanced and esoteric topics that fall under the heading of chemometrics, as well as shedding light on some of the darker corners of the more common topics such as quantitative and qualitative modeling algorithms. Another regular column appearing in NIR News is the Chemometrics Mythbusters column, written by Kim Esbensen, Paul Geladi, and Anders Larsen as regular columnists and sporadic other guest columnists. **

So, overall, the chemometrics bridge between the lands of the overly simplistic and severely complex is well under construction; one may find at least a single lane open by which to pass. So why another series? Well, it is still our labor of love to deal with specific issues that plague ourselves and our colleagues involved in the practice of multivariate qualitative and quantitative spectroscopic calibration. Having collectively worked with hundreds of instrument users over 25 combined years of calibration problems, we are compelled, like bees loaded with pollen, to disseminate the problems, answers, and questions brought about by these experiences. Then what would a series named Chemometrics in Spectroscopy

hope to cover which is of interest to the readers of Spectroscopy?

We have been taken to task (with perhaps some justice) for using the broader title label Chemometrics in Spectroscopy

for what we have claimed will be discussions of the somewhat narrower range of topics included in the field of multivariate statistical algorithms applied to chemical problems, when the term Chemometrics

actually applies to a much wider range of topics. Nevertheless, we will use this title, for a number of reasons. First, that's what we said we were going to do, and we hate to not follow through, even on such a minor point. Second, we have said previously (with all due arrogance) that this is our column, and we have been pretty fortunate that the editors of Spectroscopy have always pretty much let us do as we please. Finally, at this point, we consider the possibility that we may very well eventually extend our range to include some of these other topics that the broader term will cover.

As of right now, some of the topics we foresee being able to expand upon over the series will include but not be limited to

•The multivariate normal distribution (MND).

•Defining the bounds for a data set.

•The concept of Mahalanobis distance.

•Discriminant analysis and its subtopics of.

–Sample selection.

–Spectral matching (qualitative analysis).

•Finding the maximum variance in the multivariate distribution.

•Matrix algebra refresher.

•Analytic geometry refresher.

•Principal components analysis (PCA).

•PCR.

•More on multiple linear least squares regression (MLLSR), also known as multiple linear regression (MLR), and *P*-matrix and its sibling, *K*-matrix (although the wider chemometric community is disparaging the "*K*-matrix and

*P*-matrix" terminology and discouraging its use for new publication).

•More on simple linear least squares regression (SLLSR), also known as simple least squares regression (SLSR) or univariate least squares regression.

•PLSR.

•Validation of calibration models.

•Laboratory data and assessing error.

•Diagnosis of data problems.

•An attempt to standardize statistical/chemometric terms.

•Special calibration problems (and solutions).

•The concept of outliers: theory and practice.

•Standardization concepts and methods for transfer of calibrations.

•Collaborative study problems related to methods and instruments.

We also plan to include in the discussions such important statistical concepts as correlation, bias, slope, and associated errors and confidence limits. Beyond this, it is also our hope that readers will write to us with their comments or suggestions for chemometric challenges which confront them. If time and energy permit we may be able to discuss such issues as neural networks, general factor analysis, clustering techniques, maximizing graphical presentation of data, and signal processing.

We will begin with the concept of MND.

Think of a cigar, suspended in space. If you can't think of a cigar suspended in space, look at **Fig. 1-1A. Now imagine the cigar filled with little flecks of stuff, as in Fig. 1-1B (it doesn't really matter what the stuff is, mathematics has never concerned itself with such unimportant details). Imagine the flecks being more densely packed toward the middle of the cigar. Now imagine a swarm of gnats surrounding the cigar; if they're attracted to the cigar then naturally there will be fewer of them far away from the cigar than close to it (Fig. 1-1C). Next take away the cigar and just leave the flecks and the gnats. By this time, of course, you should realize that the flecks and the gnats are really the same thing and are neither flecks nor gnats but simply abstract representations of points in space. What is left looks like Fig. 1-1D. **

**Fig. 1-1 **Development of the concept of the multivariate normal distribution (MND) (this one shown having three dimensions)—see text for details. The density of points along a cross-section of the distribution in any direction is also an MND of lower dimension.

**Fig. 1-1D, of course, is simply a pictorial/graphical representation of what an MND would look like, if you could see it. Furthermore, it is a representation of only one particular MND. First of all, this particular MND is a three-dimensional MND. A two-dimensional MND will be represented by points in a plane, and a one-dimensional MND is simply the ordinary normal distribution that we have come to know and love [16]. An MND can have any number of dimensions; unfortunately, we humans cannot visualize anything with more than three dimensions, so for our examples we are limited to such pictures. Also, the MND depicted has a particular shape and orientation. In general, an MND can have a variety of shapes and orientations, depending upon the dispersion of the data along the different axes. Thus, for example, it would not be uncommon for the dispersion along two of the axes to be equal and independent. In this case, which represents one limiting situation, an appropriate cross-section of the MND would be circular rather than elliptical. Another limiting situation, by the way, is for two or more of the variables to be perfectly correlated, in which case the data would lie along a straight line or a plane (or a hyperplane as the corresponding higher-dimensional figure is called). **

Each point in the MND can be projected onto the planes defined by each pair of the axes of the coordinate system. For example, **Fig. 1-2 shows the projection of the data onto the plane at the bottom of the coordinate system. There it forms a two-dimensional MND, which is characterized by several parameters, the two-dimensional MND being the prototype for all MNDs of higher dimension, and the properties of this MND are the characteristics of the MND that are the key defining properties. First of all, the data contributing to an MND itself has a normal distribution along any of the axes of the MND. We have discussed the normal distribution previously [16] and have seen that it is described by the expression: **

**(1-1) **

**Fig. 1-2 **Projecting each point of the three-dimensional MND onto any of the planes defined by two axes of the coordinate system (or, more generally, any plane passing through the coordinate system) results in the projected points being represented by a two-dimensional MND. The correlation coefficients for the projections in all planes are needed to fully describe the original MND.

The MND can be mathematically described by an expression that is similar in form but has the characteristic that each of the individual parts of the expression represents the multivariate analog of the corresponding part of Eq. **(1-1). **

represents the mean of the data for which Eq. **, an individual mean). **

If we project the MND onto each axis of the coordinate system containing the MND, then as stated earlier, these projections of the data will be distributed as an ordinary normal distribution, as shown in **Fig. 1-3. This distribution will itself then have a standard deviation (SD), so that another defining characteristic of the MND is the SD of the projection of the MND along each axis. This must also then be represented by a vector. **

**Fig. 1-3 **Projecting the points onto a line results in a point density that is our familiar univariate normal distribution.

The final key point to note about the MND, which can also be seen from **Fig. 1-2, is the fact that, when the MND is projected onto the plane defined by any two axes of the coordinate system, the data may show some correlation (as does the data in Fig. 1-2). In fact, the projection onto any of the planes defined by two of the axes will have some value for the correlation coefficient between the corresponding pair of variables. The amount of correlation between projections along any pair of axes can vary from zero, in which case the data would lie in a circular blob, to unity, in which case the data would all lie exactly on a straight line. **

Since each pair of axes defines another plane, many such projections may be possible, depending on the number of dimensions in which the MND exists. Indeed, every possible pair of axes in the coordinate system defines such a plane. As we have shown, we mere mortals can't visualize more than three dimensions, so our examples and diagrams will be limited to showing data in three dimensions or fewer, but the mathematical descriptions can be extended with all generality, to as high dimensionality as might be needed. Thus, the full description of the MND must include all the correlations of the data between every pair of axes. This is conventionally done by creating what is known as the correlation matrix (depending on the formalisms used, the term covariance matrix

may apply instead, the difference being how the matrices are scaled). This matrix is a square matrix, in which any given row or column corresponds to a variable, and the individual positions (i.e., the *m*, *n *position, for example, where *m *and *n *represent indices of the variables) in the matrix represent the correlation between the variable represented by the row it lies in and the variable represented by the column it lies in. In actuality, for mathematical reasons, the correlation itself is not used but rather the related quantity called the covariance replaces the correlation coefficient in the matrix. The elements of the matrix that lie along what's called the main diagonal (i.e., where the column and row numbers are the same) are then the variances (the square of the SD—this shows that there's a rather close relationship between the SD and the correlation) of the data. This matrix is thus called the variance-covariance matrix and sometimes just the covariance matrix for simplicity.

Since it is necessary to represent the various quantities by vectors and matrices, the operations for the MND that correspond to operations using the univariate (simple) normal distribution must be matrix operations. Discussion of matrix operations is beyond the scope of this chapter, but for now it suffices to note that the simple arithmetic operations of addition, subtraction, multiplication, and division all have their matrix counterparts. In addition, certain matrix operations exist which do not have counterparts in simple arithmetic. The beauty of the scheme is that many manipulations of data using matrix operations can be done using the same formalism as for simple arithmetic, since when they are expressed in matrix notation, they follow corresponding rules, whereby for simple arithmetic:

these relationships hold.

However, there is one major exception to this: the commutative rule does not hold for multiplication:

The commutative rule does not hold true for matrix multiplication. That is because of the way matrix multiplication is defined (which we will not describe here but will do so in a later chapter in this book. For this case, however, changing the order of appearance of the two matrices to be multiplied may provide different matrices as the answer). Thus, instead of *f*(*x*) and the expression for it in Eq. **(1-1) describing the simple normal distribution, the MND is described by the corresponding multivariate expression: **

**(1-2) **

where now the capital letters *X *and *K *represent vectors, and the capital letter *A *represents the covariance matrix. This is, by the way, a somewhat straightforward extension of the definition (although it may not seem so at first glance) because for the simple univariate case, the matrix *A *degenerates into the number 1, *X *becomes *x*.

Most texts dealing with multivariate statistics have a section on MND, but a particularly good one, if a bit heavy on the math, is the discussion by Anderson **[17]. To help with this a bit, our next few chapters will include a review of some of the elementary concepts of matrix algebra. **

[1] Mark H., Workman J. *Statistics in Spectroscopy. *Boston: Academic Press; 1991.

[2] Haaland D., Thomas E. Partial least squares methods for spectral analysis. 1. Relation to other quantitative calibration methods and the extraction of qualitative information. *Anal. Chem. *1988;60:1193–1202.

[3] Haaland D., Thomas E. Partial least squares methods for spectral analysis. 2. Application to simulated and glass spectral data. *Anal. Chem. *1988;60:1202–1208.

[4] Naes T., Martens H. *Multivariate Calibration. *New York: John Wiley & Sons; 1989.

[5] Mark H. *Principles and Practice of Spectroscopic Calibration. *New York: John Wiley & Sons; 1991.

[6] Naes T., Isaksson T. The chemometric space. *NIR News. *1992.

[7] Bonate P.L. Concepts in calibration theory. *LC/GC. *1992;10(4):310–314.

[8] Bonate P.L. Concepts in calibration theory. *LC/GC. *1992;10(5):378–379.

[9] Bonate P.L. Concepts in calibration theory. *LC/GC. *1992;10(6):448–450.

[10] Bonate P.L. Concepts in calibration theory. *LC/GC. *1992;10(7):531–532.

[11] Miller J.N. Calibration methods in spectroscopy. *Spectrosc. Int. *1991;3(2):42–44.

[12] Miller J.N. Calibration methods in spectroscopy. *Spectrosc. Int. *1991;3(4):41–43.

[13] Miller J.N. Calibration methods in spectroscopy. *Spectrosc. Int. *1991;3(5):43–46.

[14] Miller J.N. Calibration methods in spectroscopy. *Spectrosc. Int. *1991;3(6):45–47.

[15] Miller J.N. Calibration methods in spectroscopy. *Spectrosc. Int. *1992;4(1):41–43.

[16] Mark H., Workman J. Statistics in spectroscopy—part 6—the normal distribution. *Spectroscopy. *1987;2(9):37–44.

[17] Anderson T.W. *An Introduction to Multivariate Statistical Analysis. *New York: Wiley; 1958.

[1] Donald B. *Personal communication. *1993.

**☆ Adapted from Spectroscopy 8(4) (1993) 12-15. **

**Section 1 **

Elementary Matrix Algebra

**Chapter 2 **

**Elementary Matrix Algebra: Part 1 **

This chapter introduces the basic concepts involved with matrix algebra. Topics covered include matrix notation, matrix arithmetic operations, inverse, and transpose of a matrix. Elementary operations for linear equations are also described. Mathematical examples and notations are given for all concepts covered.

Matrix operations; Linear equations; Transpose of a matrix; Inverse of a matrix

**Matrix Operations **

**Matrix Addition **

**Subtraction **

**Matrix Multiplication **

**Matrix Division **

**Inverse of a Matrix **

**Transpose of a Matrix **

**Elementary Operations for Linear Equations **

**The Solution **

**Summary **

**References **

You may recall that in the first chapter, we promised that a review of elementary matrix algebra would be forthcoming, so the next several chapters will cover this topic all the way from the very basics to the more advanced spectroscopic subjects.

You may already have discovered that the term matrix

is a fanciful name for a table or list. If you have recently made a grocery list, you have created an *n* × 1 matrix or in more correct nomenclature, an **X***n* × 1 matrix, where *n *is the number of items you would like to buy (rows) and 1 is the number of columns. If you have become a highly sophisticated shopper and have made lists consisting of one column for Store A and a second one for Store B, you have ascended into the world of **X***n* × 2 matrix. If you include the price of each item and put brackets around the entire column(s) of prices, you will have created a numerical matrix.

By definition, a numerical matrix is a rectangular array of numbers (termed elements

) enclosed by square brackets []. Matrices can be used to organize information such as size versus cost in a grocery department, or they may be used to simplify the problems associated with systems or groups of linear equations. Later in this chapter we will introduce the operations involved for linear equations (see **Table 2-1 for common symbols used). **

**Table 2-1 **

**a Where X or x is represented by any letter, generally those are listed under **

parameters or matrix namesin this table.

The symbols below represent a matrix:

Note that *a*1 and *a*2 are in column 1, *b*1 and *b*2 are in column 2, *a*1 and *b*1 are in row 1, and *a*2 and *b*2 are in row 2.

The above matrix is a 2 × 2 (rows × columns) matrix. The first number indicates the number of rows, and the second indicates the number of columns. Matrices can be denoted as **X**2 × 2 using a capital, boldface letter with the row and column subscript.

The following illustrations are useful to describe very basic matrix operations. Discussions covering more advanced matrix operations will be included in later chapters, but for now, just review these elementary operations.

To add two matrices, the following operation is performed:

To add larger matrices, the following operation applies:

For subtraction, use the following operations:

The same operation holds true for larger matrices such as:

and so on.

To multiply a scalar by a matrix (or a vector) we use

where *A *is a scalar value.

The product of two matrices (or vectors) is given by

In another example, in which an **X**1 × 2 matrix is multiplied by an **X**2 × 1 matrix, we have

denoted by **X**1 × **X**2 in matrix notation.

Division of a matrix by a scalar is accomplished:

where *A *is a scalar value.

The inverse of a matrix is the conceptual equivalent to its reciprocal. Therefore, if we denote our matrix by **X**, then the inverse of **X **is denoted as **X**− 1 and the following relationship holds.

where [**1**] is an identity matrix. Only square matrices, which have an equal number of rows and columns (e.g., 2 × 2, 3 × 3, and 4 × 4), have inverses. Several computer packages provide the algorithms for calculating the inverse of square matrices. The identity matrix for a 2 × 2 matrix is

and for a 3 × 3 matrix, the identity matrix is

and so on. Note that the diagonal is always composed of ones for the identity matrix, and all other values are zero. To summarize, by definition:

The basic methods for calculating **X**− 1 will be addressed in the next chapter.

The transpose of a matrix is denoted by **X**′ (or, alternatively, by **X***T*). For example, for the matrix:

The first column of [**X**] becomes the first row of [**X**]′; the second column of [**X**] becomes the second row of [**X**]′; the third column of [**X**] becomes the third row of [**X**]′; and so on.

To solve problems involving calibration equations using multivariate linear models, we need to be able to perform elementary operations on sets or systems of linear equations. So before using our newly discovered powers of matrix algebra, let us solve a problem using the algebra many of us learned very early in life.

The elementary operations used for manipulating linear equations include three simple rules **[1,2]: **

•Equations can be listed in any order for convenience and organizational purposes.

•Any equation may be multiplied by any real number other than zero.

•Any equation in a series of equations can be replaced by the sum of itself and any other equation in the system. As an example, we can illustrate these operations using the three equations below as part of what is termed equation system

or simply a system

(Eqs. **2-1–2-3): **

*(2-1) *

*(2-2) *

*(2-3) *

To solve for this system of three equations, we begin by following the three elementary operations rules stated earlier:

•We can rearrange the equations in any order. In our case, the equations happen to be in a useful order.

•We decide to multiply Eq. **(2-1) by a factor such that the coefficients of a are of opposite sign and of the same absolute value for Eqs. (2-1), (2-2). Therefore, we multiply Eq. (2-1) by − 4 to yield **

*(2-4) *

•We can eliminate *a*1 in the first and the second equations by adding Eqs. **(2-4), (2-2) to give Eq. (2-5): **

and we bring Eq. **(2-1) back in the system by dividing Eq. (2-4) by − 4 to get **

*(2-6) *

*(2-7) *

*(2-8) *

Now to eliminate the *a*1 term in Eqs. **(2-6), (2-8), we multiply Eq. (2-6) by − 6 to yield **

*(2-9) *

Then we add Eq. **(2-9) to Eq. (2-8): **

Now we bring back Eq. **(2-6) in its original form by dividing Eq. (2-9) by − 6, and our system of equations looks like this: **

*(2-11) *

*(2-12) *

*(2-13) *

We can eliminate the *b*1 term from Eqs. **(2-12), (2-13) by multiplying Eq. (2-12) by − 8 and Eq. (2-13) by 2 to obtain **

*(2-14) *

*(2-15) *

Adding these equations, we find

*(2-16) *

Restore Eq. **(2-7) by dividing Eq. (2-14) by − 8 to yield **

*(2-17) *

*(2-18) *

*(2-19) *

Solving for *c*1, we find *c*1 = (− 60/ − 16) = 3.75.

Substituting *c*1 into Eq. **(2-18), we obtain − 2 b1 + 3.75 = 14. **

Solving this for *b*1, we find *b*1 = − 5.13.

Substituting *b*1 into Eq. **(2-17), we find a1 + (− 5.13) = − 2. **

Solving this for *a*1, we find *a*1 = 3.13.

Finally:

A system of equations where the first unknown is missing from all subsequent equations and the second unknown is missing from all subsequent equations is said to be in *echelon *form. Every set or equation system comprising linear equations can be brought into echelon form by using elementary algebraic operations. The use of *augmented *matrices can accomplish the task of solving the equation system just illustrated.

For our previous example, the original equations:

*(2-20) *

*(2-21) *

*(2-22) *

can be written in augmented matrix form as:

*(2-23) *

The echelon form of the equations can also be put into matrix form as follows.

Echelon form:

*(2-24) *

*(2-25) *

*(2-26) *

Matrix form:

*(2-27) *

In this chapter, we have used elementary operations for linear equations to solve a problem. The three rules listed for these operations have a parallel set of three rules used for elementary matrix operations on linear equations. In our next chapter, we will explore the rules for solving a system of linear equations by using matrix techniques.

[1] Kowalski B.R. *Recommendations to IUPAC Chemometrics Society. *Seattle, WA: Laboratory for Chemometrics, Department of Chemistry, BG-10, University of Washington; 1985.1–2.

[2] Britton J.R., Bello I. *Topics in Contemporary Mathematics. *New York: Harper & Row; 1984.408–457.

**Chapter 3 **

**Elementary Matrix Algebra: Part 2 **

This chapter continues to explain the basic concepts involved with matrix algebra. Topics covered include elementary matrix operations and calculating the inverse of a matrix. Mathematical examples and notations are described for each concept covered.

Matrix operations; Inverse of a matrix

**Elementary Matrix Operations **

**Calculating the Inverse of a Matrix **

**Summary **

**References **

To solve the set of linear equations introduced in **Chapter 2 referenced as [1], we will now use elementary matrix operations. These matrix operations have a set of rules which parallel the rules used for elementary algebraic operations, used for solving systems of linear equations. The rules for elementary matrix operations are as follows [2]: **

1.Rows can be listed in any order for convenience or organizational purposes.

2.All elements within a row may be multiplied using any real number other than zero.

3.Any row can be replaced by the element-by-element sum of itself and any other row.

To solve a system of equations, our first step is to put zeros into the second and the third rows of the first column and into the third row of the second column. For our exercise we will bring forward **Eqs. (2-1)–(2-3) as (Eq. set 3-1): **

**(3-1) **

We can put the above *set *or *system *of equations in matrix notation as:

and so:

Matrix *A *is termed matrix of the equation system.

The matrix formed by [*A* | *C*] is termed augmented matrix.

For this problem the augmented matrix is given as:

Now if we were to find a set of equations with zeros in the second and the third rows of the first column, and in the third row of the second column, we could use **Eqs. (2-17)–(2-19)[1] which look like (Eq. set 3-2): **

**(3-2) **

We can rewrite these equations in matrix notation as:

and the augmented form of the above matrices is written as:

For **Eq. (2-7), we can reduce or simplify the third row in [ G | P] by following Rule 3 of the basic matrix operations previously mentioned. As such we can multiply row III in [G | P] by 1/2 to give **

We can use *elementary row operations*, also known as *elementary matrix operations, *to obtain matrix [*G* | *P*] from [*A* | *C*]. By the way, if we can achieve [*G* | *P*] from [*A* | *C*] using these operations, the matrices are termed row equivalent

denoted by **X**1 ~** X**2. To begin with an illustration of the use of elementary matrix operations, let us use the following example. Our original *A *matrix earlier can be manipulated to yield zeros in rows II and III of column I by a series of row operations. The example below illustrates this:

The left-hand augmented matrix is converted to the right-hand augmented matrix by (II/II − 4I) or row II is replaced by row II minus 4 times row I. Then (III/III − 6I) or row III is replaced by row III minus 6 times row I.

To complete the row operations to yield [*G* | *P*] from [*A* | *C*] we write

This is accomplished by (III/III − 4II) or row III is replaced by row III minus 4 times row II.

As we have just shown using two series of row operations we have

which is *equivalent *to **Eqs. (2-17)–(2-19) and Eqs. (3-3) earlier; this is shown here as (Eq. set 3-3). **

**(3-3) **

Now, solving for *c*1 = (− 30/ − 8) = 3.75; substituting *c*1 into **Eq. (2-18), we find − 2 b1 + 3.75 = 14, therefore, b1 = − 5.13; substituting b1 into Eq. (2-17), we find a1 + (− 5.13) = − 2, therefore, a1 = 3.13, and so: **

Thus, matrix operations provide a simplified method for solving equation systems as compared to elementary algebraic operations for linear equations.

In **Chapter 2, we promised to show the steps involved in taking the inverse of a matrix. Given a 2 × 2 matrix [ X]2 × 2, how is the inverse calculated? We can ask the question another way as, "What matrix when multiplied by a given matrix [X]r × c will give the identity matrix ([I])"? In matrix form, we may write a specific example as: **

Therefore:

or stated in matrix notations as [*A*] × [*B*] = [**I**], where [*B*] is the inverse matrix of [*A*], and [**I**] is the identity matrix.

By multiplying [*A*] × [*B*] we can calculate the two basic equation systems to be used in solving this problem as:

The augmented matrices are denoted as:

The first (preceding) matrix is reduced to echelon form (zeros in the first and the second rows of column one) by

The row operation is (II/3I − 2II) or row II is replaced by three times row I minus two times row II. The next steps are as follows:

with row operations as …(I/I + II) and …(I/−1/2I).

Thus, *c*1 = − 2, *c*2 = − 3, *d*1 = 1, and *d*2 = 2. So *B* = *A*− 1 (inverse of *A*) and

So now we check our work by multiplying [*A*] • [*A*]− 1 as follows:

By coincidence, we have found a matrix which when multiplied by itself gives the identity matrix or, saying it another way, it is its own inverse. Of course, that does not generally happen, a matrix and its inverse are usually different.

Hopefully **Chapters 1 and 2 have refreshed your memory of early studies in matrix algebra. In this chapter, we have tried to review the basic steps used to solve a system of linear equations using elementary matrix algebra. In addition, basic row operations were used to calculate the inverse of a matrix. In the next chapter, we will address the matrix nomenclature used for a simple case of MLR. **

[1] Workman Jr. J., Mark H. Chemometrics in spectroscopy-elementary matrix algebra, part I. *Spectroscopy. *1993;8(7):16–19.

[2] Britton J.R., Bello I. *Topics in Contemporary Mathematics. *New York: Harper & Row; 1984.408–457.

**Section 2 **

Matrix Algebra and Multiple Linear Regression

**Chapter 4 **

**Matrix Algebra and Multiple Linear Regression: Part 1 **

This chapter continues with the basic concepts involved with matrix algebra. Topics covered include MLR, quasialgebraic operations, and the least squares method in both matrix and summation notation. Mathematical examples and notations are given for all concepts covered.

Matrix operations; Multiple linear regression; Quasialgebraic operations; Least squares method

**Quasialgebraic Operations **

**Multiple Linear Regression **

**The Least Squares Method **

**References **

In a previous chapter we noted that by augmenting the matrix of coefficients with unit matrix (i.e., one that has all the members equal to zero except on the main diagonal, where the members of the matrix equal unity), we could arrive at the solution to the simultaneous equations that were presented. Since simultaneous equations are, in one sense, a special case of regression (i.e., the case where there are no degrees of freedom for error), it is still appropriate to discuss a few odds and ends that were left dangling.

We started in the previous chapter with the set of simultaneous equations:

*(4-1a) *

*(4-1b) *

*(4-1c) *

(where we now leave the subscripts off the variables for simplicity, with no loss of generality for our current purposes). Also note that here we write all the coefficients out explicitly, even when the ones and zeroes do not necessarily appear in the original equations—this is so that they will not be inadvertently left out of the matrix expressions (where the place filling

function must be performed), and we noted that we could express these equations in matrix notation as:

where the equations then take the matrix form:

*(4-2) *

The question here is, how did we get from Eqs. (**4-1a through 4-1c) to (4-2)? The answer is that it is not at all obvious, even in such a simple and straightforward case, how to break up a group of algebraic equations into their equivalent matrix expression. It turns out, however, that going in the other direction is often much simpler and straightforward. Thus, when setting up matrix expressions, it is often desirable to run a check on the work to verify that the matrix expression indeed correctly represents the algebraic expression of interest. In the current case, this can be done very simply by carrying out the matrix multiplication indicated on the left-hand side of Eq. (4-2). **

Thus, expanding the matrix expression [*A*][*B*] into its full representation, we obtain

*(4-3) *

From our previous chapter defining the elementary matrix operations, we recall the operation for multiplying two matrices: the *i*, *j *element of the result matrix (where *i *and *j *represent the row and the column of an element in the matrix, respectively) is the sum of cross-products of the *i*th row of the first matrix and the *j*th column of the second matrix (this is the reason that the order of multiplying matrices depends upon the order of appearance of the matrices—if the indicated *i*th row and *j*th column do not have the same number of elements, the matrices cannot be multiplied).

Now let us apply this definition to the pair of matrices listed earlier. The first matrix ([*A*]) has three rows and three columns. The second matrix ([*B*]) has three rows and one column. Since each row of [*A*] has three elements, and the single column of [*B*] has three elements, matrix multiplication is possible. The resulting matrix will have three rows, each row resulting from one of the rows of matrix [*A*], and one column, corresponding to the single column in the matrix [*B*].

Thus the first row of the resulting matrix will have the single element resulting from the sum of products of the first row of [*A*] times the column of [*B*], which will be

*(4-4) *

Similarly, the second row of the resulting matrix will have the single element resulting from the sum of products of the second row of [*A*] times the column of [*B*], which will be

*(4-5) *

and the third row of the resulting matrix will have the single element resulting from the sum of products of the third row of [*A*] times the column of [*B*], which will be

*(4-6) *

or, simplifying:

*(4-7) *

The entire matrix product, then, is

Eqs. **(4-4)–(4-6) represent the three elements of the matrix product of [ A] and [B]. Note that each row of this resulting matrix contains only one element, even though each of these elements is the result of a fairly extensive sequence of arithmetic operations. Eqs. (4-4), (4-5), (4-7), however, represent the symbolism you would normally expect to see when looking at the set of simultaneous equations that these matrix expressions replace. Note further that this matrix product [A][B] is the same as the entire left-hand side of the original set of simultaneous equations that we originally set out to solve. **

Thus we have shown that these matrix expressions can be readily verified through straightforward application of the basic matrix operations, thus clearing up one of the loose ends we had left.

Another loose end is the relationship between the quasialgebraic expressions that matrix operations are normally written in and the computations that are used to implement those relationships. The computations themselves have been covered at some length in the previous two chapters **[1,2]. To relate these to the quasialgebraic operations that matrices are subject to, let us look at those operations a bit more closely. **

Thus, considering Eq. **(4-2), we note that the matrix expression looks like a simple algebraic expression relating the product of two variables to a third variable, even though in this case the variables in question are entire matrices. In Eq. (4-2), the matrix [ B] represents the unknown quantities in the original simultaneous equations. If Eq. (4-2) were a simple algebraic equation, clearly the solution would be to divide both sides of this equation by A, which would result in the equation B = C/A. Since A and C both represent known quantities, a simple calculation would give the solution for the unknown B. **

There is no defined operation of division for matrices. However, a comparable result can be obtained by multiplying both sides of an equation (such as Eq. **4-2 by the inverse of matrix [ A]. The inverse (e.g., of matrix [A]) is conventionally written as [A]− 1. Thus, the symbolic solution to Eq. (4-2) is generated by multiplying both sides of Eq. (4-2) by [A]− 1: **

**(4-8) **

There are a couple of key points to note about this operation. The main point is that since the order of appearance of the matrices matters, it is important that the new matrix, the one we are multiplying both sides of the equation by, is placed at the beginning of the expressions on each side of the equation.

The second key point is the accomplishment of a desired goal: on the left-hand side of Eq. **(4-8) we have the expression [ A]− 1[A]. We noted earlier that the key defining characteristic of the inverse of a matrix is that fact that when multiplied by the original matrix (that it is the inverse of), the result is a unit matrix. Thus Eq. (4-8) is equivalent to **

**(4-9) **

where [1] represents the unit matrix. Since the property of the unit matrix is that when multiplied by any other matrix the result is the same as the other matrix, then [1][*B*] = [*B*], and Eq. **(4-9) becomes **

**(4-10) **

Thus, we have symbolically solved Eq. **(4-2) for the unknown matrix [ B], the elements of which are the unknown variables of the original set of simultaneous equations. Performing the matrix multiplication of [A]− 1[C] will then provide the values of these unknown variables. **

Let us examine these symbolic transformations with a view toward seeing how they translate into the required arithmetic operations that will provide the answers to the original simultaneous equations. There are two key operations involved. The first is the inversion of the matrix, to provide the inverse matrix. This is an extremely intensive computational task, so much so that it is in general done only on computers, except in the simplest cases for pedagogical purposes, such as we did in our previous chapter.

In this regard we are reminded of an old, and somewhat famous, cartoon, where two obviously professor-type characters are staring at a large blackboard. On the left side of the blackboard are a large number of mathematical symbols, obviously representing some complicated and abstruse mathematical derivations. On the right side of the blackboard is a similar set of symbols. In the middle of the blackboard is a large blank space, in the middle of which is written, in big letters: AND THEN SOME MAGIC HAPPENS,

and one of the characters is saying to the other: I think you need to be a bit more explicit here in step 10.

To some extent, we feel the same way about matrix inversions. The complications and amount of computation involved in actually doing a matrix inversion are enough to make even the most intrepid mathematician/statistician/chemometrician run to the nearest computer with a preprogrammed algorithm for the task. Indeed, there sometimes seem to be just about as many algorithms for performing a matrix inversion as there are people interested in doing them. In most cases, then, this process is in practice treated as a black box

where some magic happens.

Except for the theoretical mathematician, however, there is usually little interest in being more explicit,

as long as the program gives the right answer. As is our wont, however, our previous chapter worked out the gory details for the simplest possible case, the case of a 2 × 2 matrix. For larger matrices, the amount of computation increases so rapidly with matrix size that even the 3 × 3 matrix is left to the computer to handle.

But how can we tell then if the answer is correct? Well, there is a way and one that is not too overwhelming. From the definition of the inverse of a matrix, you should obtain a unit matrix if you multiply the inverse of a given matrix by the matrix itself. In our previous chapter [1] we showed this for the 2 × 2 case. For the simultaneous equations at hand, however, the process is only a little more extensive. From the original matrix of coefficients in the simultaneous equations that we are working with, the one called [*A*] earlier, we find that the inverse of this matrix is

How did we find this? Well, we used some of our magic. The details of the computations needed were described in the previous chapter, for the 2 × 2 case; we will not even try to go through the computations needed for the 3 × 3 case we concern ourselves with here.

However, having a set of numbers that purports to be the inverse of a matrix, we can verify whether or not it is the inverse of that matrix: all we need to do is multiply by the original matrix and see if the result is a unit matrix. We have done this for the 2 × 2 matrix in our previous chapter. An exercise for the reader is to verify that the matrix shown in Eq. **(4-11) is, in fact, the inverse of the matrix [ A]. **

That was the hard part. It now remains to calculate out the expressions shown in Eq. **(4-10), to find the final values for the unknowns in the original simultaneous equations. Thus, we need to form the matrix product of [ A]− 1 and [C]: **

This matrix multiplication is similar to the one we did before: we need to multiply a 3 × 3 matrix by a 3 × 1 matrix; the result will then also have dimensions of three rows and one column. The three rows of this matrix will thus be the result of these computations:

Thus, in matrix terms, the matrix [*C*] is

**(4-14) **

and this may be compared to the result we obtained algebraically in the last chapter (and found to be identical, within the limits of different roundings used).

At first glance it would seem as though this approach has the additional characteristic of requiring fewer computations than our previous method of solving similar equations. However, the computations are exactly the same, but most of them are hidden

inside the matrix inversion.

It might also seem that we have been repetitive in our explanation of these simultaneous equations. This is intentional—we are attempting to explicate the relationship between the algebraic approach and the matrix approach to solving the equations. Our first solution (in the previous chapter) was strictly algebraic. Our second solution used matrix terminology and concepts, in addition to explicitly writing out all the arithmetic involved. Our third approach uses symbolic matrix manipulation, substituting numbers only in the last step.

In **Chapters 2 and 3, we discussed the rules related to solving systems of linear equations using elementary algebraic manipulation, including simple matrix operations. The past chapters have described the inverse and transpose of a matrix in at least an introductory fashion. In this installment we would like to introduce the concepts of matrix algebra and their relationship to MLR. Let us start with the basic spectroscopic calibration relationship: **

Also written as:

**(4-15) **

In this example we state that the concentration of an analyte within a sample is a linear combination of two variables. These variables, in our case, are measured in the same units, that is, absorbance units. In this case the concentration is known as the *dependent *variable or response variable because its magnitude depends or responds to the values of the changes in absorbances at wavelengths 1 and 2. The absorbances are the *x*-variables, referred to as *independent *variables, regressor variables, or predictor variables. Thus, equations such as Eqs. **(4-4)–(4-15) attempt to explain the relationship between concentration and changes in absorbance. This calibration equation or calibration model is said to be linear because the relationship is a linear combination of multiplier terms or regression coefficients as predictors of the concentration (response or dependent variable). Note that the β1 and β2 terms are called regression coefficients, multiplier terms, multipliers, or sometimes parameters. The analysis described is referred to as linear regression, least squares, linear least squares, or most properly MLR. In more formal notation, we can rewrite Eq. (4-15) as: **

**(4-16) **

where *E*(*cj*) is the expected value for the concentration. Note: The difference between *E*(*cj*) and *cj *is the difference between the predicted or expected value *E*(*cj*) and the actual or observed value *cj*. This can be rewritten as:

**(4-17) **

and

**(4-18) **

where *ɛj *is termed prediction error,

residual error,

residual,

error,

lack of fit error,

or the unexplained error.

We can also rewrite the equation in matrix form as:

This equation of the model in matrix notation is written as:

**(4-20) **

The problem now becomes: How do we handle the situation in which we have more equations than unknowns? When there are fewer equations than unknowns it is clear that there is not enough information available to determine the values of the unknown variables.

When we have more equations than unknowns, however, we seem to have the problem of having too

You've reached the end of this preview. Sign up to read more!

Page 1 of 1

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading