Professional Documents
Culture Documents
Abstract
terizations systematically extend existing results in literature, and they have many
applications in areas like quantum information theory. Some structural theorems and
power series over matrices are widely used in our characterizations.
1. Introduction
Preserver problem is one of the most active research areas in matrix theory (e.g. [1–4]).
Researchers would like to characterize the maps on a given space of matrices preserving
certain subsets, functions or relations. One of the preserver problems concerns maps ψ on
some sets V of matrices which preserves k-power for a fixed integer k ≥ 2, that is,
ψ Ak ¼ ψ ðAÞk for any A ∈ V (e.g. [3, 5, 6]). The k-power preservers form a special class
of polynomial preservers. One important reason of this problem lies on the fact that the
case k ¼ 2 corresponds to Jordan homomorphisms. Moreover, every k-power preserver is
also a k-potent preserver, that is, Ak ¼ A imply that ψ ðAÞ k ¼ ψ ðAÞ for any A ∈ V . Some
researches on k-potent preservers can be found in [6–8].
Given a field , let Mn ðÞ, Sn ðÞ, Dn ðÞ, N nðÞ, and T n ðÞ denote the set of n n
general, symmetric, diagonal, strictly upper triangular, and upper triangular matrices
over , respectively. When is the complex field , we may write Mn instead of
Mn ðÞ, and so on. Let Hn , P n , and P n denote the set of complex Hermitian, positive
definite, and positive semidefinite matrices, and HnðÞ ¼ S n ðÞ, P nðÞ, and Pn ðÞ
the corresponding set of real matrices, respectively. A matrix space is a subspace of
Mm,n ðÞ for certain m, n ∈ þ. Let At (resp. A ∗ ) denote the transpose (resp. conjugate
transpose) of a matrix A.
1
Matrix Theory - Classics and Advances
A ∈ MnðÞ (resp. S nðÞ) (See Theorems 3.1 and 5.1). In 1998, Brešar, Martindale,
and Miers considered additive maps of general prime rings to solve an analogous
problem by using the deep algebraic techniques ([10]). Monlár [[3], P6] described a
particular case of their result which extends Theorem 3.1 to surjective linear
operators on BðHÞ. In 2004, Cao and Zhang determined additive k-power preserver
on MnðÞ and S n ðÞ ([11]). They also characterized injective additive k-power
preserver on T nðÞ ([12] or [[6], Theorem 6.5.2]), which leads to injective linear k-
power preserver on T n ðÞ (see Theorem 8.1). In 2006, Cao and Zhang also character-
ized linear k-power preservers from Mn ðÞ to MmðÞ and from S nðÞ to Mm ðÞ
(resp. S mðÞ) [8].
In this article, given an integer k ∈ nf0, 1g , we show that a unital linear map ψ :
V ! W between matrix spaces preserving k-powers on a neighborhood of identity
must preserve all integer powers (Theorem 2.1). Then we characterize, for ¼ and
, linear operators on sets V ¼ Mn ðÞ, Hn , S nðÞ, Pn , P n ðÞ, Dn ðÞ, and T n ðÞ that
satisfy ψ Ak ¼ ψ ðAÞk on an open neighborhood of I n in V . In the following descrip-
Our results on MnðÞ and Sn ðÞ extend Chan and Lim’s results in Theorems 3.1
and 5.1, and result on T nðÞ extend Cao and Zhang’s linear version result in [12].
Another topic is the study of a linear map ϕ from a matrix set S to another matrix
set T preserving trace equation. In 1931, Wigner’s unitary-antiunitary theorem [[3],
p. 12] says that if ϕ is a bijective map defined on the set of all rank one projections on a
Hilbert space H satisfying
2
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
The authors also determined the cases if ϕ is not assuming unital, the set Pn is
replaced by another set like M n , Sn, T n, or D n. In [[17], Theorem 3.8], Leung, Ng, and
Wong considered the relation (1) on infinite dimensional space.
Let hS i denote the subspace spanned by a subset S of a vector space. Recently,
Huang and Tsai studied two maps preserving trace of product [18]. Suppose two maps
ϕ : V 1 ! W 1 and ψ : V 2 ! W2 between subsets of matrix spaces over a field under
some conditions satisfy
for all A ∈ V1 , B ∈ V2 . The authors showed that these two maps can be extended
to bijective linear maps ϕ ~ : h V 1i ! hW 1i and ψ~ : h V2 i ! hW 2 i that satisfy
tr ~
ϕðAÞψ~ðBÞ ¼ trðABÞ for all A ∈ hV 1i , B ∈ hV2 i (see Theorem 2.2). Hence when a
matrix space V is closed under conjugate transpose, every linear bijection ϕ : V ! V
corresponds to a unique linear bijection ψ : V ! V that makes (2) hold (see Corollary
2.3). Therefore, each of ϕ and ψ has no specific form.
One natural question is to ask when the following equality holds for a fixed
k ∈ nf0, 1g:
tr ϕðAÞψ ðBÞk ¼ tr ABk :
(3)
The second major work of this paper is to use our descriptions of linear k-power
preservers on an open neighborhood S of In in V to characterize maps ϕ, ψ : V ! V
under one of the assumptions:
2.equality (3) holds for all A, B ∈ S and both ϕ and ψ are linear,
3
Matrix Theory - Classics and Advances
for the sets V ¼ Mn , Hn , P n, S n , Dn, T n , and their real counterparts. These results,
together with Theorem 2.2 and the characterizations of maps ϕ1 , ⋯, ϕ m : V ! V (m ≥ 3)
that satisfy trðϕ1ð A1Þ⋯ϕ m ðA mÞÞ ¼ trð A1 ⋯Am Þ in [18], make a comprehensive picture of
the preservers of trace of matrix products in the related matrix spaces and sets.
In the following characterizations, ¼ or , P, Q ∈ Mn ðÞ are invertible,
U ∈ Mn ðÞ is unitary, O ∈ MnðÞ is orthogonal, and c ∈ n f0 g.
4. V ¼ P n and P n ðÞ (Theorem 6.4): ϕðAÞ ¼ ck U ∗ AU and ψðBÞ ¼ cU∗ BU, or
ϕðAÞ ¼ ck U ∗ At U and ψðBÞ ¼ cU ∗ B t U, in which c ∈ þ . Characterizations
under some other assumptions are also given as special cases of Theorem 6.2
(Huang, Tsai [18]).
6.V ¼ T n ðÞ (Theorem 8.5): ϕ and ψ send N nðÞ to N n ðÞ, ðD∘ϕÞj DnðÞ and
ðD∘ψ Þj D nðÞ are characterized by Theorem 7.2, and D∘ϕ ¼ D∘ϕ∘D. Here D
denotes the map that sends A ∈ T nðÞ to the diagonal matrix with the same
diagonal as A.
The sets M n, H n, Pn, S n , Dn, and their real counterparts are closed
under conjugate transpose. In these sets, trðABÞ ¼ h A ∗ , Bi for the standard inner
product. Our trace of product preservers can also be interpreted as inner product
preservers, which have wide applications in research areas like quantum information
theory.
4
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
2. Preliminary
fAB þ BA : A, B ∈ V g ⊆ V, (4)
A1 : A ∈ V is invertible ⊆ V:
(5)
In particular,
ψ A k ¼ ψ ðAÞ k,
A ∈ SV : (9)
Then
In particular,
Proof. We prove the complex case. The real case is done similarly.
1. For each A ∈ V nf0g, there is ϵ > 0 such that I p þ xA ∈ S V for all x ∈ with
n o
∣x∣ < min ϵ, ∥A1∥ . Thus
k kðk 1Þ 2
Ip þ xA ¼ Ip þ xkA þ x 2 A þ ⋯ ∈ V: (14)
2
The second derivative
5
Matrix Theory - Classics and Advances
d2
k
I p þ xA ¼ kðk 1ÞA 2 ∈ V: (15)
dx2
x¼0
2
AB þ BA ¼ ðA þ BÞ A2 B2 ∈ V: (16)
2.Now suppose (8) and (9) hold. The proof is proceeded similarly to the proof of
part (1). For
n every A ∈ V
o, there is ϵ > 0 such that for all x ∈ with
1 1
∣x∣ < min ϵ, ∥A∥ , ∥ψ ðAÞ∥ ,
k kðk 1Þ 2
¼ Iq þ xkψðAÞ þ x2
ψ Ip þ xA ψ ðAÞ þ ⋯ ∈ W, (17)
2
k kðk 1Þ 2
ψ Ip þ xA ¼ Iq þ xkψðAÞ þ x2 ψ A þ ⋯ ∈ W: (18)
2
ψ ðAÞ 2 ¼ ψ A2 ,
A ∈ V: (19)
Therefore, for A, B ∈ V ,
2 2
ψ ðA þ B Þ ¼ ψ ðA þ B Þ (20)
We get (10): ψ ðAB þ BAÞ ¼ ψ ðAÞψ ðBÞ þ ψ ðBÞψ ðAÞ. In particular ψ ðA r Þ ¼ ψ ðAÞr
for all A ∈ V and r ∈ þ.
Every invertible A ∈ V can be expressed as A1 ¼ f ðAÞ for a certain
polynomial f ðxÞ ∈ ½x. Then ψ A 1 ¼ ψ ð f ðAÞÞ ¼ f ðψðAÞÞ is commuting with ψ ðAÞ.
Hence
1
We get ψ A1 ¼ ψ ðAÞ . Therefore, ψ ðAr Þ ¼ ψ ðAÞr for all r ∈ .
We recall two results about two maps preserving trace of product in [18]. They are
handy in proving linear bijectivity of maps preserving trace of products. Recall that if
S is a subset of a vector space, then hS i denotes the subspace spanned by S.
6
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
tr ~ϕðAÞψ~ðBÞ ¼ trðABÞ,
A ∈ h V 1i, B ∈ hV 2 i: (23)
Chan and Lim described the linear k-power preservers on Mn and Mn ðÞ for k ≥ 2
in [7, Theorem 1] as follows.
Theorem 3.1. (Chan, Lim [5]) Let an integer k ≥ 2. Let be a field with charðÞ ¼ 0
or charðÞ > k. Suppose that ψ : Mn ðÞ ! Mn ðÞ is a nonzero linear operator such that
ψ Ak ¼ ψ ðAÞk for all A ∈ Mn ðÞ. Then there exist λ ∈ with λ k1 ¼ 1 and an invertible
P ∈ M nðÞ such that
matrix
7
7
Matrix Theory - Classics and Advances
(25) and (26) need not hold if ψ is zero or is a map on a subspace of MnðÞ. The
following are two examples. Another example can be found in maps on Dn ðÞ
(Theorem 7.1).
Example 3.2. The zero map ψðAÞ 0 clearly satisfies ψ Ak ¼ ψ ðAÞk for all A ∈ Mn
We now generalize Theorem 3.1 to include negative integers k and to assume the k-
power preserving condition ψ Ak ¼ ψ ðAÞk only on matrices nearby the identity.
neighborhood of In consisting of invertible matrices. Then there exist λ ∈ with λk1 ¼ 1 and
an invertible matrix P ∈ Mn ðÞ such that
Proof. We prove for the case ¼ . The case ¼ can be done similarly.
Obviously, ψ ðIn Þ ¼ ψ Ikn ¼ ψ ðIn Þ k.
1.First suppose k ≥ 2. For each A ∈ Mn , there exists ϵ > 0 such that for all x ∈
with ∣x∣ < ϵ, the following two power series converge and equal:
!
k1
k i k1i
X
ð ψð I n þ xAÞÞ ¼ ψ ðI nÞ þ x ψ ðI nÞ ψ ðAÞψ ð In Þ
i¼0
! (29)
k 2i
2 kX
i j
X k2ij
2
þx ψ ðI nÞ ψ ðAÞψ ðI n Þ ψ ðAÞψð InÞ þ⋯
i¼0 j¼0
kðk 1Þ 2
ψ ðI n þ xAÞk ¼ ψ ðI n Þ þ xkψðAÞ þ x 2 ψ A þ⋯ (30)
2
k1
ψ ðI nÞ iψ ðAÞψ ðI nÞk1i:
X
kψðAÞ ¼ (31)
i¼0
kψð In Þψ ðAÞ kψðAÞψ ðI n Þ ¼ ψ ðIn Þk ψ ðAÞ ψ ðAÞψ ðI n Þk ¼ ψ ðI nÞψ ðAÞ ψ ðAÞψ ðIn Þ:
(32)
8
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
Hence ψ ðIn Þψ ðAÞ ¼ ψ ðAÞψ ðIn Þ for A ∈ Mn , that is, ψ ðIn Þ commutes with the range
of ψ .
Now equating degree two terms of (29) and (30) and taking into account that
k ∈ f0, 1 g, we have
Define ψ 1ðAÞ ¼ ψ ðI n Þk2 ψ ðAÞ for A ∈ M n. Then ψ 1 A2 ¼ ðψ 1 ðAÞÞ2 for all A ∈ Mn.
(31) and the assumption that ψ is nonzero imply that ψ ðIn Þ 6¼ 0. So ψ1 ðIn Þψ ðIn Þ ¼
ψ ðIn Þk ¼ ψð In Þ 6¼ 0. Thus ψ 1 ðIn Þ 6¼ 0 and ψ1 is nonzero. By Theorem 3.1, there exists
an invertible P ∈ M n such that ψ 1 ðAÞ ¼ PAP1 for A ∈ M n or ψ 1ðAÞ ¼ PA t P1 for
A ∈ M n. Moreover, ψ ðI nÞ commutes with all ψ 1ðAÞ, so that ψ ðIn Þ ¼ λIn for a λ ∈ .
By In ¼ ψ 1ð In Þ ¼ ψ ðI n Þk1, we get λk1 ¼ 1. Therefore, ψ ðAÞ ¼ λψ 1ðAÞ. We get (27)
and (28).
1.Next Suppose k < 0. For every A ∈ Mn, the power series expansions of
1
k
ð ψð In þ xAÞÞ and ψ ðI n þ xAÞk are equal when ∣x∣ is sufficiently small:
!
k1
X
k 1 i k1i
ðψ ðIn þ xAÞÞ ¼ ψ ð I nÞ þx ψ ðIn Þ ψ ðAÞψð InÞ þ⋯ (34)
i¼0
1
1 1 1
ψ ðI n þ xAÞk ¼ ψ ðI n Þ xkψ ðI nÞ ψ ðAÞψ ðI nÞ þ ⋯ (35)
k1
ψ ðI nÞi ψ ðAÞψð I nÞk1i :
1 1
X
kψ ðI n Þ ψ ðAÞψ ðI n Þ ¼ (36)
i¼0
Therefore,
k ψðAÞψ ðI n Þ1 ψ ðI nÞ1 ψ ðAÞ
! !
k1
X k1
X
k1i k1i
¼ ψ ðI nÞ ψ ð InÞi ψ ðAÞψ ð In Þ ψ ðIn Þi ψ ðAÞψð In Þ ψ ðIn Þ (37)
i¼0 i¼0
k k 1 1
¼ ψ ðI nÞ ψ ðAÞ ψ ðAÞψð I nÞ ¼ ψ ð I nÞ ψ ðAÞ ψ ðAÞψ ðI n Þ :
We get ψ ðI n Þ1ψ ðAÞ ¼ ψ ðAÞψ ðI nÞ 1 for A ∈ Mn. So ψ ðI n Þ1 and ψ ðIn Þ commute
with the range of ψ . The following power series are equal for every A ∈ Mn when ∣x∣ is
sufficiently small:
k kðk 1Þ
k1
ðψ ðIn þ xAÞÞ ¼ ψ ðIn Þ þ xkψð InÞ ψ ðAÞ þ x 2 ψ ðIn Þk2ψ ðAÞ 2 þ ⋯ (38)
2
kðk 1Þ
ψ ðI n þ xAÞ k ¼ ψ ðI nÞ þ xkψðAÞ þ x2 ψ A2 þ ⋯ (39)
2
9
9
Matrix Theory - Classics and Advances
Equating degree two terms of (38) and (39), we get ψ ð I nÞ k2 ψ ðAÞ2 ¼ ψ A2 : Let
ψ 1ðAÞ ≔ ψ ðIn Þk2ψ ðAÞ ¼ ψ ðI nÞ1 ψ ðAÞ. Then ψ 1 ðAÞ2 ¼ ψ 1 A2 and ψ 1 is nonzero.
Corollary 2.3 shows that every linear bijection ϕ : Mn ðÞ ! MnðÞ corresponds
to another linear bijection ψ : M n ðÞ ! MnðÞ such that trðϕðAÞψ ðBÞÞ ¼ trðABÞ
for all A, B ∈ Mn ðÞ. When m ≥ 3, maps ϕ1 , ⋯, ϕm on M nðÞ that satisfy
trð ϕ1 ðA 1 Þ⋯ϕmð AmÞÞ ¼ tr ðA 1 ⋯A mÞ for A1 , … , A m ∈ M nðÞ are determined in [18].
If two maps on Mn ðÞ satisfy the following trace condition about k-powers, then
they have specific forms.
Theorem 3.5. Let ¼ or . Let k ∈ nf 0, 1g. Let S be an open neighborhood of I n
consisting of invertible matrices. Then two maps ϕ, ψ : M nðÞ ! Mn ðÞ satisfy that
k
tr ϕðAÞψ ðBÞ ¼ tr AB k ,
(40)
(43)
Thus tr ϕðAÞψ ðBÞ k ¼ tr AB k for A ∈ MnðÞ and B ∈ S, which leads to
assumption (1).
Now we prove the theorem under assumption (1), that is, (40) holds for all
A ∈ MnðÞ and B ∈ S, and ψ is linear. Only the necessary part is needed to prove.
Let S 0 ¼ B ∈ P n : B1=k ∈ S , which is an open neighborhood of In in P n. Define ψ~ :
k
S0 ! Mn such
that ψ~ ðBÞ ¼ ψ B 1=k . Then (40) implies that
10
10
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
k
tr ðϕðAÞ ~ψ ðBÞÞ ¼ tr ϕðAÞψ B1=k ¼ trðABÞ, A ∈ M n , B ∈ S0 : (44)
By the bijectivity of ϕ,
k1
kψ ð CiÞ k ¼ ψ ðC i Þ j ψ ðI Þψ ðCi Þk1j:
X
(50)
j¼0
Since
11
Matrix Theory - Classics and Advances
the only matrix A ∈ M n such that tr Aψ ðCi Þk ¼ 0 for all i ∈ 1, … , n2 is the zero
n o
matrix. So ψ ð CiÞk : i ¼ 1, … , n2 is a basis of Mn . (51) implies that ψð In Þ ¼ cI n for
certain c ∈ nf0 g.
(46) shows that
Therefore,
ck1 tr ϕðAÞψ Bk ¼ tr ABk ¼ tr ϕðAÞψ ðBÞk ,
A ∈ Mn , B ∈ S: (54)
k
c 1ψ Bk ¼ c1 ψ ðBÞ ,
B ∈ S: (55)
2. Now suppose k < 0. Then ψ ðIn Þ is invertible. For every B ∈ M n and sufficiently
small x, we have the power series expansion:
ðψ ðIn Þ þ xψðBÞÞk
1 ∣k∣
1 1
¼ In þ xψ ðIn Þ ψ ðBÞ ψ ð I nÞ
h i∣k∣
¼ In xψ ðIn Þ1ψ ðBÞ þ x2 ψ ðIn Þ1ψ ðBÞψ ðI n Þ1ψ ðBÞ þ ⋯ ψ ðI n Þ1
! (56)
∣k∣
k i k1þi
X
¼ ψ ð In Þ x ψ ðI n Þ ψ ðBÞψ ðIn Þ
i¼1
!
∣k∣ ∣kX
∣þ1i
i j k2þiþj
2
X
þx ψ ðI n Þ ψ ðBÞψ ðI n Þ ψ ðBÞψð In Þ þ⋯
i¼1 j¼1
Equating degree one terms and degree two terms of (45) respectively and using
(56), we get the following identities for A, B ∈ Mn :
!!
X∣k∣
i k1þi
tr ϕðAÞ ψ ðI nÞ ψ ðBÞψ ðI n Þ ¼ trðjkjABÞ, (57)
i¼1
∣þ1i
!!
∣k∣ ∣kX
X i j k2þiþj kðk 1Þ 2
tr ϕðAÞ ψ ð In Þ ψ ðBÞψð I nÞ ψ ðBÞψ ðI n Þ ¼ tr AB :
i¼1 j¼1
2
(58)
∣k∣
ψ ðIn Þ iψ Bk ψ ð In Þk1þi ¼ ∣k∣ψ ðBÞk , B ∈ S: (59)
i¼1
X
12
12
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
Let Fr ðBÞ denote the degree r coefficient in the power series of ðψ ðI nÞ þ xψðBÞÞ k .
Then (57) and (58) show that:
k 1 2
F1 B ¼ F2 ðBÞ, B ∈ M n: (60)
2
2
So ψ1 B2 ¼ ψ 1 ðBÞ for B ∈ Mn. Note that ψ 1ðI n Þ ¼ In . By Theorem 3.4, there exists
k1
ψ ðIn Þ1 F 1 B 2 F2 B2 ψ ðIn Þ1 ¼ ψ ðI nÞ1 F2 ðBÞ F2 ðBÞψ ðI n Þ1,
(62)
2
which gives
1k
ψ ð InÞ k ψ B2 ψ B2 ψ ð InÞk
2 (63)
¼ ψ ðIn Þk ψ ðBÞψ ðI nÞ 1ψ ðBÞ ψ ðBÞψ ðI nÞ 1ψ ðBÞψ ðI n Þk:
The equality on degree one terms shows that ψ ðI nÞk commutes with all ψ1 ðEÞ.
k
Hence ψ ð In Þ commutes with the range of ψ . (??) can be rewritten as
! !
∣k∣
k1þi i
X
tr ψ ð I nÞ ϕðAÞψ ðI nÞ ψ ðBÞ ¼ trðjkjABÞ, A, B ∈ M n: (66)
i¼1
13
13
Matrix Theory - Classics and Advances
(67)
ible P ∈ Mn such that ψ 1ðBÞ ¼ PBP1 or ψ 1 ðBÞ ¼ PB tP 1. Then ψ ðBÞ ¼ cPBP 1 or
ψ ðBÞ ¼ cPBtP 1 . Using (40), we get (42).
Remark 3.6 The following modifications could be applied to the proof of Theorem 3.5
for ¼ :
2.We may choose the following basis of rank 1 projections of Mn ðÞ to substitute (48):
1
fE ii : 1 ≤ i ≤ ng ∪ pffiffiffi Eii þ Ejj þ Eij þ E ji : 1 ≤ i < j ≤ n
2 (68)
∪ ω1Eii þ ω2 Ejj þ Eij Eji : 1 ≤ i < j ≤ n ,
ψ A k ¼ ψ ðA Þ k
(69)
1.When k is even,
14
14
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
2.When k is odd,
Proof. It suffices to prove the necessary part. Suppose (69) holds on an open
neighborhood S of I n in Hn .
A ∈ H n.
It implies that
ψ~ ðA þ iBÞ2 ¼ ψ~ ðA þ iBÞ2 , A, B ∈ Hn :
2.Now assume that k < 0. Replacing Mn by Hn in part (2) of the proof of Theorem
3.4, we can show that ψð InÞ commutes with the range of ψ , and furthermore the
1 2
nonzero linear map ψ1ðAÞ ≔ ψ ðI n Þ ψ ðAÞ satisfies that ψ 1 A 2 ¼ ψ1ðAÞ . By
arguments in the preceding paragraphs, we get (70) or (71).
15
15
Matrix Theory - Classics and Advances
a. When k ¼ 1, there exist an invertible matrix P ∈ Mn and c ∈ f1, 1g such that
ϕðAÞ ¼ cP ∗ AP ϕðAÞ ¼ cP ∗ At P
A, B ∈ H n; or A, B ∈ H n: (74)
ψ ðBÞ ¼ cP ∗ BP ψ ðBÞ ¼ cP ∗ B tP
Proof. Assumption (2) leads to assumption (1) (cf. the proof of Theorem 3.5). We
prove the theorem under assumption (1). It suffices to prove the necessary part.
2.When k < 0, in the part (2) of proof of Theorem 3.5, through replacing M n by Hn
and complex numbers by real numbers, we can get the corresponding equalities
of (56) (60) on Hn. The case k < 1 can be proved completely analogously
with the help of Theorem 4.1.
For the case k ¼ 1, the equality corresponding to (60) can be simplified as
ψ B2 ¼ ψ ðBÞψ ðI nÞ 1 ψ ðBÞ,
B ∈ Hn: (76)
1
Let ψ 1ðBÞ ≔ ψ ðI n Þ ψ ðBÞ. Then ψ 1 : Hn ! Mn is a nonzero real linear map that
2
satisfies ψ 1 B2 ¼ ψ 1 ðBÞ for B ∈ H n. Extend ψ1 to a complex linear map ~ ψ : Mn !
Mn such that
16
16
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
Similarly to the arguments in part (1) of the proof of Theorem 4.1, we have
~ ðA þ iBÞ 2 ¼ ðψ~ðA þ iBÞÞ 2 for all A, B ∈ Hn. Using Theorem 3.4 and the fact that
ψ
ψ ðIn Þ ¼ ψ 1ð InÞ ¼ I n, we can prove that there is an invertible P ∈ M n such that for all
~
B ∈ Hn , either ψ1 ðBÞ ¼ P1 BP or ψ 1 ðBÞ ¼ P1 Bt P. So
∗
If ψ ðBÞ ¼ ψ ð I nÞP 1BP for B ∈ H n , then ψð In ÞP1 BP ¼ ψ ðI n ÞP1 BP ¼
Obviously, ψ may be non-linear, and the choices of pairs ðϕ, ψ Þ are much more
than those in (74) and (75).
Chan and Lim described the linear k-power preservers on SnðÞ for k ≥ 2 in
[7, Theorem 2] as follows.
Theorem 5.1. (Chan, Lim [5]) Let an integer k ≥ 2. Let be an algebraic closed field
with charðÞ ¼ 0 or charðÞ > k. Suppose that ψ : Sn ðÞ ! Sn ðÞ is a nonzero linear
operator such that ψ Ak ¼ ψ ðAÞk for all A ∈ S nðÞ. Then there exist λ ∈ with λ k1 ¼ 1
and an orthogonal matrix O ∈ MnðÞ such that
ψ ðAÞ ¼ λOAO t, A ∈ Sn: (81)
17
17
Matrix Theory - Classics and Advances
We generalize Theorem 5.1 to include the case S n ðÞ, to include negative integers
k, and to assume the k-power preserving condition only on matrices nearby the
identity.
Theorem 5.2. Let k ∈ nf0, 1g. Let ¼ or . Suppose that ψ : S n ðÞ ! Sn ðÞ is a
nonzero linear map such that ψ Ak ¼ ψ ðAÞ k for all A in an open neighborhood of I n in
S nðÞ consisting of invertible matrices. Then there exist λ ∈ with λ k1 ¼ 1 and an
orthogonal matrix O ∈ Mn ðÞ such that
Proof. It suffices to prove the necessary part. In both k ≥ 2 and k < 0 cases, using
analogous arguments as parts (1) and (2) of the proof of Theorem 3.4, we get that
k2
ψ ðIn Þ commutes with the range of ψ , and the nonzero map ψ 1 ðAÞ ≔ ψ ðIn Þ ψ ðAÞ
2 2
satisfies that ψ 1 A ¼ ψ 1ðAÞ for A ∈ S n ðÞ. Then
an orthogonal matrix O ∈ MnðÞ such that ψ 1ðAÞ ¼ OAOt. Since ψ ðI nÞ commutes with
the range of ψ 1, we have ψ ðI nÞ ¼ λI n for certain λ ∈ in which λk1 ¼ 1. So ψ ðAÞ ¼
λOAOt as in (82).
Obviously, in ¼ case, (82) has λ ¼ 1 when k is even and λ ∈ f1, 1g when k is
odd.
Corollary 2.3 shows that every linear bijection ϕ : S nðÞ ! S nðÞ corresponds to
another linear bijection ψ : S nðÞ ! S n ðÞ such that trðϕðAÞψ ðBÞÞ ¼ trðABÞ for all
A, B ∈ S nðÞ. When m ≥ 3, maps ϕ 1, ⋯, ϕ m : S nðÞ ! Sn ðÞ that satisfy
trðϕ1ð A1 Þ⋯ϕm ðA mÞÞ ¼ tr ðA1⋯A m Þ are determined in [18].
We characterize the trace of power-product preserver for S n ðÞ here.
Theorem 5.3. Let ¼ or . Let k ∈ nf 0, 1g. Let S be an open neighborhood of
In in Sn ðÞ consisting of invertible matrices. Then two maps ϕ, ψ : S nðÞ ! SnðÞ satisfy
that
tr ϕðAÞψ ðBÞ k ¼ tr AB k ,
(84)
a. When k ¼ 1, there exist an invertible matrix P ∈ Mn ðÞ and c ∈ n f0 g such that
18
18
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
b. When k ∈ nf 1, 0, 1g, there exist c ∈ n f0 g and an orthogonal matrix O ∈ M nðÞ
such that
Proof. Assumption (2) leads to assumption (1) (cf. the proof of Theorem 3.5). We
prove the theorem under assumption (1). It suffices to prove the necessary part.
Obviously, S n∩Hn ¼ Sn ðÞ and Sn ∩Pn ¼ P nðÞ. Let S0 ≔ B ∈ P nðÞ : B 1=k ∈ S ,
which is an open neighborhood of I n in P nðÞ and whose real (resp. complex) span is
S nðÞ (resp. S n ). Using an analogous argument of the proof of Theorem 3.5, and
replacing Mn by S nðÞ, replacing the basis (48) of M n by the following basis of rank 1
projections in S nðÞ:
1
fE ii : 1 ≤ i ≤ n g∪ pffiffiffi Eii þ E jj þ Eij þ Eji : 1 ≤ i < j ≤ n , (87)
2
and replacing the usage of Theorem 3.4 by that of Theorem 5.2, we can prove the
case k ≥ 2, and for k < 0, we can get the corresponding equalities up to (60).
1
Define a linear map ψ 1 : S n ðÞ ! M nðÞ by ψ 1 ðBÞ ≔ ψ ðBÞψð InÞ .
2
When k ¼ 1, we get the corresponding equality of (61), so that ψ 1 B 2 ¼ ψ 1 ðBÞ
r r
for B ∈ S nðÞ. Similar to the proof of Theorem 5.2, we get ψ 1ð B Þ ¼ ψ 1 ðBÞ for all
r ∈ þ. By [26, Theorem 6.5.3], there is an invertible matrix P ∈ M nðÞ such that
ψ 1ðBÞ ¼ PBP1 , so that ψ ðBÞ ¼ PBP1 ψ ðIn Þ for B ∈ S nðÞ. Since ψ ðBÞ ¼ ψ ðBÞt , we get
Therefore, P1 ψ ðIn ÞPt ¼ cIn for certain c ∈ nf 0g, so that ψ ðBÞ ¼ cPBPt for all
B ∈ Sn ðÞ. Consequently, we get (85). The remaining claims are obvious.
When k < 1, using analogous argument as in the proof of k < 1 case of
Theorem 3.5 and applying Theorem 5.2, we can get (86).
In this section, we will determine k-power linear preservers and trace of power-
product preservers on maps Pn ! Pn (resp. P nðÞ ! P nðÞ). Properties of such
maps can be applied to maps Pn ! Pn and P n ! Pn (resp. PnðÞ ! P n ðÞ and
Pn ðÞ ! P n ðÞ).
19
Matrix Theory - Classics and Advances
ψ A k ¼ ψ ðA Þ k
(89)
Proof. We prove the case ψ : P n ! P n. The sufficient part is obvious. About the
necessary part, the nonzero linear map ψ : P n ! P n can be easily extended to a linear
map ψ~ : Hn ! H n that satisfies ~ψ Ak ¼ ψ~ðAÞ k on an open neighborhood of In . By
Now consider the maps P n ! Pn (resp. P n ðÞ ! Pn ðÞ) that preserve trace
of powered products. Unlike Mn and Hn, the set P n (resp. P nðÞ) is not a
vector space. The trace of powered product preservers of two maps have the following
forms.
Theorem 6.2 (Huang, Tsai [18]). Let a, b, c, d ∈ nf0g . Two maps ϕ, ψ : P n ! Pn
satisfy
tr ϕðAÞ a ψ ðBÞb ¼ tr AcB d ,
A, B ∈ Pn, (91)
Theorem 6.3 (Huang, Tsai [18]). Given an integer m ≥ 3 and real numbers
α1 , … , αm , β1 , … , βm ∈ n f0 g , maps ϕi : P n ! Pn (i ¼ 1, … , m) satisfy that
β
trðϕ1 ðA1 Þα1 ⋯ϕm ðA mÞ αm Þ ¼ tr A1 1 ⋯Aβm1 , A1 , … , Am ∈ P n, (93)
if and only if they have the following forms for certain c1 , … , cm ∈ þ with
c1⋯c m ¼ 1:
ϕi ðAÞ ¼ c1=α
i
i
U ∗ A βi =αi U, A ∈ Pn : (94)
1=α i 1=α i
ci M ∗ Aβ i M , i is odd,
ϕi ðAÞ ¼ A ∈ Pn : (95)
0 1=αi 1 βi ∗ 1=αi
c M A M , i is even,
@ i
20
20
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
Both Theorems 6.2 and 6.3 can be analogously extended to maps Pn ðÞ ! P nðÞ
without difficulties.
Theorem 6.2 determines maps ϕ, ψ : P n ! P n that satisfy (91) throughout
their domain. If we only assume the equality (91) for ðA, BÞ in certain
subset of Pn P n and assume certain linearity of ϕ and ψ , then ϕ and ψ
may have slightly different forms. We determine the case a ¼ c ¼ 1 and
b ¼ d ¼ k ∈ nf0g here.
Theorem 6.4. Let k ∈ nf0g. Let S be an open neighborhood of In in Pn . Two maps
ϕ, ψ : Pn ! P n satisfy
tr ϕðAÞψ ðBÞk ¼ tr AB k ,
(96)
3. for all A, B ∈ P n, or
ϕðAÞ ¼ P ∗ A tP
0
ϕðAÞ ¼ P ∗ AP
1=k or@ h
k
i1=k A, B ∈ Pn : (97)
ψ ðBÞ ¼ P1 B kP ∗ ψ ðBÞ ¼ P 1ð BtÞ P ∗
if and only if when k ∈ f1, 1g, ϕ and ψ take the form (97), and when
k ∈ nf1, 0, 1g, there exist a unitary matrix U ∈ M n and c ∈ þ such that
21
Matrix Theory - Classics and Advances
By Theorem 4.2 and taking into account the ranges of ϕ and ψ , we see that when
k ¼ 1, ϕ and ψ take the form of (97), and when k ∈ nf 1, 0, 1g , ϕ and ψ take the
form of (98).
Theorem 6.4 has counterpart results for ϕ, ψ : P n ðÞ ! P nðÞ and the proof is
analogous using Theorem 5.3 instead of Theorem 4.2.
We define the linear functionals f i : DnðÞ ! ði ¼ 0, 1, … , nÞ, such that for each
A ¼ diagða1 , … , anÞ ∈ Dn ðÞ,
ψ Ak ¼ ψ ðAÞ k, A ∈ S,
(103)
if and only if
ψ ðAÞ ¼ ψ ðIn Þdiag f pð1Þ ðAÞ, … , fpðnÞ ðAÞ , A ∈ D nðÞ, (104)
22
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
kðk 1Þ
ψ ðIn þ xAÞ k ¼ ψð InÞ þ xkψ ðI nÞk1 ψ ðAÞ þ x2 ψ ð In Þk2 ψ ðAÞ 2 þ ⋯ (107)
2
k1
ψ ðAÞ ¼ ψ ð I nÞ ψ ðAÞ, (108)
k2 2
ψ A2 ¼ ψð InÞ ψ ðAÞ :
(109)
k2
The linear map ψ 1ðAÞ ≔ ψ ðI nÞ ψ ðAÞ satisfies that
ψ1 A2 ¼ ψ 1 ðAÞ2,
A ∈ D n ðÞ: (110)
1 1
By (101), let Lψ 1 ¼ ℓij ∈ Mn ðÞ such that diag ðψ 1 ðAÞÞ ¼ Lψ 1 diag ðAÞ for
A ∈ Dn ðÞ. Then (110) implies that for all A ¼ diagða 1, … , a nÞ ∈ D nðÞ:
!2
n
X n
X
ℓ ija 2j ¼ ℓ ija j , i ¼ 1, 2, … , n: (111)
j¼1 j¼1
Therefore, each row of L ψ1 has at most one nonzero entry and each nonzero entry
must be 1. We get
ψ 1ðAÞ ¼ diag f pð1Þ ðAÞ, … , f pðnÞ ðAÞ (112)
In [18], we show that two maps ϕ, ψ : Dn ðÞ ! D nðÞ satisfy trðϕðAÞψ ðBÞÞ ¼
trðABÞ for A, B ∈ D nðÞ if and only if there exists an invertible N ∈ M nðÞ such that
23
23
Matrix Theory - Classics and Advances
if and only if there exist an invertible diagonal matrix C ∈ Dn ðÞ and a permutation
matrix P ∈ Mn ðF Þ such that
Proof. Assumption (2) leads to assumption (1) (cf. the proof of Theorem 3.5). We
prove the theorem under assumption (1).
For every B ∈ D n ðÞ, In þ xB ∈ S and the power series of ðI n þ xBÞk converges when
x ∈ is sufficiently close to 0, so that
tr ϕðAÞψ ðI n þ xBÞk ¼ tr AðI n þ xBÞk (116)
Comparing degree one terms and degree two terms in the power series of the
above equality, respectively, we get the following equalities for A, B ∈ DnðÞ:
k1
tr ϕðAÞψ ðBÞψ ðI n Þ ¼ trðABÞ, (117)
tr ϕðAÞψ ðBÞ 2 ψ ðIn Þk2 ¼ tr AB 2 :
(118)
Applying Theorem 2.2 to (117), ψ ðIn Þ is invertible and both ϕ and ψ are linear
k1 2 k2
bijections. (117) and (119) imply that ψ B2 ψ ðI nÞ
¼ ψ ðBÞ ψ ðI nÞ . Let
1 2
2
ψ 1ðBÞ ≔ ψ ðBÞψ ðIn Þ . Then ψ1 B ¼ ψ 1ðBÞ for B ∈ Dn ðÞ. By Theorem 7.1 and
ψ 1 ðIn Þ ¼ In, there exists a permutation matrix P ∈ Mn ðF Þ such that ψ 1 ðBÞ ¼ PBP1 for
B ∈ DnðÞ. So ψ ðBÞ ¼ ψ ðI n ÞPBP1 ¼ PCBP1 for C ≔ P1ψ ðI nÞP ∈ Dn ðÞ. Then (114)
implies (115).
24
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
a 11 ca12 a22 ca 12
λ , λ , (121)
0 a22 0 a11
Then each ψ satisfies that ψ A k ¼ ψ ðAÞ k for every positive integer k but it is not
Proof. Obviously ψ is a linear bijection. Follow the same process in the proof of
Theorem 3.4. In both k ≥ 2 and k < 0 cases we have ψð I nÞ commutes with the range of
ψ , so that ψ ðI n Þ ¼ λIn for λ ∈ and λ k1 ¼ 1: Moreover, let ψ 1ðAÞ ≔ ψ ðIn Þ1 ψ ðAÞ, then
ψ 1 is injective linear and ψ1 A 2 ¼ ψ 1 ðAÞ2 for A ∈ T nðÞ: Theorem 8.1 shows that
Theorem 2.2 or Corollary 2.3 does not work for maps on T n ðÞ. However, the
following trace preserving result can be easily derived from Theorem 7.2. We have
T nðÞ ¼ Dn ðÞ⊕N nðÞ. Let DðAÞ denote the diagonal matrix that takes the diagonal
of A ∈ T n ðÞ.
Theorem 8.5. Let ¼ or . Let k ∈ nf0, 1g. Let S be an open neighborhood of In in
T nðÞ consisting of invertible matrices. Then two maps ϕ, ψ : T n ðÞ ! T nðÞ satisfy that
tr ϕðAÞψ ðBÞk ¼ tr AB k ,
(125)
25
Matrix Theory - Classics and Advances
if and only if ϕ and ψ send N n ðÞ to N n ðÞ, ðD∘ϕÞjDnðÞ and ðD∘ψ Þj Dn ðÞ are linear
bijections characterized by (115) in Theorem 7.2, and D∘ϕ ¼ D∘ϕ∘D.
Proof. The sufficient part is easy to verify. We prove the necessary part here. Let
ϕ 0 ≔ ðD∘ϕÞj Dn ðÞ and ψ 0 ≔ ðD∘ψ Þj DnðÞ . Then ϕ0 , ψ 0 : Dn ðÞ ! Dn ðÞ satisfy
tr ϕ0 ðAÞψ 0ðBÞ k ¼ tr AB k for A, B ∈ D nðÞ. So they are characterized by (115). The
bijectivity of ϕ0 and ψ 0 implies that ϕ and ψ must send N n ðÞ to N nðÞ in order to
satisfy (125). Moreover, ϕ should send matrices with same diagonal to matrices with
same diagonal, which implies that D∘ϕ ¼ D∘ϕ∘D.
9. Conclusion
These results, together with Theorem 2.2 about maps satisfying trðϕðAÞψ ðBÞÞ ¼
trðABÞ and the characterizations of maps ϕ1, ⋯, ϕm : V ! V (m ≥ 3) satisfying
trð ϕ1 ðA 1 Þ⋯ϕmð AmÞÞ ¼ tr ðA 1 ⋯Am Þ in [18], make a comprehensive picture of the pre-
servers of trace of matrix products in the related matrix spaces and sets. Our results
can be interpreted as inner product preservers when V is close under conjugate
transpose, in which wide applications are found.
There are a few prospective directions to further the researches.
First, for a polynomial or an analytic function f ðxÞ and a matrix set V , we can
consider “local” linear f -preservers, that is, linear operators ψ : V ! V that satisfy
ψ ð f ðAÞÞ ¼ f ðψ ðAÞÞ on an open subset S of V . A linear f -preserver ψ on S also
preserves matrices annihilated by f on S, that is, f ðAÞ ¼ 0 (A ∈ S) implies
f ðψ ðAÞÞ ¼ 0. When S ¼ V is Mn , BðHÞ, or some operator algebras, extensive studies
have been done on operators preserving elements annihilated by a polynomial f ; for
examples, the results on Mn by R. Howard in [19], by P. Botta, S. Pierce, and W.
Watkins in [20], and by C.-K. Li and S. Pierce in [21], on BðHÞ by P. Šemrl [22], on
linear maps ψ : BðHÞ ! BðK Þ by Z. Bai and J. Hou in [23], and on some operator
algebras by J. Hou and S. Hou in [24]. We may further explore linear f -preservers for
a multivariable function f ðx 1 , … , xrÞ, that is, operator ψ satisfying
ψð f ð A1 , … , ArÞÞ ¼ f ðψ ðA 1Þ, … , ψ ðA rÞÞ. The corresponding annihilator preserver
problem has been studied in some special cases, for example, on Mn for homogeneous
multilinear polynomials by A. E. Guterman and B. Kuzma in [25].
26
26
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
27
Matrix Theory - Classics and Advances
References
[1] Li CK, Pierce S. Linear preserver [11] Cao C, Zhang X. Power preserving
problems. American Mathematical additive maps (in Chinese). Advances in
Monthly. 2001;108:591-605 Mathematics. 2004;33:103-109
[2] Li CK, Tsing NK. Linear preserver [12] Cao C, Zhang X. Power preserving
problems: A brief introduction and additive operators on triangular matrix
some special techniques. Linear Algebra algebra (in Chinese). Journal of
and its Applications. 1992;162–164: Mathematics. 2005;25:111-114
217-235
[13] Uhlhorn U. Representation of
[3] Molnár L. Selected Preserver symmetry transformations in quantum
Problems on Algebraic Structures of mechanics. Arkiv för Matematik. 1963;
Linear Operators and on Function 23:307-340
Spaces, Lecture Notes in Mathematics.
Vol. 1895. Berlin: Springer-Verlag; [14] Molnár L. Some characterizations of
2007 the automorphisms of ℬℋ and CX.
Proceedings of the American Mathematical
[4] Pierce S et al. A survey of linear Society. 2002;130(1):111-120
preserver problems. Linear and
Multilinear Algebra. 1992;33:1-129 [15] Li CK, Plevnik L, Šemrl P. Preservers
of matrix pairs with a fixed inner
[5] Chan GH, Lim MH. Linear preservers product value. Operators and Matrices.
on powers of matrices. Linear Algebra 2012;6:433-464
and its Applications. 1992;162–164:
615-626 [16] Huang H, Liu CN, Tsai MC, Szokol
P, Zhang J. Trace and determinant
[6] Zhang X, Tang X, Cao C. Preserver preserving maps of matrices. Linear
Problems on Spaces of Matrices. Beijing: Algebra and its Applications. 2016;507:
Science Press; 2006 373-388
[7] Brešar M, Šemrl P. Linear [17] Leung CW, Ng CK, Wong NC.
transformations preserving potent Transition probabilities of normal states
matrices. Proceedings of the American determine the Jordan structure of a
Mathematical Society. 1993;119:81-86 quantum system. Journal of
Mathematical Physics. 2016;57:015212
[8] Zhang X, Cao CG. Linear k-power/k-
potent preservers between matrix [18] Huang H, Tsai MC. Maps Preserving
spaces. Linear Algebra and its Trace of Products of Matrices. Available
Applications. 2006;412:373-379 from: https://arxiv.org/abs/2103.12552
[Preprint]
[9] Kadison RV. Isometries of operator
algebras. Annals of Mathematics. 1951; [19] Howard R. Linear maps that
54:325-338 preserve matrices annihilated by a
polynomial. Linear Algebra and its
[10] Brešar M, Martindale WS, Miers CR. Applications. 1980;30:167-176
Maps preserving nth powers.
Communications in Algebra. 1998;26: [20] Botta P, Pierce S, Watkins W. Linear
117-138 transformations that preserve the
28
28
Linear K-Power Preservers and Trace of Power-Product Preservers
DOI: http://dx.doi.org/10.5772/ITexLi.103713
29
Chapter 2
Abstract
1. Introduction
In this section, we will introduce the main objects of this chapter along with some
brief historical notes.
By operator pencils or operator polynomials one means polynomials of a complex
variable λ whose coefficients are linear bounded operators acting in a Banach space X:
X
m
LðλÞ ¼ λ jA j, (1)
j¼0
where A j : X ! X (j ¼ 0, … , m), see, for example, [1, 2]. Parlett in ref. [3, p. 339]
stated that the term pencil was introduced by Gantmacher in ref. [4] for matrix
expressions, and Parlett explained how this term came from optics and geometry. In
this chapter, we shall be mainly interested in pencils of banded semi-infinite matrices
that are related to different kinds of scalar orthogonal polynomials. The classical
example of such a relation is the case of orthogonal polynomials on the real line
30
Matrix Theory - Classics and Advances
∞
(OPRL) and Jacobi matrices, see, for example, refs. [5, 6]. If pnðxÞ n¼0 is a set of
orthonormal OPRL and J is the corresponding Jacobi matrix, then the following
relation holds:
!
ðJ xEÞp ðxÞ ¼ 0, (2)
! T
where p ðxÞ ¼ p 0 ðxÞ, p1ðxÞ, … , is a vector of polynomials (here the superscript
T means the transposition), and E is the identity matrix (having units on the main
!
diagonal and zeros elsewhere). In other words, p is an eigenfunction of the pencil
J xE. It is surprising that mathematicians rarely talked about the relation (2) in such
a manner. The next classical example is the case of orthogonal polynomials on the
unit circle (OPUC) and the corresponding three-term recurrence relation, see ref.
[7, p. 159]. More recently there appeared CMV matrices, which are also related to
OPUC, see, for example, ref. [8]. We should notice that besides orthogonal polyno-
mials, there are other systems of functions that are closely related to semi-infinite
matrices. Here we can mention biorthogonal polynomials and rational functions, see,
for example, [9, 10] and references therein.
A natural generalization of OPRL is matrix orthogonal polynomials on the real line
(MOPRL). MOPRL was introduced by Krein in 1949 [11]. They satisfy the relation of
!
type (2), with J replaced by a block Jacobi matrix, and with p replaced by a vector of
matrix polynomials. It turned out that MOPRL is closely related to orthogonal poly-
nomials on the radial rays in the complex plane, see refs. [12, 13]. We shall discuss this
case in Section 2.
Another possible generalization of relation (2) is the following one:
!
ðJ 5 xJ 3Þp ðxÞ ¼ 0, (3)
31
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
If f 0, then f ðBÞ≔0jP .
If H is a Hilbert space then ð, ÞH and ∥ ∥H mean the scalar product and the norm
in H, respectively. Indices may be omitted in obvious cases. For a linear operator A in
H, we denote by DðAÞ its domain, by RðAÞ its range, by Ker A its null subspace
(kernel), and A ∗ means the adjoint operator if it exists. If A is invertible then A 1
means its inverse. A means the closure of the operator, if the operator is closable. If A
is bounded then ∥A∥ denotes its norm.
Throughout this section N will denote a fixed natural number. Let J2Nþ1 be a
∞
complex Hermitian semi-infinite ð2N þ 1Þ-diagonal matrix. Let pn ðλÞ n¼0, degp n ¼ n
be a set of complex polynomials, which satisfy the following relation:
!
ðJ2Nþ1 λ nEÞp ðλÞ ¼ 0, (4)
! T
where p ðλÞ ¼ p0ðλÞ, p 1ðλÞ, … , is a vector of polynomials, and E is the identity
matrix. Polynomials, which satisfy (4) with real coefficients, were first studied by
Durán in ref. [16], following a suggestion of Marcellán. As it was already noticed in
the Introduction, these polynomials are related to MOPRL. Namely, the following
polynomials:
0 1
R N,0 p nN ðxÞ RN,1 pnN ðxÞ ⋯ R N,N 1 pnN ðxÞ
B R
B N,0 p nNþ1 ðxÞ R N,1 pnNþ1 ðxÞ ⋯ RN,N1 pnNþ1 ðxÞ C C (5)
P n ðxÞ ¼ B C
@ ⋮ ⋮ ⋱ ⋮ A
RN,0 p nNþN1 ðxÞ RN,1 pnNþN1 ðxÞ ⋯ R N,N 1 p nNþN1 ðxÞ
p ðnNþmÞð0Þ n
RN,m ðpÞðtÞ ¼ t , p ∈ , 0 ≤ m ≤ N 1: (6)
n ðnN þ mÞ!
X
3
32
Matrix Theory - Classics and Advances
n N1 o∞
Conversely, from a given set Pn ðxÞ ¼ Pn,m,j m,j¼0 of orthonormal MOPRL
n¼0
(suitably normed) one can construct scalar polynomials:
X
N 1
p nNþm ðxÞ ¼ x j Pn,m,j x N , n ∈ , 0 ≤ m ≤ N 1, (7)
j¼0
which satisfy relation (4) [12]. Writing the corresponding matrix orthonormality
conditions for Pn and equating the entries on both sides, one immediately gets
orthogonality conditions for p n:
0 1
R N,0 p m ðxÞ
ð B C
B B R N,1 pm ðxÞ C
C
R N,0 p n ðxÞ, R N,1 pn ðxÞ, … , RN,N1 pn ðxÞ dμ B C¼
B ⋮ C (8)
@ A
R N,N1 pm ðxÞ
¼ δn,m , n, m ∈ þ ,
A. OPRL;
33
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
and
!
ðJ 5 λJ 3Þp ðλÞ ¼ 0, (11)
! T ∞
where pðλÞ ¼ p 0ðλÞ, p1 ðλÞ, p2 ðλÞ, ⋯ . Polynomials p nðλÞ n¼0 are said to be asso-
ciated with the Jacobi-type pencil of matrices Θ.
Observe that for each system of OPRL with p 0 ¼ 1 one can take J 3 to be the
corresponding Jacobi matrix, J5 ¼ J 23 , and α, β being the coefficients of p1
(p1 ðλÞ ¼ αλ þ β). Then, this system is associated with Θ ¼ ðJ 3 , J5 , α, βÞ. Let us mention
two other circumstances where Jacobi-type pencils arise in a natural way.
1. Discretization of a 4-th order differential operator. Ben Amara,
Vladimirov, and Shkalikov investigated the following linear pencil of
differential operators [23]:
00 0
py λ y 0 þ cry ¼ 0: (12)
The initial conditions are: yð0Þ ¼ y0 ð0Þ ¼ yð1Þ ¼ y 0 ð1Þ ¼ 0, or yð0Þ ¼ y0 ð0Þ ¼
0
y0 ð1Þ ¼ py0 0 ð1Þ þ λαyð1Þ ¼ 0. Here p, r ∈ C½0, 1 are uniformly positive, while the
parameters c and α are real. Eq. (12) has several physical applications, which include a
motion of a partially fixed bar with additional constraints in the elasticity theory [23].
The discretization of this equation leads to aJacobi-type
∞ pencil, see ref. [24].
2. Partial sums of series of OPRL. Let g nðxÞ n¼0 (degg n ¼ n) be orthonormal
OPRL with positive leading coefficients. Let fc k g∞ k¼0 be a set of arbitrary
positive numbers. Then polynomials
n
1
pnðxÞ≔ c jg j ðxÞ, n ∈ þ, (13)
c0 g 0 j¼0
X
5
34
Matrix Theory - Classics and Advances
are associated with a Jacobi-type pencil [25, Theorem 1]. Polynomials p n are
normed partial sums of the following formal power series:
X
∞
c j g jðxÞ:
j¼0
Set
! ! ! !
u n ≔J3 e n ¼ an1 e n1 þ b n e n þ an e nþ1 , (16)
! ! ! ! ! !
wn ≔J 5 e n ¼ γ n2 e n2 þ β n1 e n1 þ α n e n þ βn e nþ1 þ γ n e nþ2 , n ∈ þ: (17)
!
Here and in what follows by e k with negative k we mean (vector) zero. The
following operator:
ζ ! !
X ∞
Af ¼ e 1 β e 0 þ ξn wn ,
α n¼0
!
X
∞
f ¼ ζe0þ ξ nu n ∈ l 2,fin, ζ, ξn ∈ , (18)
n¼0
with DðAÞ ¼ l2,fin is called the associated operator for the Jacobi-type pencil Θ. In the
sums in (18), only a finite number of ξn are nonzero. In what follows we shall always
assume this in the case of elements from the linear span. In particular, the following
relation holds:
! !
AJ 3 e n ¼ J 5 en , n ∈ þ :
Then
AJ 3 ¼ J 5: (19)
The matrices J3 and J 5 define linear operators with the domain l 2,fin , which we
denote by the same letters.
35
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
P
For an arbitrary nonzero polynomial f ðλÞ ∈ of degree d ∈ þ, f ðλÞ ¼ dk¼0 d k λk ,
P
dk ∈ , we set f ðAÞ ¼ dk¼0 d kA k. Here A 0 ≔El . For f ðλÞ 0, we set f ðAÞ ¼ 0j l2,fin .
2,fin
! !
¼ p nðAÞ e 0 ,
en n ∈ þ ; (20)
! !
pn ðAÞ e 0 , p mðAÞ e 0 ¼ δ n,m , n, m ∈ þ : (21)
l2
∞
Denote by f rnðλÞg n¼0, r0 ðλÞ ¼ 1, the system of polynomials satisfying
! ! !
J3 r ðλÞ ¼ λ r ðλÞ, r ðλÞ ¼ ðr0 ðλÞ, r1 ðλÞ, r2ðλÞ, … Þ T: (22)
These polynomials are orthonormal on the real line with respect to a (possibly
nonunique) nonnegative finite measure σ on the Borel subsets of (Favard’s
theorem). Consider the following operator:
" #
X
∞
!
X
∞
U ξn e n ¼ ξnrn ðxÞ , ξn ∈ , (23)
n¼0 n¼0
which maps l 2,fin onto P . Here, by P we denote a set of all (classes of equivalence
which contain) polynomials in L2σ . Denote
i. DðAÞ ¼ P;
36
Matrix Theory - Classics and Advances
X
k
Axk ¼ ξk,kþ1 xkþ1 þ ξk,j x j , (26)
j¼0
iii. The operator AΛ0 is symmetric. Here, by Λ0 we denote the operator of the
multiplication by an independent variable in L2σ restricted to P .
There is a general subclass of Jacobi-type pencils, for which elements much more
can be said about their associated operators and models [24]. Here we used some ideas
from the general theory of operator pencils, see ref. [1, Chapter IV, p. 163].
Let Θ ¼ ðJ3 , J5 , α, β Þ be a Jacobi-type pencil and A be a model representation in L2σ
of the associated operator of Θ. By Theorem 1.2 we see that AΛ 0 is symmetric:
Suppose that the measure σ is supported inside a finite real segment ½a, b,
0 < a < b < þ ∞, that is, σ ðn½a, bÞ ¼ 0. In this case, the operator Λ of the multiplica-
tion by an independent variable has a bounded inverse on the whole L2σ . Using (27) we
may write:
1
Λ A½λuðλÞ, ½λvðλÞ L2 ¼ Λ1 ½λuðλÞ, A½λvðλÞ L 2 , u, v ∈ P: (28)
σ σ
Then A 0 is symmetric with respect to the form Λ1 , L2 . Thus, in this case, the
σ
operator A is a perturbation of a symmetric operator.
Consider two examples of Jacobi-type pencils which show that Sobolev orthogo-
nality is close to them.
Example 3.1. ([26]). Let σ be a nonnegative measure on BðÞ with all finite power
Ð Ð 2
moments, dσ ¼ 1, jgðxÞ j dσ > 0, for any nonzero complex polynomial g. The
following operator:
where c > 1 and d ∈ , satisfies properties ðiÞ-ðiiiÞ of Theorem 1.2. Let J 3 be the
Jacobi matrix, corresponding to the measure σ, and J 5 ¼ J 23. Define α, β in the following
way:
1 ξ0,1s 1 þ ξ0,0
α¼ p , β¼ p : (31)
ξ0,1 Δ1 ξ0,1 Δ1
n
ffiffiffiffiffiffiσ, while Δ n≔det ðsffiffiffiffiffi
Here s j are the power moments of kþffil Þk,l¼0 , n ∈ þ, Δ 1≔1 are the
Hankel determinants. The coefficients ξ k,j are taken from property (ii) of Theorem 1.2.
8
37
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
∞
Let Θ ¼ ð J3 , J5, α, β Þ. Denote by pn ðλÞ n¼0 the associated polynomials to the pencil Θ,
∞
and denote by f rn ðλÞgn¼0 the orthonormal polynomials (with positive leading
coefficients) with respect to the measure σ . Then
1 d r nðλÞ rnð0Þ c
pnðλÞ ¼ r nðλÞ þ r nð0Þ, n ∈ þ; (32)
cþ1 cþ1 λ cþ1
pn ðλÞ pnðdÞ
rn ðλÞ ¼ ðc þ 1 Þp nðλÞ þ ðc þ 1Þd cp nðdÞ, n ∈ þ : (33)
λd
pn ðdÞ
λpnðλÞ ¼ ðcλ þ dÞ þ an1 p n1ðλÞ þ bnp n ðλÞ þ anp nþ1 ðλÞ, n ∈ þ, ðλ ∈ Þ:
cþ1
(34)
rn ðλÞ rnð0Þ
pnðλÞ ¼ r nðλÞ , (36)
λ
38
Matrix Theory - Classics and Advances
b ðα,βÞ 1
P0 ðxÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ,
2 αþβþ1 Bðα þ 1, β þ 1Þ
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
b ðα,βÞ ð2n þ α þ β þ 1Þn!Γðn þ α þ β þ 1Þ ðα,βÞ
P n ðxÞ ¼ P ðxÞ, n ∈ :
2αþβþ1 Γðn þ α þ 1ÞΓðn þ β þ 1Þ n
d2 d
D α,β,c≔ x 2 1 2
þ ½ðα þ β þ 2Þx þ α β þ c, (38)
dx dx
l n,c ≔c þ nðn þ α þ β þ 1Þ: (39)
Xn
1 bðα,βÞ ðα,βÞ
Pn ðα, β, c, t0 ; xÞ≔ P k ðt 0Þ b
Pk ðxÞ, n ∈ þ, (40)
l
k¼0 k,c
10
39
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
Let K denote the real line or the unit circle. The following problem was stated in
ref. [28], see also ref. [29]: ∞
Problem 1. To describe all Sobolev orthogonal polynomials yn ðzÞ n¼0 on K, satisfying
the following two properties:
a. Polynomials yn ðzÞ satisfy the following differential equation:
! ! ! T
L y ðzÞ ¼ zM y ðzÞ, y ðzÞ ¼ y0 ðzÞ, y1 ðzÞ, … , (45)
where L, M are semi-infinite complex banded (i.e., having a finite number of non-
zero diagonals) matrices.
Relation (44) shows that yn is an eigenfunction of the operator pencil R λS, while
relation (45) means that vectors of ynðzÞ are eigenfunctions of the operator pencil
L zM. We emphasize that in Problem 1 we do not exclude OPRL or OPUC. They are
formally considered as Sobolev orthogonal polynomials with the derivatives of order
0. In this way, we may view systems from Problem 1 as generalizations of systems of
classical orthogonal polynomials (see, e.g., the book [30], and papers [20, 31, 32] for
more recent developments on this subject, as well as references therein). Related
topics are also studied for systems of biorthogonal rational functions, see, for example,
ref. [33]. Conditions ðaÞ, ðbÞ of Problem 1 are close to bispectral problems, and in
particular, to the Bochner-Krall problem (see refs. [31, 34–36] and papers cited
therein).
One example of Sobolev orthogonal polynomials, which satisfy conditions of
Problem 1, we have already met in Example 3.2. In ref.
∞[37] there was proposed a way
to construct such systems of polynomials. Let pn ðxÞ n¼0 (pn has degree n and real
coefficients) be orthogonal polynomials on ½a, b ⊆ with respect to a weight function
wðxÞ:
ðb
pn ðxÞp m ðxÞwðxÞdx ¼ A nδn,m , An > 0, n, m ∈ þ : (46)
a
X
ξ
D ξ yðxÞ ¼ d k ðxÞyðkÞ ðxÞ, (47)
k¼0
where dk and y are real polynomials of x: dξ 6¼ 0. Let us fix a positive integer ξ, and
consider the following differential equation:
11
40
Matrix Theory - Classics and Advances
where Dξ is defined as in Eq. (47), and n ∈ þ. The following assumption plays a
key role here.
Condition 1. Suppose that for each n ∈ þ, the differential Eq. (48) has a real n-th
degree polynomial solution yðxÞ ¼ yn ðxÞ.
If Condition
∞ 1 is satisfied, by relations (46),(48) we immediately obtain that
y nðxÞ n¼0 are Sobolev orthogonal polynomials:
0 1
ymðxÞ
ðb B 0 C
B ymðxÞ C
0 ðξÞ B C
yn ðxÞ, ynðxÞ, … , yn ðxÞ MðxÞB CwðxÞdx ¼ Anδn,m ,
a B ⋮ C (49)
@ A
ðξÞ
ym ðxÞ
n, m ∈ þ ,
where
0 1
d 0ðxÞ
B d 1 ðxÞ C
B C
MðxÞ≔ B C ðd ðxÞ, d1ðxÞ, … , d ξ ðxÞÞ, x ∈ ða, bÞ: (50)
@ ⋮ A 0
d ξ ðxÞ
Xr
dk
D¼ d kðzÞ , dk ðzÞ ∈ :
k¼0 dz k
∞
Let funðzÞ gn¼0, degu n ¼ n, be an arbitrary set of complex polynomials. The fol-
lowing statements are equivalent:
(A) The following equation:
for each n ∈ ℤþ, has a complex polynomial solution yðzÞ ¼ ynðzÞ of degree n;
(B) Dzn is a complex polynomial of degree n, ∀n ∈ ℤ þ ;
(C) The following conditions hold:
degdk ≤ k, 0 ≤ k ≤ r; (52)
X
r
½n jd j,j 6¼ 0, n ∈ þ , (53)
j¼0
12
41
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
Observe that condition (53) holds true, if the following simple condition holds:
Thus, there exists a big variety of linear differential operators with polynomial coeffi-
cients that have property ðAÞ. This leads to various Sobolev orthogonal polynomials.
In ref. [37] there were constructed families of Sobolev orthogonal polynomials on
the real line, depending on an arbitrary finite number of complex parameters.
Namely, we considered the following hypergeometric polynomials:
Ln ðxÞ ¼ L nðx; α, κ1 , … , κδ Þ ¼
(55)
¼ δþ1F δþ1 ðn, 1, … , 1; α þ 1, κ 1 þ 1, … , κδ þ 1; xÞ,
Pn ðxÞ ¼ Pn ðx; α, β, κ1, … , κ δ Þ ¼
¼δþ2 Fδþ1ðn, n þ α þ β þ 1, 1, … , 1; α þ 1, κ 1 þ 1, … , κ δ þ 1; xÞ, (56)
α, β, κ 1 , … , κδ > 1, n ∈ þ:
X
∞
wn
Gðt, wÞ ¼ f ðw ÞetuðwÞ ¼ gnðtÞ , t ∈ , ∣w∣ < R0 , ð R0 > 0Þ, (57)
n¼0
n!
where f , u are analytic functions in the circle fjwj < R 0g, uð0Þ ¼ 0. Such generating
functions for OPRL were studied by Meixner, see, for instance, ref. [39, p. 273]. In the
case of OPUC, we do not know any such a system, besides fz ngn¼0 . Consider the
∞
following function:
1 1
Fðt, wÞ ¼ Gðt, wÞ ¼ f ðwÞetuðwÞ,
pðuðwÞÞ pðuðwÞÞ (58)
t ∈ , ∣w∣ < R 1 < R0 , ðR1 > 0Þ,
where p ∈ : pð0Þ 6¼ 0. In the case uðzÞ ¼ z, one should take R 1 ≤ ∣z0 ∣, where z0 is a
root of p with the smallest modulus. This ensures that Fðt, wÞ is an analytic function of
two variables in any polydisk CT 1 ,R 1 ¼ ðt, wÞ ∈ 2 : jtj < T 1 , wj < R1 j , T 1 > 0. In the
general case, since pðuð0ÞÞ ¼ pð0Þ 6¼ 0, there also exists a suitable R1 , which
guarantees that F is analytic in CT1 ,R1 . Expand the function F ðt, wÞ in Taylor’s series by
w with a fixed t:
13
42
Matrix Theory - Classics and Advances
X
∞
wn
Fðt, wÞ ¼ φn ðtÞ , ðt, wÞ ∈ CT 1,R 1 , (59)
n¼0
n!
where φn ðtÞ are some complex-valued functions. Then the function φn ðtÞ is a
complex polynomial of degree n, ∀n ∈ þ , see [28, Lemma 3.5]. Suppose that degp ≥ 1,
and
X
d
pðzÞ ¼ ck zk, c k ∈ , cd 6¼ 0; c 0 6¼ 0; d ∈ : (60)
k¼0
∞
Theorem 1.4 ([28, Theorem 3.7]) Let d ∈ , and pðzÞ be as in (60). Let g nðtÞ n¼0
be a system of OPRL or OPUC, having a generating function Gðt, wÞ from (57) and
Fðt, wÞ be given by (58). Fix some positive T 1, R1 , such that F ðt, wÞ is analytic in the
polydisk C T1,R1 . Polynomials
Xn
n
φ nðzÞ ¼ b j gnj ðtÞ, n ∈ þ , (61)
j¼0
j
ð jÞ
1
where b j ¼ pðuðwÞÞ ð0Þ, have the following properties:
where
e ¼ ðc 0 , c1 , … , cd ÞT ð c 0, c1 , … , cdÞ:
M
i. Polynomials φn have the generating function Fðt, wÞ, and relation (59) holds.
n! 1
φ nðtÞ ¼ ∮ f ðwÞe tuðwÞ wn1 dw, n ∈ þ, (62)
2πi ∣w∣¼R2 pðuðwÞÞ
14
43
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
X
d
ck
ðn þ 1Þ φnþ1kðtÞ ¼
k¼0
ðn þ 1 kÞ!
! (63)
X
d
ck
¼t φ nk ðtÞ , n ∈ þ,
k¼0
ðn kÞ!
X
d
c k2k Xd
ck 2k
ðn þ 1Þ φ nþ1kðtÞ þ2 φn1k ðtÞ ¼
k¼0
ðn þ 1 kÞ! k¼0
ðn þ 1 kÞ!
! (64)
Xd
c k 2k
¼ 2t φnk ðtÞ , n ∈ ,
k¼0
ðn kÞ!
Observe that polynomials φn from the last two corollaries fit into the scheme of
Problem 1.
5. Conclusion
44
Matrix Theory - Classics and Advances
Sobolev orthogonality. This fact promises that Sobolev orthogonal polynomials can
find their applications in mathematical physics.
We think that Problem 1 is an appropriate framework for a search and a construc-
tion of new Sobolev orthogonal polynomials having nice properties. Notice that one
can produce such systems using classical OPRL or OPUC. The differential equation, if
it existed, is inherited by new systems of polynomials. The more complicated question
is the existence of a recurrence relation.
Besides new families of Sobolev orthogonal polynomials, it is of a big interest
finding classes of systems of Sobolev orthogonal polynomials, having recurrence
relations. One such a class (orthogonal polynomials on radial rays) was described in
Section 2. Thus, it looks reasonable to start not only from Sobolev orthogonality, but
from the other side, i.e., from recurrent relations. One such an example of derivation
was given by orthogonal polynomials on radial rays from Section 2.
Another possible way was given in Section 3, where we described Jacobi-type
pencils. The associated polynomials of a Jacobi type pencil have special orthogonality
relations. The associated operator yet has not a suitable functional calculus. As we
have seen, under some conditions this operator is a perturbation of a symmetric
operator. However, it is not clear how to calculate effectively a polynomial of this
operator.
In general, it is a classical situation that the operator theory stands behind special
classes of semi-infinite matrices and related objects. The operator theory of single
operators is well promoted and it is well recognized by any mathematician. It seems
that the theory of operator pencils is less known to the mathematical community.
This fact can explain the situation that pencils of semi-infinite matrices and related
polynomials appeared on a mathematical scene just recently. We hope that, as in the
classical case, these new orthogonal polynomial systems will shed some new light on
the theory of operator pencils.
Acknowledgements
The author is grateful to Professors Zolotarev and Yantsevich for their permanent
support.
45
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
References
17
46
Matrix Theory - Classics and Advances
18
47
Pencils of Semi-Infinite Matrices and Orthogonal Polynomials
DOI: http://dx.doi.org/10.5772/ITexLi.102422
48
Chapter 3
Abstract
1. Introduction
Sentence reordering is one of the popular tasks in reading comprehension [1, 2].
Regrettably, in the field of language testing, the scoring method, in practice, has been
overwhelmingly ‘all or nothing’, i.e., one can get a full score only if his/her answer
matches the correct sequence perfectly. ‘All or nothing’ is simple enough, but excludes
the idea of partial correctness, which Alderson, Percsich, and Szabo see as unfair [3].
There is no consideration of the difference in the test-taker’s degrees of performance.
Consider first a reordering task as an example of a reading comprehension ques-
tion. Text 1 is taken from Japan’s National Centre Examination (2013) with option [b]
being the correct answer [4]. The test-takers were told to choose the correct order of
the illustrations of a movie story they were presented.
49
Matrix Theory - Classics and Advances
Which of the following shows the order of the scenes as they appear in the movie?
Text 1.
Sample reordering task for reading comprehension. *: correct answer; element codes are
rearranged for convenience.
According to the test manual, only [b] gets the point; the other options make no
point. The correct sequence [b] contains three continuous ascending runs: A-B, B-C,
and C-D, and three discrete ascending runs: A-C, A-D, and B-D. Of these six
sequences [a] satisfies 5 matches, [c] 3 matches, and [d] 4 matches. By another
criterion, [a] is reached by dislocating one element (either C or B) from the correct
sequence, [c] by two elements, and [d] by two elements. By either case, the three
distractors are gradable in terms of proximity to the correct sequence. The ‘All or
nothing’ scoring method accepts only the perfect answer and ignores the differences in
the test-taker’s partially formed construct.
What should be sought is a rational evaluation method for a partial achievement of
item reordering tasks. This paper first reviews some past literature concerning this
issue, followed by an overview of a stretch of alternative measurement methods
developed by the author. The core issue is a new measurement scheme Linearity
Matrix (LM), which can compensate for the shortcomings of the present practices,
ensuring quicker and light-weight processing for non-specialists to handle.
2. Literature overview
Alderson, et al. examine four alternative methods to seek fairness and high dis-
criminability: (1) ‘Exact matching ’, (2) ‘Previous’, (3) ‘Next’, and (4) ‘Edges’ [3].
They were all in a test stage and no clear conclusion was reached. Above all, these
methods were empirically designed on the basis of ad hoc assumptions. Of them (4)
‘Edges’ is a linear extension of (1) ‘Exact matching’, and (2) ‘Previous’ and (3) ‘Next’
are variations of ‘Adjacent matching ’. ‘Exact matching’ requires each element to be
located exactly in the same position as in the correct answer. In option [a] of Text 1, A
and D will get points; the other options [c] and [d] will get no points because there is
no element in the correct position. In ‘Adjacent matching’ each of the three adjacent
pairs will get 1 point. None of the three distractors [a], [c], and [d] will get a point in
our example. But if we had an option
the initial pair C-D and the final pair A-B would each get 1 point.
Kendall’s coefficient tau is originally one of the measurement methods of
rank-order correlation [5]. Kendall’s tau is defined as:
50
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
In other words,
P
4 P
τ¼ 1 (1)
nðn 1Þ
where
P is the total number of items ranked after a given item by both rankings, and
n is the number of items
Kendall’s tau is a popular tool in evaluating correspondence between machine-
translated passages and human translations [6–8]. Papineni, Roukos, Ward and Zhu,
for example, measured tau for the degree of correspondence between the n-gram of
the reference text and that of the target text produced as a result of machine transla-
tion [9]. This study as well as other papers of the same interest deals with open-ended
elements for comparison where additions and reductions of words and phrases natu-
rally occur, hence irrelevant to the present scope of item reordering in which the
elements are closed.
Bollegala, Okazaki and Ishizuka’s ‘Average continuity’ [10] is a variation of
‘Adjacent matching’:
!
k
1 X
AC ¼ exp ∙ log ð Pi þ αÞ (2)
k 1 i¼2
where
k is the maximum number of continuous elements to be considered for calculation,
α is any small number (e.g., 0.001 in Bollegala, et al’s example),
and ‘precision of n continuous sentences ’:
Pn ¼ m=ðN n þ 1Þ (3)
where
n is the length of continuous elements,
m is the number of continuous elements in the correct order, and
N is the number of elements in the correct sequence.
For example, when evaluating an answer CDABE for a correct sequence ABCDE,
N = 5 (length of ABCDE), k = N = 5, m = 2 (count of CD and AB), and n = 2 (length of
CD or AB).
51
Matrix Theory - Classics and Advances
Alpha Mean AC SD of AC
Table 1.
Mean and SD of AC scores for all permutations of five elements. (A perfect sequence is excluded as an outlier).
Lapata prepared all permutations of orders of eight sentences and calculated the
tau value for each text, and compared it with the human rating of comprehensibil-
ity [11]. She obtained a significant correlation coefficient of r = 0.45 (N = 64)
(p. 478). She also claimed that she confirmed that Kendall’s tau was able to predict
the text cohesion by measuring the reading time of the target texts. Although this
was one of the few studies validating the effect of randomised sentence order, the
final coefficient value alone is not convincing enough to ensure the effect of
particular text disturbance on comprehension. Other measurement methods for
evaluating disrupted orders could have been equally significant while comprising
substantial linguistic differences beneath the surface. Therefore, the next step of
this study would be to analyse how the target sequences of sentences are created
and organised.
In conclusion, both ‘ Exact matching ’ and ‘Adjacent matching’ are incomplete and
counter to intuition. The problem with ‘Exact matching’ is that it is sensitive to the
absolute location of elements, and relative sequence is out of consideration. This
scoring method may be appropriate for a question in which absolute location is
significant, e.g., topic sentence in a paragraph, and initial cause of sequential events.
However, if we consider a sequence of combined cause-effect instances, relative
locations should also be rewarded with partial points. In contrast, ‘Adjacent matching’
weighs much on local adjacency. Even though [e] gets 2 points, the two pairs are
twisted in relative order.
52
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
where
Penalty rate ¼ Recovery distance = Maximal recovery distance
where
Maximal recovery distance ¼ n ð n 1Þ=2
where
n is the number of elements in the sequence.
53
Matrix Theory - Classics and Advances
EDCBA 0 A, B, C, D 10 1 0 (1–1) = 0
Table 2.
Sample recovery effects. (Bold letters indicate elements for recovery or ‘disruptors’).
In the first step, C jumped over D, which jumped over C in the second step. This
redundancy has occurred because C jumped over an older element D when it moved
right, the relative order having been reversed, which had to be rectified in the second
step by making D jump over C. We need a set of constraints as follows:
No element can cross over an older element when moving right: (6)
No element can cross over a younger element when moving left: (7)
Figure 1 is a flow chart illustrating the procedure of calculating MRS (and Distance
as a supplement). The core mechanism of creating MRS is to concatenate ascending
elements in the answer sequence into a possible pair and connect its tail with the head
of another ascending pair. In the case of CDABE, the first seeds are CD, DE, AB, BE
and AE. Some of them grow into larger sequences CDE and ABE. The growth stops
here since no more concatenation is possible. These are the MRS and the count of
concatenation steps is the MRS score (= 2). A full set of codes is included in [18].
5. MRS by Excel
54
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
Figure 1.
Flow chart of MRS algorithm.
Since P indicates the number of elements that each element has to jump over to make
a complete reverse sequence (EDCBA), the maximum of P is
where
n is the number of elements.
55
Matrix Theory - Classics and Advances
Figure 2.
Sample excel sheet of MRS and MRS + Dist.
The answer CDABE is distant from the correct answer by dislocating C and D by 4
occasions of jumping over (Table 2), and it is still distant from the complete reverse
by additional 6 occasions of jumping over. It means the sum of recovery distance and
P is always constant, Pmax.
Therefore recovery distance can be calculated by tau. Figure 2 already includes the
column of Distance by tau.
6. Linearity matrix
Despite the improved accessibility, a major problem with the Excel calculation is
its consumption of columns. If we follow the layout in Figure 2 the number of
columns required will soon reach the limit of 16,384 as we increase the size of the
sequence. It means n = 12 is the largest possible size. Table 3 shows the number of
entire columns required for n = 4 to n = 13, including columns for calculating recovery
distance by Kendall’s tau.
Yet another idea for representing the mechanism of MRS is to make use of a
matrix. Table 4 shows the framework of matrix (Linearity Matrix = LM) in which the
relationships of elements are indicated. The value ‘1’ indicates that the row element is
correctly followed by the column element. The value will be ‘0’ when the relative
order is disrupted. In Table 5 for CDABE, C-A, C-B, D-A and D-B are in the wrong
order. The sum of the values is Linearity Matrix score, representing the
wellformedness of the answer sequence.
The greatest advantage of LM is its efficiency in column consumption. When
transplanted to Excel the core part of LM requires n (n–1)/2 columns, resulting in a
n 4 5 6 7 8 9 10 11 12 13
Table 3.
Size of entire columns (C) required by MRS + Dist.
56
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
A 1 1 1 1
B 1 1 1
C 1 1
D 1
E LM = 10
Table 4.
Linearity matrix for correct answer ABCDE.
C 1 0 0 1
D 0 0 1
A 1 1
B 1
E LM = 6
Table 5.
Linearity matrix for partially correct answer CDABE.
n 4 5 6 7 8 9 10 11 12 13
C 14 19 25 32 40 49 59 70 82 95
Table 6.
Size of entire columns (C) required by LM.
small size of the entire columns (Table 6). Whereas MRS + Dist by Excel would need
3,080,634 columns (in theory) when n = 20, LM requires only 214.
An additional advantage of LM is that it can calculate the recovery distance by
counting the number of zero values. Since a pair with ‘1’ mark indicates an ascending
run, the sum of values (= LM) is equivalent to P of Kendall’s tau. It means that the
count of zero values represents the recovery distance. In Table 5, either C and D have
to jump over A and B or A and B have to jump over C and D. In either case, the
disruption is solved by removing the elements with zero values.
Furthermore, since zero marks are caused by the elements in incorrect positions,
these disruptors are subject to displacement for the entire sequence to be corrected.
Without these disruptors, the remaining elements should all be arranged in ascending
orders in any combination. Thus we can identify the elements of MRS by identifying a
minimal number of steps to remove disruptors. In Table 7, we can get B and E as MRS
D 0 1 0 0
B 1 1 0
E 0 0
C 0
A LM = 3, Distance = 7
Table 7.
Linearity matrix for partially correct answer DBECA.
57
Matrix Theory - Classics and Advances
D 0 1 0
B 1 1
E 0
A LM = 3, Distance = 3
Table 8.
Linearity matrix after removing disruptor A (step 1).
B 1 1
E 0
A LM = 2, Distance = 1
Table 9.
Linearity matrix after removing disruptor D (step 2).
B 1
A LM = 1, Distance = 0
Table 10.
Linearity matrix after removing disruptor C (step 3).
10
58
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
A 4 3 2 1
B 4 3 2
C 4 3
D 4
E Sum = 30
Table 11.
Linearity matrix for correct answer (n = 5) with gradient weights.
C 4 0 0 1
D 0 0 2
A 4 3
B 4
E Sum = 18
Table 12.
Linearity matrix for partially correct answer (n = 5) with gradient weights.
Figure 3.
Plot of scores by gradient LM and MRS + Dist. (The scores are standardised between 0 and 1. The oval represents a
density ellipse of probability 0.90.)
For part of the test-takers (N = 31), internal consistency was compared among
eight measurement methods where the reordering task was part of a larger reading
comprehension test including short-answer questions. Table 13 indicates that LM
methods are stable and can better capture the test-taker performance.
Table 13.
Alpha coefficients of eight measurement methods.
11
59
Matrix Theory - Classics and Advances
Table 14.
Processing time of four measurement methods (s).
Figure 4.
Processing time(s) by sequence length.
LM is even quicker in processing a vast amount of data than MRS. Table 14 shows
a processing time taken by four methods when n = 10. The data (size = 1000) was
randomly sampled from all permutations of 10 elements. The programmes for two
versions of LM were written by Xojo, the same programming language for MRS.
Distance in MRS + Dist used Kendall’s tau formula. Each value was an average of ten
trials.
It should also be noted that with LM the processing time did not deteriorate as the
number of elements increased. Figure 4 summarises the average processing time of
the four measurement methods for sequence length of n = 3 to n = 12.
12
60
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
ðAÞ A technician at Compaq Computers told of a frantic call he received on the helpline:
ðBÞ It was from a woman whose new computer simply would not work:
ðCÞ She said she ’d taken the computer out of the box, plugged it in; and sat
there for 20 minutes waiting for something to happen:
ðDÞ The tech guy asked her what happened when she pressed the power switch:
ðEÞ The woman replied, ‘What power switch?’
Text 2:
Correct sequence:
Both Text 3 and Text 4 differ from the correct sequence by one dislocation of
statement (C). However, while Text 4 is completely unacceptable Text 3 sounds
much more acceptable, if not perfectly. The use of past perfective form in Text 3
refers back to the point that occurred before (D). The one dubious element is that the
tech guy’s question in (D) is slightly too specific, which could disrupt the natural flow
of discourse. More serious is the fact that Text 4 is much less acceptable than Text 3,
even though the recovery distance of Text 4 is shorter than that of Text 3. It means
that the recovery distance alone is not necessarily a predictor of penalty.
This type of task may be called an a priori (or jigsaw) task: test-takers must read
the fragments at sight and reconstruct the original passage. Because the fragments
contain a lot of linguistic clues such as tense, reference words, and definiteness
markers, the test-takers can use these clues to connect fragments. Yet another task
type (a posteriori task) requires test takers to read or hear the entire passage initially
and reconstruct the outline by selecting descriptions in the correct order. Alderson
et al.’s ‘Queen task’ is of this type. In fact, they admit that the a posteriori type might
be more appropriate as a measurement tool of reading comprehension (p. 442). Text
6, based on intact Text 5 [22], is another sample of this type where linguistic clues are
as much neutralised as possible.
13
61
Matrix Theory - Classics and Advances
New neighbours
Mr and Mrs Smith married thirty years ago, and they have lived in the same house
since then. Mr Smith goes to work at eight o’clock every morning, and he gets home at
half past seven every evening, from Monday to Friday.
There are quite a lot of houses in their street, and most of the neighbours are nice.
But the old lady in the house opposite Mr and Mrs Smith died, and after a few weeks, a
young man and woman came to live in it.
Mrs Smith watched them for a few days from her window and then she said to her
husband, ‘Bill, the man in that house opposite always kisses his wife when he leaves in
the morning and he kisses her again when he comes home in the evening. ‘Why don’t
you do that too?’
‘Well,’ Mr Smith answered, ‘ I don’t know her very well yet.’
Text 5:
Sample original passage
Text 6:
Outline items for a posteriori reordering:
In this paper, we examined the nature of MRS and LM. Both of them are still in an
incubation stage in language testing as well as other psychometric measurements. The
differential weighting for ascending pairs in gradient LM is a proposed model without
empirical evidence. There might be clusters of items to be fixed together. Alternative
exchanges of talks in dialogue (as in the Compaq task) are considered an example of high
adhesion whereas some kinds of discourse order may not be as adhesive. Nevertheless, it
is meaningful to attempt application of various measurement methods and validate
psychometric as well as semantic connectivity. For example, flat LM might be suitable for
a task of recollecting historical events because reference to the chronological order of
events is relevant to all (or most) pairs. When reconstructing a story, in contrast, MRS or
gradient LM might be a better tool, because local connections are considered more
important than remote connections, and the wellformedness of the story depends on how
much the completed sequence of items looks like a string of stories. Finally, describing
MRS by matrix is space and time-saving. LM is like a ripple in the pond; if you observe the
wave on the shore you can detect where the stone was cast.
Acknowledgements
The author expresses his deep gratitude to Dokkyo University Information Science
Research Institute and to the Multivariate Study Group at SAS Institute Japan for their
insightful discussions and suggestions.
14
62
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
Figure 5.
Comparison of AC and MRS ratings.
Table 15.
Correlation of AC with MRS, MRS + Dist, and LM. (A perfect sequence is excluded as an outlier).
Conflict of interest
Notes
1.The correlation of AC and MRS is 0.524, but 89% of AC scores are smaller than
0.1 while 51% of MRS is between 0.5 and 0.6 (Figure 5, created by JMP [23]). See
Section 2 for MRS.
3. While AC = 0.050, MRS + Dist. = 0.350 and LM = 0.667. See Sections 2 and 3 for
MRS + Dist and LM.
5.Mac Pro (2 2.4GHz 6-core Intel Xeon, 25GB, 1.3 MHz DDR3)/OS10.14/Xojo
2017r2.1
63
Matrix Theory - Classics and Advances
64
Matrix as an Alternative Solution for Evaluating Sentence Reordering Tasks
DOI: http://dx.doi.org/10.5772/ITexLi.102868
References
17
65
Matrix Theory - Classics and Advances
66
Chapter 4
Abstract
The interest in the problem of weighted pseudoinverse matrices and the problem
of weighted least squares (WLS) is largely due to their numerous applications. In
particular, the problem of WLS is used in the design and optimization of building
structures, in tomography, in statistics, etc. The first part of the chapter is devoted to
the sensitivity of the solution to the WLS problem with approximate initial data. The
second part investigates the properties of a SLAE with approximate initial data and
presents an algorithm for finding a weighted normal pseudo solution of a WLS prob-
lem with approximate initial data, an algorithm for solving a WLS problem with
symmetric positive semidefinite matrices and an approximate right side and also a
parallel algorithm for solving a WLS problem. The third part is devoted to the analysis
of the reliability of computer solutions of the WLS problem with approximate initial
data. Here, estimates of the total error of the WLS problem are presented, and also
software-algorithmic approaches to improving the accuracy of computer solutions.
1. Introduction
The interest in the problem of weighted pseudoinverse matrices and the WLS problem
is largely due to their numerous applications. In particular, the problem of weighted least
squares is used in the design and optimization of building structures, in tomography, in
statistics, etc. A number of properties of weighted pseudoinverse matrices underlie the
finding of weighted normal pseudosolutions. The field of application of weighted
pseudoinverse matrices and weighted normal pseudosolutions is constantly expanding.
The definition of a weighted pseudoinverse matrix with positive definite weights
was first introduced by Chipman in article [1]. In 1968, Milne introduced the definition
of a skew pseudoinverse matrix in paper [2]. The study of the properties of weighted
pseudoinverse matrices and weighted normal pseudosolutions, as well as the construc-
tion of methods for solving these and other problems, are devoted to the works of Mitra,
Rao, Van Loan, Wang, Galba, Deineka, Sergienko, Ben-Israel, Elden, Wei, Wei, Ward
etc. Weighted pseudoinverse matrices and weighted normal pseudosolutions with
degenerate weights were studied in [3–5]. The existence and uniqueness of weighted
67
Matrix Theory - Classics and Advances
pseudoinverse matrices with indefinite and mixed weights, as well as some of their
properties, were described in [6–8]. Application of the weighted pseudoinverse matrix
in statistics presented, for example, in [9, 10]. Many results on weighted generalized
pseudo-inversions can be found in monographs [11, 12]. Much less work is devoted to
the study of weighted pseudoinversion under conditions of approximate initial data.
These issues are discussed in [13–17]. Analysis of the properties of weighted
pseudoinverses and weighted normal pseudosolutions, as well as the creation of solution
methods for these and other problems, are described in [18–20].
When solving applied problems, their mathematical models will have, as a rule,
approximate initial data as a result of measurements, observations, assumptions,
hypotheses, etc. Later, during discretization (‘arithmetization’) of the mathematical
model, these errors are transformed into the errors of the matrix elements and the right
parts of the resolving systems of equations. The input data of systems of linear algebraic
equations and WLS problems can be determined directly from physical observations,
and therefore they can have errors inherent in all measurements. In this case, the
original data we have is an approximation of some exact data. And, finally, the initial
data of mathematical models formulated in the form of linear algebra problems can be
specified exactly in the form of numbers or mathematical formulas, but, given the finite
length of a machine word, it is impossible to work with such an exact model on a
computer. The machine model of such a problem in the general case will be approxi-
mate either due to errors in converting numbers from the decimal system to binary or
due to rounding errors in the implementation of calculations on a computer.
The task is to study the properties of the machine model and to form a model of the
problem and an algorithm for obtaining an approximate solution in a machine envi-
ronment that will approximate the solution of a mathematical problem. The key
question of numerical simulation is the reliability of the obtained machine solutions.
The most complete systematic exposition of questions related to the approximate
nature of the initial data in problems of linear algebra is given in the monographs
[21–24]. Various approaches to the study and solution of ill-posed problems were
considered, for example, in [25–28]. Problems of the reliability of a machine solution for
problems with approximate initial data, i.e. estimates of the proximity of the machine
solution to the mathematical solution, estimates of the hereditary error in the mathe-
matical solution and refinement of the solution were considered in the publications
[12, 26, 29–33]. Much less work has been devoted to the study of similar questions for
the WLS problem. The sensitivity analysis of a weighted normal pseudosolution under
perturbation of the matrix and the right-hand side is the subject of papers [16, 34–36].
The chapter is devoted to the solution of the listed topical problems, namely the
development of the perturbation theory for the WLS problem with positive definite
weights and the development of numerical methods for the study and solution of
mathematical models with approximate initial data.
2.1 Preliminaries
Let the set of all m n matrices is denoted by R mn. Given a matrix A ∈ Rmn let AT
is the transpose of A, rankðAÞ is the rank of A, ðAÞ is the field of values of A and
ðAÞ is the null space of A. Additionally, let kk denote the vector 2-norm and the
consistent matrix 2-norm, and let I be an identity matrix.
2
68
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
Given an arbitrary matrix A ∈ Rmn and symmetric positive definite matrices M and
N of orders m and n, respectively, a unique matrix X ∈ Rmn , satisfying the conditions:
A# ¼ N1 A T M: (2)
þ þ
P ¼ Aþ þ
MN A, Q ¼ AA MN , P ¼ A MN A, Q ¼ AAMN : (3)
1 1 (4)
1
kykN ¼ ðy, yÞ 2N ¼ y T Ny 2 ¼ N 2y , y ∈ Rn:
Let x, y ∈ Rm and ðx, yÞM ¼ 0. Then the vectors x and y are called M-orthogonal, i.е.
1 1
M x- and M 2y-orthogonal. It is easy to show that.
2
Lemma 1 (see in [37]). Let A ∈ Rmn , rankðAÞ ¼ k , M and N are positive definite
matrices of orders m and n, respectively. Then, there are matrices U ∈ Rmm and
V ∈ Rnn , satisfying U T MU ¼ I and V T N1 V ¼ I such that.
!
D 0 D1
0
A¼U VT , Aþ
MN ¼N 1
V UT M, (7)
0 0 0 0
where D ¼ diagð μ1 , μ2, … , μ kÞ, μ 1 ≥ μ2 ≥ … ≥ μk > 0 and μ2i are the nonzero eigen-
values of the matrix A# A. The nonnegative values μi are called the weighted singular
values of A, moreover, k Ak MN ¼ μ1 , Aþ 1
MN NM ¼ μ k .
The weighted singular value decomposition
of A yields an M-orthonormal basis of
the vectors of U and an N 1 -orthonormal basis of the vectors of V.
69
Matrix Theory - Classics and Advances
In the study of the reliability of the obtained machine results, three linear systems
are considered. A system of linear algebraic equations with exact input data
Ax ¼ b: (8)
We will consider the corresponding weighted least squares problem with positive
definite weights M and N:
min k xk N, C ¼ xj kAx b kM ¼ min , (9)
x∈C
where.
А ¼ A þ ΔA, b ¼ b þ Δb,
x ¼ x þ Δx: (11)
Assume that the errors in the matrix elements and the right-hand side satisfy the
relations:
Ax ¼
b þ r, (13)
Let linear manifold L be a nonempty subset of space R, closed with respect to the
operations of addition and multiplication by a scalar (if x and y are elements of L ∀α, β,
the αx þ β y is an element of L). Vector x is N-orthogonal to the linear manifold L
(x⊥N L) if x is N-orthogonal to each vector from L.
Lemma 2 (see in [38]). There exists a unique decomposition of vector x, namely
x¼^ xþ~ ^ ∈ L, ~x⊥ N L.
x, where x
Let A is an arbitrary matrix. The kernel of matrix A, denoted by (A), is the set of
vectors mapped into zero by А: ðAÞ ¼ fx : Ax ¼ 0 g.
The set ðAÞ of images of matrix A is the set of vectors that are images of vectors
of the space R from the definition domain of A, i.e. ðAÞ ¼ fb : b ¼ Ax, ∀x g .
4
70
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
This lower bound is attained since ^b belongs to the set of images А, i.е. ^
b is an image
^ ¼ Ax0 .
of some x0 : b
Thereby, for this x0 the greatest lower bound is attainable:
2 2
kb Ax 0 k2M ¼ b ^
b ¼ ~
b : (15)
M M
71
Matrix Theory - Classics and Advances
and hence, the lower bound can only be attained for x ∗ , for which Ax ∗ ¼ ^ b.
According to Theorem 2, each vector x ∗ , can be presented as a sum of two orthogonal
vectors: x ∗ ¼ ^x ∗ þ ~x ∗ , where ^x ∗ ∈ ð A# Þ, ~
x ∗ ∈ ðAÞ.
2
∗
Therefore, Ax ¼ Ax^ and hence, k b Ax ∗ k M ¼ k b A^
∗ x ∗ k 2M. Note that
k x ∗ k 2N ¼ k ^x ∗ k2N þ k~
x ∗ k2N ≥ k ^
2
x ∗k N , (17)
where strict inequality is possible when x ∗ 6¼ x ^ ∗ (i.е. if x ∗ does not coincide with
#
its projection on to ðA Þ).
It was shown above, that x 0 minimizes k Ax bkM, if and only if Ax 0 ¼ ^ b, and
among the vectors that minimize kAx bkM, each vector with the minimum norm
should belong to the set of images A# . To establish the uniqueness of a minimum-
norm vector, assume that x^ and x ∗ belong to ð A# Þ and that A^ x ¼ Ax ∗ ¼ ^
b.
#
∗
Then x x ∗
^ ∈ ð A Þ, however A ðx ^xÞ ¼ 0, so x x∗
^ ∈ ðAÞ ¼ ⊥N ð A#Þ.
As vector x ∗ x^ is N-orthogonal to itself k x ∗ x^k N ¼ 0, i.е. x ∗ ¼ ^x.
Remark 2. There is another assertion that is equivalent to Theorem 4. There exists
an n-dimensional vector y such that
If
^ ¼ N1 AT My ¼ A # y,
x (20)
i.е. ^
x can be obtained by means of any vector y 0 , that satisfies the equation
A# AA #y ¼ A #b by the formula x ^ ¼ A #y 0.
Proof. According to the condition of Theorem 3 ð A# Þ ¼ ðA #AÞ. Since vector
A#b belongs to the set of images A# , it should belong to the set of images A# A and thus
should be an image of some vector x with respect to the transformation A#A. In other
words, Eq. (21) (with respect to x) has at least one solution. If x is a solution of
Eq. (21), then x ^ is the projection of x on to ðA #Þ, since Ax ¼ A ^
x according to
#
Theorem 2. Since x
^ ∈ A ^
ð Þ, vector x is an image of some vector y with respect to the
transformation A#: x ^ ¼ A #y.
Thus, we have established that there exists at least one solution of Eq. (21) in the
form of (20). To establish the uniqueness of this solution, we assume that ^ x 1 ¼ A # y1
and ^ x2 ¼ A#y2 satisfy Eq. (21). Then A #A A #y 1 A# y2 ¼ 0, therefore,
A y1 y 2 ∈ ðA# AÞ ¼ ðAÞ, where from
#
the equality AA # y1 y2 ¼ 0 follows.
6
72
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
^ 1 ¼ A #y1 ¼ A# y2 ¼ x^2.
Therefore y1 y2 ∈ ðAA# Þ ¼ ðA# Þ; hence, x
Thus, there exists exactly one solution of Eq. (21) in the form (20). The proof of
Theorem 5 will be completed if we can show that by virtue of Theorem 1 the solution
found in the form (14) is also a solution of the equation Ax ¼ ^b, where b ^ is a weighted
# #^
projection of b on to ðAÞ, i.е. A b ¼ A b.
Theorem 4 establishes that there is a unique, from ð A#Þ solution of the equation
Ax ¼ ^
b: (22)
we can formulate the following theorem for problem (9) in a shorter form.
Theorem 6. Let A ∈ R mn, then x ¼ Aþ MN b—is М-weighted least squares solution
with the minimum N-norm of the system Ax ¼ b.
Note that in [18] a slightly different mathematical apparatus was used to prove the
existence and uniqueness of the M-weighted least squares solution with the minimum
N-norm of the system Ax ¼ b.
μi ðAÞ k ΔAkMN ≤ μ i
A ≤ μi ðAÞ þ kΔAkMN : (24)
¼ rankðAÞ and
Lemma 4 (see [40]). Let A, ΔA ∈ Rmn , rank A
þ
kΔAk MN AMN NM < 1. Then
þ
A
þ MN NM
A MN ≤ : (25)
1 k ΔAk MN Aþ
NM
MN NM
þ
Lemma 5. Let G ¼ AMN A þ
MN , А ¼ А þ ΔA and rank A ¼ rankðAÞ. Then G can
be represented as the sum of three matrices G ¼ G1 þ G2 þ G3, where
þ ΔAAþ ,
G1 ¼ A (26)
MN MN
#
G 2 ¼ ðI PÞN1 ΔA T AþT þ #
MNNA MN ¼ ðI PÞΔA AMN
þ
Aþ
MN , (27)
7
73
Matrix Theory - Classics and Advances
þ
G3 ¼ A MN ðI Q Þ: (28)
Proof. Following [26], G can be represented as the sum of the following matrices.
þ
G ¼ ½P þ ðI PÞ AMN Aþ
MN ½Q þ ðI Q Þ ¼
þ Q þ P Aþ ðI Q Þ PA
¼ PA þ ðI Q Þ þ ðI PÞ Aþ Q þ (29)
þ Q PA
MN MN MN MN MN
þ þ þ
þ ðI PÞA MN ðI Q Þ ð I PÞA Q þ ð I PÞA ðI Q Þ:
MN MN
Since.
PAþ þ þ þ þ þ
MN ¼ AMN , ðI PÞAMN ¼ 0, A MNQ ¼ AMN , A MN ðI Q Þ ¼ 0, (30)
we obtain
þ þ þ
G ¼ A MN Q þ AMN ðI Q Þ PA
þ
MN þ ðI PÞAMN ¼
(31)
¼ A þ Q PAþ þ þ
MN MN ðI PÞAMN þ AMN ðI Q Þ:
þ Q PAþ ¼ A
þ AA þ A
þ AA þ A A
þ ¼A þ ΔAAþ :
Aþ ¼ A
G1¼ A MN MN MN MN MN MN MN MN MN MN
(32)
To estimate the second term, we use properties (1)
T þ
Aþ
þ þ 1
NA þ
MN ¼ AMN A AMN ¼ N MNA AMN ¼
T
(33)
¼ N1 AT AþT þ 1 þT þ 1 T þT þ
MN NA MN ¼ N A AMN NAMN N ΔA AMN NAMN :
Since,
ðI T AþT
PÞN1 A þ 1 T þT þ þ 1 T þT
MNNA MN ¼ N A AMN NAMN AMN AN A A MN NAMN ¼
þ
(35)
T AþT NAþ N 1A
¼ N1 A TA þT NAþ ¼ 0
MN MN MN MN
we obtain
# þ
G2 ¼ ðI PÞAþ 1 ΔAT AþT NAþ ¼ ðI
PÞΔA# Aþ
MN ¼ ðI PÞN MN MN MN AMN :
(36)
Finally,
þ þ
þ þ # þ
G ¼ A MN A þ þ #
MN ¼ AMN ΔAA MN ðI PÞΔA A MN AMN þ AMN ðI Q Þ: (37)
¼ rankðAÞ ¼ k, then
Lemma 6 (see in [41]). If rank A
ðI QÞ ¼ Q I Q
Q , (38)
MM MM
8
74
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
þ
ð εAhÞ 2
AMN Aþ ≤C , (40)
MN
NM 1 εA h
moreover.
if A is not a full rank matrix, then C ¼ 3,
if m > n ¼ k or n > m ¼ k, then C ¼ 2,
if m ¼ n ¼ k, then C ¼ 1:
Proof. To obtain estimates, we use the results of Lemma 5:
þ þ
þ # þ
þ ðI Q Þ:
AMN A þ þ #
MN ¼ AMNΔAAMN ðI PÞΔA AMN A MN þ A MN (41)
(44)
Using the results of Lemma 4, we obtain an estimate for the absolute error of the
weighted pseudoinverse matrix A.
þ
þ
A
MN NM
kΔA kMN Aþ
þ
AMN Aþ
≤ MN NM þ k ΔAk MN A MN NM þ
MN
NM 1 kΔA kMN Aþ
MN NM
(45)
hε A ðhεA Þ2
þ Aþ
MN NM
k ΔAk MN ¼ ð hεAþ hεA þ hεA Þ ¼ C , C ¼ 1, 2, 3:
1 hε A
1 hε A
9
75
Matrix Theory - Classics and Advances
Let us estimate the error of the weighted minimum-norm least squares solution.
Let us introduce the following notation:
k Δbk M k rk M k
r kM kΔb kM
α¼ ,β ¼ ,γ ¼ αl ¼ ,
kAk MN kx kN kx kN kAkMN kA kMN kxk N kAk MNkx lkN
(47)
k rk M kr l kM k
rk k M
βl ¼ ,γ ¼ ,γ ¼ :
kx lkN kAkMN l kAk MN kx l kN k k AkMN kxkN
kx x kN h
≤ ð2ε A þ α þ hε A βÞ, (48)
kxk N 1 hεA
Then,
h i
þ ΔAAþ þ ðI PÞN 1 ΔAT A þT NAþ A
þ ðI Q Þ b þ A
þ b b ¼
x x ¼ A MN MN MN MN MN MN
þ ΔAAþ b þ ðI
PÞN1 ΔA TA þT þ þ þ
¼A MN MN MN NA MN b A MNðI Q Þb þ A MN b b
þ þ þ b b
¼
AMN ΔAx þ ð I
PÞN 1 ΔATAþT (51)
MN Nx A MN ðI Q Þb þAMN
Thus,
76
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
ðI Q Þb ¼ ðI Q Þr ¼ r, r ¼ b Ax, x ¼ A þ
MNb (54)
and applying Lemma 6, the weighted norm of each term in (27) can be rearranged
as follows
þ 1 þ
AMN ΔAx ¼ N =2 AMN M =2 M =2 ΔAN = 2 N=2x ≤
1 1 1 1
N
a.
1 þ =1 2 1=2 1=2 1=2
þ
≤ N =2A MNM M ΔAN N x ¼ A MN kΔA kMNkxkN : ð55Þ
NM
where
¼ M=1 2 AAþ
=1
¼ AAþ
MN I Q M
Q I Q 2 ¼
MM MN I Q MM
T
¼ M =2 MAAþ I Q M =2 ¼
1 1
MN
T T
¼ M 1=2 Aþ T M =1 2M1=2 I Q M =1 2
A A ¼
MN
T T T
¼ M 1=2 Aþ M1=2 ≤ M =2 Aþ ΔAT M= 2 ¼
1 1
ΔA M I Q
MN MN
1=2
¼ M1=2ΔAN =1 2 N =1 2Aþ ≤ M =2 ΔAN =2 N =2 Aþ 1=2
1 1
1
M MN M ≤
MN
≤ kΔA kMN Aþ
MN NM :
(58)
Aþ Q
≤ Aþ k ΔAkMN Aþ
MN ð I Q Þb MN MN NMkr kM : (59)
N NM
11
77
Matrix Theory - Classics and Advances
þ b 1=2 þ 1=2 =1 2
d. A b ¼ N A M M b b ≤
MN MN
N
1=2 þ þ
N A MNM =2 M = 2 b b ¼ A b
1 1
MN b M ð60Þ
NM
Taking into account I P < 1, and applying Lemma 4, we obtain the following
as required.
Specifically, if M ¼ I ∈ Rmm and N ¼ I ∈ Rnn, then the estimates of the hereditary
error of normal pseudosolutions of systems of linear algebraic equations follow from
next theorem.
Theorem 8 (see in [32]). Let║ΔА║║А þ║< 1, rank A ¼ rankðAÞ ¼ k. Then
kx xk h kb b k k
≤ 2εА þ εb k þ hεА , (62)
kx k 1 hεА k bk k
where bk is the projection of the right-hand side of problem (8) onto the principal
left singular subspace of the matrix A [42], i. е., bk ∈ Im A, h ¼ hðAÞ ¼ k Ak kA þk is
condition number of A, the symbol kk , unless otherwise stated, denotes the Euclidean
vector norm and the corresponding spectral matrix norm, Аþ is the Moore–Penrose
pseudoinverse.
Case 2. The rank of the perturbed matrix is larger than that of the original matrix A,
i.e.rank A > rankðAÞ ¼ k.
Define the idempotent matrices:
Pk ¼ þ þ
P ¼ Aþ þ
MNA, Q ¼ AA MN , Ak MN
A, k ¼
Q AAkMN , (63)
12
78
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
Theorem 9. Assume that k ΔAkMN Aþ 1
MN NM < 2, rank A > rankðAÞ ¼ k. Then
kx x k k N h
≤ ð 2ε А þ α þ hεА βÞ: (64)
kx kN 1 2hε А
where hðAÞ ¼ k Ak MN A þ
MN NM is the weighted condition number of A, the sym-
bols kk MN and kk NM denote the weighted matrix norms defined Eq. (4)–(6), and A þ MN
is the weighted Moore–Penrose pseudoinverse.
Proof. The desired estimate is derived using the method of [32], which is based on
the singular value decomposition of matrices. Specifically,
А is represented as a
weighted singular value decomposition:
А ¼ U
D T :
V (65)
D
Аk ¼ U T ,
kV (66)
where D k is a rectangular matrix whose first k diagonal elements are nonzero and
equal to the corresponding elements of D, while all the other elements are zero.
The weighted minimum-norm least squares solution to problem (10) is approxi-
mated by the weighted minimum-norm least squares solution xk to the problem
n o
min k xkN , C ¼ xj Akx
b M ¼ min : (67)
x∈C
The matrix A k is defined by (48) and has the same rank k as the matrix of the
unperturbed problem.
Thus, the error estimation of the least-squares solution for matrices with a
modified rank is reduced to the case of the same rank. This fact is used to estimate
x k kN =k xkN . The error of the weighted pseudoinverse matrix then becomes:
kx
þ þ þ
Gk ¼ ½Pk þ ðI Pk Þ AkMN Aþ
MN ½Q þ ðI Q Þ ¼ P kAk MN Q þ P kA kMN ðI Q Þ
P kAþ þ þ þ
MN Q P k AMN ðI Q Þ ðI Pk ÞAk MN Q þ ð I P k ÞA k MN ðI Q Þ
ð I PkÞA þ þ þ þ þ (68)
MN Q þ ðI Pk ÞAMN ðI Q Þ ¼ Ak MNQ PkAMN ðI Pk ÞA MN þ
kþ
þA þ þ þ þ þ þ
MN ðI Q Þ ¼ Ak MNAAMN AkMN AA MN ð I Pk ÞA MN þ A kMNðI Q Þ ¼
þ Aþ ð I PkÞA þ þ Akþ ðI Q Þ,
¼ AkMN A A
MN MN MN
xk x ¼ A
k þ ðI Q kÞb þ A
k þ ΔAx þ ð I P k ÞN1 ΔAT AþT Nx A kþ b
b :
MN MN MN MN
(70)
13
79
Matrix Theory - Classics and Advances
k:
Let estimate ΔAk ¼ A A
k A ¼ A kA
þ ΔA ≤ Ak A
þ kΔA k ¼
kΔA k k MN ¼ A MN MN MN MN
0
0 0
1
(72)
@ AT
¼
U V þ k ΔAk MN ≤ 2 kΔA kMN :
0 Dkþ1
MN
kx xk k h kb bk k
≤ 2εА þ εb k þ hε А : (73)
kxk 1 2hεА kb kk
P l ¼ Al þ þ þ
P¼A ¼A
Aþ ,
MNA, Ql ¼ AAl MN , MN A, Q MN (74)
14
80
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
kxl xkN μ 1=μl μ1
≤ 2εA þ αl þ εАβ l , (75)
k xlk N 1 2kΔAk MN =μl μl
Applying Lemma 4 and passing to the weighted norms yields the estimate
kxl
xkN þ
kΔA kMN þ kΔAkMN Al þ
≤ A MN NMþ
MN
kxkN NM
þ þ
Al kΔAk krk þ
A MN MN NM MN M A MN kΔbk M
NM NM
þ þ ≤
k xkN kxk N
(79)
þ
MN NM k Ak MN
Al
kΔAkMN kΔbkM
≤ þ 2 þ þ
1 kΔA l k MNA lMN NM kAk MN k AkMN kx kN
kA k kΔA kMN kr kM
þAl þ
MN NM MN ,
kAk MN kA kMNkxk N
k xl xk μ 1 =μl μ 1 kb bl k
≤ 2εA þ ε bl þ ε А , (80)
kxl k 1 2kΔAk=μ l μl kbl k
where xl is the projection of the normal pseudosolution of problem (8) onto the
right principal singular subspace of the matrix A of dimension l, bl is projection of the
right-hand side b onto the principal left singular subspace of dimension l of the matrix
A, μi is singular values of the matrix А.
15
81
Matrix Theory - Classics and Advances
3.2 Estimates of the hereditary error of a weighted normal pseudosolution for full
rank matrices
þ ðI Q Þ,
þ ΔAA þ þ A
Aþ A þ ¼ A (84)
MN MN MN MN MN
k x xkN h
kΔb k
≤ 2εA þ , (86)
k xkN 1 hεA kA kIN kxkN
þ A þ ¼ A
þ ΔAAþ ðI
PÞN 1ΔA TA þT þ
(87)
A MN MN MN MN MNNAMN :
82
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
which are easy to obtain for the weighted matrix norm based on the perturbation
theory for symmetric matrices.
¼ rankðAÞ and k ΔAk Aþ < 1. Then
Lemma 7. Let A, ΔA ∈ Rmn , rank A
MN MN NM
the estimate of the relative error of the condition number of the matrix A has the form
h h 1þh
h ≤ εA 1 ε Ah
(89)
where h A ¼ A
þ
A the
is weighted matrix condition number A,
MN MN
NM
symbols kkMN and kkNM denote the weighted matrix norms defined by Eq. (4)–(6)
and AþMN is the weighted Moore–Penrose pseudoinverse.
Thus, estimates of the hereditary error, the right-hand side of which is determined
by approximate data, can be obtained without inequalities (88). Estimates similar to
(90) can be obtained for all the previously considered cases.
Remark 4. Under the conditions of Theorem 15, using the inequality
k x xk N kx x kN xk N
kx
≤ 1þ (91)
k xkN kx kN kxk N
(92)
Estimates similar to (92) can be obtained for all the previously considered cases.
17
83
Matrix Theory - Classics and Advances
4.1 Investigation of the properties of WLS problem with approximate initial data
In the study of the mathematical properties of the weighted least squares problem
with approximate initial data associated with computer realization as an approximate
model in (10), (11) we will understand exactly the computer model of the problem.
We will assume that the error of the initial data ΔA, Δb, in this case, contains in
addition to everything, the error that occurs when the matrix coefficients are written
to the computer memory or its computing.
Matrix of full rank within the error of initial data, we assume a matrix that
cannot change the rank of ΔA change in its elements.
Matrix of full rank within the machine precision, we assume a matrix that
cannot change the rank when you change the elements within the machine precision.
Lemma 8. If rankðAÞ ¼ min ðm, nÞ, and
kΔAk MN Aþ MN NM < 1,
(93)
¼ rankðAÞ.
Then rank A
Proof. For proof, let, for example, rankðAÞ ¼ m . Taking equal kΔA kMN ¼ ε, in
equality (93) can be rewritten as μεm < 1, which is equivalent
μm ε > 0: (94)
< 1,
εA h A (95)
1
1:0 þ 6¼ 1:0 (96)
h A
where h A ¼ A
þ
A
is weighted condition number of matrix A.
MN MN
NM
The fulfillment of the first condition (95) guarantees that the matrix has a full rank
and is within the accuracy of the initial data, and the second (96), which is performed
in floating-point arithmetic, means that the matrix has a full rank within the machine
precision.
Under these conditions, the solution of the machine problem exists, it is unique
and stable. Such a machine problem should be considered as correctly posed within
the accuracy of initial data.
Otherwise, the matrix of the perturbed system may be a matrix, not full rank and,
therefore, the machine model of the problem (10), (11) should be considered as ill-
posed. A key factor in studying the properties of a machine model is the criterion of
18
84
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
the correctness of the problem. Thereby, a useful fact is that the condition
for study-
ing the machine model of problem (96) includes the value inverse to h A . As a result,
for large condition numbers of conditionalitydoes
not occur an overflow in order.
And the disappearance of the order for 1:0=h A for large condition numbers is not
fatal: the machine result is assumed to be equal to zero, which allows us to make the
correct conclusion about the loss of the rank of the matrix of the machine problem.
To analyze the properties of a machine model of problems with matrices of
incomplete rank under conditions of approximate initial data, a fundamental role is
played definition of the rank of a matrix.
The rank of the matrix in the condition of approximate the initial data (effective
rank or δ -rank) is
This means that the δ-rank of the matrix is equal to the minimum rank among all
matrices in the neighborhood k A BkMN ≤ δ.
From [28] that if rðδÞ is the δ-rank of the matrix, then
The practical algorithm for finding δ—rank can be defined as follows: find the
value of r is equal to the largest value of i, for which the inequality is fulfilled
δ
< 1, μi 6¼ 0, i ¼ 1, 2:: (99)
μi
Using the effective rank of a matrix, can always find the number of a stable
projection that approximates the solution or projection
To analyze the rank of a matrix of values within the machine precision value δ can
be attributed to machine precision, for example, setting it equal macheps kBk.
¼ rankðAÞ ¼ k, an
1. If the rank of the matrix has not changed rank A
approximate weighted normal pseudosolution is constructed by the formula
þ
x ¼ A
MN b, (100)
þ
where AMN is represented as a weighted singular value decomposition (7).
19
85
Matrix Theory - Classics and Advances
kþ
xk ¼ A (101)
MN b:
kþ is defined as follows
Weighted pseudoinverse matrix A MN
k þ
A 1 þ T
MN ¼ N V D k U M, (102)
where Dk is a rectangular matrix, the first k diagonal elements of which are
nonzero and coincide with the corresponding elements of the matrix D from (7),
and all other elements are equal to zero.
In this case, the weighted normal pseudosolution of system (9) is approximated
by the projection of the weighted normal pseudosolution of system (10) onto the
right principal and, if
þ weighted singular subspace of dimension k of the matrix A
1, then the error of the solution is estimated by formula
kΔAkMN AMN NM < 2
(64).
¼ l, an approximation
3. If the rank of the matrix has decreased rankðAÞ > rank A
to the projection of a weighted normal pseudosolution of problem (9) is
constructed using formula (100). In this case, the projection of the weighted
normal pseudosolution of system (9) onto the principal right weighted singular
subspace of dimension l of the matrix A is approximated by the weighted normal
kΔAk
pseudosolution of system (10) and, if μ MN < 12, the projection error is estimated
l
by formula (75).
Remark 5. If the rank of the original matrix is unknown, then the δ-rank should be
taken as the projection number in (101). In this case, it is guaranteed that a stable
approximation is found either to a weighted normal pseudosolution or to a projection,
respectively, with error estimates.
If the rank of the original matrix is known, then it is guaranteed to find an
approximation to the weighted normal pseudosolution with appropriate estimates.
Remark 6. Because of the zero columns in the matrix D þ, only the largest first n
columns of the matrix U can actually contribute to the product (100). Moreover, if
some of the weighted singular numbers are equal to zero, then less than n columns of
U are needed. If kp is the number of nonzero weighted singular numbers, then U can
be reduced to the sizes m kp, Dþ—to the sizes kp kp, VT—up to size kp n.
Formally, such matrices U and V are not M-orthogonal and N1 -orthogonal, respec-
tively, since they are not square. However, their columns are weighted orthonormal
systems of vectors.
20
86
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
5.1 Estimates of the total error of a weighted normal pseudosolution for matrices
of arbitrary rank
Estimates of the total error take into account both the hereditary error due to the
error in the initial data and the computational error due to an approximate method for
determining the solution to the problem. In this case, the method of obtaining a
solution is not taken into account. The computational error can be a consequence of
both an approximate method of obtaining a solution and an error due to inaccuracy in
performing arithmetic operations on a computer. The residual vector r¼A x
b takes
into account the overall effect of these errors.
Let us obtain estimates for the total error of the weighted normal pseudosolution
using the previously introduced notation (47). Let us consider three cases.
Case 1. The rank
of the original matrix A remains the same under its perturbation, i.e.,
rankðAÞ ¼ rank A .
Theorem 17. Assume that kΔAk MN Aþ
MN NM < 1, rank A ¼ rankðAÞ ¼ k and let
#
x
∈ A . Then
kx xk N h
≤ ð 2ε þ α þ hε Аβ þ γ Þ: (103)
kxk N 1 hε А А
Proof. For the hereditary error, in this case, estimate (48) holds.
To estimate the computational error x x, we use the relation
ð x xÞ ¼ r ¼
A
bk A x, (104)
where bk is projection of the vector b on the main left weighted singular subspace
i.е. b k ∈ A .
of the matrix A,
þ A
# #
Considering that x x ∈ A and the fact that A is a projector in
A , we
MN
have
þ þ
AMN A ð x xÞ ¼
x
x¼AMNr: (105)
kx
x kN krk
≤ A Aþ M : (106)
MN
k xkN MN NM
bk M
An estimate of the total error of the normal pseudosolution follows from the
relations
k x xk N þ k
rk M
≤ kA kMN A MN (108)
kx kN NM kbk kM
21
87
Matrix Theory - Classics and Advances
k x xk N h
≤ ð2εA þ α þ hεAβ þ γk Þ: (109)
k xkN 1 hεA
k þ
∈ Α k # , and operator A
Considering the fact that xk x MN Ak is the projection
#
operator in Αk , we obtain
Hence follows an estimate of the computational error for the projection of the
normal pseudosolution
k xk xkN þ k rk
k
≤ A k M :
MN Ak MN NM
(112)
k x kk N bk M
k x l xkN
μ1 =μ l μ1
≤ 2εA þ αl þ ε β
A l þ γ l (114)
kx lkN 1 2kΔAkMN =μl μl
Proof. For the proof, along with problem (9), consider the problem
88
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
5.2 Estimates of the total error of the weighted normal pseudosolution for
matrices of full rank
kx
xk h kΔb kM krkM rk kM
k
≤ εА þ þ hεА þ : (118)
kxk 1 hεА kA kMI kx k k Ak MI kx k kAkMIk xk
k x x k k
þ
r kM
≤ A MN AMN NM
: (119)
kx k bk M
The estimate for the total error (118) follows from the inequalities
kx
xk k x xk kx x k k x
xk þ k
rkM
≤ þ , ≤ k AkMI A MN (120)
kxk kxk kxk kxk IM kb k kM
and estimates for the pseudoinverse matrix (25) and the hereditary error (83).
#
Theorem 21. Let kΔAkINA þ
MN NI < 1, n > m ¼ k and x ∈ R A . Then
kx xk N
h kΔb k kr k
≤ 2εA þ þ , (121)
kxk N 1 hε А kAk INk xkN kb kk
The proof of Theorem 21 is similar to the proof of the previous theorem, taking
into account the estimate for the hereditary error (86).
Remark 7. Here, we did not indicate a method for obtaining an approximate
weighted normal pseudosolution x, satisfying the conditions of the theorems. Algo-
rithms for obtaining such approximations are considered, for example, in Section 4.2.
Remark 8. Along with estimates (103), (109), (114), (118), (121), error estimates
can be obtained, the right-hand sides of which depend on the input data of systems of
23
89
Matrix Theory - Classics and Advances
linear algebraic equations with approximately given initial data. For example, the
following theorem holds.
þ #
Theorem 22. Let kΔA kMN A MN ∈R
< 1, and x A . Then, for the total error
NM
of the normal pseudosolution, the following estimate is fulfilled
!
kx xkN hA kΔbkM
kr kM kr kM
≤ 2εА þ þ h A εA þ :
εА kxk k xk N
x kN
k 1h A A
MN N
A
MN bk M
(122)
kx x kN kx x
x kN k x kN
≤ þ (123)
kxkN kxk N k
xk N
The numerical methods we have considered for solving systems of linear algebraic
equations and WLS problems have one common property. Namely, the actually cal-
culated solution (pseudosolution) is exact in accordance with the inverse analysis of
errors [43] for some perturbed problem. These perturbations are very small and are
often commensurate with the rounding errors of the input data. If the input data is
given with an error (measurements, calculations, etc.), then usually they already
contain significantly larger errors than rounding errors. In this case, any attempt to
improve the machine solution (pseudosolution) without involving additional infor-
mation about the exact problem or errors of the input data errors will be untenable.
The situation changes significantly if a mathematical problem with accurate input
data is considered. Now the criterion of bad or good conditionality of the computer
model of the problem depends on the mathematical properties of the computer model
of the problem and the mathematical properties of the processor (length of the
computer word), and it becomes possible in principle to achieve any given accuracy of
the computer solution. In this case, as follows from estimates (48), (64), (75), (83),
(86), it is obviously possible to refine the computer solution by solving a system with
24
90
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
increased bit depth, in particular, using the GMP library [44] for implementation of
computations with arbitrary bit depth.
To predict the length of the mantissa (machine word) that provides a given
accuracy for a solution (joint systems), you can use the following rule of thumb: the
number of correct decimal significant digits in a computer solution is μ α, where μ is
the decimal order of the mantissa of a floating-point number ε, α is the decimal order
of the condition number. Thus, knowing the conditionality of the matrix of the system
and the accuracy of calculations on a computer, it is possible to determine the required
bit depth to obtain a reliable solution.
The GMP library is used to work on integers, rational numbers and floating-point
numbers. The main feature of the library is the bitness of numbers (precision) is
practically unlimited. Therefore, the main field of application is computer algebraic
calculations, cryptography, etc. The functions of the GMP library allow not only
setting the bit depth at the beginning of the program and performing calculations with
this bit depth, but also changing the bit width as needed in the computation process,
i.e. execute different fragments of the algorithm with different bit depths.
The library’s capabilities were tested in the study of solutions to degenerate and ill-
conditioned systems in [45].
6. Conclusions
25
91
Matrix Theory - Classics and Advances
92
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
References
[1] Chipman JS. On least squares with Journal of Research of the National Bureau
insufficient observation. Journal of the of Standards. 1964;68B(4):151-172
American Statistical Association. 1964;
59(308):1078-1111 [10] Rao CR, Mitra SK. Generalized
Inverse of Matrices and its Applications.
[2] Milne RD. An oblique matrix New York: Wiley; 1971. p. 240
pseudoinverse. SIAM Journal on Applied
Mathematics. 1968;16(5):931-944 [11] Nashed MZ. Generalized Inverses
and Applications. New York: Academic
[3] Ward JF, Boullion TL, Lewis TO. Press; 1976. p. 1068
Weighted pseudoinverses with singular
weights. SIAM Journal on Applied [12] Ben-Israel A, TNE G. Generalized
Mathematics. 1971;21(3):480-482 Inverses: Theory and Applications. New
York: Springer-Verlag; 2003. p. 420.
[4] Galba EF, Deineka VS, Sergienko IV.
DOI: 10.1007/b97366
Weighted pseudoinverses and weighted
normal pseudosolutions with singular
[13] Khimich AN. Perturbation bounds
weights. Computational Mathematics
for the least squares problem.
and Mathematical Physics. 2009;49(8):
Cybernetics and System Analysis. 1996;
1281-1297
32(3):434-436
[5] Sergienko IV, Galba EF, Deineka VS.
[14] Khimich AN, Nikolaevskaya EA.
Existence and uniqueness of weighted
Reliability analysis of computer solutions
pseudoinverse matrices and weighted
normal pseudosolutions with singular of systems of linear algebraic equations
with approximate initial data.
weights. Ukrainian Mathematical
Cybernetics and System Analysis. 2008;
Journal. 2011;63(1):98-124
44(6):863-874
[6] Varenyuk NA, Galba EF, Sergienko IV,
Khimich AN. Weighted pseudoinversion [15] Nikolaevskaya EA, Khimich AN. Error
with indefinite weights. Ukrainian estimation for a weighted minimum-norm
Mathematical Journal. 2018;70(6):866-889 least squares solution with positive definite
weights. Computational Mathematics and
[7] Galba EF, Varenyuk NA. Mathematical Physics. 2009;49(3):
Representing weighted pseudoinverse 409-417
matrices with mixed weights in terms of
other pseudoinverses. Cybernetics and [16] Wei Y, Wang D. Condition numbers
System Analysis. 2018;54(2):185-192 and perturbation of the weighted Moore-
Penrose inverse and weighted linear least
[8] Galba EF, Varenyuk NA. Expansions squares problem. Applied Mathematics
of weighted pseudoinverses with mixed and Computation. 2003;145(1):45-58
weights into matrix power series and
power products. Cybernetics and System [17] Wei Y. A note on the sensitivity of the
Analysis. 2019;55:760-771. DOI: 10.1007/ solution of the weighted linear least
s10559-019-00186-9 squares problem. Applied Mathematics
and Computation. 2003;145(2–3):481-485
[9] Goldman AJ, Zelen M. Weak
generalized inverses and minimum [18] Molchanov IN, Galba EF. A weighed
variance linear unbiased estimation. pseudoinverse for complex matrices.
27
93
Matrix Theory - Classics and Advances
Ukrainian Mathematical Journal. 1983; [29] Elden L. Perturbation theory for the
35(1):46-50 least squares problem with linear
equality constraints. SIAM Journal on
[19] Elden L. A weighted pseudoinverse, Numerical Analysis. 1980;17:338-350
generalized singular values and
constrained least squares problems. BIT. [30] Bjork A. Numerical Methods for
1982;22:487-502 Least Squares Problems. Linköping:
Linköping University; 1996. p. 407
[20] Wei Y. The weighted Moore–
Penrose inverse of modified matrices. [31] Voevodin VV. Computational
Applied Mathematics and Computation. Foundations of Linear Algebra. Moscow:
2001;122(1):1-13. DOI: 10.1016/S0096- Nauka; 1977
3003(00)00007-2
[32] Khimich AN. Estimates of
[21] Voevodin VV. On the regularization perturbations for least squares solutions.
method. Zhurnal Vychislitel0 noy Kibernetika i Sistemnyi Analiz. 1996;3:
Matematiki i Matematicheskoy Fiziki. 142-145
1969;9:673-675
[33] Khimich AN, Voitsekhovskii SA,
[22] Tikhonov AN. Regularization of ill- Brusnikin VN. The reliability of solutions
posed problems. Doklady Akademii to linear mathematical models with
Nauk SSSR. 1963;153:42-52. DOI: approximately given initial data.
10.1016/S0096-3003(00)00007-2 Mathematical Machines and Systems.
2004;3:3-17
[23] Ivanov VK, Vasin VV, Tanana VP.
Theory of Linear Ill-Posed Problems
[34] Wei Y, Wu H. Expression for the
and Applications. Moscow: Nauka; 1978.
perturbation of the weighted Moore–
206 p
Penrose inverse. Computers & Mathematcs
with Applications. 2000;39:13-18
[24] Morozov VA. Regularization
Methods for Unstable Problems.
[35] Wei M. Supremum and Stability of
Moscow: Moscow State University; 1987
Weighted Pseudoinverses and Weighted
[in Russian]
Least Squares Problems: Analysis and
Computations. New York: Huntington;
[25] Albert AE. Regression and the
2001. p. 180
Moore–Penrose Pseudoinverse. New
York: Academic Press; 1972. p. 180
[36] Wang D. Some topics on weighted
Moore–Penrose inverse, weighted least
[26] Lawson CL, Hanson RJ. Solving squares, and weighted regularized
Least Squares Problems. Moscow: Tikhonov problems. Applied Mathematics
Nauka; 1986. p. 232 and Computation. 2004;157:243-267
[27] Kirichenko NF. Analytical [37] Van Loan CF. Generalizing the
representation of perturbations of singular value decomposition. SIAM
pseudoinverses. Kibernetika i Sistemnyi Journal on Numerical Analysis. 1976;13:
Analiz. 1997;2:98-107 76-83
[28] Golub GH, Van Loan CF. Matrix [38] Wang G, Wei Y, Qiao S. Generalized
Computations. Baltimore: Johns Hopkins Inverses: Theory and Computations.
University Press; 1996. p. 694. Beijing: Science Press; 2004. p. 390
28
94
Weighted Least Squares Perturbation Theory
DOI: http://dx.doi.org/10.5772/ITexLi.102885
95
Chapter 5
A Study on Approximation of a
Conjugate Function Using
Cesàro-Matrix Product Operator
Mohammed Hadish
Abstract
Keywords: weighted Lipschitz class, error approximation, Cesàro (Cδ) means, Matrix
(T) means CδT product means, conjugate Fourier series, generalized Minkowski’s
inequality
1. Introduction
96
Matrix Theory - Classics and Advances
Cesàro-Euler Cδ H and furthermore C1 H, C1 N p , C1 N pq , C1 Eq and C1 E1 product
operators are the special cases of Cδ T for δ ¼ 1.
Therefore, we establish two theorems so obtain best inaccuracy estimation of a
function ~ζ, conjugate to a 2π-periodic function ζ in weighted W ðLp , ξðωÞÞ space of its
CFS. Here we shall consider the two cases (i) p > 1 and (ii) p ¼ 1 in order to get the
Hölder’s inequality satisfied. Our theorems generalizes six previously known results.
Thus, the results of [5, 8–12] becomes the special cases of our theorem. Some
inportant corollaries are also obtained from our theorems.
Note 1 The CFS is not necessarily a FS.2
Example 1 The series
X∞
sin ðλxÞ
λ¼2
log λ
conjugate to the FS
X∞
cos ðλxÞ
λ¼2
log λ
Let C2π is a Banach space of all periodic functions with period 2π and continuous on
the interval 0 ≤ x ≤ 2π under the supremum norm.
The best λ-order error approximation of a function ~ζ ∈ C2π is defined by
Eλ ~ζ ¼ inf ∥~ζ tλ ∥,
tλ
2
FS denotes Fourier series and we use this abbreviation throughout the paper.
97
A Study on Approximation of a Conjugate Function Using Cesàro-Matrix Product Operator
DOI: http://dx.doi.org/10.5772/.103015
for 0 < α ≤ 1;
n o
ζ ∈ Lipðα, pÞ if Lipðα, pÞ ≔ ζ ∈ Lp ½0, 2π : ∥ζ ðx þ ωÞ ζ ðxÞ∥p ¼ Oðωα Þ
for p ≥ 1, 0 < α ≤ 1;
n o
ζ ∈ Lipðα, ξðωÞÞ if Lipðα, ξðωÞÞ ≔ ζ ∈ Lp ½0, 2π : ∥ζ ðx þ ωÞ ζ ðxÞ∥p ¼ OðξðωÞÞ
where ξðωÞ > 0 and increasing with ω > 0 and Lp space of all 2π-periodic and integrable
functions. Under above assumptions for α ∈ ð0, 1, p ≥ 1, ω > 0, we observed that
ξðλþ1
π
Þ ξðλþ1
1
Þ
Remark 1 If ξðωωÞ is non-increasing; then π ≤ 1
π
i.e., ξ λþ1 ≤ πξ λþ1
1
.
λþ1 λþ1
Let
X
∞
vλ (1)
λ¼0
Pk
be an infinite series such that sk ¼ m¼0 vm . Let
rkþη1
X
r
rk
σ ηr ¼ sk , for η > 1: (2)
k¼0
rþη
r
If lim λ!∞ σ ηλ ¼ s then we say that the series (1) is ðC, ηÞ summable to s or summable
by Cesàro mean of order η.If we take η ¼ 0 in (2), ðC, ηÞ summability reduces to an
ordinary sum and if we take η ¼ 1, then ðC, ηÞ summability reduces to ðC, 1Þ
summability or Cesàro summability of order 1.
Let
λ
X
q 1 λ 1
tEλ ¼ λ
s , q > 0:
kλ k
ð1 þ qÞ k¼0 k q
98
Matrix Theory - Classics and Advances
q
If lim λ!∞ tEλ ¼ s then we say that the series (1) is ðE, qÞ summable to s or summable
by Euler mean ðE, qÞ (Hardy [15]). If q ¼ 0, ðE, qÞ method reduces to an ordinary sum
and if q ¼ 1, ðE, qÞ means reduces to ðE, 1Þ means.
An infinite series (1) with the sequence fsλ g of its partial sums is said
to be
1
summable by harmonic method (Riesz [16] or simply summable N, λþ1 to sum s,
where s is a finite number, if the sequence to sequence transformation
1 X λ
sv
tλ ¼ as λ ! ∞:
log λ v¼0 λ v þ 1
Let pλ be a sequence of constants, real or complex and let
X
λ
Pλ ¼ pk , ðPλ 6¼ 0Þ:
k¼0
Let
N 1 X λ
1 X λ
tλ p ¼ pλk sk ¼ p sλk : (3)
Pλ k¼0 Pλ k¼0 k
If
N
lim tλ p ¼ s
λ!∞
then we say that the series (1) is N, pλ summable to s or summable by Nörlund
N, pλ means.
Let pλ and qλ , be two sequences of constants, real or complex such that
Convolution of the two sequences pλ and qλ , is defined as
X
λ
Rλ ¼ ðp ∗ qÞλ ¼ pk qλk :
k¼0
We write
N 1 X λ
tλ pq ¼ p q sk ;
Rλ k¼0 λk k
then the generalized Nörlund means N p,q of the sequence fsλ g is denoted by the
pq pq
sequence tλ . If tλ ! s, as λ ! ∞ then, the series (1) is said to be summable to s by
N p,q method and is denoted by sλ ! s N p,q ([17]).
99
A Study on Approximation of a Conjugate Function Using Cesàro-Matrix Product Operator
DOI: http://dx.doi.org/10.5772/.103015
Let pλ be a sequence of real constants such that p0 > 0, pλ ≥ 0 and
P
Pλ ¼ λv¼0 pv 6¼ 0, such that Pλ ! ∞ as λ ! ∞.
If
1 X
tλ ¼ pv sv ! s, as λ ! ∞,
Pλ
then we say that fsλ g is summable by N, pλ means and we write
sλ ¼ s N, pλ ,
where fsλ g is the sequence of λth partial sum of the series (1).
Let T ¼ ðlλ,k Þ be an infinite triangular matrix satisfying the conditions of regularity
[18] i.e.,
8
>
> P λ
>
> lλ,k ¼ 1 as λ ! ∞
>
>
>
< k¼0
lλ,k ¼ 0 for k > λ (7)
>
> P λ
>
>
>
> ∣lλ,k ∣ ≤ M, a finite constant
>
: k¼0
X λ X
λ
tTλ ~ζ; x ≔ lλ,k sk ¼ lλ,λk sλk
k¼0 k¼0
defines the sequence tTλ ðζ; xÞ of triangular matrix means of the sequence fsλ g
generated by the sequence of coefficients ðlλ,k Þ.
P
If tTλ ~ζ; x ! s as λ ! ∞ then the infinite series ∞ λ¼0 vλ or the sequence fsλ g is
summable to s by triangular matrix (T-method) [13].
we define Cδ T means as
λrþδ1
δ Xλ
δ1 X
r
tCλ T ~ζ; x ≔ lr,k sk ~ζ; x (8)
r¼0
δþλ k¼0
δ
δ P
If tCλ T ~ζ; x ! s as λ ! ∞, then ∞ δ
λ¼0 vλ is summable to s by C T method.
δ δ
Note 2 Since C and T both are regular then C T method is also regular.
Remark 2 The special cases of Cδ T means: Cδ T transform reduces to
Pλ
ii. Cδ N p transform if lλ,k ¼
pλk
Pλ where Pλ ¼ k¼0 pk 6¼ 0;
100
Matrix Theory - Classics and Advances
δ q λ λk
iv. C E transform when aλ,k ¼ 1
ð1þqÞλ
q ;
k
δ 1 λ
v. C E when lλ,k ¼ 1
2λ
;
k
Pλ
vi. Cδ N pq transform if lλ,k ¼
pλk qk
Rλ where Rλ ¼ k¼0 pk qλk .
In above special case (ii), (iii), and (vi) pλ and qλ are two non-negative monotonic
non-increasing sequences of real constants.
Remark 3 C1 H, C1 N p , C1 N pq , C1 Eq and C1 E1 transforms are also the special cases of
δ
C T for δ ¼ 1.
Example 2 we consider
X
∞
1 1574 ð1573Þλ1 (9)
λ¼1
sλ ¼ ð1573Þλ , ∀λ ∈ 0
λ
we take lλ,k ¼ ð787
1
Þλ
ð786Þλk , then
k
in above example, we see the series is summable neither by Cesàro means nor
Matrix means, but summable by Cesàro-Matrix.
Thus, Cδ T means is more powerfull and effective than single Cδ and T means.
Example 3 we consider another infinite series
sλ ¼ ð5Þλ , ∀λ ∈ 0
101
102
Matrix Theory - Classics and Advances
3. Lemmas
1 Xω
ðr þ δ 1Þ! δ!λ!
¼ Lr,0
2ω r¼0 ðδ 1Þ!r! ðδ þ λÞ!
λ!δ Xλ
ðr þ δ 1Þ!
¼ since Lr,0 ¼ 1
2ω ðδ þ λÞ! r¼0 r!
λ!δ ðδ 1Þ! δ! ðλ þ δ 1Þ!
¼ þ ⋯þ
2ω δ!ðδ þ 1Þ⋯ðδ þ λÞ 0! 1! λ!
λ!δ δ!ðδ þ 1Þ⋯ðδ þ λ 1Þ
≤ ðλ þ 1Þ
2ω δ!ðδ þ 1Þ⋯ðδ þ λÞ λ!
δ λþ1
¼
2ω δ þ λ
δ
≤
2ω
1
¼O for all δ ≥ 1:
ω
8
103
A Study on Approximation of a Conjugate Function Using Cesàro-Matrix Product Operator
DOI: http://dx.doi.org/10.5772/.103015
Now we consider
0 1 0 1
rþδ1 rþδ1
@ A @ A
X λ Xr X ϱ Xr
δ1 iðrkþ21Þω δ1 ιðrkÞω
0 1 lr,rk e ≤ 0 1 lr,rk e
r¼0 δþλ r¼0 δþλ
k¼0 k¼0
@ A @ A
δ δ
0 1
rþδ1
@ A
X λ Xϱ
δ1 iðrkÞω
þ 0 1 lr,rk e
r¼ϱþ1 δþλ
k¼0
@ A
δ
0 1
rþδ1
@ A
X λ Xr
δ1 iðrkÞω
þ 0 1 lr,rk e
r¼ϱþ1 δþλ
k¼ϱþ1
@ A
δ
¼ Λ1 þ Λ2 þ Λ3 , say :
(19)
104
Matrix Theory - Classics and Advances
Now,
0 1
rþδ1
B C
@ A
X
ϱ
δ1 Xr
Λ1 ≤ 0 1 lr,rk eiðrkÞω
r¼0
δþλ k¼0
B C
@ A
δ
0 1
rþδ1
B C
@ A
X
ϱ
δ1
≤ 0 1 Lr,0
r¼0
δþλ
B C
@ A
δ
Xϱ
ðr þ δ 1Þ! δ!λ!
¼ since, Lr,0 ¼ 1
r¼0
ð δ 1Þ!r! ð δ þ λÞ!
λ! Xϱ
ðr þ δ 1Þ!δ
¼
ðδ þ 1Þ … ðδ þ λÞ δ! r¼0 r!
λ! δðδ þ 1Þ ðϱ þ δ 1Þ!
¼ 1þδþ þ…þ
ðδ þ 1Þ … ðδ þ ϱÞ … ðδ þ λÞ 2! ϱ!ðδ 1Þ!
λ! δðδ þ 1Þ … ðδ þ ϱ 1Þ
≤ ðϱ þ 1 Þ
ðδ þ 1Þ … ðδ þ ϱÞ … ðδ þ λÞ ϱ!
ϱ!ðϱ þ 1Þ … λ δ ϱþ1
¼
ðδ þ ϱÞ … ðδ þ λ 1Þ δ þ λ ϱ!
δ
≤ ðϱ þ 1 Þ
δþλ
δ 1
≤ þ1
δþλ ω
1
¼O ð1 þ ωÞ for all δ ≥ 1:
ωðλ þ 1Þ
10
105
106
Matrix Theory - Classics and Advances
λ!
O ω1
ðδ þ 1Þ … ðδ þ λÞ
¼
δðδ þ 1Þ … ðδ þ λÞ δ … ðδ þ ϱÞ δ … ðδ þ λ 1Þ
lϱþ1,1 þ … þ lλ,λϱ
λ!ðδ þ λÞ ðϱ þ 1Þ! λ!
λ! δðδ þ 1Þ … ðδ þ λÞ
¼ O ω1 lϱþ1,1 þ lϱþ2,2 þ … lλ,λϱ
ðδ þ 1 Þ … ðδ þ λÞ λ!ðδ þ λÞ
δ
¼ O ω1 lϱþ1,1 þ lϱþ1,2 þ … þ lϱþ1,λϱ
δþλ
1
¼O Lϱþ1,1
ωðλ þ 1Þ
1
¼O :
ωðλ þ 1Þ
12
107
108
Matrix Theory - Classics and Advances
Conditions (22, 24) can be verified by using the fact that ψ ðx, ωÞ ∈ W Lp , ξðωÞ
and ψξððx,ωωÞ Þ is bounded function.
Proof. The λth partial sums of the CFS is denoted by sλ ~ζ; x , and is given by
ðπ
1 cos λ þ 21 ω
sλ ~ζ; x ~ζ ðxÞ ¼ ψ ðx, ωÞ dω,
2π 0 sin ω2
!
λrþδ1
X
λ
δ1 X
r
tCλ
δT
~ζ; x ~ζ ðxÞ ¼ ! lr,k sk ~ζ; x ~ζ ðxÞ
r¼0
δþλ k¼0
δ
!
rþδ1
1
ðπ X
λ Xr cos r k þ ω
1 δ1 2
¼ ψ ðx, ωÞ ! lr,rk ω
dω
2π 0 r¼0
δþλ k¼0 sin
2
δ
(25)
Ðπ δ
~C T
¼ 0 ψ ðx, ωÞK λ ðωÞ dω By the notation ð13Þ
Ð λþ1
π Ðπ
¼ ~ ðξÞ dω þ
ψ ðx, ωÞK
Cδ T
π
~ ðωÞ dω
ψ ðx, ωÞK
Cδ T (26)
0 λ λþ1
λ
¼ I1 þ I2 , say
Applying (14), Lemma 3.1, Hölder’s inequality and second mean value theorem for
integral, we have
(ð π ω!p )p1 (ð π
!q )q1
λþ1 ωσ ∣ψ ðx, ωÞ∣ sin β λþ1 ξðωÞ
I1 ¼ Oð1Þ 2
dω σþ1 dω
0 ξ ð ωÞ 0 ω sin β ω2
" ð π q q1 #
σp1
λþ1 ξ ð ωÞ
¼ O ðλ þ 1Þ βþ1σ dω
0 ω
σp1 βþp1 σ π
¼ O ðλ þ 1Þ ðλ þ 1Þ ξ
λþ1
β 1
¼ O ðλ þ 1Þ ξ
λþ1
(27)
109
110
Matrix Theory - Classics and Advances
8 9
> π >
< λ þ 1 >
> ξ =
σ1
¼ O ðλ þ 1Þ
βσþ1
>
> π >
>
: λþ1 ; (29)
β 1
¼ O ðλ þ 1Þ ξ
λþ1
in view of Remark 1.
5. Corollaries
16
111
112
Matrix Theory - Classics and Advances
provided Cδ T defined in (8) and ξðωÞ satisfies the conditions (21) to (24).
Corollary 5.7 The error approximation of ~ζ ∈ W ðLp , ξðωÞÞ class by Cδ N p means
λrþδ1
Cδ N p
X
λ
δ1 1X r
tλ ¼ p sk ,
r¼0
δþλ Pr k¼0 k
δ
provided Cδ T defined in (8) and ξðωÞ satisfies the conditions (21) to (24).
Corollary 5.8 The error estimate of ~ζ ∈ W ðLp , ξðωÞÞ class by Cδ Eq means
δrþδ1
X
δ r
X
δ q δ1 1 r
tCλ E ¼ qrk sk ,
r¼0
δþδ r
ð1 þ qÞ k¼0 k
δ
provided Cδ T defined in (8) and ξðωÞ satisfies the conditions (21) to (24).
Corollary 5.9 The error estimate of ζ ∈ W ðLp , ξðωÞÞ class by Cδ E1 means
λrþδ1
X r
δ 1
λ
δ1 1X r
tCλ E ¼ sk ,
r¼0
δþλ 2r k¼0 k
δ
of the FS is given by
1
~ðxÞ∥ ¼ O ðλ þ 1Þβ ξ
1
∥tCλ
δE
~ζ; x h p
λþ1
provided Cδ T defined in (8) and ξðωÞ satisfies the conditions (21) to (24).
Remark 6 The corollaries for 5.1 to 5.9 can also be obtained for the special cases
~ p , C1 Eq and C1 E1 all things considered Remark 3.
C1 H, C1 N p , C1 N pq , C1 N
18
113
A Study on Approximation of a Conjugate Function Using Cesàro-Matrix Product Operator
DOI: http://dx.doi.org/10.5772/.103015
6. Particular cases
7. Exercise
P∞ j1
Q. 7.1. Prove that the infinite series 1 4038 j¼1 ð4038Þ is neither summa-
1
ble by matrix means(T) nor Cesáro means of order one C but it summable by Cδ T
means for δ ¼ 1.
Q. 7.2. Prove that a function f is 2π-periodic and Lebesgue integrable then the error
approximation of f in Lipα class by Cδ T product means of its Fourier series is given by
(
O½ðn þ 1Þα , 0≤α<1
En ð f Þ ¼ h i
O ðn þ 1Þ1 f log ðn þ 1Þg , α ¼ 1,
check all conditions of T method as defined in (7) and also satisfies condition (17).
[Hint: see [19]].
Q. 7.4. If the conditions of (7) and (17) holds for faλ,k g, then prove that
8
rþδ1 π
>> Oðλ þ 1Þ, ∀δ ≥ 1, 0 < ω ≤
1
Xλ
δ1 Xr
cos r k þ 2 ω <
1
λþ1
ð2π Þ lr,rk ¼ π
δþλ sin ω2 > 1
r¼0 k¼0 >: O , ∀δ ≥ 1, ≤ ω ≤ π:
δ ðλ þ 1Þω2 λþ1
Acknowledgements
114
Matrix Theory - Classics and Advances
also grateful to all my family and friends specially Mrs. Anshu Rani, Dr. Pradeep Kumar
and Niraj Pal for their timely help and giving me constant encouragements.
I also would like to thank the reviewers for their thoughtful and efforts towards
improving my chapter.
AMS classification
40C10, 40G05, 40G10, 42A10, 42A24, 40C05, 41A10, 41A25, 42B05, 42A50
115
A Study on Approximation of a Conjugate Function Using Cesàro-Matrix Product Operator
DOI: http://dx.doi.org/10.5772/.103015
References
[1] Rhaodes BE. On the degree of (functions) belonging to Lip ðα, rÞ-class by
approximation of functions belonging to ðC, 1ÞðE, qÞ means of conjugate
Lipschitz class by Hausdorff means of its trigonometric Fourier series. Journal of
Fourier series. Tamkang Journal of inequalities and applications. 2012;2012:
Mathematics. 2003;34(3):245-247 278
MR2001920 (2004g: 41023).
Zentralblatt für Mathematik und ihre [9] Deger U. On approximation to the
Grenzgebiete 1039.42001 functions in the W Lp , ξðtÞ class by a
new matrix mean. Novi Sad J. Math.
[2] Nigam HK, Sharma K. A study on 2016;46(1):1-14
degree of approximation by Karamata
summability method. J. Inequal. Appl. [10] Singh U, Srivastava SK.
2011;85(1):1-28 Trigonometric approximation of
functions belonging to certain Lipschitz
[3] Kushwaha JK. On the approximation classes by C1 T operator. Asian-European
of generalized Lipschitz function by J. Math. 2014;7(4):1450064
Euler means of conjugate series of
Fourier series. Scientific World Journal. [11] Mishra VN, Khan HH, Khatri K,
2013;2013:508026 Mishra LN. Degree of approximation of
conjugate signals (functions) belonging to
[4] Mittal ML, Singh U, Mishra VN. the generalized weighted
approximation of functions (signals) W ðLr , ξðtÞ, r ≥ 1Þ class by ðC, 1ÞðE, qÞ
belonging to W ðLp , ξðtÞÞ-class by means means of conjugate Trigonometric Fourier
of conjugate Fourier series using series. Bulletin of Mathematical Analysis
Norlund operators. Varahmihir J. Math. and Applications. 2013;5(4):40-53
Sci. India. 2006;6(1):383-392
[12] Mishra VN, Khan HH, Khatri K,
[5] Nigam HK. A study on approximation Mishra LN. Approximation of functions
of conjugate of functions belonging to belonging to the generalized Lipschitz
Lipschitz class and generalized Lipschitz class by C1 N p summability method of
class by product summability means of conjugate series of Fourier series.
conjugate series of Fourier series. Thai Mathematicki vesnik. 2014;66(2):155-164
Journal of Mathematics. 2012;10(2):
275-287 [13] Zygmund A. Trigonometric Series.
3rd rev. ed. Cambridge: Cambridge
[6] Nigam HK, Hadish M. Best Univ. Press; 2002
approximation of functions in
generalized Hölder class. Journal of [14] Bernshtein SN. On the best
Inequalities and Applications. 2018;1:1-15 approximation of continuous functions
by polynomials of given degree, in:
[7] Nigam HK, Hadish M. A study on Collected Works [in Russian], Izd.
approximation of a conjugate function Nauk SSSR, Moscow. 1952;1:11-1047
using Cesàro-Matrix product operator.
Nonlinear Functional Analysis and [15] Hardy GH. Divergent Series. Oxford
Applications. 2020;25(4):697-713 University Press; 1949
116
Matrix Theory - Classics and Advances
117
Chapter 6
Abstract
1. Introduction
The field of complex (or real) numbers is designated by (). The set of all m n
matrices over the quaternion skew field
¼ h0 þ h1 i þ h2 j þ h3 kji2 ¼ j2 ¼ k2 ¼ ijk ¼ 1, h0 , h1 , h2 , h3 ∈ ,
118
Matrix Theory - Classics and Advances
• N r ðAÞ ¼ d ∈ n1 : Ad ¼ 0 is the right null space of A;
• N l ðAÞ ¼ d ∈ 1m : dA ¼ 0 is the left null space of A.
Let us recall the definitions of some well-known generalized inverses that can be
extend to quaternion matrices as follows.
Definition 1.1. The Moore–Penrose inverse of A ∈ nm is the unique matrix A† ¼ X
determined by equations
ð1Þ AXA ¼ A; ð2Þ XAX ¼ X; ð3Þ ðAXÞ ∗ ¼ AX; ð4Þ ðXAÞ ∗ ¼ XA: (1)
Definition 1.2. The Drazin inverse of A ∈ nn is the unique Ad ¼ X that satisfying
Eq.(2) from (1) and the following equations,
where
k¼ IndðAÞ is the index of A, i.e. the smallest positive number such that
rank A kþ1
¼ rank Ak . If IndðAÞ ≤ 1, then Ad ¼ A# is the group inverse of A. If
IndðAÞ ¼ 0, then A# ¼ A† ¼ A1 .
A matrix A satisfying the conditions ðiÞ, ð jÞ, … is called an fi, j, … g-inverse of A,
and is denoted by Aði,j, … Þ . In particular, Að1Þ is called the inner inverse, Að2Þ is called the
outer inverse, and Að1,2Þ is called the reflexive inverse, and Að1,2,3,4Þ is the Moore–Penrose
inverse, etc.
Note that the Moore–Penrose inverse inducts the orthogonal projectors PA ¼ AA†
and Q A ¼ A† A onto the right column spaces of A and A ∗ , respectively.
In [1], the core-EP inverse over the quaternion skew field was presented similarly
as in [2].
Definition 1.3. The core-EP inverse of A ∈ nn is the unique matrix A† ¼ X which
satisfies
X ¼ XAX, C r ðXÞ ¼ C r Ad ¼ C r ðX ∗ Þ:
m † m d
Recall that, A◯ † ¼ ðA Þ A A for m ≥ IndðAÞ.
Since the quaternion core-EP inverse A† is related to the right space C r ðAÞ of
A ∈ nn and the quaternion dual core-EP inverse A† is related to its left space Rl ðAÞ.
So, in [1], they are also named the right and left core-EP inverses, respectively.
Various representations of core-EP inverse can be found in [1, 5–7]. In [8], conti-
nuity of core-EP inverse was investigated. Bordering and iterative methods to find the
core-EP inverse were proved in [9, 10], and its determinantal representation for
complex matrices was derived in [2]. New determinantal representations of the
2
119
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/.103087
complex core-EP inverse and its various generalizations were obtained in [11]. The
core-EP inverse was generalized to rectangular matrices [12], Hilbert space operators
[13], Banach algebra elements [14], tensors [15], and elements of rings [3]. Combining
the core-EP inverse or the dual core-EP inverse with the Moore–Penrose inverse, the
MPCEP inverse and CEPMP inverse were introduced in [16] for bounded linear
Hilbert space operators.
In the last years, interest in quaternion matrix equations is growing significantly
based on the increasing their applications in various fields, among them, robotic
manipulation [17], fluid mechanics [18, 19], quantum mechanics [20–22], signal
processing [23, 24], color image processing [25–27], and so on.
The main goals of this chapter are investigations of the MPCEP and CEPMP
inverses, introductions and representations of new right and left MPCEPMP inverses
over the quaternion skew field, and obtaining of their determinantal representations
as a direct method of their constructions. The chapter develops and continues the
topic raised in a number of other works [28–33], where determinantal representations
of various generalized inverses were obtained.
The remainder of our chapter is directed as follows. In Section 2, we introduce of
the quaternion MPCEP and CEPMP inverses and give characterizations of new gen-
eralized inverses, namely left and right MPCEPMP-inverses. In Section 3, we com-
mence with introducing determinantal representations of the projection matrices
inducted by the Moore–Penrose inverse and of core-EP inverse previously obtained
within the framework of theory of quaternion row-column determinants and, based
of them, determinantal representations of the MPCEP, CEPMP, and left and right
MPCEPMP inverses are derived. Finally, the conclusion is drawn in Section 4.
Analogously as in [16], the MPCEP inverse and CEPMP inverse can be defined for
quaternion matrices.
Definition 2.1. Let A ∈ nn. The MPCEP (or MP-Core-EP) inverse of A is the unique
solution A†,◯
†
¼ X to the system
X ¼ XAX, XA ¼ A† AA◯
†
A, AX ¼ AA◯
†
:
†
X ¼ XAX, AX ¼ AA◯
† AA , XA ¼ A◯
† A:
A†,◯
†
¼ A† AA◯
†
, (2)
A◯
† ,†
¼ A◯ †
† AA : (3)
According to our concepts, we can define the left and right MPCPMP inverses.
Definition 2.2. Suppose A ∈ nn . The right MPCEPMP inverse of A is defined as
3
120
Matrix Theory - Classics and Advances
A†,◯
† ,†,r
¼ A† AA◯
†
AA† :
A†,◯
† ,†,l
¼ A† AA◯ †
† AA :
ii.
X ¼ A†,◯
†
PA : (4)
Proof. ½ðiÞ ↦ ½ðiiÞ. By Eq. (2) and the denotation of PA , it is evident that
A†,◯
† ,†,r
¼ A† AA◯
†
AA† ¼ A†,◯
†
PA :
XA ¼ A† AA◯
†
AA† A ¼ A†,◯
†
A,
†
AX ¼ AA† A A◯ AA† ¼ AA◯
†
PA :
To prove that the system (5) has unique solution, suppose that X and X1 are two
solutions of this system. Then XA ¼ A†,◯†
A ¼ X1 A and AX ¼ AA◯ †
PA ¼ AX1 , which
give XðAXÞ ¼ ðXAÞX1 ¼ X1 AX1 ¼ X1 . Therefore, X is the unique solution to the
system. □
The next theorem can be proved in the same way.
Theorem 2.4. Let A, X ∈ nn . The following statements are equivalent:
121
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/.103087
in which Sn denotes the symmetric group on In ¼ f1, … , ng, while the permutation
σ is defined as a product of mutually disjunct subsets ordered from the left to right by
the rules
σ ¼ ði ik1 ik1 þ1 … ik1 þl1 Þðik2 ik2 þ1 … ik2 þl2 Þ … ðikr ikr þ1 … ikr þlr Þ,
ikt < ikt þs , ik2 < ik3 < ⋯ < ikr , ∀ t ¼ 2, … , r, s ¼ 1, … , lt :
in which a permutation τ is ordered from the right to left in the following way:
τ¼ jkr þlr ⋯ jkr þ1 jkr ⋯ jk2 þl2 ⋯ jk2 þ1 jk2 jk1 þl1 ⋯ jk1 þ1 jk1 j , jkt < jkt þs , jk2 < jk3 < ⋯ < jkr :
122
Matrix Theory - Classics and Advances
whose rows and columns are indexed by α. If A is Hermitian, then jAjαα stands for a
principal minor of detA. The collection of strictly increasing sequences of 1 ≤ k ≤ n
integers chosen from f1, … , ng is denoted by
Lk,n ≔ fα : α ¼ ðα1 , … , αk Þ, 1 ≤ α1 < ⋯ < αk ≤ ng: For fixed i ∈ α and j ∈ β, put
Ir,m fig ≔ fα : α ∈ Lr,m , i ∈ αg, J r,n f jg ≔ fβ : β ∈ Lr,n , j ∈ βg.
Let a:j and a:j∗ be the jth columns, ai: and ai:∗ be the ith rows of A and A ∗ ,
respectively. Suppose that Ai: ðbÞ and A:j ðcÞ stand for the matrices obtained from A by
replacing its ith row with the row vector b ∈ 1n and its jth column with the column
vector c ∈ m , respectively.
Based on determinantal representations of the Moore–Penrose inverse obtained in
[28], we have determinantal representations of the projections.
Lemma 3.2. [28] If A ∈ mn r , then the determinantal representations of the
†
projection matrices A A≕Q A ¼ qij A
and AA† ≕PA ¼ pA ij can be expressed as
nn mm
follows
β P α
P
β ∈ J r,n fig cdeti ðA ∗ AÞ:i a_ :j β α ∈ Ir,n f jg rdet j ðA ∗ AÞ:j ða_ :i Þ
P α
ij ¼
qA P ¼ , (7)
∗ β
β ∈ J r,n jA Ajβ α ∈ Ir,n jA
∗
Ajαα
P α β
∗ P
α ∈ Ir,m f jg rdet j ð AA Þ j: ð €
a i: Þ β ∈ J fi g cdet i ð AA ∗
Þ :i €:j β
a
P α
ij ¼
pA ¼ P
r,m
∗ α ∗ β
, (8)
α ∈ Ir,m jAA jα β ∈ J jAA jβr,m
where a_ :i and a_ :j , a €:j are the ith rows and the jth columns of A ∗ A ∈ nn and
€i: and a
∗
AA ∈ mm
, respectively.
Recently, D-representations of the quaternion core-EP inverses were obtained in
[1] as well.
Lemma 3.3. [1] Suppose that A ∈ nn , IndðAÞ ¼ k and rank Ak ¼ s. Then A† ¼
a†,r
ij and A† ¼ a†,lij possess the determinantal representations, respectively,
∗
α
P
α ∈ Is,n f jg rdet j Akþ1
A kþ1
^i: Þ
ða
j:
a†,r
ij ¼ P α
, (9)
kþ1 kþ1 ∗ α
α ∈ Is,n A A
α
P ∗ β
^
β ∈ J s,n fig cdeti Akþ1 Akþ1 a:j
:i β
a†,l
ij ¼ , (10)
P kþ1 ∗ kþ1 β
β ∈ J s,n A A
β
∗ ^
where a ^i: is the ith row of A^ ¼ Ak Akþ1 ^
and a:j is the jth column of A ¼
∗
Akþ1 Ak .
Theorem 3.4. Let A ∈ nn s , IndðAÞ ¼ k and rank Ak ¼ s1 . Then its MPCEP
inverse A†,† ¼ a†,† ij is expressed by componentwise
123
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/.103087
∗
α
P kþ1 kþ1 ð1Þ
α ∈ Is1 ,n f A
jg rdet j A vi:
j:
aij†,† ¼ P α (11)
∗ βP kþ1 kþ1 ∗ α
β ∈ J s,n j A A jβ α ∈ Is1 ,n A A
α
P β
∗ ð 1Þ
β ∈ J s,n fig cdeti ðA AÞ:i u:j
β
¼P , (12)
∗ βP kþ1 kþ1 ∗ α
β ∈ J s,n jA Ajβ α ∈ Is ,n A A
1 α
where
X
ð1Þ ∗ β
vi: ¼ ~:l Þ β ∈ 1n , l ¼ 1, … , n,
cdeti ðA AÞ:i ða (13)
β ∈ J s,n f ig
X ∗
α
ð1Þ
u:j ¼ rdet j A kþ1
A kþ1
~ f:
a ∈ n1 , f ¼ 1, … , n,
α ∈ Is1 ,n f jg j: α
∗
~:l and ~
and a ~ ¼ A ∗ Akþ1 Akþ1 .
a f : are the lth column and the fth row of A
Proof. By (2), we have
X
n
a†,†
ij ¼ qil a†,r
lj : (14)
l¼1
Using (7) and (9) for the determinantal representations of Q A ¼ A† A ¼ qij and
A† , respectively, from (14) it follows
∗
α
P β P
Xn Xn ∗
_ :f β Akþ1
jg rdet j A kþ1
ð ^
a Þ
β ∈ J s,n fig cdeti ðA AÞ:i a
α ∈ Is1 ,n f l:
j:
a†,† ¼ P P ∗ α α
¼
ij l¼1 f ¼1
j A ∗
Aj β kþ1 kþ1
β ∈ J s,n β α ∈ Is1 ,n A A
α
∗
α
P ∗ β P
Xn Xn rdet A kþ1
A kþ1
ðe Þ
β ∈ J s,n fig cdeti ðA AÞ:i e:f β
α ∈ Is1 ,n f jg j l:
j: α
¼ P ~fl
a P ∗ α ,
f ¼1 ∗ β
β ∈ J s,n jA Ajβ
l¼1 kþ1 kþ1
α ∈ Is ,n A A
1 α
^l: is the
where e:f and el: are the fth column and the lth row of the unit matrix In , a
∗ ∗
^ ¼ Ak Akþ1 , and a
lth row of A ~ ¼ A ∗ Akþ1 Akþ1 .
~fl is the (fl)th element of A
If we denote by
ð1Þ
X
n X β X β
vil ≔ cdeti ðA ∗ AÞ:i e:f β ~afl ¼ cdeti ðA ∗ AÞ:i ða
~:l Þ β
f ¼1 β ∈ J s,n fig β ∈ J s,n fig
h i
ð1Þ ð1Þ ð1Þ
the lth component of a row-vector vi: ¼ vi1 , … , vin , then
X
n X ∗
α
ð1Þ
vil rdet j Akþ1 Akþ1 ðel: Þ
j: α
l¼1 α ∈ Is1 ,n f jg
X ∗
α
ð1Þ
¼ rdet j A kþ1
Akþ1
vi: :
j: α
α ∈ Is1 ,n f jg
124
Matrix Theory - Classics and Advances
X
n X ∗
α
ð1Þ
ufj ≔ ~fl
a rdet j Akþ1 Akþ1 ðel: Þ
j: α
l¼1 α ∈ Is1 ,n f jg
X ∗
α
¼ rdet j A kþ1
Akþ1
~ f:
a
j: α
α ∈ Is1 ,n f jg
h iT
ð1Þ ð1Þ ð1Þ
as the fth component of a column-vector u:j ¼ u1j , … , unj , it follows
X
n X β ð1Þ X β
ð1Þ
cdeti ðA ∗ AÞ:i e:f β ufj ¼ cdeti ðA ∗ AÞ:i u:j :
β
f ¼1 β ∈ J s,n fig β ∈ J s,n fig
P kþ1 kþ1 ∗ ð1Þ α
A A vi:
α ∈ Is1 ,n f jg
j:
a†,† ¼P α
ij
∗ βP kþ1 kþ1 ∗ α
β ∈ J s,n jA Ajβ α ∈ Is ,n A A
1 α
P β
∗ ð1Þ
β ∈ J s,n fig ðA AÞ:i u:j
β
¼P P ∗ α ,
∗ β
β ∈ J s,n jA Ajβ
kþ1
α ∈ Is ,n A Akþ1
1 α
where
2 3
X
ð1Þ
¼4 ðA ∗ A Þ ða β 5
:i ~:l Þ β ∈ , l ¼ 1, … , n,
1n
vi:
β ∈ J s,n fig
2 3 (15)
∗ α
X
ð1Þ
u:j ¼4 Akþ1 Akþ1 ~ f : 5 ∈ n1 , f ¼ 1, … , n,
a
j:
α ∈ Is1 ,n f jg α
∗
and a ~
~:l and a ~ ¼ A ∗ Akþ1 Akþ1 .
are the lth column and the fth row of A
f:
Theorem 3.6. Let A ∈ snn , IndðAÞ ¼ k and rank Ak ¼ s1 . Then its CEPMP
inverse A†,† ¼ a†,†
ij has the following determinantal representations
125
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/.103087
P α
ð 2Þ
ðAA ∗ Þ j: vi:
α ∈ Is,n f jg rdet j
a†,† ¼
α
(16)
ij P ∗ αP kþ1 ∗ kþ1 β
α ∈ Is,n jAA j α β ∈ J s ,n A A
1 β
P ∗ β
kþ1 kþ1 ð 2Þ
β ∈ J s1 ,n fig cdet i A A u:j
:i β
¼ β , (17)
P ∗ αP kþ1 ∗
kþ1
α ∈ Is,n jAA jα β ∈ J s ,n A A
1 β
where
2 3
X ∗ β
vi: ¼ 4
ð 2Þ 5 ∈ 1n , l ¼ 1, … , n,
cdeti Akþ1 Akþ1 ð^a:l Þ
:i β
β ∈ J s1 ,n fig
" # (18)
X α
ð 2Þ
u:j ¼ rdet j ðAA ∗ Þ j: ^
a f: ∈ n1 , f ¼ 1, … , n:
α
α ∈ Is,n f jg
∗
^:l and a
Here a ^ f:
^ ¼ Akþ1 Akþ1 A ∗ .
are the lth column and the fth row of A
Proof. The proof is similar to the proof of Theorem 3.4 by using the representation
(3) for the CEPMP inverse.
Corollary 3.7. Let A ∈ nn
s , IndðAÞ ¼ k and rank Ak ¼ s1 . Then its CEPMP
inverse A†,† ¼ a†,†
ij has the following determinantal representations
P α
∗ ð2Þ
α ∈ Is,n f jg ð AA Þ j: vi:
a†,† ¼
α
ij P ∗ αP kþ1 ∗ kþ1 β
α ∈ Is,n jAA jα β ∈ J s ,n A A
1 β
P β ∗
kþ1 kþ1 ð2Þ
β ∈ J s1 ,n fig cdet i A A u :j
:i β
¼ ∗ β ,
P ∗ α P kþ1 kþ1
α ∈ Is,n jAA jα β ∈ J s ,n A A
1 β
where
2 3
ð2Þ
X ∗ β
vi: ¼4 ^:l Þ 5 ∈ 1n , l ¼ 1, … , n,
Akþ1 Akþ1 ða
:i β
β ∈ J s1 ,n fig
2 3 (19)
ð2Þ
X α
u:j ¼4 ^ f : 5 ∈ n1 , f ¼ 1, … , n:
ðAA ∗ Þ j: a
α
α ∈ Is,n f jg
∗
Here a ^
^:l and a
are the lth column and the fth row of A
f:
^ ¼ Akþ1 Akþ1 A ∗ .
Theorem 3.8. Let A ∈ nn
s , IndðAÞ ¼ k and rank Ak ¼ s1 . Then its right
MPCEPMP inverse A†,†,†,r ¼ a†,†,†,r
ij has the following determinantal representations
126
Matrix Theory - Classics and Advances
P α
ð1Þ
ðAA ∗ Þ j: ϕi:
α ∈ Is,n f jg rdet j
a†,†,†,r ¼ 2
α
¼ (20)
ij P ∗ α P kþ1 ∗ kþ1 β
α ∈ Is,n j AA jα β ∈ J s ,n A A
1 β
P ∗ β
ð1Þ
Akþ1 Akþ1
β ∈ J s ,n fig cdeti ψ :j
1 :i β
¼ P 2 , (21)
∗ β P kþ1 ∗ kþ1 α
β ∈ Is,n jAA jβ α ∈ J s ,n A A
1 α
where
2 3
X ∗ β
ϕi: ¼ 4
ð1Þ
cdeti ^ :l Þ 5 ∈ 1n , l ¼ 1, … , n,
Akþ1 Akþ1 ðu
:i β
β ∈ J s1 ,n fig
2 3
X α
ð1Þ
ψ :j ¼4 ∗
^ f:
rdet j ðAA Þ j: u 5 ∈ n1 , f ¼ 1, … , n:
α
α ∈ Is,n f jg
Applying (8) for the determinantal representation of PA ¼ AA† ¼ pij and (17)
for the determinantal representation of A†,† in (22), we obtain
P ∗β α
ð2Þ P
Xn β ∈ J s1 ,n fig cdeti Akþ1
Akþ1 u:t ∗
α ∈ Is,n f jg rdet j ðAA Þ j: ða €t: Þ
:i β
aij†,†,†,r ¼ P ∗ α
α
¼
t¼1 P ∗ αP kþ1 ∗ kþ1 β α ∈ jAA jα
α ∈ Is,n jAA j α β ∈ Is ,n A A I s,n
1 β
P ∗ β α
P
Xn Xn β ∈ J s1 ,n fig cdeti Akþ1 Akþ1 e:f ∗
α ∈ Is,n f jg rdet j ðAA Þ j: ðel: Þ
:i β α
¼ ^fl
u P 2 ,
l¼1 f ¼1 P kþ1 ∗ kþ1 β ∗ α
β ∈ J s ,n A A α ∈ Is,n j AA jα
1 β
^fl is
where e:f and el: are the fth column and the lth row of the unit matrix In , and u
h i
^ ¼ U2 AA . The matrix U2 ¼ u , … , u:n is constructed from
the (fl)th element of U ∗ ð2 Þ ð2 Þ
:1
the columns (18). If we denote by
X
n X ∗ β
ð1Þ
ϕil ≔ cdeti Akþ1 Akþ1 e:f ^fl
u
:i β
f ¼1 β ∈ J s1 ,n fig
X ∗ β
¼ cdeti ^ :l Þ
Akþ1 Akþ1 ðu
:i β
β ∈ J s1 ,n fig
10
127
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/103087
h i
ð1Þ ð1Þ ð1Þ
the lth component of a row-vector ϕi: ¼ ϕi1 , … , ϕin , then
X
n X α X α
ð1Þ ð1Þ
ϕil rdet j ðAA ∗ Þ j: ðel: Þ ¼ rdet j ðAA ∗ Þ j: ϕi: :
α α
l¼1 α ∈ Is,n f jg α ∈ Is,n f jg
X
n X α X α
ð1Þ
ψ fj ≔ ^fl
a rdet j ðAA ∗ Þ j: ðel: Þ ¼ rdet j ðAA ∗ Þ j: u
^ f: α
α
l¼1 α ∈ Is,n f jg α ∈ Is,n f jg
h i
ð1Þ ð1Þ ð1Þ T
as the fth component of a column-vector ψ :j ¼ ψ 1j , … , ψ nj , it follows
X
n X ∗ β X ∗ β
ð1Þ ð1Þ
cdeti Akþ1 Akþ1 e:f ψ fj ¼ cdeti Akþ1 Akþ1 ψ :j :
:i β :i β
f ¼1 β ∈ J s1 ,n fig β ∈ J s1 ,n fig
P α
∗ ð1Þ
α ∈ Is,n f jg ð AA Þ j: ϕ i:
a†,† ¼
α
ij P ∗ α P
2 kþ1 ∗ kþ1 β
α ∈ Is,n jAA j α β ∈ J s ,n A A
1 β
P ∗ β
ð1Þ
β ∈ J s1 ,n fig A kþ1
A kþ1
ψ :j
:i β
¼ P 2 P ∗ α ,
∗ β kþ1 kþ1
β ∈ Is,n j AA jβ α ∈ J s ,n A A
1 α
where
2 3
ð1Þ
X ∗ β
ϕi: ¼4 ^ :l Þ 5 ∈ 1n , l ¼ 1, … , n,
Akþ1 Akþ1 ðu
:i β
β ∈ J s1 ,n fig
2 3
ð1Þ
X α
ψ :j ¼4 ðAA ∗ Þ j: u^ f : 5 ∈ n1 , f ¼ 1, … , n:
α
α ∈ Is,n f jg
128
Matrix Theory - Classics and Advances
∗
α
P ð 2Þ
α ∈ Is1 ,n f A kþ1
jg rdet j A kþ1
ϕ i:
j:
a†,†,†,l ¼ P 2 P ¼
α
(23)
ij
∗ β kþ1 kþ1 ∗ α
β ∈ J s,n jA Ajβ α ∈ Is1 ,n A A
α
P β
∗ ð 2Þ
β ∈ J s,n fig cdeti ðA AÞ:i ψ :j
β
¼ P 2 , (24)
∗ β P kþ1 kþ1 ∗ α
β ∈ J s,n jA A jβ α ∈ Is ,n A A
1 α
where
2 3
X β
ϕi: ¼ 4
ð2Þ
~ :t Þ β 5 ∈ 1n ,
cdeti ðA ∗ AÞ:i ðv t ¼ 1, … , n,
β ∈ J s,n fig
2 3
X ∗
α
ψ :j ¼ 4
ð2Þ 5 ∈ n1 ,
rdet j Akþ1 Akþ1 ~ f:
v f ¼ 1, … , n,
j: α
α ∈ Is1 ,n f jg
ð 2Þ
X
n X β
ϕit ≔ cdeti ðA ∗ AÞ:i e:f β ~vft
f ¼1 β ∈ J s,n fig
X β
¼ cdeti ðA ∗ AÞ:i ðv
~ :t Þ β
β ∈ J s,n fig
12
129
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/103087
h i
ð2Þ ð2Þ ð 2Þ
as the lth component of a row-vector ϕi: ¼ ϕi1 , … , ϕin , then
X
n X ∗
α
ð2Þ
ϕit rdet j A kþ1
Akþ1
ðet: Þ
j: α
t¼1 α ∈ Is1 ,n f jg
X ∗
α
ð2Þ
¼ rdet j Akþ1 Akþ1 ϕi: ,
j: α
α ∈ Is1 ,n f jg
X
n X ∗
α
ð2Þ
ψ fj ≔ ~ft
u rdet j kþ1
A A kþ1
ðet: Þ ¼
j: α
t¼1 α ∈ Is1 ,n f jg
X ∗
α
¼ rdet j Akþ1 Akþ1 ~ f: α
u
j:
α ∈ Is1 ,n f jg
h i
ð2Þ ð2Þ ð2Þ T
the fth component of a column-vector ψ :j ¼ ψ 1j , … , ψ nj , then
X
n X β ð2Þ X β
ð2Þ
cdeti ðA ∗ AÞ:i e:f β ψ fj ¼ cdeti ðA ∗ AÞ:i ψ :j :
β
f ¼1 β ∈ J s,n fig β ∈ J s,n fig
where
2 3
X
ð 2Þ
¼4 ðA ∗ AÞ ðv β 5
ϕi: :i ~ :t Þ β ∈ t ¼ 1, … , n,
1n
,
β ∈ J s,n fig
2 3
∗ α
X
ð 2Þ
ψ :j ¼4 Akþ1 Akþ1 ~ f : 5 ∈ n1 ,
v f ¼ 1, … , n,
j:
α ∈ Is1 ,n f jg α
130
Matrix Theory - Classics and Advances
4. Conclusions
In this chapter, notions of the MPCEP and CEPMP inverses are extended to
quaternion matrices, and the new right and left MPCEPMP inverses are introduced
and their characterizations are explored. Their determinantal representations are
obtained within the framework of the theory of noncommutative column-row deter-
minants previously introduced by the author. Also, determinantal representations of
these generalized inverses for complex matrices are derived by using regular deter-
minants. The obtained determinantal representations give new direct methods of
calculations of these generalized inverses.
Acknowledgements
The author thanks the Erwin Schrödinger Institute for Mathematics and Physics
(ESI) at the University of Vienna for the support given by the Special Research
Fellowship Programme for Ukrainian Scientists.
131
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/103087
References
[1] Kyrchei II. Determinantal [10] Prasad KM, Raj MD, Vinay M.
representations of the quaternion core Iterative method to find core-EP inverse.
inverse and its generalizations. Advances Bulletin of Kerala Mathematics
in Applied Clifford Algebras. 2019;29(5): Association. 2018;16(1):139-152
104. DOI: 10.1007/s00006-019-1024-6
[11] Kyrchei II. Determinantal
[2] Prasad KM, Mohana KS. Core-EP representations of the core inverse
inverse. Linear and Multilinear Algebra. and its generalizations with
2014;62(6):792-802. DOI: 10.1080/ applications. Journal of Mathematics.
03081087.2013.791690 2019;1631979:13. DOI: 10.1155/2019/
1631979
[3] Gao Y, Chen J. Pseudo core inverses in
rings with involution. Communications in [12] Ferreyra DE, Levis FE, Thome N.
Algebra. 2018;46(1):38-50. DOI: 10.1080/ Revisiting the core EP inverse and its
00927872.2016.1260729 extension to rectangular matrices.
Quaestiones Mathematicae. 2018;41(2):
[4] Baksalary OM, Trenkler G. Core 265-281. DOI: 10.2989/16073606.2017.
inverse of matrices. Linear and 1377779
Multilinear Algebra. 2010;58(6):681-697.
DOI: 10.1080/03081080902778222 [13] Mosić D, Djordjević DS. The gDMP
inverse of Hilbert space operators.
[5] Ma H, Stanimirović PS. Journal of Spectral Theory. 2018;8(2):
Characterizations, approximation and 555-573. DOI: 10.4171/JST/207
perturbations of the core-EP inverse.
Applied Mathematics and Computation.
[14] Mosić D. Core-EP inverses in Banach
2019;359:404-417. DOI: 10.1080/
algebras. Linear and Multilinear Algebra.
03081087.2013.791690
2021;69(16):2976-2989. DOI: 10.1080/
03081087.2019.1701976
[6] Wang H. Core-EP decomposition and
its applications. Linear Algebra and its
[15] Sahoo JK, Behera R, Stanimirović PS,
Applications. 2016;508:289-300. DOI:
10.1016/j.laa.2016.08.008 Katsikis VN, Ma H. Core and core-EP
inverses of tensors. Computational and
[7] Zhou MM, Chen JL, Li TT, Wan DG. Applied Mathematics. 2020;39:9. DOI:
Three limit representations of the core- 10.1007/s40314-019-0983-5
EP inverse. Univerzitet u Nišu. 2018;32:
5887-5894. DOI: 10.2298/FIL1817887Z [16] Chen JL, Mosić D, Xu SZ. On a new
generalized inverse for Hilbert space
[8] Gao Y, Chen J, Patricio P. Continuity operators. Quaestiones Mathematicae.
of the core-EP inverse and its 2020;43(9):1331-1348. DOI: 10.2989/
applications. Linear and Multilinear 16073606.2019.1619104
Algebra. 2021;69(3):557-571. DOI:
10.1080/03081087.2019.1608899 [17] Udwadia F, Schttle A. An alternative
derivation of the quaternion equations of
[9] Prasad KM, Raj MD. Bordering motion for rigid-body rotational
method to compute core-EP inverse. dynamics. Journal of Applied Mechanics.
Special Matrices. 2018;6:193-200. DOI: 2010;77:044505.1-044505.4. DOI:
10.1515/spma-2018-0016 10.1115/1.4000917
15
132
Matrix Theory - Classics and Advances
133
Quaternion MPCEP, CEPMP, and MPCEPMP Generalized Inverses
DOI: http://dx.doi.org/10.5772/103087
134
Chapter 7
Abstract
1. Introduction
In 1950, Chargaff's two rules [1] were presented. One is that the percentage of
adenine is identical to that of thymine as well as the percentage of guanine is identical
to that of cytosine, which gives a hint of the composition of the base pair for the
double-strand DNA molecule. The other is that base complementarity is effective for
each DNA strand, which gives an explanation for the overall characteristics of funda-
mental bases. To make an example of COVID-19 DNA, its four bases are satisfied with
these two rules analogous to A ¼ T ¼ 31% and C ¼ G ¼ 19%. In 1953, it was discov-
ered that DNA has a double helix structure [2, 3], which results in an optimal and
economical genetic code [4].
135
Matrix Theory - Classics and Advances
A RNA base matrix ½C U; A G was based on stochastic matrices [5], which results
in the genetic code [6, 7]. A symmetric capacity is calculated by applying the Markov
process to these doubly stochastic matrices, which suggested the symmetry between
Shannon [8] and RNA stochastic transition
matrix ½C U; A G, which is defined as
below. A square matrix of P ¼ pij is stochastic, whose entries are positive as well as
its sum in rows and columns is equal to one or constant. In other words, if the sum of
all its elements in rows and columns is equal to one or invariable, it is double stochas-
tic, which is able to describe the time-invariant binary symmetric channel. For the
input xn and the output x nþ1, two states e0 and e1 are able to depict Markov processes
on an individual basis, which are indicated by two binary symbols “0” and “1”,
accordingly. The output signal is affected by the input signal whose information is fed
into given a certain error probability. Assume that these channel probabilities α and β
are less than a half, whose error probabilities have been kept steady over a time-
variant channel for a wide variety of transmitted symbols such as
Maize Zea 26.8 22.8 23.2 27.2 0.99 0.98 46.1 54.0
Octopus Octopus 33.2 17.6 17.6 31.6 1.05 1.00 35.2 64.8
Chicken Gallus 28.0 22.0 21.6 28.4 0.99 1.02 43.7 56.4
Rat Rattus 28.6 21.4 20.5 28.4 1.01 1.00 42.9 57.0
Human Homo 29.3 20.7 20.0 30.0 0.98 1.04 40.7 59.3
Grasshopper Orthoptera 29.3 20.5 20.7 29.3 1.00 0.99 41.2 58.6
Sea urchin Echinoidea 32.8 17.7 17.3 32.1 1.02 1.02 35.0 64.9
Wheat Triticum 27.3 22.7 22.8 27.1 1.01 1.00 45.5 54.4
Yeast Saccharomyces 31.3 18.7 17.1 32.9 0.95 1.09 35.8 64.4
E. coli Escherichia 24.7 26.0 25.7 23.6 1.05 1.01 51.7 48.3
φX174 PhiX174 24.0 23.3 21.5 31.2 0.77 1.08 44.8 55.2
Covid-19 SARS-CoV-2 29.9 19.6 18.4 32.1 0.93 1.07 38.0 62.0
Table 1.
Ratio of bases [1, 9–11].
136
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
In [1, 5, 12, 13], stochastic complementary RNA bases are given for the genetic
code. On the assumption that C ¼ G ¼ 19%, A ¼ T ¼ U ¼ 31%, P denotes the
transition channel matrix expressed by
" # " #
C U 0:19 0:31
P¼ ¼ : (3)
A G 0:31 0:19
On the condition that the RNA base matrix ½C U; A G for the Markov process
described by two independent probabilities of its corresponding source varies from
0:19p to 0:31p, the transition channel matrix P is defined by
" # " # " #
0:19p 1 0:19p 0:5 1 0 :5 0:5 0:5
P¼ ¼ ¼ : (4)
1 0:19p 0:19p 1 0:5 0:5 0:5 0:5
where p is 2.631.
Applying in a similar fashion to the rest of (4),
" #
0:31p 1 0:31p 0:500 1 0:500 0:5 0:5
P¼ ¼ ¼ , (6)
1 0:31p 0:31p 1 0:500 0:500 0:5 0:5
137
Matrix Theory - Classics and Advances
Table 2.
Shannon entropy for probability p.
1 1
H2 ðPÞ ¼ p log 2 þ ð1 pÞ log 2 : (9)
p 1p
The last column of Table 2 shows the result of Eq. (9). Figure 1 portrays the curve
of Shannon and RNA Entropy. Make a mental note to make sure that a vertical tangent
can be drawn when p = 0 and p = 1 on account of the fact that
d 1 1 1 1
p log 2 þ ð1 pÞ log 2 ¼ log 2 1 log 2 þ 1 log 2 e
dp p 1p p 1p
1 1
¼ log 2 log 2 ¼ 0,
p 1p
(10)
138
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
Figure 1.
Comparison between Shannon and RNA entropy for probability p.
Then, we reach
1
p ¼ 1p ) p ¼ : (12)
2
For the RNA base matrix ½C U; A G, its symmetric entropy is calculated as
1 1
H2 ðPÞRNA ¼ p log 2 þ ð1 pÞ log 2 ¼ 0:9790, (13)
p 1p
when p is either 0.38 or 0.62. By the way, the Shannon entropy is calculated as
1 1
H 2ðPÞShannon ¼ p log2 þ ð1 pÞ log 2 ¼ 1, (14)
p 1p
139
Matrix Theory - Classics and Advances
The variance for RNA random variable X is denoted by V ðX Þ is the square of the
mean, which is expressed by
Ef Xg ¼ a ¼ 0:5: (15)
Therefore, we reach
Ef ðX 1 a 1Þ ðX 2 a 2 Þg ¼ Ef ðX 1 a1 Þg Efð X2 a 2Þ g ¼ 0: (21)
Assuming that X1 and X 2 are independent random variables, the sum of its
variances is calculated as
n o
Vf X1 þ X 2g ¼ E ðX 1 þ X 2 a1 a 2 Þ 2
n o n o
¼ E ðX1 a1 Þ2 þ 2E fðX 1 a1 ÞðX 2 a2 Þ g þ E ð X2 a 2Þ 2 (22)
If over a noise immune-free binary symmetric channel the bases of RNA genetic
code ½C U; A G are complementary such as C ¼ U and A ¼ G, the conditional
6
140
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
Figure 2.
Complementary bases of RNA genetic code ½C U; A G over noise immune-free binary symmetric channel.
probability P b j jai ¼ Pi,j makes description of this channel, whose maximum amount
of information can be transmitted as depicted in Figure 2. On the assumption that C
and G are one ’s complement of its corresponding error probability as well as A and U
are interference signals, the matrix [8] for this channel is made description of by
C U
½pðX Þ 12 ½P22 ¼ ½α 1 α ¼ ½pðY Þ12 ¼ ½ pð Y1 Þ p ðY 2Þ : (23)
A G
Under the condition that p and 1-p are the selection probability ðα ¼ 0Þ and ðα ¼ 1Þ
over the uniform channel on an individual basis, the mutual information is defined by
where
i.e. H ðY Þ ¼ p log 2p ð1 pÞ log 2ð1 pÞ ¼ 0:38 log 2 0:38 0:62 log 20:62 ¼ 1.
while Shannon capacity is derived as
141
Matrix Theory - Classics and Advances
Figure 3.
Shannon and RNA capacity vary with probability p.
be reached. In other words, the difference between Shannon and RNA capacity exists,
which is identical to the sum of variances of RNA base random variables because they
are unable to become a half over a symmetric channel.
Figure 4.
Two-user symmetric Interference Channel. (a) Strong Interference Channel. (b) Weak Interference Channel.
142
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
pffiffiffiffiffiffiffiffiffiffi
H 11 ¼ H12 ¼ hd PSNR,
pffiffiffiffiffiffiffiffiffiffi
H 12 ¼ H21 ¼ hc P SNR:
The relationship between the input and output for two user symmetric channel is
described as follows [14],
pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi
Y 1 ¼ h d PSNRX 1 þ hc P αSNR X2 þ Z1 , (29)
pffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffi
Y 2 ¼ h c P αSNRX1 þ h d P SNR X2 þ Z2 , (30)
where the powers of input symbols X 1, X 2, and additive white Gaussian noise
(AWGN) terms Z1 and Z2 are normalized to unity. Analogous to the definition of the
degree of freedom (DoF), the total GDoF metric d(α) is defined as
C ðP SNR , αÞ
dðαÞ ¼ lim , (31)
P SNR!∞ log ðPSNR Þ
1 1 SNR SNR
R ¼ min log ð1 þ INR þ SNRÞ þ log 2 þ 1, log 1 þ INR þ 1 :
2 2 INR INR
(35)
143
Matrix Theory - Classics and Advances
C sym ¼ R: (39)
Figure 5 makes the description of the weak and strong interference region where
the leftmost indicates a very weak interference region while the rightmost suggests a
very strong interference region.
Analysis:
In 1948, Shannon proposed the code generation method by exploiting the random
codebook in point-to-point communication with inverse Gaussian distribution
(Gaussian distribution variance towards infinity is called inverse Gaussian) to achieve
the channel capacity, which is described as follows [8],
1 S
C ¼ log 2 1 þ , (40)
2 N
1 þ NS
lim
DoF ¼ x!∞ ¼ 1, (42)
1 þ NS !
10
144
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
Figure 5.
Generalized degree of freedom for Gaussian Channel (W curve).
1
R¼ log 2ð1 þ 2SNRÞ: (45)
2
11
145
Matrix Theory - Classics and Advances
log 2INR
On the condition that the ratio α ¼ log 2SNR is fixed and the strength of the signal is
much larger than that of interference and noise, it is able to treat interference as noise.
Therefore, the achievable rate is represented by
SNR
R ¼ log 2 1 þ : (49)
1 þ INR
0 1 0 1 0 1
SNR SNR
log log 2
B R C B 2 1 þ INR C B INR C
DoF ¼ lim B C¼ B C ≈B C
SNR!∞ @ SNR A @ log 2 ðSNRÞ A @ log 2ðSNRÞ A
log 2 (50)
1 þ INR
log 2ðSNRÞ log 2ðINRÞ log 2 ðINRÞ
¼ ¼1 ¼ ð1 αÞ:
log 2ðSNRÞ log 2ðSNRÞ
A block circulant Jacket matrix (BCJM) is defined by [7, 12, 13, 15].
12
146
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
ð51Þ
C U 2
1
C U 2
C U C U 3
C U
P ¼ , P ¼ ⊗ , P ¼ ⊗ ,
A G A G A G A G A G
(53)
where ⊗ denotes the Kronecker product. RNA consists of the sequence of 4 bases
where C, U, A, and G indicate cytosine, uracil, adenine, and guanine, on an individual basis.
According to the theory of noise-immunity coding, for 64 triplets, by comparing
them with strong roots and weak roots, it is able to construct a mosaic gene matrix
½C U; A G3 . If any triplet belongs to one of the strong roots, it is substituted for 1. In
an analogous fashion, if any triplet is included with one of the weak roots, it is
replaced with 1. Here, the strong roots are ðCC, CU , CG, AC, UC, GC, GU , GGÞ and
ðCA, AA , AU , AG, UA, UU , UG, GAÞ are the weak roots, which results in the singular
Rademacher matrix R 8 is in Table 3 [6, 16].
A novel encoding scheme is proposed as
ð54Þ
13
147
Matrix Theory - Classics and Advances
Table 3.
[C U;A G]3 code [6, 16].
R8 ≜ I 0 ⊗ C 0 ⊗ P 2 þ I1 ⊗ C1 ⊗ P2 , (55)
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl
ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
1 0 0 1 1 1 1 1
where I0 ¼ ,I1 ¼ , C0 ¼ , C1 ¼ , and P2 is
0 1 1 0 1 1 1 1
1 1
the double stochastic permutation matrix represented by P2 ¼ . Eq. (54) has a
1 1
series of redundant rows which just repeat and are able to be canceled. From the
Rademacher matrix R8 , one version of its mosaic gene matrices can be reached as
0 1
1 1 1 1 1 1 1 1
B 1 1 1 1 1 1 1 1 C
R08 ¼ B
B C
C: (56)
@ 1 1 1 1 1 1 1 1 A
1 1 1 1 1 1 1 1
ð57 Þ
14
148
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
1 1 1 1
where C0 ¼ and C1 ¼ . These matrices are able to be
1 1 1 1
expanded into the DNA double helix or the RNA single strand, which indicates the
process by that DNA replicates its genetic information for itself, which is transcribed
into RNA and used to synthesize protein for its translation. Therefore,
R004 ≜ I 0 ⊗ C 0 þ I1 ⊗ C 1, (58)
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
ð1Þ ð1Þ
where C0 has eigenvalues such that λ1 ¼ 1 þ i and λ2 ¼ 1 i, and their
eigenvectors ς1 ¼ ð 1 i ÞT and ς 2 ¼ ð 1 i ÞT , correspondingly. In addition, C 1 has
ð2Þ pffiffiffi ð2Þ pffiffiffi
eigenvalues such that λ 1 ¼ 2 and λ2 ¼ 2 where their eigenvectors ς1 ¼
pffiffiffi T pffiffiffi T
1 þ 2 1 and ς1 ¼ 1 2 1 on an individual basis [3, 17]. Then,
where k = 1.
ð60Þ
15
149
Matrix Theory - Classics and Advances
ð61Þ
which is expanded into Eq. (63) and Eq. (64). These are other versions of variants
of genomatrices.
2 3
1 1 1 1
" # 6 7
A G 6 1 1 1 17
6 7
¼6 7
U C 6 1 1 1 1 7
4 5 , (63)
1 1 1 1
" # " # " # " #
1 1 1 1 1 1
¼ ½1 0⊗ ⊗ þ ½ 0 1⊗ ⊗
1 1 1 1 1 1
2 3
1 1 1 1
" # 6 7
G U 6 1 1 1 1 7
¼66 7
7
C A 4 1 1 1 1 5
6 7
, (64)
1 1 1 1
" # " # " # " #
1 1 1 1 1 1
¼ ½1 0⊗ ⊗ þ ½ 0 1⊗ ⊗
1 1 1 1 1 1
2 3
1 1 1 1
" # 6 7
C U 6 1 1 1 1 7
6 7
¼6 7
G A 61 1 1 1 7
4 5 , (65)
1 1 1 1
1 1 1 1 1 1
¼ ½1 0⊗ ⊗ þ½0 1⊗ ⊗
"1 # "1 1 # "1# " 1 1 #
16
150
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
2 3
1 1 1 1
" # 6 7
C A 61 1 1 1 7
6 7
¼6
6
7
7
G U 61 1 1 1 7
4 5 , (66)
1
1 1 1
" # " # " # " #
1 1 1 1 1 1
¼ ½1 0 ⊗ ⊗ þ½0 1⊗ ⊗
1 1 1 1 1 1
2 3
1 1 1 1
" # 661 1
7
G A 6 1 1 7
7
¼66
7
7
C U 6 1 1 1 1 7
4 5 : (67)
1 1 1 1
" # " # " # " #
1 1 1 1 1 1
¼ ½1 0⊗ ⊗ þ½0 1⊗ ⊗
1 1 1 1 1 1
Eq. (62–67) are six versions of variants of genomatrices, which indicate six half
pairs expanded from symmetric RNA genetic matrices by an upper-lower scheme. In
other words, they are constructed by rotating the block in the direction from upper to
low or vice versa.
ð68Þ
17
151
Matrix Theory - Classics and Advances
ð69Þ
which is expanded into Eq. (71) and Eq. (72). These are other versions of variants
of genomatrices.
2 3
1 1 1 1
6 7
" # 6 7
G C 6 1 1 1 17
6 7
¼6 7
U A 6 1 1 1 1 7
6 7
4 5 , (71)
1 1 1 1
" # " # " # " #
1 1 1 0 1 1
¼ ⊗½1 1⊗ þ ⊗½1 1⊗
0 1 1 1 1 1
2 3
1 1 1 1
" # 6 7
U A 6 1
6 1 1 177
¼6
6
7
7
C G 6 1 1 1 1 7
4 5 , (72)
1 1 1 1
1 1 1 0 1 1
¼ ⊗½1 1⊗ þ ⊗½1 1⊗
" 0# " 1 1 # "1 # "1 1 #
18
152
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
2 3
1 1 1 1
" # 6 7
A U 61
6 1 1 1 7
7
¼6 7
G C 61
4 1 1 1 7
5 , (73)
1 1 1 1
" # " # " # " #
1 1 1 0 1 1
¼ ⊗½1 1⊗ þ ⊗½1 1⊗
0 1 1 1 1 1
2 3
1 1 1 1
" # 6 7
G C 6 1 1 1 1 7
¼6
6
7
7
A U 6 1 1 1 1 7
4 5 , (74)
1 1 1 1
" # " # " # " #
1 1 1 0 1 1
¼ ⊗ ½ 1 1 ⊗ þ ⊗½1 1⊗
0 1 1 1 1 1
2 3
1 1 1 1
" # 6 7
C G 6 1 1 1 1 7
6 7
¼6 7
A U 6 1 1 1 1 7
4 5 : (75)
1 1 1 1
" # " # " # " #
1 1 1 0 1 1
¼ ⊗½1 1⊗ þ ⊗½1 1⊗
0 1 1 1 1 1
Eqs. (70)-(75) are 6 versions of variants of genomatrices, which indicate six half
pairs expanded from symmetric RNA genetic matrices by the left-right scheme. In
other words, they are constructed by rotating the block in the direction from upper to
low or vice versa.
Construct a block matrix ½CN by Jacket matrices ½C0 p and ½ C1 p such as
C0 C 1
½CN ¼ where its order N is 2p. This matrix is called block circulant if only
C1 C 0
if C0 CRT RT
1 þ C 1 C0 ¼ ½0 N , where
RT
is the reciprocal transpose. In other words, ½C N is
a block circulant Jacket matrix (BCJM) [12, 13, 15, 18]. From the fact that C 0C RT0 ¼
p½Ip and C1 CRT
1 ¼ p ½I p , C0 and C1 are Jacket matrices. Look back on the fact that ½CN
is a Jacket matrix if only if ½C½CRT ¼ NIN , where RT is the reciprocal transpose.
Therefore, ½C is a Jacket matrix if only if
!
C 0 CRT RT
RT
RT C0 C1 C 0 C1 2p½Ip 1 þ C 1 C0
½C½C ¼ ¼ ¼ NI N ,
C1 C0 C1 C 0 C 0CRT RT
1 þ C1 C0 2p½I p
(76)
where RT is the reciprocal transpose. Therefore, Eq. (76) results in plenty of BCJMs.
19
153
Matrix Theory - Classics and Advances
E ⊗ fð I0 ⊗ AÞ þ ðI 1 ⊗ BÞg ⊗ |{z}
|{z} F : (78)
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}
Position Main Body Kernel Extending
Eq. (58) is an RNA pattern by the main kernel. By applying an upper-lower or left-
right scheme to the genetic matrix, the position matrix E creates the patterns analogous
to Eq. (61, 69). Analogously, by applying the upper-lower and left-right scheme to the
genetic matrix, the extending matrix F creates the patterns analogous to Eq. (60, 68).
South Korea’s national flag stands for different symbols of trigrams and Yin-Yang
located in its middle, which is analogous to that of Figure 6. We present 24 versions of
variants of genomatrices, which distinguish from each other by replacing their subsets
1
with the kernel shown in Figure 6 like its left-hand side ⊗ ½ 1 1 , its right-hand
0
Figure 6.
General pattern by block circulant, upper-lower, and left–right scheme: Normal case.
20
154
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
0 1
side ⊗ ½ 1 1 , its upper position ð 1 0 Þ ⊗ , its lower position
1 1
1
ð0 1Þ⊗ , and its center part I0 ⊗ C0 þ I 1 ⊗ C 1 , on an individual basis.
1
1 1 1
From the fact that ð 1 0 Þ ⊗ $ ð0 1Þ⊗ and ⊗ð1 1Þ $
1 1 0
0
⊗ ð 1 1 Þ, upper symmetric genetic matrices are complementary with lower
1
ones while left ones are complementary with right ones.
In addition, the pattern is created by block circulant, upper-lower, and left–right
scheme on the ½ symmetric block, which are analyzed in three cases.
Case 1. Block circulant scheme
2 3
" 1# 1 1 1
C U 6 1 1 1 1 7
6 7
¼6 7
A G 4 1 1 1 1 5
1 1 1 1 (79)
" # " # " # " #
1 0 1 1 0 1 1 1
¼ ⊗ þ ⊗ :
0 Adiag 1 1 1 0 1 1
2 3
" # 1 1 1 1
U C 6 1 1 1 1 7
6 7
¼6 7
G A 4 1 1 1 1 5
1 1 1 1 (80)
" # " # " # " #
1 0 1 1 0 1 1 1
¼ ⊗ þ ⊗ :
0 1 1 1 AAntidiag 0 1 1
21
155
Matrix Theory - Classics and Advances
Eq. (79) is a block circulant while Eq. (80) is not. Meanwhile, one part of
Eq. (81, 82) is upper-lower symmetric while the other is not. By the way, one part of
Eq. (83, 84) is left–right symmetric while the other part is not. Figure 7 shows a
certain pattern constructed by a series of the product of ½C A; U G as well as a
distorted pattern in comparison with that in Figure 6. Therefore, these are called
sickness pattern, which can cover COVID-19.
Figure 7.
Abnormal pattern by block circulant, upper-lower, and left–right scheme.
22
156
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
9. Conclusion
Conflict of interest
23
157
Matrix Theory - Classics and Advances
158
The COVID-19 DNA-RNA Genetic Code Analysis Using Double Stochastic and Block Circulant…
DOI: http://dx.doi.org/10.5772/ITexLi.102342
References
25
159
Matrix Theory - Classics and Advances
160
Chapter 8
Abstract
The interest in quantum information processing has given rise to the development
of programming languages and tools that facilitate the design and simulation of
quantum circuits. However, since the quantum theory is fundamentally based on
linear algebra, these high-level languages partially hide the underlying structure of
quantum systems. We show that in certain cases of practical interest, keeping a handle
on the matrix representation of the quantum systems is a fruitful approach because it
allows the use of powerful tools of linear algebra to better understand their behavior
and to better implement simulation programs. We especially focus on the Joint
EigenValue Decomposition (JEVD). After giving a theoretical description of this
method, which aims at finding a common basis of eigenvectors of a set of matrices, we
show how it can easily be implemented on a Matrix-oriented programming language,
such as Matlab (or, equivalently, Octave). Then, through two examples taken from
the quantum information domain (quantum search based on a quantum walk and
quantum coding), we show that JEVD is a powerful tool both for elaborating new
theoretical developments and for simulation.
1. Introduction
161
Matrix Theory - Classics and Advances
2. Mathematical background
1 1
1
H ¼ pffiffiffi and HN ¼ H ⊗ n ðN ¼ 2n Þ (1)
2 1 1
162
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
ak1B ⋯ a klB
Assuming the sizes are such that one can form the matrix products AC and BD, an
interesting property, known as the mixed-product property, is:
The Kronecker product is associative, but not commutative. However, there exist
permutation matrices (the shuffle matrices defined in the previous subsection) such
that, if A is an a a square matrix and B a b b square matrix, then [9]:
A ¼ USV ∗ (6)
where U and V are unitary matrices, and S is diagonal. The diagonal of S contains
the Singular Values, which are real nonnegative numbers, ranked by decreasing order.
The sizes of the matrices are U ðm mÞ, Sðm nÞ and V ðn nÞ. The SVD is a very
useful linear algebra tool because it reveals a great deal about the structure of a matrix.
The image and the kernel of A are defined by:
When used in an algorithm, the notation null will also be used for a procedure that
computes a matrix whose columns are an orthonormal basis of the kernel of A.
The complement of a subspace A within a vector space H is defined by:
3
163
Matrix Theory - Classics and Advances
where D is a diagonal matrix, the diagonal of which contains the eigenvalues, and
V is a unitary matrix whose columns are the eigenvectors.
Let us note EA
λ the eigenspace of an operator A associated with an eigenvalue λ: The
,B
joint eigenspace EAλ,μ is:
,B
EA A B
λ,μ ¼ Eλ ∩E μ (13)
,B
A property of great interest in quantum information processing is that within EA
λ,μ
(and even within any union of joint eigenspaces) the operators A and B commute.
Determination of the joint eigenspace on a computer may be determined through
the complement, because:
c B cc
EA B
λ ∩E μ ¼ EA
λ ∪ Eμ (14)
C ¼ null null ðAλ Þ null Bμ (15)
However, it is not efficient in terms of complexity and in the next sections we will
propose faster computational procedures, adapted to each context.
A lower bound on the dimension of a joint eigenspace can be obtained as follows.
Let us note n the dimension of the full space. We have, obviously:
4
164
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
dim E A B
λ ∪E μ ≤ n (16)
dim E A B A B A B
λ ∪E μ ¼ dim Eλ þ dim Eμ dim Eλ ∩Eμ (17)
jψ i ¼ α 0 j0 i þ α1 j1 i (19)
2 2
where α0 and α 1 are complex numbers subject to jα 0j þ jα1 j ¼ 1.
5
165
Matrix Theory - Classics and Advances
Note that, contrary to classical digital registers, the qubits are usually not separa-
ble, hence the register must be considered as a whole. We say that the qubits are
entangled. However in the special case where the coefficients γ ab can be decomposed
in the form γab ¼ αa β b the state can be written as a tensor product of the states of two
qubits, which can be considered separately. Then, we have:
Figure 1.
Principle of quantum coding.
166
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
Figure 2.
CNOT quantum gate.
matrix U . Then, errors may occur on the encoded state: they are represented by a
unitary matrix E. The decoder is represented by another unitary matrix U ∗ which is
the transpose conjugate of the encoding matrix. Finally, we measure the last r qubits
of the decoded state, and, depending on the result of the measurement, we apply the
appropriate restoration matrix Uc (which is a unitary matrix of size 2k 2 k) to the
k-qubit register composed of the first k qubits of the decoded state.
As an illustration, let us consider n ¼ 2, k ¼ 1 and the very simple quantum
encoder shown in Figure 2. It is a basic quantum circuit known as the CNOT quantum
gate, and it is represented by the unitary matrix below:
1 0 0 0
0 1
B0 1 0 0C
U¼ B (22)
B C
@0 0 0 1A
C
0 0 1 0
Let us consider that an error may appear on the first encoded qubit and that this
error, if present, is represented by the unitary Pauli matrix X . Then, the error matrix
which acts on the encoded state is:
0 1 0 0
0 1
B 1 0 0 0C
E ¼ X⊗I ¼ B (24)
B C
@0 0 0 1 A
C
0 0 1 0
0 0 0 1
0 1
B0 0 1 0C
F ¼ U ∗ EU ¼ B C¼ X ⊗X (25)
B C
@0 1 0 0A
1 0 0 0
167
Matrix Theory - Classics and Advances
Figure 3.
Steane encoder.
Measuring the second qubit on the output of the decoder consists in decomposing
the state space into a direct sum H0⊕H 1 of two subspaces spanned by fj 00i, j10 ig and
fj01i, j 11ig. The result of the measurement will be either 0 or 1 (index of the selected
subspace), and by analogy with classical decoding, this result will be called the “syn-
drome.” The projections on these subspaces are ½ 00 T and ½ α 1α0 T. Then the proba-
bility to obtain syndrome 1 is 1.
The measurement then projects the state onto H 1. Note that in this particular case,
the information is preserved by the projection. Then, applying the operator U c ¼ X to
the projected state restores the initial state.
Similarly, if there is no error, we can see that F is the identity matrix, then the
projections on the subspaces are ½ α0 α1 T and ½ 00 T . In that case, the syndrome is 0
and the state is projected onto H0 . Correction is done by applying the operator I to the
projected state, which is equivalent to doing nothing.
The very simple code used above, as an illustration, cannot correct more complex
errors (for instance, an error Z on the first qubit). However, there exist efficient
quantum codes, such as the Steane code [13], and the Shor code [14]. A remarkable
result of quantum coding theory is that a linear combination of correctable errors is
correctable [15].
Figure 3 shows the Steane Encoder, which is a ðn ¼ 7, k ¼ 1, t ¼ 1Þ quantum
encoder. This means that it encodes k ¼ 1 qubits on n ¼ 7 qubits and it is able to
correct any error occurring on t ¼ 1 encoded qubits. It is built with Hadamard
(Eq. (1)) and CNOT (Eq. (22)) quantum gates. From this circuit description, it is
possible to obtain the coding matrix U .
The problem we address can be stated as follows (see Figure 1 for the notations)—
given a list of n independent Pauli errors Ei with corresponding diagonal outer errors
Fi , determine the unitary operator U (quantum encoder) such that:
U ∗ Ei U ¼ F i ∀i ∈ f 1, … , ng (26)
This equation shows that the columns of U are the eigenvectors of Ei. Specification
of the code by a small set of Pauli errors is very convenient and the interest of
8
168
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
Ei Fi
Z ⊗Z ⊗Z ⊗Z ⊗Z ⊗Z ⊗Z Z ⊗I ⊗I ⊗I ⊗I ⊗I ⊗I
X ⊗X ⊗I ⊗I ⊗I ⊗X ⊗X I ⊗Z ⊗I ⊗I ⊗I ⊗I ⊗I
Z⊗ Z ⊗ I ⊗ I ⊗ I ⊗ Z ⊗ Z I ⊗I ⊗I ⊗I ⊗Z ⊗I ⊗I
Z⊗I⊗I⊗Z ⊗Z ⊗Z ⊗I I⊗I⊗I⊗I⊗I⊗I⊗Z
Table 1.
Collection of Pauli errors.
Figure 4.
Diagonals of matrices F i.
169
Matrix Theory - Classics and Advances
if K ðcÞ ¼ 1 then
∣ Y 1 ¼ A1
end
for k ¼ K ðcÞ þ 1 to n do
Ck ¼ B ∗ Yk1
k
Zk ¼ nullðCkÞ
Yk ¼ Y k1Z k
• Ck ¼ Bk∗ Y k1: The orthonormal basis of Y k1 is projected on the kernel of Ak . The
components of the projected vectors are expressed in the orthonormal basis Bk of
that kernel. Consequently, Im ðCk Þ is the projection of Yk1 on the kernel of Ak ,
expressed in that kernel.
• Finally, the components of the basis vectors are restored to the original space by
Yk ¼ Y k1Z k
Let us prove that the matrices Y k have orthonormal columns. This is obviously the
case for k ¼ 1. Then, by recursion, we have:
Yk∗ Y k ¼ Z k∗ Y k1
∗
Yk1 Z k ¼ I (27)
We have also:
Then
10
170
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
Ei Fi
X⊗X⊗X ⊗X ⊗X ⊗X ⊗X X ⊗I⊗I⊗I⊗I⊗I⊗I
X⊗I⊗I⊗I⊗I⊗X⊗I I ⊗I⊗I⊗I⊗I⊗X⊗I
X⊗I⊗I⊗I⊗I⊗I⊗X I ⊗I ⊗ I ⊗I ⊗I ⊗I ⊗ X
Table 2.
Additional Collection of Pauli errors.
Then
Bk∗ Y k1 b ¼ 0 ) Ck b ¼ 0 ) ∃a : b ¼ Zk a
Figure 5.
Estimated Matrix U for the Steane encoder.
11
171
Matrix Theory - Classics and Advances
Figure 5 shows the matrix computed by our method. We have checked that it is
equal to the matrix directly computed from the circuit description.
The programmer may speed up the computation by taking into account the fact
that when computing columns c of U , some matrices Yk have already been computed
for other columns and can be reused. For instance, in Figure 4, we see that the joint
eigenvalues corresponding to columns 19 and 20 are the same, except the last one.
Then, when computing column 20, we can set Kð20Þ ¼ n in Algorithm 1 instead of the
default value Kð20Þ ¼ 1, because the Yn1 for column 20 is the same as for column 19.
More generally, Algorithm 2 written in pseudo-Octave code computes the optimal
values of the Kð cÞ.
K ¼ ½1 1
for k ¼ 2 to n do
| K ¼ reshape K ; k ∗ ones 1, 2 k1 1 2k
end
Let us consider a particle that can move on a graph. In the classical world, at the
time t this particle is localized at a node of the graph. It can then randomly choose
one of the edges linked to this node to reach one of the adjacent nodes at a time t þ 1.
The repeated iteration of this process is the concept of classical random walk.
A quantum walk [17] relies also on a graph, but contrary to the classical walk,
here the particle may be located at many nodes at the same time and can choose
many edges simultaneously. At the time t, the state of the particle is then described by
a state vector j ψ ti and the evolution
between
times t and t þ 1 is given by a
unitary matrix U ¼ SC such that ψ tþ1 ¼ Uj ψt i. The unitary matrices C and S
represent, respectively, the choice of the edges and the movement to the
adjacent nodes.
In the following, we will consider graphs associated with hypercubes [18]. We will
note n the dimension of the hypercube and N ¼ 2 n the number of nodes. Figure 6
shows the graph corresponding to a hypercube of dimension n ¼ 3. It is convenient to
label the nodes by binary words. In quantum language, these binary words κ are also
used to label the basis vectors of the so-called position space HS .
The quantum state lives in a Hilbert space built by the tensor product of the
position space HS (corresponding to the nodes) and the coin space H C (corresponding
to the possible movements along the edges) H ¼ HS ⊗ H C . The dimensions of these
state spaces are Ne ¼ nN , N and n.
It is usual to define C as [19]:
C ¼ IN ⊗ G n (31)
12
172
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
Figure 6.
Hypercube for n ¼ 3 .
Figure 7.
Matrices C (left) and O (right) for n ¼ 3.
S ¼ P^
SPT (32)
where
173
Matrix Theory - Classics and Advances
The last equation just means that, because a movement along direction d corre-
sponds to an inversion of the dth bit in κ , the shift operator permutes the values
associated to nodes that are adjacent along that dimension.
A quantum walk search can be described by repeated application of a unitary
evolution operator Q , which can be written:
Q ¼ UO (34)
Here O is the oracle, which aims at marking the solutions. An example of oracle
structure is shown in Figure 7. It is a block-diagonal matrix, whose blocks are G n
when they correspond to a solution and In otherwise. Denote M the number of
solutions and assume that M ≪ N (otherwise the quantum walk search would serve no
purpose because the probability of rapidly finding a solution with a classical search
would be high). In the example shown in the figure, there are M ¼ 2 solutions (located
at positions 1 and 4 on Figure 6).
Let t denote the number of iterations until a measurement is performed. Starting
from an initial state j ψ 0i , repeated iterations lead to the state jψ ti ¼ Q t jψ 0 i which is
then measured. The theory of quantum walk search [19] shows that the probability of
success (that is the probability of obtaining a solution by measurement) oscillates as a
function of t. This means that theoretical tools which help to understand and simulate
quantum walk search lead to the development of methods to determine the optimal
time of measurement.
In the sequel, we will show that JEVD is a fruitful tool in this context. Indeed, set E
to be the union of the joint eigenspaces of U and O, and E its complement. Inside E,
the operators commute. So, if we note with index E the restrictions of the operators to
E, we have:
Set
F ¼ HN ⊗ In (36)
14
174
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
FSF ¼ ðH N ⊗ I n ÞPN ,n ^
SP n,Nð HN ⊗ I nÞ (37)
¼ Pn,N ðIn ⊗ H NÞ^
S ðIn ⊗ H NÞPN ,n (38)
¼ PT diag … , HN ^ Sd H N, … P (39)
The latter term is diagonal because the mixed product property, H2 ¼ I and
HXH ¼ Z , shows that:
Once more, using the mixed product property, we can also prove that F keeps C
unchanged, that is:
FCF ¼ C (41)
Note that the diagonal of Sκ contains k times 1 and n k times þ1 (where k is the
Hamming weight of κ ).
Then, because F2 ¼ I , FUF is a block diagonal matrix:
Block κ is then
U κ ¼ S κG n (45)
We have:
dim EU S κ,Gn
≥ dim E þ,
κ
(46)
≥ dim ESþκ þ dim EG
n
n
(47)
≥ ðn kÞ þ ðn 1Þ n (48)
≥n k 1 (49)
and
dim EU S κ,Gn
þ ≥ dim E ,
κ
(50)
≥ dim ESκ þ dim EG
n
n
(51)
≥ k þ ðn 1Þ n (52)
≥k 1 (53)
Then, there is only room left for at most 2 eigenvalues, specifically, at most a pair
of conjugate ones.
15
175
Matrix Theory - Classics and Advances
Assume that this pair of eigenvalues exists. Since the diagonal of Gn contains
1 þ 2n, the trace of Uκ is:
2
trace ðU κ Þ ¼ ððn kÞ ðþ1Þ þ kð1ÞÞ 1 þ (54)
n
k
¼ n þ 2k þ 2 1 2 (55)
n
The sum of the eigenvalues is equal to the trace and we already have eigenvalue 1
with multiplicity n k 1 and eigenvalue þ1 with multiplicity k 1. The sum of
these n 2 eigenvalues is n þ 2k. Then the sum of the two missing eigenvalues must
be 2 1 2 kn . Let us denote them by λk and λk∗ . We must have Re ðλ kÞ ¼ 1 2 kn. Then,
Considering the eigenvalues of Gn and In it is trivial to show that the dimensions
of the eigenspaces of the oracle are:
dim EO
¼M and dim EO
þ ¼ Ne M (57)
2n
X 2n
X
βj≥ α j 2nM (58)
j¼1 j¼1
P 2n
Obviously, we have j¼1 α j ¼ Ne, so that
2n
X
β j ≥ Ne 2nM (59)
j¼1
This is a remarkable result—despite the fact that the dimension of the Hilbert
space grows exponentially (N e ¼ n2n), the dimension of the complement grows only
linearly with n.
16
176
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
5.4.1 Introduction
To check our theoretical upper bound, we propose an efficient algorithm for fast
computation of the joint eigenspaces.
We have to compute orthonormal bases of joint eigenspaces of U and O. The
dimension of EO is small, hence, it makes sense to define it by an orthonormal basis
generating the eigenspace. However, the dimension of EO þ is large (greater than Ne =2).
Hence, it is computationally more efficient to define it by an orthonormal basis of its
complement (which is EO O O
). Indeed dim E ≪ dim E þ . We then have to design an
algorithm adapted to each case.
C ¼ A ∗B (61)
C ¼ Uc Sc V c∗ (62)
Denote by sk the singular values (the diagonal elements of S c ) and determine r such
that s k ≥ 1 ε for k ¼ 1, … , r, and sk < 1 ε for k > r. Here ε ≪ 1 is a very small positive
value introduced to take into account the presence of small errors due to computer
finite precision arithmetic. Finally:
J ¼ AUc ð: , 1 : rÞ (63)
5.4.3 Intersection of two eigenspaces, one of them being defined by an orthonormal basis of
its complement
Z ¼ null ðC ∗ Þ (64)
17
177
Matrix Theory - Classics and Advances
,U
λO λU dimEO
λ O,λ U
1 any λk or λ∗k 0
þ1 λ0 ¼ þ1 321
þ1 λ1 or λ1∗ 4
þ1 λ2 or λ∗2 18
þ1 λ3 or λ∗3 32
þ1 λ4 or λ∗4 32
þ1 λ5 or λ5∗ 18
þ1 λ6 or λ∗6 4
þ1 λ7 ¼ 1 321
Table 3.
Joint eigenspaces of O and U for n ¼ 7 and M ¼ 3 solutions located at nodes 2, 8, 9.
,O
and we obtain an Ne r matrix J whose columns are an orthonormal basis of EU
λ,þ
from:
J ¼ AZ (65)
The justification of the algorithm is as follows. The q columns of C are a basis of the
projection of ImðBÞ into ImðAÞ, the components being expressed in the basis of ImðAÞ
The complement of ImðCÞ in ImðAÞ is the desired intersection (expressed in ImðAÞ).
The columns of Z are an orthonormal basis of this intersection. Finally, Eq. (65)
restores the components in the original space.
2n
X
dim E c ¼ N e β j ¼ 38 (66)
j¼1
We can see that, as expected, this dimension (dim Ec ¼ 38) is much smaller than
the dimension of the original state space (N e ¼ 896). We can also check that it is less
than the theoretical upper bound (2nM ¼ 42), as expected.
6. Conclusions
178
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
theory concepts, such as JEVD, are powerful tools to propose new theoretical results as
well as efficient simulation algorithms.
In the domain of quantum coding, we have shown how to determine the encoding
matrix of a quantum code from a collection of Pauli errors. On a more speculative note
to be part of future work concerning interception of quantum channels, it might also
be useful to identify the quantum coder used by a noncooperative transmitter.
In the domain of quantum walk search, thanks to JEVD we have proved that there
exists a small subspace of the whole Hilbert space which captures the essence of the
search process, and we have given an algorithm that allows us to check this result by
simulation.
Acknowledgements
Abbreviations
179
Matrix Theory - Classics and Advances
References
[1] Humble TS. Quantum security for the [11] Wootters W, Zurek W. A single
physical layer. IEEE Communications quantum cannot be cloned. Nature. 1982;
Magazine. 2013;51(8):56-62 299(5886):802-803. DOI: 10.1038/299
802a0
[2] Liao SK et al. Satellite-to-ground
quantum key distribution. Nature. 2017; [12] Nielsen MA, Chuang IL. Quantum
549:43-47 Computation and Quantum Information.
Cambridge, UK: Cambridge University
[3] Ren JG et al. Ground-to-satellite Press; 2010
quantum teleportation. Nature. 2017;
549:70-73 [13] Steane A. Multiple-particle
interference and quantum error
[4] Valivarthi R et al. Quantum correction. Proceeding of the Royal
teleportation across a metropolitan fibre Society of London. 1996;452(1954):
network. Nature Photonics. 2016;10: 2551-2577. DOI: 10.1098/rspa.1996.0136
676-680
[14] Shor PW. Scheme for reducing
[5] Yin J et al. Quantum teleportation and decoherence in quantum computer
entanglement distribution over 100- memory. Physical Review A. 1995;52(4):
kilometre freespace channels. Nature. R2493-R2496. DOI: 10.1103/PhysRevA.
2012;488:185-188 52.R2493
20
180
Joint EigenValue Decomposition for Quantum Information Theory and Processing
DOI: http://dx.doi.org/10.5772/ITexLi.102899
181
Chapter 9
Abstract
1. Introduction
Three-phase, doubly-fed induction (DFI) machines have a long history [1–4] and
continue to be key constituents in energy conversion processes [5, 6]. Motivation for
modeling a DFI generator comes from the need to deal with intermittency in the
primary energy supply (e.g., the wind field) and with uncertainty in the load (i.e., the
grid). Similarly, the modeling and control of a DFI motor can improve the efficiency
and reliability of electric-to-mechanical work conversion. The equations modeling the
ideal DFI machine have been studied for more than a century. Results in modeling and
control [7–13], including those which derive from numerical simulation, demonstrate
how attention to the DFI machine is being continuously paid. In essence, the ideal three-
phase machine model centers on two matrices, on which this work focuses: the rotor-to-
stator mutual inductance matrix, Lsr ½:, which depends on the “rotor angle” θr and
characterizes the machine itself, and the Blondel-Park transformation matrix, K½:,
182
Matrix Theory - Classics and Advances
which depends on another angle and describes a change of variables, from the fabcg
reference frame (Section 2) to the fdq0 g reference frame (Section 4). Both matrices,
Lsr ½: and K½:, appear in the Electric Torque Theorem (ETT) which relates mechanical
to electrical variables and as such represents the raison d’être of the DFI machine. Stated
in the fabcg frame, the ETT is a straightforward application of energy balance, once a
Legendre transformation (Section 5.2) has been introduced and co-energy accordingly
defined. Instead, the proof, in fact the translation of the ETT in the fdq0g frame
(Section 5.3), requires all relevant properties of Lsr ½: and K½: to be known. For this
reason, in Section 3 the kernel, the range (Proposition 1), the classical adjoint and the
left zero divisors (Proposition 3) of L sr½: are determined. Derivation benefits from L sr ½:
being a circulant matrix [14] (Lemma 1) and from its eigenvalues representing the
discrete Fourier transform of a 3-sequence [15] (Proposition 2). Special attention need
two constant matrices, A0 and its square (Lemma 2), because they relate differentiation
of Lsr with respect to θr to multiplication (Theorem 1). In a suitable subspace of 3 , Lsr
admits an exponential representation (Theorem 2) with A0 as infinitesimal generator.
Section 4 is devoted to K½:: its structure, as well as its exponential representation with
generator A0 , is inferred by satisfying, in sequence, a list of requirements (Proposition
4 and Theorem 4). The key formula for the product of matrices (Theorem 5) is then
applied to prove the ETT in the fdq0g frame in one step (Theorem 6). An attempt is
finally made in Section 6 to deal with a “realistic” machine model, where the three-fold
rotor symmetry is broken: the b and c rotor axes are misaligned by angles ϵb and ϵ c. To
second-order in ϵb and ϵ c there exists a constant B which relates differentiation to
multiplication of the approximate inductance matrix (Proposition 7).
Definition 1. (Three-phase, ideal DFI machine) [3, 4]. A three-phase DFI machine is
said ideal whenever its stator and rotor windings exhibit three-fold symmetry. More-
over, magnetomotive forces and flux waves created by the windings are sinusoidally
distributed and windings give rise to a linear electric network.
Remark 1. (Neglected phenomena). Higher harmonics, hysteresis, and eddy currents
are excluded by the model. Deviations from three-fold symmetry will be addressed in
Section 6.
Notation. (fabcg frames). The most natural frames where three-phase stator and
rotor voltages and currents can be represented are the f abcg frames. For example, the
stator currents are an ordered triple which one agrees to represent as a vector
! Trs
∈ 3 , (1)
j fabcgs ¼ jas jbs jcs
whose components are functions of time t ∈ T. A similar notation will hold for
other electric quantities. Unless otherwise specified, all vectors are understood in 3 .
Hp. (Function class). Dependence of all quantities of interest on time is assumed as
smooth as required.
Definition 2. (Balanced triple). An fabcg current triple is balanced or is a trivial
zero sequence, whenever
ja þ jb þ j c ¼ 0, ∀t ∈ T: (2)
183
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
Such sequences define the subspace B ⊂ 3 , a plane through the origin; the
!
corresponding notation is j B ∈ B.
Notation.
θr ½t ∈ ½0, 2π Þ is the electric rotor angle at a time t, formed by the rotor ar axis with
respect to the stator as axis.
βr ½t ∈ ½0, 2π Þ is the electric rotor angle at a time t, formed by d axis with respect to
the rotor ar axis (Section 4).
βs ½t ∈ ½0, 2π Þ is the electric rotor angle at a time t, formed by d axis with respect to
the stator as axis (Section 4).
!0 !
j fabcgr ≔ NN s j fabcgr
r
is the stator-referred (ständer-bezogen) vector of rotor currents,
where Nr and Ns are the rotor and stator turns.
A boldface, roman capital denotes a matrix in M½Nrow , N col of N row ð ≥ 2Þ rows
N col ð ≥ 2Þ columns.
am∗ ,n is the cofactor of entry f m, ng in A ∈ M ½N, N , where Nrow ¼ N col ¼ N ð ≥ 2Þ.
∗
am,n is the corresponding matrix.
adj½A is the matrix adjoint to A ∈ M½N, N , obtained by transposing a ∗m,n .
! ! ! !
u )( v is the dyadic product of the column vector u by the row vector v , both ∈ 3.
1 3 is the 3 3 identity matrix.
The end of a Proof is marked by ⊳⊲, that of a general statement or of a Remark
by ◊.
The electrical network equations which describe the dynamics of the three-phase,
ideal DFI machine are well-known [3, 4], as a consequence they are omitted.
Nr
L0sr ½θ r ≔ Lms Lsr ½θ r , (4)
Ns
184
Matrix Theory - Classics and Advances
• det ½L sr ¼ 0, ∀θr ,
n! o
• dim ½Ker ½L sr ¼ 1 and Ker½ L sr ¼ j ∈ 3j j1 ¼ j 2 ¼ j3 ≔ K sr, ∀θr .
¼ Ksr ⊕ B , (5)
! ! !
j ¼ j Ksr þ jB
B : j 1 þ j 2 þ j 3 ¼ 0 of Eq. (2).
N 1
X
ψ ½ζ ≔ cn ζ n : (6)
n¼0
N 1
c n ¼ ψ ϵ0 when ℓ ¼ N :
X
ψ ϵN ¼ (7)
n¼1
N1
X
cn ¼ 0 (8)
n¼1
ψ ϵN ¼ 0 ¼ μ0 : (9)
185
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
2 3
c0 c2 c1
6c
Lsr½θ ¼ 4 1 c0 c2 7
5 (10)
c2 c1 c0
and
1 1 1
2 3
ðb0 b 1 b2 Þ ¼ ðc0 c 1 c2 Þ 4 1 ϵ
6 ϵ2 7
5 ≔ ðc0 c1 c 2 Þ T : (12)
1 ϵ2 ϵ
b0 0 0
2 3
h i
P cð3Þ ¼ T 1 4 0 b1 0 75T : (13)
6
0 0 b2
1 0 0
2 3
ð μ0 μ1 μ 2Þ ¼ ðc0 c2 c1Þ 6
4 0 0 175 T : (14)
0 1 0
Lemma 2 (The matrix A0 and its properties). Let the matrix A0 be defined by
0 1 1
2 3
1 6
A 0 ≔ pffiffiffi 4 1 0 1 5 : (15)
7
3
1 1 0
186
Matrix Theory - Classics and Advances
• (Left zero divisors). The constant, nontrivial left zero divisors of A0 are given by
dyads
! !
h i
!
Z c ¼ c Þð 1 (19)
! !
A0 ¼ a Þðb ðfalseÞ (20)
with constant ak , b k ∈ , k ¼ 1, 2, 3.
1! !
A20 ¼ 1 3 þ 1 Þð 1 , (21)
3
!
• Let a Þ in Eq. (20) be replaced by ∇Þ, then there exists a C 1, divergence-free
!
vector field f giving rise to the “differential” dyadic representation of A0
1 !
A 0 ¼ p ∇Þð f : (23)
3
!
The system of first-order linear partial differential
ffiffiffi equations to which f is the
solution is obtained by comparing like terms in the arrays.
187
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
• As already noticed in the proof, A0 cannot be a left zero divisor of itself. (In fact,
A 20 is given by Eq. (21)).
• As one can easily verify, the eigenvalues of A20 are λ0 ¼ 0, λ1 ¼ 1. The latter has
algebraic multiplicity α1 ¼ 2 and geometric multiplicity γ1 ¼ 1. As a consequence,
its eigenspace, X 1 A 20 , not only has a dimension α 1 γ 1 þ 1 ¼ 2 but complies with
X1 A20 ¼ B (24)
0 1 1
2 3
pffiffiffi
3 1! ! 7 3
C ¼ 13 1 Þð 1 ; S ¼4 1 0 1 5 : (27)
6
2 2 2
1 1 0
188
Matrix Theory - Classics and Advances
Ker½Z ¼ B , ∀Z ∈ ℨ: (31)
and holds for a general A ∈ M½N N ; then one recalls det½Lsr ¼ 0 and the θ r-
invariance of μ 0 of Eq. (9): one can thus compute the classical adjoint to either
constant matrix, C or S, whichever is simpler to deal with; the result is the dyad on the
right side of Eq. (29), a result which holds ∀θ r. The search for left zero divisors of Lsr
as dyads like that of Eq. (30) is suggested by Eqs. (29) and (32) because adj½L sr must
be a zero divisor of Lsr. The most general form of a left zero divisor Z is inferred from
Eq. (11): since all column-wise sums of Lsr vanish, the columns of Z must be equal.
Therefore such divisor, if non-trivial, has rank one and is obtained from the dyadic
product of Eq. (30). Finally, Eq. (32) and the orthogonal decomposition Eq. (5) imply
⊳⊲
Remark 5. (Duality; divisors; other properties of L sr).
• From Eqs. (32–34) one says L sr and adj½L sr are dual to each other.
• The constant, nontrivial left zero divisors of Lsr are those of Eq. (19). By
consistency, the matrices C and S of Eq. (26) not only have the same classical
adjoint as Lsr½θr has but have all and the same zero divisors, because Eq. (26)
holds ∀θr .
! 3
• If L sr stands for the subspace of functions f ½: ∈ C0 ð½0, 2π Þ complying with
! ! ðkÞ ðkÞ
∮ Lsr½θr f ½θr dθr ¼ 0, and if am , m ¼ 0, 1, 2, ⋯, bm , m ¼ 1, 2, ⋯, are the cosine
and, respectively, the sine Fourier coefficients of the (real-valued) components
!
fk ½:, k ¼ 1, 2, 3 of f ½:, then
! nn
ð1Þ ð2Þ ð3Þ
o n oo
f ∈ Lsr⇔ a 1 ¼ a1 ¼ a1 and b1ð1Þ ¼ b1ð2Þ ¼ b ð13Þ : (35)
◊
Remark 6. (The physical meaning of L sr, C and S). As Eq. (5) suggests, given any
!
instantaneous current vector j ½t ∈ 3 , left multiplication by Lsr ½θ r½t returns a bal-
!
anced current triple j B ½t ∈ B. This follows from the three-fold symmetry of the ideal
189
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
DFI machine, mirrored by the structure of Lsr½:. Moreover, the C term of Eq. (26)
represents the opposite of reactive torque, whereas the S term represents active
torque.◊ h i
!
Theorem 1. (Differentiation of Lsr with respect to θ r). Let A 0 and Z c be respec-
tively given by Eqs. (15) and (19). Then the derivative of Lsr with respect to θ r is the
set-valued map
h i
! ∂Lsr
A0 þ Z c L sr ½θr ∈ ½ θr : (36)
∂θr
Notation: a convenient notation is ∂Lsr
∂θ r
½ θr M½θ r.
Proof of Theorem 1. Differentiation of Lsr ½θ r as represented by Eq. (26), and the
use of Eq. (28) yield
3 3
∂L sr
½ θ r ¼ A 0 cos ½θr þ A20 sin ½θ r : (37)
∂θr 2 2
3 3
A0 cos ½θ r þ A20 sin ½θr ¼ B1 C cos ½ θr þ B1 S sin ½θ r: (38)
2 2
B 1 ¼ A 0: (39)
! 3 ! !
L sr½θr j ¼ eθr A0 j , ∀ j ∈ B (41)
2
and
! ! !
M½θr j ¼ M ½0 eθr A0 j , ∀ j ∈ B (42)
190
Matrix Theory - Classics and Advances
with
3
M½0 ¼ A : (43)
2 0
Proof of Theorem 2. A matrix J ½θr is sought for, which, like Lsr ½θ r, splits into a
cos ½: and a sin ½: term as in Eq. (26) and, unlike Lsr ½:, satisfies J½0 ¼ 13. As Eqs. (28)
suggest, one solution is
2 2 1 ! ! 2
J½θr ¼ 1 3 cos θr þ S sin θ r ¼ Cþ 1 1 cos θr þ S sin θ r : (44)
3 3 2 3
2 2
n ! ! ! o
H ¼ S ; H S ¼ 13 i:e: H2 j ¼ 13 j , ∀ j ∈ B : (46)
3 3
One solution, modulo left zero divisors, comes from the properties of A20 in
Eq. (25):
H ¼ A0 : (47)
3 1! !
L sr½θr ¼ e θr A 0 1 Þð 1 cos θr : (48)
2 2
Consistency with ∂L sr
∂θr
½: implies
3 1! !
∂Lsr
½θr ¼ A 0 J½ θ r þ 1 Þð 1 sin θr : (49)
∂θr 2 2
! !
Since j ∈ B, then the rightmost dyads in Eqs. (48) and (49) return 0 when right
!
multiplied by j . Replacing A0 in Eq. (47) by the B of Eq. (40) does not change the
results (Eqs. (48) and (49)) because
! ! ! ! h i
eθ rZ ½c j ¼ 0 , ∀ j ∈ B , ∀Z c ∈ ℨ :
!
(50)
⊳⊲
Theorem 3. (Symmetry properties).
LTrs
sr ½θr ¼ Lsr ½ θr (51)
10
191
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
Proof of Theorem 3. The first two relations are immediate. The third one follows
from the representation of Lsr as linear combination of powers of A0 according to
Eqs. (26) and (28). ⊳⊲
The Kirchhoff voltage and current laws bring redundancy into the f abcg frame
representations. In order to remove said redundancy, another frame, called fdq0g, is
introduced, where only two components of a vector shall matter, the direct one, d,
and the quadrature component, q.
Definition 3. (The dq0 frame). Let f dq0g denote a reference frame for electric
quantities of axes d and q, subject to five specifications.
d:1Þ The new frame shall be suitable to represent both stator-referenced and rotor-
referenced quantities.
d:2Þ The component of a stator-referenced quantity with respect to both the direct
or d axis and the stator as axis shall be represented by the same function of angle,
evaluated at arguments which differ by βs . Similarly for variables pertaining to the
rotor ar: the phase difference shall be βr.
d:3Þ The above angles are related by
β s ¼ βr þ θr : (54)
d:4Þ The quadrature or q axis shall be orthogonal to d in the L 2 ð½0, 2π Þ sense: if w d ½:
!
and wq ½: are the d- and q-components of a (generally complex-valued) signal w½:
Ð 2π
which depends on η, then: 0 wd½η ∗ wq ½ηdη ¼ 0.
!
d:5Þ The third entry w0½: of a vector w ½: in the fdq0g frame shall be equal to the
sum of its fabc g components. (For this reason, such a sum is called “zero sequence”,
or Nullfolge, and may be trivial or not).
Problem 1. (The fabcg to fdq0g transformation problem [3, 4, 6]). Find a transfor-
!
mation K½: that maps a vector wabc (a physical quantity) from the fabcgs and,
!
respectively, the fabc gr frames to a vector wdq0 in the fdq0g frame, as specified by
Definition 3 and
K:1Þ is invertible and linear;
K:2Þ conserves instantaneous electric power;
K:3Þ has the same functional form for both stator and rotor quantities,
K:4Þ depends at most on one real parameter, an “electric angle”, which may be
different for stator or rotor quantities;
K:5Þ is of class C 1 at least with respect to that parameter;
K:6Þ magnetically decouples flux linkages [6].
Proposition 4. (Matrix representation). A solution to Problem 1 which applies to a
three-phase machine exists and is the Blondel-Park [1, 2, 7] transformation K½:
11
192
Matrix Theory - Classics and Advances
!
and w0 ½t ¼ ð wa þ w b þ w cÞ½t, ∀t and, if w ½t has a period 2π, ∮ w d½twq ½tdt ¼ 0.
Proof of Proposition 4. The structure of K½: can be inferred by satisfying, in
sequence, requirements K:1, K:2, K:6, d:4, d:5. The Ansatz
R½η ≔ K 1
0 K½η (58)
and R½0 is the unit element. The existence of the composition law is implied by the
Ansatz. Obviously, det½K½η ¼ 1 implies det½R½η ¼ 1, ∀η. ◊.
Theorem 4. (Infinitesimal generator). The infinitesimal generator F of K½: is K0 -
similar to the opposite of the infinitesimal generator A3 of rotations about the x^3 axis
of 3 according to
F ¼ K½01 A 3 K ½0 (59)
F ¼ A 0 : (60)
2π 2π
2 3
sin ½η sin η sin η þ
rffiffiffi 6 3 3 7
dK½η 26 7
¼ 2π 2π 7 (61)
6 7
dη 36 cos ½η cos η cos η þ
6
3 3 5
7
4
0 0 0
!
and the constant matrix B satisfying dK½η
dη ¼ B K ½η reads
0 1 0
B ¼ 2 1 0 0 3 ¼ A 3 : (62)
6 0 0 07
4 5
12
193
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
dR½η
¼ K1 1
0 B K½η ¼ K 0 B K 0 R½η ≔ F R½η: (63)
dη
3
∂Lsr
K ½βs ½θr K ½β r 1 ¼ K0 M ½0 K1
0 ¼ A3 : (64)
∂θ r 2
Proof of Theorem 5. The proof branches out according to which current triple is
being dealt with.
!
• (Balanced current triple trivial zero sequence). Let j fabc g ∈ B, then, by Eqs. (42),
(57), and (60) and applying transposition
∂L sr
K½ β s ½θr K½ βr 1 ¼ K ½β s M½ θr K½β r 1 ¼ K 0 eβs A0 M½βr þ θr K1
0 ¼
∂θ r
3
¼ K0 M½ βs þ β r þ θr K 1 0 ¼ K0 M ½0 K0 ¼
1
A:
2 3
(65)
!
• (General current triple). For general j fabcg ∈ 3 no exponential representation is
available. In analogy with Eq. (26) one identifies the constant matrices P, Q and
qffiffi qffiffi
R giving rise to K½η ¼ 23 P cos η þ 23 Q sin η þ p1ffiffi3 R. The product M½θ r K1 ½βr ,
after simplification, turns out to be an affine function of cos ½θr þ β r and
sin ½θr þ β r which in turn depend on angle sums: products of the involved
matrices hide an addition formula for angles on which trigonometric functions
depend. Taking Eq. (54) into account, left multiplication by K ½β s leads to a
polynomial in cos ½β s and sin ½ βs with coefficients like P C Q Trs, R C P Trs
and so forth. All β s-dependent terms in the polynomial disappear. Eventually, the
only non-zero term is 23 P C Q Trs ¼ 32 A3 , a constant. ⊳⊲
Remark 8. (Prior results). To the best of the authors’ knowledge, the role of the
Blondel-Park transformation in realization theory was pointed out by J.L. Willems [8],
who derived the exponential representation of K½: while obtaining a time-invariant
system from time-varying electric machine equations. The group properties of K½:
have been known for some time (e.g., [9], p. 1060). Instead, the relation of F to A0 , at
least in the form of Eq. (60), the relation between exponential representations of K½:
and L sr½:, and the roles played by the left zero divisors of Lsr½: and by the subspace B,
seem to have been overlooked so far.◊
13
194
Matrix Theory - Classics and Advances
5. Electric torque
From the principles of analytical mechanics, the following relation can be deduced
[3, 4] for the ideal DFI machine in generator mode. The relation involves previously
defined quantities, namely stator and rotor currents and a machine parameter, the
L0sr ½: of Eq. (4), and a quantity, the electric torque T el,g , which has not yet been
mentioned herewith. As a consequence, the relation can be regarded as the physical
“law” which defines T el,g .
Definition 4. (Electric torque in the fabcg frame). Let the ideal DFI machine have P
! !0
poles and be described by current vectors j fabcgs and j fabc gr. The electric torque in
generator mode is defined by
The relevance of Eq. (66) sits in the link it establishes between electric quantities and
a mechanical one: in generator mode, it is the torque produced by, usually a working
fluid, on the DFI machine shaft which, through a suitably excited rotor, gives rise to
electric currents in the stator coils; in motor mode power flow from machine coils to the
shaft is reversed. All machine control laws rely on Eq. (66) in order to be implemented.
! !
dU ¼ T dS þ j d λ T el,mdθ m, (67)
!
where λ is the vector of flux linkages, T el,m is mechanical torque in motor mode
(opposite to that in generator mode), and θm is the shaft angle.
A consequence of the Ansatz is the following.
Proposition 5. (Relation between motor torque and internal energy).
∂U
T el,m ¼ : (68)
∂θ m S,V,!N,!λ
Definition 5. (Legendre transform of energy with respect to flux linkage). Let Λ ⊂ IR3
!
be a subset where U is at least twice differentiable and convex with respect to λ and
! !
let p denote the variable conjugate to λ . Then the Legendre transform Y of energy U
!
with respect to λ is defined by
! ! ! ! ! !
Y S, V , N, p , θm ≔ sup p λ U S, V , N , λ , θ m : (69)
h i !λ ∈ Λ h i
14
195
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
! !
• p coincides with j and one has
! !
Y þ U ¼ j λ: (70)
• Motor torque can thus be rewritten as
∂Y
T el,m ¼ : (71)
∂θ m S,V,N!,!j
! !
• The extensive variables on which energy depends are S, N , λ and θm, and as such
are coordinates of the dynamical system’s manifold N . Instead, the intensive
! !
variables T, μ (vector of chemical potentials), j and T el,m belong to the system’s
∗
co-tangent bundle T N [11, 13]. Upon a multivariate Legendre transformation,
as many extensive variables can be replaced by their conjugates, which are
intensive variables.◊
Remark 10. (Y vs. W0fld ). By identifying U with “the energy W fld stored in the
coupling fields” of an electric machine having P poles, the rotor of which forms the
mechanical angle θm in the stator frame, one has the following relations:
In other words,
W 0fld ¼ Y : (73)
! !
S,V ,N , j
◊
Remark 11. (Models of real machines). The relation between Y and torque applies to
!
any machine and can, in principle, deal with any functional dependence between λ
! !h !i
and j . Nonlinear λ j relations [9, 10, 12] become of interest when saturation of the
magnetic circuit has to be modeled. Hysteresis and the related energy losses pose
further difficulties. ◊
Translating Eq. (66) into the fdq0g relies on relations between K½:-transformed
current vectors which involve all three angles, βs , β r , θr . Translation is made
remarkably simpler by Theorem 5.
15
196
Matrix Theory - Classics and Advances
Theorem 6. (Electric torque in the dq0 frame). For an ideal DFI machine, the electric
torque in generator mode and in the f dq0g-frame is the following bilinear form for the
matrix A3 :
2 0 3
jdr
P 3 Nr h i
6 0 7
T el,g ¼ Lms jds jqs ✓ A3 4 jqr 5 (74)
2 2 Ns
✓
which simplifies to
P 3 Nr
T el,g ¼þ L jds j0qr j qs j0dr : (75)
2 2 Ns ms
Proof of Theorem 6. It suffices to combine Eqs. (66), (56) and (64). The matrix A3
makes the 3rd entries of current vectors not relevant (✓). ⊳⊲
Real machines deviate from the hypotheses which have led to the relatively simple
form of the equations discussed so far. A satisfactory model shall account for one or
more of the following features:
(a) the effects of tooth saliency and slots on the linked fluxes,
Models which, step-wise, account for features ðaÞ to ðdÞ are “realistic” in the sense
of Fitzgerald and Kingsley [3]. Features listed under ðaÞ are relatively simple to model
if three-fold symmetry is assumed: very briefly, higher harmonics are introduced
which, because of linearity, can be dealt with separately. Instead, broken symmetry
may be of some interest: the model outlined herewith focuses on feature ðbÞ and
consists of constructing a “realistic” mutual inductance matrix, then determining its
algebraic (determinant, eigenvalues) and analytical (θ r -derivative) properties.
Definition 6. (Broken symmetry in the rotor). At fixed θr the rotor ar axis forms
angles θr , θr φ and θr þ φ with the as, bs and cs axes respectively. With ϵb and ϵc
satisfying
3ϵ 3ϵ
0 ≤ b , c < < 1 (76)
2π 2π
the rotor br axis forms angles θr þ φ þ ϵ b, θr þ ϵ b and θ r φ þ ϵb with the as, bs
and cs axes respectively. Similar relations hold for the rotor cr axis in terms of ϵc .
As a consequence the mutual inductance matrix is
16
197
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
where the four new matrices have to be defined. Cϵð2b Þ,ϵc and Sϵð2b,ϵÞ c are obtained from
C and S of Eq. (27) when their second columns are multiplied by 1 12 ϵ2b and their
third columns are multiplied by 1 12 ϵ 2c . Similarly, C ϵð1b Þ,ϵc and Sϵð1bÞ,ϵc derive from
splitting the cos ½θ r and sin ½ θr terms in the following matrix
∂Gϵð2b,ϵÞ c
!
! ! !
½θ r j ¼ B Gϵð2b Þ,ϵc ½ θr j , ∀ j ∈ 3 : (81)
∂θ r
7. Conclusion
198
Matrix Theory - Classics and Advances
199
Transformation Groups of the Doubly-Fed Induction Machine
DOI: http://dx.doi.org/10.5772/ITexLi.102869
References
[1] Blondel A. Synchronous Motor and [11] Maschke BM, van der AJS,
Converters - Part III. New York: Breedveld PC. An intrinsic Hamiltonian
McGraw-Hill; 1913 formulation of the dynamics of LC-
circuits. IEEE Transactions on Circuits
[2] Park RH. Two-reaction theory of and Systems I: Fundamental Theory and
synchronous machines - Part I. AIEE Applications. 1995;42(2):73-82
Transactions. 1929;48(2):716-730
[12] Sullivan CR, Sanders SR. Models for
[3] Fitzgerald AE, Kingsley C Jr. Electric induction machines with magnetic
Machinery. 2nd ed. New York, NY: saturation of the main flux path. IEEE
McGraw-Hill; 1961. p. 568 Transactions on Industry Applications.
1995;31(4):907-917
[4] Krause PC. Analysis of Electric
Machinery. New York, NY: McGraw- [13] Eberard D, Maschke B, van der AJ S.
Hill; 1986 Conservative systems with ports on
contact manifolds. In: Proceedings of the
[5] Hau E. Windkraftanlagen - 4. 16th IFAC World Congress; July 4–8,
Auflage. Berlin: Springer; 2008 2005. IFAC: Prague, CZ; 2005
200
Chapter 10
Abstract
This chapter deals with the validity/limits of the integral transform technique on finite
domains. The integral transform technique based upon eigenvalues and eigenfunctions
can serve as an appropriate tool for solving the Fourier heat equation, in the case of both
laser and electron beam processing. The crux of the method consists in the fact that the
solutions by mentioned technique demonstrate strong convergence after the 10 eigen-
values iterations, only. Nevertheless, the method meets with difficulties to extend to the
case of non-Fourier equations. A solution is however possible, but it is bulky with a weak
convergence and requires the use of extra-boundary conditions. To surpass this difficulty,
a new mix approach is proposed with this chapter resorting to experimental data, in order
to support a more appropriate solution. The proposed method opens in our opinion a
beneficial prospective for either laser or electron beam processing.
1. Introduction
The heat equation can be solved in a simpler mode via the Fourier heat equation,
which involves the propagation of heat waves with infinite speed. This hypothesis is in
particular valid for many applications, such as laser-metal interaction in the frame of
two-temperature model [1, 2].
The solution of Fourier equations can be inferred using different mathematical
techniques via Green function, integral, Laplace transform, or complex analysis. The
201
Matrix Theory - Classics and Advances
2. Non-Fourier equation
∂2 T 1 ∂T ∂ 2 T 1 ∂ 2 T 1 ∂T τ0 ∂2 T Pðr, φ, z, tÞ
2
þ þ 2þ 2 2
2
¼ : (1)
∂r r ∂r ∂z r ∂φ γ ∂t γ ∂t K
∂T
¼ 0: ¼ > T ¼ T ðr, z, tÞ: (2)
∂φ
∂T ðr, z, tÞ
K jr¼b þ hT ðb, z, tÞ ¼ 0: (3)
∂r
Here r = b is the cylinder radius, while h is the heat transfer coefficient. The
boundary conditions for the z coordinate are:
∂T ðr, z, tÞ
K jz¼0 hT ðr, 0, tÞ ¼ 0, (4)
∂z
2
202
A New Approach to Solve Non-Fourier Heat Equation via Empirical Methods Combined with…
DOI: http://dx.doi.org/10.5772/ITexLi.104499
and
∂T ðr, z, tÞ
K j z¼a þ hT ðr, a, tÞ ¼ 0, (5)
∂z
where a is the cylinder length. We will pass on the effective solution of the
equation. The operator Dr was defined as:
∂2
1 ∂
Dr ¼ 2 þ : (6)
∂r r ∂r
∂ 2Kr 1 ∂K r
þ þ μ2 Kr ¼ 0, (7)
∂r 2 r ∂r
and
∂Kr ðr, z, tÞ
K jr¼b þ hK r ðb, z, tÞ ¼ 0: (8)
∂r
h
J ðμ bÞ μiJ 1 ðμi bÞ ¼ 0: (9)
K 0 i
One can deduce, based upon the theory of finite integral transforms, the
~
eigenfunction K r ðr, μ i Þ corresponding to the eigenvalue μi :
~ 1
Krðr, μi Þ ¼ JO ð μi rÞr , (10)
Ci
One defines:
ðb
~ 1
T μi, z, t ¼ T ðr, z, tÞr J 0ð μi rÞdr, (12)
Ci 0
and
ðb
~ 1
P ðμi , z, tÞ ¼ Pðr, z, tÞ r J0 ðμi rÞdr: (13)
Ci 0
2~ ~ 2~ ~
~ ∂ T 1 ∂T τ0 ∂ T P
μ 2i T þ 2 ¼ : (14)
∂z γ ∂t γ ∂t 2 K
203
Matrix Theory - Classics and Advances
∂2 K z
þ λ 2 K z ¼ 0, (15)
∂z2
and
∂K Z
K h Kz ¼ 0, (16)
∂z z¼0
as well as:
∂K Z
K þ h Kz ¼ 0: (17)
∂z z¼a
The and + signs for h in Eqs. (16) and (17) denote the target heat absorption and
emission, respectively. One has:
h
Kz ðz, λÞ ¼ cos ðλ zÞ þ sin ðλ zÞ, (18)
λK
and
λ jk h
2 cot λ j a ¼ : (19)
h λj k
Here, λj denotes eigenvalues along the z-axis. From theory [7], it follows:
~ 1
K z z, λ j ¼ K z z, λ j , (20)
Cj
and
ða
K2z z, λ j dz:
Cj ¼ (21)
0
Note that Eqs. (12) and (13) discuss the eigenvalues along the r-axis, only. After
introducing the eigenvalues along the z-axis, one can step ahead to generalize
Eqs. (12) and (13) as:
ð að b
1
T μi , λ j, t ¼ T ðr, z, tÞrK r ð μ i , rÞKz λ j , z drdz, (22)
C i Cj 0 0
and
ðað b
1
P μ i, λ j, t ¼ Pðr, z, tÞrKr ð μ i, rÞK z λ j , z drdz: (23)
CiC j 0 0
1 ∂T τ0 ∂2 T P
μ2i þ λ2j T þ þ 2
¼ : (24)
γ ∂t γ ∂t K
204
A New Approach to Solve Non-Fourier Heat Equation via Empirical Methods Combined with…
DOI: http://dx.doi.org/10.5772/ITexLi.104499
We next applied the direct and inverse Laplace integral transform to solve Eq. (1)
in relation to time. C[1] and C[2] stand for the normalizing coefficients with respect to
the experimental data. The results are as follows:
2 p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 3
1γ 1 2τ0 2τ0
4 μ γ 4 λ γ ∙t
γ2 i j
6 P μi , λ j 2τ 0
7
þ C½1℮ þ C ½2 7
6 γ
7
10 X
10 6
μ2i þ λ2j
X
T ðr, z, t, τ 0Þ ¼ 7 K r ðμi , r ÞKz λ jÞ , z :
6 7
6
i¼1 j¼1 6
6 7
2 τ0 2 τ0
1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
γ þ t
4 μ γ 4 λ γ
7
4 γ2 i j 5
2τ0
℮ γ
(25)
and
ð aðb
1
T μ i, λ j ¼ Pðr, z, tÞ rK r ðμ i, rÞKz λ j, z drdz: (26)
Ci C j 0 0
We finally mention that for an intermediate point in the experimental curve, one has:
Kτ 0 ∂2 T e
2
∂ T e ∂2 T e ∂2 T e
∂T e
!
AT e þ ¼ K þ þ G ðT e Ti Þ þ Pa r, t ,
∂t γ ∂t2 ∂x 2 ∂y 2 ∂z 2
(30)
∂T i
Ci ¼ G ðT e T i Þ: (31)
∂t
Here T e and T i stand for the electron and phonon temperatures, respectively. G is
!
the coupling factor between electrons and phonons. P a r , t is the heat source, which is
induced via laser-metal interaction. The interaction could be considered of either clas-
sical or steady-state quantum mechanical type. A is the electron heat capacity, and K is
the thermal conductivity of the metal. According to Ref. [8], G can be determined from:
ðT e=T
π2mNv2 T e 4
d
G¼ x4=ðe x 1Þ dx, (32)
6τT i Ti 0
where m is the electron mass, N is the conduction electron density, v is the velocity
of sound in the metal, τ is the electron–phonon collision time, and T D is the Debye
5
205
Matrix Theory - Classics and Advances
temperature. The exact data for each metal (Cu, Ag, Al, or Fe) are available from text
books and current literature (e.g., [8–10]). As for all metals, Ci > > A, one may
assume, in a first approximation, that:
∂ 2T e ∂2 T e ∂ 2 T e ∂Ti Kτ0 ∂2 Te
!
K þ þ Ci ¼ Pa r,t : (33)
∂x2 ∂y 2 ∂z2 ∂t γ ∂t 2
T i ¼ κT e , (34)
where
τL
κ¼ : (35)
τL þ τ i
Here, τ i is the lattice cooling time while τL is the pulse duration time. Conse-
quently, one has:
∂T i ∂T e
¼κ : (36)
∂t ∂t
∂ 2T e ∂2 Te ∂ 2Te ∂Te Kτ 0 ∂2 Te
!
K þ 2 þ 2 Ci κ ¼ P a r ,t : (37)
∂x2 ∂y ∂z ∂t γ ∂t2
It follows that:
!
2 2 2
∂ Te ∂ T e ∂ T e
2
1 ∂Te τ0 ∂ Te Pa r,t
þ þ ¼ (38)
∂x 2 ∂y2 ∂z2 γ ∂t γ ∂t2 K
with
K
γ¼ : (39)
Ci κ
Under the most general form, the heat source reads as:
X
Pa ¼ I ðy, zÞ ðαmn eα mnx Þð1
m,n mn
rSmn Þ þ rSmn δðxÞ þ qcÞ ðH ðt Þ H ðt t0 ÞÞ: (40)
Here, I mn ðy, zÞ stands for the laser transverse mode {m,n} while αmn is the linear
absorption coefficient, and r Smn is the surface absorption coefficient. The quantum
corrections (qc) are steady state for the respective mode. H stands for the step
function, and t0 for the exposure time. One model explains the continuous laser beam
irradiation, while the other one, more realistic in our opinion, illustrates the laser
beam in pulse form. The equivalence between the two models requires therefore that
the intensity versus time plot should cover the same area.
In order to make a comparison with experiments, one needs besides analytical
description, concrete numerical values. The next step is therefore to estimate the
6
206
A New Approach to Solve Non-Fourier Heat Equation via Empirical Methods Combined with…
DOI: http://dx.doi.org/10.5772/ITexLi.104499
eigenvalues, numerically. For this purpose, Eq. (7) can be solved using the integral
transform technique, and eigenfunctions and eigenvalues could be calculated. One has
three differential equations as follows (K represents the eigenfunctions, while
λ, μ, and ξ are the eigenvalues) [5]:
∂ 2 Kx
þ λ2i K x ¼ 0, (41)
∂x2
∂2 Ky
þ μ 2jK y ¼ 0, (42)
∂y 2
∂ 2 Kz
þ ξ 2k K z ¼ 0: (43)
∂z 2
h
Kx ¼ cos ð λi xÞ þ sin ðλi xÞ, (44)
Kλ i
h
Ky ¼ cos μ j y þ sin μ j y , (45)
Kμ j
h
Kz ¼ cos ðξk zÞ þ sin ðξ k zÞ: (46)
Kξ k
∂Ky hK y ∂K y hK y
þ ¼ 0; þ ¼ 0, (48)
∂y K y¼0 ∂y K y¼b
∂K z hK z ∂K z hK z
þ ¼ 0; þ ¼ 0: (49)
∂z K z¼0 ∂z K z¼c
Here, a, b, and c are the metal sample dimensions. As for the boundary conditions,
the eigenvalues (h is the heat transfer coefficient) can be inferred from Eqs. (47) to
(49), as:
λiK h
2 cot ðλi aÞ ¼ , (50)
h Kλ i
μK h
j
2 cot μ j b ¼ , (51)
h Kμ j
ξk K h
2 cot ðξk cÞ ¼ : (52)
h Kξ k
207
Matrix Theory - Classics and Advances
Tð x, y, z, t, C½1, C½2, τ0 Þ
2 3
2 τ0 2 τ0 2 τ0
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1γ 1 4 μ γ 4 λ γ 4 ξ γ ∙t
γ2 i j k
6 P μ i, λ j,ξ k 2τ0
7
þ C½1℮ þ C½2 7
6 γ
7
μ2i þ λ2j þ ξ2k
6
10 P
10 P
6
10 6
7
P 7
¼ 6
6
7
7 (53)
i¼1 j¼1 k¼1 6 7
6 ffi
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 7
6 1γ þ 1 2 τ0 2τ 0
4 μ γ 4 λ γ 4ξ γ2τ 0 ∙t
7
4 γ2 i j k 5
2τ0
℮ γ
K xðμi , xÞK y λ j, y Kz ðξ k, zÞ
The advantage of Eq. (53) in our model is related to a quick converging series.
Thus, after 10 iterations, the solution’s accuracy reaches already 10 2 K in the case of
thermal distribution [11].
4. Experimental details
The experimental setup is operated by a Nd:YAG pulsed laser source (λ = 355 nm)
(Surelite II from Continuum), generating pulses of 6 ns duration with (130 0.6) mJ
energy at a frequency repetition rate of 10 Hz. The laser beam had a spatial top-hat
distribution. The laser beam was focused onto the metallic target surface by a lens
with 240 mm focal lengths. An Al bulk target of (10 10 5) mm 3 was used in
experiments. The laser fluence was set at 7.5 J/cm2 to surpass the ablation threshold
but also to avoid the excessive plasma formation. A crater of 18 μm depth was dig into
the sample after the application of 1000 subsequent laser pulses, as checked up by a
Vernier Caliper instrument. During the multipulse laser irradiation, the thermal dis-
tribution was monitored on the sample back side via a FLUKA thermocouple
connected to a computer having Lab view software, while the sample was irradiated at
the top. All experiments were performed on an in-house developed equipment at
Laser Department, National Institute for Laser, Plasma and Radiation Physics
(INFLPR), Magurele, Romania.
Experiments and simulations were carried out during the heating of a metallic
target. The boundary conditions were described by Eq. (28). In all figures, the exper-
imental data are plotted with dots while the simulations are represented by a
continuous line. Relaxation time,τ0 , was assumed 0.5 ps (Figure 1), 1 ns (Figure 2),
and 1 μs (Figure 3), respectively. For simulation, a heat transfer coeffi-
cient = 3 107 W mm 2 K 1 was selected. As known [12–14], for a very low heat
transfer coefficient, the eigenvalues are positive very small numbers, resulting in a
linear thermal distribution curve, as visible in Figures 1–3.
208
A New Approach to Solve Non-Fourier Heat Equation via Empirical Methods Combined with…
DOI: http://dx.doi.org/10.5772/ITexLi.104499
Figure 1.
Time evolution of temperature for a relaxation time of 0.5 ps: experiments (dotted line) vs. simulation (continuous
line).
Figure 2.
Time evolution of the temperature for a relaxation time of 1 ns: experiments (dotted line) vs. simulation
(continuous line).
Figure 3.
Time evolution of the temperature for a relaxation time of 1 μs: experiments (dotted line) vs. simulation
(continuous line).
209
Matrix Theory - Classics and Advances
The best agreement between theory and experiment was achieved for a relaxation
time, τ 0, of 0.5 ps, as visible from Figure 1. We note that this is in accordance with
available literature on subject [15].
Acknowledgements
CNM, MO, NM, and CR acknowledge for financial support by Romanian Ministry
of Research, Innovation and Digitalization, under Romanian National NUCLEU Pro-
gram LAPLAS VI–contract no. 16N/2019. CNM, NM, CR, and INM thank for the
financial support from a grant of the Romanian Ministry of Education and Research,
CNCS-UEFISCDI, project number ID code RO-NO-2019-0498 and UEFISCDI under
the TE_196/2021 and PED_306/2020. M.A.M. received financial support from the
European Union’s Horizon 2020 (H2020) research and innovation program under the
MarieSkłodowska–Curiegrantagreementno.764935.
210
A New Approach to Solve Non-Fourier Heat Equation via Empirical Methods Combined with…
DOI: http://dx.doi.org/10.5772/ITexLi.104499
References
211
Chapter 11
Abstract
1. Introduction
The nonlinear eigenvalue approach is used in this chapter for the study of the
properties to solutions of the generalized phase problem related to the synthesis of
radiation systems through the incomplete data. Such incompleteness is considered
here in the example of an indeterminate phase characteristic of function, which
characterizes the radiation of the plane antenna arrays.
The problems with an indeterminate phase of the wave field arise in various
applications and are widely described in the literature. The most well-known of these
is the so-called phase problem (see, for example, [1–4]). It consists in restoring the
phase distribution (argument) of the Fourier transform of a finite function by its
amplitude (module) given (measured) along the entire real axis. This problem belongs
to the classical problems of recovery (identification) and requires the conditions of
existence of a unique solution.
212
Matrix Theory - Classics and Advances
In the physical relation, the radiation system represents the plane array with the
rectangular or hexagonal placement of radiators. Firstly, we consider the array with
the rectangular ordering of separate elements (Figure 1a).
Consider a plane rectangular array consisting of N 2 M2 ¼ ð2N þ 1Þ ð2M þ 1Þ
identical elements (radiators), which are located in the xOy plane of the Cartesian
coordinate system equidistantly for each of the coordinates. Since the radiators are
identical, it is possible to formulate the synthesis problem not for the whole three-
2
213
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 1.
The placement of elements for the considered arrays. a) rectangular. b) hexagonal.
dimensional vector directivity pattern (DP), but only for some complex scalar func-
tion fð x1 , x2Þ that is termed as the array multiplier. This function for a rectangular
equidistant array has the form [25]:
N
X M
X
f ðx1 , x2Þ ¼ AI Inm eið c1 nx1 þc2 mx2 Þ (1)
n¼N m¼M
M2
X NX
1ðmÞ
where M ¼ 2M2 þ 1 is quantity of the linear subarrays, then N ¼ 2N1 ðmÞ þ 1 is the
number of elements in the mth subarray.
Eqs. (1) and (2) for DP f ð x1 , x2Þ represent the result of a linear operator A, which
acts on a complex-valued space H I ¼ CN 2 M2 (rectangular case) or HI ¼ C N0 M (hex-
agonal case) to the space of complex functions of two variables defined in the domain
Ω. The value N0 determines the number of elements in the central linear subarray in
the hexagonal case.
Assume that the desired power DP P ðs 1, s 2Þ is not equal to zero in some regions
G ⊆ Ω, and it is equal to zero outside. The optimization problem is formulated as the
minimization problem of the functional
2
σ αðIÞ ¼ P jAIj 2 f
þ α kI k2I (3)
where kk f and k k I determine the norms inthe space of DPs and space of currents
respectively, which are defined by the inner products
214
Matrix Theory - Classics and Advances
ðð
k f k2f
¼ f 1 , f2 f
¼ f 1ð x1 , x 2Þf 2 ð x1, x 2Þdx 1dx2 (4)
Ω
N M
4π 2 X
kIk2I ¼ ð I1, I 2ÞI ¼
X
I 1nmI 2nm (5)
c 1c 2 n¼N m¼M
If to act by operator A on both the parts of (6), we get a nonlinear integral equation
of Hammerstein type for the function f
where
c 1 sin N2 c1 x 1 x01 =2
K 1 x 1, x 01, c 1
¼ (10)
π sin c1 x 1 x01 =2
The kernel of the AA ∗ operator for the hexagonal array is more complicated
because we can not to present it in the form of two multipliers
0
> sin c 1ð N1ðmÞ 1=2Þ x1 x1 , N1 ðmÞ is odd,
8
>
sin 1=2c1 x1 x01 (12)
M2
þ2 cos mc2 x 2 x 02
N 1ðmÞ
m¼1 >
cos c1 ðn 1=2Þ x1 x 01 , N 1ðmÞ is even:
>
<2
X >
>
>
> n¼1
X
>
>
:
4
215
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
The kernels (9) and (12) of the integral Eq. (8) are real and degenerate. Since
Eqs. (6) and (8) are nonlinear ones, both may have a non-unique solution. The
number of solutions and their properties is studied according to the method proposed
in [16, 27]. In the practical applications, the solution of Eqs. (6) and (8) is performed
by the method of successive approximations. The convergence of the method depends
on the parameter α, desired DP Pð x1, x2 Þ, as well as the parameters c 1 and c2 contained
in the kernels (9) and (12).
We should use the linear integral equation to define the bifurcation curves
according to [16]. Based on this equation, we pass to the respective eigenvalue prob-
lems, solutions of which allow us to find the characteristic values of parameters c 1 and
c2 in the kernel of equation, at which the bifurcation appears.
As stated by the branching theory of solutions of the nonlinear equations [16], the
bifurcation points can be those values of c 1and c2 at which Eq. (14) has nonzero solutions.
Using the properties of the degeneracy of the kernel AA ∗ , we reduce Eq. (14) to
the equivalent system of the linear algebraic equations (SLAE). The coefficients of
matrix of this equation depend on the parameters c1 and c 2 analytically. To this end,
the equations for eigenfunctions corresponding to (13) are written as
N
X M
X
gð x 1, x 2Þ ¼ x nme ið c1nx 1 þc2 mx2 Þ (15)
n¼N m¼M
where
c1 c2
P x01, x 02 g x01, x 02 e ið c1 nx1 þc 2mx2Þ dx01dx02
0 0
xnm ¼ 2
(16)
4π ðð
Ω
5
216
Matrix Theory - Classics and Advances
Multiplying both the parts of (15) on P x01 , x02 e ið c1kx1 þc2 lx 2Þ at k ¼ N , N þ
0 0
N
X M
X
ðklÞ
x kl ¼ anm ðc1 , c2 Þxnm , k ¼ N , N þ 1, … , N 1, N ,
n¼N m¼M
(17)
l ¼ M, M þ 1, … , M 1, M,
where
ðð
c c
ðklÞ
anm ¼ 1 22 Pðx1 , x2 Þei½ðc1 ðnkÞx1 þc2 ðmlÞx2 Þ dx1dx 2 (18)
4π
Ω
ðklÞ
and matrix of the coefficients anm is self-adjoint and Hermitian.
Thus, we obtained a two-parameter nonlinear spectral problem corresponding to a
homogeneous SLAE (17). This problem can be given as
ðklÞ
where AM is the matrix of coefficients anm , EM is a unit matrix of dimension N 2 M2.
For the system (19), the equality
x E M A~ M ðc1 Þ ~
ðEM AMðc 1, γc 1ÞÞ~ x¼0 (23)
Ψð c1 , γc 1Þ ¼ det EM ~
AM ðc1 Þ ¼ 0 (24)
6
217
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
ð0Þ ð0Þ ð0Þ
Let c 1 be the solution of the Eq. (24), then c1 , c 2 ¼ cð10Þ , γcð10Þ is the point
that corresponds to eigenvalue
λ0 ≈ 1 of Eq. (15). By solving Eqs. (21) and (22) in a
ð0Þ ð0Þ
small vicinity of point c1 , c2 , we find the spectral curve of the matrix-function
AMðc1, c2 Þ, which is the curve c2 ðc1Þ defining a set of the bifurcation points.
The eigenfunctions of Eq. (14) are defined as the eigenvectors of matrix A Mðc 1 , c2 Þ
using the resulting solution of the Cauchy problem with the sought solutions Ψðc1 , c2 Þ.
In this procedure, a four-dimensional matrix A Mð c1, c 2Þ is reduced to a two-
dimensional one by the relevant renouncement of its elements.
M 2 N2c 1c2
K c1, c 2, x 1 , x 01 , x2 , x02 ≈
(25)
π2
Assuming that f ð x 1, x2Þ is constant, the integral Eq. (8) can be rewritten as (usually
for small c1 and c2 f ðx1 , x 2Þ ≈ const).
ð1 ð1
π 2α
¼ Pðx1 , x 2Þdx1 dx2 4 j f ðx1 , x2Þj2 (26)
2M2 N2 c1 c2
1 1
The area of integration Ω in Eq. (8) is reduced in the last formula to the area
½1, 1 ½1, 1 because of definition of both the arguments x 1, x 2 and parameters c1, c 2.
Taking into account that j f ðx1 , x 2 Þj2 is positive, we get the following relationship
between the function P ðx 1, x2 Þ and the parameters c1, c 2, and α:
ð1 ð1
π 2α
P ðx1 , x 2 Þdx1dx2 >0 (27)
2M2N 2c1c 2
1 1
π2 α
c1 c2 > (28)
8M2 N 2
In fact; inequality (28) determines the area of parameters c1, c2 , α, where nonzero
solutions exist. In Figure 2, the dependence curves c2 ¼ c2 ð c1 Þ for three different
values M2 and N 2 are shown. The results are given for array with the number of
elements M 2 ¼ N 2 ¼ 3 (curve 1), M 2 ¼ N 2 ¼ 5 (curve 2) and M2 ¼ N 2 ¼ 11 (curve 3).
The area of values c 1 and c2 , where the existence of zero solutions is possible,
according to the estimate (28) is located below and to the left of the presented curves.
218
Matrix Theory - Classics and Advances
Figure 2.
The curves c 2 ¼ c2 ðc 1 Þ at the different M2 and N 2 .
As can be seen, the area of zero values decreases significantly with increasing N2 and
M2 . The obtained results testify that the zero solutions of Eq. (8) for a given constant
power DP can exist either at a small value c1 c2 corresponding to low frequencies (at a
given size of array), or at the values of c1 that significantly exceeding c 2 and vice versa.
The last case corresponds to arrays with a large difference in distance of elements
along the coordinate axes. Such arrays are usually rarely used in practice.
The finding of bifurcation lines of the nonlinear Eq. (8) was performed for the array
containing N2 ∗ M 2 ¼ 11 ∗ 11 ¼ 121 radiators for the desired power DP Pð x1 , x2 Þ ¼ 1 at
Λ c ¼ fð c1 , c2 Þ, 0 < c1 , c2 ≤ 2 g for the different values of the parameter α in (3).
The search for bifurcation lines can be performed directly by investigating the
properties of the determinant (20) as a function of the parameters c1 and c2 . In
addition, the function (20) depends on the parameter α; so the set of its eigenvalues
also depends on this parameter, i.e. the set of spectral curves that separate the areas of
zero and nonzero solutions.
The behavior of the corresponding curves when changing the parameter α is shown
in Figure 3. The behavior of the determinant (24) depending on the parameters c1and c 2
at α ¼ 0:5 is given in Figure 3a; and in Figure 3b–d, the intersection of this function
with a plane Ψð c 1, γc1Þ ¼ 0 is illustrated at the different α. This results in a set of curves
that correspond to a set of spectral lines separating the area of zero and nonzero
solutions. At a fixed size of array, the area where zero solutions can exist expands if the
parameter α increases, this area is located below the left of the first curve.
The curves marked by number 1 correspond to the solutions with constant (zero or
even) phase DP; curves with number 2 correspond to the solutions with phase DP that
8
219
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 3.
The spectral curves of Eq. (19) at the different α.
is even with respect to the Ox1 axis, and odd with respect to the Ox2 axis, and curves
numbered by 3 correspond to the solutions with a phase DP odd with respect to two
coordinate axes. The proposed procedure is quite approximate, it does not allow to
separate the curves that correspond to different types of solutions and thus identify
the areas where there is a nonzero solution for the synthesized power DP with the
specified phase property.
The method of implicit function proposed in [23] and developed for plane array in
[30] is devoid of this drawback.
At the first step of this method, a series of one-dimensional eigenvalue problems is
solved, by this the different values of parameter γ are prescribed by the relation c 2 ¼
γc 1and a one-dimensional problem is solved with respect to c1 . In Figures 4 and 5, the
first four eigenvaluesof
the problem at γ ¼ 1:0 and γ ¼ 0:2 are shown. The values
ðiÞ ðiÞ ðiÞ
c1 , c2 ¼ c 1 , i ¼ 1, 2, 3, 4, at which λi ¼ 1, are the bifurcation points in the plane
c1Oc2 . By this, the set of points ðc 1, c 2Þat which the eigenvalue λ ðiÞ ¼ 1 is determined
approximately from the graphical data.
ðiÞ ðiÞ
The next step is to refine the values c1 , c2 by solving the transcendental
ðiÞ ðiÞ
Eq. (20), and the point c 1 , c2 , which is considered as the initial approximation.
9
220
Matrix Theory - Classics and Advances
Figure 4.
The first eigenvalues at the ray c1 ¼ c 2 , α ¼ 0:5.
Figure 5.
The first eigenvalues at the ray c2 ¼ 0:2c 1, α ¼ 0:5.
221
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 6.
ðiÞ ðiÞ
The bifurcation curves corresponding to set c1 , c2 , i ¼ 1, … , 4, N2 ¼ M 2 ¼ 11.
222
Matrix Theory - Classics and Advances
Figure 7.
The surface of determinant (24) values for the hexagonal array, α ¼ 0:5.
Figure 8.
Zero lines of determinant (24) for the hexagonal array, α ¼ 0:5.
ðiÞ ðiÞ
i ¼ 1, 2, 3, 4, are the bifurcation points in the plane ðc1 , c2Þ. The points c1 , c 2 , for
which the eigenvalues λðiÞ ¼ 1 are
determined
approximately in this step.
ðiÞ ðiÞ
The specification of values c1 , c2 by solving Eq. (20) is carried out in the next
ðiÞ ðiÞ
step, and the points c1 , c2 of the graph data from the Figures 9 and 10 are
used as initial approximations. The usual numerical half-division method is used for
this goal.
ðiÞ ðiÞ
The bifurcation points c1 , c2 , i ¼ 1, 2, 3, 4 for the first four eigenvalues in the
rays, c2 ¼ γc 1 are shown in Figure 11. The respective curves of bifurcations, which are
obtained by solving Eqs. (21) and (22), are shown in Figure 12. As in the case of a
rectangular array, to obtain the necessary data, we should carry out precise
computations.
12
223
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 9.
The first eigenvalues for the ray c1 ¼ c 2 at α ¼ 0:5.
Figure 10.
The first eigenvalues for the ray c2 ¼ 0:2c1 , α ¼ 0:5.
The results presented in this Section demonstrate how the knowledge about the
point of bifurcation obtained as the solutions of the nonlinear eigenvalue problems
allows us to understand better the process of bifurcation and how to get the solutions,
which are the most optimal in sense of the used criterion of optimization.
The properties of solutions to Eq. (8) obtained by using the method of successive
approximation are related directly with the properties of phase characteristic of the
13
224
Matrix Theory - Classics and Advances
Figure 11.
The bifurcation points at the rays c 2 ¼ γc1.
Figure 12.
The bifurcation curves in the plane ð c1 , c2Þ.
f nþ1 β f n þ ð1 β ÞB f n ¼ 0, n ¼ 0, 1, 2, … (29)
225
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
2h ∗h ii
Bð f Þ ¼ AA ðP f Þ j f j 2 f (30)
α
In Figure 13, the dependence of the convergence of the iterative process (29) on
the value of parameter α for a desired power DP Pð x1 , x2Þ ¼ 1 at the fixed values β ¼
0:1, c1 ¼ c2 ¼ 2:0, the number of radiators M2 N 2 ¼ 11 11 ¼ 121 is shown. The
required accuracy ε ¼ 103.
The results of solving the optimizing problem for this desired power DP at α ¼ 0:5
are shown in Figure 14. The approximation quality to a desired DP P significantly
depends on the parameters c1 and c 2 at both fixed N2 and M2. The mean-square
deviation (MSD) (the first term in (3)) is equal to 0.0847 for c1 ¼ c 2 ¼ 1:0, and it is
equal to 0.0075 for c1 ¼ c2 ¼ 3:14.
The synthesized DP ∣f ∣for larger c1 , c2 has not only a more optimal mean-square
approximation, but it is also closer to the shape of the desired DP P. The optimal
amplitudes ∣Inm∣ of currents in the array’s elements are close to constant at such
parameters c1 and c 2.
When solving the optimizing problem for desired power DP of a more complex
form, the quality of the approximation significantly depends on both the parameter α
Figure 13.
The character of convergence of iterative process (29) at the different α, M 2 ¼ N2 ¼ 11.
15
226
Matrix Theory - Classics and Advances
Figure 14.
The created power RP ∣f ∣ (a) and optimal distribution of currents ∣Inm ∣ (b) at c1 ¼ c 2 ¼ 1:0.
and the type of initial approximation for the phase of a given DP. The results are
shown for the desired power DP
at c1 ¼ c2 ¼ 3:14 and α ¼ 0:2 in Figure 15. Despite the fact that the shape of desired
DP Pð x 1, x2 Þ is more complex that in the previous example, decrease of α (from 0.5 to
0.2) and simultaneous increase of c1 and c2 (from 1.0 to 3.14) allows us to get the
amplitude ∣f ðx1, x 2 Þ∣ of created DP, which is very close to the Pð x1 , x2Þ. The optimal
distribution of currents’ amplitudes ∣Inm ∣ (Figure 15b) approaches the shape of the
created DP.
We have used an additional optimization parameter β in Eq. (29), which, as shown
by the results of numerical calculations, accelerates the convergence of iterative pro-
cess significantly. In Figure 16, the results of the study of the influence of this
parameter on the rate of convergence at a fixed value of the parameter α ¼ 0:5 are
shown. The results are given for the desired power DP Pð x1 , x2 Þ ¼ 1 at c 1 ¼ c2 ¼ 2:0.
In order to achieve the accuracy 103 of calculations, one needs 157 iterations at β ¼
0:01. If parameter β increases to a certain value, the number of iterations decreases
significantly, so at β ¼ 0:05, β ¼ 0:10, β ¼ 0:15, β ¼ 0:20, and β ¼ 0:25 one requires
Figure 15.
2
The created power RP j f j at c1 ¼ c 2 ¼ 3:14.
16
227
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 16.
The convergence of iterative process (29) for the different β.
Figure 17.
The values of functional (3) versus the phase of created DP.
17
228
Matrix Theory - Classics and Advances
One can see that the values of functional at the fixed c (frequency) significantly
depend on the phase of the created DP. The value σ ¼ 0:2 is achieved for the “even-
even” solution at c ¼ 0:79, for the “even-odd” solution at c ¼ 0:71 and for the “odd-
odd” solution at c ¼ 0:625. That is, within the used criterion, the latter type of solution
is 21% better than the first one. From this fact, it follows that at a fixed distance
between the radiators for the desired DP Pðx1 , x 2Þ ¼ 1, the number of array’s elements
can be reduced by 21% with the same value of MSD. A similar situation is observed for
the characteristics of DP at σ ¼ 0:1, i.e. “odd-odd” solution is better on 19.4% than
“even-even”.
The results of solution of the optimization problem for two given power DPs
P1 ðx 1 , x2 Þ 1 and
( p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 x21 þ x22 1 x 21 x 22, x 21 þ x22 ≤ 1,
P 2 ðx 1, x2 Þ ¼ (33)
0, x21 þ x 22 > 1,
Figure 18.
The amplitude of created DP j f j 2 for P1 at c 1 ¼ 2:0, c2 ¼ 2:236.
18
229
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 19.
The amplitude j f j2 of created DP for P2 at c1 ¼ 2:0, c2 ¼ 2:236.
Figure 20.
The MSD versus the parameter α for the DP P1 .
decreases, but the norm kIk HI of current growth that is unacceptable from the
engineering point of view.
The approximation quality to the desired DP depends also on the type of initial
data, which are prescribed for the iterative process (29). The dependence of the values
of functional (3) on the parity of phase of the initial approximation f 0 for the DP P 1 is
shown in Figure 22. The results are shown for four types of initial approximation: odd
with respect to both the axes (dotted curve), even with respect to the Ox1 axis and odd
with respect to the Ox2 axis (dashed curve), odd with respect to the Ox1 axis and even
with respect to the Ox2 axis (dash-dot curve), and even with respect to both the axis
(solid curve). The initial approximation f 0 , corresponding to the even phase with
respect to both the axis, is optimal for this DP, moreover the values of σα for the small
values of parameters c1 and c2 differ significantly, but starting from c 1 ¼ 0:8 this
difference does not exceed 10%.
19
230
Matrix Theory - Classics and Advances
Figure 21.
The MSD versus parameter α for DP P2.
Figure 22.
The values of σα versus the initial approximation of initial approximation for the iterative process (29).
231
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
Figure 23.
The convergence of iterative process (29) versus α.
Figure 24.
The convergence of iterative process (29) versus number of iteration, α ¼ 0:5.
the above accuracy. The iterative process becomes slow at the subsequent growth of β.
For example, one needs 37 iterations to achieve this accuracy; the iterative process
(29) becomes convergent at β > 0:40. This testifies that in the process of computations
one should to limit by non-large values of β (β ≤ 0:20) that guarantees the conver-
gence and considerably grows its speed on the contrast with small β (β ≤ 0:01).
More information about the problem under investigation one can find in [30–34].
5. Conclusions
The problem of finding the solutions to the nonlinear integral equations and their
properties is reduced to nonlinear two-dimensional eigenvalue problems that lead to
21
232
Matrix Theory - Classics and Advances
the subsequent application of an implicit function method for solving the Cauchy
problem for the linear differential equation. The area of non-zero solutions to the
above equations is determined by involving the solving transcendental equation,
which is got by equating to zero of determinant related to the eigenvalue problem. The
results of solving the nonlinear eigenvalue problems are applied subsequently for
specification of the bifurcation points and obtaining the bifurcation curves. The
approach does not depend on the form of operator determining the radiation proper-
ties of physical system (plane rectangular and hexagonal arrays). The obtained results
are the constructive basis on which a series of practical engineering problems of
optimization was solved numerically.
233
Advanced Methods for Solving Nonlinear Eigenvalue Problems of Generalized Phase…
DOI: http://dx.doi.org/10.5772/ITexLi.103948
References
[1] Burge RE, Fiddy MA, Greenaway AH, [10] Ikehata M. Reconstruction of an
Ross G. The phase problem. Proceedings obstacle from the scattering amplitude at
of the Royal Society of London. 1976; a fixed frequency. Inverse Problems.
350(1661):A350191-A350212. DOI: 1998;14:949-954
10.1098/rspa.1976.0103
[11] Colton D, Kirsch A. A simple method
[2] Zenkova CY, Gorsky MP, Ryabiy PA, for solving inverse scattering problems
Angelskaya AO. Additional approaches in the resonance region. Inverse
to solving the phase problem in optics. Problems. 1996;12:383-393
Applied Optics. 2016;55:B78-B84. DOI:
10.1364/AO.55.000B78 [12] Precup R. Methods in Nonlinear
Integral Equations. Alphen aan den Rijn,
[3] Taylor G. The phase problem. Acta Netherlands: Kluwer; 2002
Crystallographica Section D. 2003;
59(11):1881-1890. DOI: 10.1107/ [13] Masujima М. Applied Mathematical
S0907444903017815 Methods in Theoretical Physics.
Weinheim, Germany: Wiley-VCH; 2005
[4] Hauptman HA, Langs DA. The phase
problem in neutron crystallography. [14] Bauschke HH, Combettes PL,
Acta Crystallographica Section A. 2003; Luke DR. Phase retrieval, error reduction
59(3):250-254. DOI: 10.1107/ algorithm, and Fienup variants: A view
S010876730300521X from convex optimization. Journal of the
Optical Society of America. 2002;19(7):
[5] Dumber AS. On the theory antenna 1334-1345
beam scanning. Journal of Applied
Physics. 1958;13(5):31-39 [15] Fienup JR. Phase retrieval
algorithms: A comparison. Applied
[6] Tartakovskyi LB, Tikhonova VK. Optics. 1982;21:2758-2769
Synthesis of a linear radiator with a
given distribution of amplitudes. [16] Weinberg MM, Trenogin VA.
Radiotechnica & Electronica. 1959;4(12): Branching Theory of Solutions to
2016-2019 (In Russian) Nonlinear Equations. Moscow: Nauka;
1969. p. 528 (In Russian)
[7] Chony YI. To the synthesis of an
antenna system for a given amplitude [17] Deuflhard P. Newton Methods for
radiation pattern. Izvestiya Vysshikh Nonlinear Problems.Affine Invariance
Uchebnykh Zavedenii. Radioelektronika. and Adaptive Algorithms.Springer Series
1968;11(2):1325-1327 (In Russian) Computational Mathematics. Berlin,
Heidelberg: Springer-Verlag; 2004. p. 35:
[8] Sklyanin EK. The method of the xii+424
inverse scattering problem and the
quantum nonlinear Schrödinger [18] Güttel S, Tisseur F. The nonlinear
equation. Doklady Akademii Nauk SSSR. eigenvalue problem. ActaNumerica.
1979;244(6):1337-1341 (In Russian) 2017;26:1-94. DOI: 10.1017/
S0962492917000034
[9] Ramm AG. Multidimensional Inverse
Scattering Problems. New York: [19] Voss H. Nonlinear eigenvalue proble
Longman Scientific & Wiley; 1992. p. 385 ms. In: Hogben L, editor. Handbook of
23
234
Matrix Theory - Classics and Advances
Linear Algebra. 2nd ed. Boca Raton, FL: [28] Gantmacher FR. The Theory of
Chapman and Hall/CRC; 2014. p. 1904 Matrices.Volume One. New York: The
Chelsea Publishing Company; 1959. p. 276
[20] Ruhe A. Algorithms for the nonlinea
r eigenvalue problem. SIAM Journal on [29] Savenko P. Computational methods
Numerical Analysis. 1973;10(4):674-689. in the theory of synthesis of radio and
DOI: 10.1137/0710059 acoustic radiating systems. Applied
Mathematics. 2013;4:523-549. DOI:
[21] Mehrmann V, Vos H. Nonlinear 10.4236/am.2013.43078
eigenvalue problems: A challenge for
modern eigenvalue methods. GAMM- [30] Andriychuk MI, Savenko PO,
Mitteilungen. 2004;27(2):121-152. DOI: Tkach MD. Synthesis of plane
10.1002/gamm.201490007 equidistant array according to power
radiation pattern. In: Proceedings of
th
[22] Jittorntrum K. An implicit function XVII International Seminar/Workshop
theorem. Journal of Optimization Theory on Direct and Inverse Problems of
and Applications. 1978;25(4):575-577. Electromagnetic and Acoustic Wave
DOI: 10.1007/BF00933522 Theory (DIPED-2012); 24-27
September 2012. Tbilisi, New York:
[23] Savenko PA, Protsakh LP. Implicit IEEE; 2012. pp. 68-74
function method in solving a two-
dimensional nonlinear spectral problem. [31] Andriychuk MI, Voitovich NN.
Russian Mathematics. 2007;51(11): Antenna synthesis according to power
41-44. DOI: 10.3103/ radiation pattern with condition of norm
s1066369x07110060 equality. In: Proceeding of 2013 XVIIIth
International Seminar/Workshop on
[24] Savenko P, Tkach M. Numerical Direct and Inverse Problems of
approximation of real finite nonnegative Electromagnetic and Acoustic Wave
function by the modulus of discrete Theory (DIPED); 23–26, September 2013.
Fourier transformation. Applied Lviv, New York: IEEE; 2013. pp. 137-140
Mathematics. 2010;1:41-51
[32] Andriychuk MI, Kravchenko VF,
[25] Andriychuk MI, Voitovich NN, Savenko PO, Tkach MD. The plane
Savenko PA, Tkachuk VP. Antenna radiation system synthesis according to
Synthesis According to the Amplitude the given power radiation pattern.
Radiation Pattern. Numerical Methods FizicheskiyeProblemyPriborostroeniya.
and Algorithms. Kyiv: NaukovaDumka; 2013;2(3):40-55 (In Russian)
1993. p. 256 (In Russian)
[33] Andriychuk M, Savenko P, Tkach M.
[26] Savenko PO. Nonlinear Problems of Non-linear synthesis problems for plane
the Radiation System Synthesis. Theory radiating systems according to the
and Methods of Solution. Lviv: IAPMM prescribed power directivity pattern.
of NASU; 2002. p. 320 (In Ukrainian) Open Journal of Antennas and
Propagation. 2013;1:23-34. DOI: 10.4236/
[27] Kravchenko VF, Protsakh LP, ojapr.2013.12006
Savenko PA, Tkach MD. Mathematical
peculiarities of plane equidistant [34] Andriychuk MI. Antenna Synthesis
array synthesis by given amplitude through the Characteristics of Desired
radiation pattern. Antennas. 2010; Amplitude. Newcastle, UK: Cambridge
3(154):34-48 Scholars Publishing; 2019. p. xvi+150
235
Chapter 12
Abstract
Various ordinary differential equations of the first order have recently been used
by the author for the solution of general, large linear systems of algebraic equations.
Exact solutions were derived in terms of a new kind of infinite series of matrices
which are truncated and applied repeatedly to approximate the solution. In these
matrix series, each new term is obtained from the preceding one by multiplication
with a matrix which becomes better and better conditioned tending to the identity
matrix. Obviously, this helps the numerical computations. For a more efficient com-
putation of approximate solutions of the algebraic systems, we consider new differ-
ential equations which are solved by simple techniques of numerical integration. The
solution procedure allows to easily control and monitor the magnitude of the residual
vector at each step of integration. A related iterative method is also proposed. The
solution methods are flexible, permitting various intervening parameters to be
changed whenever necessary in order to increase their efficiency. Efficient computa-
tion of a rough approximation of the solution, applicable even to poorly conditioned
systems, is also performed based on the alternate application of two different types of
minimization of associated functionals. A smaller amount of computation is needed to
obtain an approximate solution of large linear systems as compared to existing
methods.
1. Introduction
Exact analytic expressions in the form of infinite series of matrices for the solution
of linear systems of algebraic equations were derived in [1] by integrating associated
ordinary differential equations. These differential equations were obtained using a
quadratic functional related to the system of algebraic equations and describe the
orthogonal trajectories of the family of hypersurfaces representing the functional.
More convergent matrix series were presented in [2] which can be applied to approx-
imate the solution of the system of equations. Solution of linear systems based on the
numerical integration of differential equations has originally been formulated in [3].
236
Matrix Theory - Classics and Advances
In Section 2 of the present book chapter, we use recently derived highly conver-
gent series formulae for matrix exponentials [3] in order to construct improved
iterative methods for solving approximately large systems of algebraic equations. In
Section 3, we use novel functionals that allow to formulate differential equations
which lead to a substantial increase in the efficiency of the solution process [4].
Independently of the starting point, the paths of integration of these equations con-
verge all to the solution point of the system considered. At each step of the numerical
solution, one passes, in fact, from one path to a slightly different one due to compu-
tation errors. The procedure does not require to find accurately an entire path but only
the solution point which is common to all the paths. This is why we apply the simple
Euler method [5] to integrate the differential equations. The computation errors are
now determined by the magnitude of the second derivative of the position vector with
respect to the parameter defining the location along the path. A related iterative
method [6] is also described. In Section 4, two different kinds of minimization of the
system functionals are applied alternately for quick computation of a rough solution
of large linear systems [7].
Ax b ¼ 0 (1)
dx
¼ f ðvÞðAx bÞ (2)
dv
Imposing the condition xðvo Þ ¼ x o, x o being a chosen position vector, (2) has a
unique solution over a specified interval [4], namely,
∞
X Ak h i
xðvÞ ¼ xo þ e Agðvo Þ ðg ðvÞÞkþ1 ð gðvo ÞÞkþ1 ðAxo bÞ (3)
k¼0
ðk þ 1Þ!
Ð
where g ðvÞ f ðvÞdv is a primitive of f ðvÞ, i.e., f ðvÞ ¼ dg =dv:If f ðvÞ is taken to be
f ðvÞ ¼ 1=v, then gðvÞ ¼ ln v: Choosing vo ¼ 1 gives g ðvo Þ ¼ 0 and (3) becomes now
∞
ðA ln vÞk
xðvÞ ¼ x o þ ð ln vÞ ð Axo bÞ (4)
k¼0
ðk þ 1Þ!
X
Over the interval v ∈ ð0, 2Þ, xðvÞ can also be expressed in the form [3]
2
237
Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations
DOI: http://dx.doi.org/10.5772/ITexLi.104209
" #
ð1 vÞk
∞
X A A
xðvÞ ¼ x o ð 1 vÞ I þ ðI AÞ I ⋯ I ðAxo bÞ (5)
k¼1
kþ1 2 k
# (7)
cqNA cq NA
Iþ ⋯ Iþ
2 k
238
Matrix Theory - Classics and Advances
(
p
10qk c qNA cq NA cq NA
NA q p
P
e ¼ ð1 þ 10 Þ I þ p! I I ⋯ I
k¼1 k!ðp kÞ! p p1 pkþ1
c qNA cq NA 1
ðpþ1Þq
10 cq NA I cq NA I ⋯ I I
2 p pþ1
#)
ð1Þk10 qk c q NA c qNA
∞
P
þp! I þ cq NA I þ ⋯ Iþ ,
k¼1 ðk þ 1Þ ðk þ 2 Þ⋯ðk þ p þ 1Þ 2 k
(9)
p ¼ 1, 2, …
The multiplication with A1 from (6) is avoided by arranging the first summation
in (9) in terms of powers of A and by taking into account that
p
X 10 qk p
p! ¼ ð1 þ 10 qÞ 1 (10)
k¼1
k!ðp kÞ!
It is obvious that using (9) leads to computations with a much smaller number of terms
retained in the series for a given N. With the same amount of computation we obtain an
x eN which is much closer to the exact solution x ð0Þ of the system of equations (1).
In this Section, we use special kinds of functionals that lead to the construction of
differential equations which allow a substantial increase in the efficiency of the solu-
tion of large linear algebraic systems.
3.1 Vector differential equations and their application to the solution of (1)
where hðvÞ is an appropriately selected real function with definite first and second
derivatives in ðvo, v S Þ, vo corresponding to a starting point xðvo Þ xð0Þ , with hðv oÞ ¼ 1,
and vS corresponding to the solution vector of (1) x ðvS Þ xS, with h ðvS Þ ¼ 0.
Denoting F x ð0Þ Fo , we have
dh T dx
Fo ¼ αk Ax bkα2 AT ðAx bÞ (13)
dv dv
4
239
Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations
DOI: http://dx.doi.org/10.5772/ITexLi.104209
Thus,
dx F o dh AT ðAx bÞ
¼ (14)
dv α dv kAx b kα2 AT ðAx bÞ2
d2 x F o d2 h A TðAx bÞ
¼
dv 2 α dv2 kAx b kα2 AT ðAx bÞ2
2 2
Fo dh 1
þ
α dv kAx bk 2 ð 1 Þ A ðAx bÞ 2
α
T
AA TðAx bÞ 2
( " # )
k Ax bk 2
T T
A A2 T I þð2 αÞI A ðAx bÞ
AT ðAx bÞ2 A ðAx bÞ2
(15)
k Ax bk ¼ ðF o hðvÞÞ1=α (16)
that allows to simply monitor the magnitude of the residual vector of (1) during
the computation process.
As explained in the Introduction we apply the Euler method for the solution of
(14) and compute successively
dx
xðiþ1Þ ¼ x ðiÞ þ η , i ¼ 0, 1, 2, … (17)
dv x¼xðiÞ
where η is the step size. In the absence of any hint about a good starting point x ð0Þ
corresponding to v ¼ vo , we have used the point along the normal from the origin x ¼ 0
to the surface F ðxÞ ¼ const in (11) which is the closest to the solution point x S [8], i.e.,
k bk 2 T
xð0Þ ¼ A b (18)
AT b 2
The function hðvÞ and the parameter α are chosen such that the first and the second
derivatives of x in (14) and (15) remain finite when v ! v S, i.e., when the residual
kAx bk ! 0, while the errors evaluated with the second derivative are kept reason-
ably small along the interval of integration v ∈ ðvo , v SÞ. For each hðvÞ, α is determined
by imposing the condition that the second derivative in (15) tends to zero as
kAx bk ! 0. To decide on the value of α for a given hðvÞ, we require to have a good
rate of decrease of kAx bk at the beginning of the computation process, for instance
to have (see (11) and (12))
Axð1Þ b
ð0Þ b k
¼ ðhðηÞÞ 1=α ffi 0:8 (19)
k
Ax
5
240
Matrix Theory - Classics and Advances
when η ¼ 0:1. Finally, to solve (1), we determine the actual step size to be used for
the numerical solution such that the errors at the beginning of the computation
process are small, say
2
η2 1 d x
< 0:01 (20)
2 k xð1Þk dv2 x¼xð0Þ
If the magnitude of the residual vector, which is computed at each step, does not
decrease anymore the computation is continued with a new cycle of integration by
applying (17) with x ð0Þ replaced by a new starting point, i.e.,
ðlastÞ 2
ðnewÞ ðlastÞ
Ax b
x ¼x 2
T
A AxðlastÞ b (21)
AT AxðlastÞ b
where x ðlastÞ is the position vector from the preceding step. xðnewÞ is the closest point
to x S along the normal to the surface F ðxÞ ¼ const taken at xðlastÞ [8]. Subsequent cycles
of integration are performed in the same way until a satisfactory approximate solution
of (1) is obtained. If the difference between xðnewÞ and xðlastÞ is insignificant we find the
point along the normal to FðxÞ ¼ const, taken at xðnewÞ , where F has a minimum and,
then, apply (21) again. It should be noted that as one approaches the solution point the
direction of the gradient AT ðAx bÞ tends to become more and more perpendicular to
the direction of xS x and, thus, the residual kAx bk and the relative error
kx xS k=k x Sk will not decrease any more as expected. This is why the computation
has to be continued by opening a new cycle of integration. For systems with higher
condition numbers, this happens more quickly.
Numerical experiments obtained using hðvÞ ¼ 1 v with α ¼ 0:45 and also using
2
hðvÞ ¼ ð1 vÞ with α ¼ 0:9 shows that only two up to five integration cycles with a
step size of 0.1 are needed in order to get an accuracy of about 1% for the solution of
systems with condition numbers up to 100.
The basic idea of this method is to find, starting from a point xð0Þ , a point x ð1Þ along
the gradient of a functional (11) associated with the general system (1), such that the
magnitude of the new residual vector is an established fraction of its initial value,
ð1Þ
Ax b ¼ τAxð0Þ b , τ < 1. Instead of performing an integration as in Section
b ¼ τ Ax ðiÞ b, i ¼ 0, 1, 2, …
ðiþ1Þ
Ax (22)
for each iteration the starting point being the point found in the preceding
iteration, with τ maintained as tight as possible at the same value. To do this, we
impose that FðxÞ in (11) varies from xðiÞ to xðiþ1Þ in the same way for each iteration,
namely,
241
Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations
DOI: http://dx.doi.org/10.5772/ITexLi.104209
FðxÞ ¼ F xðiÞ hðvÞ (23)
where hðvÞ is now a real function of a real variable, monotone decreasing in the
interval v ∈ ½ vo, vo þ η , v o ≥ 0, η > 0, with definite first and second derivatives, and
with hðvo Þ ¼ 1 such that
F xðiþ1Þ ¼ F xðiÞ h ðvo þ ηÞ, i ¼ 0, 1, 2, … (24)
and
xðiþ1Þ is computed as
ð iþ1Þ ðiÞ dx
x ¼x þη (27)
dv v¼vo
Ax b 2
ðiÞ
η dh
xð iþ1Þ ¼ xðiÞ
þ 2
α dv v¼vo T
A AxðiÞ b (28)
AT AxðiÞ b , i ¼ 0, 1, 2, …
in which, taking into account (21), one has to have ðη=αÞðdh=dvÞ v¼vo < 1. This
expression corresponds to the first step in the numerical integration by Euler’s method
of the differential Eq. (14), starting from xðiÞ with a step η, or to the first two terms of
the Taylor series expansion of xð v o þ ηÞ:
The starting value xð0Þ and the function hðvÞ are chosen in the same way as in
Section 3.1. For selected ratios η=α the iteration cycle continues as long as the residual
ðiÞ
Ax b decreases at a proper rate. Theoretically, to make Ax ðiÞ b ¼
computation errors are introduced at each iteration (as in the Euler method), the
initially chosen value of τ cannot be maintained the same as the iterative process
continues. An approximate solution of (1) is obtained at the end of the iteration cycle
as before, applying (21). Subsequent iteration cycles are performed with the starting
point in each cycle being the point determined in the preceding cycle.
Numerical results were generated using, as in the method presented in Section 3.1,
hðvÞ ¼ 1 v but with η=α ¼ 0:2 and hðvÞ ¼ ð1 vÞ2 with η=α ¼ 0:1. Now, in both
242
Matrix Theory - Classics and Advances
where
Ax b2
ðℓÞ
ðℓÞ ðℓþ1Þ η dh
ðℓÞ
xd x x ¼ 2
α dv v¼vo T
A Ax ðℓÞ b
(30)
AT AxðℓÞ b
where
ðℓ1Þ
b ðℓÞ ¼ Ax d b ðℓ1Þ , ℓ ¼ 1, 2, … ;
(33)
ð0Þ ð0Þ
b ¼ Ax b
ðℓÞ
This shows that the difference xd is a “rough approximation” to the solution
of a system (1) whose right-hand side is, at each iteration, just the residual vector
of the system in the preceding iteration, which decreases in magnitude from one
iteration to the next. Thus, the method presented in this Section represents a
practical, concrete implementation of the well-known idea of successive
approximations/perturbations [9].
To search for a possible increase in efficiency, more general functionals of the form
FðxÞ ¼ F ðk Ax bk Þ (34)
243
Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations
DOI: http://dx.doi.org/10.5772/ITexLi.104209
could be tested, where F and its first derivative are finite and continuous at all the
points within the interval of integration. Now the corresponding (14) is
1
dx dh dF kAx bk
¼ Fo
dv dv dkAx bk A ðAx bÞ2
T
(35)
T
A ðAx bÞ
Note. The paths of integration in the methods presented can be looked at as being
the field lines of a Poissonian electrostatic field in a homogeneous region bounded by a
surface of constant potential FðxÞ ¼ kAx o b kα and with a zero potential at the
solution point, Fð xS Þ ¼ 0. In the particular case of α ¼ 2 the ratio of the volume
density of charge within the region to the material permittivity is constant, namely,
2 ni¼1 nk¼1a2ik where aik are the entries of A. By altering this electrostatic field one
P P
could eventually make quicker the approach to the solution point along the integration
path.
This simple method is based on the property of a functional of the form (34)
associated with a general system of equations (1) to allow not only to minimize the
value of the functional but also the distance to the solution point of (1). Using only
minimizations along ordinary gradients of the functional is not efficient unless the
system is very well-conditioned.
For computing efficiently an approximate solution of general, large linear systems
of algebraic equations, we propose in this Section the alternate application of minimi-
zations of a functional and of the distance to the solution point, along the direction of
the gradient of the functional.
Consider the functional in (34) where F is a real function defined for all x from a
chosen starting point xðsÞ to the solution point xS of (1), monotone decreasing with
kAx bk, F ð0Þ ¼ Fð xSÞ. The gradient of F ðxÞ is
dF ð kAx b kÞ AT ðAx bÞ
∇FðxÞ ¼ (36)
dk Ax bk k Ax bk
The point at which F ðxÞ is minimum along the normal taken at x ðsÞ is determined
by replacing d with AT AxðsÞ b in (37),
9
244
Matrix Theory - Classics and Advances
2
A AxðsÞ b
T
T
x min F ¼ xðsÞ
ðsÞ
2 A Ax b (38)
T
AA AxðsÞ b
The minimum distance between the solution point xS and a point x along the
direction of the gradient of FðxÞ taken at xðsÞ is at
Ax b 2
ðsÞ
ðsÞ T ðsÞ
x min D ¼ x 2 A Ax b (40)
A Ax ðsÞ b
T
which depends only on the residual vector and on A TðAx bÞ at the point xðsÞ ,
independently of the form of F .
As already mentioned, repeated minimizations using only (38) are not efficient for
solving a general system (1). The same is true when using only (40). To obtain an
approximate solution of (1), we apply the formulae (40) and (38) alternately, the
starting point xðsÞ being each time the point determined in the preceding minimiza-
tion. As in Section 3, when there is no indication about a convenient first starting
point xðsÞ, one can use the origin xðsÞ ¼ 0: Only a few iterations, up to ten, are needed
for a solution with a relative error k~x xS k=kx Sk of about 1% for systems with
condition numbers of up to 100.
The procedure is surprisingly efficient for a rough solution even for very ill-
conditioned systems. For example, for the system Hx ¼ b where H is the Hilbert
matrix of order eight and whose solution is xS ¼ ½1, 1, … , 1 T [10], a solution of 6%
accuracy is obtained with a starting point xðsÞ ¼ 0 by performing only seven alternate
minimizations (40), (38), with no equilibration or regularization preoperated on the
system. By comparison [10], for the Gauss elimination method, the accuracy is only
40.6%, for the Gauss elimination with equilibration 9.15%, and for the Householder
method with equilibration 5.6%.
Experimental results show that, as the new point x given by (40), (38) becomes
closer to the solution point xS , the direction AT ðAx bÞ of the gradient in (36), i.e., of
the correction terms in (40) and (38), becomes more and more orthogonal to the
direction of x xS . This causes the magnitude kAx b k of the residual vector of (1)
and the relative difference kx xS k=kxS k to not significantly decrease anymore. To
progress with the computation one can intercalate minimizations of F ðxÞ along direc-
tions that are different from that of A TðAx bÞ. By numerical testing, it is observed
that x xð0Þ, where x ð0Þ is the original starting point, has at this stage, in most cases, a
significant component along the direction of x xS . Thus, one can use such a mini-
mization direction to try to improve the solution accuracy.
5. Conclusions
Highly convergent iteration formulae for solving general, large linear systems of
algebraic equations are derived from exact analytic solutions of particular differential
10
245
Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations
DOI: http://dx.doi.org/10.5772/ITexLi.104209
11
246
Matrix Theory - Classics and Advances
247
Using Matrix Differential Equations for Solving Systems of Linear Algebraic Equations
DOI: http://dx.doi.org/10.5772/ITexLi.104209
References
[1] Ciric IR. Series solutions of vector Winnipeg, Canada: IEEE; 2021. ISBN:
differential equations related to linear 978-1-6654-0335-1
systems of algebraic equations. In:
Proceedings of the 12th International [7] Ciric IR. Alternate minimizations for
Symposium on Antenna Technology and an approximate solution of general
Applied Electromagnetics and Canadian systems of linear algebraic equations. In:
Radio Sciences Conference (ANTEM/ Proceedings of the 19th International
URSI). Montréal, Canada: IEEE; 2006. Symposium on Antenna Technology and
pp. 317-321. ISBN: 978-0-9638425-1-7 Applied Electromagnetics (ANTEM).
Winnipeg, Canada: IEEE; 2021. ISBN:
[2] Ciric IR. Rapidly convergent matrix 978-1-6654-0335-1
series for solving linear systems of
equations. In: Proceedings of the 17th [8] Ciric IR. A geometric property of the
International Symposium on Antenna functional associated with general
Technology and Applied systems of algebraic equations. In:
Electromagnetics (ANTEM). Montréal, Proceedings of the 12th International
Canada: IEEE; 2016. DOI: 10.1109/ Symposium on Antenna Technology and
ANTEM.2016.755022/978-1- Applied Electromagnetics and Canadian
4673-8478-0 Radio Sciences Conference (ANTEM/
URSI). Montréal, Canada: IEEE; 2006.
[3] Ciric IR. New matrix series formulae pp. 311-315. ISBN: 978-0-9638425-1-7
for matrix exponentials and for the
solution of linear systems of algebraic [9] Lanczos C. Applied Analysis. New
equations. In: Shah K, Okutmustur B, York: Dover Publications, Inc.; 1988.
editors. Functional Calculus. London: ISBN: 0-486-65656-X (Paperback)
IntechOpen; 2020. DOI: 10.5772/
ITexLi.2022.77599/978-1- [10] Hämmerlin G, Hoffmann K-H.
83880-007-9 Numerical Mathematics. New York:
Springer-Verlag; 1991
[4] Ciric IR. Approximate solution of
large linear algebraic systems using
differential equations. In: Proceedings of
the 19th International Symposium on
Antenna Technology and Applied
Electromagnetics (ANTEM). Winnipeg,
Canada: IEEE; 2021. ISBN: 978-1-6654-
0335-1
248