# A DERIVATION OF THE LORENTZ TRANSFORMATION

BETHANY HERB

Abstract. The ideas surrounding special relativity were born through the abstract theoretical re-
sults of great men such as Albert A. Michelson, Hendrik Lorentz, and Henri Poincar, but culminated
into the Theory of Special Relativity proposed by Albert Einstein. Using the mathematical concepts
of linear algebra we are able to deduce the Lorentz Transformation matrix,
 
γ −βγ 0 0
 −βγ γ 0 0 
Tv = 
 0

0 1 0 
0 0 0 1
from the 3 postulates published in Einstein’s paper. Later, we will use the results of the transfor-
mation to prove the well known equation, e = M c2 .

1. Necessary Postulates
We will require the use of three postulates regarding special relativity:
Postulate 1 (Principle of Relativity). All reference frames are equivalent, or no single reference
frame is in any way special.
Postulate 2 (Constancy of Light). The speed of light, measured in any reference frame and in any
direction, is c.
Postulate 3 (Homogeneity of Space). Equally spaced increments of space and time in one reference
frame correspond to equally spaced increments of space and time in any other reference frame.

2. Background
Let us review the concepts surrounding relative reference frames. Imagine two entities, A and B,
in space. We are to assume that the same physical laws govern both entities in the same way. B
begins to move away from A at a constant velocity v along the x-axis. We can see that if we make
each entity the origin of its own space we will need to define the two coordinate systems as S and S 0
respectively. These are called reference frames as shown in Figure 1. We can see that B is moving
at a velocity v relative to A. We can set the time so that t = t0 = 0 when x = x0 or rather the exact
moment at which B passed A. Therefore, (x, y, z, t) may be coordinates in the resting frame S,
while (x0 , y 0 , z 0 , t0 ) may be coordinates in the inertia reference frame, S 0 . An event is categorized by
something that occurs at one point in S 0 . B in S 0 sees this event occur at (x0 , y 0 , z 0 , t0 ), while A in
S sees this even occur at some point (x, y, z, t). However, because the reference frames are relative
to each other, each observer will disagree on the location of the event based on their reference of
origin. The coordinate systems can be converted by the simple Galilean transf ormations given as:

x = x0 + vt0 y = y0 z = z0

However, due to the constancy of light from Postulate 2 the transformations from one reference
frame to another are not nearly this simple. Suppose B emits a light beam in the direction of v. Say
Date: May 8, 2017.
1
that from reference frame S, A decides to measure the velocity of the beam of light. By the Galilean
transformations, A would measure the light traveling at vs = c + v, where vs is the speed observed
by A, c is the speed of light, and v is the speed of B relative to A. However, unless B is not moving,
the resulting velocity observed by A will be greater than c. Thus, we must drop the assumption that
time is absolute, since c must be constant regardless of the relative velocities between the references
frames. Therefore, we must create a transformation between reference frames that maintains the
constancy of light. Hence, the Lorentz equations were born.

Figure 1. Relative reference frames between observer A and B

3. Preliminaries
Before we begin our derivation of the Lorentz transformation equations, we must define our vector
space. Our space will represent physical space-time, so we will use three distance components x, y, z
and one component to represent time. Because each component should have the same unit, we will
multiply time t by the speed of light c. Thus, our vector space M, known as the Minkowski Space
after Hermann Minkowski, is comprised of all w such that
 
ct
 x 
w=  y .

z
Our objective is to derive an isomorphism from M to M. In the physical world, this isomorphism
represents a conversion between two different reference frames while the speed of light, referenced
in Postulate 2 and homogeneity of space, Postulate 3, are preserved.
Definition 1 (Reference Frame). Let a reference frame be a point o such that
 
0
 0 
o=  0 .

0
Likewise, we will let our unit vectors be defined as:
       
1 0 0 0
 0
 , e1 =  1  , e2 =  0  0
      
e0 =  ,e =  .
 0   0   1  3  0 
0 0 0 1
A DERIVATION OF THE LORENTZ TRANSFORMATION 3

Let this reference frame be called S, and let S 0 be any reference frame moving with speed v relative
to S. The conversion from S to S 0 is the Lorentz transformation, which we will derive.

4. Proof the Transformation is Linear
Proof. We will prove that a transformation from one reference frame S to another reference frame
S 0 is linear. Let xi ∈ S, and let there be constant motion in S. Then dx dt
i
is a constant, which
∂ 2 xi
implies that dt2 = 0.
Now consider a constant motion in S 0 with x0i ∈ S 0 . We can see that

dx0i X ∂xi dxj
= .
dt j
∂x j dt

We can also see that

d2 x0i
= 0,
dt2
as a result of Postulate 3, the homogeneity of space. Note that we sum over j, k when j, k are all
the possible indexes for xi .
By the Chain Rule,

d2 x0i X d
 
∂xi dxj
=
dt2 j
dt ∂xj dt
X ∂xi d2 xj X X ∂ 2 xi dxj dxk
= 2
+
j
∂x j dt j k
∂xj ∂xk dt dt
= 0.
d 2 xi
Recall that =0 for all i,
dt2
X ∂xi d2 xj
so = 0.
j
∂xj dt2
X X ∂ 2 xi dxj dxk
This implies that = 0.
j k
∂x j ∂x k dt dt
∂ 2 xi
Since each dx
dt
i
is an arbitrary constant for all i, all ∂xj ∂xk
must equal 0. Therefore, the transfor-
mation equation must be linear.


5. Proof the Transformation is an Isomorphism
Having shown that the transformation is linear, we will now show that it is an isomorphism.
Definition 2 (Isomorphism). A linear transformation T : V → W that is one-to-one and onto is
called an isomorphism.
Proof. Let v be the velocity between two reference frames. We need to verify that the linear
transformation Tv is an isomorphism from M to M. It is enough to show that M is finite dimensional
and Tv has a nullspace equal to ~0 . Clearly, the vector space has four components and is therefore
finite dimensional.
By Postulate 3, equally spaced increments of space and time in one reference frame correspond to
equally spaced increments of space and time in all other reference frames. There is no transformation
that will transform an existing interval of space and time into a non-existing interval. Thus, the
nullspace of Tv is ~0, and Tv is an isomorphism by Definition 2. 

6. Setting Up the Transformation
We will assume that v is in the e1 direction and thus moving along the x-axis. Therefore,
   0 
ct ct
 x   x0 
Tv 
 y  =  y0 
   such that y = y 0 and z = z 0 ,
z z0
Note that only the x and t components are of importance, the motion of reference frame S 0 is only
in the x direction. Due to Tv (0) = 0, we can conclude that
       
0 0 0 0
 0   0   0   0 
 1 = 1 
Tv     and  0 = 0
Tv    ,

0 0 1 1

in other words, Tv (e2 ) = e2 and Tv (e3 ) = e3 . Thus, the span of {e1 , e2 } under Tv is invari-
ant. We can see that these properties also hold true for the adjoint Tv∗ of Tv .

7. Defining the Inner Product Space
Suppose an event happens in reference frame S. Considering a point some distance from the
event; news of the event will travel at the speed of light to that point. We can use a sphere of
radius ct to relate the distance from the event to the point. The equation of the sphere is given
by (ct)2 = (x2 + y 2 + z 2 ), or x2 + y 2 + z 2 − (ct)2 = 0. Notice this looks similar to the dot product
defined as
3
X
hw, ui = wi ui .
i=0

However, let us modify this dot product to fit our equation. Let η be the matrix
 
−1 0 0 0
 0 1 0 0 
η=  0 0 1 0 .

0 0 0 1
Thus,
X3
hw, η (w)i = hη (w) , wi = wi wi − w0 w0 ,
i=1

where w is in M. Notice our modified dot product, known as the interval in the context of rela-
tivity, will fit the equation of news traveling out from an event.
A DERIVATION OF THE LORENTZ TRANSFORMATION 5

Recall the following properties of matrices:
Definition 3 (Adjoint). The adjoint Tv∗ of Tv is the transpose of the matrix of cofactors of Tv .
   
A11 A12 A11 A21
Definition 4 (Transpose). If given a matrix A = , the transpose of A is .
A21 A22 A12 A22
Definition 5 (Cofactor). The cofactor Cij of an element aij of a matrix A is given by
Cij = (−1)i+j Mij where Mij is the minor of aij .
Definition 6 (Minor). If A is a square matrix, then the minor Mij of the element aij is the
determinant of the matrix obtained by deleting the ith row and jth column of A.
We can see from Postulate 2 that light traveling from an event in any reference frame will always
create a circle. In other words, if hη (w) , (w)i = 0 in one reference frame, then the same is true in
any reference frame. Suppose also that Tv (w) = w0 . Given these, we conclude that:

hη (w0 ) , w0 i = 0
and thus,
hηTv (w) , Tv (w)i = 0.

Let T ∗ be the adjoint of T as defined by Definition 3. Recall the property that ha, T (b)i =
hT ∗ (a) , bi. Let a = ηTv (w) and b = w. Then

hη (w0 ) , w0 i = hηTv (w) , Tv (w)i = hTv∗ ηTv (w) , wi.

Finally, our conclusion is that
(1) if hη (w) , wi = 0, then hTv∗ ηTv (w) , wi = 0.

8. Deriving the Transformation Matrix
We will begin our derivation with some observations. Consider two vectors in M:
   
1 −1
 1 
 , w2 =  1  .
 
w1 =  0   0 
0 0
By taking the modified inner product, we can see each produces 0.
    
−1 0 0 0 1 1
 0 1 0 0   1   1 
hη (w1 ) , w1 i = 1 1 0 0     = −1 1 0 0   = 0
 0 0 1 0  0   0 
0 0 0 1 0 0
    
−1 0 0 0 −1 −1
 0 1 0 0 
 1  = 1 1 0 0  1  = 0
   
hη (w2 ) , w2 i = −1 1 0 0  0 0 1 0  0   0 
0 0 0 1 0 0

Using the common inner product, we see that w1 · w2 = (1) (−1) + (1) (1) + (0) (0) + (0) (0) = 0, so
w1 and w2 are orthogonal. We can also see that {w1 , w2 } spans {e0 , e1 } and {w1 , w2 } are linearly
independent. Therefore, w1 and w2 form an orthogonal basis for {e0 , e1 }.

Recall:
Definition 7 (Span). The span of a set S = {v1 , v2 , ...vk } of vectors in a vector space V is the set
of all linear combinations of the vectors in S.
We can see that the span of {w1 , w2 } is invariant under Tv , Tv∗ , and η. Therefore,

Tv∗ ηTv (Span{w1 , w2 }) = Span{w1 , w2 }.

By Eq. (1), if hη (w) , (w)i = 0, then hTv∗ ηTv (w) , Tv (w)i = 0. By extension, hTv∗ ηTv (w1 ) , w1 i = 0
implies that Tv∗ ηTv (w1 ) = aw2 for some constant a. We will verify this below.

0 = hTv∗ ηTv (w1 ), w1 i
= haw2 , w2 i
 
1
 1 
= −1a 1a 0 0  
 0 
0
= −a + a = 0

Therefore, Tv∗ ηTv (w1 ) = aw2 . A similar approach may be taken to verify that Tv∗ ηTv (w2 ) = bw1 .

By substituting e0 + e1 for w1 and −e0 + e1 for w2 , we see that:

Tv∗ ηTv (w1 ) = Tv∗ ηTv (e0 + e1 ) = Tv∗ ηTv (e1 ) + Tv∗ ηTv (e0 ) = aw2

Tv∗ ηTv (w2 ) = Tv∗ ηTv (−e0 + e1 ) = −Tv∗ ηTv (e1 ) + Tv∗ ηTv (e0 ) = bw1

By adding the last inequality in each line and simplifying, we see that
 b−a

2
aw2 + bw1  a+b
Tv∗ ηTv (e1 ) =

=
 0 .
 2
2
0

By subtracting the last inequality on each line and simplifying, we see that

−a−b
 
2
aw2 − bw1  a−b
Tv∗ ηTv (e0 ) =

= 2 .
2  0 
0

Since Tv∗ ηTv (e2 ) = e2 and Tv∗ ηTv (e3 ) = e3 , we can write
A DERIVATION OF THE LORENTZ TRANSFORMATION 7

   
− a+b
2
− a−b
2
0 0 −p −q 0 0
a−b a+b
0 0   q p 0 0
Tv∗ ηTv = 
   
2 2 = 
 0 0 1 0   0 0 1 0 
0 0 0 1 0 0 0 1

by letting p = a+b2
and q = a−b
2
.

Consider (Tv ηTv ) , the adjoint of Tv∗ ηTv . We can see that because T admits an orthonormal basis

that T is self adjoint. Thus, Tv∗ ηTv = (Tv∗ ηTv )∗ .

q 0 0
= (−1)1+2 0 1 0 = (−1) (−1)1+1 (q) [(1) (1) + (0) (0)] = −q

C12
0 0 1

−q 0 0
= (−1)2+1 0 1 0 = (−1) (−1)2+1 (q) [(1) (1) + (0) (0)] = q

C21
0 0 1

Since (Tv∗ ηTv )∗ = Tv∗ ηTv , q = −q, which implies that q = 0.
Now consider
 
1
 0 
 1 .
v= 
0

Then,

   
−1 0 0 0 1 1
 0 1 0 0   0   0 
hη(v), (v)i = 1 0 1 0 
 0
 = −1 0 1 0 
 1  = −1 + 1 = 0.

0 1 0  1 
0 0 0 1 0 0
Since h(η(v, )(v))i = 0, hTv∗ ηTv (v), vi = 0 by Eq. (1). Therefore, −p + q + 1 = −p + 1 = 0, which
implies p = 1. By substituting for q and p, we find that Tv∗ ηTv = η. Thus,

(2) hη (w0 ) , w0 i = hη(w), wi.

Now we will construct different vectors to establish more properties of the transformation. Suppose
an event occurs at the origin of S so that
 
ct
 0 
u=
 0 .

0

Consider reference frame S 0 , where S 0 is moving past S at speed v. Then
 
ct0
 −vt0 
u0 = 
 0 .

0

Our transformation on u yields Tv (u) = u0 . By Eq. (2), we know hη (w0 ) , w0 i = hη(w), wi. Looking
specifically at the first and last expressions set equal to each other, we can get some helpful insight.
    
−1 0 0 0 ct ct
 0 1 0 0 
 0  =
   0  2
hη (u) , ui = ct 0 0 0 
 0 −ct 0 0 0  0  = − (ct) .

0 1 0   0 
0 0 0 1 0 0
    
−1 0 0 0 ct0 ct0
0 0 
  −vt  = −ct0 −vt0 0 0  −vt 
 0 1 0 0 
hη (u0 ) , u0 i =
  
ct0 −vt0 0 0 
 0 0 1 0  0   0 
0 0 0 1 0 0
2
= − (ct0 ) + (−vt0 )2 .

2 2
Therefore, − (ct)2 = − (ct0 ) + (−vt0 )
−c2 t2 = −c2 t02 + v 2 t02
v 2 02
−t2 = −t02 + t
c2
v2
t2 = t02 − 2 t02
 c 2
v
t2 = t02 1 − 2
c
r
v2
t = t0 1 − 2
c
1
We may choose t = , since t is arbitrary.
cr
1 v2
= t0 1 − 2
c c
r  v 2
0
1 = ct 1 −
c
1 1
t0 = q  ,
c v 2
1− 1− c
1
so ct0 = q 2
1 − 1 − vc
v
and − vt0 = q 2
c 1 − vc
Now we can see that
A DERIVATION OF THE LORENTZ TRANSFORMATION 9

1
 
  q
v 2
1  1−( ) c 
 0   − q v 2 
Tv 
 0  =  c 1−( c )
  v 

0
 0 
0
v q 1
Let β = c
and γ = 2, so that
1−( vc )
  
1 γ
 0   −βγ 
Tv (e0 ) = Tv 
 0 = 0 
  
0 0

Now we will observe an event occurring at the origin in S 0 as S moves at speed v. Therefore,
   0 
ct ct
 vt  0  0 
u=  0 ,u =  0 
  
0 0
We will again consider Eq. (2) particularly,
    
−1 0 0 0 ct ct
 0 1 0 0   vt
     vt  2 2
hη (u) , ui = ct vt 0 0 
 0
= −ct vt 0 0  0  = − (ct) +(vt) .

0 1 0  0 
0 0 0 1 0 0
  0   
−1 0 0 0 ct ct0
 0 1 0 0   0  0  0 2
hη (u0 ) , u0 i =

ct0 0 0 0 
 0
 = −ct0  0  = − (ct ) .
0 0 0  
0 1 0  0 
0 0 0 1 0 0
2
Therefore, − (ct0 ) = − (ct)2 + (vt)2 .
Through a similar process, we can see that
   
0 −βγ
 1   γ 
Tv (e1 ) = Tv 
 0  =  0 .
  
0 0

Since the basis vectors e2 , e3 remain unaffected by the transformation, we have derived the Lorentz
transformation matrix as,
 
γ −βγ 0 0
 −βγ γ 0 0 
Tv = 
 0
.
0 1 0 
0 0 0 1

We will now use this Lorentz transformation matrix to obtain the Lorentz transformation equations.
9. Deriving Transformation Equations
 
 0 
ct ct
 x   x0 
Recall that Tv 
 y  =  y 0 . Since we have defined Tv , we can substitute in the Lorentz
  
z z0
transformation matrix:
        
ct0 ct γ −βγ 0 0 ct ctγ − βγx
 x0   x   −β
  γ 0 0   x   −βγct + γx 
 0  = Tv  =  y  = 
   .
 y   y   0 0 1 0 y 
0
z z 0 0 0 1 z z

We will now use β = v
c
and γ = q 1
2 to find equations for ct0 and x0 .
1−( vc )

ct vx c2 t − vx
ct0 = q − q = q  ,
v 2 v 2 v 2
 
1− c
c 1− c
c 1− c
c2 t − vx
0 t − vx
c2
and it follows that t = q =q 2 .
2
c2 1 − vc 1 − vc


vct x x − vt
Also, x0 = − q +q =q  .
v 2 v 2 v 2
 
c 1− c
1− c
1 − c

10. Proof of e = M c2
One application of the Lorentz transformation is a proof of e = M c2 . The concepts used to derive
the Lorentz transformation can be used to derive the following relativistic mass equation:

m
m0 = q ,
v 02
1− c2

where m ∈ S and m0 , v 0 ∈ S 0 . The derivation of such,
R x is left to the diligent student.
Let us consider the Work-Energy Theorem, E = 0 F dx, where E is energy and F is force.
Because mass is relative, F 6= ma in all cases. Rather, force can be defined as F = dtd (mv). By
substitution,
 
0
d d mv
F = 0
(m0 v 0 ) = 0  q ,
dt dt 1− v 02
c2

where t0 ∈ S 0 .
We can see that
 
0
d  mv = ma0
dt0
q
v 02 v 02
 32 ,
1− c2
1− c2

where a0 ∈ S 0 . Let the following simplification be proof.
A DERIVATION OF THE LORENTZ TRANSFORMATION 11

 
v0
mv 0 − 12 (−2)(a0 )
 
0
d  mv = ma0 c2
+ 3
dt0
q q
v 02 v 02 02 
1− c2
1 − c2 1 − vc2 2
 
v 02 02
ma 1 − c2 + ma0 vc2
0

= 3
02 
1 − vc2 2
 02   02 
ma0 − ma0 vc2 + ma0 vc2
= 3
02 
1 − vc2 2
ma0
= 3
02 
1 − vc2 2

By substitution and manipulation, we can easily see that
Z x Z x Z x
ma0 a0
E= F dx = 3 dx = m 3 dx.
02  02 
0 0 1 − vc2 2 0 1 − vc2 2
dv 0 dv 0 dx 0
By recalling that a0 = dt0
= dx dt0
= v 0 dv
dx
, we can substitute for a0 :
Z x
v0 dv 0
E=m 3 dx.
1 − vc2 2 dx
02 
0

The Chain Rule gives us a change in bounds such that,
Z v0
v0 0
E=m 3 dv
v 02  2
0 1 − c2
After integrating, we see that

E = c2 (m0 − m) = M c2

where M = m0 −m is the change in mass. Thus, we have used the concept of Lorentz transformations
to prove the well-known formula e = M c2 .
