You are on page 1of 9

COUUTUTARY

40 GrGABrr ErneRruer AND 100 Glcnelr ErHenruer:


THe DvELoPMENT oF a FLrxIaIe AncHITEcTURE
Jonru D'AuBRostA
lntroduction
In December 2007 the IEEE Standards Asso-
ciation approved the formation of the IEEE
P802.3ba Task Force, which was chartered with
the development of 40 Gb Ethernet and 100 Gb
Ethernet. The decision to do both rates of Ether-
net was scrutinized by the industry at the time,
but ultimately the Higher Speed Study Group
provided a vital forum for the stakeholders in the
next generation of Ethernet to debate this very
issue. The fact that this debate actually occurred
is in itself a testament to the success of Ethernet.
Networking applications, whose bandwidth
requirements are doubling approximately every
18 months, have greater bandwidth demands than
computing applications, where the bandwidth
capabilities for servers are doubling approximate-
ly every 24 months. The impact of this difference
in bandwidth growth is illustrated in Fig. 1. It is
clear from these trend lines that if Ethernet is to
provide a solution for both the computing and
network application space, it needs to evolve past
its own tradition of 10x leaps in operation rates
with each successive generation.
The decision to do two rates was not taken lightly by par-
ticipants in the Higher Speed Study Group. In hindsight, this
author, who was in the thick of this debate, feels that the deci-
sion to do both 40 Gb and 100 Gb Ethernet was the correct
decision for Ethernet. Ultimately, it was the IEEE standards
development process itself that proved to be the key to resolv-
ing this difficult decision.
Support of two differing data rates as well as different
physical layer specifications selected for this project presented
the task force with a dilemma. The task force needed to
develop an architecture that could support both rates simulta-
neously and the various physical layer specifications being
developed today, as well as what might be developed in the
future. This column will provide the reader with insight into
the IEEE P802.3ba architecture, and highlight its inherent
flexibility and scalability.
The Physical Layer Specifications
Closely examining the different application spaces where
40 Gb and 100 Gb Ethernet will be used led to the identifica-
tion of the physical layer (PHY) specifications being targeted
by the Task Force. For computing applications, copper and
optical physical layer solutions are being developed for dis-
tances up to 100 m for a full range of server form factors
including blade, rack, and pedestal configurations. For net-
work aggregation applications, copper and optical solutions
are being developed to support distances and media types
appropriate for data center networking, as well as service
provider intra-office and interoffice connection.
Table 1 provides a summary of the different PHY specifi-
cations that were ultimately targeted by the task force with
their respective port type names. Below is a description of
each of the different physical medium dependents (PMDs):
.
40GBASE-KR4: This PMD supports backplane transmis-
sion over four channels in each direction at 40 Gb/s. It
leverages the l0GBASE-KR architecture, already devel-
oped channel requirements, and PMD.
.
40GBASE-CR4 and 1O0GBASE-CR10: The 40GBASE-
CR4 PMD supports transmission at 40 Gb/s across four dif-
ferential pairs in each direction over a twin axial copper
cable assembly. The 1OOGBASE-CR10 PMD supports
transmission at 100 Gb/s across 10 differential pairs in each
direction over a twin axial copper cable assembly. Both
PMDs leverage the lOGBASE-KR architecture, already
developed channel requirements, and PMD.
.
4OGBASE-SR4 and 100GBASE-SR10: This PMD is based
on 850 nm technology and supports transmission over at
Ieast 100 m OM3 parallel gigabit per second. The effective
date rate per lane is 10 Gb/s. Therefore, the 40GBASE-
SR4 PMD supports transmission of 40 Gb Ethernet over a
parallel gigabit per second medium consisting of four paral-
lel OM3 fibers in each direction, while the 100GBASE-
SR10 PMD will support the transmission of 100 Gb
Ethernet over a parallel gigabit per second medium consist-
ing of 10 parallel OM3 fibers in each direction.
.
4OGBASE-LR4: This PMD is based on 1310 nm coarse
wavelength-division multiplexing (CWDM) technology and
over at least 10 km over single-mode
is based on the ITU G.694.2 specifi-
ngths used are 1270,1,290,1310, arrd
1330 nm. The effective data rate per lambda is 10 Gb/s,
which will help maximize reuse of existing 10G PMD tech-
nology. Therefore, the 40GBASE-LR4 PMD supports
transmission of 40 Gb Ethernet over four wavelengths on
each SMF in each direction.
.
100GBASE-LR4: This PMD is based on 1310 nm dense
WDM (DWDM) technology and supports transmission of
at least 10 km over single-mode gigabit per second. The
grid is based on the ITU G.694.1 specification, and the
wavelengths used are 1295,1300,1305, and 1310 nm. The
(Continued on page 510
1 00 Gigabit Ethernet
40 Gigabit Ethernet
1 0 Gigabit Ethernet
I
Gigabit Ethernet
I Figure 1. Bandwidth growth
forecasts.
IEEE Communications Magazine. March 2009
I Table 1. Summary of IEEE P802.3ba physical layer specifications.
(Continued
from
page S8)
effective data rate per lambda is 25 Gb/s. Therefore, the
100GBASE-LR4 PMD supports transmission of 100 Gb
Ethernet over four wavelengths on each SMF in each direc-
tion.
.
100GBASE-ER4: This PMD is based on 1310 nm DWDM
technology and supports transmission over at least 40 km
over single-mode gigabit per second. The grid is based on
the ITU G.694.t specification, and the wavelengths used
are 1295,1300, 1305, and 1310 nm. The effective data rate
per lambda is 25 Gb/s. Therefore, the 1O0GBASE-LR4
PMD supports transmission of 100 Gb Ethernet over four
wavelengths on each SMF in each direction. To achieve the
40 km reaches called for, it is anticipated that implementa-
tions may need to include semiconductor optical amplifier
(SOA) technology.
The Architecture
During the proposal selection process for the different
PHY specifications, it became evident that the task force
would need to develop an architecture that would be both
flexible and scalable in order to simultaneously support 40 Gb
f Figure 2. IEEE P802.3ba architecture.
and 100 Gb Ethernet. These architectural aspects would be
necessary in order to deal with the PHY specifications being
developed by the IEEE P802.3ba Task Force, as well as those
that may be developed by future task forces.
Figure 2 illustrates the overall IEEE P802.3ba architecture
that supports both 40 Gb and 100 Gb Ethernet. While all of
the PHYs hve a physical coding (PCS) sublayer, physical
medium attachment (PMA) sublayer, and physical medium
dependent (PMD) sublayer, only the copper cable (-CR) and
backplane (-KR) PHYs have an auto-negotiation (AN) sub-
layer and an optional forward error correction (FEC) sublay-
er.
For 40 Gb Ethernet the respective PCS and PMA sublay-
ers need to support PMDs being developed by the IEEE
P802.3ba Task Force that operate electrically across four dif-
ferential pairs in each direction, or optically across four opti-
cal fibers or four wavelengths in each direction. It was
realized, however, that in the future, the IEEE
p802.3ba
architecture might need to support other 40 Gb PMDs that
could operate either across two lanes or a single serial lane.
Likewise, for 100 Gb Ethernet the respective PCS and
pMA
!I!EV"1.
need to support PMDs being developed by the
IEEE P802.3ba Task Force that operate electrically across 10
differential pairs in each direction, or optically across 10 opti-
cal fibers or four optical wavelengths in each direction. It was
also realized that in the future the IEEE P802.3ba architec-
ture might need to support other 100 Gb PMDs that might
potentially operate across five lanes, two lanes, or a single
serial lane.
The task force leveraged the relationship between the
respective sublayers to develop the flexible and scalable archi-
tecture it needed for 40 Gb and 100 Gb Ethernet, as well as
for future rates of Ethernet.
_
The PCS sublayer couples the respective media indepen-
dent,interface (MII) to the PMA sublayer. For 40 Gb Ether-
net, the MII is called XLGMII, and for 100 Gb Ethernet, the
MII is called CGMII. The PMA sublayer interconnects the
PCS to the PMD sublayer. Therefore, the functionality
embedded in the PCS and PMA represent a two-stage process
that couples the respective MII to the different
pMDs
that
were envisioned for 40 Gb and 100 Gb Ethernet. Further-
more, this scheme can be scaled in the future to support the
next higher rates of Ethernet.
As noted above, the PCS sublayer couples the respective
MII to the PMA sublayer. The aggregate stream coming from
(Continued on page 512)
sl0
IEEE Communications Magazine
.
March 2009
f
Aggregate stream of 64l66b words ---
rl
- pCS
lane I
.
PCS lane 2
5imple 66b word
round robin PCS lane n
\-
-,'
\
-.'
Lane markers
CotvrurruTARy
I Figure 3. PCS lane distibution concept.2
(Continued
from
page 510)
the MII into the PCS sublayer undergoes the 6481668 coding
scheme that was used in 10 Gb Ethernet. Using a round-robi
distribution scheme, 66-bit blocks are then dislributed across
PCS lanes for 100 Gb Ethernet. The number of
pCS
lanes for
each rate
er of lanes
that migh
given rate
and then
Jof those
implemen
Gb Ethernet will
four channels or
PCS lanes for 10
employ 7, 2, 4, 5,
each direction..
(Continued on page 514)
MAC client
MAC control (optional)
MAC
Reionciliation
CGMII
PCS
PMA (20:10)
CAUI
PMA (10:4)
PMD
MD
Medium
Z
1 OOGBASE-LR4
I Figure 4. Example implementation of L77GBASE-LR4.
IEEE Communications Magazine
.
March 2009
(Continued
from
page 510)
I Figure 5.IEEE P802.3ba timeline.
Lowesf cosfper clean in the industry
Podable Palm sized package
Etrective Wet or Dry Cteaning
Wide range connectors and polish types
More than 500x cleanngs pet unit
We hope to E6e you at Booth # 3030
for a demonsttation
,, ( ,. ," o:;;,i
",
c"-,.;.;,.o\. . h"Rr
Need a helping hand to keep your fibers clean?
CAUI interface, which is then multi-
plexed into four lambdas, each with an
effective data rate of 25 Gb/s, and car-
ried across 10 km of SMF.
Conclusion
The IEEE P802.3ba Task Force has
ls prepanng a request to go to working
Group Ballot, the nexistage in th
development of 40 Gb and 10 Gb Eth_
ernet. The adopted schedule for the pro_
ject
is shown in Fig. 5.
Regardl
regarding th
this project
fashion and
dards approval in June 2010. Further-
more, the architecture this task force has
adopted will allow Ethernet to scale to
even greater speeds in the future, which
should interest those parties already
starting to call for Terabit Ethernet.
SEIKOH GIKEN
www.SeikohGiken,com
salee@sg.usa.com I f . 7 7 O -27 9-6602
IEEE Communications Magazine. March 20(D
applications RNER:t
Derek J. Walvoord
and Roger L. Easton, Jr.
scription to assist scholars in reading
damased characters and words.
MULTISPECTRAL IMAGING
ACQUtStTtON
Both the erased Archimedes text (the
underwriting) and the Euchologion text
(ovenruriting) were written using iron gall
ink. The original proposal for imasingi
was directed at capturing and enhancin
a small color difference between the two
texts
[Figure
2(a)]. Each page was imaged
under two illuminations (ultraviolet light
at I : 365 nm and low-wattase tunssten
lights) through five different bandpass fil-
ters (blue,
6lreen,
red, and two IR bands)
[2]
to create a multispectral data set for
subsequent processin.
LEAST-SQUARES
SPECTRAL UNMIXING
The initial approach used to providr
scholars with enhanced fishlrnsds5 te$
Digital Transcription of the Archimedes Palimpsest
he recent and concurrent
development of the technolo-
gy of optical sensors and dig-
ital computers, and the
consequent decreases in
their cost, makes possible for their usae
in the transcription of historical manu-
scripts. Some of these documents may
have been deliberately erased and over-
written to make apalimpsest, which may
further suffer from many other forms of
deterioration.
The Archimedes Palimpsest includes
partial texts from seven treatises by
Archimedes, including the only extant
copy in the original Greek of his most
famous work Or Floating Bodies, the
only copies in any form of On the
Method of Mechanical Theorems
(which
provides insight into his mathematical
thought process) and of Stomachion
(which has been identified to be a very
early study in combinatorics
[1]).
In
this article we present the workflow and
methods involved in the digital tran-
scription of the Archimedes palimpsest.
THE PALIMPSEST
The original codex of Archimedes texts
was copied onto parchment from other
sources in the 10th century to make a
bound book. Durins the Fourth Crusade
inL204, the bookwas disbound, the origi-
nal text was erased, and the pages were
cut in half. Along with pases from the
other manuscripts, the erased Archimedes
pages were then overwritten with the
Euchologion (a Christian prayer book). In
1906, Johan L. Heiber, a Danish philolo-
gist, identified that the undertext of the
palimpsest was the work of Archimedes,
and he had 65 photographs made of the
book. The manuscript resurfaced in 1998
when Christie's auctioned the codex for
US$2 million dollars to an anonymous
American collector, who has lent it to the
Walters Art Museum in Baltimore,
Maryland, and generously funded its con-
servation, imaging, and study. The condi-
tion of the book has deteriorated
markedly since Heiberg's work, as shown
by the comparison of a Heiber photo-
raph
to the current appearance of the
same page in Figure 1.
The manuscript has
reat
historical
importance, but unfortunately much of
the text is very difficult to discern due to
its poor condition. Our underlying objec-
tive in the transcription of the
Archimedes palimpsest was to apply mod-
ern imainS techniques to preserve the
manuscript and to assist transcription by
scholars. In this process, we used multi-
spectral image collection and processin,
which facilitated transcription of nearly
8070 of the manuscript, and a digital
imagin tool that used this partial tran-
(a) (b)
[FlGll
Leaf 57 verso of the Archimedes palimpsest: (a) photograph from 1905 by
Heiberg and (b) condition in 2007, showing the damage that occurred in 100 years
(Courtesy of the owner of the Archimedes palimpsest.)
employed
tral unmi
terins ea
data, an o
that belor
writin, r
mold. Th
estimate
j
pixel.
Figu
reverse sid
under nor
of the c
Archimedr
of each p
ship in thi
to white a
imase sho
ly strippd
Euchologi,
of the orig
the addiri
tmages wa
were traru
any fuchin
obscured
showed di:
might lead
scholars dr
that preser
ings, whik
some mann
PSEUDOo.
ENHANCE
After receir
scholars,
tfi
processin
I
or lma{les
th
the underw
ment classes
observation
I
ly detectable
texts are sil
violet lighl
t
ible fluorescc
enharces
the
viewed throq
or tmaie
is c
red channel
c
illumination
blue channel r
green
and bl
Archimedes
te
and blue cha
light, thereby
Diqitat object rdmtiliq lo.1 logtuspzooa.szgoo
IEEE SIGNAL PROCESSING MAGAZINE [1OO] JULY 2OO8 1 053-5888/08/125 o(E@G
lJ. Walvoord
L. Easton, Jr.
ars in readin
ords.
NG
nedes text
(the
uchologion
text
n using iron
gall
sal for ima$ing
g and enhancing
between
the tuo
pa6ie was imaged
(ultraviolet ligbt
h,attage tunsten
rcnt bandPass
fi!'
nd two IR bandst
rtral data set for
E
used to
Provide
I fuchimedes
tef
rorn 1906 bY
in 100 Years'
employed a supervised least-squares spec-
tral unmixing algorithm
[3].
After regis-
tering each band in the multispectral
data, an observer selected giroups of pixels
that belonged to four object classes: over-
writin, underwriting, parchment, and
mold. The imase set was processed to
estimate the class membership of each
pixel. Figure 2(b) shows leaf 28 verso (the
reverse side of leaf 28) of lhe Euchologion
under normal illumination and the image
of the class map for the original
Archimedes text. Note that the
ray
value
of each pixel is a measure of the member-
ship in this class; underwriting is mapped
to white and other classes to black. The
image shows how the method successful-
ly stripped off much of the ink from the
Euchologion, leaving only the remnants
of the original Archimedes text. However,
the additional noise in the processed
imaes was annoying to the scholars who
were transcribing the text. In addition,
any Archimedes character that is partially
obscured by a Euchologion character
showed distinct breaks in the ink that
might lead to ambiguous readings. The
scholars desired a much simpler result
that preserved the visibility of both writ-
ings, while distinguishing the texts in
some manner.
PSEUDOCOLOR IMAGE
ENHANCEMENT
After receiving the feedback from the
scholars, the imaging team developed a
processing method to produce pseudocol-
or images that allocated different colors to
the underwritin, overwriting, and parch-
ment classes. The process is based on the
observation that the undenvriting is bare-
ly detectable under red light, while both
texts are visible when viewed under ultra-
violet light. W illumination generates
vis-
ible fluorescence in the parchment, which
enhances the contrast of both texts when
viewed through a blue filter. A pseudocol-
or image is constructed by assigning the
red channel of the image under tungsten
illumination to its red channel and the
blue channel under UV illumination to its
green and blue channels. The reddish
Archimedes text appears dark in the
sreen
and blue channels and brighter in red
light, thereby showing a distinct reddish
tint in the pseudocolor image. The
Euchologion text is dark in all three chan-
nels and so appears in a dark neutral
shade. This color cue assists the scholars
in the transcription of the text. An exam-
ple of a pseudocolor imase is shown in
Figure 2(c).
THE TRANSCRIPTION SYSTEM
While the pseudocolor system works
sufficiently well for much of the under-
text, transcription of some pases is still
problematic due to severe damage. An
interactive image processins and analysis
system has been developed to present
(a)
(a) lmages of a section of the palimpsest under illumination with wavelength
sing from right to left. The text becomes more visible as the wavelength of the
illumination decreases. (b) A comparison of the original strobe illuminated image of leaf
28 verso before and after least-squares spectral unmixing. (Courtesy
of the owner of the
Archimedes palimpsest.) (c) Pseudocolor image of disbound leaves 98 verso
-
102 recto
of the Euchologion (Archimedes
treatise On Spiral Lines). The horizontal Archimedes text
and the diagram appear with reddish tints, while the prayer book text appears black.
(Courtesy of the owner of the Archimedes palimpsest.)
85888/08/$25 00@200
IEEE SIGNAL PROCESSING MAGAZINE I,I01tr JULY 2OO8
CORNER ii continued
additional information to the scholar and
utilize his or her feedback. The process...
ing uses a series of spatial correlations
between character fragments in the
images and a trainins library of charac-
ters extracted from relatively clean
regions in the manuscript. A high-level
diaram of the overall transcription sys-
tem is shown in Figure 3(a) and a low-
level diagram of the processins is shown
in Figure 3(b).
FEATURE EXTRACTION
The features used for character classifica-
tion are extracted usins advanced corre-
lation techniques
[Figure
3(b)]. Several
matching schemes are used simultane-
ously to account for variability in the
spatial structure of the character regions
under scrutiny, thus providing adequate
features for classification.
ADVANCED CORRELATION
FILTERING
The feature extraction process benefits
heavily from the inclusion of filter
designs that incorporate a set of training
imaes into the filter mask development.
Composite correlation designs use a
trainin set from a particular class to pro-
vide some degree of distortion tolerance
for within-class variation. The maximum
aueroge correlation height (MACH) filter
[4],
when used with other classical filter-
ing designs, provided acceptable correla-
tion results for feature extraction using
the palimpsest imasery. The MACH filter
has the form
h : y(S-t I)-lm, (1)
where h is the vector representation of
the filter transfer function, m is the vec-
tor containing the mean trainins image
Fourier transform, I is the identity
matrix, and
7
is a normalization con-
stant. Note that lower-case bold-faced
symbols represent vectors while upper-
case symbols refer to matrices. The
matrix S in the MACH filter is given by
lN
S:
,i= )'rX-Mr*1Xi-M).
(2)
d.1\
-
:l
where ly' is the number of training
imaes, d is the number of pixels in each
image, and X and M are diagonal matri-
ces containins the th training Fourier
transform and the average Fourier
transform, respectively.
To achieve high tolerance to within-
class distortion, the filter is designed to
minimize the averase similarity measure
(ASM) between output correlation planes
for each of the training images used to
construct the filter mask. In addition, the
output noise uariance (ONV)
is also mirr
imized,, and the aerage correlation
height (ACH) is maximized. These crite-
ria are
[4]
ASM:1-\to,t*.rt
N
??-*
-
S@,r)12
: h+Sh (31
oNV: E{hrch} (4t
,N
ACH :
* )-.qt0.
0t t5t
]\t
-
T:I
where g(m, n) is the correlation plar
corresponding to the ith training imagE
and C is the covariance matrix of tl-
input noise estimation.
An example of a typical correlatim
plane produced by the MACH filts usir
Greek characters from the underwriting h
training and targets is shown in Figure 4
Probabilistic Network
(b)
lFlc3l
(a) Block diagram of the transcription system and (b) detailed block diagram of
the character recognition of (a).
COMPA
CORREL
While
th
consider
rmage
da
represent
ed perfo
obtained
used as th
cation.
A
sharpness
peak-to-si
where p a
standard
d
tively.
Ottx
characteri
matching,
ratio,
the
ratio,
and f
The result
c
tor of corre
class and foi
itself (autoo
CHARACIE
After clcula
class featun
vector
in a
space
of redu
bility
distribt
character
c[
detection
pro
sification
d
recosnition
q
intermediate
to apply
hiyh
at a conclusiol
INTEGRANOI
PROBABIITSI
The Archimed
matical
text al
vocabulary.
A
table (LUT)
w
partial
transcri
Netz of Stanfon
user can accul?
ber of characten
tisation
and ap
character
loc
work
may be u:
tainty
in the c
Probabilities
of CharacterMord
Library
..:
Character Class % Classification
Sum Squared Error

Character Class Feature Vectors
ROI Autocorrelation Feature Vector
L
r
IEEE SIGNAL PROCESSING MAGAZINE 1102} JULY 2OO8
True Class
(Nontraining)
v
[FlG4] Example of normalized correlation planes generated
using the MACH filter
for (a)
true class targets and (b) false class'targetsl
IACHI
filter
ssical filter-
rble correla-
rtion using
,f{CH filter
L
(lt
sentation of
n is the rec-
ining irnagp
he identitl
ization con-
r bold-raced
rile upper-
trices. Thc
sgiwnh-
l--!lr-
rZt
of trainin6
irclsind
oonal mi-
rE Forrrlr
4le
Fouricr
x to rib
; rie<igtnd b
iq'mmt
latiur
E{ffi
ages ud 5
dditixrfu
I is atso rfu
anelottn
.Ibese ,lrits-
gpln-nt
h-r-Sh flr
m
ddin
sh
ni 4?'
mir dft
I corre|i
Hfu"m*
emit4tu
inFrrc{-
CO M PACT R E PR ES E NTATI O N O F
CORREIATION PEAKS
\.
While the correlation plane itself may be
considered an extracted feature of the
image data, correlation peaks
are often
represented more compactly by associat-
ed performance
metrics. The values
obtained from these metrics are then
used as the extracted features for classifi-
cation. A metric that characterizes the
sharpness of the correlation peak is the
peak-to-sidelobe
ratio (PSR)
psR:
,.ul_ ,
(6)
where
r
and o are the mean and the
standard deviation of the peak, respec-
tively. Other metrics also may be used to
characterize the quality
of the spatial
matchin, including the signal-to-noise
ratio, the peak-to-correlation
ener6iy
ratio, and full area at half maximum
[5].
The result of this process is a feature vec-
tor of correlation performance
for each
class and for the region-of-interest (ROI)
itself (autocorrelation
vector)
[7].
CHARACTER CLASSIFICATION
After calculating the distance from each
class feature vector to the ROI feature
vector in a principal
component (pC)
space of reduced dimensionality, a proba-
bility distribution may be assigned to the
character classes. While typical target
detection problems produce
a single clas-
sification decision, the current pattern
recognition system is designed to provide
intermediate results that allow the user
to apply his/her own decisions to arrive
at a conclusion.
INTEGP,4.TION
OF A
PROBABILISTIC
NETWORK
The Archimedes palimpsest
is a mathe-
matical text and is therefore limited in
vocabulary. A word dictionary look-up
table (LUT)
was constructed using the
partial
transcription produced
by Reviel
Netz of Stanford University. Assuming the
user can accurately determine the num_
ber of characters in the word under inves-
tigation and approximately segment the
character locations, a probabilistic
net-
work may be used to model the uncer-
tainty in the character probabilities
associate with each ROI. Each ROI is a
variable (or node) in the network. For
simplicity, the network design is a poly-
tree, in which each node is conditionally
dependent on the previous
node. This
allows practical
usage of a variable elimi-
nation algorithm
[6]
for exact inference.
The network may now be initialized by
eneratins the associated conditional
probabilities
for each character location
from the word dictionary LUT
For example, the polytree
for a word
consisting of six characters would be
enerated
to represent the six ROI nodes.
The resulting network structure is illus-
trated in Figure 5(a) where the symbols
labeled "CPR" denote the results of cor-
relation pattern recognition
senerated
prior to the construction of the polytree.
These results are not nodes in the net-
work but rather probabilities
that con-
tribute to initialization and query results
independently of the contextual informa-
tion probabilities
usins naiVe Bayes,rule.
APPLYING EVIDENCE
TO THE MODEL
From to the block diagram in Figure 3(b), it
is apparent that the output character and
word probabilities
are computed from mul-
tiple sources of information. The user can
Training Set
base decisions in the transcription on infor-
mation from the spatial structure of the
palimpsest
images, the LUT created from
the partial
transcription results, and prior
knowledge ofthe context and lansuaie.
The entire decision-makins process
to
transcribe a desraded word resion is
illustrated in Figure 5(b). As shown, eight
ROIs were identified by the user in the
pseudocolor
image as the assumed loca-
tions of eight characters that constitute a
word, The first table, labeled
,,Correlation
Pattern Recognition Results," shows the
resulting probabilities
for the five charac-
ter classes deemed "most probable,,
at the
locations of each ROI. Note that 36 char-
acter classes were present in the training
set. For ROI 5, CPR results indicate that
"cv" is the most probable
character
(10.03%
likelihood). The second table,
labeled "Initialized Network Results,,, lists
updated probabilities
after initializing the
probabilistic
network. In this example,
hard evidence is now applied to ROI I
("r") and ROl 4 ("p") based on character
likelihood and visual information from
the imase itself. The third table of
,,euery
Results" shows a large increase in proba-
bility for the "correct" character classes
for all but ROI 6. At this point, the user
can either make an informed decision
False Class
1.0
0.8
0.6
0.4
0.2
1.0
0.8
0.6
0.4
0.2
ffir
(b)
(a)
IEEE SIGNAL PROCESSING MAGAZINE
'I03]
JULY 2OO8
applications CORNER,: continued
using the probable
words from the LUT
or perform another query
[81.
CONCLUSIONS
We have applied multispectral imaging
techniques and character recosnition
methods based on a library of identified
characters to digitized data from the
Archimedes palimpsest in a unified image
analysis and classification framework. The
process presented in this article assisted
scholars in their transcription of the
palimpsest,
which is one of the most
important documents in the history of sci-
ence. Among them, Reviel Netz, the princi-
pal scholar in translating the Archimedes
text, has commented very positively on the
value of the character recognition tool. In
addition, the digital transcription workflow
has been applied to a tenth century Hebrew
colophon
[7]
and even outside the digital
transcription of documents (for instance, it
is currently being used to locate centroids
of registration markers in three-dimen-
sional MRI breast imaging).
ACKNOWLEDGMENTS
The authors thank the other members of
the Archimedes palimpsest
imaging
team, Dr. Keith Knox of Boeing LTS, and
Dr. William A. Christens-Barry of
Equipoise Imaging, LLC. In addition, rte
thank the owner of the Archimedes
Palimpsest, Dr. William Noel and Abigai,
Quandt of the Walter's Art Museum i::
Baltimore, and Dr. Reviel Netz ,:
Stanford University. Photographs of tlr
Archimedes Palimpsest were produce;
by William A. Christens-Barry Roger L
Easton, Jr., and Keith T. Knox.
AUTHORS
Derek J. Waluoord (djw6430@rit.edu r nr
Roger L. Easton,
-/r
(rlepci@cis.rit-
are with the Chester E Carlson Cent=:n
Imaging Science at the Rochester l-=im
of Technology, Rochester, New Yorli
REFERENCES
[1] R. Netz and W. Noel, The Archimedu (l.rT
h/Mr
York DaCapo Press, 2007.
[2] R.L. Easton, Jr. and W Noel,'
imaging of the Archimedes palim
Liure Mdiaal, vol. 45, pp. 3949, 20M.
[3] J.R. Schott. Remole Sensinq: The t:-,:.t
-M
Approach. New\ork: Oxford Univ.
prs-
-' '
-
t4l B.VK. Vijaya Kumar, A. Mahalan= an illXi
Juday, Conelation Pattem Recogmitio-_ lw mfru
Cambridge Univ. Press, 2005.
t5] B.V.K. Vijaya Kumar and L- -_llrrmum[lil
''Performance
measures for corr.:-_ :r'Er,
Appli e d Op t., v ol. 20, no. 20, pp. 299 i-j
"r"Ni_
.lt#]lliltl
[6] S.J. Russell and P Norvig,,lrda-;-
A Modem Approach,2nd ed. Eng],r
Prentice Hall, 2002.
[7] D. Walvoord, R.L. Easton. lr.. ir I,m. ruu]lL,
Heimbueger.
''Enhancement
and ;ire ulq"
tion of the erased colophon o: ,i.:gmouttlUil
Hebrew prayer book," Proc. SP1f. r':u ffi
pp. 157-166.
[8] D.J. Walvoord, R.L. Easton. -Ir-- :m 1*
'Adding contextual information :: :m
ter recognition on the Archimeda :almmsm"
SPIO, vol. 6500, p. 650008-1, 2tr,ri
David
G
and
Bn[
Effec
The
!
Query Results
a 90.80 T i1.25 o 26.69 o 30.91
o 7.67 V 906 ) 25.84 v 23.89
EVroence
I
Applied j
I
1.53
r 7.95 & 18.85 I 21.06
U 605 e 6.26 a 7.39
). 4.99 D 504 1) 6.62
neurship.
lmportant.
i
might go
al
cle discusse:
lense
for er
for
ensineer
WHATIS
EN
The
luletiam
nition
of en
organizes,
m
risks
of a bus
of
these
Entrepreneun
context,
mean
a product
or s
In business.
uncertainty,
I
anses
from
in
service
withou
profits,
usually
uct or service
il
that
there
is lin
upon.
Assumptir
significant
bec
*ntrepreneur
hi
r'tse
something,
rr reputation
(
-*.k
is often
assr
:,rn).
Including
itrn
and
manag.
lttrepreneur
is
lr:]o
sets the
aer
Entrepreneursl
context,

Torations
to I
companies
(
Object ldmfifier
l0-l)
Probable Words
[FlGs] (a) Example polree network for Bayesian inference across six Rors.
(b) lllustration of the passing of contextual linowledge as evidence into the character
recognition block.
(a)
BOI Selection
Correlation Pattern Recogniton Results
ROtl I ROt2 ROt 3 ROt 4 ROI 5 BOt 6 ROt 7 ROt I
t
K
6.56
|
9.15 e 8.77 a 6.88 d 10.03 K 7.38
o 8.23 o 5.07
5 3!.i 5.57
x
I
6.56 ( 5.92 H 8.57 f 6.98 P 6.57 0 4.84
2 5.24
i o 5.53 o 5.57
5
5.49 K 6.67 1f 6.O7 V 5.07 t 4.82
p 4.49
i
e 5.10 P 5.16 542 K 6.65 4.71 ( 4.58 1 4.66
a 4.23 iP 504 a 4.97 V 4.97 A 5.68 2 4.35
,
4.17 o 445
lnitialized Network Results
ROt 1 ROt 2 BOt 3 BOt 4 ROt 5 ROt 6 ROt 7 ROt I
25.06 e 37.58 I 17.36 18.85 53.33 r 39.1
(
o 24.25 o 30.61
d 19.29 o 13.58 0
'16.76
p
13.57
(()
8.53 v 9.53
(1)
22.50 v
22.O2
p
13.'17 1t 7.92 d 13.26 2.20 5.65 o 7.99 a 16.69 I 20.35
10 58 p
6.75
v
12 83 lt 11 15 I 4.99 a 601 7.O2 a 931
K 904 p
4.29 o 8.90 o 10.91 o 4.73 I 5.48 I 6.58 a 7.57
IEEE SIGNAL PROCESSING MAGAZINE 1104' JULY 2OO8
B8/08/$25
OO@2O08lEI

You might also like