Named Entity Recognition From Unstructured Handwritten Document Images

2016 12th IAPR Workshop on Document Analysis Systems
Named Entity Recognition from Unstructured Handwritten Document Images
Chandranath Adak∗ , Bidyut B. Chaudhuri† , Michael Blumenstein∗

∗ School of ICT, Griffith University, Gold Coast-4222, Australia
† CVPR Unit, Indian Statistical Institute, Kolkata-700108, India
chandranath.adak@griffithuni.edu.au
Abstract—Named entity recognition is an important topic in find some keywords, e.g., date-field, named-entity, numeric
the field of natural language processing, whereas in document value, currency, time information etc. from the document
image processing, such recognition is quite challenging without image. These are useful in document indexing/retrieval with-
employing any linguistic knowledge. In this paper we propose out reading the full text in a document image.
an approach to detect named entities (NEs) directly from In this paper, we strive to identify word image that
offline handwritten unstructured document images without represent a Named Entity (NE). An NE usually refers to
explicit character/word recognition, and with very little aid some name-of-things. These entities may be living or non-
from natural language and script rules. At the preprocessing living, such as a person, place, company, organic/inorganic
stage, the document image is binarized, and then the text is chemical compound, currency, time, month and so on.
segmented into words. The slant/skew/baseline corrections of Named Entity Recognition (NER) [1] has been a popular
the words are also performed. After preprocessing, the words topic in the fields of Natural Language Processing (NLP)
are sent for NE recognition. We analyze the structural and and Information Retrieval (IR) for the last two decades. A
positional characteristics of NEs and extract some relevant detailed survey on NER research in natural language text is
features from the word image. Then the BLSTM neural reported in [2]. However, work on NE identification from
network is used for NE recognition. Our system also contains document images is rare. One approach of NER from a
a post-processing stage to reduce the true NE rejection rate.
document image was attempted by Zhu et al. [3] who con-
The proposed approach produces encouraging results on both
sidered semi-structured and unstructured printed documents
of a special type, namely “automated expense reimburse-
historical and modern document images, including those from
ment”. In their method, a Conditional Random Field (CRF)
an Australian archive, which are reported here for the very
framework with some rich page layout features was used.
first time.
They also employed an OCR engine and recognized the NEs
with assistance from the OCR outputs.
Keywords: BLSTM neural network, Document image anal- Without character/ word recognition, NE detection from
ysis, Dual layer bagging, Information retrieval, Named entity a document image is very difficult because NLP-based
recognition. knowledge can hardly be used in such a situation. How-
ever, such detection is essential where linguistic knowledge
1. Introduction cannot be used due to the poor performance of handwritten
text recognition engines. In this paper, we attempt to fill this
A document may contain text as well as non-texts such gap by proposing an approach for NE recognition without
as maps, drawings, photographic images etc. For hand- doing OCR on the document image. We find some features
written documents, some special types of non-text (e.g., by analyzing the structural and positional characteristics of
doodles, struck-out words, annotations etc.) may also be NE and feed them to a BLSTM neural network for NE
there. Sometimes a document may be a mixture (hybrid) recognition. Our method can process unstructured handwrit-
of printed and handwritten text where artifacts from both ten offline historical and contemporary English documents.
groups may exist. We have not found any published work on NE recognition
Important studies have been undertaken on layout anal- directly from document images, which is similar to the one
ysis, printed/handwritten text separation and text/non-text described here.
segregation in documents. Moreover, better accuracy with The present problem is completely different from the
higher efficiency has been achieved on preprocessing mod- classical keyword spotting [4] problem. In keyword spotting,
ules such as text-line identification, word/character segmen- a template (of a keyword) is given to find its matches in a
tation, as well as Optical Character Recognition (OCR) document image. But in our proposed work, no template is
engines. provided at all. We just use some structural and positional
On degraded documents, where OCR engines do not characteristics of an NE for its identification. The rest of
work well, keyword spotting can play a remarkable role in the paper is organized as follows. Section 2 describes our
identifying important words. Also, it may be necessary to proposed method in detail. The results and evaluation of the
978-1-5090-1792-8/16 $31.00 © 2016 IEEE 375

DOI 10.1109/DAS.2016.15
system are provided in Section 3. Then Section 4 concludes 2.4. Characteristics of Named Entities
the paper.
Detection of NE, without any explicit character/word
recognition, is challenging because of the lack of linguistic
2. Proposed Method knowledge. The only things available are the patterns of the
character strokes; those also vary with individual writers. We
In this section, we elaborate on our method for Named analyze some characteristics of NE from the perspective of
Entity (NE) recognition. We are given an unstructured doc- image and pattern analysis.
ument image I and our aim is to identify the NE from this
image. Our scheme consists of several stages, as discussed 2.4.1. Structural Characteristics.
below in detail.
An NE is likely to start with a capital (i.e. upper-case)
2.1. Binarization letter in English script. It is a useful indicator for NE
identification. For example, the difference between “beauty”
(noun of beautiful) and “Beauty” (may refer to a person’s
A document image I(r, c), where r and c denote row and
name) can be identified only by the first capital letter in the
column index respectively, may contain noise and may be
word image. The height of a capital letter is normally more
degraded in quality due to aging, insect perforations etc. So,
than that of small (i.e. lower-case) letters.
proper binarization is required to analyze the writing strokes. To detect a capital letter, we partition a word image into
We find the binarization technique of Ntirogiannis et al. [5] three zones (upper, middle and lower) by two imaginary
suitable for our purposes on historical documents. A fusion lines called upper and lower baselines. These baselines are
of local and global adaptive binarization with background obtained by simple horizontal histogram projections. Some
estimation is used there. The superiority of this method, details of the approach can be found in [9]. The maximum
compared to some other off-the-shelf and recent binarization number of object (ink) pixels generally lies in the middle
algorithms, is also shown in [5]. Moreover, we modify this zone, where the horizontal histogram profile reaches the
method slightly as per our requirements. In [5], very small global maximum. This middle zone is actually the main
components are all rejected as noise, but we retain them and body of a word.
process further to find punctuation marks (in Section 2.4.2). An upper-case character usually covers the upper and
For modern and good quality documents, Otsu’s binarization middle zones. A capital letter is sometimes wider than a
[6] method is employed. lower-case letter. In the upper projection profile (of a word)
the maximum height and width is likely to be found near
2.2. Word Segmentation the beginning (left), if the word starts with a capital letter.
Some distinct characteristics are also observed in the word
Since our aim is to find the NEs which consist of projection profiles, if it starts with a capital letter. More
words, we have to segment individual words. We use a 2D details on these aspects are provided in Section 2.5.
Gaussian filter for word segmentation on the image I(r, c).
The standard deviation (σ ) and block size (a × b) for this 2.4.2. Positional Characteristics.
filter is chosen automatically by analyzing the connected
components in a text-line. In English script, the words are An NE may be positioned at different places of a sen-
arranged horizontally. Hence, the block size is chosen to tence. It can be broadly categorized in terms of positions: (i)
be wider in length than height. After blurring, an adaptive first position, i.e. first word of the sentence, (ii) last position,
threshold is used for binarization and for returning the i.e. last word of the sentence and (iii) middle position, i.e.
(word) segmented image. anywhere except (i) and (ii).
A sentence often ends with punctuation marks such as
a full stop (.), question mark (?), exclamation mark (!) etc.
2.3. Slant / Skew / Baseline Correction which are detected to isolate a sentence from the document
image. We note that there is no character in the English
In English script, an NE generally starts with an upper- alphabet which has a dot-like component in its lower zone.
case character. An upper-case character is normally more But question marks and exclamation marks have dots in their
elongated in the vertical direction than lower-case ones. This lower zone. Full stop can be easily detected by finding a dot-
is a useful property, applied in our approach. However, we like component near the border between middle and/or lower
may get erroneous results if the characters are slanted. So, zone. Such punctuation marks signal the end of a sentence
slant correction is necessary. and the beginning of the next sentence. This sentence-end-
For slant correction of a word, we use the algorithm marking result may be erroneous for degraded and noisy
presented in [7]. This algorithm is based on a vertical documents, as well as for documents having false end-
projection profile and employs the Wigner-Ville distribution markers due to individual handwriting habits.
[8]. This method fits well for our purpose. If an NE is positioned in the middle or last part of a
The skew and baseline offsets [9] are also detected and sentence, then the first character of the word is in upper-
corrected. case. So, by analyzing the position, we can conclude that the
376
middle and last positioned words are generally NEs when 2.5. Feature Extraction
the first character is a capital letter. Some exceptions are
there, which we handle during post-processing. The features used in our method of detecting NE words
Now, the difficulty arises with the first positional NE, are described below. All features are normalized in the range
since any first positional word starts with a capital letter. To [0, 1].
know how frequently they occur, we performed positional
occurrence analysis of NEs in an English sentence. For 2.5.1. Object Pixel Distribution.
this purpose, we took 300 articles (100 newspapers, 100
story books and 100 online articles). Among these 10 These types of features are based on the distribution
articles on various topics were chosen from 10 English of the foreground object (ink) pixels throughout the word
newspapers having the highest circulation in 10 countries. image.
Also, 10 pages from each of the 10 different English story We consider a vertical sliding window of size h × 1,
books and novels of distinct writers from various countries where ‘h’ is the height of the word bounding-box. This
were selected. Moreover, 100 online articles on dissimilar window is moved from left to right over a word. Then we
topics were chosen from different websites (e.g. various calculate the weight (f1 ), center of gravity (f2 ) and 2nd -
blogs, Wikipedia pages etc.). After processing these 300 order moment (f3 ) of the window as [9]. Sometimes, f1 is
articles by the Stanford Named Entity Recognizer [1], we called a vertical projection profile.
found that per article at most 16.66% (minimum: 0%)
of NEs can reside in the first position of a sentence.
h
On average, the first positional occurrence of an NE is f1 (c) = p(r, c) (1)

only 7.01% (newspaperavg. : 8.33%, story-bookavg. : 5.78%, r=1
online-articleavg. : 6.92%). So, the confusion may arise only

1
h
on these 7.01% of NEs. We deal with such cases later in f2 (c) = r.p(r, c) (2)
the post-processing stage. h r=1
1 2
h
Obtaining guidance from the SMART [10] project, we
categorized four special classes of frequently occurring f3 (c) = r .p(r, c) (3)
h2 r=1
words (in the English literature) with respect to structures
and various positions in a sentence. These classes are Here p(r, c) is the binary image of I(r, c) obtained in Section
described below. 2.1 and p(r, c) can take on values 0 or 1, where 1 is the
i) Wh-word class: This class contains Wh-question words foreground pixel.
starting with characters “Wh” and generally positioned at Additionally, we employ a horizontal sliding window of
the beginning of a sentence. e.g. Why, Who, Whom, What, size 1×w, where ‘w’ is the width of the word bounding-box,
When, Where, Which, Whose etc. and calculate the features f4 , f5 and f6 , as
ii) Th-word class: Words starting with “Th”/“th” belong

w
f4 (r) = p(r, c) (4)
to this class and are usually positioned at the beginning
c=1
(first) and middle of a sentence. At the first position ‘T’ is
1
in upper-case, otherwise it is in lower-case (‘t’). e.g. The, w
This, That, These, Those, Thank etc. f5 (r) = c.p(r, c) (5)

w c=1
iii) Small-width word class: This class contains some words
1 2
w
with one or two characters, e.g. I, A, An, Am, At, As, Be,
We, He, Hi, It, If, Is, Of, On, No, To, In, Dr, Mr, Ms, Mx, f6 (r) = c .p(r, c) (6)
w2 c=1
Jr, Sr, Br, Sr, Fr, Lt, Mt etc.
iv) All-caps word class: Sometimes, all characters 2.5.2. Profile Features.
of a word are in capital letters. It is usually in the
abbreviation/acronym form of multiple words, Roman The word-profile features used in our method are de-
numerals etc. e.g. OCR, USD, UAE, XXI etc. scribed as follows.
We choose the upper contour profile (f7 ) [11] of a
Some salutation (or title) words with small width, e.g. word, which is the distance from the upper boundary of the
Mr, Ms, Mx, Md, Dr, Pt, Sk, St, Mrs, Esq, Adv, Sir, Sri bounding-box to the vertically closest object pixel for each
etc. followed by (a dot ‘.’ sometimes, and) another word column. Similarly, the lower contour profile (f8 ) is obtained
beginning with a capital letter is a sign of NE with high on the lower boundary, and used as a feature.
confidence. Also, two or more consecutive words beginning Also, crossing count of the object (black ink) and non-
with a capital letter have a higher possibility of being an object (white) pixel for each column is considered as a fea-
NE, e.g. Sir Isaac Newton, John F. Kennedy, Victor Dalby ture. Crossing count is usually half of the (blackto white +
Lord Jr., Queen Elizabeth II etc. whiteto black) transition count. It is called the vertical
377
black-white transition (f9 ) [9]. Additionally, we compute input nodes, respectively. The input nodes are connected
horizontal black-white transitions (f10 ) of a word. with 2 distinct recurrent hidden layers, one for forward and
Furthermore, a vertical run-length histogram (f11 ) and another for the backward sequence. The output layer is also
a horizontal run-length histogram (f12 ) are obtained. The connected with both hidden layers. We primarily wanted
run-length is calculated as the sum of consecutive object to distinguish NE from non-NE words. Then four add-on
pixels in the vertical/horizontal direction. special classes (Wh-word, Th-word, Small-width and All-
caps: described in Section 2.4) are also considered. So, the
2.5.3. Special Weighted Features. total number of classes is K = 2 + 4 = 6. At the output
layer, one extra node is added for a non-word class, such as
In addition to the above twelve features, two weighted noisy components and punctuation marks. Thus, the output
features described below are employed. layer (for each of B1 , B2 and B3 ) contains K +1 = 7 nodes.
On the upper projection profile of a word image, we The details of BLSTM-NN working principles can be found
count the transition of peaks and valleys. Since an NE in [13]. The outputs of B1 , B2 and B3 are joined with a
generally starts with a capital letter, we give some weight on fusion scheme. Combining such classifiers, called bagging
the peak-valley transition count. Let, n1 , n2 and n3 be the strategy, is applied for decreasing the error rate. The setup
count of peak-valley transitions on the left 1/3rd , middle of BLSTM-NN as the classifier for our problem is described
1/3rd and right 1/3rd portion (in x-direction), respectively, in Section 3.2.
of the upper projection profile. This may be used as a feature
(f13 ). 2.7. Post-processing
f13 = k1 .n1 + k2 .n2 + k3 .n3 (7)
Here, k1 , k2 and k3 are the weight factors. By testing on a After classification, we check the positional occurrence
training dataset, we obtained optimum results with k1 = 0.8, (as described in Section 2.4.2) of the marked NEs. Some
k2 = 0.1 and k3 = 0.1 . non-NEs, including first positional words, may be misclas-
On the horizontal projection profile (f4 ), we mark the sified as NEs. We check with the special classes (Wh-
upper, middle and lower zones, as described in Section 2.4.1. word, Th-word, Small-width and All-caps: see Section 2.4)
Since the capital letters normally reside in the upper and to separate the non-NEs. In this way, we attempt to reduce
middle zone, we apply some weights to this profile feature the false acceptance of the first positional non-NE words.
and obtain f14 . We focus on minimizing the genuine NE rejection rate,
sacrificing higher accuracy. It is more likely that we would
w
3 2w
3

w find a subset of words, and those are most certainly the NEs.
f14 = k4 . p(r, c) + k5 . p(r, c) + k6 . p(r, c) So, we mark some words as potential NEs in this subset with
c=1 c> w
3 c> 2w
3
a degree of confidence [15].
(8) In certain writing, we note that NEs having many oc-
Here, ‘w’ is the width of a word. The weight factors k4 = currences are indicative of the writing topic and thus assist
0.6, k5 = 0.3 and k6 = 0.1 perform well for this problem. in context analysis. So, we also find the NE-word frequency
count using the above BLSTM-NN to extend our work for
2.6. Neural Network-based Classification context analysis. Here we employ the strategy of [4], but
use the features presented in Section 2.5.
Among a few standard classifiers, we have found that
a Neural Network (NN) yields highest accuracy for our 3. Experimental Results and Discussions
problem. A detailed survey on NN-based classification is
reported in [12]. Of the various NN classifiers, we use Before discussing the experimental results, we begin
“Bidirectional Long-Short Term Memory”(BLSTM) neural with the description of the dataset used for evaluation of
network for our problem. Since the BLSTM-NN has shown our scheme.
promising results in other handwriting analysis/ recognition
problems [13], we have chosen this classifier. Also, as shown
in [14], BLSTM-NN is time efficient in comparison with 3.1. Dataset Employed
Dynamic Time Warping (DTW) for shape matching.
In our approach, we have employed the above 14 fea- For our experiments, we used three handwritten offline
tures. To feed into the BLSTM-NN, in general, the features datasets: i) George Washington database (GWdb) [16], ii)
are categorized into 3 types: (i) vertical feature set fV : {f1 , Queensland State Archives database (QSAdb) [17], iii) IAM
f2 , f3 , f7 , f8 , f9 , f11 }, (ii) horizontal feature set fH : {f4 , database (IAMdb) [18].
f5 , f6 , f10 , f12 } and (iii) special feature set fS : {f13 , f14 }. GWdb and QSAdb are historical text databases. Of
With these 3 sets of features (fV , fH and fS ), we employ 3 those, the GWdb contains 20 handwritten pages of G.
different BLSTM-NNs (B1 , B2 and B3 , respectively). The Washington. We took 66 pages from the QSAdb containing
input layer of a BLSTM-NN has d nodes, where ‘d’ is the handwritten text only. Finally, the IAMdb contains recent
number of features. So, B1 , B2 and B3 have 7, 5 and 2 data with 1539 handwritten pages written by 657 writers.
378
For ground-truth generation of the NEs on such pages, a
semi-automatic approach with human-intervention was em-
ployed. In addition, the manuscripts having publicly avail-
able transcripts, were fed into the Stanford Named Entity
Recognizer [1] for generating the ground truth.
(a)
3.2. Results and Evaluation
For training purposes, we consider four different ses-

sions with random selections of 50%, 60%, 70% and 80%
of the dataset to make four different classifiers (C1Bi , C2Bi ,
C3Bi and C4Bi ) for each Bi , i = 1, 2, 3. This is a dual
layer bagging strategy, i.e. bagging inside each bag. The
(b)
classifier outputs (for both of the inside bags: CjBi: j=1,2,3,4
and outside bags: Bi: i=1,2,3 ) are combined using a fusion
strategy: Sum rule over the confidence values for each class
of each classifier. The sum rule is used because it exhibits
better performance over some other fusion strategies such as
the max, min, median, majority vote, product rule [19] etc.
To avoid overfitting, 5-fold cross validation is also used.
We calculate Precision (P ), Recall (R) and F1 -Measure
(F M ) for NE recognition on each document image after
testing on the overall dataset. From the confusion matrix,
True Positive (T P ), False Positive (F P ), True Negative
(T N ) and False Negative (F N ) are obtained. The average
P , R and F M are shown in Table 1.
(c)
TABLE 1. E VALUATION OF NE R ECOGNITION Figure 1. NE identification in document images of (a) GWdb, (b) QSAdb,
(c) IAMdb. Color boxes denote true positive (Red), false negative (Blue)
Database P (%) R (%) FM (%)
and false positive (Green) cases.
GWdb 64.12 89.26 74.62 (Color copy of this paper exhibits a better display of the above Fig.1).
QSAdb 59.33 86.72 70.45
IAMdb 68.42 92.66 78.71
Also, some first positioned words of a sentence can create
Since our aim is to reduce genuine NE rejection, i.e. false acceptance (e.g. in Fig.1.c, the word “But” marked by
minimizing F N , the Recall (R) should be high. The ex- a green box).
perimental results in Table 1 show fairly high recall values, Conversely, an NE starting with a capital letter, having
predicting high sensitivity. a reasonably small width in the upper zone, can be rejected
Table 2 shows the effectiveness of different features (see, falsely (Fig.1.a, blue box containing an NE).
Section 2.5) on experimental results with respect to the F1 - For comparison purposes, we have not found any other
Measure. work on NE recognition, which has been performed without
employing a character/word recognition engine.
TABLE 2. I MPACT OF D IFFERENT F EATURES
F1 -Measure (%) 4. Conclusion

Features
GWdb QSAdb IAMdb
fV : {f1 , f2 , f3 , f7 , f8 , f9 , f11 } 61.58 57.39 64.27 A named entity recognizer for offline handwritten un-
fH : {f4 , f5 , f6 , f10 , f12 } 61.18 59.78 65.16 structured documents, without employing a character/word
fS : {f13 , f14 } 64.60 62.37 67.32 recognizer and a linguistic model, is presented in this paper.
Experiments conducted on two historical datasets and one
From Table.2, it may be observed that special weighted modern handwriting dataset have resulted in an average F1 -
features perform fairly well. Measure of 74.59%. The proposed method is expected to
Our method has the possibility to identify a non-NE as work on some other Latin scripts, where an NE usually
an NE when a person writes a non-NE word starting with a starts with a capital letter. Our method will not work on
capital letter (e.g. in Fig.1.b, the word “Engineer”, marked Abugida (e.g. Devanagari, Bangla etc.) or Logographic (e.g.
in a green box). Conjoined cursive writing may lead to false Chinese, Japanese etc.) script, where the concept of capital
acceptance (see Fig.1.b, the green box containing “I have”). letters does not exist. Our future work will endeavor to
make our system more accurate for English scripts. The
TP TP 2×P ×R
. P = T P +F P
, R= T P +F N
, FM = P +R
present scheme can produce a set of NE-word images (from
379
a document image) only, not its machine-readable output [9] U.-V. Marti and H. Bunke, “Using a statistical language model to
with the aid of any name-dictionary. This will also be the improve the performance of an HMM-based cursive handwriting
recognition systems”, Hidden Markov Models: Applications in Com-
scope of our future work. puter Vision, World Scientific, ISBN: 981-02-4564-5, pp.65-90, 2002.
[10] G. Salton, “The SMART retrieval system-experiments in automatic
References document processing”, Prentice-Hall Inc., 1971.
[11] T. M. Rath and R. Manmatha, “Word image matching using dy-
[1] J. R. Finkel, T. Grenager and C. Manning, “Incorporating non-local namic time warping”, Proc. Computer Vision and Pattern Recognition
information into information extraction systems by Gibbs sampling”, (CVPR), vol.2, pp.521-527, 2003.
Proc. Annual Meeting on Association for Computational Linguistics
(ACL), pp.363-370, 2005. [12] G. P. Zhang, “Neural networks for classification: a survey”, IEEE
Trans. on SMC, Part C: Applications and Reviews, vol.30, no.4,
[2] D. Nadeau and S. Sekine, “A survey of named entity recognition pp.451-462, 2000.
and classification”, J. Lingvisticae Investigationes, vol.30, no.1, John
Benjamins Pub. Co., pp.3-26, 2007. [13] A. Graves, M. Liwicki, S. Fernandez, R. Bertolami, H. Bunke and
J. Schmidhuber, “A novel connectionist system for unconstrained
[3] G. Zhu, T. J. Bethea and V. Krishna, “Extracting relevant named handwriting recognition”, IEEE Trans. on PAMI, vol.31, no.5, pp.
entities for automated expense reimbursement”, Proc. ACM Conf. on 855-868, 2009.
Knowledge, Discovery and Data Mining (KDD), pp.1004-1012, 2007.
[14] R. Jain, V. Frinken, C. V. Jawahar and R. Manmatha, “BLSTM neural
[4] V. Frinken, A. Fischer, H. Bunke and R. Manmatha, “Adapting network based word retrieval for Hindi documents”, Proc. Int. Conf.
BLSTM neural network based keyword spotting Trained on Modern on Document Analysis and Recognition (ICDAR), pp.83-87, 2011.
Data to Historical Documents”, Proc. Int. Conf. on Frontiers in
Handwriting Recognition (ICFHR), pp.352-357, 2010. [15] H. Zaragoza and F. d’Alché-Buc, “Confidence measures for neural
network classifiers”, IPMU, vol.1, pp.886-893, 1998.
[5] K. Ntirogiannis, B. Gatos and I. Pratikakis, “A combined approach
for the binarization of handwritten document images”, Pattern Recog- [16] “George Washington Papers”, The Library of Congress, USA. Web:
nition Letters, vol.35, pp.3-15, 2014. http://memory.loc.gov/ammem/gwhtml/gwhome.html
[6] N. Otsu, “A threshold selection method from gray-level histograms”, [17] “Queensland State Archives”, Australia-4113. Online available at:
IEEE Trans. on Systems, Man and Cybernetics, vol.9, no.1, pp.62-66, http://www.archivessearch.qld.gov.au/Search/BasicSearch.aspx
1979. [18] U.-V. Marti and H. Bunke, “The IAM-database: an English sentence
[7] E. Kavallieratou, N. Fakotakis and G. K. Kokkinakis, “Slant estima- database for off-line handwriting recognition”, Int. J. on Document
tion algorithm for OCR systems”, Pattern Recognition, vol. 34, no.12, Analysis and Recognition, vol.5, pp.39-46, 2002.
pp.2515-2522, 2001.
[19] J. Kittler, M. Hatef, R. P. W. Duin and J. Matas, “On combining
[8] E. Wigner, “On the quantum correction for thermodynamic equilib- classifiers”, IEEE Trans. on PAMI, vol.20, no.3, pp.226-239, 1998.
rium”, Physical Review, vol.40, pp.749-759, 1932.
380

Named Entity Recognition From Unstructured Handwritten Document Images

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Named Entity Recognition From Unstructured Handwritten Document Images

Uploaded by

Copyright:

Available Formats

2016 12th IAPR Workshop on Document Analysis Systems

Named Entity Recognition from Unstructured Handwritten Document Images

Chandranath Adak∗ , Bidyut B. Chaudhuri† , Michael Blumenstein∗

978-1-5090-1792-8/16 $31.00 © 2016 IEEE 375

On average, the ﬁrst positional occurrence of an NE is f1 (c) = p(r, c) (1)

online-articleavg. : 6.92%). So, the confusion may arise only

ii) Th-word class: Words starting with “Th”/“th” belong

This, That, These, Those, Thank etc. f5 (r) = c.p(r, c) (5)

For training purposes, we consider four different ses-

F1 -Measure (%) 4. Conclusion

You might also like