Professional Documents
Culture Documents
Model
∗
Anukriti Bansal Gaurav Harit Sumantra Dutta Roy
IIT Delhi IIT Jodhpur IIT Delhi
anukriti1107@gmail.com gharit@iitj.ac.in sumantra@ee.iitd.ac.in
get labeled as table trailer and others get misclassified as for the detection of table header and trailer regions. In the
table-cells. This happened possibly because our dataset has first table, a few rows have the same pattern as that of table
few tables with multiple rows in the table trailer. How- trailer, but the thickness of the horizontal separator helps
ever, this will not have much effect on accurate localization in determining the correct labels. Similarly, in Figure 3(e),
of tables using header and trailer regions. Table-cells and presence of the ruling line between table rows differentiates
non-table blocks are labeled with high accuracy due to their the table-header blocks from the rest of the table and pro-
strong appearance and contextual features. vides correct labeling to multi-row header blocks. Our ap-
We compared our results with CRF for labeling document proach gives correct labeling for tables present in different
entities and extracting tables. For CRF, we used the UGM page layouts. Thus it is evident that the proposed method
toolbox [22] and used loopy belief propagation for the in- learns the layout with effectively and can be used on images
ference. Table 2 shows the quantitative comparison of the with complex layouts.
performance of both the classifiers on our dataset. In Figure
5 we analyze the labeling results of CRF and the fixed point
model on a document page from our dataset. CRF was not 5. CONCLUSION
able to distinguish between table header, table trailer and This paper presents a novel learning-based framework for
table cell. The possible reason could be that since CRF is the problem of layout analysis and table detection in docu-
a probability-based classifier, it learns the context for ta- ment images with complex layouts. The method contributes
ble rows (which are more in number) better than header or an efficient way of analyzing and labeling different document
trailer. elements which in turn defines the table boundary. A com-
Figure 3 shows the result of our approach on few images bination of foreground and background features is extracted
picked from the test set. Single page tables, tables span- and used with the fixed model for learning the context in-
ning multiple columns and tables confined to single columns formation. This helps in learning the document layout and
are labeled correctly. Different feature combinations have labeling the different regions. The experimental results are
shown effectiveness in case of different layouts. For example, promising and our approach works well on a heterogeneous
in Figure 3(a) the font color of text region and the thickness collection of documents. Unlike other existing techniques,
of horizontal separators around text provide a strong clue the method is general and does not rely on heuristic rules
Figure 4: Results with incorrect labeling of table trailer blocks: (a) Original Image (b) Ground truth image with correct
labeling, (c) Results of our approach which misclassified table-trailer blocks as table cell.
Figure 5: Comparison of Fixed point model and CRF: (a) Original Image (b) Labeling results of Fixed point model, (c)
Labeling results of CRF
such as the presence of horizontal and vertical ruling lines Srivastava. Newspaper article extraction using
and their intersection. It provides an alternate to most of hierarchical fixed point model. In Proc. IAPR
the present rule-based and learning-based systems. While International Workshop on Document Analysis
the method correctly locates the tables one above another, Systems, DAS’14, pages 257 – 261, 2014.
it merges the tables present side by side. Future work in- [2] D. S. Bloomberg. Multiresolution morphological
cludes a comprehensive analysis of the applicability of the approach to document image analysis. In Proceedings
proposed approach on other types of images and using addi- of the 1991 International Conference on Document
tional clues to handles tables present side by side. We shall Analysis and Recognition, ICDAR ’91, 1991.
also explore other context-handling approaches like Recur- [3] F. Cesari, S. Marinai, L. Sarti, and G. Soda. Trainable
rent Neural Networks (RNNs) for this task. table location in document images. In Proceedings of
the 16 th International Conference on Pattern
Acknowledgements Recognition (ICPR’02) Volume 3, 2002.
We are grateful to Prof. Santanu Chaudhury and Prof. J. [4] B. B. Chaudhuri and U. Garain. Automatic detection
B. Srivastava for introducing us to the fixed point model, of italic, bold and all-capital words in document
and its applications. images. In Proceedings of the 14th International
Conference on Pattern Recognition-Volume 1 - Volume
1, ICPR ’98, 1998.
6. REFERENCES
[5] J. Chen and D. Lopresti. Table detection in noisy
[1] A. Bansal, S. Chaudhury, S. Dutta Roy, and J. B.
off-line handwritten documents. In Document Analysis [22] S. Mark. Ugm: Matlab code for undirected graphical
and Recognition (ICDAR), 2011 International models, 2011.
Conference on, pages 399–403, 2011. http://www.cs.ubc.ca/ schmidtm/Software/UGM.html.
[6] A. C. e Silva. Learning rich hidden markov models in [23] G. Nagy. Twenty years of document image analysis in
document analysis: Table location. In ICDAR, pages PAMI. IEEE Trans. Pattern Anal. Mach. Intell.,
843–847. IEEE Computer Society, 2009. 22:38–62, 2000.
[7] D. W. Embley, M. Hurst, D. P. Lopresti, and G. Nagy. [24] D. Pinto, A. McCallum, X. Wei, and W. B. Croft.
Table-processing paradigms: a research survey. Table extraction using conditional random fields. In
IJDAR, 8(2):66–86, 2006. Proceedings of the 26th Annual International ACM
[8] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, SIGIR Conference on Research and Development in
and C.-J. Lin. Liblinear: A library for large linear Informaion Retrieval, SIGIR ’03, pages 235–242, 2003.
classification. J. Mach. Learn. Res., 9:1871–1874, June [25] J.-Y. Ramel, M. Crucianu, N. Vincent, and C. Faure.
2008. Detection, extraction and representation of tables. In
[9] J. Fang, P. Mitra, Z. Tang, and C. L. Giles. Table Proceedings of the Seventh International Conference
header detection and classification. In AAAI, 2012. on Document Analysis and Recognition - Volume 1,
[10] S. Geman and D. Geman. Stochastic relaxation, gibbs ICDAR ’03, 2003.
distributions, and the bayesian restoration of images. [26] F. Shafait and R. Smith. Table detection in
IEEE Trans. Pattern Anal. Mach. Intell., 6:721–741, heterogeneous documents. In Proceedings of the 9th
1984. IAPR International Workshop on Document Analysis
[11] J. C. Handley. Electronic Imaging Technology. Systems, DAS ’10, pages 65–72, 2010.
Document Recognition, SPIE, 1999. [27] A. Shahab, F. Shafait, T. Kieninger, and A. Dengel.
[12] G. Harit and A. Bansal. Table detection in document An open approach towards the benchmarking of table
images using header and trailer patterns. In structure recognition systems. In Proceedings of the
Proceedings of the Eighth Indian Conference on 9th IAPR International Workshop on Document
Computer Vision, Graphics and Image Processing, Analysis Systems, DAS ’10, pages 113–120, 2010.
ICVGIP ’12, pages 62:1–62:8, 2012. [28] A. C. Silva, A. M. Jorge, and L. Torgo. Design of an
[13] J. Hu, R. S. Kashi, D. P. Lopresti, and G. Wilfong. end-to-end method to extract information from tables.
Medium-independent table detection. In Proc. SPIE, International Journal Document Analysis Research,
volume 3967, pages 291–302, 1999. 8:144–171, 2006.
[14] T. Kasar, P. Barlas, S. Adam, C. Chatelain, and [29] R. W. Smith. Hybrid page layout analysis via tab-stop
T. Paquet. Learning to detect tables in scanned detection. In Proceedings of the 2009 10th
document images using line information. In ICDAR, International Conference on Document Analysis and
pages 1185–1189, 2013. Recognition, ICDAR ’09, pages 241–245, 2009.
[15] D. Keysers, F. Shafait, and T. M. Breuel. Document [30] W. Tersteegen and C. Wenzel. Scantab: Table
image zone classification - a simple high-performance recognition by reference tables. In in Proceedings of
approach. In in 2nd Int. Conf. on Computer Vision Document Analysis Systems, (DAS’98, 1998.
Theory and Applications, pages 44–51, 2007. [31] Y. Wang, R. Haralick, and I. T. Phillips. Improvement
[16] T. Kieninger. Table structure recognition based on of zone content classification by using background
robust block segmentation. In Document Tecognition analysis. In In Fourth IAPR International Workshop
V SPIE, pages 22–32, 1998. on Document Analysis Systems. (DAS 2000), Rio de
[17] J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Janeiro, pages 10–13, 2000.
Conditional random fields: Probabilistic models for [32] Y. Wang, R. M. Haralick, and I. T. Phillips.
segmenting and labeling sequence data. In Proceedings Automatic table ground truth generation and a
of the Eighteenth International Conference on background-analysis-based table structure extraction
Machine Learning, ICML ’01, pages 282–289, 2001. method. In ICDAR, pages 528–532, 2001.
[18] Q. Li, J. Wang, Z. Tu, and D. P. Wipf. Fixed-point [33] Y. Wang and J. Hu. A machine learning based
model for structured labeling. In Proc. of the 30th approach for table detection on the web. In
International Conference on Machine Learning Proceedings of the 11th International Conference on
(ICML-13), volume 28:1, pages 214–221, 2013. World Wide Web, WWW ’02, pages 242–250, 2002.
[19] Y. Liu. Tableseer: Automatic Table Extraction and [34] Y. Wang, I. T. Phillips, and R. M. Haralick. Table
Search and Understanding. Ph.D. Thesis, The detection via probability optimization. In in
Pennsylvania State University, 2009. Proceedings of Document Analysis Systems, (DAS’02,
[20] D. P. Lopresti and G. Nagy. A tabular survey of pages 272–282. Springer-Verlag, 2002.
automated table processing. In Selected Papers from [35] Y. Wang, I. T. Phillips, and R. M. Haralick. Document
the Third International Workshop on Graphics zone content classification and its performance
Recognition, Recent Advances, GREC ’99, pages evaluation. Pattern Recogn., 39(1):57–73, Jan. 2006.
93–120, 2000. [36] R. Zanibbi, D. Blostein, and R. Cordy. A survey of
[21] S. Mandal, S. P. Chowdhury, A. K. Das, and table recognition: Models, observations,
B. Chanda. A simple and effective table detection transformations, and inferences. Int. J. Doc. Anal.
system from document images. IJDAR, 8:172–182, Recognit., 7:1–16, 2004.
2006.