You are on page 1of 6

International Journal of Advanced Computer Science, Vol. 2, No. 2, Pp. 79-84, Feb. 2012.

An Improved Processing Technique With Image Mining Method For Classification of Textual Images Using Low-Level Image Features
Rahul Jagtap
Abstract This paper proposes a system that reads the text encountered in textual images using low-level image features. Image mining concerns with extraction of image data relationships, or other patterns of images which are not explicitly stored in the images. And Image classification is a large and growing field within image processing. Image Classification is useful in CBIR (Content Based Image Retrieval). In this paper a novel unsupervised method is proposed for the image classification based on various features distribution of textual images. From these various features, differences between images can be computed, and these can be used to classify the textual images which are of three types i.e. Document image, Caption Text image or Scene Text image. Based on various low level image features like mean, skewness, energy, contrast, homogeneity, the proposed method can classify various textual images. This paper contributes by proposing methods for Web-based image mining and management systems using content-based image retrieval (CBIR). The proposed method experimented on 60 different images. Manuscript
Received: 30, Jun., 2011 Revised: 29.Nov., 2011 Accepted: 29. Jan., 2012 Published: 15,Mar.,2012

Keywords
Image classification, Histogram features, GLCM features

Document images i.e. Image of any document, document may also contain picture but image text is not embedded in picture. 2) Caption Text Image i.e. Caption is manually put on the image, text is not an intrinsic part of an image. E.g. Subtitles in sports videos. These image contain important information like names of players, score in sport video, gives more description about picture 3) Scene Text Image i.e. text is intrinsic part of the image for e.g. photograph of a boards various institutions, shops, boards showing traffic signs etc. provide important information about names of institution, shop. It is important to classify these images so that most relevant images are considered for content based image retrieval. In my paper, I am classifying textual images into three different categories base on their various low level features. Features like mean, skewness, energy, contrast, homogeneity are used for the purpose of classification. First I check whether the image is document image or it is a non document image base on the features mean and skewness. Then if the image is non-document image, then it is either Caption Text image or Scene Text image. To differentiate between these two types of images, feature like energy, contrast and homogeneity are used. I have used weka J 48 decision tree classifier for the purpose of classification. I have used about 60 images for the purpose of training and testing of this weka J48 decision tree classifier. With the impact of the information superhighway, image repositories are evolving in a decentralized fashion on the Internet. Although more images are available and they could be important and useful to many applications, it is difficult to differentiate relevant or irrelevant images from this unbounded image source. Similarly, managing these images in an effective way is another important research area in image database management systems. A trend of recent research in CBIR is focusing on object-oriented techniques. These new techniques extract portions of the images and encapsulate the low-level features of these image portions in data objects. Based on these objects, images can then be classified and categorized in new ways that are not possible in the existing commercial image database systems.

1. Introduction
Advance in multimedia technologies such as image digitization, storage and transmission along with the growth of the World Wide Web, mobile device, cameras have lead to the proliferation of online digital images. Content-based image classification has been an interesting subject of many researchers in recent years. There are many great efforts in developing the classification approaches and techniques to improve the classification accuracy [1]. Recently many advanced classification approaches, such as artificial neural networks, fuzzy-sets, and expert systems, have been widely applied for the problem of image classification. Classification is an important form of knowledge extraction, and can help make key decisions [2]. I know that the textual images are those which contain text in the image. These textual images can be of the three types: 1)

80

International Journal of Advanced Computer Science, Vol. 2, No. 2, Pp. 79-84, Feb. 2012.

2. Related work
Most of the time, image mining work is concentrated to the images related to the medical field or astronomical images. But in my work I have considered different kind of images. Also all the other methods consider high level features which leads to complex calculation also more time. In my work I have considered low level image features which are easy to calculate. Before this very few efforts were made to classify the images based on low level features [3]. S.Chitrakala, P.Shamini and Dr. D.Manjula have worked for classifying textual images using eight low-level features like mean, skewness, variance, energy, entropy, correlation, homogeneity, contrast [3]. But my experimental results have shown that only FIVE features i.e. mean, skewness, variance, energy, contrast are efficient to classify various textual images. This will improve time efficiency of classification.

mage is document or not. 4. If image is non document image slice it into 8 color levels and also prepare GLCM metrics for that image. 5. Calculate energy, contrast values. 6. Depending on those values, decide whether the image is Captioned image or Scene Text image. Some of the images I have used:

3. Proposed System
First, I have studied the various literatures to find existing methods for image classification. Several approaches have already been proposed to analyze and classify pictures, by using support-vector machines (SVM) on image histograms or hidden Markov models on multiresolution features [4]. It proposes multiple classifiers by a new combination method called meta-classification, which makes the final decision by re-classifying the result each classifier returned. The existing approaches have not focused on image with incorporated text in various forms and this paper aims to classify textual images i.e. images with incorporated text in various forms. In the paper, I am classifying three types of images: Scene Text image, Caption Text image and Document image depending upon various low-level image features. The advantage of using low-level features is that they can be directly extracted from the image without the image undergoing any transformation. Main two parts of my paper are: A. Image Processing: In image processing, first preprocessing of the image includes the color to grey conversion, slicing of grey levels. In first phase, I am calculating the images features like mean, variance, skewness and these values are used for training and testing of J48 decision tree classifier. In second phase, slice gray value in 8 levels. I consider one window from sliced image and I form a GLCM matrix for each pixel of sliced image from that I calculate the GLCM features as contrast, energy and again I use these values for training and testing J48 decision tree classifier Following are the major steps in my method: 1. Convert the given color image into grey scale 2. Calculate the histogram features mean and skewness. 3. Depending on these values, decide whether the
Fig. 3 Scene Text Images Fig. 1 Caption Text Images

Fig. 2 Document Images

International Journal Publishers Group (IJPG)

Rahul Jagtap: An Improved Pre-Processing Technique With Image Mining Method For Classification of Textual Images Using Low-Level Image Features.

81

why it I negative?) Formulae for mean, variance and skewness are as follows:

M1=mean

Fig. 4 Two Level Classifier

FM2=variance

B.

Training and Testing and WEKA

Weka is data mining tool which is multipurpose I will use it for classifying purpose. I gave input in the form ARFF file format. I prepare input file from image processing and will gate trained J48 decision tree classifier. Now again I prepare the testing file and will test using training set and will actual class and predicted class by J48 decision tree classifier. Steps for Training and Testing and Decision Tree Classifier in WEKA: Create instance to read .arff file for training. Create object for J48 Classifier. Build Classifier using training instances. Create instance to read .arff file for testing

X-grey level associated with pixel Where, P u(x) is the probability of occurrence of pixel with intensity x.

4. System Description Pre-processing


Our input image mostly will be color image; with color image it will be difficult to calculate various GLCM features so it will be better if I convert it into gray scale image. So as a preprocessing work I will convert the given color image into gray scale. To convert image into gray scale I will calculate the values of red, green and blue components then multiply them with factors as follows I get grey scale value for each pixel. Gray Scale value = 0.56*red + 0.33*green + blue*0.11

Another factor which also might help us is variance because variance is the mid between mean and skewness. But most of the time values of variance are such that does not found to be useful one for classification. Thus by considering mean and skewness of the gray scale image I will decide whether the image Document image or not. If the image is not document image then it must be either Scene Text image or Captioned image. To decide actual type of image I will go further.

6. Check for Scene Text Image or Caption Text Image


The GLCM features which I will be using for the purpose of this classification are Energy and Contrast, because these two features are most useful. You can see that in case of Scene Text images there is very much contrast between text and its background. See the following image

5. Check for Document Image


As Document image can be easily classified from other kinds of images i.e. from Scene Text image or caption text image, I will first check that the image is Document image or not if it is a document image then my work is over no need to go further. For the document image mean value is too high because almost all document images are with the white Background. As majority of the pixel values are high their mean will be also high. Also skewness value is also negative in case of document i m a g e s . (What skewness, means &
International Journal Publishers Group (IJPG)

Fig. 5 Scene Text Image

82

International Journal of Advanced Computer Science, Vol. 2, No. 2, Pp. 79-84, Feb. 2012.

Fig. 6 Caption Text Image

Fig. 7 Decision Tree Classifier

Due to above fact the contrast of Scene Text image is different from than that of Captioned image. Same is the case of energy. In case o f captioned image there is no such contrast between text and its background. Other difference between Scene Text image and Captioned image includes in case of Captioned image edges of the text in the image are sharp and boundaries are rigid. In Scene Text images, edges of text are less sharp and rigid boundaries. Also in case of Caption Text Images, text in image is in standard font which is not the case with Scene Text image. Formulas for above features are as follows:
Energy:

8. Use of Context Based Image Retrieval


Existing image engines allow users to search for images via a keywords interface or via query by image example [4-5], [6-11]. Most of them are based on visual similarity measures between an image reference and a test one. Nevertheless, most of CBIR engines allow the user to form a query only by keywords. In the case of web search engine the image index is built with keywords extracted heuristically from HTML documents containing each image, and/or from the URL image. Then asking a too precise query giving many keywords, the user may narrow the scope of accurate images. C T (d) = argminall Ck L(T (d); T (Ck )): Each class Ck of REF (Reference class) is represented by the textual vector T (Ck) averaging the textual vectors of images of class Ck. Then, the text-only class of a document d of T EST is estimated by: I know that, Classification is an important form of knowledge extraction, and can help make key decisions. The explosive growth in the volume of images available on the Web and in enterprise databases continues unabated. However, images on the Web are of many variations in type and format. Selecting relevant images based on image content is a difficult problem. Managing a large amount of images using their semantic meaning with content-based retrieval remains a challenge for both database and image processing researchers

Contrast:

7. Decision Tree Classifier


Decision Tree Classifiers (DTC's) are used successfully in many diverse areas such as radar signal classification. Character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition, to name only a few .Perhaps, the most important feature capability to break down a complex decision-making process into a collection of simpler decisions, thus p r o v i d i n g a solution which is often easier to interpret [5]. In our paper I used weka J48 decision tree classifier for the purpose of classification. I am having two parts in each phase i.e. training part and testing part. In first phase out of 60 images, I am having, I used 45 for the purpose of training and remaining 15 are used for the purpose of testing. I calculated the values of mean, skewness and variation for each of these images. These calculated values for training and testing images are kept in the form of two different files having .arff file format. These files are fed to the weka J48 decision tree classifier for training and testing purpose. In second phase out of 40 images, I used 30 for the purpose of training and remaining 10 is used for the purpose of testing. I calculated the values of energy, contrast, homogeneity, entropy for each of these images. These calculated values for training and testing images are kept in another two .arff files which used for training and testing of second phase.

9. Experimental Results
In the first level of classification, I have used, out of 60 images, 45 images for the training purpose and 15 images are used for testing purpose. For this level, accuracy is found be 100%. For the Second level of classification, I used out of 40 images, 30 images for the training purpose and 10 images for testing purpose. The accuracy for the second level is found to be 92%. Below are given some results for various images Values Decided by Decision Tree Classifier For Histogram features in First level of training.
TABLE 1: HISTOGRAM FEATURES IN FIRST LEVEL TRAINING Mean Skewness Doc Non Doc 234 116 -94 45

International Journal Publishers Group (IJPG)

Rahul Jagtap: An Improved Pre-Processing Technique With Image Mining Method For Classification of Textual Images Using Low-Level Image Features.

83

Values Decided by Decision Tree Classifier for GLCM features in Second level of training.
TABLE 2: HISTOGRAM FEATURES IN SECOND LEVEL TRAINING

10. Future Work


In future, I am planning to classify these images so that only the most relevant images will be considered for various applications like Intelligent Glasses for the blind to read signboards on street, to annotate news videos based on their content, automatic vehicle movement tracking and other areas of Machine Vision. I am planning to consider those images in which there is a combination of scene and caption text within a single image. I am also planning to investigate many other low-level features such as those based on t h e GLRM (Gray Level Run Length Matrix), edge strength and edge intensity features for robust classification.

Caption Text Scene Text

Energy 0.7047 0.3982

Contrast 0.6542 0.5049

TABLE 3: NO. OF IMAGES USED FOR EACH TYPE

Doc Caption Text Scene Text

No. Of Given Images 20 20 20


TABLE 4:

Correctly Classified Images 20 17 19

11. Conclusion
Thus I have given a method which is very useful for classification of textual images. As it uses low level features like mean, skewness, variance, energy, contrast which are easy to calculate. So, from image databases using above lowlevel features, I can accurately classify images into one of three types i.e. Document image, Scene Text image and Caption Text image

ACCURACY FOR EACH TYPE OF IMAGE

Image Accuracy

Doc 100%

Caption Text 86%

Scene Text 96%

References
[1] Bertrand Le Saux & Giuseppe Amato, "Image Classifiers for Scene Analysis," 2003. [2] K. Matsuo, K. Ueda, & U. Michio, "Extraction of character string from scene image by binarizing local target area," (2002) Transaction of The Institute of Electrical Engineers of Japan , vol. 122-C(2), pp. 232-241. [3] S. Chitrakala, P.Shamini & D. Manjula, "Multi-class Enhanced Image Mining of Heterogeneous Textual Images Using Multiple Image Features," (2009) IEEE International Advance Computing Conference (IACC 2009) Patiala, India. [4] Yulia Arzhaeva, Bram van Ginneken, & David Tax, "Image classification from Generalized Image Distance Features: Application to Detection of Interstitial Disease in Chest Radiographs," IEEE 2006 [5] L. Breiman, J. H. Friedman, R. A.Olshenand, & C. J. Stone, Classification and Regression Trees. New York: Chapman & Hall, 1984. [6] Barry Raikind, Minsuk Lee, Shih-Fu Chang, & Hong Yu, "Exploring text and image features to classify images in Bioscience literature," (2006) Proceedings of the BioNLP Workshop on Linking Natural Language Processing and Biology at HLT-NAACL 06, pp. 73-80. [7] Holmes, G., Donkin, A., Witten, & I.H. WEKA: a machine learning workbench. In: Proceedings Second Australia and New Zealand Conference on Intelligent Information Systems, Brisbane, Australia, pp. 357-361 1994. [8] Shehzad Muhammad Hanif & Lionel Prevost, "Texture based text - detection in natural scene images a help to blind and visually impaired persons," Conference & Workshop on Assistive Technologies for People with Vision & Hearing Impairments Assistive Technology for All Ages CVHI 2007. [9] S. Rasoul Safavian & David Landgrebe, "A Survey of decision tree classifier methodology," (1991) IEEE Transactions on Systems, Man, and Cybernetics, Vol. 21, No. 3.

Fig. 8 Decision Tree for second level Training

Fig. 9 Histogram Features of Level- 1 Classifications

Fig. 10 GLCM Features of Level 2 Classifications

International Journal Publishers Group (IJPG)

84

International Journal of Advanced Computer Science, Vol. 2, No. 2, Pp. 79-84, Feb. 2012.

[10] Thach-Thao Duong, Joo-Hwee Lim, Hai-Quan Vu, & Jean-Pie re Chevallet, "Unsupervised Learning for Image Classification based on Distribution of Hierarchical Feature Tree," (2008) IEEE. [11] Wei-Hao Lin, Rong Jin, & Alexander Hauptmann, "Meta-classification of Multimedia Classifiers", International Workshop on knowledge discovery in multimedia and complex data, 2002 Rahul Jagtap is born in India. He has received the B.E.Computer Engineering degree from Pune University in 2010. He is the author or coauthor of more than three national and international papers. His current research interest includes image mining, Grid computing, etc.

International Journal Publishers Group (IJPG)