P. 1


|Views: 6|Likes:

More info:

Published by: Tadmeri Narayan Vikram on Nov 13, 2012
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





This article appeared in a journal published by Elsevier.

The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

the proposed approach does not require any training bases. we propose a simple and efficient method which is based on random sampling of the image into rectangular regions of interest and computing local saliency values. which constitutes the surrogate representation of the visual attention span over an input image for a free viewing task. Tscherepanow). Motivated by this. The attention mechanism in humans helps them in focusing their limited cognitive resources to the context relevant stimuli while suppressing the ones which are not important. York University [13].see front matter & 2012 Elsevier Ltd. tnvikram@gmail.b a b Applied Informatics Group.Author's personal copy Pattern Recognition 45 (2012) 3114–3124 Contents lists available at SciVerse ScienceDirect Pattern Recognition journal homepage: www. Bielefeld University. E-mail addresses: nvikram@techfak. All rights reserved. Germany a r t i c l e i n f o Available online 22 February 2012 Keywords: Visual attention Saliency map Salient region detection Eye-gaze prediction Bottom-up attention a b s t r a c t In this article we propose a novel approach to compute an image saliency map based on computing local saliencies over random rectangular regions of interest. we describe the experiments carried out to validate the performance of the proposed method for the tasks of eye-gaze prediction and salient region detection. marko@techfak. It has been shown in the literature that the receptive fields inside the retina operate randomly at both positional and scale spaces while processing the visual stimuli [11]. the proposed method also achieves state-of-the-performance for the task of predicting eye-gaze.com/locate/pr A saliency map based on sampling an image into random rectangular regions of interest Tadmeri Narayan Vikram a. doi:10. Finally.3]. Several computational approaches to visual attention have been proposed in the literature since the seminal work of Koch and Ullman in 1985 [1]. 1. Introduction Visual attention and its associated cognitive neurobiology has been a subject of intense research over the last five decades. Germany Research Institute for Cognition and Robotics (CoR Lab).n.com (T. The proposed approach is described by means of a pseudo-code and a graphical illustration in Section 4. Unlike many of the existing methods. image compression [8] and attention guidance for human–robot interaction [9.7]. Most of the existing visual attention approaches process pixels or image windows sequentially. Furthermore. we conclude this work by highlighting its current shortcomings with a brief discussion about the future directions of the current work in Section 6. Wrede). The same authors proposed the concept of a saliency map. All rights reserved. Marko Tscherepanow a.uni-bielefeld. Saliency maps are now employed for several applications in robotics. image summarization [5]. In view of this.10]. Bielefeld 33615.1016/j. We provide a review of the existing computational approaches to visual attention with a brief description of their strengths and shortcomings in Section 2.b. Bielefeld 33615. Vikram).uni-bielefeld. Bielefeld 33615. Bielefeld University. In Section 5. Britta Wrede a. It has been tested on the two distinct tasks of salient region detection (using MSRA dataset) and eye gaze prediction (using York University and MIT datasets).009 . Some of the successful applications are object detection [2.de (B.de (M. 0031-3203/$ . computer vision and data transmission. Bielefeld University. which is in contrast to the way the human visual system operates. Having an artificial system which could simulate this attention mechanism could positively impact upon the way in which various cognitive and interactive systems are designed.2012. The proposed method is shown to have comparable performance with existing methods for the task of salient region detection. object recognition [4]. n Corresponding author at: Applied Informatics Group. the computer science community has invested considerable time and efforts to realize a computational model of visual attention which at least partially exhibits the characteristics of a human visual attention system. The paper is organized as follows. image segmentation [6.patcog. & 2012 Elsevier Ltd. The method has been tested on MSRA [12]. bwrede@techfak.: þ49 521 106 12229.elsevier. Tel. Literature review The theoretical framework for the computation of saliency maps was first proposed in [1] and later realized in [15].N. Germany. MIT [14] datasets and benchmarked along with nine of the other existing state-of-theart methods to corroborate its performance. 2. The proposed method achieves state-of-the-art performance on the eye gaze prediction task as compared with nine other state-of-the-art methods.uni-bielefeld. The motivation leading to the current work is described in Section 3.02.de. operates on the image at the original scale and has only a single parameter which requires tuning.

Entropy-based approaches The popular approach of [29] is based on local contrasts and maximizes the mutual information between features by employing independent component analysis (ICA) bases.  carried out to compute the master saliency map. [15]. The saliency map is thus constructed from a residual frequency spectrum obtained by the difference between an image frequency spectrum and a mean frequency spectrum. Subsequently it is used to compute the conditional and joint distribution of features for information maximization.5. Power law based approaches Power laws are basically used to model human visual perception. orientation and contrast symmetry features to generate the final saliency map. Power law based approaches: These approaches compute saliency maps by removing redundant patterns based on their frequency of occurrence. except that it operates directly on the pixels rather than on the spectral space. 2. Another symmetry based saliency approach was proposed in [27].2. A quaternion Fourier analysis of the decomposed channels is A simple but efficient approach based on computing the absolute difference of a pixel and the image mean was proposed in [25].19]. which employs color. Entropy-based approaches: The mutual information between patterns is employed to optimize the entropy value. This approach computes 41 different feature maps for a given input image based on color. Spectral approaches Several approaches to compute a saliency map which operate at the spectral domain have thus been proposed in the literature. A Gabor wavelet analysis and multi-scale image processing based approach was proposed in [18]. A Zipf’s law based saliency map has been proposed in [23]. These methods are found to be successful in detecting proto-objects in images. The fusion method employed to compute the master saliency map from the various feature maps plays a vital role in its accuracy. gradient and orientation information. the method results in the loss of information during image reconstruction and is compromised by illumination. The existing approaches can be classified into seven distinct categories based on the computational scheme they employ. The resulting feature maps are fused into a single map using a winner takes all (WTA) network and an inhibition of return mechanism (IOR). But this method is constrained by its emphasis on edges and neglects salient regions [30].4. As it can be seen from the illustrations in [8]. A set of ICA bases is pre-computed using a patch size of 7 Â 7 pixels. Despite its theoretic appeal. noise and other image artifacts. Fourierbased methods are affected by the number of coefficients selected for image reconstruction and the scale at which the input image is processed. Hierarchical approaches The most popular approach in this category is the one proposed by Itti et al. A distance transform based approach was proposed in [28]. Experiments conducted on the York University eye-gaze dataset [13] have proven its efficiency. In general. hierarchical methods tend to ignore visually significant patterns which are locally occurring as they are primarily driven by global statistics of an image [15]. Poor global contrast of an image affects the performance of these methods and local-statistics based approaches for saliency computation have been proposed in the literature to address this problem. the input image is decomposed into four different channels based on opponent colors and intensity. The method employs maximum likelihood estimates (MLE) to set the parameters of various distributions. Spectral approaches: They operate by decomposing the input image into Fourier or Gabor spectrum channels and obtain the saliency maps by selecting the prominent spectral coefficients. Rarely occurring patterns are considered salient while frequently occurring patterns are labeled redundant. which computes an edge map for each gray scale threshold and fuses them to generate a final saliency map. This approach has served as a classical benchmark system. the Weibull’s distribution based saliency map has a large parameter set and the heuristic to fix and optimize them constitutes a major drawback. Vikram et al.7]. 2.22] are another class of approaches which highlight the saliency by removing redundancy from the Fourier spectrum. where a larger entropy value indicates that a given pattern is salient. Spectral residual approaches proposed in [21.N. It also reasonably simulates eye-gaze for free viewing task in case of natural sceneries with no specific salient objects in it. Despite being biologically inspired and efficient. This work is similar to those of [21. Several approaches to saliency which employ power laws have thus been proposed. employ maximal symmetric regions and novel dissimilarity metrics respectively to compute the final saliency map. texture.22]. An integrated Weibull’s distribution based saliency map has been proposed in [24]. 2. Arriving at a generalized rule of fusion for various maps is complicated and requires intensive crossvalidation to fine tune the fusion parameters. Hybrid approaches: Models of this paradigm employ a classifier in combination with one or more approaches to compute saliency. the method is compounded due to the fact that the pattern of mean frequency spectrum is subjective.Author's personal copy T. Like subspace analysis.3.  Hierarchical approaches: They perform a multi-scale image   processing and aggregate the inputs across different scales to compute the final saliency map. Interested readers are pointed to [16.17] for a more detailed and exhaustive review on saliency approaches. Similar approaches proposed by the same authors in [26. Fourier transform has also been extensively utilized to generate saliency maps as seen from [8.1. The contrast is analogously treated as the image saliency. Though straight forward. Two dimensional Gabor wavelets of Fourier spectrum have also been utilized for computing saliency maps in [20] and the final saliency map is reweighted using a center-bias matrix. / Pattern Recognition 45 (2012) 3114–3124 3115 Subsequently it has led to the development of several other saliency approaches based on different mathematical and computational paradigms. 2. its performance is constrained by the fusion of multiple maps and the requirement to process multiple features. Image contrast approaches: The mean pixel intensity value of the entire image or of a specified sub-window is utilized to compute the contrast of each pixel in the image. 2. It also adds a spurious border effect to the resultant . These methods are successfully applied for proto-object detection and out-beats many state-of-theart methods without having many of their drawbacks. Center-surround approaches: These approaches compute the saliency of a pixel by contrasting the image features within a window centered on it. In the approach of [8]. Image contrast approaches    We briefly explore the existing saliency approaches based on the aforementioned categories in the sub-sections to follow. and its performance on the eye-gaze data is shown to be better than other power law based approaches.

The method is shown to be robust to noise and drastic illumination changes. In [48]. the probability distribution of colors and orientations of an image is computed and either of these features are selected to compute the final saliency map. Another ICA based approach was proposed in [31] where image selfinformation is utilized to estimate the probability of a target at each pixel position. it has been shown in [52] that each visual stimulus is biased by every other stimulus present in the attention space. An innovative application of this saliency map has been found useful for pedestrian detection. local steering kernels are employed to compute the saliency of a pixel by contrasting the gradient covariance of a surrounding region. . sub-band decomposition. Another successful approach based on kernel density estimation (KDE) is proposed in [43]. Selecting an appropriate window size for center and surround patches plays an important role in obtaining a higher quality saliency map. The method outputs an object saliency map. 2. but is subjected to high variations in the performance as too many features. The method is found to be successful in salient object detection. However. An extension of this work can be found in [47]. where entropy of a center versus a surround region is computed as the saliency value of a pixel. but has unwieldy run-time as it relies on image segmentation. In view of the aforementioned discussion. The final saliency map is based on the statistics of color image derivatives and employs an information theoretic approach to boost the color information content of an image. Motivation As it can observed from our previous review. These methods which rely on information theoretic approaches are in general constrained by the requirements of training bases. color boosting and as well as center-surround paradigms.40].7. Furthermore. It is further fused with top-down features derived from ICA bases to build the final saliency map. A saliency map based on modeling eyemovements using support vector machines (SVMs) is proposed in [42]. An extension of this paradigm can been seen in the recent approach proposed in [33]. 2. A graph-based visual saliency approach is proposed in [41] which implements a Markovian representation of feature maps and utilizes a pyscho-visual contrast measure to compute the dissimilarities between features. The mutual information between the sub-band features is calculated by realizing a random-walk on them and initializing the site entropy rate as the weight of the edges in the graph. and not being biologically motivated. But salient regions or objects can occur at arbitrary positions. A linear classifier is trained on an image dataset to build a bag-of-features to arrive at a prior for an object in an image. It correlates well with human eye-gaze data but is constrained by the subjectivity involved in the computation of weights for fusing the different maps to compute the master saliency map. / Pattern Recognition 45 (2012) 3114–3124 image and requires re-scaling of the original image to a lower scale in order to make the computational process more tractable. This method avoids multi-scale image analysis but fails when the image has a low color or gradient contrast. many machine learning based saliency approaches have been proposed. This captures local contrasts unlike the global methods for computing saliency maps. which resembles image segmentation. this approach requires enormous amounts of data to learn saliency weights reasonably. The method is theoretically very attractive. downscaling of the original image in order to obtain a tractable computational run-time. A color saliency boosting function is introduced in [46] which is obtained from an isosalient image surface. shapes and scales in an image. Based on its color distinctiveness and spatial distribution. where the saliency of a pixel is determined by the dissimilarity of the center to its surround over multiple scales. Most of the existing approaches to compute saliency maps process input image pixels sequentially with fixed sliding windows. The proposed method also avoids competition between multiple saliency maps which is in contrast to the approaches proposed by [15. Local analysis of gradients along with center-surround paradigm is advocated as an alternative to multi-scale image analysis as it can be seen in [39. coupled with edge density histograms. Center-surround approaches A saliency map based on discriminant center-surround entropy contrast which does not require training bases was proposed in [36]. integration of complex classifiers. Research in vision sciences has shown that the human visual system has its receptive fields scattered randomly and does not process input stimuli in a sequential manner [11. 3. we propose a random center-surround method which operates by computing local saliencies over random regions of an image. color contrasts and the saliency map of [15]. In order to achieve a tractable runtime. A recent center-surround contrast method [37].51]. Fusion of the center-surround hypothesis and oriented sub-band decomposition to develop a saliency map has been proposed in [49]. A multi-scale center-surround saliency map was proposed in the work of [38]. The authors recommend to construct a set of KDE approaches based on region segmentation using the mean shift algorithm. but it is computationally expensive as a set of compound features needs to be computed for each pixel in the image. In [39]. They employ the concept of super pixel straddling.41]. the method in [45] is computationally efficient as it employs the pulsed cosine transform to produce the final saliency map. it does not require any training priors and has only a single parameter which needs tuning. A few hybrid approaches attempt to combine the positive aspects of multi-scale analysis.6. large set of tunable parameters. Other entropy-based approaches like [34. high. The method requires fine tuning of neural network weights for an inhibitory mechanism which is heuristic and hence cannot be used for real-time visual attention. Unlike [44].35] employ incremental coding length to compute the final saliency map. It performs reasonably well on both eye-gaze fixation and salient object detection datasets. A salient object selection mechanism based on an oscillatory correlation approach was proposed in [44]. Furthermore. The approach proposed in [40] uses color and edge orientation information to compute the saliency map. A radically different approach which tries to detect salient regions by estimating the probability of detecting an object in a given sliding window is proposed in [50].Author's personal copy 3116 T. maps and parameters are involved which require fine tuning. using a framework similar to the one presented in [39]. the patch size parameters and the size of the training bases. identifies pre-attentive segments and computes the mutual saliency between them. the image is down-scaled. Vikram et al.N. Hybrid approaches To overcome the issue of patch size parameters. many of the existing computational approaches to visual attention are constrained by one or more of the following issues like : requirement of training bases. complexity associated with programming. A Hebbain neural network based saliency mechanism which simulates lateral surround inhibition is proposed in [45]. the color saliency and spatial saliency of each KDE approach are evaluated. The method proposed in [32] employs sparse bases to extract sub-band features from an image.

1. a and b channels each of dimension r  c.y2 Þ Compute S by pixel-wise Euclidean norm à Step 6 : Step 7 : Step 8 : Apply median filter on S Normalize S in [0.y2 to 0 for i¼1 to n x1i ¼ Random number in ½1.x2 . B and C of size r  c (1) Fused matrix FM of size r  c Apply pixel-wise Euclidean norm Set all elements of FM to 0 for i¼ 1 to r for j ¼1 to c pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi FM i.Sa . as this helps in placing windows without having any bias towards a specific window size or spatial region of the image. In addition our method does not require prior training bases in contrast to [31.15.255] and subsequently subjected to median filtering. Input: Output: Method Step 1 : Step 2 : Step 3 : Step 4 : (1) IMG (RGB Image) of size r  c  3 (2) n—Number of random windows S—Saliency map of size r  c Apply Gaussian filter on IMG n Convert input IMG to Ln an b space Generate random window co-ordinates ½x1 . Furthermore. 5.26.k ÀMeani 9 end-k end-j end-i Algorithm : Generate_Windows Input : (1) n—Number of random windows (2) r—Number of rows (3) c—Number of columns x1 . Experimental results S ¼ Fusion ðSL . The median filter is chosen because of its ability to preserve edges despite eliminating noise.k end-k end-j Meani ¼ Sumi Areai Output : Method Step 1 : Step 2 : for j ¼ x1i to x2i for k ¼ y1i to y2i ISM j. Several experiments were conducted to validate the performance of the proposed saliency map for two distinct tasks of . the details of which are presented in the next section. Note that our approach does not downscale the input image to a lower resolution like other approaches [41.30].y1i Þ and ðx2i .x1 . Computing the final saliency value for a given position as the n Euclidean norm of saliencies over different channels in the Ln an b space is recommended as an ideal fusion rule in [25. To further enhance the contrast of S.n.y1 .j end-j end-i An illustration of the above paradigm is given in Fig.x2 .y2 Š ¼ Generate_Windows ðn. This property is important because salient regions or objects in an image can occur at arbitrary positions and scales.k ¼ ISMj. The saliency value at a particular coordinate position in a given channel is defined as the sum of the absolute differences of the pixel intensity value to the mean intensity values of the random sub-windows in which it is contained.x2 .39].Author's personal copy T.n.rÀ1Š y1i ¼ Random number in ½1.y2 each of size n Generate random window co-ordinates Set all elements of x1 . The proposed saliency map We consider a scenario where the input I is a color image of dimension r  c  3.7].cŠ end-i Output : Method Step 1 : Algorithm: Fusion Input : Output : Method Step 1 : (1) Matrices A.Sb Þ 1 The original CIE document—http://www. The algorithm for the proposed approach is as follows. The resulting saliency map S is normalized in the interval [0. The input image I is initially subjected to a Gaussian filter in order to remove noise and abrupt onsets.electropedia.y1 .255] Apply histogram equalization to S Algorithm: CompSal Input : (1) I (Gray scale Image) of size r  c (2) n—Number of random windows (3) Co-ordinate vectors x1 .j Á Ai. The final saliency map S is also of dimension r  c.j þBi. an and bn channels. The saliency maps due to Ln.k þ 9Ij.x1 .x2 .j Á Bi.y2 Þ Step 5 : Sb ¼ CompSal ðb . n random sub-windows are generated over each of the Ln.26.y1 .rŠ y2i ¼ Random number in ½y1i þ1. we subject it to histogram equalization as recommended in [36.j Á C i.y2 Þ Sa ¼ CompSal ðaà . This image enhancement is in consonance with the fact that the human visual system enhances the perceptual contrast of the salient stimulus in a visual scene [54.x2 .53].n.y1 .r. It is n further converted into the CIE1976 Ln an b 1 space and decomposed n n n into the respective L . Vikram et al.org/iev/iev. Sa and Sb respectively and their dimensions are also equal to r  c.x1 .x2 .y2i Þ respectively. an and bn channels are referred to as SL.y1 . n The Ln an b space is preferred over the other color spaces because of its similarity to the human psycho-visual space [26] and is also recommended as a standard to compute saliency maps in [30.13].y1 . Algorithm: Random center-surround saliency. where r and c are the number of rows and columns respectively.nsf/display? openform&ievref=845-03-56.j ¼ Ai.x2 .55].25.N.cÞ Generate saliencies for each component SL ¼ CompSal ðLà .y2 each of size n ISM—Interim saliency map of size r  c Set all elements of ISM to 0 Update ISM for i ¼ 1 to n Areai ¼ ðx2i Àx1i þ 1Þ Á ðy2i Ày1i þ 1Þ Sumi ¼ 0 for j ¼ x1i to x2i for k ¼ y1i to y2i Sumi ¼ Sumi þ ISM j.j þ C i.y1 . Also n is the only parameter which requires tuning. The upper left and the lower right coordinates of the ith random sub-window is denoted by ðx1i . A discrete uniform probability distribution function is used to generate the random sub-window co-ordinates.cÀ1Š x2i ¼ Random number in ½x1i þ 1. / Pattern Recognition 45 (2012) 3114–3124 3117 4.

zip 6 http://users.49.zip 8 http://cseweb.N. weighted contrast9 [30] and symmetric contrast10 [26] based image saliency approaches. The final saliency map is then computed by fusing the channel specific saliency maps by a pixel-wise Euclidean norm. human–robot interaction and other tasks which involve detecting image regions that are semantically interesting [14]. or are used to predict eye-gaze patterns as in [31. Details about fine tuning n is given in Section 5. The contemporary saliency maps are either employed to detect salient regions as in the case of [30.inria.26]. But human eye-gaze which focuses mainly on salient regions are also distracted by semantically relevant regions [56].ucsd. We selected nine state-of-the-art methods of computing saliency maps to compare and contrast the proposed method. as it led to a more stable performance. Several popular and recent works like [41.ucsc. Subsequently it is converted into the Ln an b space and the individual Ln.php 7 http://users. / Pattern Recognition 45 (2012) 3114–3124 Fig.42.edu/ $ harel/share/simpsal. distance transform7 [28]. The following parameter settings were used as a standard for all the experiments carried out.zip 4 http://www.ch/supplementary_material/RK_CVPR09/SourceCode/Salien cy_CVPR2009. The number of distinct random sub-windows n was set to 0:02  r  c. an and bn images whose pixel intensity values are normalized in the range of [0.1. The performance on the eye-gaze prediction task was validated on two different datasets of York University [13] and MIT [14].cf.04. an and bn channels. Salient region detection and eye-gaze prediction are the two most significant applications of saliency maps. there are subtle differences between them. Sa and Sb) are updated. The TPR (also 2 http://ivrg.zip 9 Available from our previous work 10 http://ivrg.epfl.cs.klab.1 LTS (Lucid Lynx) as operating system. Salient regions of an image are those which are visually interesting. Vikram et al. This results in Ln.epfl. self-Information8 [31].caltech. object localization and object tracking in videos [25]. Only few of the existing saliency approaches like [41. The ROC graphs are two-dimensional graphs in which the true positive rate (TPR) is plotted on the Y axis and the false positive rate (FPR) rate is plotted on the X axis.m 3 http://www-sop. ranking and selecting classifiers based on their performance [58].28].10.soe.1.39].20. An ROC graph is a general technique for visualizing. we have employed the receiver operating characteristic (ROC)–area under the curve (AUC) as a benchmarking metric. 1.klab. An illustration of the proposed method. Subsequent experiments to corroborate the performance on salient object detection task were conducted on the popular MSRA dataset [12]. image quality assessment. Local saliencies are computed over each of these ROIs and the channel specific saliency maps (SL.edu/ $ rokaf/download.ch/supplementary_material/RK_ICIP2010/code/Saliency_MSSS_ ICIP2010.ac. 255].zip 5 http://www.57. multi-scale5 [15].30] employ the ROC–AUC metric to evaluate eye-gaze fixation correlation. The input image is subjected to Gaussian filter in the first stage. Although these two tasks appear similar.edu/ $ l6zhang/code/imagesaliency. entropy3 [29].edu/ $ harel/share/gbvs. The methods are global contrast2 [25].25.15] have consistent performance on both of these tasks. The employed Gaussian filter.fr/members/Neil. the default Matlab configuration) was used as a pre-processor on the images for noise removal as recommended in [25. The inbulit srgb2lab Matlab routine was used to convert the input image from RGB colorspace n to Ln an b colorspace.Rosin/resources/salience/salience. n eye-gaze prediction and salient region detection in free viewing conditions. Automatic prediction of eye-gaze is important in the context of image aesthetics. A median filter of size 11  11 was employed to smooth the resultant saliency map before being enhanced by histogram equalization. Experiments on eye-gaze prediction task In order to empirically evaluate the performance on eye-gaze correlation task.0 (R2010a) on an Intel Core 2 Duo processor with Ubuntu 10. graphical approach4 [41]. A rotational symmetric Gaussian low pass filter (size 3  3 with s ¼ 0:5. The experiments conducted to corroborate the performance of the proposed method for eye-gaze correlation in a free viewing task is described in the following sub-section.m .Author's personal copy 3118 T.uk/Paul. Salient region detection is relevant in the context of computer vision tasks like object detection. an and bn channels are obtained. median filter and histogram equalization are based on the usual straight forward methods. 5. For the sake of simplicity we have considered three random regions of interest (ROI) on the respective Ln. All experiments were conducted using Matlab v7.caltech. local steering kernel6 [39].Bruce/AIM.

expressions. Note that the proposed method has similar performance like the graph-based [41] saliency approach and outperforms all the other methods in consideration. 2. For the MIT dataset [14] we have not been able to obtain the ROC plots and area under the curve for the self-information [31] and random center-surround 11 Source code used for ROC–AUC computation in this article—http://mark.75 0.81 0. Vikram et al. tn is the number of true negatives. Fig.81 approaches [30].76 0. This is despite the fact that the proposed method is simple and is devoid of the sophistication associated with the graph-based [41].75 0. that the proposed method has a similar ROC plot as compared to graph-based [41] saliency. An ideal classifier would give an AUC of 1. / Pattern Recognition 45 (2012) 3114–3124 3119 called hit rate and recall) and FPR (also called false alarm rate) metrics are computed as in [30. Fig.jar. The variation in image dimensions. Also note that the age group of the viewers.com/programs/AUC/auc.72 NA 0. We have considered two popular datasets of York University [13] and MIT [14] for the eye-gaze prediction task. It can be observed from Fig.0. semantics. topology of arrangement in objects between these two considered eye-gaze datasets offers a test-bed to evaluate the robustness of a saliency map.75 0. ROC performance plots for the MIT [14] dataset. We preferred not to rescale the input images as this might lead to a bias during the evaluation. The ROC curve is generated by plotting the obtained FPRs versus TPRs and the AUC is calculated. The saliency map and the corresponding ground truth fixation density map are binarized at each discrete threshold in [0.81 0. multi-scale [15] or self-information [31] based saliency methods. goadrich. music concerts etc.Author's personal copy T. ROC performance plots for the York University [13] dataset.77 0.63 0. Table 1 The performance of the methods under consideration in terms of ROC–AUC on the York University [13] and MIT [14] datasets.53 0. An ROC graph depicts relative trade-offs between benefits (TPR) and costs (FPR). 2. the AUC indicates how well the saliency map predicts actual human eye fixations. . the objects consisted. The same holds true in case of its performance vis-a-vis the MIT dataset [14] as it can be seen in Fig. and clearly outperforms all the other methods considered. the eye-gaze recording equipments. Unlike the York University dataset [13].84 0. 3. The performance in terms of ROC–AUC11 is measured and the resulting plots for the York University dataset [13] and the MIT dataset [14] are shown in Figs. fp is the number of false positives and fn is the number of false negatives. This dataset does not consist of images with human faces. sporting activities which attract attention and are semantically relevant. Note that the proposed method has the highest ROC–AUC performance on the York dataset [13] and attains a performance equivalent to the graph-based [41] method on the MIT dataset [14] as it can be observed in Table 1. Note that symmetric contrast [26] and global contrast [25] based methods have a near linear slope. which have semantic meanings attributed to them. 3.77 NA 0.83 0. which was similar to their performance on the York dataset [13] as shown in Fig. To quantitatively evaluate the performance we computed the ROC–AUC which is shown in Table 1. The MIT dataset [14] consists of eye fixation records from 15 subjects for 1003 images of size 1024 Â 768 pixels.67 0. Since the AUC is a portion of the area of the unit square. This measurement also has the desired characteristic of transformation invariance. 2. since the original source codes of these methods do not work on images as large as 1024 Â 768 pixels.5 [59]. its value will always be between 0 and 1. and viewing conditions were not identical while creating these two eye-gaze datasets. the [14] consists of images which may contain faces. This results in a predicted binary mask (from the saliency map) and a ground truth binary mask (from the fixation density map) for each binarizing threshold. Observe that symmetric contrast [26] and global contrast [25] based methods have near linear slopes for their ROC-plots. which indicates their ineffectiveness for the task of eye-gaze prediction.85 MIT [14] 0.64 0. The York University [13] dataset contains eye fixation records from 20 subjects for a total of 120 images of size 681 Â 511 pixels. animals or images of activities like sports.0 while random guessing produces an AUC of less than 0. Saliency map Global contrast [25] Symmetric contrast [26] Graph-based [41] Multi-scale [15] Entropy-based [29] Self information [31] Local steering kernel [39] Weighted contrast [30] Distance transform [28] Proposed York University [13] 0. Observe that the performance of the proposed method and the graph-based [41] method does not suffer despite the change in eye fixation dataset.N. 2 and 3 respectively. It can be observed that the proposed method has state-of-the-art performance on both of the eye fixation datasets. In our case. 255]. The proposed method attains a similar performance like the graph-based [41] saliency approach while outperforming the others.54 0.32]: TPR ¼ tp tp þ fn fp fp þtn ð1Þ FPR ¼ ð2Þ where tp is the number of true positives. The TPR and FPR for each threshold are subsequently computed. in that the ROC–AUC does not change when applying any monotonically increasing function (such as logarithm) to the saliency measure [31].

Fig. The results shown in Fig. As it can be seen from Fig. Thus there is a weak association between the exact segmentation masks and the saliency of the image.Author's personal copy 3120 T.75) and self information [31] (ROC–AUC¼0. We deliberately included minimal number of ROC plots. In this context. Variations in ROC–AUC of proposed method due to change in n on York University [13] dataset. the proposed method already achieves an ROC–AUC of 0. 5.57] have shown that the positions of the principal maxima in a saliency map are significantly correlated to the positions of areas that people would choose to put a label indicating a region of interest. Fig. 2. Our method employs random sampling which means that the selected windows at each trail may be different. Such an annotation task involves high-level cognitive mechanisms and is not stimulus driven. Observe that the proposed method has an ROC–AUC value of 0. Hence we decided to study the stability of the proposed method in terms of ROC–AUC for the standard configuration of n on the York University [13] dataset. The mean ROC–AUC is 0. hence adding to the stability of the proposed method. the eye movements were not recorded and annotators were required to enclose the most visually interesting region of the image.76 for n¼ 10 which is better than the optimal performance of the local steering kernel [39] (ROC–AUC¼ 0. possibility that it consists of more objects. 7 shows an example of this labeling. It can be observed from Fig. The lack of oscillation in ROC–AUC values versus n shows that the proposed method is stable despite changes in the value of n. In general. Despite involving a high-level visual task. 260 and standard configuration) and obtained ROC plots on the York dataset [13] which are given in Fig. for a subset of 1000 images from the MSRA dataset [12]. 60.60] that a more precise-to-contour ground truth leads to a more accurate evaluation. In order to quantitatively evaluate the performance for the task of detecting salient regions. Motivated by this a precise-to-contour ground truth was released in [25]. We varied n (where n ¼10. Naturally occurring objects do not necessarily have a regular contour which can be enclosed accurately inside a rectangle. We therefore evaluated the performance of the proposed method on the York dataset [13] by varying n and recorded the ROC–AUC values. Vikram et al. 110. The method saturates in terms of ROC– AUC for a small value of n. Variations in ROC plots of the proposed method due to change in n on the York University [13] dataset.67) based approaches as seen from Table 1. As the size of the image increases so does the Fig.N. the proposed method obtains comparable ROC plot to local steering kernel [39] and distance transform [28] as seen from Fig.76 and saturates to an ROC–AUC of 0. ROC–AUC performance of the proposed method for different trails on the York University [13] dataset. Observe that the ROC–AUC values do not change significantly during 10 different trails. With n¼ 60 the performance of the proposed method starts saturating and this helps in processing large images without compromising on run-time. Observe that for n¼10. 5. 7. Experiments on salient region detection task The MSRA dataset [12] consists of 5000 images annotated by nine users. 5. 4. we followed the method Fig. sampling the image in tandem with its size helps in computing a better saliency map. 6. we recommend to use the standard configuration for n. Fundamentally. The low standard deviation indicates the stability of the proposed method. 4 that for 10 random samplings. Despite this. as all the ROC plots looked very similar when n 460. content and image size. unnecessary background is enclosed by this kind of annotation.8554 and the standard deviation (s) is 9:3743eÀ04.26. It has been recently shown in [25. 6 reveal that the variations in ROC–AUC values over 10 different trails are not significant. .2. / Pattern Recognition 45 (2012) 3114–3124 The experimental parameter n is correlated with the scale of the image where the standard setting is 0:02  r  c. Previous studies [61.85 for 60 random samplings and above. The annotators were asked to enclose what they thought was the most salient part of the image with a rectangle. the MSRA dataset [12] is still relevant to test whether there is an association between the predicted saliency map and the ground truth masks of this dataset. and hence rigorous testing to fix this parameter could be avoided. Thus we consider this subset of 1000 images which have accurate ground truth annotations for our experiments. the MSRA dataset [12] is significantly different from the eye-gaze datasets in terms of test protocol.

where the saliency map is binarized and compared with the ground truth mask. Rectangular and exact segmentation masks.64. it could be noted from our previous experiments on eye fixation datasets that the weighted contrast [30] based method fails to work when the size of the input is large and has far below performance in terms of ROC–AUC as compared to the graph-based [41] approach and the proposed method. 8.51 0.63 FÀmeasure 0. the multi-scale [15] and local steering kernel [39] based saliencies highlight the minute details but suppress 12 http://www.N. Despite being simple. as the binarization threshold depends on the It can be observed from Table 2 that the proposed method and the graph-based method have an identical F-measure performance of 0. 255] to obtain a binary mask and is further compared with the ground truth. The formula for F-measure is given in Eq. This curve provides a reliable comparison of how well various saliency maps highlight salient regions in images. Observe that the proposed method has an equivalent performance to the graph-based [41] method.64 Fig. Observe the reduction in background information between rectangular and exact annotations. However.26.46 0.67 0. The weighted contrast-based [30] saliency attains the highest F-measure of 0.cs. Apart from the proposed method and graph-based [41] saliency.49 0.84 0. A common threshold to binarize different saliency maps might not be suitable.47 0.85 0. Thus we employ Tsai’s moment preserving algorithm12 [62] which has also been recommended in [28] to select the map specific binarization threshold.46 0. The recall (also called TPR or hit-rate) and precision metrics are computed as in [30]: recall ¼ tp tp þ fn tp tp þfp ð3Þ image specific statistics.56 0. It can be observed from Fig. that the entropy-based [29] saliency approach fails to effectively localize the salient object in the image. Observe that the proposed method has a superior recall–precision plot as compared to all the methods except for the graph-based [41] and weighted contrast [30] based approaches. The entropy-based [29] method has the lowest F-measure performance of 0. For the purpose of illustration we overlaid the original image with ground truth masks and the other masks obtained after binarizing the respective saliency maps by Tsai’s moment preserving algorithm [62]. We demonstrate the performance of the various approaches considered on the task of salient region detection by considering three sample images from the MSRA [12] dataset as shown in Fig. Table 2 The recall. Conversely.64 0.58 is seen to perform equally well on eye-gaze detection task.30]. It can be observed from Fig. F-measure ¼ 2 Á precision Á recall precision þ recall ð5Þ precision ¼ ð4Þ The resulting recall versus precision curve is shown in Fig.71 0.48 0. In case of the image consisting of the yellow flower. Saliency map Global contrast [25] Symmetric contrast [26] Graph-based [41] Multi-scale [15] Entropy-based [29] Self information [31] Local steering kernel [39] Weighted contrast [30] Distance transform [28] Proposed Recall 0.50 0.48.45 0. 7.33 0. 9.zip . that the proposed method has comparable performance with the graph-based [41] method and outperforms all the other approaches in consideration with an exception of the weighted contrast [30] based method.29 0.40 0. To the left is the sample image from the MSRA dataset [12].28.73 0.30]. We further evaluate the accuracy of the obtained binary masks to the ground truth masks by employing F-measure as suggested in [49. (5).66 0. while the salient region of the glowing bulb is treated as irrelevant.25.98 0. the proposed method fares well in both the tasks of eye-gaze correlation as well as salient region detection and has as an equivalent performance to the graph-based [41] approach. as both the methods operate on mean pixel intensity value of the image.44 0.tut.71 Precision 0. the image in the center shows the original rectangular annotation and at right is the accurate-to-countour annotation.Author's personal copy T. but does not perform well in the eye-gaze correlation task. 8.43 0. The saliency map is thresholded within [0.71.43 despite achieving a good performance on the eye-gaze task.fi/ $ ant/histthresh/HistThresh.62 0. Despite being efficient the global contrast [25] and symmetric contrast [26] based methods ignore the finer details of the salient region. / Pattern Recognition 45 (2012) 3114–3124 3121 Fig.30]. The thresholds are varied from 0 to 255 and recall–precision metrics are computed at each binarizing threshold. Recall–precision plot on the MSRA [12] dataset. only the multi-scale approach [15] with a F-measure of 0.66 0. the entire image is shown as salient. recommended in [25.55 0. The recall–precision metric is found to be more suitable to evaluate the salient region detection performance than the ROC–AUC metric [25.precision and F-measure performance of saliency maps from the various methods under consideration. 9.28.63 0.26.58 0. Vikram et al. 8.

image size 205 Â 103 308 Â 195 410 Â 259 Global contrast [25] Symmetric contrast [26] Graph-based [41] Multi-scale [15] Entropy-based [29] Self information [31] Local steering kernel [39] Weighted contrast [30] Distance transform [28] Proposed Matlab Matlab Matlab with Cþ þ Matlab with Cþ þ Matlab Matlab Matlab Matlab Binary Matlab 0. which most of the other methods in consideration fails to achieve. salient regions as they have a bias towards strong gradients. Vikram et al.6256 0.3820 5.2050 0.2701 0. The original plugins of global contrast [25].3661 10.2038 0. on an Intel Core 2 Duo processor with Ubuntu 10.N.0976 0. 9. Saliency map Code type Run-time(in s) w.5939 0. Only the weighted contrast [30] based method has a near perfect performance on detecting the salient regions. Qualitative analysis of results for MSRA [12] dataset.5422 0. The runtime of the various methods were benchmarked on three different scales of a color image as shown in Table 3.3430 0.04.r. Run-times were computed using Matlab v7.t. . multi-scale [15] and graph-based methods [41] are quasi Matlab codes which call Cþ functions þ for run-time optimization. But the previous experiments on eye-gaze prediction have shown that the weighted contrast [30] based approach is not highly effective. symmetric contrast [26].1448 1.0898 0.1187 0. The original plugin for distance Table 3 The computational run-time(in s) of various saliency methods under consideration.2806 0. 9. Computational run time We evaluated the run-time of the proposed saliency approach with reference to the other methods in consideration.6242 3. 5. entropy-based [29] and self information-based [31] saliency approaches are pure Matlab codes.5308 1. and consequently it discards regions of uniform appearance which can be observed from column nine of Fig.5577 0. it þ gives a relative overview of run-time performance of the all methods under consideration.1133 1.4388 2.Author's personal copy 3122 T. The remaining columns contain the images overlaid with thresholded (using Tsai’s algorithm [62]) saliency maps obtained from the respective methods mentioned in the figure header. / Pattern Recognition 45 (2012) 3114–3124 Fig. The self information [31] based approach highlights rare segments of the image as salient. An absolute comparison on the basis of run-time might penalize the methods which are coded in Matlab script as they are relatively slower than their Cþ counterparts.3915 7. The proposed method highlights the salient regions along with the finer details effectively which shows that it is not biased towards edges like the other methods. While the codes pertaining to local steering kernel [39].0761 4.1 LTS (Lucid Lynx) as operating system.0590 0. The proposed method is programmed in Matlab. Nevertheless.4788 0.1172 0.6466 3.0 (R2010a).0714 3.0766 transform [28] is a binary executable while its original coding language is unknown. Observe that the proposed method highlights the salient regions along with its finer details effectively.1266 0.3.1539 0.10. weighted contrast [30]. The first column contains the original image.0652 0.

pp. Shan. R. Saliency based on information maximization.L. Hamidi. Image and Vision Computing 28 (2010) 1130–1145. Q. 967–973. Caron. [30] T. Hemami. F. J. [34] X. [14] T. Susstrunk. pp. ¨ [7] R. In future. A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression.L. T. [16] M. Wils. 2005.H. Walther. Dynamic visual attention: searching for coding length increments. Schomaker. L. Pattern Recognition 42 (2009) 2363–2371. [4] J.K. S. 2010. Most of the existing approaches to compute a saliency map fail to perform equally well on both of the aforementioned tasks. Jurie. Metaxas. Sun: a Bayesian framework for saliency using natural statistics. pp. Y. in: Proceedings of the International Conference on Computer Vision Theory and Applications. Xu. 166–173. [31] L. Nagai. N. Susstrunk. The rest of the methods process the input image in its original scale and hence the run-time changes with the input image dimension. Moosmann. Ehinger. 22. T. 2010.N. [5] L. [27] G. Makris. S.B. Despite this we recommend to tune the sampling rate in direct proportion to the size of the image. 2106–2113.N. [10] Y. [29] N. in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition—Workshops. In spite of its simplicity and elegance the proposed method fails to capture saliency when color contrast is either low or not being the principal influence. Yanulevskaya. Bruce. pp. [33] Y. 1–8. Zhang. 270–273. 235065).D. 355–362. Saliency detection based on 2d log-gabor wavelets and center bias. 2009. A computational model for saliency maps by using local entropy. in: Proceedings of the Annual Review of Neuroscience. pp.I. G. pp. From bottom-up visual attention to robot action learning. [35] Y. [2] F. Learning to detect a salient object. F.M. in: Proceedings of the European Conference on Computer Vision. 2011. Estrada. A. in: Proceedings of the ECCV International Workshop on the Representation and Use of Prior Knowledge in Vision. S. Zhang. in: Proceedings of the IEEE International Conference on Image Processing. ¨ [25] R. in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Erich. pp. It should be observed that the proposed method achieves this performance despite being simple as compared to graph-based [41]. 2006. Computational visual attention systems and their cognitive foundations: a survey. F. 2010. Ahmadabadi. 1–8. [3] P. Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. Combining attention and recognition for rapid scene analysis. in: Proceedings of the IEEE Workshop on Applications of Computer Vision. Ma. pp. pp. B. Neurocomputing 65-66 (2005) 125–133. Jia. Susstrunk. Duan.N. Vincent. Huang. Pattern Recognition 40 (2007) 2521–2529. 2009. in: Proceedings of the International Conference on Multimedia. we plan to address this bottleneck and further improve our method to capture higher level semantics like detecting pop-outs in terms of orientation. Saliency driven total variation segmentation. Wang. in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Our results have shown that the proposed method attains state-of-the-art performance on the task of eye-gaze correlation and also has a comparable performance with the other methods for the task of salient region detection. Itti. [13] N. N. Object of interest detection by saliency learning. Larlus.N. in: Proceedings of the IEEE International Conference on Multimedia and Expo. Colby. 2007. This work has been supported by the RobotDoC Marie Curie Initial Training Network funded by the European Commission under the 7th Framework Programme (Contract no. Frintrop. Learning saliency maps for object categorization. L.D. 3093–3096. L. in: Proceedings of the International Conference on Computer Vision Systems. local steering kernel-based [39] and other existing sophisticated methods. 2368–2375. Lu. Hou. Xu. Liu. 617–620. Wang. 2008. [22] X. Hirzer. in: Proceedings of the IEEE International Conference on Development and Learning. Rosin. pp. Christian Schnier and Subhashree Vikram who proof read this article at various stages. IEEE Transactions on Image Processing 19 (2010) 185–198. Geusebroek. C. C. Cottrell. [20] M. Zhang. J. M. Wang. [18] D. . Torben Toniges. Tang. Koch. pp. M. / Pattern Recognition 45 (2012) 3114–3124 3123 It can be observed from Table 3 that the run-time of the local steering kernel [39]. 1999. Bischof. vol. Zheng. L. [23] Y.-N. Sun. Borji. Incremental sparse saliency detection. Tang. Koch. G. M. [19] C. L. Visual attention for robotic cognition: a survey. M. H. F.K. J. Araabi. P. 1–8. Achanta. F. Features that draw visual attention: an information theoretic perspective. Ullman.B. [21] X. Begum. Measuring visual saliency by site entropy rate. Q. as generating large number of arbitrary sized image windows often leads to a higher probability of completely enclosing a salient region or an object. X. Marks. [15] L. the graph-based [41] and the multi-scalebased [15] saliency approaches do not change significantly irrespective of the size of the input image. Vikram et al. pp. X. B. Tian. [17] S. D. 2009. [24] V. Judd. in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. A. E. Torralba. Khuwuthyakorn. L. Niebur. Advances in Neural Information Processing Systems (2008) 681–688. Zhang. But our method along with the graph-based [41] approach has a reliable performance on both these diverse tasks. ¨ [26] R.-Y. Acknowledgments Tadmeri Narayan Vikram would like to thank his colleagues ¨ Miranda Grahl. Urschler. Fang. Robles-Kelly. Space and attention in parietal cortex. ACM Transactions on Applied Perception 7 (2010). 636–649. 2007. L. This is on account that the input image is rescaled to pre-specified dimension mentioned in their source codes. in: Proceeding of the IEEE International Conference on Computer Vision. Yang. 2010. [11] C. Wang.Author's personal copy T. 2009. Itti. Hou. Cognitive Computation 3 (2011) 223–240. L. H. 2009. Human Neurobiology 4 (1985) 219–227. L. de Boer. 6:1–6:39. in: Proceedings of the ACM International Conference on Multimedia. Use of power law models in detecting region of interest. B. [6] M. IEEE Transactions on Autonomous Mental Development 3 (2011) 92–105. Tsotsos. 66–75. Karray. [32] W. Liu. Zhou. Significance of the Weibull distribution and its sub-models in natural image statistics. S. Koch. Learning to predict where humans look. pp. Online learning of taskdriven object-based visual attention control. Guo. Achanta. It has also been shown in the experiments that a very small number of random rectangles (n¼60) is adequate to attain good results. Advances in Neural Information Processing Systems (2005) 155–162. 1597–1604. Q. 2009. P. Li. Experiments were carried out on large publicly available datasets of York University [13]. Achanta. Zhang. Tscherepanow. [8] C. pp. M. Bonaiuto. References [1] C. C. H. Bruce. Donoser. 90–97. 319–349. B. Estrada. J. J. H. 817–824. pp. pp. shape and proximities. S. Y. [12] T. Attention in hierarchical models of object recognition. Yang. Tong. 2653–2656. Y. K. in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. H. Context saliency based image summarization. M. Guo. in: Proceedings of the IEEE International Conference on Computer Vision. Lin. Discussion and conclusion We have compared and contrasted nine existing state-of-the-art approaches to compute a saliency map along with the proposed method for the two different tasks of eye-gaze correlation and salient region detection. A model of saliency-based visual attention for rapid scene analysis. Goldberg. Huang. Salient region detection and segmentation. Predicting eye fixations on complex visual stimuli using local symmetry. pp. Durand. Zhou.W. Xu. Wrede. Frequency-tuned salient region detection. Saliency detection: a spectral residual approach. 6. Temporal spectral residual: fast motion saliency detection. Li. 2009. Shi.B.E. Shifts in selective visual attention: towards the underlying neural circuitry. 2010. in: Proceedings of the AAAI Conference on Artificial Intelligence. Progress in Brain Research 165 (2007) 57–78. Vikram. IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (1998) 1254–1259. Cui. [28] P. W. Kootstra. D. pp. in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Shum. A simple method for detecting salient regions. MIT [14] for eyegaze correlation and the MSRA dataset [12] for salient region detection. A random center surround bottom up visual attention model useful for salient region detection. pp. Gao. in: Proceeding of the IEEE International Conference on Image Processing. Saliency detection using maximum symmetric surround. 979–982. Christensen. J. Exploiting the ability of the proposed method to detect a salient region as a pre-processor for object detection is also envisaged. 2008. pp. Journal of Vision 8 (2008) 1–20. M. J. [9] A. 1–6. Y. 2009.

From 1999 to 2002 she pursued a Ph. Selecting salient objects in real scenes: an oscillatory correlation model. Sang. Perona. [58] P. Tsai. L. Mozer. [46] R.F. [54] J. Learning receptor positions. Speech and Signal Processing. Shen. in: Proceedings of International Symposium on Image and Signal Processing and Analysis. Marko Tscherepanow received his diploma degree in computer science from Ilmenau Technical University. Q. B. France between 2008 and 2010.A. Gevers. Tadmeri Narayan Vikram has received Masters of Computer Applications from University of Mysore.-C. Scholkopf. in 2006. A nonparametric approach to bottom-up visual saliency. 73–80.J. [52] R. Static and space-time visual saliency detection by selfresemblance. [47] G. and Image Processing 29 (1985) 377–393. Milanfar. [56] A. T. pp. L.M. Stimulus bias. in: Proceedings of the European Workshop on Visual Information Processing. Salient region detection by modeling distributions of color and orientation. at the Applied Informatics Group in Bielefeld University. Cao. in: Proceedings of the IEEE International Conference on Multimedia and Expo. 13–16. F. University of Caen. Le Meur. pp. Le Callet. His research interests include computational modeling of visual attention and its applications in developmental robotics. Liu. pp. Y. 2010. M. Relevance of a feed-forward model of visual attention for goal-oriented and free-viewing tasks. Afterwards. L. Goadrich. in: Proceedings of the International Conference on Neural Information Processing. Image and Vision Computing 26 (2008) 114–126. A two-stage approach to saliency detection in images. After her Ph. D. thesis entitled ‘‘Image Analysis Methods for Location Proteomics’’. Alexe. pp. in: Proceedings of International Conference on Machine Learning. Face saliency in various human visual saliency models. Mahadevan.A. V. Esaliency (extended saliency): meaningful attention using stochastic image modeling. Wang. in: Proceedings of IEEE International Conference on Acoustics. ¨ [42] W. pp.-S. [38] R. C. Rothenstein. R. N. [39] H. Huang. Journal of Vision 9 (12) (2009) 1–27. Saliency based on multi-scale ratio of dissimilarity.N. [60] Z. Deselaers. 965–968. Valenti. 23–34. Hu. Cheikh. 2010. The relationship between precision–recall and ROC curves. What is an object? in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. he joined the Applied Informatics Group at Bielefeld University. Voluntary attention enhances contrast appearance. Carrasco.D. 15. 327–332. Y. and classification. Franz.D. Nosofsky. J. Computer Vision. [50] B. C. Liu. [61] L. Image saliency by isocentric curvedness and color. Elazary. experience-based theory of attentional control. pp. [48] V. He worked as a research engineer at GREYC.A. degree (Dr. M. he finished his Ph. B. asymmetric similarity. J. Gopalakrishnan.G. Ferrari. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (2010) 693–708. M. Koch. . 1–9. B. Avraham. 2009.D. to pursue Ph. Chevet. [51] A. Yu. D. Vikram et al. Currently she heads the Applied Informatics Group at Bielefeld University. feature selection. [41] J. C. 253–256. 1991. Hebbian-based neural networks for bottom-up visual attention systems.O. Ahumada. Huang. 2010. Jung. C. N. pp. Le Meur. Wang. [44] Y.C. IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (2006) 802–817. D. 13–18. Zhang. [59] J. [37] T. / Pattern Recognition 45 (2012) 3114–3124 [36] D. Attention links sensing to recognition. T. Seo. Graphics. P. [49] O. pp. F. Li. Advances in Neural Information Processing Systems (2007) 545–552. pp. F. Lindenbaum. Cheikh. Kim. in: Proceedings of IEEE International Conference on Computer Vision. Advances in Neural Information Processing Systems (2007) 689–696.-Ing. 2006. in: Proceedings of the IEEE International Conference on Image Processing. [53] M. Zhao. Taosheng Liu. Harel. Moment-preserving thresholding: a new approach. Treue. V. Britta Wrede received her masters degree in computational linguistics and the Ph. in: Proceedings of the International Conference on Pattern Recognition.L. in: Computational Models of Visual Processing. USA. L.D. and the application of such methods in cognitive robotics. Journal of Vision 8 (2008) 1–15. Tang. Barba.) in computer science from Bielefeld University in 1999 and 2002 respectively. evolutionary optimization. D. Vasconcelos. 2009. pp. Rajan. Perceptual enhancement of contrast by attention. Kim. 2010. IEEE Transactions on Multimedia 11 (2009) 892–905. Salient region detection with opponent color boosting. Trends in Cognitive Sciences 8 (10) (2004) 435–437. Saliency detection: a self-ordinal resemblance approach. 2010. Sharma.D. Graph-based visual saliency. Nonparametric saliency detection using kernel density estimation. Wilder. Wickens. [55] S. Davis. His current research interests include incremental on-line learning. P. M. Wichmann. L.H. she received a DAAD PostDoc fellowship at the speech group of the International Computer Science Institute (ICSI) at Berkeley. The discriminant center-surround hypothesis for bottom-up saliency. Germany. P.Author's personal copy 3124 T. Thoreau. Germany. [40] W. Advances in Neural Information Processing Systems (2008) 497–504. In December 2007. Wang.D. Cognitive Psychology 23 (1991) 94–140. D. N. India.H.A. 1260–1265. He is awarded the prestigious Marie Curie fellowship since 2010. MIT Press.K. [62] W. in 2003. Xue. Romero. program at Bielefeld University in joint affiliation with the Applied Informatics Group and the graduate program taskoriented communication. Zhang. Itti. Kienzle. 2008. neural networks. pp. IEEE Transactions on Image Processing 19 (2010) 2801–2813. Z. 233–240. Gao. Psychological Science 20 (3) (2009) 354–362. Journal of Vision 11 (2) (2011) 1–30. Sebe. J. Tsotsos. Hardeberg.Y. 2009. A coherent computational approach to model bottom-up visual attention. An integrative. 2185–2192. Interesting objects are visually salient. Neural Networks 24 (2011) 54–64. Germany. [45] M.J. [57] O. Quiles. [43] Z. M.

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->