You are on page 1of 8

2009 Canadian Conference on Computer and Robot Vision

A Support Vector Machine Based Online Learning Approach for Automated Visual Inspection
Jun Sun1, 2 and Qiao Sun1 Department of Mechanical and Manufacturing Engineering, University of Calgary, Calgary, Alberta T2N 1N4, Canada 2 Alberta Research Council, Calgary, Alberta T2L 2A6, Canada
models, which are usually pre-defined by the system developer manually through trial and error. It has been recognized that it is not a trivial task to define either appropriate matching template or effective inspection rules for a particular inspection problem [1]-[3]. Hence, the conventional AVI system lacks adaptability when changes happen in the production line. For instance,

In manufacturing industry there is a need for an adaptable automated visual inspection (AVI) system that can be used for different inspection tasks under different operation condition without requiring excessive retuning or retraining. This paper proposes an adaptable AVI scheme using an efficient and effective online learning approach. The AVI scheme uses a novel inspection model that consists of the two sub-models for localization and verification. In the AVI scheme, the region localization module is implemented by using a template-matching technique to locate the subject to be inspected based on the localization submode. The defect detection module is realized by using the representative features obtained from the feature extraction module and executing the verification submodel built in the model training module. A support vector machine (SVM) based online learning algorithm is proposed for training and updating the verification sub-model. In the case studies, the adaptable AVI scheme demonstrated its promising performances with respect to the training efficiency and inspection accuracy. The expected outcome of this research will be beneficial to the manufacturing industry.

The production line is reconfigured to manufacture a

new product. This situation requires adapting the existing AVI system to deal with new assembly parts; The system operation conditions, such as illumination condition and camera settings, have changed or drifted after a certain period of time. As a result, image representation of the inspected assembly part may change which renders the existing system obsolete. This situation requires adjusting or retuning the existing system so that it can work properly under the new operation conditions. For the past two decades, researchers have attempted to apply the machine-learning techniques, such as neural networks and neuro-fuzzy systems, to improve the adaptability of AVI systems [4]. These learning techniques are commonly used in AVI systems to build inspection models or functions with training samples in offline fashion. In the offline learning, all training samples are previously obtained and each training sample is assigned a class label (e.g., defective or nondefective) as the desired system output (i.e., inspection result). The role of a human inspector is to label the training samples according to quality standard. The estabished inspection model or function is not changed after an initial training process has been completed.

1. Introduction
Increasingly, various automated visual inspection (AVI) systems have been utilized for quality assurance of product assembly processes in different production lines. Instead of human inspectors, the AVI system can perform inspection tasks to verify that parts are properly installed and reject improper assemblies. Conventionally, most of the existing AVI systems use the matching-template or rule-based inspection
978-0-7695-3651-4/09 $25.00 2009 IEEE DOI 10.1109/CRV.2009.13

With this learning capability, the system can be trained to handle different inspection problems. However, concerns are often raised by the end users that the performance of an offline-learning approach relies heavily on the quality of initial training data. In many situations it may be difficult or even impossible to collect sufficient representative training samples over a limited period of time. It requires simulating all possible scenarios of future events. To address this issue, recently there is an emerging research interest in applying online learning approach for development of adaptable AVI systems [2][5]. The idea is to incorporate new inspection patterns into the inspection model as they are encountered during the system operation. As such, the system does not require excessive initial training before it can function. This paper presents an adaptable AVI scheme with a support vector machine (SVM) based online learning approach. Particularly, the following objectives are focused emphasized in this paper: (i) utilizing an adaptable inspection model that can be trained online adapting itself to different inspection problems; (ii) developing an efficient and effective online learning algorithm which can minimize the cost for sample labeling while building an accurate inspection model.

An Acquired Image Verification Region (VR) Assembly Part (Clip A) Assembly Base

Localization Sub-model (ML)

Verification Sub-model (MV) Pattern 1

(VR) Pattern 2 Pattern 3

Figure 1. Adaptable inspection model

be used to identify the inspected image as defective or non-defective. As ML is invariant to all inspection samples, the system developer can specify it by simply drawing a box enclosing the inspected subject. MV is built and updated based on training samples using the SVM learning technique. In this paper, an adaptable AVI scheme is developed with the four major modules: region localization, feature extraction, defect detection, and model training, as illustrated in Figure 2. Given an inspection sample, the region localization module applies an image template matching technique with the predefined ML to locate VR within an acquired image. The VR image representing the subject to be inspected is then processed by feature extraction module. From each VR image, the number l of representative features are extracted to generate the feature vector for representing the inspection sample x . With an existing MV, the defect detection can be done by generating the inspection result y using x as input. The model training module updates the existing MV through an efficient and effective online learning algorithm proposed in this paper. The online training algorithm is summarized as follows:

2. An adaptable AVI scheme

In this paper, we proposed a novel adaptable inspection model that can adapt to different inspection problems through online learning. Taking an assembly process as an example, the process involves installing a part at a required site on an assembly base. As illustrated in Figure 1, a gray-scale image is acquired by camera for the inspection of assembly part. Within the image a region of installation site is specified as the verification region (VR) for visual inspection. VR is a subset of image that contains the subject being inspected. For a defect detection problem, VR may show different appearances reflecting non-defective or defective part assembly situations. The adaptable inspection model consists of the two sub-models: Localization Sub-model (ML) encloses the VR and contains features independent of the subject being inspected within the VR. The features in ML are usually invariant for both defective and nondefective samples. The VR can be located by using ML as a reference or landmark within an acquired image. Verification Sub-model (MV) is considered a classification model which incorporates a set of representative VR patterns within it and then can

Model Training Adaptable AVI Model

Labeled Sample



Manual Inspection Uncertain Sample


Region Localization VR

Feature Extraction

Defect Detection

Certain Result

Figure 2. Framework of an adaptable AVI scheme

Online Learning Algorithm for Adaptable AVI Scheme Initialization Build MV based on an initial training set of size n, D = [(x1, y1), (x2, y2), , (xn, yn)], where y denotes the numerical class label +1 or -1, representing defective or non-defective class, respectively. Calculate and set the confidence threshold pt for determination of uncertain/certain result based on the estimated prediction accuracy of MV. The inspection sample with an uncertain result, called uncertain sample, requests to be labeled through manual inspection. Updating For each newly arrived sample xi without a class label, do the following steps: Classify xi using the existing MV, i.e., yi = MV (xi). 2) Calculate the certainty of the classification result, p(xi). 3) If the certainty is unsatisfied, i.e., p(xi) < pt . do the following steps: a) Request a class label yi for xi from the manual inspection; b) Add the labeled sample (xi, yi) into D; c) Update (i.e., retrain) the existing MV with D; d) Estimate the prediction accuracy of MV; e) Reset the confidence threshold pt. Otherwise, i.e., p(xi) >= pt. Output yi as the certain result, i.e., assign a class label +1 or -1. 1)

3. Region localization
In the region localization module, ML is considered a matching template and the module seeks the best matching occurrence of ML within the image. The matched ML occurrence is normalized through scaling, rotation, and landmark alignment. The VR is then identified with the matched ML occurrence image to represent the subject to be inspected. A template matching technique, namely the edgebased pattern-matching method, is employed in the region localization module. Instead of comparing every pixels of the whole image, the edge-based technique compares edge pixels with the matching template. It offers several advantages over the pixel-to-pixel correlation method. For example, it offers reliable pattern identification when part of an object is obstructed, as long as a certain percentage of its edges remain visible. Since only edge information is used, this technique can rotate and scale edge data to find an object, regardless of its orientation or size. In addition, this technique can provide good results with a greater tolerance of lighting variations. We used the Geometric Model Finder in Matrox Imaging Library to implement the edge-based geometric patternmatching function for the region localization.

4. Feature extraction
Three learning strategies are used for facilitating the online learning algorithm. The first strategy is to adopt an efficient and effective method to estimate the prediction accuracy of the SVM based MV. The second is to use an adaptive margin sampling approach to reduce the cost for sample labeling and model updating. The third is to adopt a grid-search method to choose the optimal SVM model parameters. The details about the three strategies will be described in Section 6.

The feature extraction module generates the feature vector that represents the VR image as the input of MV. As illustrated in Figure 3, the following steps are taken in the feature extraction process:

1) The image is first binarized using Ostus method

that chooses a global optimal threshold to maximize the separability of the inspected subject and background in gray levels [6].

2) On the binarized image, a binary pixel mask can be

placed around the subject to be inspected for reducing unnecessary background noise. 3) A blob analysis technique is used for a further reduction of noise pixels. The blob analysis identifies the blob that is formed by a set of connected pixels. Those identified blobs should be removed if they are considered noises according to certain criteria. 4) The perimeter pixels on the boundary of the inspected subject are determined by using the image dilation and erosion operations. 5) The principal component analysis (PCA) technique is used to determine the center and orientation of the inspected subject. In this step, each perimeter pixel is denoted by a two-dimensional vector consisting of its x and y coordinates in the image. For a set of perimeter pixels, the center of these pixels is denoted by the two-dimensional vector: [X0, Y0]T. The two eigenvectors e1 and e2, and their corresponding eigenvalues, 1 and 2, can be computed using PCA technique. Since the eigenvectors e1 and e2 are orthogonal, only the orientation 1 and accumulation contribution ratio r1 of the eigenvector e1 are selected as the representative features. Upon completing the feature extraction process, the four representative features X0, Y0, 1, and r1 are extracted to generate the feature vector representing a given inspection sample:

The feature extraction module is implemented using the relevant functions provided by the Matlab image processing toolbox.

5. Background of SVM
Support Vector Machine (SVM) is an effective machine learning technique for classification [7][8]. For a non-linear classification problem, it is required to find a classification function based on a given set of labeled training samples (xi, yi), i = 1, 2, , n, where the sample vector xi and label y {+1, -1}. We first map the sample vector x into a higher dimensional

space z and then construct a linear hyperplane in this space:


(1) where (x) is the space transforming function, and w and b are the weight vector and bias for the linear hyperplane. For the hyperplane, there are two margins as follows:

w z + b = 0, z = (x ), w

w z + b = 1 w z + b = +1


Finding the optimal hyperplane is a constrained optimization problem that has the following primal objective function and constraints:
n 1 2 w + C i 2 i =1 s.t. y i (w z i + b ) 1 i , i 0, i (3)

min P (w , b, ) =

x = [X 0 , Y0 , 1 , r1 ]

VR image

The inspected subject


Blob analysis Perimeter identification


where denotes the classification error for each training sample and C is the penalty parameter for the error term. If the error term is not included in (3), the hyperplane is considered a hard-margin classifier that attempts to separate all samples correctly between the two classes. Due to there are noises and outliers in the training data, in many situations such hard-margin classifier may not achieve a good performance on classifying future unseen data. By adding the error term and the penalty parameter, the hyperplane obtained from (3) becomes a soft-margin classifier, which can reduce the influence of noise and outliers in the training data. The constrained optimization problem (3) can be solved using the solution of its dual problem: max D ( ) =

(X 0 Y 0) 1 e1


1 n i j yi y j (z i z j ) 2 i , j =1

s.t. C i 0 i and

i =1

i i



Figure 3. Feature extraction process


With the identified optimal i, i = 1, 2, , n, the solution of weight vector w is described as:

5. Model training
In the proposed AVI scheme, an online learning algorithm (as described in Section 2) is used to build and update MV, i.e., the SVM based classification function described as (11). In order to make the online learning algorithm effective and efficient, the following three learning strategies are employed in this algorithm.

w * = yi i z i
i =1


All training samples with i > 0 at the solution are called support vectors that represent all relevant patterns for the classification problem. The samples with 0 < i < C are called unbounded support vectors that lie on or between the two margins as described by (2), while the samples with i = C are called bounded support vectors that are misclassified training samples. Assume there are m support vectors (si, yi), i = 1, 2, , m, the optimal weight vector w can be described only using these support vectors:

5.1. Estimating prediction accuracy

In order to evaluate the training sufficiency, it is useful to estimate the prediction accuracy that reflects the model performance on the future unseen inspection samples. In the proposed online learning algorithm, the leave-one-out (LOO) method is used to estimate the prediction accuracy of the trained classification model MV. LOO error is an unbiased estimator of the true generalization error compared to other estimators obtained by the k-fold cross-validation and splittingsample methods. In the LOO method, the classification model is tested on a held out training sample. If the sample is classified incorrectly it is said to produce a LOO error. This process is repeated for all training samples. The LOO error rate is equal to the number of LOO errors divided by the total number of training samples [10]. A disadvantage of the LOO method is its computational inefficiency, because it needs to build the model n times for a training set of size n. An efficient and effective method was developed by Joachims for estimating the LOO error of SVM [10]. This method is adopted in the proposed online learning algorithm to estimate the prediction accuracy of MV. In particular, this method does not require actually performing n times of re-sampling and retraining, but can be applied directly after training the model. Based on the solution of the SVM training problem and the corresponding classification error for the each training sample, the LOO error rate can be calculated by

w * = yi isi
i =1


The bias b can be calculated in terms of the w* and any unbounded support vector: (7) The corresponding classification function is then obtained as the following:

b* = y w * s

f (z ) = sign w * z + b*

m = sign yi i (s i z ) + b* i =1


If a kernel function K is used to substitute the dot products in (4) and (8), the calculation will simply depend on the kernel function without directly dealing with the mapping function (x). The kernel function K is expressed as: K x i , x j = z i z j = (x i ) x j (9)

( )

For example, the radial basis function (RBF) is used as the kernel function in this paper:

K (x i , x j ) = exp x j x i


Based on the above descriptions, we conclude that SVM can generate the following non-linear classification function with a set of support vectors.

m f (x ) = sign y i i K (s i , x ) + b * i =1


In this paper we use the sequential minimal optimization (SMO) algorithm for training SVM [9]. SMO is an iterative method that can quickly converge to the optimal solution by iteratively updating the hypothesis.

d n (12) 2 d = i : 2 i R + i 1 , i { 1,..., n} E =

{ (

where the d counts the number of training samples for which the inequality holds. In the inequality, the R2 is an upper bound on the kernel function K(xi, xj) for any two training samples. For the RBF function, R2 = 1.

5.2. Reducing online updates

In the online learning process, reducing the number of updates of MV also means reducing the costs for manual sample labeling and model retraining computation. The goal here is to achieve a good prediction performance of MV while requesting as few online updates as possible. In this paper, we proposed an adaptive margin sampling method, in which a sampling heuristic is used to determine if a given sample is informative and should be labeled for updating model. Given a sample x, it can be classified by the existing MV. Lets define the certainty p(x) of the classification result using the distance from the given sample to the hyperplane of MV:

labeled manually for updating the existing MV. With the adaptive margin sampling method, the online update is required only when an uncertain sample is encountered. Once the estimated LOO error for the trained MV converges after a certain number of samples is trained, the MV is considered stabilized and the online updating may become unnecessary.

5.3. Selecting SVM parameters C and

As described in Section 5, there are two important parameters, C and , that should be per-determined when we apply the SVM technique with RBF kernel function. The benefit of identifying the optimal parameters (C, ) is to allow preventing the problem of over-fitting outliers and noises in training data. In the online learning algorithm, we use a gridsearch method to choose the parameters (C, ) [11]. To minimize the LOO error is considered the objective for the search process. The pair of (C, ) with the minimum LOO error is chosen as the optimal parameters. Practically, the grid-search is implemented on exponentially growing sequences of C and in the form: For example, the search grids for C and are C = 2-5, 2-3, , 212 and = 2-15, 2-13, , 23. In order to reduce the computational cost for the search process, the search uses a coarse grid first. After indentifying an optimal region on the grid, a fine grid search on that region can be conducted. It has been found by researchers that the grid-search method is simplistic but quite effective compared to other advanced heuristic search methods [11].

p(x ) =

y K (s , x ) + b
i i i i


That is, the closer a sample to the hyperplane, the less certainty for the classification result. As illustrated in Figure 4, a confidence threshold pt is calculated based on the LOO error rate E (described in (12)) of the existing MV that is currently used for classification: (14) pt = 1 + E H where the H is a user-defined parameter that denotes the default distance between the threshold and the margin when E is equal to 1 (i.e., 100%). As we can see, when the E decreases, the confidence threshold pt will decrease adaptively. The sampling heuristic is summarized as: a given sample is considered having a certain classification result, if p(x) > pt. Otherwise, the sample is an uncertain sample that requests to be selected and



,2 begin + step ,...,2 begin + k *step ,...,2 end }

( x) Space
pt Hyperplane 1 1 E*H (+)Margin E*H (+)Threshold pt

6. Defect detection
In the proposed AVI scheme, the defect defection module classifies inspection samples as defective and non-defective using the trained MV. During the online learning process, this module identifies the uncertain samples that request to be inspected manually and invoke the online learning process for updating MV. In the meanwhile, it also provides confident inspection results for the certain samples.

7. Case studies
In the case studies, we applied the proposed adaptable AVI scheme to field data that were collected from an existing fastener inspection system for a truck cross-car beam assembly. In the assembly line, an AVI system is used to examine a total of 46 metal clips



Figure 4. Confidence threshold

inserted by assembly robots for their proper installation. The existing system works well after excessive amount of manual tuning. Improving the system adaptability to changes has been a top priority.

critical to manufacturers as they desire not to release any defective product to customers.

7.2. Experimental results and analysis

This section presents the experimental results of inspecting one type of metal clips, Clip A, as shown in Figure 1. The camera was mounted at the top of the assembly robot. The illumination condition was provided with the overhead lighting from the ceiling of the plant. A sequence of inspection samples was collected for the experiment, including 101 nondefective (i.e., installed properly) and 105 defective (i.e., missing or installed improperly) samples. In the sample sequence, the first 140 samples were used for training, while the rest of samples were held out as a test dataset. In the experiment for the online learning approach, we used the first 40 samples as the initial training data. It was observed that the prediction accuracy of MV became stabilized after the following 100 samples (indexing from 41 to 140) were processed. The experimental results titled as Online-Learning are summarized in Table. 1. To demonstrate the promising performances of the online learning approach, three experimental results were also generated with the offline learning approach for comparisons. Table 1 also presents the three results that are titled as Offline-Learning II, III, and I respectively. Using the same sample sequence for the online approach, the SVM based MV was trained in batch mode with the samples indexing from 0 to 40, 0 to 90, and 0 to 140, respectively. It was assumed that the all samples used for the offline approach were manually labeled (as defective or non-defective) in beforehand. The trained MV was evaluated using the same test data for the online approach. From Table 1, it can be found that the inspection accuracy of the offline approach relies on the size of training dataset that is collected in beforehand. Comparing to the offline approach, the online approach provided a higher training efficiency. In the

7.1. Performance measures

There are the following performance measures that are used to evaluate the training efficiency and inspection accuracy of the proposed AVI scheme: Manual Inspection Rate (RMI) and Automatic Inspection Rate (RAI) The efficiency of the online training can be affected by the RMI. Known the total number of processed inspection samples N, the number of manual inspection samples NMI, and the number of automatic inspection samples NAI, the RMI and RAI are defined as RMI = N MI N (15) (16) The higher RMI (i.e., the lower RAI) indicates, the more the cost for human involvement is needed in the online training process. False Positive Rate (RFP) and False Negative Rate (RFN) The RFP and RFN are used to measure the accuracy of automatic inspection. A false positive result refers to a non-defective sample when it is classified incorrectly as defective. Known the number of total number of non-defective samples NN and the number of false positive samples NFP, the RFP is defined as R FP = N FP N N (17) A false negative result refers to a defective sample when it is classified incorrectly as non-defective. Known the number of positive samples NP and the number of false negative sample NFN, the RFN is defined as R FN = N FN N P (18) Although both the RFP and RFN are important measures for inspection accuracy, the RFN is more

R AI = N AI N = 1 RMI

Table 1. Experimental Results of Online and Offline Learning Approaches

Approach Training Performance Sample Indexes Online-Learning Offline-Learning III Offline-Learning II Offline-Learning I 1~140 (1~40 as initial training samples) 1~140 1~90 1~40 RMI 42% RAI 58% RFN 0% RFP 0% Inspection Accuracy on Test Samples RFN 0% 0% 6.90% 24.1% RFP 5.41% 5.41% 5.41% 8.11%


experiment, the online approach achieved the same inspection accuracy (RFP = 5.41% and RFN = 0%) as Offline-Learning III after sequentially processing the same samples indexing 41~140. Using the online approach, only 42% of the 100 processed samples requested for manual labeling. In the meantime, the online approach provided a high accurate inspection (RFP = 0% and RFN = 0%) on the 58% of samples that were processed during online learning. In order to demonstrate the adaptability of the proposed AVI scheme, we also conducted several experiments with the datasets acquired in different scenarios with inspecting different clips and changing operation condition. The experimental results still showed that the proposed AVI scheme had promising performances with respect to its training efficiency and inspection accuracy.

group at Queens University for providing sample images for the case studies.

10. References
[1] S. Newman and A.K. Jian, A Survey of Automated Visual Inspection, Computer Vision and Image Understanding, vol. 61, no. 2, March 1995, pp. 231262. [2] G. Abramovich, J. Weng, and D. Dutta, Adaptive Part Inspection through Developmental Vision, Journal of Manufacturing Science and Engineering, vol. 127, November 2005, pp. 1-11. [3] H.C. Garcia, J. Rene-Villalobos, and G.C. Runger, An Automated Feature Selection Method for Visual Inspection Systems, IEEE Transactions on Automation and Engineering, vol. 3, no. 4, October 2006, pp. 394406. [4] E.N. Malamas, E.G.E. Petrakis, M. Zervakis, L. Petit, and J.-D. Legat, A Survey on Industrial Vision Systems, Application and Tools, Image and Vision Computing, vol. 21, no. 2, February 2003, pp. 171-188. [5] H. Jia, Y. L. Murphey, J. Shi, and T. Chang, An Intelligent Realtime Vision System for Surface Defect Detection, Proceedings of 17th International Conference on Pattern Recognition (ICPR04), vol. 3, Cambridge, UK, August 23-26 2004, pp. 239-242, [6] N. Otsu, A Threshold Selection Method from GrayLevel Histograms, IEEE Transactions on System, Man, and Cybernetics, vol. SMC-9, no. 1, January 1979, pp. 62-66. [7] V.N. Vapnik, Statistical Learning Theory, Wiley, New York, 1998. [8] M.A. Hearst, S.T. Dumais, E. Osman, J. Platt, and B. Scholkopf, Support Vector Machines, IEEE Intelligent Systems and Their Applications, vol. 13, no. 4, July/August 1998, pp. 18-28. [9] J.C. Platt, Fast Training of Support Vector Machine Using Sequential Minimal Optimization, Advances in Kernel Methods: Support Vector Learning (edited by B. Scholkopf, C.J.C. Bugres, and A.J. Smola), The MIT Press, London, England, 1998. [10] T. Joachims, Estimating the Generalization Performance of an SVM Efficiently, LS-8 Report 625, Universitat Dormund, Fachbereich Informatik, 1999. [11] C.W. Hsu, C.C. Chang, and C.J. Lin, A Practical Guide to Support Vector Classification (online), Available:

8. Conclusions
In this paper, we have presented an adaptable AVI scheme for the application of part-assembly inspection. The proposed AVI scheme can adapt to changing inspection task and operation condition thought an online learning process without requiring excessive retuning or retuning. An adaptable inspection model, consisting of the two sub-models ML and MV, plays a key role for the proposed AVI scheme. There are four major modules: region localization, feature extraction, model training, and defect detection in the AVI scheme. The region localization is implemented by using the edge-based geometric template-matching technique to locate the subject to be inspected based on ML. The defect detection is realized by using the representative features obtained by the feature extraction and executing MV built by the model training. An efficient and effective online learning algorithm is developed using the SVM technique. In the case studies, the proposed AVI scheme demonstrated its promising adaptability performances with respect to its training efficiency and inspection accuracy. The future work will focus on refining and validating the adaptable AVI scheme with a greater range of data reflecting variations in both inspection parts and operation conditions. The expected outcome of this research will be beneficial to the manufacturing industry.

9. Acknowledgements
The authors would like to thank Van-Rob Stampings Inc. and Dr. Brian W. Surgenors research