Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Download
Standard view
Full view
of .
Look up keyword
Like this
2Activity
0 of .
Results for:
No results containing your search query
P. 1
Semi Supervised Learning

Semi Supervised Learning

Ratings: (0)|Views: 34|Likes:
Published by Yasin Khan

More info:

Published by: Yasin Khan on Mar 21, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as DOC, PDF, TXT or read online from Scribd
See more
See less

08/28/2013

pdf

text

original

 
ABSTRACT
As a supervised learning algorithm, the standard Gaussian Processes has theexcellent performance of classification. In this report, we present a semi-supervised algorithm to learning a Gaussian Process classifier, whichincorporating a graph-based construction of semi-supervised kernels in the presence of labelled and unlabeled data, and expanding the standard GaussianProcesses algorithm into the semi-supervised learning framework. Oualgorithm adopts the spectral decomposition to obtain the kernel matrices, andemploys a convex optimization method to learn an optimal semi-supervisedkernel, which is incorporated into the Gaussian Process model. In the GaussianProcesses classification, the expectation propagation algorithm is applied toapproximate the Gaussian posterior distribution. The main characteristic of the proposed algorithm is that we incorporate the geometric properties of unlabeleddata by globally defined kernel functions. The semi-supervised GaussianProcesses model has an explicitly probabilistic interpretation, and can model theuncertainty among the data and solve the complex non-linear inference problems. In the presence of few labelled examples, the proposed algorithmoutperforms cross-validation methods, and we present the experimental resultsdemonstrating the effectiveness of this algorithm in comparison with other related works in the literature.
1
 
CHAPTER 1INTRODUCTION
Semi-supervised learning [1] has attracted an increasing amount of attention in the recent years,which includes many research areas, such as semi-supervised classification, semi-supervisedregression, semi-supervised clustering, Co-training, etc. In this report, we primarily considethe semi-supervised classification. The standard supervised learning methods use only labelleddata (or features) to train and learn classifiers. Due to the diversity of data, labelled instancesare often difficult, expensive and time consuming to obtain. Meanwhile unlabeled data may berelatively easy to collect in practice. Comparing with supervised learning, semi-supervisedlearning can build better classifiers by using large amount of unlabeled data together with fewlabelled data.In the statistics and machine learning fields, much of the basic theory and many algorithms areshared to use. The primary differences between two fields are the goal of learning and the typeof problem solved. The statistics mainly considers how to understand the relationships betweendata and models, such as linearity or independence. In contrast, the machine learning primarilyfocuses on how to give the accurate prediction and understand the behaviour of algorithms.Due to the different objectives, the two fields have the different development trends, for example, the learning algorithms in the machine learning have widely used as black-box andwe only worry about the input and output. But it is usually difficult in the statistics to describethese models and obtain the satisfied results. To some extent, Gaussian process model [2-9]effectively bridges the two fields, and has an explicit probabilistic interpretation which canfacilitate modelling the uncertainty of complex data sets, and provides a completely theoreticalframework for model selection and probability prediction simultaneously.As a supervised learning algorithm, the posterior distribution of standard Gaussian processescan be affected by unlabeled data, which makes the location of decision boundary not beinfluenced. In this report, we present how to effectively expand Gaussian processes model intothe semi- supervised framework through incorporating unlabeled data, and improve the performance of Gaussian process classifiers. Due to Gaussian processes based on Bayesianframework, we can address this problem from two areas of likelihood function and prior distribution: (1) to combine a Gaussian process prior with a given likelihood function, which2
 
make the posterior distribution incorporate the cluster assumption and influence the location of decision boundary. Lawrence [4] provided the Null-Category Noise Model (NCMM), which isequal to be a probabilistic margin. Rogers [5] replaced the NCMM by a Multinomial Probitlikelihood function, which generalized the binary setting to the multi-class setting; (2) Todirectly modify the kernel function of Gaussian process prior, which has the properties of semi-supervised kernels and incorporate the information of labelled data and unlabeled data. Spectralclustering [11], diffusion kernels [12] and Gaussian random field [13] are semi-supervisedkernel methods. These methods belong to the parametric approaches, which are difficult tochoose an appropriate function family and accurately model the data without the enoughdegrees of freedom. In this report, we address the semi-supervised learning algorithm from thearea of Gaussian process prior distribution described above. The proposed algorithmincorporates the geometric properties of unlabeled data through the graph-based spectraldecomposition, and obtains the optimally non-parametric semi-supervised kernel which iscombined with the Gaussian processes model.
1.1SUPERVISED, UNSUPERVISED, AND SEMI-SUPERVISEDLEARNING
In order to understand the nature of semi-supervised learning, it will be useful first
 
to take alook at supervised and unsupervised learning.
1.1.1SUPERVISED AND UNSUPERVISED LEARNING
Traditionally, there have been two fundamentally different types of tasks in machine learning.The first one is unsupervised learning. Let X = (
 x
1
 , ….., x
n
) be a set of n examples learning (or  points), where
 x
i
 
 
 X 
for all
i
 
[
n
] := {
1, . . . , n
}. Typically it is assumed that the points aredrawn i.i.d. (independently and identically distributed) from a common distribution on X. It isoften convenient to define the (
n ×
)-matrix
X
= (
 x
i
)
i
[n]
that contains the data points as itsrows. The goal of unsupervised learning is to find interesting structure in the data X. It has beenargued that the problem of unsupervised learning is fundamentally that of estimating a densitywhich is likely to have generated X. However, there are also weaker forms of unsupervised3

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->