Yue Fang MSC

An Application of Independent Component Analysis to DS-CDMA Detection A Thesis Submitted to the College of Graduate Studies and Research in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Department of Electrical Engineering University of Saskatchewan by Yue Fang © Copyright Yue Fang, October 2006. All rights reserved,PERMISSION TO USE In presenting this thesis in partial fulfillment of the requirements for a Postgraduate degree from the University of Saskatchewan, it is agreed that. the Libraries of this University may make it freely available for inspection. Permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professors who supervised this thesis work or, in their absence, by the Head of the Department of Electrical Engi- neering or the Dean of the College of Graduate Studies and Research at the University of Saskatchewan. Any copying, publication, or use of this thesis, or parts thereof, for financial gain without the written permission of the author is strictly prohibited. Proper recognition shall be given to the author and to the University of Saskatchewan in any scholarly use which may be made of any material in this thesis. Request for permission to copy or to make any other use of material in this thesis in whole or in part should be addressed to: Head of the Department of Electrical Engineering 57 Campus Drive University of Saskatchewan Saskatoon, Saskatchewan, Canada STN 5A9ACKNOWLEDGMENTS I would like to express my largest gratitude to my supervisor, Professor Kunio Takaya, for his continuous support and guidance throughout my thesis work, My appreciation goes to Dr. Dodds and Dr. Ha. Nguyen for their time and suggestions rendered to this thesis. I wish to thank all of you not just for being on my committee but for all of your guidance and assistance. A special thanks goes out to my wife for her endless encouragement. iiABSTRACT This work presents the application of the theory and algorithms of Inde- pendent Component Analysis (ICA) to blind multiuser symbol estimation in downlink of Direct-Sequence Code Division Multiple Access (DS-CDMA) com- ‘munication system, ‘The main focus is on blind separation of convolved CDMA mixture and the improvement of the downlink symbol estimation. Term blind implies that the separation is performed based upon the observation only. Since the knowledge of system parameter is available only in the downlink environment, the blind multiuser detection algorithm is an attractive option in the downlink: Firstly, the basic principles of ICA are introduced. The objective function and optimization algorithm of ICA are discussed. A typical ICA method, one of the benchmark methods for ICA, FastICA, is considered in details. Another typical ICA algorithm, InfoMAX, is introduced as well, followed by numerical experiment to evaluate two ICA algorithms Secondly, FastICA is proposed for blind multiuser symbol estimation as the statistical independence condition of the source signals is always met. The system model of simulation in downlink of DS-CDMA system is discussed and then an ICA based DS-CDMA downlink detector has been implemented with MATLAB, A comparison between the conventional Single User Detection (SUD) receiver and ICA detector has been made and the simulation results are analyzed as well. The results show that ICA detector is capable of blindly solving multiuser symbol estimation problem in downlink of DS-CDMA system, The convergence of ICA algorithm is, then, discussed to obtain more stable simulation results. A joint detector, which combines ICA and SUD and where ICA is considered as an additional clement attached to SUD detector, has been implemented. It was demonstrated that the joint detector gives the lowest error probability compared to conventional SUD receiver and pure ICA itidetector with training sequences ‘Keywords: Direct-Sequence Code Division Multiple Access (DS-CDMA), FastICA, Independent Component Analysis (ICA), Principal Component Analy- sis (PCA)LIST OF ABBREVIATIONS 3rd Generation (3G) Additive White Ganssian Noise (AWGN) Bit Error Rate (BER) Blind Source Separation (BSS) Direct-Sequence Code Division Multiple Access (DS-CDMA) Direct Sequence Spread Spectrum (DSSS) Double Sideband Suppressed Carrier (DSB-SC) Frequency Division Multiple Access (FDMA) Frequency-shift keying (FSK) Independent Component (IC) Independent Component Analysis (ICA) Inter-Symbol Interference (ISI) Multiple Access Interference (MAI) Principal Component: Analysis (PCA) Probability Density Function (pdf) Pseuclo-Noise (PN) Code Phase-Shift: Keying (PSK) Single User Detection (SUD) Spread Spectrum ($8) Time Division Multiple Access (TDMA)Table of Contents PERMISSION TO USE ACKNOWLEDGMENTS ABSTRACT LIST OF ABBREVIATIONS TABLE OF CONTENTS LIST OF FIGURES LIST OF TABLES 1 INTRODUCTION 1.1 Motivation of Research 1.2 Research Objectives 1.3. Outline of the Thesis 2 INDEPENDENT COMPONENT ANALYSIS (ICA) 2.1 Preliminaries 2.2 History of ICA 2.3 Principal Component vs. Independent Component. 24. Independence and Nongaussianity 2.5 Objective (Contrast) Funetions for ICA 2.5.1 Mutual Information 2.5.2 Kurtosis and Negentropy 2.6 Algorithms for ICA. 2.6.1 Preprocessing of the data 2.6.2 InfoMAX 2.6.3 FastICA | 2.7 Simulation of ICA . 2.7.1 InfoMAX 2.7.2 FastICA 28 Summary of ICA Algorithm3 DIRECT SEQUENCE CODE DIVISION MULTIPLE AC- CESS (DS-CDMA) 3.1, Multiple Access . 3.2. Spread Spectrum 3.3. Signal Model of A DSSS System 3.3.1 Signal Modulation 3.3.2 Signal Demodulation 3.4 Signal Model of DS-CDMA System 3.5. Probability of Error of DSSS System . 4 APPLICATION OF ICA TO DS-CDMA DETECTION 4.1 Signal Model of DS-CDMA Downlink 4.2, Gold Code Generation : 4.3 Application of ICA to DS-CDMA System 5 SYSTEM SIMULATION OF ICA BASED DETECTOR, 5.1 Simulation of Conventional DS-CDMA Detector 5.2. Simulation of ICA-Based Detector 5.2.1 The First ICA-Based Detector 5.2.2 Some Factors in ICA Algorithm 5.3 Simulation of ICA-SUD Detector. 5.3.1 Ambiguities of ICA Detector 5.3.2 ICA-SUD Detector 6 CONCLUSIONS AND RECOMMENDATIONS 6.1 Conclusions 6.2 Recommendations . . REFERENCES APPENDIX MATLAB SOURCE CODES vil21 22 23 24 25 2.6 27 28 2.9 2.10 a4 32 33 34 35 41 42 51 52 53 Bd 55 List of Figures A set of four arbitrary source signals, Four mixtures of the source signals in Figure 2.1 ‘The ICA estimates of the original source signals An illustration of ICA model ‘Two principal components (p1,p2) of two dimensional Gausian data by PCA (Left), and two independent components (91,42) of two dimensional non-Gaussian data by ICA superimposed by PCA vectors (p1,p2) (Right)... 0.00.0 An illustration of density function with various kurtosis . . Flowchart of InfoMAX algorithm ‘The comparison of source signals and estimates Four mixtures of the source signals in Figure 2.8 Flowchart of FastICA Algorithm Cells in cellular communication system . Model of Spread Spectrum Communications Generation of a DSSS signal Convolution of spectra V(f) and C(f) Demodulation of DS spread spectrum signal Generation of Gold sequences of length=31 . Example of 33 Gold sequences of length=31 Model of DSSS for Monte Carlo Simulation Results of DSSS performance with Monte Carlo Simulation Model of ICA Detector with Monte Carlo Simulation Flowchart of Simulation Data Set Generation with Monte Carlo Shnulation Flowchart of [CA Detector with Monte Carlo Simulation, viii ity 16 22 23 25 31 32 33 36 4 42 45 46 a756 57 58 Results of ICA Detector with Monte Carlo Simulation Results of ICA Detector with Monte Carlo Simulation Results of ICA Detector with Monte Carlo Simulation, M- = 1000, 2000, 5000 and 10000 Model of ICA-SUD Detector for Monte Carlo Simulation Result of ICA-SUD Detector for Monte Carlo Simulation53 ba List of Tables Results of Conventional Detector with Monte Carlo Simulation Results of ICA Detector with Monte Carlo Simulation, M = 1000, 2000, 5000 and 10000 Results of ICA Detector with Monte Carlo Simulation with Zero, Random and Gold Code Initial Value Results of ICA-SUD Detector with Monte Carlo Simulation1. INTRODUCTION 1.1 Motivation of Research Code Division Multiple Access (CDMA) efficiently provides high quality voice services and high-speed packet data access and it has been selected as the promising solution to 3rd generation (3G) wireless communications. Direct- Sequence Code Division Multiple Access (DS-CDMA) is one of the required radio interface of these solutions. In a DS-CDMA system, users share the same band of frequencies and the same time slots, but they are separated by unique spreading codes. The main sources of errors at the receiver are due to the Multiple Access Interference (MAD, the Intet-Symbol Interference (ISI), the asynchronism of users and the near-far problem. The last two problems only occur at the uplink (from the mobile stations to the base stations), because in the downlink (from the base stations to mobile stations) the information-bearing signals from the base stations are transmitted in a synchronous way and with the same power to all users. The traditional way to estimate symbols, Single User Detection (SUD), considers interference as additional noise and thus ignore the structure of MAL ‘This leads to Matched Filtering. The optimum Multiuser Detection (MUD) strategy, maximum likelihood method, leads to a joint estimation of each user’s symbols, and it thus increases computational complexity. Dr. 8. Verdi took into account the inherent structure of interference in his classic work(16] and thus a significant computational gain was introduced. ‘The maximum likelihood sequence detector consists of a banks of matched filters followed by a Viterbi-algorithm. Dr. Verdii’s invention meant that the interference of nunltiple access and the near-far elfect can be stitigated, although not by a conventional SUD receiver.Although MUD receivers have been an attractive field of research, we will not use optimal MUD as a reference due to «it is still quite computationally complex, since there is less processing power in the mobile station than in the base station. « it is not applicable in downlink environments, since the codes of the interference users should be known and each mobile station knows only its own code while the codes of the others are unknown, ‘These features of downlink processing call for new, efficient and simple solutions. Independent Component Analysis (ICA), one of Blind Source Sep- aration (BSS) techniques provides a promising new approach to the downlink signal processing of DS-CDMA systems using short spreading codes. In blind MUD, no knowledge of interference parameters are needed, but the structure of MAL is still exploited in the demodulation of a particular user. ‘To exploit the structure of MAI, the most intuitive way is to separate each user from the mixture of DS-CDMA signal, or to eliminate all the interference when demodulating a particular user’s symbol in the presence of multiaccess interference. This is also the idea behind optimal MUD, ie. the multiaccess interference is not considered as the additional noise but the info-bearing data stream. ‘The only interference left is background noise. ICA isa recently developed, useful extension of standard Prineipal Compo- nent Analysis (PCA). It has been proposed and implemented to take advantage of Blind Source Separation (BSS) algorithms. However, the performance and robustness of the algorithm are not satisfied. 1.2 Research Objectives The objective of this research is to build an ICA-based blind multiuser detector for symbol estimation in downlink environment of DS-CDMA system. At first, an ICA-based build multiuser detector is considered. We then propose an ICA-SUD joint detector, in which ICA detector is considered asan additional element attached to conventional SUD receiver. The results of numerical experiment are compared to conventional SUD receiver, 1.3 Outline of the Thesis ‘This thesis is organized into 6 chapters. Chapter 1 provides background and motivation for the thesis. The objectives of this research are also stated. Background information is provided in Chapters 2 and 3 to give the reader knowledge necessary to better understand and appreciate the ICA-based DS- CDMA downlink detector. Chapter 2 describes the characteristies of ICA Algorithm, which is used for this work. Two typical ICA methods, Infor MAX and FastICA, have been implemented with MATLAB. The results of numerical experiment have been analyzed as well. Tt shows that FastICA is a better solution for ICA-based detector. Chapter 3 provides an overview of DS-CDMA systems. The signal model of DS-CDMA is discussed as well Chapter 4 reviews the related literatures, in which the signal model of ICA based multiuser detector is discussed. Also, the short spreading code, Gold code, is discussed. Chapter 5 describes the detailed numerical experiment. implementation of ICA-based blind multiuser detector in downlink of DS-CDMA system. At first, the simulation model and the Matlab implementations of IGA-based blind multiuser detector are discussed. Next, some factors are discussed to improve the convergence properties of ICA detector. At last, ICA-SUD joint detector is simulated. Chapter 6 summarizes the research performed and the results. Future work which can be done to improve on aspects of this work is also discussed.2. INDEPENDENT COMPONENT ANALYSIS (ICA) 2.1 Preliminaries In the past two decades, a particular method for finding underlying factors or components from multivariate (multidimensional) statistical data, called Independent Component Analysis (ICA), has attracted a lot of interest both in statistical signal processing and neural network communities. Several good algorithms utilizing higher-order statistics of suitable nonlinearities either directly or indirectly are now available for solving the basic linear Blind Source Separation (BSS, also known as Blind Signal Separation) problem. Nonlinear- ities cause the statistical distribution of variables to be non-ganssian. Imagine that four people are speaking simultaneously in a room and four microphones positioned at different locations give you four recorded time signals which are denoted by 1(t), x2(t), «3(t) and y(t). Each of these recorded signals is a weighted sum of the source signals emitted by the four speakers, which are denoted by 5:(¢), s2(t), sa(¢) and s4(t). We can express this system of linear equations as: zi(t) = arsi(t) + arg80(t) + a1sss(0) + arasalt) a(t) = azisi(t) + a2282(t) + ana89(t) + aacsa(?) (oa) alt) = as1s3(t) + agsa(t) + an38a(0) + aaesilt) a(t) = ayisi(t) + agn8o(t) + aassa(t) + a4ass(t) where ay, i,j = {1,2,3,4} are some parameters that depend on the distances from the microphones to the speakers. It would be interesting if we could now estimate the sources s,(1), s(t), s(t) and s(t) using only the recordings 21 (t), 2a(0), a(t) aud y(t). This problem is called cocktail perty problem because one tries to eavesdrop a particular person's talking from the noisy mixture 4Source Signals a a a a er) Deo eos) CaO SCs ee 00 150~—«00~=—«sD «a0 SOC «UO Figure 2.1 A set of four arbitrary source signals of all conversations. As the name implies, ICA can be used to estimate the parameters ai, i,j = {1,2,3,4}. This allows us to separate the source signals from the mixed recordings. As an illustration, the four source waveforms are shown in Figure 2.1. Let us pretend for a moment that these waveforms represent sound signals from four speakers. Four recordings from four different positions could look something like the four mixtures shown in Figure 2.2. The problem is now to recover the signals shown in Figure 2.1 using only the data shown in Figure 2.2. If the parameters ay, i,j = {1,2,3,4} are known, the source signals could have been easily extracted by linear algebraic methods. Unfortunately, since the parameters ais, i, = {1,2,3,4} are unknown, the problem seems impos- sible, However, it trims ont that the source Signals con be separated from the mixtures with a realistic assumption that the fone soutee signals s(t), s(t)sed Signals ; See eee ee el °y 50 100 150-200-250 300380400450. 500, : soft lotr nal Ae * 50 a a a a ee eo ‘500, : : : 3100 180~—<200 ~~ as0~~—«80D~aSSC ON SSD«OO t Figure 2.2 Four mixtures of the source signals in Figure 2.1 sa(t) and s(t) are statistically independent. Figure 2.3 gives an illustration of the four signals estimated by ICA method. It can be seen that the signals are very close to the original source signals. Some of the estimates are scaled by some positive or negative factor, but waveform were restored. ‘This cocktail-party problem can be extended to n sources and m mixtures, where m > n, and solved by ICA method as illustrated above. This system ‘can be written in matrix form as: x= As (22) the problem is to find out the estimation of source signals § = Wx with observed mixing signals x only. An illustration of [CA model is shown in Fis- ure 24.Estimated Signals 0a 1009300 aoo aw 00 a0 5 EEE EEE - 080 ee te ato ato aa ann i a a a ) Figure 2.3. The ICA estimates of the original source signals The ICA has many interesting applications that are similar to the cocktail- party problem. A brief literature review is given in Section 2.2, and a brief theoretical framework for ICA is given in next sections. 2.2 History of ICA Source separation is an old problem and many algorithms exist depending on the nature of the mixed signals. The problem of Blind Source Separation (BSS) is usually difficult, without. knowing the structure of the mixed signals, ‘The technique of ICA was introduced in the early 1980s by J. Hérault, C. Jutten and B. Ans [7] [8] [9]. At a meeting held in 1986 on Neural Networks for Computing, Jeanny Hérault and Christian Jutten contributed a research paper entitled “Space or time adaptive signal processing by nenral network niodels” [L1]. This paper annownees the birth of ICA, although the name ofSource Estimated Signal #1 xe Source Signal #1 Signal #1 ' rs 1 1 A 1 w 1 ' ee ' Ss Mixed Estimated Source Signal #m Source Signal #n. Signal #n Figure 2.4 An illustration of ICA model ICA was not mentioned. ‘The only assumption made by Hérault, and Jutten was statistical independence, but additional constraints are needed on the probability distribution of the sources. Subsequent research has shown that the best performance was obtained by the Hérault-Jutten network when the source signals were sub- Gaussian, that is, for signals whose kurtosis! was less than that of a Gaussian distribution. ‘The general framework for ICA introduced by Hérault and Jutten is most clearly stated by P. Comon [14]. There are many other algorithms available in the literature. J. F. Cardoso [10] used algebraic methods, especially higher- order cumulant tensors, which eventually led to the JADE algorithm. ICA attained wider attention and growing interest after A. J. Bell and T. J. Se- jnowski (5) published their approach based on the Information Maximization (InfoMAX) principle in 1995. This algorithm was further refined by 8. Amari et al [15] using the natural gradient. In 1997, A. Hyvainen and E, Oja [4] presented the fixed-point or FastICA algorithm, which has contributed to the application of ICA to large-scale problems due to its computational efficiency. “in probability theory and statistics, kurtosis is a moasure of the "peskedness" of the probability distribution of a real-valued rousdom variable. The mathematical definition will he given in Section 24Since the mid-1990s, there has been a growing wave of papers, workshops, and special sessions devoted to ICA. The first international conference on ICA was held in Aussois, France, in January 1999, the second? followed in June 2000 in Helsinki, Finland, the third* in December 2001 in San Diego, California, USA, the fourth in December 2002 at Whistler ski resort, Canada, the fifth® in September 2004 in Granada, Spain and the sixth in March 2006 in Charleston, South Carolina, USA. Each gathered more than 100 researchers working on ICA and BSS, and contributed to the transformation of ICA to an established and mature field of research. 2.3 Principal Component vs. Independent Component In this work, the problem of representing discrete-valued multidimensional variables is considered. Let x denote an observed m-dimensional random variable; the problem is then to find a matrix W so that the n-dimensional transform 8 = (51, $,..+45y)" defined by Wx (23) can be estimated from the observed variable x, where x is a linear mixed signal of source signal vector s shown as Bquation 2.2 and W is an n x m matrix to be determined and in most eases, the linear transform of the observed variable x is considered. ‘The most popular methods for finding a linear transform as in Equation 2.3 are second-order methods, e.g. Principal Component Analysis, factor analysis and many more, Such methods use only the information contained in the covariance matrix of the data vector x, since its distribution is completely determined by this second-order information, if the variable x bas a normal, or Gaussian distribution. tp rw cis nat ica2000/ St: / /ica2001 ues cay/ “Inttp:/ eww: keel nt.c0,ipfiel/snal en 2008/ itp /iea200 tg es/ itp, //rowweneufedfiea2006/Principal Component Analysis (PCA) is a common statistical method for analyzing variations within a set of data, and is known in several fields applying statistical techniques, e.g. image compression and pattern recognition. PCA is also known as the Karhunen-Lodve transform (or KLT, named after Kari Karhunen and Michel Loéve) or the Hotelling transform (in honor of Harold Hotelling) [2]. ‘The aim of PCA is to find a set of n orthogonal vectors in data space that account for as much as possible the variance in the data, ‘The data js projected from their original m-dimensional space onto the n-dimensional subspace spanned by the 50 called principal components?. In this way, the first principal component is taken to be along the direction with maximum variance, The second principal component is constrained to lie in the subspace perpendicular to the first. In general, PCA is a decorrelation-based method that finds a linear transform W that satisfies Equation 2.3 so that the following three criteria are met [2] 1. The output vectors 8 are uncorrelated. 2. The basis vectors of W are orthogonal to each other. 3. The eigenvalues of W are ordered according to eigenvalue. Comparing to PCA, ICA provides a solution to the problem to represent multi-dimensional data by a set of their independent components. The principal components of multi-dimensional Gaussian data can be found by PCA as shown in Figure 2.5 (Left) for a 2-dimensional case. Figure 2.5 (Right) illustrates the case of non-Gaussian distribution which gives the same PCA components (pi, 2), but yields the basis vector (q1, qx) of data independence, provided that the statistical distributions of independent components are not of Gaussian, One of the basie goal in PCA is to reduce te dimension of the data, tis one us hooses mm, 10Po Po Figure 2.5. Two principal components (p1, p2) of two dimensional Gausian data by PCA (Left), and two independent components (qx, a2) of two dimensional non-Gaussian data by ICA superimposed by PCA vectors (p1,p2) (Right). ‘The reason for comparing PCA with ICA is their close relationship. Even though PCA works quite well in some cases, the situation is quite different when it comes to BSS problem, In fact, by using the well-known decorrelation ‘methods, any linear mixture of the independent components can be transformed into uncorrelated components, in which case the mixing is orthogonal. ‘Thus, the trick in ICA is to estimate the orthogonal transformation that is left after decorrelation. ‘This is something that classic methods such as PCA can not estimate because they are based on essentially the same covariance information as decorrelation. ‘Typical algorithms for ICA use PCA or singular value decomposition as preprocessing steps in order to simplify and reduce the complexity of the actual iterative algorithm. Preprocessing ensures that all dimensions are treated equally a priori before the algorithm is run, 2.4 Independence and Nongaussianity Consider ovo scalar-valued random yatiables 6) sud s2, if information on (he value of s; does not give any information on the value of sy and vice versa, u1 and 82 are said to be independent, In mathematical terms, independence is defined by probability density function (pdf). Random variables s1, 62,-.-, are (mutually) independent, if the joint pdf can be factorized (6) as: P(S1,--+5n) = Pa(s1)P2(s2) ---Pa(n) (2.4) where p(si,-..,5n) denotes the joint pdf of s1,52,.-.5 and pi(s,) denotes the marginal pdf of s;. Note that independence must be distinguished from ‘uncorrelation, which is defined by [6] E{(si - Bfsi})(s— Efsj)} =0, for iA 7 (25) or equivalently Efsis;} — E{s,JE(s;} = 0, for 4 j (26) where E{e} denote expected value. Independence is, in general, a much stronger requirement than uncorrelation since independence implies uncorrelation, but uncorrelation does not imply independence, Indeed, if random variables s1,2,-.-,, are said to be independent, it must satisfy [6] E{91(si)9a(s5)} — E{si(si)}E{ge(s))} = 0, for i #7 (2.7) {for any nonlinear transformation functions g,(si) and go(sj) (in the sense that their covariance is zero). It is obvious that Equation 2.6 or Equation 2.5 is a special case where both g:(s;) and go(s,) are linear only. Equation 2.7 can be proved as follows [6] Bla(sdore)} = f f ov(sials,)p(6us;)dsi80 =f [ alsdr(sidor(s)ry(ss)asise = falsdetedsr [ an(s;,(s;)ds2 = Bfa(s)}Etals,)} (28) Independence implies “xonlincar™ uncerrelation, thus it gives a basic ICA estimation principle: By finding a matrix W for any i # J, the components 12sj and s, are uncorrelated, as well as the transformed components gx(s;) and. (sj) ate uncorrelated. However, an important special ease where independence and uncorrelation are equivalent, that is, s1,2,--,5, have Gaussian distribution, Hence, the fundamental restriction of ICA is that independent components must be nongaussian for ICA to be possible® Another basic ICA estimation principle is thus given: By finding the local ‘maxima of non-Gaussianity of a linear combination x; = jis; under the constraint that the variance of x is constant, each local maximum gives one independent component. ‘This is because if x were a real mixture of two or mote components s,, it would be closer to a Gaussian distribution than its source components s;, due to the central limit theorem. Simply speaking, the key to estimate the ICA model is non-Gaussianity. Non-Gaussianity, motivated by the central limit theorem, is one method for ‘measuring the independence of the components. The more details will be given in following sections. 2.5 Objective (Contrast) Functions for ICA For assuring the identifiability of the ICA model, in this work, the following. fundamental restrictions are imposed [2] 1. All the independent components si, with the possible exception of one component, must be non-Gaussian, 2. The number of observed linear mixtures m must be at least as large as ‘the number of independent components n, i.e., m > n. 3, The mixing matrix A must be of full column rank so that § = Wx based. on x = As) Usually, it is also assumed that x and s are centered at zero, which is in practice no restriction, as this ean always be accomplished by subtracting the Actually, if only one of the indepenclent components is Gaussinn, the ICA model eam still bo estimated 1B‘mean from the random veetor x. From proceeding sections, it is known that the basic principle of ICA estimation is to find a set of estimated source signals $ by maximizing non- Gaussianity. Thus the estimation of the data model of ICA is usually performed by formulating such an objective function and then minimizing or maximizing it. Often such a function is called an objective or contrast function (also the terms loss function or cost function are used). One might express this in the following ‘equation’: ICA methods = Objective function + Optimization algorithm (2.9) Non-Gaussianity can be measured, for instance, by kurtosis or approximations of negentropy. Mutual information is another popular criteria for measuring statistical independence of signals. Optimization algorithin is the subject of optimization theory. Most of ‘multivariate function optimization are based on the gradients of the objective functions, which are considered in this work. Different optimization methods can be used to optimize a single objective function, and a single optimization ‘method may be used to optimize different objective functions. 2.5.1 Mutual Information One objective function for ICA estimation, inspired by information theory, is minimization of mutual information. ‘The most basic concept of information theory, entropy H(y) (often called differential entropy), is defined for a continuous-valued random variable y, with pdf p(y) as [5] My) = ~BXnp(a)} = ~ [r(o) velop 2.10) where E{*} denotes expected value, Mutual information /(y,) between the random variables y and x gives the relative entropy [5] Ay.) = H(y) ~ H(y|x) (2.41) Tywhere y is considered as the output of a neural network processor and contains information about its input «. H(y) is the entropy of the output, while H (yl) is whatever entropy the output has which didn’t come from the input. Here the gradient of information theoretic quantities with respect to some parameter, say w, is considered. The above equation can be differentiated as follows, with respect to a parameter, w, involved in the mapping from to y [5 = a Bol Ow Hy) (2.12) Because H(y|x) does not depend on w. No matter what the level of H(y| dy ‘maximization of the mutual information, J(y, 2), is equivalent to the maximization of the output entropy, H(y). This principle leads to an ICA. algorithm, InfoMAX, which will be discussed in following sections. 2.5.2 Kurtosis and Negentropy Another classical measure of non-Gaussianity is kurtosis or the 4"*-order cumulant. The kurtosis of random variable y is classically defined by [2] kurt(y) = B{y*} ~ 3(B{y*})? = Efy*} (2.13) since y is assumed to have unit vatiance®, ic, E{y*} = 1. Fora Gaussian y, the 4% moment E{y*} equals 3(B{y*})®. Thus kurtosis is zero for a gaussian random variable. For most non-Gaussian random variables, kurtosis is nonzero. Kurtosis can be both positive or negative, corresponding to super-Gaussian and sub-Gaussi 1n distribution respectively. Super-Gaussian random variables have typically a “spiky” pdf with heavy tails, e.g. Laplacian distribution, and sub-Gaussian random variables have a “flat” pdf, e.g. uniform distribution, ‘Their pdfs are illustrated in Figure 2.6, Typically non-Gaussianity is measured by the absolute value of kurtosis Computationally, kurtosis can be simply estimated by using the 4"* moment of the sample data. Theoretically, kurtosis has the following linearity property To simplify, y is assmmeel to have zeroemean anwl nat variance, Actually, one uf the functions of preprocessing in ICA algorithms is to make this simplifeation possible. 15Laplacian istibution (SuperGaussian) Uniform with positive distribution kurtosis (SubGaussian) with negative. Gaussian oa distribution Figure 2.6 An illustration of density funetion with various kurtosis (modified from [2)) If-xy and 22 are two independent random variables, it holds (2] Kenrt(ny +t) = kurt(¢,) + kurt(2) (2.14) Kurt(ax;) = okurt(2) (2.15) where a is a scalar, However, kurtosis has also some drawbacks in practice, when its value has to be estimated from measured samples. The main problem is that. kurtosis can be very sensitive to outliers. Its value may depend on only a few observations in the tails of the distribution, this means that kurtosis is not a robust measure of non-Gaussianity. A basic result from information theory is that a variable with a Gaussian distribution has the largest entropy among all other random variables of equal variance. This means that differential entropy (calculated based on the definition of entropy given by Equation 2.10 on Page 14) could be used as a measure of non-Gaussianity. The measure should give the difference between the entropy of a Gaussian random variable Yguse and @ non-Gaussian random variable y. This measure is called negentropy, denoted by .J, and is defined as Jig) = Hy, /) 2.10) ‘under the requirement that Yyjauss and y are of the same covariance matrix. The 16problem in using negentropy is very difficult to calculate. Therefore, simpler approximations of negentropy are very useful ‘The classical method of approximating negentropy is using higher-order cumulants, for example, as follows [4]: ete ae Jy) = BWP + Fkurt(y) (az) Unfortunately, the reservations made with respect to kurtosis are also valid here, since the cumulant-based approximations of negentropy are inaccurate, and in many cases too sensitive to outliers. New approximations of negenrtopy were therefore introduced. In the simplest case, new approximation is of the form [4] (y) © elE{a(y)} ~ E{o(v)}]* (2.18) where g is practically any non-quadratic fumetion, ¢ is an irrelevant constant, and v is a Gaussian variable of zero mean and unit variance (i.e. standardized) For the practical choice of g, these approximations were shown to be more robust than the cumulant-based ones. 2.6 Algorithms for ICA Alter choosing one of the objective (contrast) functions for ICA discussed in preceding section, one needs a practical method for its implementation. Usually, this means that one optimization method needs to be decided to optimize the objective function. In this section, the optimization problem will be discussed. 2.6.1 Preprocessing of the data Typical algorithms for ICA. use centering, whitening and dimensionality reduction as preprocessing steps in order to simplify and reduce the complexity of the problem for the actual iterative algorithm. The most basic and necessary preprocessing is to center the observed signal vector x, ie. subtract its mean ector E(x} so as 1o make x a zeoutean varinhle, This huplies zero-mean as well, as can be seen by taking expectations on both sides of 7Equation 2.3. Another useful preprocessing in ICA, often called whitening, is to transform the observed vector x linearly so that the components of a new vector X are uncorrelated and their variances equal unity, Le. B{&&"} = 1. ‘The whitening transformation is always possible by performing = ED-¥?B"x (2.19) where E is the orthogonal matrix of eigenvectors of B(xx"} and D is the diagonal matrix of its eigenvalues, D = diag(d,...,d,). Whitening and dimension reduction can be achieved with principal component analysis or singular value decomposition. Next, two typical ICA algorithms, InfoMAX and FastICA will be introduced, 2.6.2 InfoMAX Bell and Sejnowski [5] have shown that maximizing the joint entropy H(y) of the output of a neural network processor can approximately minimize the ‘mutual information among the output components y = g(u), where g(u) is an and . 7 ne wr + wo When a single input z is passed through a transform function g(c.) to give invertible monotonic nonlinearity, e.g. g(u) an output variable y, both I(y, 2) and H(y) are maximized when high density part of pdf of z is aligned with highly sloping parts of the function g(r). When g(z) is monotonically increasing or decreasing (i.e. has a unique inverse), the paf of the output, p,(y), ean be written as a function of the pdf of the input, Pe(z) (6) ro) = Ba (220) substituting Equation 2,20 into Equation 2.10 gives [5] 4) ~ € {0/2} — ema (2.2) : ‘ln this work. aimed for application of algoritisns. The details of algorichins wou't be given. 18where the entropy is measured with natural logarithm. The second term on the tight (the entropy of «) may be considered to be unaffected. Therefore in order to maximize the entropy of y by changing w, it only needs to maximize the first term , which is the average log of how the input affects the output. ‘This can be done by considering the ‘training set’ of 2’s to approximate the density p,(z), and deriving an ‘online’, stochastic gradient ascent learning rule (5] -2(l0)-(@)' 2) os AW x 5 In the case of the logistic transfer function [5] oy Ow Bw \" ar 1 I+e™ y ua wrt uy (2.23) in which the input x is multiplied by a weight w and added to a bias-weight wp, and then we have wy —v) e229 2 (%) -v0-n p+ v2.20) (228) then the learning rule for the logistic function for one input and one output is given [5) 1 Aw oc = + 2(1 ~2y) (2.26) as well as the rule for the bias weight Auy 1 -2y (2.27) for the input vector x, a weight matrix W, a bias vector wo and a monotonically transformed output vector y = g(Wx-+ wo), the resulting learning rules are familiar in form [5] AW oc sox + x"(1-2y) Awy x1 -2y (2.28) 192.6.3 FastICA A practical ICA algorithm, FastICA algorithm, is a fixed-point iteration scheme for finding a maximum of the non-Gaussianity or negentropy of w"x purposed by A. Hyviirinen and B. Oja [3] [4]. Its objective function is shown in the Equation 2.18 on Page 17. Since the objective function is the measure of non-Gaussian, many practical function could be used, To begin with, the Equation 2.18 on Page 17 could be modified as follow Joly) = |E{G(y)} — B{G(v)}? (2.29) Note that the notation Je should not be confused with the notation for ne- 1,2 typically. Clearly, Jg can be considered a generalization of kurtosis. For G(y) = gentropy, J, and the exponent p Jo becomes simply the modulus of kurtosis of y. Note that G must not be quadratic, because then Jg would be trivially zero for all distributions. Thus, it seems plausible that J could be a contrast function in the same way a kurtosis. ‘The fact, for p = 2, Ja is coincides with the approximation of negentropy given in Equation 2.18 on Page 17. In [6], the finite-sample statistical properties of the estimators based on optimizing such a general contrast function were analyzed. It was found that for a suitable choice of G, the statistical properties of the estimator (asymptotic variance and robustness) are considerably better than the properties cumulant- based estimators. The following choices of G were proposed [4]: G,(u) = logcoshayu (2.30) Galw) = exp (4) (ean) where a1,a) > 1 are some suitable constants. Experimentally, it was found that especially the values 1 < a, < 2, a2 = 1 for the constants give good approximations. This approximations of negentropy gives a very goo com- promise between the properties of the ovo classical wgaussiauity measuzes sgiven by kurtosis and negentropy 20For one computational unit or neuron, the basic FastICA scheme is then as follows [4] 1. Choose an initial (e.g. random) weight vector w 2, Let wt tion g is g(u) = 18 E{xg(w"x)} — E{g{(w?x)}w, where the nonquadratie fune- 3. Let w = w*/|jwt |, 4. If not converged, go back to 2 Note that convergence means that the old and new values of w point in the same direction, ie. their dot-product is (almost) equal to 1. ‘The algorithm estimates just one of the independent components once. To estimate several independent components, the FastICA need to be run several times." 2.7 Simulation of ICA ICA can be applied to a variety of problems involving separation of mixed signals, multivariate analysis and almost every case where PCA is used. In this section, two ICA methods, InfoMAX discussed in Section 2.6.2 and FastICA discussed in Section 2.6.3 are simulated. Each algorithm was applied to two BSS applications. The first application is a Cocktail Party Problem which consists of 4 speakers and the same number of microphones. The second application is a demonstration of the example shown in Figure 2.1 on Page 5, and Figure 2.2 on Page 6. 2.7.1 InfoMAX ‘The pri ple of the algorithm is discussed in Section 2.6.2. The algorithm has been simulated with MATLAB, and its flowchart is shown in Figure 2.7. At first, the Cocktail Party Problem which consists of 4 speakers is applied. The experiments presented here were obtained using speech segments recorded pill eciea/. she FastiCA MATLAB yuck us. cis.hut,fi/projects/ii age is available, 21 Se Permute mixed signal X q Subtract means from X, le. let mean=0 t Get decorrelating covariance matrix C q Decorrelate X 1 Initialize W, L, B Go through data? No + Update W Yes Print results a) igure 2.7 Flowchart of InfoMAX algorithm=o Oates eels 10 . ° ~10) Osan ra sans 1 10 ° ora) a o 1 2 a 4 8 xt igure 2.8 ‘The comparison of source signals and estimates from various speakers. All signals were sampled at 8 kHz with 50000 sample points. The method of training was stochastic gradient ascent, but the learning rule is multiplied by WT x W, where W is unmixed matrix estimated by InfoMAX algorithm, as proposed by Amari, Cichocki and Yang [15]. This ‘natural gradient’ method speeds convergence and avoids the matrix inverse in the learning rule, Weights were usually adjusted based on the summed AW’s of small ‘batches’ of length B, where B=30 and various learning rates!” were ). To time, the time index of the signals was permuted. sed (0.01 was typi ire that, the input ensemble was stationary in An example is run with 4 speech sources. The mixtures, x, is formed with a random mixtnre matrix A. The vnmixed solution was reached and is reflected 2 he tearing vate fs he ptepursiouslsy e ion 2.28 on Pogive Signals xo y eee J 00S SSCS 9) a a et wo 0) ost 1s 2 28 3 38 4 45.5 Figure 2.9 Four mixtures of the source signals in Figure 2.8 in the permutation structure of the matrix WA. As one typical result out of ‘many trials, matrix WA is shown below 0.0157 0.0809 [38.7406] -0.0966 =0.0510 [703998] 0.0523 0.1968 0.0149 0.0609 0.0647 [12.5733] SABI] 0.0293 0.1328 0.0317 WA= (2.32) as can be seen, only one substantial entry (boldface and boxed) exists in each row and column. The interference was attenuated by between 34.5 and 65.34B. The original source signals and their estimates are shown in Figure 2.8, and four observed mixtures are shown in Figure 2.9. Since both s and A are unknown in equation x=As, any scalar nyultiplier in one of the sources could always he canceled by dividiug the correspouting column a; of AL the same scalar; Consequently, the variances (magnitudes or energies) of the 24Sart > ae Remove mean of X q Initialization [ete PCA for dimension reduction q Whiten data q Initialization for ICA t Calculate independent ‘component All components done? Yes + Print results — Figure 2.10 Flowchart of FastICA Algorithm Yes 25independent components cannot be determined. Note that the ambiguity of the sign is left too, when multiplying an independent component by 1 [2] Secondly, the mixed sources, shown in Figure 2.2 on Page 6 are applied. Unfortunately, the unmixed solution cannot be reached, since the objective function of InfoMAX works for estimation of most super-Gaussian independent, components; but for sub-Gaussian independent components, other objective functions must be used. Because two sub-Gaussian components existed in sample data, the attempt to unmix the independent components failed. 2.7.2 FastICA ‘Two BSS problems that are the same as those in preceding section, have been applied to FastICA algorithm as well. The MATLAB code is obtained from A. Hyviirinen’s FastICA MATLAB package with default parameters, and its flowchart is shown in Figure 2.10. At first, the speech signal test is applied. The mixtures, x, is formed with random mixture matrix A. The unmixed solution can be reached. When the same mixture matrix, A, as that in the trial of the Equation 2.32 on page 24, is applied, the corresponding matrix WA is shown below 45418) 0.0356 © —0.0362 0.0324 0.0291 0.0836 0.0962 2013 WA= (2.38) 0.0081 0.1576 [12.1829] -0.0848 0.0372 [5.5457] -0.3557 —0.0648 where W is unmixed matrix. as can be seen, only one substantial entry (boldface and boxed) exists in each row and column, ‘The interference was attenuated by between 23.9 and 63.2dB. Note that the order of the independent ‘components must not be the same as the order in Equation 2.32 on page 24 because of the ambiguity of the ICA. The reason is that, in equation x=As, both s and A are unknown. The order of the terms can be freely changed, ¢ the first one and any of the independent components ea ambitious attempt, nine sources were successfully separated. 26‘The source example shown in Figure 2.1 on Page 5, and the mixed souree shown in Figure 2.2 on Page 6 are applied, the unmixed solution shown in Figure 2.3 on Page 7 is reached. It proves that FastICA is more robust and has faster convergence than the InfoMAX method, 2.8 Summary of ICA Algorithm PCA is a classical tool within multivariate analysis. However, PCA is not well suited to separate statistically independent signals that are mixed. ‘This is because uncorrelated variables must not be independent. Independent Component Analysis (ICA) is a general purpose statistical technique in which observed random data is separated in a way that makes the estimates maximally independent, from each other under the assumption that not more than one of the sources has a Gaussian distribution, ICA can be formulated with different algorithms, and its basie principle is mainly based upon nonlinear decorrelation by maximizing or minimizing negentropy or maximizing mutual information. In this section, no account has been given for cases where there is known noise in the inputs, so it impacts slightly the accuracy of the solution, ‘The InfoMAX algorithm presented in section 2.6.2 is limited. Firstly, since only single layer networks are used, the optimal mappings discovered are constrained to be linear, while some multi-layer system could be more powerful Secondly, the diversified objective function can improve the performance in various problem. Despite these concerns, the information maximization approach serves as a guiding principle for further advances. FastICA package is a good multi-purpose implementation of ICA. ‘This algorithm is based on the fixed-point method and has the following advantages © Fast convergence. ‘© Contrary to sradient-hased algorithins, like InfoM AX, there is no learn= te or other aeljustable parazaovers fa the «govt‘* Suitable to both super-Gaussian and sub-Gaussian components, as well as the mixtures of them. In summary, the basic principles of ICA have been discussed. It has been demonstrated that ICA can solve the cocktail party problem and some other basic BSS problems. Two typical methods, InforMAX and FastICA, have been implemented. In spite of the large amount of research conducted on this basic problem by several authors, this area of research is by no means exhausted. ICA has found many applications in diversified fields, like blind separation of electroen- ‘cephalographic (EEG) and magnetoencephalographic (MEG), economic time series analysis, feature extraction of image and so forth, Recently, applica tions on telecommunications have also been published, where ICA is useful to separate the user's own signal from the interfering other users’ signals in DS-CDMA, The use of ICA as a post-processing tool for conventional ar- ray receiver shows that is capable of mitigating the continuous-wave jamming in DS-CDMA system, especially with moderate/high signal to noise radio values [17] It is an area of great potential. To this end, nonlinear transformation, higher number of estimated parameters, and signals separation in the presence of noise must be researched in further. 283. DIRECT SEQUENCE CODE DIVISION MULTIPLE ACCESS (DS-CDMA) Design and development of future generation wireless systems is one of the most active and important areas of research and development in wireless communications. In November of 1999, the International Telecommunication Union (ITU) approved an industry standard for third-generation (3G) wireless networks. This standard, called International Mobile Telecommunications- 2000 (IMT-2000), consists of 5 operating modes, including 3 based on CDMA technology. They are most commonly known as CDMA2000, WCDMA (UMTS) and TD-SCDMA. The standard defines three radio interfaces enabling inter- operability: IMT-MC (Multi-Carrier), IMT-DS (Direct Spread) and IMT-TC (Time Code). Code Division Multiple Access (CDMA) was originally developed in World War II for secure military communications. Today, CDMA offers significant advantages over other analog and digital cellular technologies. CDMA is a ‘modulation and multiple-access scheme based on spread-spectrum communication. In this scheme, multiple users share the same frequency band simultaneously, by spreading the spectrum of their transmitted signals, so that each user's signal is pseudo-orthogonal to the signals of the other users, 3.1 Multiple Access Multiple Access refers to the sharing of a communications resource, or in other word, the common transmission medium such as time, frequency among several users. ‘The sharing of a communications resource is an essen- tial question particularly in wireless communication systems since wireless is a broadcast meclium. However, as the numer of nsers in the system grows. the demand of using the common resources as efficiently as possible also arise. 29‘Thus multiple access schemes are developed for this purpose, The most. common multiple access schemes are {12} 1. Frequency Division Multiple Access (FDMA): Each user is given a frequency slot in which one and only one user is allowed to operate. Inter- ference of other users is thus easily prevented by assigning slots that do not overlap. 2, Time Division Multiple Access (TDMA): A simmilar idea is realized in the time domain, where each user is given a unique time period. One uuser can hence transmit and receive data only during its own time slot while others are silent. 3. Code Division Multiple Access (CDMA): As distinet from both FDMA and TDMA, each user occupies the same frequency band simultaneously. ‘The users are now identified by so called codes, which are unique to cach other. The signals of different users are first applied by its unique codes and then get mixed together before they are transmitted through a common medium, The unique codes, however, ensure that the signal of each user can be resolved at receiver. In the simplest form, the code of CDMA is a sequence of +1's, also called chip sequence. This is 50 called Direct sequence spread spectrum, also known as Direct Sequence Code Division Multiple Access (DS-CDMA). DS-CDMA is one of three approaches to spread spectrum modulation for digital signal transmission (12): ‘* Frequency Hopping (FH). The signal is rapidly switched between different frequencies within the hopping bandwidth pseudo-randomly, and the receiver knows before hand where to find the signal at any given time. * Time Hopping (TH). The signal is transmitted in short bursts psenclo- randomly, and the receiver knows beforehand when to expect the burst. 30Figure 3.1 Cells in cellular communication system # Direct Sequence (DS). The digital data is directly coded at a much higher bandwidth. The code is generated pseudo-randomly, the receiver knows how to generate the same code, and correlates the received signal with that code to extract the data. ‘Two types of digital modulation, Phase-shift keying (PSK) and Frequeney- shift keying (PSK), are usually in conjunction with Spread Spectrum system. PSK is generally used with Direct Sequence Spread Spectrum (DSSS) system and FSK is commonly used with FH Spread Spectrum system. Since available radio frequencies are limited resource, it should be used as efficiently as possible. This brings the question of frequency planning. In Fig- ture 3.1, each cell (llstrated as a hexagon) corresponds to a geographical area served by a base station, In a randomly picked cell, A, certain predetermined frequencies are assigned. In all the adjacent cells of cell A, the frequencies assigned to them must not be same as the frequency slot of cell A. Due to the availability of the frequencies, it would be desired to reuse the same frequencies in another existing cell, eg. cell B. In FDMA or TDMA system, the reuse pattern must be carefully designed to prevent different cells interfering with each other. Or in other word, cells A and B must be far apart so that the signals from cell B are attenuated enough as they are received 1 cell A and vice versa. In CDMA system, no such design problems exists. since the same frequencies are used. This is the main advantage achieved by 31Information Output Sequence Data —[Ehsouer H Moauiator [4{ channet -f Demoauiator Hf Seer [+ Tt z PR CBE Generator Generator Figure 3.2 Model of Spread Spectrum Communications CDMA, increasing the system capacity, 3.2 Spread Spectrum ‘The basic elements of a spread spectrum digital communication system are illustrated in Figure 3.2. In general, Spread Spectrum communications is distinguished by three keys elements [12} 1. The signal occupies a bandwidth much greater than that is necessary to send the information. This results in many benefits, such as immunity to interference and jamming and multi-user access. 2. The bandwidth is spread by means of a code whieh is independent of the data. 3. The receiver synchronizes to the code to recover the data. The use of independent codes and synchronous reception allows multiple users to access the same frequency band at the same time. In order to protect the signal, the code used is pseudo-random. It appears random, but is actually deterministic, so that the receiver can reconstruct the code for synchronous detection, This pseudo-random code is also called Pseudo-Noise (PN) code. The code helps the signal resist narrow band jamming or interference and also enables the original data to be recovered if chip bits are damaged during transmission. ‘Time synchronization of two identical pseudo-random sequence generators in Figure 3.2 is also required. A fixed PN bit pattern is designed and transmitted prior to the transmission of information so that the receiver will detect it with high probability in the presence of 32interference In practice, since the codes may be as long as 2 — 1, it is difficult to eavesdrop. It is, however, possible. Hence, the spread spectrum methods does not offer security in a strict sense. Encryption is still necessary. 3.3. Signal Model of A DSSS System In next two sections, the signal model of a DSSS system is briefly introduced and the transmission of a DSSS signal by means of binary PSK is considered (13} vo 4 ( Ty > t A oy 4 1 B a A viet) 4 ‘ j———» A Figure 3.3. Generation of a DSSS signal 333.3.1 Signal Modulation In Figure 3.3, the information-bearing baseband signal is denoted as v(t) and can be expressed as v= YS bnglt~ mm) @a) where {bm = +1, —00 < m < oo} is the information-bearing symbol stream g(t) is a rectangular pulse of duration T, ‘This signal is then multiplied by the PN signal c(¢) generated from the PN sequence genertor elt) = Yo enp(t— nT.) (3.2) where {¢, = +1, -00 . ‘This multiplication operation spreads Ts Tr th the bandwidth of v(t) to the much wider bandwidth of PN signal c(t). In other words, itis equivalent to convolution operation in the frequency domain, ie. the narrow spectrum signal, V(f), is spread to a much wider spectrum V(J) #C(,f) by the PN signal C(f), which are illustrated in Figure 3.4. The product signal v(t}c(t) is modulated by the carrier Accos2mfct and thus the Double Sideband Suppressed Carrier (DSB-SC) signal is u(t Acv(t)e(t)cos2r fut (33) since v(t)e(t) = for any ¢, then u(t) may also be expressed as u(t) = Accos(2n fet + 9(t)) (3.4) where 0(t) = 0 when v(t)c(t) = 1 and 6(4) = x when v(t)e(¢ the u(t) is a BPSK signal whose phase varies at the rate =-1. Therefore, S]- 34avo enoan ! tr % ach 0 aE oe Tr; Vino Figure 3.4 Convolution of spectra V(f) and C(f) 3.3.2. Signal Demodulation ‘The signal demodulation is performed as illustrated in Figure 3.5. The received signal r(t) is first multiplied by a synchronized local PN signal c(t) at the receiver. This operation is called spectrum despreading. since the effect of multiplication by c(t) at the receiver is to undo the spreading operation at the transmitter. Thus we have r(t)e(t) = Acv(t}e"(t)eos2n ft (3.5) since 2(f) = 1 for all t. The demodulator for the despread signal r(thelt) = Aev(theos2n fut (3.6) 35Matched Filter poo------ ' i . al 1 | [1+ To Decoder Figure 3.5 Demodulation of DS spread spectrum signal is simply the matched filter 3.4 Signal Model of DS-CDMA System ‘There are two main applications of DSSS system. First, the spread spee- trum signal is transmitted at very low power so that a listener would encounter seat difficulty in trying to detect the presence of the signal. We discussed the signal model of this system in last section. A second application is multiple- access radio communications, ie. DS-CDMA system. As shown in Figure 3.3, the bit and chip durations are 7, = 5 and T, = 1, respectively. ‘The processing sain is thus determined by ratio C= 74. For the orthogonal synchronous DS-CDMA systems, it gives a theoretical upper limit for the amount of users that can be supported In a DS-CDMA system, users share the same band of frequencies and the same time slots, but they are separated in code. The main sources of errors at the detector are due to the Multi-Access Interference (MAI). The conventional Inter-Symbol Interference (ISI) and channel noise are certainly the other sources of errors. In Equation 3.1 and Equation 8.2, e(t) is usually called Chip Sequence. In 8 practical DS-CDMA syste, its length snot from —o0 tooo and duration L=G We consider the signal after demodulation in this work. In Figure 3.5, let us denote m** data symbol (information bit) by Bm, and the chip sequence with finite length C’ by s(¢). The number of bits in the observation interval is 36denoted by M. Then we have au r(t)= YO bas(t— mT) (3.7) wo where r(t) denotes the transmitted CDMA baseband signal of one single user. ‘The signal model of multi-user can be obtained and will be discussed in next chapter. 3.5 Probability of Error of DSSS System In an Additive White Gaussian Noise (AWGN) downlink channel, let us consider a spreading signal with BPSK modulation, If we consider the signal after spreading as the transferred bit stream, the probability of error for this system with is identical to the probability of error for conventional BPSK, that (3.8) where Ey is power of signal and No is power of noise. On the basis of Central Limit Theorem, the MAI from other users in a DS- CDMA system can be approximately treated as the AWGN signal!, when the number of user is big, ¢.g. 30. As mentioned above, the probability of error for the bit stream after spreading for one single user is therefore approximately equal to the probability of error for conventional BPSK, if we define the power of single user's signal as the power of signal, and the sum of the power of other users’ signals and the power of AWGN as the power of noise, when we define Signal-to-Noise Ratio (SNR) in this calculation. ‘The simulation is shown in following chapters. “in practice, the PN code used in CDMA system usually are not orthogonal. ‘Then MAL will include the effect of it 374. APPLICATION OF ICA TO DS-CDMA DETECTION In last chapter, we briefly introduced the signal model of DS-CDMA system. The system model of ICA based multiuser detection in downlink environment of DS-CDMA is discussed in what follows. 4.1 Signal Model of DS-CDMA Downlink SDMA signal model, In this section, we represent mathematically the DS. in a mobile receiver over an Additive White Gaussian Noise (AWGN) channel, the received signal (baseband) r() can be easily described for K users and m symbols from Equation 3.7 a 1) =Y DY bimse(t— mM) + n(t) (4:1) 7 where n(2) denotes additive noise. ‘The signal model is not yet realistic, because it does not consider the effect, of multipath propagation and fading. Our desired signal model including these factors therefore has the form KLM r= & 2X ainbamsult — mTy — dT.) +n(t) (42) where 1, k,m > path, user and symbol indices, respectively iy, > path gain! jy, => symbol ‘4(-) > spreading code (chip sequence) C = number of chips per symbol t,T,Te = time, symbol and chip duration, respectively In downlink model, path gain docs not differ ansong the sors since all the users signal 's sent together and the path gain aim and delay factor dk only depend on the path 38di => propagation delay factor, and d; € {0,+++,(@ —1)/2} n(t) = Gaussian noise? ‘The signal is then sampled and chip-rate sampling is assumed. Here both code timing and channel estimation are often prerequisite tasks, The discrete data samples Kb M rll = OY im-tbem—s84eln] + aimbimBisln)) + nln] (4.3) fattei mat where n[n} denotes the discrete noise and the “late” and “early” parts of the code vectors are suln] = [s(C-d +1) --- sx(C) 0 Buln] = [0 --- 0 sy(1) +++ sx(C— a]? Having sampled, the samples are collected into C x M’ matrix R. from subse- (44) quently discrete data samples r| r(mC) r(mC + C) r(m'C) Rel mcr) rimo+e+1 : r(m'C +1) _ r(mC+C-1) rlmO4+2C) ... r(m'C+C-1) or in O-veetors ty, format R= [em Fined oes Brel (46) where m! > m, M’ = m'—m-+1 and m,m’ € {0,---,M}. The length M’ is required by ICA algorithm and usually much greater than length of chips C. Although in practice, the length of chip codes C can be very long, M' still has to be greater than C. If the delay is shorter than a symbol duration, the length M' can be as shorter as requested by ICA algorithm. Then data vector has the form Kb Tm = DEY latm-tbkm-18i1 + Aiba Bii] + Bm a (4.7) a a 1 =D [Peman 2 emia + Dion D> MBit | + Boe ist i a The noise is the sum of all the background noise and will be considered as the last independent component. 39where n,, denotes noise vector. The Equation 4.7 can be represented in more compact form Tm = Gby + Mm (48) where the C x 2KL matrix G contains all the 2K'L “early” and “late” code vectors and fading terms G = (su.8uy--8x1,3K2] (4.9) and the 2KL vector bm contains the symbols Bos = (i ym-tPtyn-ty Gimbios 5 @tam-1Pke ts PK (4.10) ‘The C x M' data matrix R, then can be represented in pure matrix form R=GB+N (41) where matrices G, B and N denote the unknown mixed matrix, symbols and noise. R= brs tm] B= [ayes Bye (4.12) N= [amy] Comparing with the model of linear ICA in Equation 2.2 on Page 6 As (423) B is the source signal s need to be estimated, R is the observed mixed signal x, and G is the unknown mixing matrix A. The noise matrix N in Equa- tion 4.L1can be treated as an independent component to be added into x. 4.2 Gold Code Generation ‘The generation of PN sequences for spread spectrum system simulation is necessary. By far the most widely known binary PN sequences are the ‘maximum-length shift-register sequences, or m-sequences for short. Its peri- odie autocorrelation functions is n (j=0) 6G) = (4d) -1 (isj se(C (53) Su In view of ICA algorithm, the AWGN can be treated as one of the independent a7Load Saved Gold Code ‘Set K, M q Generate Test Symbol Matrix (KXM, [1, -1] q ‘Spreading (KXCM, [1, -1] y Generate AWGN Data q Save Generated Data Set = ar) Figure 5.4 Flowchart of Simulation Data Set Generation with Monte Carlo Simulation components. Then the Equation 5.2 is changed to R=GB (54) where the dimension of matrices R, B and G are Cx M’, Cx K and K x M’ respectively. ‘The simulated DS-CDMA downlink data in the presence of AWGN is used to test the algorithm. The short Gold codes shown in Figure 4.1 on Page 41 are used. Thus the maximum number of users is K = 30 as the 31% “user” is AWGN. The number of symbol is 1000. The process of data set. generation is shown in Figure 5.4 Matlab simulation is implemented for verification of the validity of the system model. Parameters were as follows: Number of symbol was M = 1000; Number of users was K = 30; Number of paths was L = 1; Signal-to-Noise 48 ol Load Gold Code Load Simulation Data Set 1 SUD t ICA Decoding q Match ICs to User q Calculate BER for SUD and ICA. Figure 5.5 Flowchart of ICA Detector with Monte Carlo Simulation Ratio (SNR) was varied with respect to the individual user from -10dB to OdB. All the signals for each of users are sent in same power. The AWGN is treated as an IC. The simulation model is shown in Figure 5.3 ; the flowchart of simulation is shown in Figure 5.5and two of the typical simulation results are shown in Figure 5.6 and Figure 5.7. ‘We can easily observe that the ICA detector functionally works. It can CDMA downlink signal when the decoding is blind semi-blindly estimate DS but it requires some user information such as the user's assigned chip code or training sequences to match the decoded bit stream to user who should receive it, However, in this simulation, more than 10% of simulation trials can’t converge. In other words, the ICA decoder is not working at all in these trials since no IC can be found during the iteration, In Figure 5.6, we also notice that, at SVR = —8dB and —4dB, the symbol error rate of the ICA decoder are unreasonably high because some ICs can not be found, as well as in Figure 49Symbol Error Rate 10 =8 =6 =4 2 0 ‘SNR in dB (48) Figure 5.6 Results of ICA Detector with Monte Carlo Sim- ulation 5.7 at ~5dB. Therefore, the basic ICA detector need to be improved, though it functionally works in most cases. 5.2.2 Some Factors in ICA Algorithm In the last section, the ICA detector has been implemented. However the consistency of the simulation results is not satisfied. In this section, some factors such as number of symbol M (also known as the vector size required by ICA algorithm) and the initial value of iteration in ICA algorithm are discussed. Number of Symbol ‘The number of symbol involved in a single trial of ICA detection is usually ‘M = 1000, It is used in many literatures [17] [18] [20] and the simulation in the last section. We test the ICA detector with M = 500, 1000, 2000, 5000 and 10000 5000000 00000 000° ] 0000" 00000 00000] 000° 000 — 00000 ] 00000 000° + a00010] 0 0000" 9000" 9000" | 0000"0 000° ©0000 | 0000°0 00000 00000 ] 0000" 000° ao0'0 | T- 00000 00000 0000" | 0000"0 000° 0000 | 0000" 0000 000° | 0000'0-—Too"'0 T0000 | ze 0000" 9000" 9000" | 00000 9000' ©0000 | 0000°0 00000 00010 To00" 000° oD |e Too’ 000° 000° | To00'0 000° T0000 | o000'2 —To90'0 TODD | FOON'D © Te00;0 90000 | zo00 100070 "0 _Z000°0 z000'0 | E0009 © FOO0'D FOOD | 6L00°0 9TOV;D-—6oDD'| | s- 2000 9000" L000°0 | 60000 — $000'0 90000] st00;0 e107 F100] 6TO © se00;0 zo 2000 © @ZO0 ~— 120070 | szON © ¢zO0' zz] ze00'0 se00'0| T300°0TsT0;0eLgoO] 2 6200 0900" + 6s00'0 | g900"0 L000 FON 06000 © s800°0] ett0;0 uso eITOO| veI00 910 ETO | THT ESTO. —esTO-O F100 29100 | eco; 6rE0EZOTO) 6 s9zo'0_z9z0'0 saz] zee" zezO'N—_zBeO'O zoe. gezor0 | Fe90;0 zor e220] oT et oH # tt cH = cH T# c# oH 1# | (ap) 0001 = AW 008 = 7° 008 = oor = aNS 0O00T PH G90 “OONe ‘ODOT = AA ‘woreyn “WS oFYD eqUOPY YHA 1O}EG YOLIOSIMSY ZS PART, 510000°0 000070 0900°0 | 0000°0 0000 000° ] 0000° 0000 co00'0 | 0 0000°0 00007 00000 | 000° 0000 00000 | 0000" 000% 000.0} 1- (0000°0 0000° 0000°0 | 0000°0 0000" 0000°0 | 0000" 0000 0000} z 0000'0 000° 0000'0 | 0000" 00000 00000 | 0000°0 000° 0000 | &- (0000°0 1000" 0000°0 | 000° 1000 00000 | c000' T0000 000.0} ¥ T000°0 ooo" TOG0'0 | To00'0 zo00'| | To00'0| to00' zooN' zo00'0| s- 2000°0 20000 9000'0 | 90000 4000°0 20000 | 200° 200° 90000 | 9 1z00' F009 zeo00 | ez00'0 eco ez00'0| ¥zO0' eo. T2000} 2 6200°0 600° 090070 | z900°0 1900 62000 | F900° +9000 19000] s- Seto reI0o 98100] FeI00 sto et0;0| zetO;0 zeTO;O eeTOO| 6- g9c0'0 920.0 e9z00 | L9zO'D_T9z0' 29200 | Soz0;0 Toe. _69z00 | OT- Cee oe ee | oa, ot tH | ct cH T# | (ap) Pred) wopuey ono, aNS enqen TERT apop plop pue wopuRy ‘ox07 ypLA uORET “RTS OF18D ONOPY YAH TOO VOLIOSINSY — g°S OTE 52.——suD Ica ‘Symbol Error Rate ‘SNR in dB (48) Figure 5.7 Results of ICA Detector with Monte Carlo Sim- ulation and typical simulation results from a trial (randomly picked) are shown in Fig- ure 5.8 and 3 randomly picked simulation results are shown in Table 5.2 as well. At first, the ICA detector can't find out any ICs at all as M = 500 because the algorithm can’t converge and its result is not shown, Next, the computing load is heavier as M increases. At last, 100 simulations are performed for each M. In Figure 5.8 we can observe that the consistency of the simulation results is getting better and the symbol error rate is getting lower as M increases. Es- pecially in Table 5.2, we notice that the consistency level of result is acceptable as M = 10000. Therefore, M = 10000 will be used for all the next simulations. Also, we find that ICA detector can even provide lower symbol error rate than conventional SUD detector as SNR > —5dB and M = 5000, 10000. 53,‘Symbol Error Rate 10 -8 6 ~4 SNR in d8 (48) Figure 5.8 Results of ICA Detector with Monte Carlo Sim- ulation, M = 1000, 2000, 5000 and 10000 Initial Value Initial value is another important factor that affects the convergence of iteration algorithm, Table 5.3 shows the simulation results of ICA detector when the initial value is set to all-zero, random and Gold code. From the simulation results, no significant difference among the initial value settings is observed. However, we can notice that the number of iteration is a little less when Gold code is used. At last, we also attempt to change the criteria of convergence ¢, but no difference is observed. 5.3 Simulation of ICA-SUD Detector In the preceding sections, ICA detector has been implemented and discussed. The blind multi-user detection can be performed by ICA based detec 54Bi Steam Spreading SUM [ (KXM) (KXCM) (1XCM) |_| Symbor AWGN Compare Error Counter! (xem) [DP ‘SUD (xm) PostiCA ICA Detector Reshaping (KXM) ((K+4)XM) (CXM) PostiCA TCA Detector Reshaping (KXM) ((K+1)XM) (cxm) Figure 5.9 Model of ICA-SUD Detector for Monte Carlo ‘Simulation, Table 5.4 Results of ICA-SUD Detector with Monte Carlo Simulation SNR (dB)} ICA | suD_|Ica-suD -10 | 0.027047 | 0.011077 | 0.010930 -9 | 0.014180 | 0.005890 | 0.005213 -8 | 0.006273 | 0.002773 | 0.002227 -7 | 0.002207 | 0.001140 | 0.000753 -6 | 0.000727 | 0.000473 | 0.000203 -5 | 0.000187 | 0.000188 | 0.000070 -4 | 0.000027 | 0.000073 | 0.000010 -3 | 0.000000 | 0.000030 | 0.000000 -2 | 0.000000 | 0.000007 } 0.000000 -1 | 0.000000 | 0.000003 } 0.000000 0 | 0.000000 | 0.000000 | 0.000000 5510 —— sud Ica Ica-sup| BER ‘SNRindB(4B) Figure 5.10 Result of ICA-SUD Detector for Monte Carlo Simulation tor. The term blind implies that the detection can be performed by received signal only. Although the parameter of system is not necessary, use of some pilot or training sequence is desirable to identify the signal for desired user. 5.3.1 Ambiguities of ICA Detector Independent Components (ICs) can be normally found by ICA detector, However, the ICA detector sometimes encounters the following ambiguities or indeterminacies. Variances (energies) of the ICs can’t be determined As a consequence, we have to fix this magnitudes of the ICs. This is not a problem in our simulation. Since all of transmitted data stream is binomial distribution random variables with equal possibilities, the most natural way to do s0 is to assign +1 to all the positive magnitude and —1 to all the negative 56magnitude. Note that this still leaves the ambiguity of the sign. ‘Therefore, the pre-requisite info such as training sequences is required to identify the sign of the decoded data stream. Order of ICs can’t be determined Because of two indeterminacies above, ICA detector can’t be used indepen- dently as a DS-CDMA detector. It requires training sequence to help identify the owners of the decoded data stream, Therefore, the ICA detection may perform better as an additional part of conventional detector 5.3.2 ICA-SUD Detector The ICA-SUD detector is implemented. ‘The simulation model is shown in Figure 5.9 and one randomly picked results of numerical experiment are shown in Figure 5.10 on and Table 5.4. Parameters were as follows: Number of symbol was M = 10000; Number of users was K = 30; Number of paths was L= ; Signal-to-Noise Ratio (SNR) was varied with respect to the individual user from -10dB to OdB. All the signals for each of users are sent in same power: In the illustrative experiment, all the algorithm worked quite well. Two independent ICA detectors are used as the post-processing of SUD detector. In Figure 5.10 and Table 5.4 show the performance of these methods as the average bil-error-rate (BER) corresponding to all users. It is obvious that ICA-SUD gives the lowest BER.6. CONCLUSIONS AND RECOMMENDATIONS 6.1 Conclusions CDMA techniques are currently studied extensively in telecommunications, because they will be used in a form or another in future high performance mobile communications systems. A specific feature of ICA, semi-blind nature of source separation, provides useful means for CDMA demodulation, Where the receiver has more or less prior information on the communication system, typically at least the spreading code of the desired user is known, the ICA can be released from the user identification and concentrate on decoding. ‘This prior information should be combined in a suitable way with blind ICA techniques for achieving optimal results. Another important design feature is that practical algorithms should not be computationally too demanding, making it possible to realize them in real time, though it still requires more effort in future work. First of all, we began with the basic concepts of ICA. FastICA [4] was taken into deeper consideration. FastICA is a good multi-purpose implementation of ICA. This algorithm is based on the fixed-point method, so it has some advantages such as fast convergence, suitable to both super-Gaussian and sub- Gaussian components, Next, basic concepts of multiple access, DSSS and DS-CDMA were discussed. The DS-CDMA signal model was introduced, as well as its discrete- time vector representation, followed by system model of ICA based DS-CDMA downlink detector. Finally, ICA DS CDMA downlink detector has been simulated and it demos strated that ICA detector can solve the symbol estimation problem with no 58.spreading code required, though the spreading code should be utilized to identify each user. Thus an ICA-SUD detector has been proposed and its symbol ‘error rate is lower than the conventional SUD detector concluded from numerical experiments, Even if the powers of the signals are the same, additional multiple access interference can be mitigated by ICA, thus improving the performance of SUD. When the number of symbol involved into a single simulation run is getting bigger, e.g. from 1000 to 10000, we can observe the number of iteration is significantly reduced, e.g. form 20 down to 5 at average level. The bigger the number, the lower the symbol error rate. Of course, the more computation power is required. 6.2, Recommendations Comparing to the conventional RAKE detector and subspace MMSE detector, SUD detector is simpler method. It will be considered to take ICA as an additional element of RAKE and MMSE detector in the presence of multi-path fading channel in future. 59References [l] A. Hyvéirinen, “One-unit contrast functions for independent component, analysis: A statistical analysis,” in Neural Networks for Signal Processing VII (Proc. IEBE Workshop on Neural Networks for Signal Processing), pp. 388-397, amelia Island, Flodia, USA, 1997. [2] A. yviirinen, “Survey on in yendent component analysis,” [Online], Available: http://www cis. hut. £i/&apo/ [3] A. Hyvarinen and Erkki Oja, “Independent Component Analysis: A Ta- torial,” [Online], Available: http: //www.cis.hut.£i/aapo/ [4] A. Hyvéirinen and B. Oja, “A fast fixed-point algorithm for independent component analysis,” Neural Computation, 9(7):1483-1492, 1997. {5] A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol. 7, pp. 1129-1159, 1996. [6] Athanasios Papoulis, Probability, Random Variables and Stochastic Processes, 3rd ed, McGraw-Hill, 1991. (7| ©. Jutten and J. Hérault, “Blind separation of sources, part I: An adaptive algorithm based on neuromimetie architecture”, Signal Processing, vol. 24, pp. 1-10, 1991 [8] C. Jutten and J. Hérault, “Blind separation of sources, part II: Problems statement”, Signal Processing, vol. 24, pp. 11-20, 1991 [9] C. Jutten and J. Hérault, “Blind separation of sources, part III: Stability analysis”, Signal Processing, vol. 24, pp. 21-29, 1991. [10] J. F, Cardoso, “Source separation using higher order moments,” in Proc. ICASSP'89, pp. 2109-2112, 1989. 60[11] J. Hérault and C. Jutten, “Space or time adaptive signal processing by neural network models”, in Neural networks for computing: AIP conference proceedings 151, J.S. Denker, Ed. New York: American Institute for Physics, 1986, pp. 206-211. [12] J. G. Proakis, Digital Communications, 4th ed. McGraw-Hill, 2001 {13] J. G. Proakis, Masoud Salehi, Gerhard Bauch, Contemporary Commu- nication Systems Using MATLAB And Simulink, 2nd ed. THOMSON- Brooks/Cole, 2004. [14] P. Comon, “Independent component analysis ~ a new concept?” Sig- nal Processing, Elsvier, vol. 36, pp. 287-314, Apt. 1994. Special issue on Higher-Order Statistics {15] $. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind source separation,” in Advances in Neural Information Processing 8, Cambridge, MA:MIT Press, 1996, pp. 757-763. [16] S. Verdi, Multiuser Detection, Cambridge University Press, 1998, (07] . Ristaniemi and R. Wa, tigation of continuous-wave jamming in DS- CDMA systems using source separation techniques,” Proc. Third Inter national Workshop on Independent component Analysis and Signal Sepa- ration, (ICA2001), San Diego, CA, USA, December 2001, pp. 55-58. [13] J. Joutsensalo and T. Ristaniemi, “Learning Algorithms for Blind Multi- ‘user Detection in CDMA Downlink,” Proc. PIMRC’98, Boston, USA, September 1998, pp. 1040-1044 [19] R. Cristescu, T. Ristaniemi, J. Joutsensalo, and J. Karhunen, “Delay Estimation in CDMA Communications Using A FastICA Algorithm,” Proc. the International Workshop on Independent Component Analysis and Blind Signal Separation, (ICA 2000), Helsinki, Finland, June 2000, pp. 105-110. 61[20] T. Ristaniemi and J. Joutsensalo, “On the performance of Blind Source Separation in CDMA Downlink,” Proc. International Wrokshop on Inde- pendent Component Analysis and Signal Separation, (ICA'99), Aussois, France, January 1999, pp.437-442. (21) T. Ristaniemi and R Wu, “On the Performance of Multisensor Reception in CDMA by Fast Independent Component Analysis,” The 53th IEEE Vehicular Technology Conference, Rhodes, Greece, May 6-9, 2001, vol. 3, pp. 1829-1833. 02APPENDIX MATLAB SOURCE CODES File Descriptions [#### Signal generation wa goldSgen.m - Generates Gold code siggeni.m - Generates mixed signals I for ICA simulation (Speeches) siggen2.m - Generates mixed signals II for ICA simulation siggen3.m - Generates mixed signal sets for CDMA simulation #### InfoMAK simulation files #### infomax.m - Main program sep-m - The code for one learning pass thru the data wehange.m - Tracks size and direction of weight changes #### FastICA simulation files #### fastica.m - Main function program fpica.m - Main algorithm for calculating ICA remmean.m ~ Function for removing mean pcamat.m - Calculates the PCA for data whitenv.m - Function for whitening data }#### CDMA simulation files #### SUDICA.m ~ Main program SUD, ICA, SUD-ICA Detector yfastica.m ~ Modified fastica main program with Gold Code initial value gold5gen.m 1 Gotd code generator 1 P¥s.25: Sebite ALEN sequence (2 5) with éattsal value [1.0.0.0 01 X PwBLI266: Sebtea MLSh sequence (1.246) with daitiel value {1 0.0 0 0 golds generated Gold code, total 1 chipe tanta ton: itiairs aig: boldezeroe( 3,31): eldCd, PS 1265.6P¥5_25.0(-295 ns. 262 Pn, 26(2:31) 796_26(093; siggenl.m {Generate X containe m mixtures of m source speeches 63Eas aseing anteee 1X + ized signal vector Read vay £12 Lat faleenrrend(sourcet?); [62 te}ovaveend( source?!) [es f}nrorrend( sourced") Let talewnmrend( sources"); [as f}-raveena(‘aources?);[e6 fa)nwarrend(source6'); [at floveveeed( source?) [66 feloworrend(aourc [ao fatmenreend sourced”); See. j02.1 508." 985,66, 9:67.588.7189.°15 ta pleeizis) 1 ps000d, weds for example seed) aes 2:1 0s siieg natrsz eves {six Snpoteignale siggen2.m eso; vet0s8-11:80; Sct, derinco/2): 1 eumeota 5C2 Crone 29)-119/99-°6 2 funy eave 5(3, Dnt Ceone 2719/9) IT enetooth 5C4, oranda 4 Ganasian signal for’ emis4, 8b, D98(¢:) /as(e,2)): ead Israna(eize(S, 03; 08 nix oigaete siggen3.m_ 1 mixed Signat generator ~ Dovalink efgnal generator 1k: # of User, mast aeea-0; aeaesar(agreatO)s TL gst docorreteting tat ie, 4K aacorestate nines => cove oh oy) weayeos 4 sate. uanteing matrix, or errand: 4eize(.2); Kara urualy oepe0; Woldek; elddelvarones(3,n02) 1 should converge on 4 pias for 292 ast but anmesling ‘LW iaprove sola even nore and 20 om 1=0,0%; B30; sop; U>0. 001; B30; sep Shataueveero; ‘haste sanized source sep.m 1 SEP goes ence through the mixed signals, I sm bateh blocks of size B, adjusting weighte, ¥, at the end of each Block swscpravacptl; tel noblockastix(p/8) BI-Be4d SKC ete Weve CbteCie2e Cd. /Csexp(-wd))) 00") {change olddatea,ang1«] change (W014, ¥,o1ddelta);WoldWs sprints Cesecenenpria, changorlct sagleeI. if dog.» (ad, 0%4,p%@,4,L.S4) \o", swoop, change, 180raseie/pi0.8 PBL) woureh ' “tahould be 4 persatatioa ‘wchange.m| function (change, delta, anglel-vehange (Wold o1daelt=) (eralneize(Q)s deleerrostape( olde, Sy090)) chingesdoltecdetta'; nglevacoa((deltavolddel te’) /aqrt((deitaedetta’)+(olddeiteraiddelts02)5 fastica.m fmction (Dutt, Out2, Outs} = fastica(atzedesg, varargia? Wlseaetg, ty Ml = FASTICA Gebnedsie) Yo ieerig ""—"setisated {ndepondent conponects tes ising anerts XW Seeioeted unsistog eetriz 4 mixedang — nied aignat XL Aamove the mean and check the data Ueteedsig, aizedeeen) = rennosa(ateedeig) Ue of rove, # of cotumel (Die, NiadeSaspt] = eize(asxedese)s 4 dotante vaiues for “peamst’ parazetes Tastasg Dis; A Defaait values for "fpice’ peraneters hema Sens : See ‘isplaytode ‘ieplayineervad 051 Paranatere for faatIOh - i.e. this file only = 34 ueeriaOfTC = 0; 1 priae information about data Spriatt(Munber of signals! id\a!, Dia) printeC manor of ssnpleet d\n’, SanD#Sexp)); 1 ack 4¢ the data bas Doon entered the wrong way, ‘Lue vara only... st may be on parpore prince earning: {printf(Toe signal aatrix may be oriented sm the wrong way.\2"?s sprints (In that cage traepore the tateix.\\n)s X eatoutating Poa (eb}=poanat(asxedstg,ttrotRig, last, interactsvePCR); TL mivening the data [ehitonig,whiveniagMotris,dauhieningMetréx}vahstanr(aszedsig, 6,0); 4 catealating the 10h 1 Ghece sone paraneters [1 Tee dimension of tbe data aay have been reduced during PCk calculations. 4s original dinsneion ie colesiated fron the data ty default, and the Hanser of IC Sa by dataule sat vo equal that dinension. Din » aszeCanstenig, 1) ‘Toe auber of C'S must be 2eas or equal to the dinaneion of data se'aimlete > bie {Show warsing oaly Sf varboce ~ ‘on! and user suplied a value for *uug0470" {printtC'veraing: eotinating only 14 Sodependeat conpensats\n', minDf1C); ‘sprines( Can!" satinate more ANdependent componente than diaeaeion of a8ts)\n"); caLcatate the ICA vita tized point algorstaa, a, M's tpica Giscesig, vaitensnglatrix, dovhit ‘opreuch, aunlfle, , finatoney aly ad, ayy, stabitcention, ‘potion, susnfeerations, macFinotane, iniestate, guess, HimpleSize, ieplayMode, dieplayinterval): agate, 1 cece for walsa zevara fe “Laoapey@) {Lda the ean back So sprinee(Assing the saan BACK to the data. \a?); Seaaig = W'+ eszedaig + (W's wizednean) + ones(l, NuOtSaspl); FprinteCote taae the plote don!" have the mesh added, \n")} + teasig = Os ead (uet'= Sennigiout2 = A:0u69 = Ws fpica.m function (A, W) = fpicaUt, whiteniaghatrie, dewhite ‘mmOtlO, gy fsnevane, at, o2, ayy, stabilization, Spetloa, sachunTtorations, mexFiautane, saitat guess, eampleSize, diaplayode, displayiocervel; inghatrix, approach, 1 Pectora sadepentent component analyais wring Hyvarinen’s fixed point ‘elgorteas. Outpute an eatinnte of the mixing aatrix A and ies Snverse W. x 1H wasvens se shiteaed date es row vectors 1 wbieentagtotrse can be obtained ith function shitenr 4X devitentngiateis ean be obtalaed with futction whiten Xapprouch C veyna? | ‘doth? ]_ ihe approach used (deflation or eYmeetsic) Anmoric [= Di of whitesig ] immber of indopendant. components estinated Xe C Ypovse I vemar | sthe aonliearsey used 661 ‘aus? | Yakov? 7 Ttioeeune (anne az g + 'oft'] the nonlinearity used ia finetuning tar [perareter for tuning “tanh? te [permeeter for tuning "get? ta step size in eeabilived algorsthe IT stabiniaetion ( Yon! 1 off? fit m <1 then attonatically co spetlon stopping criterion. {{sashariteratsene sarin masher of iterations 1 nasPinetune fenrinwe muaber of iteretions for finetuning Edaitstate [Prana | tguece? ]sinieian guece or random saitinl state. See beloe TK guess nitiai guess for A. Ipuored sf iaveseate = "rand? Usmplesice C0 - 1) ‘percentage of the sugples weed ia one iteretion {aleplaytode [ Vougaale! | ‘uecte! | splot running eatsaate t vesieere! | '2te" FE ateptaytete amber of iterstions ve take between plots sf nargin © 3, exror( ot enough sepuments!?); end [vectorsice, sunsunplee] = 2:58! Tevwcxing the data Se "aareal(D, error('Topst bas an imaginary part./5 end ey se seen SrrorCoprinte(TI2egal value [ Ye } for paranetor: "approach! \a!, approach); Epriote(Ueed approach (lg J.\a', approach Knocking the vanue for muxorze 4 voctorSiza < nant IG, error('Mist dave ouETC <= Dimeneson!”) send 1H cncceing the emptesize 6 samplesize > 1 sumpledize = 1ifprist#(varaing: Setting "emplesize?” to 1.\a"; sf aumplesice © 2 ‘St Cenmpledize © tantanplee) < 1000 ‘samplasizn = win(looo/aumsanples, 1); Aprintt Warning: Setting "sampieStz0’* to 10.3 (ld eanplee).\2", ‘sumplesize, floor (aumplesize * aunSaxplee));, 32 Caamplasize <1) sprint Ueing about 10.08%% of the eamplas sn random ordar So every atep.\a?,eamplasizes100); 4X checeang the valve for nonlinearity seiteh Loree) (ase pow", gOeig = 0: "ean", goeig = 205, Cqaaa!, gases, gorse = 20, (are ‘eter, plese = Or ‘eror(oprinte(Tilegal value [ te ] for paraseter: '75!"\a!, 9) WE ssaplesize “= 1, plrig = worse + ie egy "se atcig'~ gieig trend sprinteCvsed nontsaeaesey [2 Da, ‘snetuntngEnt seiten lover (ti (aso ‘psd, gFine = 10+ 15 ‘am’, gFine = 204 35 (guns! gauss’), gFine © 90 + ty ‘eev, gFine = 409 1p orAt nyy “1, grine = gOrigs iso, gFie = gOrig » tyead ‘Hastaningtnabled = 0; eror(eprint¢(/TIlepal value [ie } for paramatar: /¢inetane! \e! Tinotase)): ona sf sinataninginahled sprint C Fanetaning enabled (aabinearéey: (ie })-\n!, #eeesae) switch lover (atabslsaation) ae ayy ‘abiliaationfaabled = 45 ‘error sprint #(Tiiogal value { Xs ] for paranetor: *vetabstizatson’*\n', stebinéeation)): st stabiltationbuabled, fprintf (Using stabilized algorsthn. a?) send 1am ster preeters sya = yh Pihense Re ciw-aing wc ony = at + 7 lee may tne dave ty for comets wl v0 g06 op ssodilsoearity = glrigistroks = Oja0tFine = Sslong = 0; oneceing the vaiue for initial 2 besten lover intestate) fase "rand", initialseatatods = 0 Af eize(guese,1) “+ size(whitesingtatrix,2) fpranttC'arning: size of initial guess 16 incorrect, Ueing randea initial gueee.\3"): Hf sizaCguese,2) < mimoETe fprinte('darning: initial guoca only for f3ret *) sprintt("Id coaponenta. Using Tandon Soitial queen for others.\a',siea(gu0ss,2)); eize(gunes, 2) + SsmuOfI6) ~ rand(vectorSiae,sunltTo-eize(gusee,2))-.8) Doasaeie Sprines('darniag: Initial guess too Largs. The excess colina are dropped. \n?)s print CUsiag Snitial gumes.\2?); rror(eprint¢(/Tilagal valse (ie ] for parameter: 'snseseate’ + anieseat)s 1X Gecking the valve for dieplay node. Peston lover(tpaytode) face Come Fy aeedbseplay = 0} case Gon", ‘seaDioplay Sf (unbanptoe > 10000, fprintt warning: Data vectors are very long. Plotting may take Long tine.\n') send if (aunOETe > 25), tpraterWarniagt There aro to nany sigasie to plot. Plot nay act look good.\a));end Af (ouDEC > 25),fprinte(*Warning: There are too many signals to plot. Plot may aot look good.\n’) end 68‘woaDiapray = IF Cvectordize > 25) fprinteCVaraing: There are too many signala to plat. Plot aay not Look good. \n") send ‘rvor eprint (1129 1 value {a T for pareweters ‘*4teplayMode!"\a', dtepteytode)): 1 the ateplaytnterval can't be 1eee than 1. Se dlaplayinterval € 1) dlaplayinterval = yea sprinte(‘Sterting 104 ealculation...\3 ‘Tremere APPROACH SE appreachede == 1, ‘eee sone paranctare wore eediainearity = girigyatroke = OjpotPine = t;long = 0: zeros(vectorsize, au0ei0); Devhitened baste vectors Sf saieiaseatefods = 0 ‘{ Take rancon orthonormal Saitial vectors, 3 = oreh(rendGrectarSize, sunDfIC) ~ -8); ‘Lise the given initial vector ae the éaiesal a ="thstaninglatrix + guess; bold = zeroo(esee(8)):B0142 = zerea(eiza(8)): 2h inthe acanl hee pola seraton op. ‘printf ("No convergence after Yd stepa\n’, sarualterations); ‘eine oce tha the pote axe probly wrong as 1% Symotric orthogonalszatioa, Baas reekGine(B! © BY°C1/29) {Test for termination condition. Note that we consider opposite {directions here a0 wll ‘iaibedoe = nin(abe(ding(S" * 8014)))saiaAbaCond = win(abe(aing(®? + 80122))); st (L~ minfoeton < epeston) Af fluetaningSnabled & notFine sprint! (attial convergence, fine-tuning: \n")s totFine ~ 0; ueedilanesrity gFine; ayy ~ ayyK * ayyOrig: Bold = zeros(asee(8)); Ula? = zerea(ot2a(8)) ‘sprintt( Convergence after YA 0" round; Y cuculate the davuhitened vectors A= domniveningyateie + 8; break: SE Cetrone) b (1 = aialbetoe? < epsilon) priate Cstroxstw): Beroke = ayy: ayy = «Seuyy: Ef nod(asedhlinearsty 2) = ‘ay = stroke; stroke = 0; BF Gayy = 1) b (ood (used¥ inearity,2) cg measnnetey = tetany = ‘elself ("long) & (coundaasthatterations/2) ‘priate Texing tong (reducing step 2i25)\e")5 eng = 45 ayy = Sonny: Af medCuseatrinearsey,2) = 0 vag HesMineaty = eam inearsty #45 saiinearity = waedminearity + Asem o 69oiaa = soa;Bora = 8; {Show the progress Se sound m1, fprsned('Seep no. d\n, round ‘printe('Step no. 1d, cbaage im value of estinate: 1.3f \n!, round, 1 ~ sinkbeCos) end {Ades plot the current stats ‘Sf ven(round, ateplaytaterval) — 0, ‘There as and may still be ether Aseplaynodee 2D afgnale Seaplot depres "8)")5, ‘Af ven(round, ateptaytatervan) Cas. and now there are #-) iD ete AS arehiteninglatrix + Bjicaplot(diepeie? AY; caoe'3 ‘sf ven(rosnd, dteplapiacerval) ‘hows ad sow there are) Lip eieere ag PTB? P eblenunpattataplot se resten asedilinearsty face 10 pow Be Ge (CH #8). 9)) / sansaapl nee 1 "fl opeinoste ~ opationin korotsia eroja te on optiaoite ood, kateo rants ood of X aluaisemniata versioiste inten 2.0 bets YE) #18; Gpovs = Y .” 3; Bota = sunt + Gpows); D> tag(s ./ (beea = 3 » nomtanples)); BBs ayy Bs (+ Opeed = Giag(Beta)) + D; ‘xaubex(:, gotSanpios(amsonptee, earplessze)): Bi Giaah 6 (Cabs #8) 7 3)) Paize(Rew,2) ~ 9 «Bs 1 Optinoses YaubeXC, gotSanples ouseaples, samples)’ + 8; Gpoud = Yeah. 3; Beta = sum(feub © Gpow3); Dim aiag(t «/ eta = 3 + sizeCteub', 2)) Bae ayy » Bs (laub’ + Gpead = diag(@ets)) + D; "20 1 tanh hyptan = eamh(al = X° + 8s Br ks byptan / auntaaples — cnne(eize(B,1),3) eam ~ yan ~ 2) 1 opeinosea ~ epationin kototaia York's By hyptan = tamh(at + 7D; Bota = euacY + bypraa)s DT diag(t ./ Coats ~ at's runt = byptan ©" 29334 Ba Bey s Bs ( + hypten ~ dinglQets)) + Dy case 22 HaubeX(:, getSanples(aunSanplee, sexplasize)): bypfan =/tanb(al = xd" + 8); B= Kab + ayptan / sizeGies, 2) — ‘see(ize(By3),1) #sun(l'~ hyptan -” 2) = B / stzoCtaub, 2» ats *Copesnosea YaKG, getSanples(ounsaaples, samplesiss))? © 6: byptan + tanh(al +103 Beta = sun(Y «+ hypraa) ie Ws 1 sonsaples + ats 70De aiag(t «/ (Beta = at » mutt - bypten -* 29) BoBe ay eB (0 © byptan = diegitets)) © D; 1 gaoes Ue + 8; UequarednU .” 2 ex = exp(-at + Uequared / 29) guise = Use oxy aug = (1> a2 + Unquared) tors Beek > gauss / uonsan nastaize(®,),1) + sun(dlauae) va) aontanples ¢ opeinosea Yards By ox = anp(oad © (1 ." 2) / 2); gauss Beta = tune + gouse)s Dev dingG -/ (ante sam((L ~ a2 + F - 2)) «+ «8909; Bboy e Bs (Oo gues ~ ding(Bein) + 0; sabes, get Sample CoanSanple fer 7 exp(-a2 ¢ Usquared / 2); gus = + 0 dinner © (1 = a2 = Uoquered) ten Be Kaob + gauss / aszedaeus.D) ~~. sanptessee)): ‘nea ize 29,1) + sudden). 7 eizabtons,2) § 1 Opeinotea YaKG, getGamples(ounSanples, sanpleSize))? © 8; 7D: ase Yo ax Seve = cua? "+ geass); Di diag(s «/ (Bota = en((L = a2 + (2) + oD): BeBeayy eB (0 + gauss ~ dingibeia)) © iaser B= (Le (Qk? 6B) .* 29) / aman case "C Opesnotes Yo; Gozae = 4.” 2; Beta.» sun(Y + Cakes) F aiagt./ Geta): = Bsinyy + B+ C1" + Gokow ~ dingtBera)) + Ds “oubex(:, getSonplae(ounSugplee, saxplesize)) Bs Gub © (Goub! 8) 2 a)) / eisedleabDs 1 Vast optinoita Ya XG, getGanples(omSemplee, semplaSize))? + skew ='Yo-> 2; Beta = gun(Y -+ Geko); Dis diag(t -/ (Bata); Ba Begs ¢ Be (0 » Gakew ~ ding(Bsta)) « Di ‘rror (Code for desiced nonlinearity aot found!; oot Homuate 108 ti2ters. W'B + vbttentngiatria; 1 Ao plot the ast oe beiech hawabiepley ae wt aay se: be otber dseplaynaes.. Teapoe aiepee! CO OUST. ana no enero 50 =) 0 baa Seaplot(apetg! AM sAraemows ‘boss aad ow thare are =) aD divers acaplet (asepotg? i) :deeimovs 1 pertarion aprnonce Tae search for a basis vector is repeated auntie tin while round <= nun0fTC, fry = myyOrig; ueedMisnearity = gOrigi atsoke = 0; aotPine = 1; long - 0; andFinetuning = 0 1 show whe progress. HpriateCre ka", sound): 1 Take « random tnitia vector of lenght { and exthogonadice it ‘Lith respect to ene other vectors ‘rvhitentagatetamgueas (round); Pee BoB ew ww / norated: wind = zeros(eiz0(s) 282 = aaroaCstenCed?: 1X mse 40 the actual fixed-point Steration Loop. Lo fort =1's eaxtunteeretione #2 {the projection with matrix B, since he toro entries do not 1 conteibate te the projection Sipe Brew wow / norm: ‘printeC\econponeat aunber Ya aid et conver ‘Found, eaehunteerationa) ‘printf (Too wany failures to converge (ld). Giving wp.\a', suaFotTaree) Sfround =r 0, MoUs Hes o0d KnunFastores > faturelinit Wi = mecnuntterations + 1 case "ts¢ aotrine wOld'= wp So the algorstha will atop on the sext test ona Wie notrsne sn 1d Sterations.\a!, 1X show the progr priate; 1 Tost for tarnioation condition. ote that *he algorstha bas 4 converged if the direction of and wld Sa the 4an6, thie se may we eect the tro cares Se norma = xOld) < epeiion | sora(e + wOld) € opetlon 7246 Hinetuniaginabied # sotFine ‘printf initiel convergence, fSne-tuniag: ’): rotting = 0; gaboa = nasPinecune; Old = ceree(eiae()); Ola? zaroaCeiza(e))s unedllinearity = eFin Soy = ayy ¢ wyyOrsg; cndPioctaning = nani Save te vector BC:, round) = 1 Chicalate the do-ehitened vector. AC roma) = dovasteniogéatris * Gemeulae tet titeer. Uczound, ) = 9? + whiteningtarics Show the progeece sprinteCecrpated (id etepe ) \a!, 2) Waieo plot the current etate seseen tneaiepay ‘SE rn(rosnd, displayloterra) = 0, {There wae and may still be other dteplayzodee iD signe eg OP HPA: Scape CAtopose?stenp(¢ tsm0F0) "9s arms ‘Mf ren(eound, dspreyinterval) ss and pow there ate ib oarte Sesplot( dtepeig! A") Aramow cue} ‘sf ran(rosnd, dteplayloverva) Tass and now there are) TAD cicere eng SAE Apetg? 05 dren Pavsven weeaniapley Deeaky X10 ready = next TE cinetuntngiaabled & actFine 22 Catroke) Georm(e ~ 60142) < epaston | noon(y # W062) < «0. patie) eteoke = ayy: fprinte(strokel!) ary = Sears: Ef wodCGoeatisneertty,2) == 0 ‘Soodilsnearity = seedilsneassty + 45 ‘ay = stroke; stroke * 0: BF ayy = 1) (aodGosedthsneartty,2) ‘cedilinearity = ueediineerity = ‘nse (0otFine) # (long) & (4 > sentntverations / 2) ‘tprinttCTansag Long (eeduciag step eize) ')? ang = 35 a3y = Seay: $2 soa(aredtinearity,2) == 0 % poo (+ (CO +) = 2) / masamplos ~ 9 6 vs Eps = (& + (Ca!) 9) / manSesptess 2BBota sw! + Etapoxs; wine ayy * Exped ~ Bata sv) / (2 Betads ‘esbet(: get Sunples(susSanplee, caaplesiza)) ws Gleat+ (Gud! ex) -” 3)) /aszeGiaus, 2) ~3 2 5 suupresi2e)) 3) / stzebteub, 2); wae ayy © GXtpond - Bata + ») / (2 Bots); i tam case 20 Typtan = tamkiat © x) + 2): wn GL bypfan ~/at ¢'eum(t - aypfan * 2)/ + w) / sunsaaplen ‘ypten = tamnéat © x7 6 05 Baas ys ks hyptan wre = yy © (ke bypfan = Bota #3) / Get onCoayptan ‘oubeC+gotSanptosounsanp fypTen ="tana(a! © Keub! +) se Gleab » byptan ~ at + eun(t - byptan .* 2)! 0 x) / aize(ews, 2); ‘oubekC+ gotSanpten ounsanp fyplen ="fonnGe! ¢ Kevbl = 8 Betas e's daub © hyptany ‘yy (Gtexb © hyptan = Bata 6's) / sumplesize)): samptesize)): at # tun(Crnypran 12)") = Bete)? goes "E Tase has bean split for performsace reasons Uae ees udu. 2; excexpleed w2/? pase eee Sonuce (a2 © a2) 00 GE + geuce = cunCachuce)’ + w) / sunsunptes: case 34 "en 2 ows wu."2; exconpt-al «92/25 (geuse =) Givexy Osage = (a2 42) ae; Betas ys Ts gauss See nyy * Ck gauss ~ Bata #¥) / (eum(dinnas)? * Bees); “eubeX( gotSunplee(aunSanplee, sunplestze)): is Haubl = yy seusr3p aneeapicad » 2/2; (reuse = aleexy dOnuge = (282 * 42) ten; i (toub o gauss ~ sun(dlauae)! + =) /stzaCtoub, 25 ‘Houb-XC+,gotSaapLoo(oumSanplee, sanplesiee)): abn ey ea ceerp(a2 * 20/2); gauss = ate; aoauee = (2a? tw) on Betas ws few © gna, See ayy (Kea = gu {eun(dinuas)’ = Bees); anew pata ew / ck + (GP ow) «7 2) / untangle: (Ls (#9) = 2) / msampless ‘aeub-XC+,gotSanptos(ounsanples, sanplesize)): B= Ghsh's (ab! # ws 3D) 7 aise, 235 outed(+ getSonplee(aunSunples, samplasszs)): EXGeuey = Glaub + (Gud? © ¥) .° 29) / atzedienb, 2); 4go> ayy = (XGaLen ~ Bataes) /(-Bated; ‘rror("Gade for desired nonlinearity not found"); X Wormalion the ney wee / normte)st = #4 45 ‘peints(Done.\8"95 aso plot the onse thet nay act have been plotted Er CoseaDieplay > 6) & Ceon(Gound-i, dieplayineerval) “ 0) este aeenepiny ‘ Teera was and say still be other dteplaymidae 1D agnae ‘oup = K's8stcaplot(‘asepasg? tenp(:,S:mun0#10)") drei cue? a and sow there are ib vests Scaplot(asepess! 1?) came 3 “a and now there are 1) ib dieere Seaplot aiepeig!.W); dramor 1 tn the cad tot" check the data for suns security Se “tareal(t) {printf C'varsing: removing the imaginary part fron the rosult.\n"); Reread W = reals 1 surance: 4 Caicalates tanh sinplisr and faster than Matiab ta function yreaahis) yortn 2 (oxp(2* 2) * Ds “Ren oo function Samples = geeSamplo(aar, poroentage) Sanplee = findlesnd(t, ans) percentage remmean.m| function (serVectors, assalse] {IRDOEAN ~ reaove the mean fran vectors eioctora = saree (ele nValue = wean (vectors pevlectore ~ voctora-naanlalbetone 1 ,stz0(vactor9,2)): peamat.m| _pormreamenc neeetetareenmniseenneeettetmasRRS function CE,D}-peanst vectors, fareuEog,2eet#ig,=-steract ive) L (2, D1 = peanaeGvectora, firattig, 1astBig, x “Ssveractive, rerboce) x caleatates the PCA aatrices for given date (cor) vectors. Returse the eigenvector G) and diagonal vigeavalue (D) stersces containing the selected subepaces, Dinetstonality seduction ie controlled withtho paranstore *firet8ig? snd ‘1astEig? ~ bu <¢ can alzo be done Snteractively by aovting parancter ‘isteractive! to ‘ea! oF "gsi" x x x x x X vectors bata Sn row vectors, Ufirethig Index of ehe Tergeet eigenvalue to keep. x Default $0 1 Waaeig Inder of 1 Defaae £2 equ T inveractive specity eigenvalues x you eet Vinteractive" to ton" or Vgai' than the valuoe t ‘for "TiretBig’ and "Lastbig? vill be ignored, but they x 1 i x x 1 i i % x x Dest eigenvalue to keep. eep satersctively. Hote that if Gin have to be entered, f¢ the value to ogi” then the ‘Gute grophsoal user interface se ia FASTICAS ell be For nsetorscal reasons this version does not sort the sigearalues or 4 the oigen vectors in eny uaya. Therefore neither does the FASTICA oF A Fasricho. Generally st aeane that the componente fetarned frca vaitoning is alsost in reversed order. (Phat means, thay usually are, ‘Lou sonetine they are not ~ depends on the £10-conmand of matlab.) 4% 16.6.2000 % hago Gvert 1X Check the optsonal parameters; fuseen Lover(e interactive) case ‘oft!, b interactive = 0; fase ‘gui') Slinteraceive = 2) ‘rror(print¢(Tategal velue ( te } for parazeter: ’"sateractive?"\a ‘einteractive)); cueDinvasion = size (vectors, 1): SEG iateractive) 4¢ Tastiig <1 | IastEig > oldinensson exror(eprist¢(/Tllapal vain ( 14 ] tor paraneter: 1eeeEig? \a', tastes): Se tirettig <1) coretBig > Lantig crror(aprint#('TIlagal valve ( 1a ) for parameter: "#iretBie 1 terstBig)>; cateatate Pea ‘caleaiate the covariance matrix. fprinte (‘Calculating covariance. \n')scovarianeaatrix = cov(vestors easLesteig = rank(eoversancelatrsz, 1e-9);, Wcaleatete the vigeaveloes ant eigenvectors of covariance matrix, Ip, b}= eigtcovarsaacematess); 1 Sore the eigenvalues ~ decenting stgenvabses = #lspadCeore (8iap(0)))5 ‘LInverecesve part - comand-lina ‘how the eigenvalues to the user 76ndinetigures DarCosgeovalues); title igeareluee’ 4 ask the range from ehe user. His apd keep op soking inti the range Se valid :-) ‘iratbig ™ input (The tndox of the Largest eigenvalue to beep? (3) D3 Tasteig’» sapue( (Tha Sadar ot the eealiese eigenvalue to beep C ‘ot2ete(olaDinensios) ') '1)s cater tne new vale Lif they are eapty the use defante values $4 Seanpty(dsseetig), tirathig = 13end £2 SeueptyCaetdig), TestBig = oldDineartonjond ‘Woneck that the eatered saluoe are within the range Af lnstéig < t 1 lasttig > olaDinension sprintt('Iilegal aanber for the iastsigenvala.\ Se rinetsig <1 | toretbig > aasteig ‘fprintf( thlegal murber for the first eigenvalue.\x'); Tatose ee eindoe tase(haa iad: 1 See st the naar tas reauced the dimension enough Se laste > sasiastiie asteig ~ sascesteser ‘print? CDinenston reduced to J due to the singularity of covariance satris\a!, astesgetarsttiget)s Reduce the dineasionslity of the problen, 41 eldDinenston == (lesteig ~ firetbig + 3) fprintt (/Digeneion not rwduced.\8"); ‘print Reducing dineasion...\3") 1. Drop the mealter eigenvalues At lastbig < oldDinensi ‘overLinseValue = (eigenvalves(180t£ig) * ‘overLinceVelus = eigenvatves(oldbisenaion) overColams = diag(D) > loverLinsevanue; igonvaiuee (2eetEig + 0) / 2: 1 Drop the larger eigenraiues Se furetsig > 2 higherLinttValue = Ceigonvalues(tiret8ig ~ 1) + eigenvaluee(fézetBig)) / 25 ‘igherLinttValue = eigenvalues(t) + 15 Medercotumne = dieg(0) < higherLinit¥slue; combine the results frou above electedGeluane = Loverdslunas b higherCotamns; 4X print sous into for ene weer ‘printt ('Satectod [4 | diuenotons.\a', eum (eelectedCotumae)): SF num (celectedColume) “= Qastbig ~ tiret#ig + 0. cover (elected a erang ainber of dinausioas {prints (/Saat2eot renaiasng (Gon-zere) aigenralua [ ig I\n!, etgenvaloes laste: 7prince (Largest rena sprinet ("Sun of renored ig (no-zero) eigenvalue C Yg In", eigenvalues tirstBig)); gsavaluce [Hg 1a", san(Giog(D) .+ (selectodCoiums))); 4 Select the cotuse whieh correspond to the desired range 1 of eigenvalues E> seicol(E, selectodcotama)s D> seicol(ecleal(D, selactedColumes)', aelectedcotvma) 1 Sone nore information ‘banlleroaCeigenvaiues); ‘fneegeeun(Siog(0))5, Fetained = (eunlized / sunt) + 100; SpeiBtEC'[ Ye ] Th of Goarzero) eigenvalase retained.\a!, retained) fsnction aewatrix = aelcol ol@atréx, muakactoe); aowiatrix = selool(olatatrix, sacklacter)} i X Selects the columns of the matrix that marked by one in the given vector. X the mancYector ia a colaan vector 46 stao(aastector, 1) “> eiza(olanatrix, 2, ccror ("The ntek vacter and autris ere of tncoopetible siz." for ts 1: size (auaktecter, 2), Hf maantector(s, 1) == 4, ‘akinglask(1, nosTaxen’+ 2) = 4; munTaken * suaTaken + 15 Seve = oldteia(:, tangas): whitenv.m| sunction (nenvoctors,vhiteningetrse doth tantagiatr x]euhitet {irre = iaseonr vectors, Mitens the data (row vectors) and rofioos dinnaaion, Returns {the wnitened vectors, sbitening aed deviseesing matrices (vectors, Ky DDs vectors Data in ror vectere 1e igenvector aatrix iron function *peamst* aD Diagonal stgonvaiue watrix fron fonction *peanat’ 4X calculate the whitening and devhitening mtricos 1 Ceheee handle dinenaionnlity tignleaneonsy) seentaghaerix = inv (aqre (0)) + Bs Govbiteninglntrix = age CD); 1 Project to the eigenvectors of the covariance aatrix 1 Vaiten the samples and reduce dinension eisultanecuely. Sprint Ciiteniag. a’)? aivectore = whiteaiaglatrie + vactore: 1 suse sone security Sf “fareal(noeVectors) frror (Viuitoned vectors have imepseary valosa./25 Hpesoet (oneck: covariance differs from identity by (1g 1.\2", nax(aax(abe Cov newectore?2)-eye(esen(aewlectore,3)))))) SUDICA.m 4 Cosparigoa: SUD ve ICh ve IC4-SUD over AMM dovatine Goer numbers Reao? The 31th ie the Guseian Hoise 1 Steoa@ to 20487 1 # of ayatote Hes0000 8oad golds.aats 10nd aseadeig. aac ow siRinab~-10; Boga Sihindb=O; stapes; ‘Snddeiow_ Rind :atep:bigh SRiadB; for sini: teagen(SRiads) sp=10"CaxRsna8(55))/20); esun(S)A/eqrt 28H); iodereshape(,R63,1D; nat 10a! G1 Clsnvileyifestsca(Rtcb: [Bat _t042 0: G-suvil=yatesttea(R IC? [Base 1043 G: GUsarideyitertioa(R ICD; mtoh cat ‘uae fchesign(mat CAD; for jokes nur ee ain value vill be expected tap({)eoun(Bibetreon(s,:)-eBaee TOUS, 2): HEs)=t-oaxabee=p)0/M ‘BE nean_IOU(4S)-neun(GED: clear tapiciear BR; epereshepe(®, St 28za(R,2)/32) for jun Bape sep JD-agel4Cy2)*s Base 0064, en(oun(tep)): % SUD dacietea er suv cat ‘BERG nrn (Bhat SUD(4, 2) aBstatrean(429-10/620/ 4 Be, suDTCA cal ~ compare with TCAL temp. Expesun(abs(enp')); clear teaps 32 casn( emp) >anteman(emp))) (eerp() taap@) Imeae(eap) smut, SUDICACG, =). SUDCE, =) vain (Bhat 2044 eanp(2),2)95 nana SUD(S, 1) 4atgaCGhat 104863, :09 Eeerp(t) temp) emia (eas) ‘Baer SUDICACS, :)nabae UDC, lear toaps clear tap; guCGhat ICAL CEoup(2) 305 1% mn such cal ~ compare with 1012 for jets aap, :)eBhat SUD, )easgn(Bbae 1042(5,2)): ‘apreaa (abe (eenp')) EF Casn(tap)> Caetenan(e=p))) Teenp(1) toap(2)Ienex(eap): Haat SUDICAG,:)=Bba8 SUDICAC,:)4stgu(Ghat,1012(¢emp(2),2)): (erp(2) teap(2) asa eas) ‘mat, SUDICAC, )oSbet_SUDIEACS,)-asgn(Baae_IC12(eamp(2),:))5 ‘lear teas clear taps ‘ae. SUDIOA(A :)oedgn( hat SUDICACS, 1 ‘BE_SUDTCA(S) “oun (bse, SUGIOACS,=)lBseteenn($,=)-D/C-200/Mi BER, ean, SUD (LA) -uean(ED; 79‘aan, SUDICA(S1)emeaa GER. SUDICADs {lone tap; clear BER: clear texp; clear Bek sw ett mat BERLuean_SUD HER neon ICA BER nean SUDICA lov SHRSnd® high URindB Ms yfastica.m| fanction (Ost, Oat2, Osta] = ytartica(nizedasg) Ulnizedesg, aizednens) ~ remenn(atzedsis); (in, Nuabfsanpi] = efzeCatzeaeis): 4 emeatating Pot Xe seed the reralte of PCA for ehitening, but if ve don't ‘Lneod to do whitening... than we dont ao0d PCL covariancaMayrix = cov(aixedoig', 1); f Caleaiave the corariance astris, Wenicalate the eigenvalues and sigeuvectore of coverfance aatris (, D) = eigteovartencetatess); XZ catcutate ve wnsteniog and devbttentng matrices 1 (hese handle dinenaionality aimeaneouely) thlteningiiateie © sor (eget (D) = By Aeviiteningiateis = E+ tC); Troject to the eigenvectors of the covariance astre XL niton the samples and reduce dissasion eisulteneeuely. hitorig = whiteninglatrix wtzedoig; TLestestate the 10h vith fixed point slgoritin (4, WI'= ptpica nitesig, whiteningMtrix, dovkiteningietrin); fiction Ch, VI = yfpicaCK, vntteningmatess, devastentagatess) Esiluralintt +'5; nurdunlverations = 1000; ‘Sigetaniagtnabled =O; a08Fiae “1: 1 epeston = 0.000 Ivectorsize, munsaapiee} ete; mate = vectorsize (vactorSiza); T9 DEFLATION APPROACH uech for» baste vactor ie repeated aunl¥IC ta ‘while round <= mmO{IC, 1596 ‘ifinetuning = 0 ftpeintsCTC 1d +, round); Show the proer 4 Take e random initial vector of Lenght 1 and erthogonalice st viek Feapece to the other vectors X random reed initial valoe rudGvectorsiee, 3) = 8: X60% oad golds.oat; ve(gordCeound,))"s ernie 1206 | MeO ta adi trapaasaaasaisistasiaaigy wee BoB au wee / sormted: Gold = zevos(esze(a)); w01d2 = zeree(etee(s)): ‘Thi ts the actual fixed-point steration 100p Xo for tis t's navdunteerstlons 1 80while {'< gexunTterations + gabbe 1616 1H Project the vector into the space orthogonal to the space pened by the earlier found basis vectors, Bote shat se can do {the projectson with matrix B, aince the sare entries €0 08 X conteitute to the projection, ews pep ws 4628 E57 nome; ‘print{(\aIo Hd did aot converge in 4 iteratione.\a!, round, waeMualterations)s ‘priats('Teo sany failures to converge (ld) \n!, munFasiares); WO vel: breaks . "ut 5 entPsnetaning ‘wOld'= gj H So the algorithe vill atop on the aaxt test. print (+.°); 1862 show the progress 4 Test for ternssation condition. tote that the algorstht bat {converged if the direction of v and w0ld is the sane, thie 1 is why ve teat tho two oa Sf nor(a = wld) < epation | nomn(s + Old) < apeston X68 AE abe oun (esgn(d'ew))-eun(esen(X'=001a))9<10 ‘mPaiiuree = 0; 108 ound) =; H fave eh vector Found) = deehstantagiatrix +») X Calculate the di-vbitened vector Ceduna.) = w! > wbitentaghatrix; i Caleulate Ih filter fprinst(*computod ( Ta topo ) \a", 3); % Show the progress eenp = X'eby desplot(tepsig! aap, :mmg0t10)"); drawnow: brea X 721 10 ready ~ nent, wOlaa = voids void =H; 1748 eC 8) 9) / sansa a 8 wee / nomn(e)s 1 = 1 + $5 ets Kormatize the new. Found round + 45 4847 ‘Sprints Done. \0"95 1 tn the sad tote check the data for sone eocursty St “aareal@) Af baverbose, tprinttCvarning: removing the Snaginary part fron the rerult.\n!)} sot Mereals ¥ = reals, 81

Yue Fang MSC

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Yue Fang MSC

Uploaded by

Copyright:

Available Formats

You might also like