Professional Documents
Culture Documents
By
ASHWANI SINGH
117BM0731
1
ACKNOWLEDGEMENT
2
Table of Contents
1 Introduction 4
6 Discussion 29
3
INTRODUCTION
The wavelet decomposition technique was used to filter the signal into different
sub-bands each in a unique frequency range. The sampling frequency used while
extracting the signal was 256Hz. Thus the maximum frequency of the eeg signal
would be 128Hz. Thus a 7 level decomposition is used and type of debauche 4th
(db4) order filter was used. The following features were used for classification:
This makes a total of 16 features for each EEG epoch, thus making a total of 200
x 16 matrix entries. The MATLAB code to extract sample entropy is as follows.
Various classification algorithms were also explored like least square support
vector machines, extreme machine learning, discriminant analysis, resilient
4
back propagation neural networks and random forest algorithm. The accuracy of
classification using various features was studied and compared. The ROC curves
to study sensitivity of various features was also done. Other miscellaneous
analysis tools such as box plot diagrams, frost algorithm, scalograms and time-
frequency plots were also used to study the signals
EXTRACTED FEATURES
Step 2: The distance between each template and the other templates denoted as
d[Xm (i), Xm (j)] is computer as the maximum absolute difference between their
scalar components :
Step 3 : For a given template Xm(i) count the number of template matching,
denoted as Øi i.e.the number of j(1<j<N-m+1) satisfying the distance d[Xm (i),
Xm (j)]<r. Then Øim(r) is the probablity that any template Xm(j) matches Xm(i) :
Øim(r)= 1/(N-m+1)*Øi
Step 4 :
5
Step 5 : Increase the dimension to m+1 and follow steps 1-4 to compute Øim+1(r)
and Øi(r)
The code for computing approximate entropy for all the data is given in
Appendix I. This was done similarly for all sub-bands and the data was stored in
an excel files.
The approximate entropies were extracted and scatter plots were made for the
same. The scatter plots for different sub-bands are as shown below.
6
Approximate entropy of both signals for d4 feature blue-ictal, red-interictal
7
Approximate entropy of both signals blue-ictal, red-interictal
The code for computing the same is given in Appendix II. This feature is
extracted for all the data and stored in excel files.
8
The code for computing the same is given in Appendix III. This feature is
extracted for all the data and stored in excel files.
the diagonal structures to all recurrence dots. Diagonal lines represent epochs
with similar time evolution of states. Therefore, %DET is related with the
determinism of the system. %DET is calculated as shown below, where P(l) is
the frequency distribution of the lengths of the diagonal structures in the RP.
Lmin is the threshold, which excludes the diagonal lines formed by the tangential
motion of a phase space trajectory, in this research Lmin =2.
d) LEN refers to average diagonal length of the recurrence plot. The code for
the same is in Appendix III.
9
RECCURENCE PLOTS :
4) Sub Band Energy :For the purpose of comparison with the three non linear
methods introduced above, the energy in each of the sub-band signals {d3 d4 d5
d6 d7} is computed. The sub-band signals {d3 d4 d5 d6 d7} are not used
directly as entries of the feature vector. Using such a direct representation of the
EEG waveform is too sensitive to noise and slight variations in morphology
.Instead, the energy in each of the sub-band signals {d3 d4 d5 d6 d7} is used.
An explicit representation of the features computed for each sub-band is:
The code for computing approximate entropy for all the data is given in
Appendix IV. This was done similarly for all sub-bands and the data was stored
in an excel files.
10
CLASSIFICATION TECHNIQUES AND CORRESPONDING MATLAB
CODES
The extracted features were then all stored in an excel file of size 200 x 16 ,
the first 100 entries being ictal of seizure signal features and 101 to 200 being
inter-ictal signals. Now this data is fed as training to classifiers. Various types
of classifiers and their accuracy were studied in this report.
An ls-svm, least squared type support vector machine was used for
classification of this data. First a target vector was created for binary
classification of the data. Seizure signals were given a target of 0 and abnormal
signals a target of 1. The code of classification using ls-svm in MATLAB is
presented in Appendix 4.
The obtained accuracy for given training data was found to be 99.5%.
11
2) Aritifcal neural networks with resilient back propagation :
ACCURACY : 98.64 %
12
The accuracy obtained using the same classifier was 99%. This uslally depends
on number of trees used. The code for the same is in Appendix 7.
In theory ELM can approximate any target continuous function and classify any
disjoint regions; In theory compared to ELM, SVM, LS-SVM and PSVM
achieve suboptimal solutions and require higher computational complexity. (cf.:
Dedails on the reasons why SVM/LS-SVM provide suboptimal solutions). The
accuray obtained using this was 93.5 % and code is in Appendix 8.
ttest is basically used to check whether two data belong to the same normal
distribution or different ones. If two data belong to the same distribution we get
h=1 and the null hypothesis (independent data) is rejected else it is accepted.
Thus ideally all features should show h=1. The code for the same is in
Appendix 9.
13
MISCELLANEOUS ANALYSIS TOOLS :
14
SPECTROGRAM PLOTS FOR WAVELET SUB-BANDS (APP.11)
15
How sensitive are the features to classification?
An ROC plot of the different features and after learning is as shown below.
ROC analysis provides tools to select possibly optimal models and to discard
suboptimal ones independently from (and prior to specifying) the cost context
or the class distribution. ROC analysis is related in a direct and natural way to
cost/benefit analysis of diagnostic decision making.
16
ROC Plot of Approximate Entropy
17
ROC Plot of Sample Entropy
To further compare the two signals statically, through the mean, median,
quartile etc, the box plots of the two signals for different features was obtained.
18
Box plots for seizure feature d3 sub-band energy (left) and inter-ictal are shown
clearly showing that the seizure signal has more energy than a non seizure
signal.
Discussion
In the course of this project the wavelet based signal processing technique was
studied and used for feature extraction. Features like sample entropy ,
approximate entropy , recurrence quantifiers, wavelet energy, etc of the main
signal as well as the decomposed signals were extracted. These features were
then used to classify the eeg signals.
APPENDIX
a=textread('S001.txt','%f');
approx_entropy(2,0.5,a)
anNw=zeros(1,1);
a=textread('N001.txt','%f');
[C,L] = wavedec(a,7,'db4');
D3 = wrcoef('d',C,L,'db4',3);
19
anNw(i)=approx_entropy(2,0.5,D3); ; disp( anNw(i)) ; i=i+1;
amp=zeros(1,1);
i=1;
a=textread('N001.txt','%f');
amp(i)=SampEn(2,0.5,a,1); i=i+1;
rec=zeros(1,1);
det=zeros(1,1);
entr=zeros(1,1);
lent=zeros(1,1);
i=1;j=1;k=1;y=1;
a=textread('N001.txt','%f');
Q=transpose(a);
[RP,DD] = RPplot(Q,3,1,.5,0);
[RR,DET,ENTR,L] = Recu_RQA(RP,0)
det(j)=DET; j=j+1;
entr(k)=ENTR;k=k+1;
lent(y)=L;y=y+1;
E3=zeros(1,1);
E4=zeros(1,1);
20
E5=zeros(1,1);
E6=zeros(1,1);
E7=zeros(1,1);
a=textread('S001.txt','%f');
[C,L] = wavedec(a,7,'db4');
[Ea,Ed] = wenergy(C,L);
E4(j)=Ed(4); j=j+1;
E5(k)=Ed(5); k=k+1;
E6(l)=Ed(6); l=l+1;
E7(m)=Ed(7);m=m+1;
X=xlsread('Final.xls');
X=transpose(X);
T=zeros(200,1);
for i=100:200
T(i)=1;
end
T=transpose(T);
SVMStruct = svmtrain(X,T,'method','LS')
plotroc(T,X)
21
%Define input A
Group = svmclassify(SVMStruct,A)
plotroc(T,X)
X=transpose(X);
T=zeros(200,1);
for i=101:200
T(i)=1;
end
T=transpose(T);
net = feedforwardnet(10,'trainrp')
net = train(net,X,T);
y=net(X);
plotroc(T,y)
X=xlsread('Final.xls');
X=transpose(X);
T=zeros(200,1);
for i=101:200
T(i)=1;
end
T=transpose(T);
net = feedforwardnet(10,'trainrp')
22
net = train(net,X,T);
y=net(X);
plotroc(T,y)
X=xlsread('Final.xls');
T=zeros(200,1);
for i=101:200
T(i)=1;
end
class = classify(Y,X,T);
ACCURACY : 99%
X=xlsread('Final.xls');
T=zeros(200,1);
for i=101:200
T(i)=1;
end
count=0;
for i=1:100
Sample=A(i, :) ;
class = classify(Sample,X,T);
if(class==0) count=count+1;
end
end
for i=101:200
23
Sample=A(i, :) ;
class = classify(Sample,X,T);
if(class==1) count=count+1;
end
end
Accuracy= count/200*100
X=xlsread('Final.xls');
X=transpose(X);
T=zeros(200,1);
for i=101:200
T(i)=1;
end
X=transpose(X);
B = TreeBagger(2,X,T,'NVarToSample',1)
Class = predict(B,P)
T=zeros(200,1);
for i=101:200
T(i)=1;
end
B = TreeBagger(2,X,T,'NVarToSample',1)
count=0;
24
A=xlsread('Final.xls');
for i=1:100
if(strcmp(class,'0'))count=count+1;
end
end
for i=101:200
Sample=A(i, :) ;
class = predict(B,Sample);
if(strcmp(class,'1')) count=count+1;
end
end
Accuracy= count/200*100
93.5 % ACCURACY
[TrainingTime,TrainingAccuracy] = elm_train('Final.txt', 1, 20,'sig')
X=xlsread('Final.xls');
%X=transpose(X);
T=zeros(200,1);
for i=100:200
T(i)=1;
end
%T=transpose(T);
plotroc(T,X)
25
9) TTESTS :
X=X';
Y=Y';
[h,p] = ttest2(X,Y)
X=X';
Y=Y';
[h,p] = ttest2(X,Y)
X=X';
Y=Y';
[h,p] = ttest2(X,Y)
X=X';
Y=Y';
[h,p] = ttest2(X,Y)
X=X';
26
Y=Y';
[h,p] = ttest2(X,Y)
X=xlsread('S E3.xls');
X=X';
Y=xlsread('S E3.xls');
Y=Y';
[h,p] = ttest2(X,Y)
X=xlsread('S E4.xls');
X=X';
Y=xlsread('S E4.xls');
Y=Y';
[h,p] = ttest2(X,Y)
X=xlsread('S E5.xls');
X=X';
Y=xlsread('S E5.xls');
Y=Y';
[h,p] = ttest2(X,Y)
X=xlsread('S E6.xls');
X=X';
Y=xlsread('S E6.xls');
Y=Y';
[h,p] = ttest2(X,Y)
27
10) SCALOGRAM :
COEFS = cwt(a,1:1,'db4');
SC = wscalogram('image',COEFS);
[C,L] = wavedec(a,3,'db4');
D1 = wrcoef('d',C,L,'db4',1);
D2 = wrcoef('d',C,L,'db4',2);
D3 = wrcoef('d',C,L,'db4',3);
figure(1)
spectrogram(D1);
figure(2)
spectrogram(D2);
figure(3)
spectrogram(D3);
28