Professional Documents
Culture Documents
UDK 621.391:004.93
IFAC IA 5.8.6
Preliminary communication
Extraction of relevant features is essential stage in a pattern recognition and classification system. Goal of the
feature extraction algorithm is to find feature subset where relevant information for recognition is contained in
minimal number of features. Proposed algorithms are based on the assumption that features with better individual
discrimination ability will also be better in combination with other features. Features are first extracted from the
initial set, then sorted according to their individual fitness. Sorted set is used to form the search tree. Two heuris-
tic algorithms are proposed: the first one performs the depth first search, bounded with required increase of fit-
ness function and the second one is based on genetic algorithm. Their performances are compared with complete
search and sequential search (FSS, BSS) algorithms.
Key words: feature extraction, pattern recognition, signal analysis
tion that the feature that is better individually is al- function used for feature evaluation is a ratio be-
so likely to produce better overall result in combi- tween average Euclidean distance between instances
nation with other features. This is achieved by ap- from different classes, and average distance be-
propriate construction of the search tree and defi- tween instances belonging to the same class (1),
nition of the pruning criteria, and by adjusting pa- [4, 9].
rameters of the genetic algorithm [12] respectively. C ni −1 n1
Proposed algorithms are compared with some of ∑ ∑ ∑ d ( xij , xik )
the well known feature extraction algorithms [5]: i =1 j =1 k = j + 1
D1 =
Complete search, and standard versions of Forward C
n
and Backward sequential selection (FSS and BSS) ∑ 2i
i =1
[1, 7, 8, 10] according to the quality of extracted
feature sets and efficiency. (1)
C −1 C ni n j
∑ ∑ ∑ ∑ d ( xij , x jl )
i =1 j = i +1 k =1 l =1
2 AUTOMATIC GENERATION OF THE INITIAL D0 = C
,
n ni
FEATURE SET
2− ∑ 2
i =1
Initial feature set may contain various data col-
lected from the signal. It should be created by con- where:
sistently applying appropriate transformation to dif- C – number of classes
ferent parts of the signal. n – total number of instances
ni – number of instances in i-th class
In the case of spectral analysis, feature is defined xij – j-th instance from i-th class
as an average energy in frequency window of vari- d(xij, xkl) – Euclidean distance between xij and xkl
able width and position, represented by frequency
interval [ f0, f0 + ∆fi], where f0 takes values from in- Because that algorithm does not take into con-
terval [ fmin, fmax − ∆fi]. Window width, ∆fi, takes dis- sideration interactions between features, if applied
crete values from the interval [ fs/N, fmax], where fs alone usually gives poor results. Pseudo code of the
is sampling frequency, and N is the number of sam- algorithm for extracting individually best features
ples, and also number of frequency channels in the from the initial feature set is shown in Figure 2.
spectrum. It is appropriate to express parameters
fmin, fmax, and ∆ f in channel numbers, where the sort InitialSet according to Fitness
channel width is defined as fs/N. extract best feature from InitialSet
remove overlapping features form InitialSet
If the relevant part of the spectrum contains Fig. 2 Algorithm for extracting individually best features
1 024 channels, and above parameters are defined
as:
Array InitialSet contains the value of the fitness
fmin = 1 ⋅ fs/N, fmax = 1 024×fs/N,
function for each feature from the initial set, as
∆fi ∈ {8, 12, ...,80} ⋅ fs/N, well as starting and ending frequency of the fre-
quency range covered by the particular feature. In
initial set will contain 19 ⋅ 1 024 − 4 ⋅ (2 + 3 + … + 20) = each iteration, initial set is sorted according to the
= 18 620 features. Size of the corresponding feature fitness and the current best feature is extracted. All
space is about 105 600, which is unmanageable for features whose frequency range overlap with the
any feature extraction algorithm. Therefore, it is range of the extracted feature (i.e. they are partial-
necessary to reduce the number of the initial fea- ly redundant) are removed from the set. Described
tures to the manageable size. procedure is repeated until all features are either
extracted or removed from the initial set. Features
3 REDUCTION OF THE INITIAL FEATURE SET from the reduced set cover entire frequency range,
and contain features whose frequency ranges do
Initial feature set is highly redundant, due to not overlap.
many features containing information from the
same channel. In the above example minimal width The first few features will usually have signifi-
of the frequency window is eight channels, which cantly higher individual discrimination ability.
implies that reduced feature set may contain at Therefore, it is justified to organize search in such
most 128 non-overlapping features, covering fre- a way that will favor them. Feature extraction algo-
quency range of 1024 channels. Initial feature set is rithm should also take into consideration other fea-
reduced by selecting individually best features. tures, which may also contain significant informa-
Fitness function for each feature is calculated indi- tion that should be included into the extracted fea-
vidually, and the best f features are chosen. Fitness ture subset.
f
f′ = Nf
, (2)
1.1
where f is value of the original fitness
function, and Nf is number of features
in the particular feature set. That
modification emphasizes feature sets
containing smaller number of features.
Currently there is no method to auto-
matically adjust fitness function.
Therefore, denominator in equation
(2) is chosen experimentally. Chosen
value will prefer feature subsets con-
taining between three and eight fea-
tures. Fig. 3 PDF for determining the crossover point
5 HEURISTIC PRUNING OF THE SEARCH TREE On the other hand, examining all nodes at the
certain level will significantly increase the number
Feature space of some 102 features is still too of iterations. Testing the algorithm on real data
large to be searched exhaustively. Evaluating all shows that even with such restrictive policy, algo-
possible combinations of e.g. 100 features will re- rithm produces comparable results in fraction of
quire 1030 iterations. Knowledge about the feature time required by other algorithms.
space can guide the search, significantly reducing
the size of the search space. In proposed algorithm Pseudo code of the algorithm is shown in Figure
information about the quality of individual features, 6.
attained during the reduction of the initial set, will
be used.
Algorithm performs the depth first search, boun-
// Return from recursion if the bottom of the
ded with required increase of fitness function. Fea- // tree is eventualy reached
tures are sorted according to their fitness, directing if Level>MaxDepth
the search to earlier evaluate combinations of indi- return
end
vidually better features. Example of the search tree
for the set of five features is presented in Figure 4. if Level==1
StartingNode = 0
else
StartingNode = FeatureSet(Level-1)
end
from the same class. Fitness function is defined as Forty measurements were performed for each of
minimal distance between neighboring classes. In the samples of four different materials, making to-
this way feature subsets resulting in balanced distri- tal of 160 measurements. Half of the measure-
bution of classes will prevail subsets that well sepa- ments were used for training, and the other half for
rate one class, leaving the others close together. It testing extracted feature sets.
could be proved that such defined function satisfies
the monotonic property, meaning that by adding new
feature to any subset, fitness value will remain the
same or increase.
Due to the monotonic property, the »best« subset
will be the one containing all features, because it
will have the highest fitness. But from the classifier
point of view, it is better to work with subset con-
taining smaller number of features and having suffi-
cient discrimination ability. For this reason, de-
scribed algorithms classify extracted feature subsets
according to the number of features in a set. There-
fore it is possible to determine the best subset con-
taining appropriate number of features.
6 EXPERIMENTAL RESULTS
Heuristi~ki postupci izdvajanja zna~ajki u obradi signala. Izdvajanje relevantnih zna~ajki je klju~an korak u
sustavu za raspoznavanje uzoraka i klasifikaciju. Cilj postupka izdvajanja zna~ajki je pronala`enje najmanjeg skupa
zna~ajki koji sadr`i informacije potrebne za raspoznavanje uzorka. Predlo`eni postupak temeljen je na pretpostavci
da }e zna~ajke koje pojedina~no bolje razlikuju uzorke iz razli~itih klasa to svojstvo imati i u kombinaciji s drugim
zna~ajkama. Nakon izdvajanja iz po~etnog skupa, zna~ajke se sortiraju po padaju}oj vrijednosti kriterijske funkcije.
Iz sortiranog skupa zna~ajki formira se stablo pretra`ivanja, tako da }e skupovi koji sadr`e pojedina~no bolje
zna~ajke biti pretra`eni prije. Predlo`ena su dva postupka izdvajanja zna~ajki: prvi provodi pretra`ivanje stabla po
dubini ograni~eno zadanim porastom vrijednosti kriterijske funkcije, a drugi je temeljen na genetskom algoritmu.
Postupci su prema kvaliteti izdvojenih skupova zna~ajki i efikasnosti uspore|eni s postupkom potpunog pretra`iva-
nja i slijednim postupcima (FSS, BSS).
AUTHORS’ ADDRESSES:
Dr. sc. Davor Antoni}, dipl. ing.,
HR-10000 Zagreb, Klai}eva 21, CROATIA
Prof. Dr. sc. Mario @agar, dipl. ing.
Faculty of Electrical Engineering and Computing,
HR-10000 Zagreb, Unska 3, CROATIA
Received: 2002−10−05