© All Rights Reserved

3 views

© All Rights Reserved

- Artificial Intelligence
- MachineLearning Technique on Horse Racing
- Easy Neural Networks with FANN
- Neural Networks for Device and Circuit Modelling
- ThomasHosslerResume.pdf
- Detection of Coronary Heart Diseases Using Data Mining Techniques
- 13-practicalai-1234541139859232-1
- Matlab
- Active Learning NG
- ACAI99 Workshop
- algoritmo para classificação com sensores magneticos
- Classification Rule Extraction using Artificial Neural Network
- 1
- A Hierarchical Fuzzy Neural Network Approach for Multiple Fault Diagnosis
- 1_5
- Shikhar Resume One Page
- COMPUSOFT, 2(11), 365-369.pdf
- Face Recog using neural networks
- Combining Neural Network and Firefly Algorithm to Predict Stock Price in Tehran exchange
- Ankush Resume (3)

You are on page 1of 5

1

Mathematical Institute of Slovak Academy of Sciences, Dumbierska 1, Bansk Bystrica

2

University of ilina, Univerzitn 8215, ilina, Slovakia

arXiv:1311.0819v1 [cs.SD] 4 Nov 2013

We consider the ability of a very simple feed-forward ne- network.

ural network to discriminate phonemes based on just relative First, we demand that the output of the neural network is

power spectrum. The network consists of two neurons with balanced i.e. invariant with respect to loudness. We achieve

symmetric nonlinear response over a spectral range. The out- this requirement by restricting our attention to neural ne-

put of the neurons is subsequently fed to a comparator. We tworks carrying out computation

show that often this is enough to achieve complete separation

f (s) = f1 (s) f2 (s), (1)

of data. We compare the performance of found discriminants

with that of more general neurons. Our conclusion is that not where s represents sound and f1 , f2 are nonlinear functions

much is gained in passing to real-valued weights. More li- that grow additively with increase in loudness i.e.

kely higher number of neurons and preprocessing of input

will yield better discrimination results. The networks consi- f1 (s) f1 (s0 ) = f2 (s) f2 (s0 ) = d, (2)

dered are directly amenable to hardware (neuromorphic) de-

signs. Other advantages include interpretability, guarantees of if sounds s and s0 differ purely by d decibels in loudness.

performance on unseen data and low Kolmogoroffs comple- Secondly, we demand a symmetric response of neurons

xity. f1 , f2 over a spectral range. Write s = (s1 , . . . , sn ) for the

Index Terms phoneme discrimination, feed-forward discrete log-periodogram of a sound, i.e. si represents the log

neural network, neuromorphic hardware, TIMIT, memristor of power at frequency (i1) f2n S

. A spectral range r is any sub-

sequence (si , si+1 , . . . , sj ) with i j. Symmetric response

over a spectral range r of a neuron represented by k-ary func-

1. INTRODUCTION tion f means that

Artificial neural networks have emerged as one of the most f (r) = f ((r)) (3)

powerful tools in speech recognition [1], and more generally

in machine learning [2, Chapter 5]. If there is a downside for any permutation of k elements. This requirement is a

to employing neural networks, it is their opaqueness. Much strong form of requiring that the response be invariant to small

like a human brain which inspired them, it is often not clear shifts of formant frequencies. Consider a family of signals,

why they work so well. It is however possible, and even ad- spectra of three are sketched in Figure 1. Suppose one wants

visable [3, page 148], to address this opaqueness by adopting

design principles that will provide guarantees about their per-

formance in situations that did not occur during training.

In our work we introduce a new class of neural networks

that may be used for discrimination between phonemes. They

arose in connection with investigation of computational po-

wer of memristor based networks. As such, they use predo-

minantly min and max processing primitives. There are other

approaches that use the same processing elements [4], [5], [6], Fig. 1. Sketch of power spectra of three signals with a similar

[7], [8], [9] as well as a vast body of research on more gene- formant frequency.

ral fuzzy logic systems. The difference in our work is that we

Research partially supported by grants APVV-0219-12 and VEGA

to construct a function with strong response for signals in this

2/0112/11 . Usage of computational facilities of University of ilina and Uni- family with formants varying between frequencies 1 and 2 .

versity of Matej Bel is gratefully acknowledged. If one restricts oneself to linear forms, there is essentally a

single expression that has uniformly strong response for all where i , i2 are means and variances of evaluations of disc-

signals in the family, namely riminant function f over the two classes. For Z-classifiers we

have maximized the following secondary tie-breaking crite-

si + si+1 + + sj . (4)

rion

This is not true, if one tries expressions from nonlinear algeb- 2 2

ras, even as simple as the algebra generated by binary min FZ (f ) := min 1

, 2 , if sign(1 ) 6= sign(2 ). (9)

and max functions. For instance, consider functions 12 22

max(si , si+1 , . . . , sj ) (5) For our data set we used discrete spectra created by A.

max2 (si , si+1 , . . . , sj ) (6) Buja, W. Stueltze and M. Maechler and used in work [10].

It is freely available in ElemStatLearn package of R statistics

where max2 denotes the second largest element of the set. software as well as online [11]. The data set contains spectra

Both of hese functions have strong and symmetric response of five english phonemes computed from TIMIT database.

uniformly over all signals in the family. We have used custom-built C++ software for results ob-

In our work we shall be concerned with induced B- tained in the next two sections, R for verification, graphing

classifiers that determine the class of a phoneme based on and Nelder-Mead optimization in section 4 and Matlab for

comparing f (s) with a threshold , and a special subclass of computing data in Figure 4.

Z-classifiers, for which = 0.

A vast majority machine learning techniques such as support Figure 2 presents the results of classification by Z-classifiers.

vector machines, neural networks, or various regressions have From the graphs it is clear that in majority of cases, the discri-

a parameter space that forms a Riemannian manifold. Con- minant functions we considered are able to completely sepa-

sequently with these techniques one may use gradient based rate the two classes. Moreover, separation occurs with relati-

optimization mechanisms and sometimes even convex opti- vely short spectral ranges, with size at most 12. There are just

mization. The situation with our class of neural networks is two cases where separation does not occur and that is discri-

different. We need to find an optimal structure in a discrete mination of pairs aa-ao and dcl-iy.

parameter space. There are two hurdles that need to be add- It is interesting to note that passing from Z-classifiers

ressed. to more general B-classifiers does not improve results very

First, in the absence of a clever trick, one needs to search much as can be seen in Table 1.

the discrete space in a reasonable amount of time. One may

opt for a local search whereby an initial network is optimi- Change on

zed by small twists, or perhaps by a genetic algorithm. The phonemes train data test data

disadvantage of the former is that the search may end in a su- aa-ao 0.78 % 1.82 %

boptimal local minimum, whereas the latter may take a long aa-dcl 0.09 % 0%

time to find a good network. In our work we opt for exhaus- aa-iy 0% 0%

tive search of polynomially growing family, namely we will aa-sh 0% 0%

consider only functions of the form ao-dcl 0.08 % 0%

ao-iy 0% -0.17 %

f (s) = q1 (r1 ) q2 (r2 ), (7) ao-sh 0% 0%

where q1 , q2 are quantiles (e.g. max, max2 ), and r1 , r2 are dcl-iy 2.48 % 0.99 %

spectral ranges. If one considers ri of length at most L and the dcl-sh 0% -0.48 %

whole spectrum has N power points, then there are O(N 2 L4 ) iy-sh 0.07 % 0.19 %

such functions and an exhaustive search is feasible.

Secondly, one needs to define a goodness criterion. Ty- Table 1. Improvement (positive) or worsening (negative) of

pically this is classification success. However, in the case performance of B-classifiers compared to Z-classifiers

when multiple discrimination functions achieve perfect se-

paration on training data, a more refined criterion is needed.

In this context one needs to distinguish between training Z- Unlike many other classes of neural networks, the struc-

classifiers and B-classifiers. For B-classifiers, usual Fisher ture (and not only response) of our networks can be clearly

discriminant may be used, which is defined as visualized as seen in Figure 3. In the figure in horizontal scale

we indicate spectral ranges to which the two neurons are sen-

(1 2 )2 sitive. In the vertical scale we indicate maximum length of

FB (f ) := , (8)

12 + 22 spectral ranges r1 , r2 .

5 10 15 20 25 30

dcl sh iy sh

100

95

90

85

80

ao dcl ao iy ao sh dcl iy

100

% correctly classified

95

90

85

80

aa ao aa dcl aa iy aa sh

100

95

90

85

80

5 10 15 20 25 30 5 10 15 20 25 30

maximal width

Fig. 2. Train and test success rates for various pairwise Z-classifiers

4. CONTINUOUS NEIGHBORHOOD OF OPTIMAL with exactly one nonzero entry in both b1 and b2 . One may

QUANTILE CLASSIFIERS relax this condition on bi to obtain other classifiers.

One may try to improve the discrimination results by allowing Example 1. Consider the B-classifier defined by function

more general functions of a spectral range. So let us suppose

f (s) = max(s62 , s63 , . . . , s74 ) s1 , (12)

that we have spectral ranges s1 , s2 . Let us write sort() for the

function that orders its vector argument elementwise. Com-

with threshold value = 4.03279 that decides

monly used LDA tries to optimize Fishers discriminant of

functions (

phoneme is dcl if f (s) < 4.03279,

w1 r1 w2 r2 , (10) (13)

phoneme is iy if f (s) > 4.03279.

with real valued weight vectors w1 , w2 , whereas in the previ- We can consider vectors bi in (11) of the following categories

ous section we optimized separations of expressions

(monotone OWA) nonnegative, increasing entries in

q1 (r1 ) q2 (r2 ) = b1 sort(r1 ) b2 sort(r2 ), (11) each bi , with total sum equal to one,

5. FUTURE WORK

ao iy

results due to limited space. It is clear however that more rese-

30

arch is needed for this class of networks to find applications.

Let us outline the directions further research may take.

25

First, one should take into account known psychoacoustic

phenomena of human hearing. It may prove advantageous to

spectral range width

20

frequency [15]. Our experiments showed that often low fre-

quency power was crucial for discrimination and thus it may

15

prove useful to use Q-transform [16] which provides more

data points in lower frequencies compared to ordinary FFT.

10

Secondly, it is well known that it is two and sometimes up

to 4 formants that characterize a vowel. It is therefore neces-

sary to consider a more complex set of discrimination func-

5

tions. In [17] we proposed an algebra, whose elements are

candidates for describing the structure of more complex ne-

0 2000 4000 6000 8000

tworks.

Frequency (Hz)

Let us conclude with summarizing advantages of propo-

sed networks. By design, they provide a guaranteed perfor-

Fig. 3. Spectral ranges of optimal neural networks for disc-

mance on variations of trained data unseen during training,

rimination ao-iy plotted against increasing spectral range

they are interpretable and have very low Kolmogoroffs com-

width. White circle indicates the position of quantile (right is

plexity, as the example (12) shows.

the maximum, the left end and the middle would represent

the minimum and the median respectively). More graphs are

available in [12, page 83].

sum equal to one,

hods in the table use only spectral compoments s1 and the

spectral range s62 , s63 , . . . , s72 . Balanced LDA is a variant

of LDA, in which the sum of all coefficients is 0. Our conclu-

Fig. 4. Evaluation of dcl-iy classifier over a Slovak word

(IPA: /odiSla/) superimposed on PCM signal. Note that the

method train test er- Fisher Fisher

classifier was train on English data (TIMIT).

error ror train test

score score

Last, but not the least, the networks can be easily imple-

quantiles (f ) 2.7 % 5.1 % 6.22 6.54

mented in hardware, since BJT [18], CMOS [19] and even

monotone OWA 3% 4% 6.24 6.5

passive memristor implementations [20] of min, max and

OWA 3% 4.5 % 6.28 6.54

comparison operators exist. In this context it is worthwhile

ordered LDA 0.8 % 1.8 % 15 12.83

to point out that it is the change of, rather than the absolute

balanced LDA 4.2 % 5.7 % 5.69 5.85 spectral content, that can be read off from these discrimi-

LDA 1.2 % 2.2 % 13.81 12.22 nants. This is illustrated in Figure 4 where transition between

phonemes is quite strong. One may thus hypothesize that

Table 2. Comparison of LDA with ordered methods. See text

analog hardware speech recognizer could be based on silicon

for explanation of various methods.

cochlea ([21], followed by processing by a neural network

of the kind described here, whose output would be fed to

sion is that not much is gained by passing from discrete struc- adaptive differentiator like that of Delbrck and Mead [22],

tures represented by quantiles to weighted ones, in line with and finally to a memristive switch [23].

similar research [14].

References [13] R. Yager, On ordered weighted averaging aggrega-

tion operators in multicriteria decisionmaking, Sys-

[1] G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. tems, Man and Cybernetics, IEEE Transactions on,

Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, vol. 18, no. 1, pp. 183190, 1988, ISSN: 0018-9472.

and B. Kingsbury, Deep neural networks for acous- DOI : 10.1109/21.87068.

tic modeling in speech recognition: the shared views

[14] D. Soudry and R. Meir. (2013). Mean Field Bayes

of four research groups, Signal Processing Magazine,

Backpropagation: scalable training of multilayer ne-

IEEE, vol. 29, no. 6, pp. 8297, 2012, ISSN: 1053-

ural networks with binary weights, [Online]. Available:

5888. DOI: 10.1109/MSP.2012.2205597.

http://arxiv.org/abs/1310.1867.

[2] C. M. Bishop, Pattern recognition and machine lear-

[15] R. Stern and N. Morgan, Hearing is believing: bio-

ning. Springer, 2006.

logically inspired methods for robust automatic spe-

[3] D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, A ech recognition, Signal Processing Magazine, IEEE,

learning algorithm for boltzmann machines, Cogni- vol. 29, no. 6, pp. 3443, 2012, ISSN: 1053-5888. DOI:

tive Science, vol. 9, no. 1, pp. 147 169, 1985, ISSN: 10.1109/MSP.2012.2207989.

0364-0213. DOI: http : / / dx . doi . org / 10 .

[16] J. C. Brown, Calculation of a constant q spectral

1016 / S0364 - 0213(85 ) 80012 - 4. [Online].

transform, The Journal of the Acoustical Society

Available: http://www.sciencedirect.com/

of America, vol. 89, no. 1, pp. 425434, 1991. DOI:

science/article/pii/S0364021385800124.

10.1121/1.400476. [Online]. Available: http:

[4] S. Badura, Learning of fuzzy logic circuits with me- / / scitation . aip . org / content / asa /

mory for speech recognition using video, Ph.D. thesis, journal/jasa/89/1/10.1121/1.400476.

ilinsk univerzita, 2012.

[17] O. uch. (2013). Phoneme discrimination using KS

[5] S. Foltn, Speech recognition by means of fuzzy logi- algebra I, [Online]. Available: arxiv . org / abs /

cal circuits, Ph.D. thesis, ilinsk univerzita, 2012. 1302.6031.

[6] S. Foltn, Speech recognition by means of fuzzy lo- [18] T. Yamakawa, Fuzzy logic computers and circuits,

gical circuits, in 18th international conference on soft U.S.Patent 4,875,184, 1987.

computing, MENDEL 2012, Brno, Jun. 2729, 2012.

[19] I. Baturone, S. Sanchez-Solano, A. Bariga, and J. L.

[7] S. Foltn and J. Smieko, Phoneme recognition by Huertas, Implementation of CMOS fuzzy controllers

application of genetic programming, in Sbornk prs- as mixed-signal integrated circuits, IEEE J. of Solid

pevku z mezinrodn vedeck konference, Mezinrodn State Circuits, vol. 5, no. 1, pp. 119, 1997.

konference pro doktorandy a mlade vedeck pracov-

[20] M. Klimo and O. uch. (2011). Memristors can im-

nky, Jun. 2729, 2012, pp. 22342241.

plement fuzzy logic, [Online]. Available: http : / /

[8] S. Badura, M. Klimo, and O. kvarek, Lip reading arxiv.org/abs/1110.2074.

using fuzzy logic network with memory, in Appli-

[21] B. Wen and K. Boahen, A silicon cochlea with ac-

cation of Information and Communication Technolo-

tive coupling, Biomedical Circuits and Systems, IEEE

gies (AICT), 6th International Conference, Oct. 2012,

Transactions on, vol. 3, no. 6, pp. 444 455, Dec. 2009,

pp. 14. [Online]. Available: http://ieeexplore.

ISSN : 1932-4545. DOI : 10 . 1109 / TBCAS . 2009 .

ieee.org/stamp/stamp.jsp?tp=&arnumber=

2027127.

6398471&isnumber=6398461.

[22] T. Delbrck and C.A.Mead, Adaptive photoreceptor

[9] S. Badura, S. Foltn, and M. Klimo, Fuzzy logic

with wide dynamic range, in 1994 International Sym-

networks for speech recognition, Communications,

posium on Circuits and Systems, ISCAS94, 1994, 339

Scientific Letters of University of ilina, pp. 1318, 2

342, vol.4.

2013.

[23] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S.

[10] T. Hastie, A. Buja, and R. Tibshirani, Penalized disc-

Wiliams, The missing memristor found, Nature, vol.

riminant analysis, The Annals of Statistics, vol. 23, no.

453, pp. 8083, 2008, doi:10.1038/nature06932.

1, pp. 73102, 1995.

[11] (Dec. 7, 2012). English phonemes, [Online]. Available:

http://www- stat.stanford.edu/~tibs/

ElemStatLearn/datasets/phoneme.data.

[12] O. uch. (2013). Using min-max circuits for speech

recognition, [Online]. Available: http : / / www .

savbb.sk/~ondrejs/Phoneme/book.pdf.

- Artificial IntelligenceUploaded bysarayoo
- MachineLearning Technique on Horse RacingUploaded byprasannakumar1989
- Easy Neural Networks with FANNUploaded byRoberto Solano
- Neural Networks for Device and Circuit ModellingUploaded byfeedback8469
- ThomasHosslerResume.pdfUploaded byThomas Hlr
- Detection of Coronary Heart Diseases Using Data Mining TechniquesUploaded byEditor IJRITCC
- 13-practicalai-1234541139859232-1Uploaded byGuruh Fajar
- MatlabUploaded byshylaja_p
- Active Learning NGUploaded bynaveen jaiswal
- ACAI99 WorkshopUploaded byluobolo
- algoritmo para classificação com sensores magneticosUploaded byAlan Ramos
- Classification Rule Extraction using Artificial Neural NetworkUploaded byseventhsensegroup
- 1Uploaded byaandaku
- A Hierarchical Fuzzy Neural Network Approach for Multiple Fault DiagnosisUploaded byAbu Hussein
- 1_5Uploaded byBrij Mohan Singh
- Shikhar Resume One PageUploaded byDeepak Lamba
- COMPUSOFT, 2(11), 365-369.pdfUploaded byIjact Editor
- Face Recog using neural networksUploaded byAmit Samal
- Combining Neural Network and Firefly Algorithm to Predict Stock Price in Tehran exchangeUploaded byATS
- Ankush Resume (3)Uploaded byspeedboy786
- Paper 228Uploaded bytarunchat
- Oreilly Thesis.1Uploaded byVishwanathSeshagiri
- Future of Machine IntelligenceUploaded byJohny Doe
- Learning to Compose Neural Networks for Question AnsweringUploaded byJan Hula
- jamjoom-ICCC-09Uploaded bysantibanks
- WWW2018 Final StaQCUploaded byAashka
- Ml CheatsheetUploaded byvamsi krishna
- 3.1 ML Data Science Syllabus.pdfUploaded bysdfsdf
- GAAAAAAAAAAAAAAAAUploaded byJesus Jair Alarcón Arca
- Big Data Machine Learning Part-IUploaded bygaurav pande

- Student SPD Checklist for TeachersUploaded byMaria Guy Del Duca
- application_format.pdfUploaded byLalitha Raja
- Rahu Ketu EngUploaded byLalitha Raja
- Early Language Development-Trends in Language Acquisition Research.pdfUploaded byret wahyu
- Chapter 1Uploaded byLalitha Raja
- Voice Disorders 3Uploaded byLalitha Raja
- 10936_2016_Article_9469Uploaded byLalitha Raja
- General GuidelinesUploaded byLalitha Raja
- SRCD2007_2-105_132_CochUploaded byLalitha Raja
- 10.1.1.113.2960Uploaded byLalitha Raja
- 1 (8)Uploaded byLalitha Raja
- 5th-Call for Proposals in DST-InRIA-CNRS Targeted ProgrammeUploaded byLalitha Raja
- unit 4Uploaded byLalitha Raja
- Sensory Processing Disorder ChecklistUploaded byLalitha Raja
- Tamilweb_ Grammar NotesUploaded byLalitha Raja
- extIPAChart2008Uploaded bySantiago Guerrero
- Tracing Traditional Roots of Telugus Settled in Tamil Nadu - The HinduUploaded byLalitha Raja
- MTEUploaded byLalitha Raja
- Cohesion and CoherenceUploaded byLalitha Raja
- Educator Panel - KROLLUploaded byLalitha Raja
- Behavioral BrainUploaded byLalitha Raja
- 0012 Genesee BrainUploaded byLalitha Raja
- 02 Central Nervous System Ppt 2027Uploaded bysajida_hajju
- Pragmatics Profile ChildrenUploaded byLalitha Raja
- StutteringUploaded byscoviatugonza
- 17 VowelsUploaded byZulma Cardona
- Yojana April 2013Uploaded bysarojnitr
- 129Uploaded byLalitha Raja

- Microsemi RTG4 FPGA Product Brief PB0051 V10Uploaded byAbhinay Yadav
- Kinesio TapingUploaded byinny
- Op ProjectionUploaded byPrem Soni
- Global Target Low Birth WeightUploaded byamalia
- Plant Maintenance, Proof of Performance and Signal Leakage Rev[1]. AUploaded byhailemebrahtu
- Audi RS7 ConfiguredUploaded byCS_EE_Photography
- DPP(41-43)Uploaded byRahul Kumar Sharma
- Startup Science® Toolkit 1.2 - MASTERUploaded byMaryGraceBolambaoCuyno
- LUCCA.- GPL Forum Mexico_2 Octubre 2014_defUploaded byCesar Javier Espinola Gutierrez
- Chapter 15 PolymersUploaded byzubi0585
- Great Book of Math PuzzlesUploaded bytanvirdhklimon
- probiotilk basillus.pdfUploaded byAsa Étudier La-Dien
- Marcus Wareing - Cook the PerfectUploaded byJose Asuaje
- CDS SyllabusUploaded byacankit026
- Bloomberg Businessweek - 02 September 2013Uploaded byAndy Herg
- Soil Cheatsheet (Updated)Uploaded byAlan Otoni Lagartixa
- Samsung CL21M6WK chasis K57 A.pdfUploaded bySergio Glez
- ADG3000 ManualUploaded byIvan Garces Gómez
- GBSUploaded bySeptika Purnastuti Hapsari
- Part 1Uploaded byjonthy1431988
- MY LIFE: An AutobiographyUploaded bybayo
- 10-FUNDAMENTALS OF ULTRASONIC FLOW METERS.pdfUploaded byxjaf01
- The Blank Slate, The Noble Savage and the Ghost in the MachineUploaded byEstefania
- IDF Newsletter Leaders Brief Issue n.8Uploaded bytatianagomespdl
- NurseReview.Org - The infant of an hiv positive motherUploaded bynursereview
- Metal DuctsUploaded byanand_nambiar2003
- Classified_2015_05_04_000000Uploaded bysasikala
- Concrete mix design lab reportUploaded byShaluka Wijesiri
- Surgery Study NotesUploaded by00nae00
- Training IEC CE for control panels_April 2014.pdfUploaded byTimothy Fields