You are on page 1of 34
co» United States Patent (10) Patent No: US 7,124,075 B2 1US007124075B2, Terex 45) Date of Patent: Oct. 17, 2006 (54) METHODS AND APPARATUS FOR PITCH 6018706 41/2000, Huang ea DETERMINATION (9020357 22000 Iron eta 6038271 8 32000. Chen (75) Inventor: Dmitry Edward Terez, 6 North th $ 6087254 842000 fron eta illvile, NB (US) 08332 19038 BI 32001 Takaniom ett 6208958 BL 32001 Cho eal (4) Notice: Subjoto any dislaimer, the term ofthis {6216118 BL 42001 Tokibe ea patent is extended or adjusted under 38, 0 BL S200]. Acero ea USC. 15406) by 935 days (21) Apr. Nos 10/t40.211 (Continved) (22) Filed: May 7, 2002 OTHER PUBLICATIONS, ler Publication ‘Dogan M C ctl: “Ralsie robust pitch Deco” Dil Sina a Prior Publication Data Processing, Mat. 25, VoL vol. $ CONF. 17, Mat. 231992, pp. US 2003/0088401 Al May 8, 2003, 1ost32, NPOIOOSSE99 ISBN: OA80308520, Related US. Application Data (Continued) (60) Provisional application No, 602348,883, fled on Oct. Primary Examiner—Martin Lemer 26, 2001, (74) dome, Agent, or Firm-Steasb- and Pokotylo; Michael P. Sian G1) mec. - G01 is (2006.01) on ABSTRACT (2) US. cL ‘04203; 708/207 (58) Field of Classification Search 708/200, Methods and apparatus for detecting periodicity andor for for ot 208: 204, 205, 206,207 determining the fundamental period of a signal such as ‘See aplication file for complete search history. spowch. The methods include embedding & portion of & 66) References Cited simple digitized signal into an m-cimensional state space US. PATENT DOCUMENTS 2908, Suosas7 A Saveiaes & 1001959. Ralsbeck 1011968 David et a "21970. Scvoslet * 101970 Miller ro4268 DIM Nol ta 531972. Rabiner. 1073 Atl 197s Mecesy ‘31977 Dalbaowski ot BID87. Nakano 1987 Sent ea 700231 11/1989 Poae ta 711993. Haig 911999 Rapp eta roa207 ‘o obtain a sequence of m-dimensional vectors, selecting closest pairs of vectors in sate space from a plurality of possible pairs of mdimensional vector in sad sequence of fiedimensional vectors, accumulating total numbers of selected closest pairs of vectors having the same time separation values to produce a histogram of accumulated fumbers, and locating at least highest peak in portion of sid histogram to obtain a vale isiating the fundamental period of the signal. Various embodiments are directed to Speech and audio signal processing snd other speech related applications. However, the methods have a general nature and can be applied to other types of periodic or quasi- periodic signals as well, 62 Claims, 16 Drawing Sheets US 7,124,075 B2 Page 2 US. PATENT DOCUMENTS 600067 BL 12/2002 Hegre a 6.58457 BL* 62005 Heikkinen et (OTHER PUBLICATIONS, Banbrook M et al "Is speech choi”: invariant geometiat eases fr speech dat” TEE Collguium on Exploiting Chaos in Signal Processing (Digest No. 1994143), 1994, pp. 8-810, Xpov6827368 London Banlwook M ot al "Spuech Characterization and Spates by Noniner Methods” IEEE Transactions on Spesch and Auto Por ‘essing IEEE Ine. New York US. vol, No. fan. 1999 pp 1-17, XPodd890820 ISSN: 1053-6576. Supplementary European Search Report for Application No EPO TRAIT, Ost 4, 2005, 1 Pp Taken, “Detecting Stange Altactors ia Tupbuleace” Lecture Noles in Mathenatis v. $98, pp. 336-381, eds D. Rand an 1. 5. Young, Springer Bea, 1081, 1. Brom and King “Fstracting Qualitative Dynamics fom Experimenta Dats", Physica 20D, pp. 217-236, Now Holland, ‘mse (1986), Dr Tathrop and Kosslich, “Characterization of an Experimenta Stange Atacor by Periods Orbit”, Physical Review A 40, 07 pp. aD2S-A031, (Oct. 1, 199). W- Hess "Pitch and Voicing Determination”, Advances in Speech Signaling Procesting. pp. 3-47- ads. MoM. Sond and. unui, Marcel Dekker, New York (1991, AA. Provenale et ay "Distinguishing Between Low-dimensonat Dynamics and Radominess in Messina Time Series Physi D $8, pp 31-9, North Holland, (1992) ross 704207 .Giimors," New Tes for Chao" Journal of eonomic Behavior ‘nd Orpniaton 22, 9p. 209.257, Hicver Science Pubishors 1, (i995). TT. Scher, “Elicient Neighbor Seuching in Nonlinear Tie Soros Analysis", Dap of Theortal Physics, Univ of Wuppeta, Ds12097 Wapperah pp 1-20, Jl 8, 1996) D. Talk, "A Robust Algorithm for Bich Tiacking (RAPT Speech Coding and Syathesis, pp 495-818, Hever Science Pub lishers BW, (1999), G. Kutin, "Nonlinear Processing of Speech", Speech Coding ant Synthesis pp. 557610, Llsevier Science Publisher BV, (195). 1 Kanty and T, Schreiber, “Nonlinear Time Analysis, Cambridge University Press pp 3904, (1998). 1. Mann and S. MeLaughia, "A Nonlinear Algorithm for Epoch “Marking in Speech Signals Using Pineare Maps, Procedings of the 9 Euopean Signal Processing Confrence, V. 2, pp. 701-704, (1998), 2. Gilmore, “Topological Analysis of Chote Dynamical Systems" Reviows of Modo Physis, «90, No. pp. 1485-1529, (Ost 198). D. Geran, “Avtio sisslization in phase space” in “Bridges: “Mathemtieal Connections in Ar. Mutie and Sconce", 199, pp. 137-144, as downloaded from ip ctscerist psc 28752 unl in 3003 D. Gerhard, “Avtio sisslization in phase space” in “Bridges: ‘Mathematical Connections in Am, Music and Science, 199, pp. 137-144, as downloaded from. hpiterceristpeu ct putasdSdaudio Mant in Feb. 206. * cited hy examiner U.S. Patent Oct. 17, 2006 Sheet 1 of 16 US 7,124,075 B2 Pini ‘Sample number FIG. 1A U.S. Patent Oct. 17, 2006 Sheet 2 of 16 US 7,124,075 B2 559 pairs. Euclidean distance 100 Time separation (samples) 50 FIG. 2A 2 £ 150] $ 100 = 50 2 9 8 8 100 150 Time separation (samples) Normalized number & o 50 100 150 Time separation (samples) FIG. 2C U.S. Patent Oct. 17, 2006 Sheet 3 of 16 US 7,124,075 B2 Normalized number © & - | a 0 400 150 Time separation (samples) FIG. 3A st a | 8 £ 2 Bos 3 E 2 0} 0 50 100 150 Time separation (samples) FIG. 3B Amplitude ° 50 100 150 Time separation (samples) FIG. 3C U.S. Patent Oct. 17, 2006 Sheet 4 of 16 US 7,124,075 B2 05 ° g 2 2° wn E Y g -0.5 0 50 100 150 200 Sample number FIG. 4A U.S. Patent Oct. 17, 2006 Sheet 5 of 16 US 7,124,075 B2 Euclidean distance 150 FIG. 5A 100 ‘Time separation (samples) ] Number of pairs 0 50 100 150 Time separation (samples) FIG. 5B Normalized number » 1 100 Time separation (samples) 150 FIG. 5C U.S. Patent Oct. 17, 3 5 3 4 E 5 2 0 100 160 Time separation (samples) FIG. 6A —— a 3 5 Bos 5 2 °% 50, 100 150 Time separation (samples) FIG. 6B 1 1 o \ 3 |\ Bo \ / £ | \/ \/ aL 4 ° 30 2006 Sheet 6 of 16 US 7,124,075 B2 100 Time separation (samples) 150 FIG. 6C U.S. Patent Oct. 17, 2006 Sheet 7 of 16 US 7,124,075 B2 Amplitude L = eth mle alt i i Ni peal Sample number FIG. 7A U.S. Patent Oct. 17, 2006 Sheet 8 of 16 US 7,124,075 B2 8 3 3 3 a °> 50 100 750 Time separation (samples) FIG. 8A 200 g ‘8 150 3 & 100] 3 — 50] 2 0 ° 50 700 750 Time separation (samples) FIG. 8B Normalized number ° & hatte ete ied treed td ° 50 109 180 ‘Time separation (samples) FIG. 8C U.S. Patent Oct. 17, 2006 Sheet 9 of 16 US 7,124,075 B2 Normalized number ° ° & ° 50 100 150 Time separation (samples) FIG. 9A ~ ——— 8 5 g os so & 2 ° oO 50 100 150 Time separation (samples) FIG. 9B Amplitude evo vittro ° 50 i) 150 Time separation (samples) FIG. 9C U.S. Patent Oct. 17, 2006 Sheet 10 of 16 US 7,124,075 B2 SPEECH SIGNAL, SIGNAL PRE-PROCESSING 102 (INCLUDING A/D CONVERSION) (~———____________ EMBED A PORTION OF SAMPLED SIGNAL, INTO M-DIMENSIONAL STATE SPACE TO OBTAIN 104 ‘(A SEQUENCE OF M-DIMENSIONAL VECTORS, ¥ SELECT CLOSEST PAIRS OF VECTORS IN STATE SPACE FROM A PLURALITY OF POSSIBLE PAIRS. b- 106 OF M-DIMENSIONAL VECTORS es ACCUMULATE TOTAL NUMBER OF SELECTED CLOSEST PAIRS OF VECTORS FOR EACH OF APLURALITY OF TIME SEPARATION VALUES TO PRODUCE A HISTOGRAM OF ACCUMULATED NUMBERS: 108 y LOCATE ONE OR MORE HIGHEST PEAKS 110 IN THE HISTOGRAM 112 POST-PROCESSING OF THE LOCATED HIGHEST PEAKS PITCH VALUE FIG. 10 U.S. Patent Oct. 17, 2006 Sheet 11 of 16 US 7,124,075 B2 = SIGNAL FRAME, =s—2 24 EMBED INTO M-DIMENSIONAL STATE SPACE, ‘AND NORMALIZE THE TRAJECTORY ye ‘SELECT PAIRS OF VECTORS CLOSER THAN max TN STATE SPACE FROM A PLURALITY 216 OF POSSIBLE PAIRS AND ASSIGN TOTAL NUMBER OF THE SELECTED PAIRS TO ntotal ‘COMPUTE NORMALIZED HISTOGRAM WITH “THE ntotal SELECTED PAIRS AND DETERMINE /~?!8 ‘THE HIGHEST PEAK’S MAGNITUDE max YES max Sinong posible poi of wecers nthe saguosce of ne ionnotl vectors x) (1 = Closes pairs of vectors can be ected by choosing some neighborhood rai ria rate sace and wenlying pain cf tector witha itance between vectors in ste space less than this rao. This procedure ean be ssa by di fevtng a spacetime separation plot wth horizontal neat the vorical positon coresponding toa chosen and Sele ing al data point below tis ine. Fr example, honzotal Ald line 32 in Fi. 2A defines te nihborood ras TOLL In FIG. 2A, tore ate S59 dat points blew the line, ‘onsiponing tthe scected closest puis of vectors in tnedinensionl sate space Inone embodiment, dstanecs Dix.nG ae eomputed foe all pomible noneepeting prs of Nector in the feuenot of dimensonal vest: (xt), x@)). where JES Stand ich the computed divine are then ‘compared with the predetermined value oft nd pers with sistance Dl a(x are select a const pr In he ‘sempary embociment, sored Fvelgcon dtnces are empire. The computed ditancs ae somiared with the Sahara vale of: The val ofr shouldbe chosen appro pracy. For example, in one embodiment recone trajectories forall Fase ae normalized 1 fino anit ‘ube in sate space anda constant mas 1-015 ts used ‘One ean also select predetermined umber of vector pain with the smaet dances between vetrs in ste Space fron sto vector pi Tan one embimeat ‘Toset pais of vector ae sched by Comparing spatial fistncesDI9G)xG)] forall possible non-epeting pais of Sectors inthe sequence of vectors) (el =» Ms ondeing {ctor pas thc sptial stances n inrcsing onde and Selecting predetermined aunber no lose pais om the fnlered st of vector pais. The scschn can be easly perform sa rest ofthe ordering orth selected closet pairs of rectors (x(x) he corespnding tie separations between vectors At} Gin ed oe somputing integer aumber of samples) are re Periodicity histogram, ‘A periodicity histogram is computed based on time sepa- ration values ofthe selevied closest pairs of veetors. Fach bin inthe periodicity histogram accumulates a total number of selected closest puis laving the same time separation ss berween vectors, eg., as expressed BY the number of samples corresponding to «bin index, The term “histogram” 12 inthis description is sed to refer to a one-dimensional aeay ‘of numbers, where each bin in histogram corresponds toa tlement ofthe one-dimensional atray. Periodicity histogsam computation ean be performed by summing up data points wih the same horizontal positions (Ghat is, lined up vertically) and located below line 22 in the ‘space-time separation plot of FIG. 2A, to yield the histogram shown in FIG. 28, For the sequence of vectors x(i (iI... M) representing 4 trajectory in mdimensional state space, & periodicity histogram ean be formally defined as ne) = 3 sr Dyan. aed where k is bin index comesponding to the time separation in samples between vectors x() and x(ivk), ris a predeter- ‘mined neighborhood radius, Dix(i)xi+k)] isa spatial dis- tance between vectors and H is Heaviside function, ‘As discussed above, Fuelidesn spatial distance between Vectors, used in the exemplary embodiment, can be replaced ‘with some other distance norm in n-dimensional space. FIGS. 2B, 5B and 8B show periodicity histograms com- pid according to EQ. 4 with r-0.1 fora sustained vowel /ANY,a transitional voiced segment and an unvoiced fica- tive ISI, respectively: FIG, 2B shows a. shamp peak 24 corresponding to the fundamental pitch period of periodic vowel, and a second sharp peak 26 corresponding to Wvice the pitch period value. The periodicity histogram in FIG. 583, ‘computed fo the transitional voiced segment, shows a peak $52 corresponding toa fundamental pitch period. However, in this ease the peak $2 is much lower and is not sharp. The periodicity histogram for the unvoiced fictive /S! in FIG. 8B shows many random low peaks distributed along the time separation axis, In genera, a periodicity histogram, computed aeconting © FQ. 4 with an appropriately chosen value of (or ‘uivalently, with an appropriate number of selected closest pars of vectors, will have distinet peaks corresponding to ‘fundamental period and is integer multiples for periodic Signals. Periodicity histograms corresponding to aperiodic Sianals will lack sch characteristic poaks Histogram bins with smal index values of k near or equal tw zero should be exchided from consideration when seaeh- ing for histogram peaks. Those bins eomrespond to pais of vectors with small time separations between vectors in samples. Such puirs of veciors represent suocessive points fm the reconstructed injectory and, therefore, are nomally close in state spe. Tn particular, the highest histogram peak socording 9 EQ. 4 is always at k-O and ite magnitude is fal 10 M. Since the summation interval in EQ. 4 linearly shrinks ‘with an increasing value of ka periodicity histogram has a bias: an upper bound is aot the same forall bins and is a Fineaely decaying funetion of k, as shown by slanting line 2 in FIG. 2B. This causes the magnitudes of histogram peaks to decay with increasing values of k, ats observed in FIG. 2B. Dve to this docay, the main histogram peak, correspond: ing 10 the Towest sub-moltiple and representing a true ndamenta period, is usually the langestof all peaks for clean and steady period signals asi is evidenced by peak 24 in FIG. 2H, Thus, locating the highest pesk in the Periodicity histogram can give a reliable pitch period est mate for clean and steady periodic frames, US 7,124,075 B2 13 For lager values of k approaching M only few numbers ‘ean be accumulated when computing corespoading histo- ‘gram bins. Fence, histogram bins close to the right elge are Satsically unreliable and should also be excluded f ‘consideration when searching for peaks Inthe exemplary embodiment, a perinicity histogram is ‘computed and searched for peaks for the values of I inthe predetermined interval of possible pitch periods and not for ‘ther values of k. Thus, in such an embodiment, only prs fof vectors with time separation values K_ satisfying plow

You might also like