Welcome to Scribd. Sign in or start your free trial to enjoy unlimited e-books, audiobooks & documents.Find out more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
In Tech 0305

In Tech 0305

Ratings: (0)|Views: 1|Likes:
Published by Rainy Shady Mistu

More info:

Published by: Rainy Shady Mistu on Jun 27, 2012
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





A Mechanical Voice System:Construction of Vocal Cords and its Pitch Control
Toshio Higashimoto and Hideyuki Sawada
Dept. of Intelligent Mechanical Systems Engineering, Faculty of Engineering, Kagawa University2217-20, Hayashi-cho, Takamatsu-city, Kagawa, 761-0396, JAPAN
A mechanical model of the human vocal system is being developed based on a mechatronicstechnology under a feedback control. While various ways of vocal sound production have been activelystudied so far, a mechanical construction of vocal system is considered to advantageously realize naturalvocalization with its fluid dynamics. Several motors are used to manipulate the mechanical voicesystem. The system is able to learn the relations between motor control papameters and the producedvocal sounds using an auditory feedback with neural networks, by mimicking a human vocalization.This paper introduces the construction of vocal cords and its adaptive control for the pitch learning. Themechanical system generates vowel and consonant sounds of different pitches by dynamicallycontrolling the vocal cords, vocal tract and nasal cavity.
1. Introduction
Only humans are able to use words as theprimary media in the verbal communication, whilealmost all animals have voices. Different vocalsounds are generated by the complex movements of vocal organs under the feedback controlmechanisms using an auditory system. Vocalsounds and human vocalization mechanisms haveinterested many researchers for a long time
, andcomputerized voice production and recognitionhave become the essential technologies in therecent developments of flexible human-machineinterfaces studies.In the researches of sound production, variousways and techniques have been reported.Algorithmic syntheses have taken the place of analogue circuit syntheses and became widely usedtechniques
. Sound sampling methods andphysical model based syntheses are typicaltechniques, which are expected to provide differenttypes of realistic vocal sounds
. In addition tothese algorithmic synthesis techniques, amechanical approach using a phonetic or vocalmodel imitating the human vocalization mechanismwould be a valuable and notable objective.Several mechanical constructions of a humanvocal system to realize human-like speech havebeen reported
. In most of the researches,however, the mechanical reproductions of thehuman vocal system were mainly directed byreferring to X-ray images and FEM analysis, andcontrol methods for natural vocalization have notbeen considered so far. In fact, since the behaviorsof vocal organs have not been sufficientlyinvestigated due to the nonlinear factors of fluiddynamics yet to be overcome, the control of mechanical system has often the difficulties to beestablished.We are developing a mechanical voicegeneration system together with its controlmechanism for voice production imitating humanvocalization. The fundamental frequency and thespectrum envelope determine the principalcharacteristics of a sound. The former is thecharacteristic of a source sound generated by avibrating object, and the latter is operated by thework of the resonance effects. In vocalization, thevibration of vocal cords generates a source sound,and then the sound wave is led to a vocal tract,which works as a filter to determine the spectrumenvelope.We have constructed a motor-controlledmechanical model with a vocal cord and a vocaltract so far
. By introducing an auditoryfeedback mechanism with an adaptive controlalgorithm of pitch and phoneme, the system is ableto autonomously acquire the control method of themechanical system to produce stable vocal soundsimitating human vocalization
. In the system, anartificial vocal cord used by people who had toremove their vocal cords because of a glottaldisease was used. The vibration of a rubber with5mm width stretched over a plastic body madevocal sound source. The tension of the rubber wasmanipulated by applying tensile force with a motor,so that the fundamental frequency of a generatedvocal sound was changed easily.We paid attention to the quality of a soundgenerated by the voice system to be close to ahuman, and worked to develop human-like vocalcords. This paper describes the construction of vocal cords and its control for changingfundamental frequency, together with an adaptivelearning mechanism.
2. Human Vocal System and VoiceGeneration
Human vocal sounds are generated by therelevant operations of vocal organs such as the lung,trachea, vocal cords, vocal tract, tongue andmuscles. In human verbal communication, thesound is perceived as words, which consist of vowels and consonants.The lung has the function of an air tank, andthe airflow through the trachea causes the vocalcord vibration as the source sound of a voice. Theglottal wave is led to the vocal tract, which worksas a sound filter as to form the spectrum envelopeof the voice. The fundamental frequency and thevolume of the sound source is varied by the changeof the physical parameters such as the stiffness of the vocal cords and the amounts of airflow from thelung, and these parameters are uniquely controlledwhen we utter a song. In contrast, the spectrumenvelope, which is necessary for the pronunciationof words consisting of vowels and consonants, isformed based on the inner shape of the vocal tractand the mouth, which are governed by the complexmovements of the jaw, tongue and muscles. Vowelsounds are radiated by the relatively stableconfiguration of the vocal tract, while the short timedynamic motions of the vocal apparatus produceconsonants generally. The dampness and viscosityof organs greatly influence the timbre of generatedsounds, which we may experience when we have asore throat. Appropriate configurations of the vocaltract for the production of phonemes are acquired asinfants grow by repeating trials and errors of hearing and vocalizing vocal sounds.
3. Mechanical Model for Vocalization
3-1. Configuration of Mechanical Voice System
As shown in Figure 1, the mechanical voicesystem mainly consists of an air compressor,artificial vocal cords, a resonance tube, a nasalcavity, and a microphone connected to a soundanalyzer, which correspond to a lung, vocal cords, avocal tract, a nasal cavity and an audition of ahuman.The air in the compressor is compressed to8000 hpa, while the pressure of an air from lungs isabout +200 hpa larger than the atmosphericpressure. A pressure reduction valve is applied atthe outlet of the air compressor so that the pressureis reduced to be nearly equal to the air pressurethrough the trachea. The valve is also effective toreduce the fluctuation of the pressure in thecompressor during the operations of compressionand depression process. The decompressed air is ledto the vocal cords via an airflow control valve,which works for the control of the voice volume.The resonance tube is attached to the vocal cordsfor the modification of resonance characteristics.The sound analyzer plays a role of the auditorysystem. It realizes the pitch extraction and theanalysis of resonance characteristics of thegenerated sound in real time, which are necessaryfor the auditory feedback control. The systemcontroller manages the whole system by listening tothe produced sounds and generating motor controlcommands, based on the auditory feedback controlmechanism.
AirCompressorAir Flow
Resonance Tube
 Muscles Lung and TracheaVocal CordsVocal 
Tract & Nasal Cavity Audition
Nasal Cavity
 Learning Part
 System Controller5 Motors
Figure 1: System Configuration
3-2. Construction of Resonance Tube and NasalCavity
The human vocal tract is a non-uniform tubeabout 170mm long in man. Its cross-sectional areavaries from 0 to 20cm
under the control forvocalization. A nasal tract with a total volume of 60cm
is coupled to the vocal tract. Nasal sounds suchas /m/ and /n/ are normally excited by the vocalcords and resonated in the nasal cavity. Nasalsounds are generated by closing the soft palate andlips, not to radiate air from the mouth, but toresonate the sound in the nasal cavity. The closedvocal tract works as a lateral branch resonator andalso has effects of resonance characteristics togenerate nasal sounds. Based on the difference of articulatory positions of tongue and mouth, the /m/ and /n/ sounds can be distinguished with each other.In the mechanical system, a resonance tube asa vocal tract is attached at the sound outlet of theartificial vocal cords. It works as a resonator of asource sound generated by the vocal cords. It ismade of a silicone rubber with the length of 180mm and the diameter of 36mm, which is equal to10.2cm
by the cross-sectional area as shown inFigure 2 and 3. The silicone rubber is molded withthe softness of human skin, which contributes to thequality of the resonance characteristics. In addition,a nasal cavity made of a plaster is connected to theintake part of the resonance tube to vocalize nasalsounds like /m/ and /n/.40
=36180Intake side
Figure 2: Construction of Vocal Tract and NasalCavityFigure 3: Structural View of Mechanical System
By actuating displacement forces by stainlessbars from the outside, the cross-sectional area of thetube is manipulated so that the resonancecharacteristics are changed according to thetransformations of the inner areas of the resonator.DC motors are placed at 5 positions
=1-5) fromthe intake side of the tube to the outlet side asshown in Figure 2, and the displacement forces
) are applied according to the controlcommands from the phoneme-motor controller.A nasal cavity is coupled with the resonancetube as a vocal tract to vocalize human-like nasalsounds by the control of mechanical parts. A slidingvalve as a role of the soft palate is settled at theconnection of the resonance tube and the nasalcavity for the selection of nasal and normal sounds.For the generation of nasal sounds /n/ and /m/,the sliding valve is open to lead the air into thenasal cavity as shown in Figure 4(a).By closing the middle position of the vocaltract and then releasing the air to speak vowelsounds, /n/ consonant is generated. For the /m/ consonants, the outlet part is closed to stop the airfirst, and then is open to vocalize vowels. Thedifference in the /n/ and /m/ consonant generationsis basically the narrowing positions of the vocaltract.In generating plosive sounds /p/ and /t/, themechanical system closes the sliding valve not torelease the air in the nasal cavity. By closing onepoint of the vocal tract, air provided from the lungis stopped and compressed in the tract as shown inFigure 4(b). Then the released air generates plosiveconsonant sounds like /p/ and /t/.
(a) Airflow Control for Nasal Sound /n/ (b) Airflow Control for Plosive Sound /p/ Figure 4: Motor Control for Nasal and PlosiveSound generation
Sliding valveOpenSliding valveClosed

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->